[ Switch to styled version → ]
advanced · 5 agents · 14 skills
A five-stage ETL pipeline for production data processing. Agents handle ingestion from S3 and databases, parallel transformation, data validation with quarantine for bad records, loading into target stores, and automated reporting via Slack dashboards.
clawhub install pilot-etl-data-pipeline-setup pilot-s3-bridgepilot-database-bridgepilot-task-chainpilot-cronpilot-task-routerpilot-stream-datapilot-task-parallelpilot-audit-logpilot-alertpilot-quarantinepilot-receiptpilot-webhook-bridgepilot-metricspilot-slack-bridge<your-prefix>-ingest - Data Ingestion pilot-s3-bridge, pilot-database-bridge, pilot-task-chain, pilot-cron <your-prefix>-transform - Data Transformer pilot-task-router, pilot-stream-data, pilot-task-parallel <your-prefix>-validate - Data Validator pilot-task-router, pilot-audit-log, pilot-alert, pilot-quarantine <your-prefix>-loader - Data Loader pilot-database-bridge, pilot-task-chain, pilot-receipt <your-prefix>-reporter - Pipeline Reporter pilot-webhook-bridge, pilot-metrics, pilot-slack-bridge, pilot-cron <your-prefix>-ingest → <your-prefix>-transform:1001 - raw data batches<your-prefix>-transform → <your-prefix>-validate:1001 - transformed records<your-prefix>-validate → <your-prefix>-loader:1001 - validated records<your-prefix>-loader → <your-prefix>-reporter:1002 - load receipts<your-prefix>-validate → <your-prefix>-reporter:1002 - validation metrics# Replace <your-prefix> with a unique name for your deployment (e.g. acme)
# On ingestion server
clawhub install pilot-s3-bridge pilot-database-bridge pilot-task-chain pilot-cron
pilotctl set-hostname <your-prefix>-ingest
# On transform server
clawhub install pilot-task-router pilot-stream-data pilot-task-parallel
pilotctl set-hostname <your-prefix>-transform
# On validation server
clawhub install pilot-task-router pilot-audit-log pilot-alert pilot-quarantine
pilotctl set-hostname <your-prefix>-validate
# On loader server
clawhub install pilot-database-bridge pilot-task-chain pilot-receipt
pilotctl set-hostname <your-prefix>-loader
# On reporting server
clawhub install pilot-webhook-bridge pilot-metrics pilot-slack-bridge pilot-cron
pilotctl set-hostname <your-prefix>-reporter
# On ingest:
pilotctl handshake <your-prefix>-transform "setup: etl-data-pipeline"
# On transform:
pilotctl handshake <your-prefix>-ingest "setup: etl-data-pipeline"
# On loader:
pilotctl handshake <your-prefix>-reporter "setup: etl-data-pipeline"
# On reporter:
pilotctl handshake <your-prefix>-loader "setup: etl-data-pipeline"
# On loader:
pilotctl handshake <your-prefix>-validate "setup: etl-data-pipeline"
# On validate:
pilotctl handshake <your-prefix>-loader "setup: etl-data-pipeline"
# On reporter:
pilotctl handshake <your-prefix>-validate "setup: etl-data-pipeline"
# On validate:
pilotctl handshake <your-prefix>-reporter "setup: etl-data-pipeline"
# On transform:
pilotctl handshake <your-prefix>-validate "setup: etl-data-pipeline"
# On validate:
pilotctl handshake <your-prefix>-transform "setup: etl-data-pipeline"
pilotctl trust