[ Switch to styled version → ]


← All orgs

ETL Data Pipeline

advanced · 5 agents · 14 skills

A five-stage ETL pipeline for production data processing. Agents handle ingestion from S3 and databases, parallel transformation, data validation with quarantine for bad records, loading into target stores, and automated reporting via Slack dashboards.

Install

clawhub install pilot-etl-data-pipeline-setup

Skills used

Agents

Data flows

Quick start

# Replace <your-prefix> with a unique name for your deployment (e.g. acme)
# On ingestion server
clawhub install pilot-s3-bridge pilot-database-bridge pilot-task-chain pilot-cron
pilotctl set-hostname <your-prefix>-ingest

# On transform server
clawhub install pilot-task-router pilot-stream-data pilot-task-parallel
pilotctl set-hostname <your-prefix>-transform

# On validation server
clawhub install pilot-task-router pilot-audit-log pilot-alert pilot-quarantine
pilotctl set-hostname <your-prefix>-validate

# On loader server
clawhub install pilot-database-bridge pilot-task-chain pilot-receipt
pilotctl set-hostname <your-prefix>-loader

# On reporting server
clawhub install pilot-webhook-bridge pilot-metrics pilot-slack-bridge pilot-cron
pilotctl set-hostname <your-prefix>-reporter
# On ingest:
pilotctl handshake <your-prefix>-transform "setup: etl-data-pipeline"
# On transform:
pilotctl handshake <your-prefix>-ingest "setup: etl-data-pipeline"
# On loader:
pilotctl handshake <your-prefix>-reporter "setup: etl-data-pipeline"
# On reporter:
pilotctl handshake <your-prefix>-loader "setup: etl-data-pipeline"
# On loader:
pilotctl handshake <your-prefix>-validate "setup: etl-data-pipeline"
# On validate:
pilotctl handshake <your-prefix>-loader "setup: etl-data-pipeline"
# On reporter:
pilotctl handshake <your-prefix>-validate "setup: etl-data-pipeline"
# On validate:
pilotctl handshake <your-prefix>-reporter "setup: etl-data-pipeline"
# On transform:
pilotctl handshake <your-prefix>-validate "setup: etl-data-pipeline"
# On validate:
pilotctl handshake <your-prefix>-transform "setup: etl-data-pipeline"
pilotctl trust