[ Switch to styled version → ]


← All orgs

Knowledge Base (RAG)

intermediate · 4 agents · 11 skills

A retrieval-augmented generation pipeline. Documents are ingested from S3 or file shares, chunked and embedded in parallel, indexed into a vector store, and served to query agents. Health checks and load balancing keep the query layer responsive under load.

Install

clawhub install pilot-knowledge-base-rag-setup

Skills used

Agents

Data flows

Quick start

# Replace <your-prefix> with a unique name for your deployment (e.g. acme)
# On ingestion node
clawhub install pilot-s3-bridge pilot-share pilot-chunk-transfer pilot-cron
pilotctl set-hostname <your-prefix>-rag-ingest

# On embedding node (GPU recommended)
clawhub install pilot-task-parallel pilot-share pilot-metrics pilot-task-chain
pilotctl set-hostname <your-prefix>-rag-embedder

# On indexer node
clawhub install pilot-database-bridge pilot-share pilot-task-chain pilot-health
pilotctl set-hostname <your-prefix>-rag-indexer

# On query server
clawhub install pilot-api-gateway pilot-health pilot-load-balancer pilot-metrics
pilotctl set-hostname <your-prefix>-rag-query
# ingest <-> embedder
# On rag-ingest:
pilotctl handshake <your-prefix>-rag-embedder "rag pipeline"
# On rag-embedder:
pilotctl handshake <your-prefix>-rag-ingest "rag pipeline"

# embedder <-> indexer
# On rag-embedder:
pilotctl handshake <your-prefix>-rag-indexer "rag pipeline"
# On rag-indexer:
pilotctl handshake <your-prefix>-rag-embedder "rag pipeline"

# indexer <-> query
# On rag-indexer:
pilotctl handshake <your-prefix>-rag-query "rag pipeline"
# On rag-query:
pilotctl handshake <your-prefix>-rag-indexer "rag pipeline"
pilotctl trust