Run Your Agent Network Without Cloud Dependency
"What happens to my devices if the cloud service shuts down?" This question used to be hypothetical. It is not anymore.
In April 2022, Insteon -- one of the largest smart home platforms in the United States -- went dark overnight. The company ceased operations without warning. Their cloud servers went offline. Every Insteon hub, switch, dimmer, and sensor stopped responding. Customers who had spent thousands of dollars building their smart homes woke up to expensive paperweights. There was no migration path, no export tool, no local fallback. The devices were physically functional but digitally dead, because every command routed through servers that no longer existed.
Insteon was not an outlier. It was a preview.
In August 2023, Google shut down Cloud IoT Core, its managed service for connecting and managing IoT devices. Millions of devices that depended on Google's MQTT broker for communication needed to be migrated to alternative platforms. Google gave 18 months of notice, which was generous compared to Insteon's zero, but the migration was still painful: new SDKs, new authentication flows, new billing accounts, new terms of service.
In January 2026, Belkin announced the end-of-life for Wemo's cloud services. Every Wemo smart plug, switch, and camera that relied on the Wemo app for remote access lost that capability. Local control still works for some devices, but only if you are on the same Wi-Fi network. Remote access -- the entire point of a "smart" device -- is gone.
The pattern is clear: cloud-dependent devices are rented, not owned. The vendor controls the infrastructure your devices depend on, and the vendor can take it away at any time, for any reason.
Why Cloud Dependency Is a Design Flaw, Not a Feature
Cloud dependency is not inherent to networked devices. It is a design choice -- and it is the wrong one for most deployments. Here is why.
Single Point of Failure
When every device communicates through a cloud service, that service is a single point of failure for your entire fleet. If the cloud has an outage, all of your devices stop working simultaneously. AWS has had multiple multi-hour outages affecting us-east-1. Google Cloud had a global networking outage in 2023. Azure had a 14-hour authentication outage in 2025. During each of these events, millions of IoT devices became unresponsive -- not because the devices had problems, but because the cloud they depended on did.
Latency for Local Communication
Two devices in the same room should not need to communicate through a data center on another continent. But in cloud-dependent architectures, they do. A temperature sensor sends a reading to the cloud, the cloud processes a rule, the cloud sends a command to the thermostat, and the thermostat adjusts. Round-trip: 200-500ms. The same operation over a local connection: 2ms.
This is not just a performance issue. For safety-critical applications -- industrial control, medical devices, autonomous vehicles -- the latency introduced by cloud round-trips can be dangerous. An emergency stop command that takes 500ms because it routes through AWS is 498ms too slow.
Privacy and Data Sovereignty
Cloud-dependent devices send all of their data to the vendor's servers. Every sensor reading, every camera frame, every voice command, every usage pattern flows through infrastructure you do not control. The vendor's privacy policy governs your data, not your own policies. If the vendor is acquired, if they change their terms of service, or if they receive a legal request from a jurisdiction you have never heard of, your data is exposed.
For businesses operating under GDPR, HIPAA, or industry-specific regulations, cloud dependency creates a compliance burden. You need to audit the vendor's data handling practices, sign data processing agreements, and monitor their compliance continuously. Self-hosting eliminates this entire category of risk.
The Account Tax
Cloud-dependent devices require cloud accounts. Each vendor wants you to create an account, verify your email, accept terms of service, provide payment information, and configure authentication. As one frustrated home automation user put it: "30 minutes logging into somebody else's website per device."
For a home with 20 smart devices from 5 vendors, that is 5 accounts to create, 5 apps to install, 5 password resets when you forget them, and 5 separate interfaces to manage your own devices. This is not a user experience problem. It is an ownership problem. You bought the hardware, but the software belongs to someone else.
The fundamental question: Why does Home Assistant need to contact Google to add a Matter device? Why does your thermostat need the vendor's cloud to talk to your temperature sensor in the next room? Cloud dependency exists because it is convenient for the vendor, not because it is necessary for the user.
Self-Hosting a Pilot Rendezvous Server
Pilot Protocol's rendezvous server is the only piece of infrastructure you need to run an agent network. It handles agent registration, hostname resolution, NAT traversal coordination, and trust relay. It is a single Go binary with no external dependencies -- no database, no message queue, no container runtime.
Here is how to set it up. This takes approximately five minutes.
Step 1: Install
# Install pilotctl (requires Go 1.21+)
$ go install github.com/TeoSlayer/pilotprotocol/cmd/pilotctl@latest
That is the entire install. One command. No Docker, no Kubernetes, no Terraform. The binary is statically linked and runs on Linux, macOS, and Windows.
Step 2: Start the Rendezvous Server
# Start the rendezvous server with registry and beacon
$ pilotctl rendezvous start --listen :9000 --beacon :9001
Registry listening on :9000 (TCP)
Beacon listening on :9001 (UDP)
Persistence: /var/lib/pilot/registry.json
# For production, run as a systemd service
$ sudo systemctl enable pilot-rendezvous
$ sudo systemctl start pilot-rendezvous
The registry listens on TCP port 9000 for agent registrations, lookups, and handshake relay. The beacon listens on UDP port 9001 for STUN discovery and hole-punch coordination. Both run in the same process.
The rendezvous server persists its state to a JSON file (/var/lib/pilot/registry.json) using atomic writes. If the server restarts, it reloads the previous state and agents reconnect automatically. There is no database to configure, no schema migration to run, and no backup process more complex than copying a single file.
Step 3: Connect Agents
# Agent 1: point at your rendezvous server
$ pilotctl init --hostname sensor-1
$ pilotctl daemon start --registry your-server.local:9000
Registered as sensor-1 (1:0001.0000.0001)
# Agent 2: same rendezvous, same network
$ pilotctl init --hostname controller
$ pilotctl daemon start --registry your-server.local:9000
Registered as controller (1:0001.0000.0002)
# Agents can now discover and connect to each other
$ pilotctl find sensor-1
sensor-1 1:0001.0000.0001 public 192.168.1.50:4000
No accounts. No API keys. No terms of service. No third-party cloud. The rendezvous server runs on your hardware, your agents register with it, and all communication stays on your network.
Hardware Requirements
The rendezvous server is lightweight. For a fleet of up to 1,000 agents, a Raspberry Pi 4 is more than sufficient. For larger deployments, any modern server or VM works. Here are the measured resource requirements:
| Fleet Size | RAM | CPU | Bandwidth |
|---|---|---|---|
| 10 agents | ~20 MB | Negligible | ~1 KB/s |
| 100 agents | ~35 MB | <1% single core | ~10 KB/s |
| 1,000 agents | ~80 MB | ~5% single core | ~100 KB/s |
| 10,000 agents | ~400 MB | ~1 core | ~1 MB/s |
For production deployments exceeding 1,000 agents, Pilot supports hot-standby replication: a secondary rendezvous server receives push-based snapshots from the primary and can take over if the primary fails. See How We Run 10,000 Agents on 3 VMs for the operational details.
Agent Enrollment Without Third-Party Accounts
In cloud-dependent systems, enrolling a new device means creating a vendor account, registering the device serial number, generating API credentials, and configuring the device with those credentials. In Pilot Protocol, enrollment is a single command.
# New agent enrollment -- no accounts, no credentials to manage
$ pilotctl init --hostname new-sensor
Identity created: ~/.pilot/identity.key
Public key: 7a2c...f819 (Ed25519)
Virtual address: 1:0001.0000.0005
$ pilotctl daemon start --registry your-server.local:9000
Registered as new-sensor (1:0001.0000.0005)
The agent generates its own Ed25519 key pair locally. No certificate authority, no credential server, no enrollment API. The private key never leaves the device. The public key is registered with the rendezvous server alongside the agent's virtual address.
Authentication is cryptographic: the agent proves its identity by signing messages with its private key. Other agents verify the signature with the registered public key. There is no password to rotate, no token to refresh, and no OAuth flow to debug.
For fleet provisioning, you can script the enrollment process:
# Provision 50 sensors from a deployment script
for i in $(seq 1 50); do
ssh sensor-$i "pilotctl init --hostname sensor-$i && \
pilotctl daemon start --registry rendezvous.local:9000"
done
Each sensor gets its own cryptographic identity, its own virtual address, and its own hostname. No shared credentials, no master key, no single secret that compromises the entire fleet if leaked.
What Happens When the Rendezvous Goes Down
This is the question that exposes the critical difference between cloud-dependent and self-hosted architectures: what breaks when the central server is unavailable?
In a cloud-dependent system, the answer is "everything." Devices cannot authenticate, cannot discover peers, cannot send commands, and cannot receive updates. The entire deployment is dead.
In Pilot Protocol, the answer is "new discovery stops, but everything else keeps working."
- Active connections continue -- agents that have already discovered each other and established tunnels continue communicating. The tunnel is a direct UDP connection between agents. The rendezvous server is not in the data path.
- Trusted relationships persist -- trust state is stored locally on each agent in
~/.pilot/. It does not depend on the rendezvous server. If the server goes down, agents can still communicate with all of their previously trusted peers. - New discovery fails -- agents cannot look up new peers by hostname because the registry is unavailable. This is the only thing that breaks.
- NAT traversal degrades -- STUN discovery and hole-punch coordination require the beacon, which runs on the rendezvous server. New connections behind NAT may not be established. Existing hole-punched connections continue working because the NAT mappings are maintained by keepalive probes.
- Recovery is automatic -- when the rendezvous server comes back, agents reconnect and re-register automatically. No manual intervention required.
Compare this to what happened when Insteon's cloud went down: devices could not execute local automation rules, could not be controlled via the app, and could not even be reset to work with a different platform. The cloud was not just the discovery layer -- it was the control plane, the data plane, and the authentication layer, all in one.
Design principle: The rendezvous server is the phonebook, not the phone network. If the phonebook is unavailable, you cannot look up new numbers, but you can still call anyone whose number you already have. This is a fundamentally more resilient architecture than routing all traffic through a central service.
Comparison: Pilot vs. MQTT Broker vs. Cloud IoT Platforms
| Property | Pilot Protocol | MQTT Broker | Cloud IoT (AWS/GCP/Azure) |
|---|---|---|---|
| Self-hostable | Yes (single binary) | Yes (Mosquitto, etc.) | No |
| Cloud account required | No | No (self-hosted) | Yes |
| Central server in data path | No (discovery only) | Yes (all messages) | Yes (all messages) |
| P2P communication | Yes (after discovery) | No (always via broker) | No (always via cloud) |
| NAT traversal | Automatic (STUN/punch/relay) | Not supported | Not supported |
| Encryption | End-to-end (X25519 + AES-256-GCM) | TLS to broker only | TLS to cloud only |
| Survives server outage | Yes (existing connections continue) | No (all messages stop) | No (all messages stop) |
| Per-device identity | Ed25519 key pair (auto-generated) | Username/password or certificate | Cloud-managed certificate |
| Vendor lock-in | None (open source, MIT license) | Low (MQTT is a standard) | High (proprietary SDKs) |
| Monthly cost at 1K devices | $0 (self-hosted) | $0 (self-hosted) | $50-500/month |
MQTT is the closest alternative for self-hosted deployments. Mosquitto is excellent, battle-tested, and widely used. But MQTT brokers are always in the data path -- every message flows through the broker, making it a bottleneck and a single point of failure. Pilot's rendezvous server is only in the discovery path; actual communication is peer-to-peer.
MQTT also does not provide NAT traversal. If your devices are behind different NATs, you need a publicly reachable broker (which reintroduces cloud dependency) or a VPN (which reintroduces configuration complexity). Pilot's automatic NAT traversal works behind any NAT type without additional infrastructure. For the technical details, see Connect AI Agents Behind NAT Without a VPN.
Migration Guide: Cloud-Dependent to Self-Hosted
If you are currently running devices on a cloud IoT platform and want to migrate to a self-hosted Pilot network, here is the process.
Step 1: Deploy Your Rendezvous Server
Choose a machine with a stable IP address. This can be a cloud VM (the irony is intentional -- using a $5/month VM to run your own infrastructure is different from depending on a vendor's managed service), a Raspberry Pi, or any server on your network.
$ pilotctl rendezvous start --listen :9000 --beacon :9001
Step 2: Install Pilot on Each Device
Pilot compiles to a static binary for Linux (amd64, arm64, arm), macOS, and Windows. For embedded devices, cross-compile from your development machine:
# Cross-compile for ARM (Raspberry Pi, IoT devices)
$ GOOS=linux GOARCH=arm64 go build -o pilotctl ./cmd/pilotctl
$ scp pilotctl sensor-device:/usr/local/bin/
Step 3: Initialize and Connect Each Device
# On each device
$ pilotctl init --hostname device-name
$ pilotctl daemon start --registry your-server:9000
Step 4: Establish Trust Between Devices
Devices that need to communicate establish trust through the handshake protocol. For fleet deployments, use auto-approval rules to automate this:
# On the controller: auto-approve devices on the same network
$ pilotctl handshake sensor-1 "Fleet enrollment"
$ pilotctl handshake sensor-2 "Fleet enrollment"
# Or script it for the entire fleet
for sensor in $(pilotctl --json peers --search "sensor" | jq -r '.[].hostname'); do
pilotctl handshake "$sensor" "Automated fleet enrollment"
done
Step 5: Decommission the Cloud Service
Once all devices are communicating over Pilot, disable the cloud service. Your devices are now self-hosted. The rendezvous server runs on your hardware, the communication is peer-to-peer, and there is no monthly bill, no vendor dependency, and no risk of service shutdown.
If the rendezvous server is on your local network and your devices need to communicate across the internet, ensure the rendezvous server has a publicly reachable IP (or use a cheap cloud VM for just the rendezvous). The rendezvous server is stateless enough that a $5/month VM is sufficient for thousands of devices.
The Ownership Principle
The devices you buy should work for as long as the hardware functions. Not for as long as the vendor stays in business. Not for as long as the vendor's cloud service runs. Not for as long as the vendor's terms of service allow.
Self-hosting your agent network with Pilot Protocol means you own the infrastructure. The software is open source (MIT license). The binary is self-contained. The state is a JSON file you can back up with cp. There is no vendor that can shut down your deployment by shutting down their servers.
Insteon's customers learned this lesson the hard way. Wemo's customers are learning it now. The next cloud shutdown -- and there will be a next one -- does not have to affect you.
For a complete deployment guide with high availability, monitoring, and fleet management, see Building a Private Agent Network for Your Company. For the security model that protects your self-hosted network, see How to Secure AI Agent Communication With Zero Trust. For a step-by-step quickstart, see Build a Multi-Agent Network in 5 Minutes.
Try Pilot Protocol
One binary, five minutes, zero cloud dependency. Self-host your agent network and own your infrastructure. No accounts, no vendor lock-in, no service shutdowns.
View on GitHub
Pilot Protocol