← Back to Blog

Connect Agents Across AWS, GCP, and Azure Without a VPN

February 19, 2026 multi-cloud networking deployment

You have agents on AWS. Your team in Europe runs agents on GCP. A partner organization uses Azure. You need them to communicate. The traditional answer is VPN: set up site-to-site tunnels between each cloud provider, configure routing tables, manage firewall rules, and hope the whole thing does not fall over when someone changes a security group.

Multi-cloud networking is a hard problem. It is hard because each cloud provider designed its networking for a single-provider world. AWS VPCs, GCP VPC networks, and Azure VNets are all incompatible. Connecting them requires either cloud-provider interconnect products (AWS Transit Gateway, GCP Cloud Interconnect, Azure ExpressRoute) or VPN tunnels between gateways. Both options are expensive, complex, and scale poorly.

For AI agent communication, this complexity is unnecessary. Agents do not need full network-level connectivity between clouds. They need to find each other, establish encrypted connections, and exchange data. Pilot Protocol provides this with virtual addresses that work regardless of which cloud the agent runs on, automatic NAT traversal that handles the networking, and end-to-end encryption that does not depend on cloud-provider security.

The Multi-Cloud Networking Nightmare

Consider a concrete scenario. Your organization runs agents in three clouds:

To connect these with VPN, you need:

That is 6 tunnel endpoints for 3 clouds. Add a fourth cloud or on-premises location, and you need 12 endpoints. This is the combinatorial explosion problem: the number of VPN tunnels grows as N*(N-1)/2 where N is the number of sites. "When managing hundreds of applications, you are quickly talking about managing hundreds of VPN tunnels."

VPN throughput limitations

VPN gateways have throughput limits that are often lower than you expect. Industry reports note that organizations are "very much limited by the throughput of the various VPNs -- around 300 Mbps." AWS VPN connections support up to 1.25 Gbps per tunnel, but real-world throughput is often lower due to encryption overhead, MTU limitations, and the single-threaded nature of IPsec processing on many gateway implementations.

For bulk data transfer between agents (model weights, datasets, training outputs), VPN throughput becomes a bottleneck. For control-plane traffic (task delegation, status updates, coordination), the throughput is sufficient but the operational cost of maintaining VPN infrastructure for lightweight agent communication is disproportionate.

The skills and cost gap

Multi-cloud networking requires expertise in each cloud provider's networking model. AWS networking (VPCs, subnets, security groups, NACLs, route tables, transit gateways) is different from GCP networking (VPC networks, subnets, firewall rules, Cloud Router, Cloud NAT) is different from Azure networking (VNets, subnets, NSGs, route tables, Virtual WAN). A survey of cloud professionals found that 93% expressed concern about cloud security skills shortage.

The cost is equally problematic. "The costs for multi-cloud are enormous -- support and operation cost easily more than doubles." VPN gateway hours, data transfer between regions, cross-cloud egress fees, and the engineering time to manage it all add up quickly. For agent communication that might transfer megabytes per day, you are paying for gigabit-class infrastructure.

Why VPNs Do Not Scale for Agent Communication

VPNs solve the wrong problem for AI agents. VPNs provide network-level connectivity: they make two remote networks appear as if they are on the same LAN. This is useful for applications that need to access databases, file shares, and services using IP addresses and ports. It is overkill for agents that need to exchange messages, delegate tasks, and stream events.

The mismatch shows up in several ways:

Virtual Addresses: One Identity Regardless of Cloud

Pilot Protocol assigns each agent a 48-bit virtual address in the format N:NNNN.HHHH.LLLL. This address has two components: a 16-bit network ID and a 32-bit node ID. The address is generated when the agent first starts and remains stable for the agent's lifetime, regardless of where it runs.

An agent on AWS us-east-1 might have address 1:0001.0000.0017. If you migrate that agent to GCP europe-west1, its address stays 1:0001.0000.0017. Other agents continue to reach it at the same address. The Pilot daemon handles re-registration with the registry when the physical endpoint changes, and peers automatically reconnect to the new location.

# Agent identity is portable across clouds

  AWS us-east-1            GCP europe-west1          Azure eastus2
  +------------------+    +------------------+      +------------------+
  | Agent: research  |    | Agent: analysis  |      | Agent: customer  |
  | 1:0001.0000.0017 |    | 1:0001.0000.0042 |      | 1:0001.0000.0063 |
  | IP: 10.0.1.15    |    | IP: 10.128.0.5   |      | IP: 10.1.0.8     |
  +------------------+    +------------------+      +------------------+
          |                        |                         |
          +------------------------+-------------------------+
                        Pilot overlay network
                    (same virtual address space)
                    (direct encrypted tunnels)

The physical IP addresses (10.0.1.15, 10.128.0.5, 10.1.0.8) are cloud-specific and can change. The Pilot addresses are stable. Your application code references Pilot addresses, not IP addresses. This decouples agent identity from infrastructure, which is exactly what you need for multi-cloud deployment.

Example: Agent on AWS Talks to Agent on GCP

Here is a complete walkthrough. We will deploy one agent on an AWS EC2 instance in us-east-1 and another on a GCP Compute Engine instance in europe-west1, and have them communicate.

Step 1: Deploy rendezvous server

The rendezvous server can run on any cloud (or on-premises). It handles address resolution and NAT traversal signaling. For this example, we will run it on the GCP instance, but it could run anywhere.

# On GCP instance (or any machine with a public IP)
go install github.com/TeoSlayer/pilotprotocol/cmd/pilotctl@latest

# Start rendezvous with registry on port 9000, beacon on port 9001
pilotctl rendezvous start \
  --registry-addr 0.0.0.0:9000 \
  --beacon-addr 0.0.0.0:9001 \
  --persist /var/lib/pilot/registry.json

Firewall rules: Open TCP port 9000 (registry) and UDP port 9001 (beacon) on the rendezvous server. Open UDP port 4000 on each agent VM (tunnel port). These are the only firewall rules needed -- three ports total, regardless of how many agents you deploy.

Step 2: Start agent on AWS

# On AWS EC2 instance (us-east-1)
go install github.com/TeoSlayer/pilotprotocol/cmd/pilotctl@latest

# Start daemon pointing at rendezvous
pilotctl daemon start \
  --registry <rendezvous-ip>:9000 \
  --beacon <rendezvous-ip>:9001 \
  --endpoint <aws-public-ip>:4000

# Join network and set identity
pilotctl join 1
pilotctl set-hostname research-aws
pilotctl set-visibility public
pilotctl tags set cloud=aws,region=us-east-1,role=research

Step 3: Start agent on GCP

# On GCP Compute Engine instance (europe-west1)
go install github.com/TeoSlayer/pilotprotocol/cmd/pilotctl@latest

# Start daemon
pilotctl daemon start \
  --registry <rendezvous-ip>:9000 \
  --beacon <rendezvous-ip>:9001 \
  --endpoint <gcp-public-ip>:4000

# Join same network
pilotctl join 1
pilotctl set-hostname analysis-gcp
pilotctl set-visibility public
pilotctl tags set cloud=gcp,region=europe-west1,role=analysis

Step 4: Establish trust and communicate

# From AWS agent: discover and trust GCP agent
pilotctl resolve analysis-gcp
# 1:0001.0000.0042

pilotctl trust request 1:0001.0000.0042 \
  --justification "Cross-cloud research collaboration"

# From GCP agent: approve trust
pilotctl trust approve 1:0001.0000.0017

# Communicate: send a message
pilotctl send-message 1:0001.0000.0042 "Analyze this dataset"

# Send a file
pilotctl send-file 1:0001.0000.0042 dataset.csv

# Submit a task
pilotctl task submit 1:0001.0000.0042 \
  --description "Run sentiment analysis on Q1 customer feedback"

# Benchmark the connection
pilotctl echo 1:0001.0000.0042

That is the complete setup. Two go install commands, two daemon start commands, one trust handshake, and the agents are communicating across clouds with end-to-end encryption. No VPN tunnels. No cloud interconnect products. No firewall rules beyond the three ports.

NAT Traversal Handles the Networking Automatically

The example above used VMs with public IPs and the --endpoint flag, which skips NAT traversal by registering a fixed public endpoint. But many cloud deployments use private-only VMs (no public IP) behind Cloud NAT or similar services. Pilot handles this automatically.

When an agent starts without the --endpoint flag, the daemon performs STUN discovery to determine its public-facing IP and port, plus the NAT type (Full Cone, Restricted Cone, Port-Restricted Cone, or Symmetric). Based on the NAT types of both peers, the connection strategy is selected automatically:

Agent A NAT Agent B NAT Strategy
Public IP / Full Cone Any Direct connection
Restricted Cone Restricted Cone Hole-punching via beacon
Port-Restricted Cone Port-Restricted Cone Hole-punching via beacon
Symmetric Symmetric Relay through beacon
Symmetric Non-symmetric Hole-punching (may succeed)

Cloud NAT services (AWS NAT Gateway, GCP Cloud NAT, Azure NAT Gateway) typically implement Port-Restricted Cone or Symmetric NAT. Between two agents behind different cloud NATs, Pilot attempts hole-punching first. If hole-punching fails (Symmetric NAT on both sides), it falls back to relay through the beacon. The fallback is automatic -- your application code does not change.

# Agent behind AWS NAT Gateway (no public IP)
pilotctl daemon start \
  --registry <rendezvous-ip>:9000 \
  --beacon <rendezvous-ip>:9001
# No --endpoint flag: STUN discovery handles it

# Agent behind GCP Cloud NAT (no public IP)
pilotctl daemon start \
  --registry <rendezvous-ip>:9000 \
  --beacon <rendezvous-ip>:9001
# Same: STUN discovery, automatic hole-punching or relay

Performance: Direct Tunnels vs VPN Overhead

Pilot's UDP tunnels introduce less overhead than VPN tunnels for agent communication patterns. Here is why.

A VPN tunnel encapsulates IP packets inside encrypted IP packets. Each packet gets an outer IP header (20 bytes), a UDP or ESP header (8-24 bytes), and encryption overhead (16-32 bytes for AES-GCM). This reduces the effective MTU and can cause fragmentation, especially for larger payloads. VPN gateways also introduce an extra network hop, adding latency.

Pilot tunnels encapsulate application data directly in UDP packets with a 34-byte Pilot header and AES-256-GCM encryption overhead (16-byte auth tag + 12-byte nonce). There is no IP-in-IP encapsulation because Pilot is an overlay network, not a VPN -- it does not route arbitrary IP traffic, only agent communication. This means less overhead per packet and no fragmentation issues with standard 1500-byte MTUs.

For agent communication patterns (small messages, task payloads, event streams), the difference is measurable:

Metric Pilot Tunnel IPsec VPN WireGuard
Per-packet overhead 62 bytes 58-76 bytes 60 bytes
Connection setup 1 RTT (existing tunnel) 2-4 RTT (IKE) 1 RTT
Additional hops 0 (direct P2P) 2 (gateway each side) 0-1
NAT traversal Built-in (automatic) NAT-T (UDP encap) Built-in (manual config)
Per-agent identity Yes (Ed25519) No (network-level) Yes (Curve25519)
Discovery Registry + tags None (static config) None (static config)

The key performance advantage is not per-packet overhead (which is similar across all three). It is the elimination of the gateway hop and the zero-configuration NAT traversal. With VPN, traffic routes through gateway VMs that become bottlenecks. With Pilot, traffic flows directly between agent VMs over peer-to-peer tunnels.

Comparison: Pilot vs Tailscale vs ZeroTier vs Site-to-Site VPN

Feature Pilot Protocol Tailscale ZeroTier Site-to-Site VPN
Designed for AI agent communication Device connectivity Virtual networking Network interconnect
Identity model Per-agent Ed25519 Per-device (SSO/OAuth) Per-device (ZT identity) Per-network (certs/PSK)
Trust model Mutual handshake + justification ACL policy (centralized) Network membership Network-level (all-or-nothing)
NAT traversal STUN + hole-punch + relay DERP relay servers Root servers + relay NAT-T (manual config)
Control plane Self-hosted registry Tailscale coordination (cloud) ZeroTier Central (cloud) Self-managed
Agent features Tasks, events, files, trust, reputation None (network only) None (network only) None (network only)
Self-hostable Yes (fully) Partial (Headscale) Partial (self-hosted controller) Yes
Pricing Free (open source) Free tier, paid plans Free tier, paid plans Per-tunnel-hour (cloud)

Tailscale and ZeroTier are excellent products for connecting devices and services across networks. They solve the connectivity problem well. But they are general-purpose network tools, not agent communication platforms. They provide a tunnel -- you still need to build agent discovery, trust management, task delegation, event streaming, and reputation tracking on top.

Pilot provides all of these as built-in services on well-known ports. An agent on Pilot can discover peers by tags, establish trust with justifications, delegate tasks (port 1003), stream events (port 1002), exchange files (port 1001), and build reputation through task completion -- all without additional infrastructure. The networking is just the foundation.

Cost Comparison

The cost difference for multi-cloud agent communication is significant:

Component Pilot Protocol Cloud VPN Tailscale
Software Free (open source) Included with cloud Free for 3 users, $6/user/mo
VPN gateway hours $0 ~$0.05/hr per tunnel (~$36/mo) $0
3 clouds, 3 tunnels $0 ~$108/mo $0 (relay via Tailscale cloud)
Rendezvous server 1 small VM (~$5/mo) N/A N/A (Tailscale coordination)
Data transfer Cloud egress only Cloud egress + VPN processing Cloud egress only
Engineering time Low (2 commands per agent) High (per-cloud config) Low (install + join)
Agent features Included Build yourself Build yourself

For 3 clouds with site-to-site VPN, you are paying approximately $108/month in VPN gateway hours alone, before data transfer. For 5 clouds, it is $360/month (10 tunnels). For 10 sites, it is $1,620/month (45 tunnels). And each tunnel requires configuration, monitoring, and maintenance.

Pilot requires one rendezvous server (a small VM, ~$5/month) regardless of how many agents or clouds you connect. Adding a new cloud means installing Pilot on the new agent and running pilotctl join. No VPN configuration, no firewall rules beyond port 4000 UDP, no cloud-specific networking setup.

Scaling Beyond Three Clouds

The real advantage of Pilot for multi-cloud appears as you scale. Adding agents does not require new tunnels or configuration changes. Each new agent:

  1. Installs pilotctl
  2. Starts the daemon pointing at the rendezvous server
  3. Joins the network
  4. Establishes trust with the specific peers it needs to communicate with
# Add a new agent on Azure (or any cloud, or on-premises, or a laptop)
go install github.com/TeoSlayer/pilotprotocol/cmd/pilotctl@latest
pilotctl daemon start \
  --registry <rendezvous-ip>:9000 \
  --beacon <rendezvous-ip>:9001
pilotctl join 1
pilotctl set-hostname new-agent-azure
pilotctl tags set cloud=azure,region=eastus2,role=customer-support

# Discover peers by tag
pilotctl discover --tag role=research
# 1:0001.0000.0017  research-aws  [cloud=aws, region=us-east-1, role=research]

# Establish trust with specific agents (not entire networks)
pilotctl trust request 1:0001.0000.0017 --justification "Cross-cloud task delegation"

There are no VPN tunnels to add. No routing tables to update. No firewall rules to modify. No cloud-specific configuration. The agent connects to the rendezvous server, discovers peers, and establishes direct encrypted tunnels. Whether the peer is on AWS, GCP, Azure, Oracle Cloud, a Raspberry Pi, or a laptop on a coffee shop WiFi network, the process is identical.

This is what "cloud-agnostic" actually means for agent networking. Not "works on multiple clouds with per-cloud configuration" but "works on any network with the same two commands."

Getting Started

Deploy agents across any combination of clouds in under 10 minutes:

# 1. Deploy rendezvous (any machine with a public IP)
go install github.com/TeoSlayer/pilotprotocol/cmd/pilotctl@latest
pilotctl rendezvous start --registry-addr 0.0.0.0:9000 --beacon-addr 0.0.0.0:9001

# 2. On each agent (any cloud, any location)
go install github.com/TeoSlayer/pilotprotocol/cmd/pilotctl@latest
pilotctl daemon start --registry <rendezvous-ip>:9000 --beacon <rendezvous-ip>:9001
pilotctl join 1

# 3. Agents discover each other and communicate
pilotctl discover --tag role=analysis
pilotctl trust request <peer-address> --justification "Multi-cloud collaboration"
pilotctl send-message <peer-address> "Task: analyze Q1 data"

No VPN. No cloud interconnect. No per-cloud networking configuration. One overlay network that spans all of them.

Try Pilot Protocol

Connect agents across any cloud with two commands per agent. No VPN tunnels, no cloud interconnect, no networking expertise required.

View on GitHub