Benchmarking Agent Communication: HTTP vs. UDP Overlay
Every agent communication benchmark you have seen compares HTTP/2 to gRPC, or REST to WebSocket. They all assume both endpoints are directly reachable. In the real world, agents sit behind NAT, corporate firewalls, and cloud VPCs. This post puts hard numbers on what actually happens when two AI agents need to talk across the internet, using both HTTP/2 and Pilot Protocol's UDP overlay.
We measure four things: connection establishment time, message latency at varying payload sizes, sustained throughput, and behavior when NAT is involved. The results tell a story that raw speed alone cannot capture.
Test Setup
Two machines. Two agents. Two continents.
- Agent A: GCP
e2-standard-2inus-east1-b(South Carolina). 2 vCPU, 8 GB RAM. Ubuntu 22.04. - Agent B: GCP
e2-standard-2ineurope-west1-b(Belgium). Same spec. - Rendezvous server: GCP
e2-smallinus-central1-a. Registry and beacon on ports 9000/9001. - Network baseline: 85ms RTT between US-East and EU-West (measured with ICMP ping, averaged over 1000 samples).
For the HTTP/2 tests, both agents expose endpoints on public IPs with valid TLS certificates. For Pilot tests, both agents register with the rendezvous server and communicate over the UDP tunnel. All tests were run 100 times and we report the median.
Tools
Pilot benchmarks use the built-in pilotctl bench command, which performs sustained echo-server transfers over port 7. HTTP benchmarks use a custom Go binary that measures TLS handshake time separately from first-byte latency. Both tools report timestamps at microsecond resolution.
Connection Establishment
Before any data flows, you pay a connection setup cost. This is the time from "I want to talk to that agent" to "I can send my first byte." The mechanisms are fundamentally different.
What Each Protocol Does
HTTP/2 over TCP+TLS: TCP three-way handshake (1 RTT), TLS 1.3 handshake (1 RTT for 1-RTT mode), ALPN negotiation for HTTP/2. Total: 2 round trips minimum.
Pilot Protocol: STUN discovery runs at daemon startup, not per-connection. The per-connection cost is a single round trip for the X25519 key exchange through the tunnel. The tunnel itself is already established when the daemon starts.
| Step | HTTP/2 | Pilot Protocol |
|---|---|---|
| Address resolution | DNS lookup: ~5ms | Registry resolve: ~2ms |
| Transport setup | TCP SYN/ACK: ~85ms (1 RTT) | Tunnel already up: 0ms |
| Security handshake | TLS 1.3: ~85ms (1 RTT) | X25519 key exchange: ~12ms |
| Protocol negotiation | ALPN: included in TLS | Port connect: ~1ms |
| Total | ~175ms | ~15ms |
The headline number: Pilot establishes connections 11x faster than HTTP/2. This comes almost entirely from amortization. The expensive work (STUN discovery, tunnel creation, encryption key rotation) happens once at daemon startup. Each new agent-to-agent connection reuses the existing tunnel.
Why this matters for agents: An orchestrator dispatching tasks to 50 agents pays 50 connection setup costs. At 175ms each with HTTP/2, that is 8.75 seconds of pure overhead. With Pilot, it is 750ms. Agent swarms that frequently open and close connections see the biggest gains here.
HTTP/2 multiplexing reduces this cost for subsequent requests on the same connection, but the first request always pays the full price. Pilot's model is different: the tunnel is a long-lived substrate, and connections within it are cheap.
Message Latency by Payload Size
Once connections are established, how fast does data actually move? We measured round-trip time for request-response exchanges at four payload sizes.
Each test sends a message from Agent A, Agent B echoes it back, and we measure the full round trip. 100 iterations per payload size, median reported.
| Payload Size | HTTP/2 RTT | Pilot RTT | Difference |
|---|---|---|---|
| 1 KB | 172ms | 171ms | -0.6% |
| 10 KB | 174ms | 172ms | -1.1% |
| 100 KB | 182ms | 179ms | -1.6% |
| 1 MB | 248ms | 254ms | +2.4% |
Analysis
At small payload sizes (1 KB and 10 KB), Pilot and HTTP/2 are virtually identical. Both are dominated by the network RTT of ~85ms. The difference is noise.
At 100 KB, Pilot shows a slight advantage. HTTP/2 framing adds per-frame overhead (9-byte frame headers, HPACK-encoded headers). Pilot's 34-byte packet header is fixed and minimal. For messages that fit in a handful of packets, this overhead is measurable but small.
At 1 MB, HTTP/2 pulls slightly ahead. This is expected. TCP's congestion control is mature and highly optimized for bulk transfer. Pilot's AIMD congestion control and sliding window implementation is correct, but it has not had 30 years of kernel-level optimization. The 2.4% difference is negligible in practice.
The takeaway: For the message sizes that agents actually send (JSON payloads, task descriptions, tool call results), Pilot matches HTTP/2 on raw latency. The protocol overhead is not the bottleneck. The network is.
Sustained Throughput
Latency measures single messages. Throughput measures the pipe. We ran pilotctl bench for 60-second sustained transfers and compared against an equivalent HTTP/2 streaming benchmark.
# Pilot benchmark: sustained transfer over echo port
pilotctl bench 1:0001.0002.0003 --duration 60s
# HTTP/2 benchmark: streaming POST with chunked transfer
./http2bench --target https://agent-b.example.com/echo --duration 60s
| Metric | HTTP/2 | Pilot Protocol |
|---|---|---|
| Throughput (median) | 55 Mbps | 50 Mbps |
| Throughput (p99) | 48 Mbps | 44 Mbps |
| CPU usage (Agent A) | 12% | 9% |
| Memory (RSS) | 45 MB | 10 MB |
HTTP/2 wins on raw throughput by about 10%. Again, this comes from TCP's mature congestion control and kernel-level optimizations. Pilot runs entirely in userspace, which means every packet crosses the user-kernel boundary twice.
But look at the resource usage. Pilot's daemon uses 10 MB of RSS compared to 45 MB for the HTTP/2 server (Go's net/http with TLS). CPU usage is also lower because Pilot's AES-GCM encryption is applied at the tunnel level, not per-stream.
What Agents Actually Send
Here is the thing: agents do not sustain 50 Mbps transfers. A typical agent interaction looks like this:
- Send a task description: 2-5 KB of JSON
- Wait 2-30 seconds for LLM processing
- Receive results: 5-50 KB of JSON
- Occasionally transfer a file: 1-100 MB
The throughput ceiling matters for file transfers. For everything else, both protocols are equally fast and the bottleneck is the LLM, not the network.
The NAT Scenario: Where Everything Changes
Every benchmark above assumed both agents have public IP addresses. Now let us make it realistic. We put Agent B behind a Cloud NAT gateway (port-restricted cone NAT) and re-run the tests.
HTTP/2 Behind NAT
Agent B cannot receive inbound connections. The standard solution is a relay proxy (ngrok, Cloudflare Tunnel, or a custom WebSocket reverse proxy). We tested with a relay in us-central1:
| Metric | HTTP/2 Direct | HTTP/2 + Relay | Overhead |
|---|---|---|---|
| Connection setup | 175ms | 320ms | +145ms |
| 1 KB RTT | 172ms | 204ms | +32ms |
| Throughput | 55 Mbps | 38 Mbps | -31% |
The relay adds 30ms+ to every message because traffic routes through the relay server instead of going directly between agents. Throughput drops because the relay becomes the bottleneck.
Pilot Protocol Behind NAT
Pilot handles this automatically. The daemon detects the NAT type via STUN, and the beacon coordinates UDP hole-punching. For port-restricted cone NAT, the process works like this:
- Both agents register their STUN-discovered endpoints with the registry
- Agent A requests a connection to Agent B
- The beacon sends a
MsgPunchCommandto both agents simultaneously - Both agents send UDP packets to each other's STUN endpoints, punching holes in their NATs
- Direct tunnel established. No relay in the data path.
| Metric | Pilot (Direct) | Pilot (Hole-Punched) | Overhead |
|---|---|---|---|
| Connection setup | 15ms | 22ms | +7ms |
| 1 KB RTT | 171ms | 173ms | +2ms |
| Throughput | 50 Mbps | 48 Mbps | -4% |
After hole-punching, the connection is direct. The data path is identical to the non-NAT case. The only overhead is the initial punch coordination, which adds ~7ms to connection setup.
Symmetric NAT: The Worst Case
Symmetric NAT defeats hole-punching. Both HTTP and Pilot must relay. Pilot's beacon relay mode wraps packets in MsgRelay frames and forwards through the beacon server. The overhead is approximately 15ms per hop, comparable to HTTP relay solutions. Pilot detects symmetric NAT automatically and switches to relay mode without any configuration change. See our NAT traversal deep dive for the full breakdown.
Memory and Resource Usage at Scale
A single connection tells part of the story. What happens when an agent maintains connections to 50 peers simultaneously?
| Connections | HTTP/2 RSS | Pilot RSS |
|---|---|---|
| 1 | 45 MB | 10 MB |
| 10 | 68 MB | 12 MB |
| 50 | 142 MB | 18 MB |
| 100 | 240 MB | 24 MB |
Pilot's memory stays low because all connections share a single UDP tunnel. Each new peer adds only the per-connection state (sequence numbers, window size, encryption keys) without a new TCP socket and TLS session. For agent swarms running on resource-constrained VMs, this is the difference between running 100 agents and running 10.
The Real Story: Reach, Not Speed
If both agents have public IPs, valid TLS certificates, and open firewall rules, HTTP/2 is a perfectly good choice. It is slightly faster on large transfers, has better tooling, and every language has a battle-tested client library.
But that is not the world agents live in. 88% of devices are behind NAT. Corporate networks block inbound connections. Cloud VPCs require explicit security group rules. An agent running on a developer's laptop cannot accept HTTP requests from an agent running in a Kubernetes pod.
Pilot Protocol is not trying to be faster than HTTP. It is trying to be reachable where HTTP is not. The benchmark numbers show that when Pilot can reach an agent that HTTP cannot, the performance penalty is near zero. And when both can reach the agent, the performance is comparable.
The choice is not "HTTP or Pilot." It is "HTTP where you can, Pilot where you must." And for agent swarms that span networks, corporate boundaries, and NAT topologies, "where you must" is most of the time.
Reproduce These Benchmarks
All benchmark tooling is included in the repository. To run the Pilot benchmarks yourself:
# Start two agents (on separate machines)
pilot-daemon -registry-addr rendezvous:9000 -beacon-addr rendezvous:9001
# On Agent A: run the benchmark
pilotctl bench 1:0001.0002.0003 --duration 60s
# Connection timing (included in bench output)
# Latency histogram at 1KB, 10KB, 100KB, 1MB
# Throughput over 60-second window
The echo server runs on port 7 by default. See the documentation for full setup instructions, including GCP deployment scripts for the cross-region configuration used in these tests.
Run Your Own Benchmarks
Clone the repo, start two agents, and see the numbers yourself. The built-in bench command makes it easy.
View on GitHub
Pilot Protocol