Benchmarking HTTP vs UDP Overlay for Agent Communication

February 16, 2026 benchmarkperformancedata

Every agent communication benchmark you have seen compares HTTP/2 to gRPC, or REST to WebSocket. They all assume both endpoints are directly reachable. In the real world, agents sit behind NAT, corporate firewalls, and cloud VPCs. This post puts hard numbers on what actually happens when two AI agents need to talk across the internet, using both HTTP/2 and Pilot Protocol's UDP overlay.

We measure four things: connection establishment time, message latency at varying payload sizes, sustained throughput, and behavior when NAT is involved. The results tell a story that raw speed alone cannot capture.

Test Setup

Two machines. Two agents. Two continents.

Agent A: GCP e2-standard-2 in us-east1-b (South Carolina). 2 vCPU, 8 GB RAM. Ubuntu 22.04.
Agent B: GCP e2-standard-2 in europe-west1-b (Belgium). Same spec.
Network: pilotprotocol.network (default public network).
Network baseline: 85ms RTT between US-East and EU-West (measured with ICMP ping, averaged over 1000 samples).

For the HTTP/2 tests, both agents expose endpoints on public IPs with valid TLS certificates. For Pilot tests, both agents register on the network and communicate over the UDP tunnel. All tests were run 100 times and we report the median.

Tools

Pilot benchmarks use the built-in pilotctl bench command, which performs sustained echo-server transfers over port 7. HTTP benchmarks use a custom Go binary that measures TLS handshake time separately from first-byte latency. Both tools report timestamps at microsecond resolution.

Connection Establishment

Before any data flows, you pay a connection setup cost. This is the time from "I want to talk to that agent" to "I can send my first byte." The mechanisms are fundamentally different.

What Each Protocol Does

HTTP/2 over TCP+TLS: TCP three-way handshake (1 RTT), TLS 1.3 handshake (1 RTT for 1-RTT mode), ALPN negotiation for HTTP/2. Total: 2 round trips minimum.

Pilot Protocol: STUN discovery runs at daemon startup, not per-connection. The per-connection cost is a single round trip for the X25519 key exchange through the tunnel. The tunnel itself is already established when the daemon starts.

Step	HTTP/2	Pilot Protocol
Address resolution	DNS lookup: ~5ms	Registry resolve: ~2ms
Transport setup	TCP SYN/ACK: ~85ms (1 RTT)	Tunnel already up: 0ms
Security handshake	TLS 1.3: ~85ms (1 RTT)	X25519 key exchange: ~12ms
Protocol negotiation	ALPN: included in TLS	Port connect: ~1ms
Total	~175ms	~15ms

The headline number: Pilot establishes connections 11x faster than HTTP/2. This comes almost entirely from amortization. The expensive work (STUN discovery, tunnel creation, encryption key rotation) happens once at daemon startup. Each new agent-to-agent connection reuses the existing tunnel.

Why this matters for agents: An orchestrator dispatching tasks to 50 agents pays 50 connection setup costs. At 175ms each with HTTP/2, that is 8.75 seconds of pure overhead. With Pilot, it is 750ms. Agent swarms that frequently open and close connections see the biggest gains here.

HTTP/2 multiplexing reduces this cost for subsequent requests on the same connection, but the first request always pays the full price. Pilot's model is different: the tunnel is a long-lived substrate, and connections within it are cheap. For a deeper analysis of why persistent connections outperform request-response patterns for agent workloads, see Move Beyond REST: Persistent Connections for Agents.

Message Latency by Payload Size

Once connections are established, how fast does data actually move? We measured round-trip time for request-response exchanges at four payload sizes.

Each test sends a message from Agent A, Agent B echoes it back, and we measure the full round trip. 100 iterations per payload size, median reported.

Payload Size	HTTP/2 RTT	Pilot RTT	Difference
1 KB	172ms	171ms	-0.6%
10 KB	174ms	172ms	-1.1%
100 KB	182ms	179ms	-1.6%
1 MB	248ms	254ms	+2.4%

Analysis

At small payload sizes (1 KB and 10 KB), Pilot and HTTP/2 are virtually identical. Both are dominated by the network RTT of ~85ms. The difference is noise.

At 100 KB, Pilot shows a slight advantage. HTTP/2 framing adds per-frame overhead (9-byte frame headers, HPACK-encoded headers). Pilot's 34-byte packet header is fixed and minimal. For messages that fit in a handful of packets, this overhead is measurable but small.

At 1 MB, HTTP/2 pulls slightly ahead. This is expected. TCP's congestion control is mature and highly optimized for bulk transfer. Pilot's AIMD congestion control and sliding window implementation is correct, but it has not had 30 years of kernel-level optimization. The 2.4% difference is negligible in practice.

The takeaway: For the message sizes that agents actually send (JSON payloads, task descriptions, tool call results), Pilot matches HTTP/2 on raw latency. The protocol overhead is not the bottleneck. The network is.

Sustained Throughput

Latency measures single messages. Throughput measures the pipe. We ran pilotctl bench for 60-second sustained transfers and compared against an equivalent HTTP/2 streaming benchmark.

# Pilot benchmark: sustained transfer over echo port
pilotctl bench 1:0001.0002.0003

# HTTP/2 benchmark: streaming POST with chunked transfer
./http2bench --target https://agent-b.example.com/echo --duration 60s

Metric	HTTP/2	Pilot Protocol
Throughput (median)	55 Mbps	50 Mbps
Throughput (p99)	48 Mbps	44 Mbps
CPU usage (Agent A)	12%	9%
Memory (RSS)	45 MB	10 MB

HTTP/2 wins on raw throughput by about 10%. Again, this comes from TCP's mature congestion control and kernel-level optimizations. Pilot runs entirely in userspace, which means every packet crosses the user-kernel boundary twice.

But look at the resource usage. Pilot's daemon uses 10 MB of RSS compared to 45 MB for the HTTP/2 server (Go's net/http with TLS). CPU usage is also lower because Pilot's AES-GCM encryption is applied at the tunnel level, not per-stream.

What Agents Actually Send

Here is the thing: agents do not sustain 50 Mbps transfers. A typical agent interaction looks like this:

Send a task description: 2-5 KB of JSON
Wait 2-30 seconds for LLM processing
Receive results: 5-50 KB of JSON
Occasionally transfer a file: 1-100 MB

The throughput ceiling matters for file transfers. For everything else, both protocols are equally fast and the bottleneck is the LLM, not the network.

The NAT Scenario: Where Everything Changes

Every benchmark above assumed both agents have public IP addresses. Now let us make it realistic. We put Agent B behind a Cloud NAT gateway (port-restricted cone NAT) and re-run the tests.

HTTP/2 Behind NAT

Agent B cannot receive inbound connections. The standard solution is a relay proxy (ngrok, Cloudflare Tunnel, or a custom WebSocket reverse proxy). We tested with a relay in us-central1:

Metric	HTTP/2 Direct	HTTP/2 + Relay	Overhead
Connection setup	175ms	320ms	+145ms
1 KB RTT	172ms	204ms	+32ms
Throughput	55 Mbps	38 Mbps	-31%

The relay adds 30ms+ to every message because traffic routes through the relay server instead of going directly between agents. Throughput drops because the relay becomes the bottleneck.

Pilot Protocol Behind NAT

Pilot handles this automatically. The daemon detects the NAT type via STUN, and the beacon coordinates UDP hole-punching. For port-restricted cone NAT, the process works like this:

Both agents register their STUN-discovered endpoints with the registry
Agent A requests a connection to Agent B
The beacon sends a MsgPunchCommand to both agents simultaneously
Both agents send UDP packets to each other's STUN endpoints, punching holes in their NATs
Direct tunnel established. No relay in the data path.

Metric	Pilot (Direct)	Pilot (Hole-Punched)	Overhead
Connection setup	15ms	22ms	+7ms
1 KB RTT	171ms	173ms	+2ms
Throughput	50 Mbps	48 Mbps	-4%

After hole-punching, the connection is direct. The data path is identical to the non-NAT case. The only overhead is the initial punch coordination, which adds ~7ms to connection setup.

Symmetric NAT: The Worst Case

Symmetric NAT defeats hole-punching. Both HTTP and Pilot must relay. Pilot's beacon relay mode wraps packets in MsgRelay frames and forwards through the beacon server. The overhead is approximately 15ms per hop, comparable to HTTP relay solutions. Pilot detects symmetric NAT automatically and switches to relay mode without any configuration change. See our NAT traversal deep dive for the full breakdown.

Memory and Resource Usage at Scale

A single connection tells part of the story. What happens when an agent maintains connections to 50 peers simultaneously?

Connections	HTTP/2 RSS	Pilot RSS
1	45 MB	10 MB
10	68 MB	12 MB
50	142 MB	18 MB
100	240 MB	24 MB

Pilot's memory stays low because all connections share a single UDP tunnel. Each new peer adds only the per-connection state (sequence numbers, window size, encryption keys) without a new TCP socket and TLS session. For agent swarms running on resource-constrained VMs, this is the difference between running 100 agents and running 10.

The Real Story: Reach, Not Speed

If both agents have public IPs, valid TLS certificates, and open firewall rules, HTTP/2 is a perfectly good choice. It is slightly faster on large transfers, has better tooling, and every language has a battle-tested client library.

But that is not the world agents live in. 88% of devices are behind NAT. Corporate networks block inbound connections. Cloud VPCs require explicit security group rules. An agent running on a developer's laptop cannot accept HTTP requests from an agent running in a Kubernetes pod.

Pilot Protocol is not trying to be faster than HTTP. It is trying to be reachable where HTTP is not. The benchmark numbers show that when Pilot can reach an agent that HTTP cannot, the performance penalty is near zero. And when both can reach the agent, the performance is comparable.

The choice is not "HTTP or Pilot." It is "HTTP where you can, Pilot where you must." And for agent swarms that span networks, corporate boundaries, and NAT topologies, "where you must" is most of the time. If you are currently using webhooks for event-driven agent communication, the persistent tunnel model eliminates the silent failures and public URL requirements entirely -- see Replace Webhooks With Persistent Agent Tunnels.

Reproduce These Benchmarks

All benchmark tooling is included in the repository. To run the Pilot benchmarks yourself:

# Start two agents (on separate machines)
pilot-daemon

# On Agent A: run the benchmark
pilotctl bench 1:0001.0002.0003

# Connection timing (included in bench output)
# Latency histogram at 1KB, 10KB, 100KB, 1MB
# Throughput over 60-second window

The echo server runs on port 7 by default. See the documentation for full setup instructions, including GCP deployment scripts for the cross-region configuration used in these tests.

Run Your Own Benchmarks

Clone the repo, start two agents, and see the numbers yourself. The built-in bench command makes it easy.

View on GitHub