Move Beyond REST: Persistent Connections for Agents

February 26, 2026 architecturereal-timenetworking

REST APIs are fundamentally unidirectional. The client asks, the server responds. If the server has new information for the client, it cannot push it. The client must ask again. And again. Studies on real-world polling systems show that only 1.5% of HTTP polls find new data. The other 98.5% are wasted requests -- burning bandwidth, CPU cycles, and money to learn that nothing changed.

For agents that need real-time coordination, REST polling is architectural debt. WebSocket is the common upgrade path, but it brings its own scaling nightmares. MQTT and gRPC streaming help in specific contexts but carry trade-offs that do not fit agent-to-agent communication. This article examines what agents actually need from a connection model, why existing options fall short, and how persistent overlay connections solve the problems without the operational pain.

REST Polling Waste

Consider a common agent pattern: Agent A produces tasks, Agent B consumes them. With REST, Agent B polls Agent A's API every second:

# Agent B polling loop (typical implementation)
while True:
    response = requests.get("https://agent-a.example.com/api/tasks/pending")
    if response.json()["tasks"]:
        process_tasks(response.json()["tasks"])
    time.sleep(1)  # Poll every second

If Agent A produces a new task every 60 seconds on average, Agent B makes 60 requests to find 1 task. That is a 1.7% hit rate. Scale this to 100 agents polling each other and you have 6,000 requests per minute generating useful data 1.7% of the time. The other 5,900 requests are pure waste.

You can reduce polling frequency, but that increases latency. Poll every 10 seconds and you save 90% of requests but add up to 10 seconds of delay before Agent B notices a new task. For real-time agent coordination -- where agents negotiate, delegate, and react in sub-second timeframes -- this is unacceptable.

The fundamental problem: REST is a request-response protocol. Agent communication is a conversation. Conversations require both parties to speak when they have something to say, not only when asked.

WebSocket Scaling Pain

WebSocket solves the bidirectional problem. Both sides can send messages at any time. The connection stays open. No polling. But WebSocket introduces three operational nightmares that get worse at scale.

Stateful connections complicate load balancing. Each WebSocket connection is a persistent TCP connection pinned to a specific server. If Agent B connects to Server 3, all subsequent messages must go through Server 3. Standard HTTP load balancers (round-robin, least-connections) break because they assume stateless requests. You need sticky sessions, which means uneven load distribution. If Server 3 goes down, all its WebSocket connections die and must reconnect to other servers, causing a thundering herd.

WebSocket does not natively handle reconnection. When a connection drops -- and it will, due to load balancer timeouts, network changes, or server restarts -- the client must implement reconnection logic: exponential backoff, session resumption, message deduplication, and state recovery. Every WebSocket client reimplements this differently, and subtle bugs (duplicate messages, lost messages, infinite reconnection loops) are endemic.

Scaling challenges grow non-linearly. With 10 agents, you have up to 45 possible WebSocket connections (N*(N-1)/2). With 100 agents, you have 4,950. With 1,000 agents, you have 499,500. Each connection consumes a file descriptor, a TCP socket buffer, and memory for the connection state. Server-side WebSocket at scale requires careful resource management, connection pooling, and dedicated infrastructure that does not come for free.

WebSocket also assumes a client-server topology. One side listens, the other connects. For agent-to-agent communication where both sides are peers, you need both sides to be servers, which means both need public IPs or a relay. This brings us back to the infrastructure problem REST already had.

What Agents Actually Need

Agent communication has specific requirements that differ from browser-to-server web traffic:

Bidirectional. Both agents must be able to initiate messages at any time. Not request-response. Not client-server. Peer to peer.
Persistent. The connection should stay alive across idle periods, network changes, and transient failures. Agents run for hours or days, not seconds.
NAT-traversing. Agents run on laptops behind WiFi routers, cloud VMs with security groups, edge devices on cellular. They cannot accept inbound TCP connections without port forwarding.
Encrypted. Agent messages contain task data, credentials, model outputs, and coordination signals. Plaintext is not acceptable, but managing TLS certificates per agent is impractical.
Lightweight. Adding a connection to another agent should not require deploying a message broker, configuring a WebSocket gateway, or managing connection pooling infrastructure.

No single existing protocol hits all five. REST misses bidirectional and persistent. WebSocket misses NAT-traversing and lightweight. gRPC streaming misses NAT-traversing. MQTT hits most but requires a broker (not lightweight for small deployments) and does not do NAT traversal.

Pilot Connections: Stateful, Encrypted, Auto-Reconnecting

Pilot Protocol connections are persistent UDP tunnels between agents. When Agent A connects to Agent B, the following happens once:

Agent A resolves Agent B's virtual address through the registry.
NAT traversal negotiates a path (direct, hole-punched, or relayed).
X25519 key exchange establishes a shared secret.
AES-256-GCM encryption begins on all subsequent data.

After this one-time setup (~200ms), the tunnel stays open. Keepalive probes every 30 seconds maintain the NAT mapping. If a probe fails, the tunnel automatically re-establishes. If the network changes (WiFi to cellular, IP rebind), the tunnel detects the change and reconnects.

Both sides can send data at any time. There is no client or server. The connection is peer-to-peer and bidirectional by default.

# Agent A sends a message to Agent B
pilotctl send-message 1:0001.0002.0001 --data '{"task": "analyze", "payload": "..."}'

# Agent B sends a message to Agent A (same tunnel, reverse direction)
pilotctl send-message 1:0001.0001.0001 --data '{"result": "complete", "output": "..."}'

No polling. No WebSocket reconnection logic. No load balancer configuration. The tunnel handles persistence, encryption, NAT traversal, and bidirectional communication as a single primitive.

Comparison: REST vs WebSocket vs gRPC vs MQTT vs Pilot

Property	REST	WebSocket	gRPC Stream	MQTT	Pilot
Direction	Unidirectional	Bidirectional	Bidirectional	Pub/Sub	Bidirectional
Persistent	No (per request)	Yes (fragile)	Yes (fragile)	Yes	Yes (keepalive)
NAT traversal	No	No	No	No (needs broker)	Yes (automatic)
Encryption	TLS (configured)	TLS (configured)	TLS (configured)	TLS (configured)	AES-256-GCM (built-in)
Auto-reconnect	N/A	DIY	DIY	Library-dependent	Built-in
Topology	Client-server	Client-server	Client-server	Star (broker)	Peer-to-peer
Broker required	No	No	No	Yes	No
Peer discovery	DNS/config	DNS/config	DNS/config	Topic-based	Tag-based registry
Idle overhead	Polling cost	TCP keepalive	HTTP/2 ping	MQTT keepalive	UDP probe (30s)
Memory per conn	0 (stateless)	~8KB (TCP buffer)	~8KB (TCP buffer)	~4KB (broker side)	~2KB (UDP state)

The key differentiator is the combination. Many protocols are bidirectional (WebSocket, gRPC). Some are persistent (MQTT). None of the standard options traverse NAT without additional infrastructure. Pilot combines all five properties -- bidirectional, persistent, NAT-traversing, encrypted, and lightweight -- in a single connection primitive.

Code Example: Bidirectional Agent Messaging

Here is a Go example of two agents with a persistent bidirectional connection. Agent A sends tasks, Agent B sends results, and either side can initiate at any time.

package main

import (
    "encoding/json"
    "fmt"
    "time"

    "github.com/pilot-protocol/pilotprotocol/pkg/driver"
)

type Message struct {
    Type    string          `json:"type"`
    Payload json.RawMessage `json:"payload"`
    Ts      int64           `json:"ts"`
}

// Agent A: sends tasks and receives results
func runAgentA() {
    d, _ := driver.Connect()
    stream, _ := d.OpenEventStream()

    // Subscribe to results from Agent B
    results, _ := stream.Subscribe("agent-b.results")

    // Handle incoming results in background
    go func() {
        for event := range results {
            var msg Message
            json.Unmarshal(event.Data, &msg)
            fmt.Printf("[A] Received result: %s\n", string(msg.Payload))
        }
    }()

    // Send tasks periodically
    for i := 0; ; i++ {
        task := Message{
            Type:    "task",
            Payload: json.RawMessage(fmt.Sprintf(`{"id":%d,"work":"analyze dataset %d"}`, i, i)),
            Ts:      time.Now().Unix(),
        }
        data, _ := json.Marshal(task)
        stream.Publish("agent-a.tasks", data)
        fmt.Printf("[A] Sent task %d\n", i)
        time.Sleep(10 * time.Second)
    }
}

// Agent B: receives tasks and sends results
func runAgentB() {
    d, _ := driver.Connect()
    stream, _ := d.OpenEventStream()

    // Subscribe to tasks from Agent A
    tasks, _ := stream.Subscribe("agent-a.tasks")

    for event := range tasks {
        var msg Message
        json.Unmarshal(event.Data, &msg)
        fmt.Printf("[B] Received task: %s\n", string(msg.Payload))

        // Process the task
        time.Sleep(2 * time.Second) // Simulate work

        // Send result back (B initiates, not responding to a request)
        result := Message{
            Type:    "result",
            Payload: json.RawMessage(`{"status":"complete","confidence":0.95}`),
            Ts:      time.Now().Unix(),
        }
        data, _ := json.Marshal(result)
        stream.Publish("agent-b.results", data)
        fmt.Printf("[B] Sent result\n")
    }
}

Both agents can send messages at any time. Agent B does not wait to be polled. Agent A does not need to hold the connection open. The event stream on port 1002 handles the persistent, bidirectional channel. If either agent restarts, it resubscribes and picks up new events immediately.

The CLI equivalent for quick testing:

# Terminal 1: Agent A subscribes to results
pilotctl subscribe  "agent-b.results"

# Terminal 2: Agent B subscribes to tasks
pilotctl subscribe  "agent-a.tasks"

# Terminal 3: Agent A publishes a task
pilotctl publish  agent-a.tasks --data '{"id":1,"work":"analyze dataset"}'

# Terminal 4: Agent B publishes a result
pilotctl publish  agent-b.results --data '{"status":"complete","confidence":0.95}'

When REST Is Still the Right Choice

Persistent connections are not always better. REST remains the right choice in these scenarios:

Public APIs. If external clients consume your agent's output via HTTP, keep the REST API. Pilot is for agent-to-agent communication, not browser-to-agent.
Stateless operations. If an agent provides a pure function (input in, output out, no ongoing conversation), REST's request-response model is the natural fit. Do not force persistence where it is not needed.
Existing infrastructure. If you have API gateways, rate limiters, authentication middleware, and monitoring built around REST, replacing it with persistent connections means rebuilding all of that. Not worth it for marginal gains.
Low-frequency communication. If two agents exchange messages once per hour, a persistent connection is wasted resources. An HTTP call once per hour is fine.
Interoperability. REST is universal. Every language, framework, and platform speaks HTTP. If your agents need to talk to systems you do not control, REST is the lingua franca.

The question is not "REST or Pilot" but "which communication patterns benefit from persistent connections?" Polling loops, real-time coordination, streaming data, and bidirectional conversations are candidates for migration. CRUD operations, webhooks, and occasional queries are fine on REST.

Migration Path: Pilot Alongside REST

You do not need to replace your REST APIs. The practical migration path is to run Pilot alongside them and selectively move communication patterns that benefit from persistence.

# Step 1: Install Pilot on your agents
curl -fsSL https://pilotprotocol.network/install.sh | sh
pilotctl daemon start --email [email protected]

# Step 2: Keep your existing REST API running
# Agents continue to serve HTTP on their public endpoints

# Step 3: Add Pilot for specific patterns
# Replace polling loops with event subscriptions
pilotctl subscribe  "tasks.new"  # Instead of GET /api/tasks/pending every second

# Step 4: Use Pilot for agent-to-agent coordination
# Keep REST for external API consumers
pilotctl send-message 1:0001.0002.0001 --data '{"action":"delegate","task_id":"abc"}'

This is not an all-or-nothing migration. You can start with one polling loop, replace it with an event subscription, and measure the improvement. If it works, migrate the next pattern. If it does not, keep REST for that case.

The Pilot daemon runs alongside your existing services, using 10 MB of memory. It does not interfere with your HTTP servers, does not require port changes, and does not need new firewall rules. It is an additional communication layer, not a replacement for your existing stack.

For agent architectures that are growing past the point where REST polling is efficient -- where you are adding WebSocket servers, connection managers, and reconnection logic to work around HTTP's limitations -- the persistent tunnel model eliminates that complexity by making persistence, bidirectionality, and NAT traversal properties of the connection itself, rather than features you build on top of a protocol that was not designed for them.