Nobody Likes Lag: How to Make Low-Latency Dev Sandboxes (opens in new tab)  🖥️Terminal Renaissance

Nobody likes lag, especially not developers. When you type a character in a terminal or editor, you expect it to appear instantly. At Compyle, we spin up ephemeral cloud dev environments where an agent and a user share an IDE + terminal. So how can we make a remote sandbox feel local?

TL;DR: If you want low latency sandboxes, cut out the middlemen and put your servers next to your users.

The Naive Approach

Here is what our initial architecture looked like: Naive architecture diagram

The flow would be something like:

  1. User starts task
  2. We provision a new sandbox in our primary region
  3. We communicate with the agent through a socket server that handles authorization, routing requests to the correct sandbox, and persisting any messages the agent sends (we don’t want to give the sandbox any credentials).

There are three main concerns with running a coding agent sandbox.

  • Startup time
  • Latency
  • Security

I can confidently say that this architecture does pretty poorly on the first two categories. Here’s the scorecard:

Startup time: bad (10-30 seconds)

While many sandbox companies advertise sub-second cold starts, this simply wasn’t the case for us. Our dockerfile is a few hundred megabytes and we attach an encrypted volume to each machine. In practice, machines took 10 seconds to start in the best case, and 30 seconds in the worst. That said, now is a good time to mention that we use fly.io for our sandboxes, and love it. More on this later.

With this approach, users are staring at a loading screen for 30 seconds before the agent’s first message would appear. Yikes.

Latency: also bad (>200ms)

We had two glaring issues on the latency front:

  • Every request makes an extra network hop. Worse yet, for websocket connections we were stitching the client connection to the agent connection in the socket server, which had some overhead.
  • Persistence was on the hot path between the agent and the client, so that extra network hop caused additional latency from any database queries done.

For the agent messages, an extra 150ms isn’t the end of the world because LLM calls make up the lion’s share of the gap between messages. The real issue is in the terminal and the IDE. It was about a 200ms lag between hitting a key and seeing the character in the terminal. Genuinely unbearable. Same for opening a file in the IDE.

Security: it’s fine

The agent wasn’t exposed to the public internet and doesn’t have any secrets. I’ll save security details for another time, though.

The point here is that this architecture does poorly on two of our three concerns; it’s painfully slow.

Fixing startup time: the warm pool

Loading more...

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help