Building an internet-controlled RC car that feels like a game

Jan 19, 2026

This is the first post in a series about my ongoing journey to control a real RC car over the internet from a browser. With FPV video, a race countdown, an admin “referee” dashboard, and latency low enough that you don’t feel like you’re driving via postcards.

The project is called Tether Rally. Under the hood, it’s a mix of WebRTC, Raspberry Pi, ESP32, Cloudflare edge plumbing, and one slightly unhinged hardware constraint: ARRMA Big Rock’s 2-in-1 ESC/receiver.

Why this is harder than it sounds

If you pick a “normal” hobby RC car, you can usually inject control signals pretty cleanly:

receiver outputs PWM
servo + ESC take PWM
Raspberry Pi can generate PWM (or you use a PCA9685)
done

But ARRM…

Jan 19, 2026

Why this is harder than it sounds

If you pick a “normal” hobby RC car, you can usually inject control signals pretty cleanly:

receiver outputs PWM
servo + ESC take PWM
Raspberry Pi can generate PWM (or you use a PCA9685)
done

But ARRMA Big Rock (and friends) ship with an integrated receiver+ESC where the receiver is basically welded to the ESC from your perspective. No handy signal pins. No “just connect to channel 1/2”. So to control it digitally, I had two choices:

replace the electronics stack (boring, expensive, loses the stock radio feel)
control the transmitter instead

I went with option 2. Which leads to the core trick of this whole project:

The ESP32 sits on the transmitter and fakes the joystick voltages. The transmitter then sends a normal radio signal to the car, like nothing happened. That means the internet control link has to reach the transmitter, not just the car.

Architecture: two variants, one cursed and one clean

Clean version (most cars):

Browser ──WebRTC──> Pi (on car) ──PWM──> Servo/ESC
│
└──WebRTC Video──> Browser

ARRMA Big Rock version (the one I actually built):

Browser ──WebRTC──> Pi (on car) ──WiFi/UDP──> ESP32 (on transmitter) ──DAC──> Transmitter ~~Radio~~> Car
│
└──WebRTC Video──> Browser

The Pi is on the car because it owns the camera. The ESP32 is on the transmitter because the transmitter is the only controllable “input device” the car accepts.

This split also creates a fun constraint: the Pi and ESP32 must share a LAN. In my setup they’re typically both on an iPhone hotspot or a portable 4G router.

The two hard problems: latency and safety

Latency

At first I tried the classic “easy” approach: browser → Cloudflare Worker → Pi/ESP via WebSockets.

It worked, but it had that unmistakable feel of driving a car through an interpreter. Steering corrections arrive late, you overcorrect, then you overcorrect the overcorrection, and suddenly you’ve invented drift racing against your will.

So I switched the control path to WebRTC DataChannel:

unordered packets
no retransmits
basically “UDP with better NAT traversal and fewer tears”

That dropped the control RTT to something like 10–15ms on LAN and ~100–200ms over the internet (which is in the “playable” range, especially if your throttle limit isn’t set to “lawnmower”).

Safety

If you’re going to let random people drive a real car remotely, you need the system to fail boring.

So the safety rules live where they can’t be bypassed:

throttle limits enforced on the ESP32 (not the browser)
auto-neutral on connection loss
slew-rate limiting so the car can’t instantly snap from 0 to full send
race state machine blocks input until it’s actually time to race

And access is token-based, more on that later.

Video: “it has to feel live” (and also not melt the Pi)

For FPV I ended up with:

Raspberry Pi Zero 2W
Camera Module 3 (Wide)
MediaMTX serving WebRTC via WHEP
Cloudflare Tunnel to expose the endpoints
Cloudflare TURN for NAT traversal

MediaMTX is doing a lot of heavy lifting here: it can grab frames from the Pi camera stack, hardware-encode H.264, and serve WebRTC without me writing a custom signaling/video pipeline.

The result is usually ~100–300ms video latency depending on network and whether you get a direct P2P route or a relayed TURN path.

That’s still higher than control latency, which is exactly what you want. Control should be snappy; video can be “good enough” as long as it’s stable and you don’t get multi-second buffering.

Controls: DataChannel → Pi → UDP → ESP32 → DAC voltages

Controls are sent from the browser at 50Hz over a binary protocol. On the Pi, a Python relay receives DataChannel packets and forwards them via UDP to the ESP32 on the LAN.

Binary protocol (the boring part that makes everything less boring)

Packet format: seq(uint16 LE) + cmd(uint8) + payload

Why UDP between Pi and ESP32? Because it’s LAN-local, tiny packets, and the ESP32 already has enough going on. UDP is perfect here, and “reliability” is handled by high frequency updates + timeouts instead of retransmits.

Control link evolution: from “it works” to “it feels right”

The first version of Tether Rally’s control path was the classic “ship it” architecture: Browser → Cloudflare Worker → WebSocket → (something on the other end) → car.

It was easy to deploy, easy to reason about, and it worked on basically any network without thinking too hard about NAT traversal. But it had one fatal flaw: It felt like driving while someone narrates your steering inputs to a second person, who then turns the wheel for you.

Not always. Not consistently. But often enough that you start driving the latency instead of driving the car.

Why WebSockets felt bad (even when the RTT looked “okay”)

WebSockets are TCP. That sentence is the whole story, but it’s worth unpacking because the failure mode isn’t obvious until you try to do realtime control.

What I cared about for controls wasn’t throughput or reliability. It was:

freshness (the latest input is the only one that matters)
predictable timing (jitter feels worse than raw latency)
graceful loss (missing one packet should be invisible, not catastrophic)

TCP gives you almost the opposite:

reliability: if a packet is lost, it will be retransmitted
ordering: packets must be delivered in order
head-of-line blocking: one missing packet can stall everything behind it

This is great for loading web pages. It’s a small disaster for “steer left, no wait steer right, no wait...”.

The control loop is sending updates at 50Hz. That’s one packet every 20ms. If you drop one packet on a shaky link, TCP doesn’t just shrug and move on, it tries to heal the timeline. That can mean the “old” steering update arrives late or it arrives before the newer one (because ordering) or everything queues behind it and you get a clump of stale inputs delivered together.

So the car does the exact thing you never want: it briefly becomes more correct about the past instead of more correct about the present.

And from the driver’s perspective, that shows up as weirdness: steering that “sticks” for a beat, sudden catch-up wiggles, throttle that feels mushy and then jumps.

Even if average RTT is fine, this kind of jitter is poison for control feel.

The aha moment: controls want UDP semantics

For control packets if one is lost, you don’t want it back, you want the next one. So what I actually wanted was: NAT traversal, a datagram-ish channel (unreliable, unordered), easy browser support. Which is basically: WebRTC DataChannel configured like UDP.

Switching to WebRTC DataChannel

Once I moved the browser→Pi link to WebRTC, two things happened immediately: RTT dropped dramatically on good paths (especially LAN), but more importantly jitter stopped feeling like betrayal.

ESP32 firmware: the part that stopped the car from twitching like it’s haunted

Analog joystick emulation sounds simple until you discover all the ways noise and scheduling ruin your day. The big fix was making the ESP32 behave like a real control loop, not “a WiFi packet handler that sometimes writes a DAC”.

So the firmware runs two tasks:

Core 0: UDP receive (updates target throttle/steering)
Core 1: control loop at 200Hz
EMA smoothing (α ≈ 0.25)
slew rate limiting (~8/sec max change)
staged timeout: 80ms hold → 250ms neutral

That’s what killed the stutter. Packet timing can jitter; the output loop stays smooth.

The admin dashboard

If you’ve ever tried to run anything resembling a tournament, you quickly learn that “just let them drive” turns into chaos. So the admin dashboard is basically a tiny race director:

see if ESP32 is discovered on the network
if player is connected
admin hits Start/Stop Race, kicks the player etc

Having the player to press “Ready” button also solved a subtle UX problem: video connect time varies a lot. Blocking controls until video is confirmed avoids the “I’m driving blind while WebRTC negotiates” moment.

YouTube restreaming: turning the whole thing into a spectator sport

If you want to, you can restream the camera feed to YouTube. The restreamer is a separate service (Fly.io) that consumes the WHEP stream, re-encodes to RTMP and pushes it to YouTube. It’s not necessary for racing, but it’s perfect for “tournament vibes”.

The plan: tournaments and real timing

The next milestones are what turn this from a tech demo into a racing platform:

Tournament queue system: A simple queue where people join a tournament slot, you get your turn and spectators can watch.
Timing system: Manual timing is fine for testing, but real racing needs automated timing with reliable ordering. I’m currently leaning toward BLE beacons for start/finish (and eventually checkpoints), because they’re cheap and easy to setup.

Why this is harder than it sounds

Why this is harder than it sounds

Architecture: two variants, one cursed and one clean

The two hard problems: latency and safety

Latency

Safety

Video: “it has to feel live” (and also not melt the Pi)

Controls: DataChannel → Pi → UDP → ESP32 → DAC voltages

Binary protocol (the boring part that makes everything less boring)

Control link evolution: from “it works” to “it feels right”

Why WebSockets felt bad (even when the RTT looked “okay”)

The aha moment: controls want UDP semantics

Switching to WebRTC DataChannel

ESP32 firmware: the part that stopped the car from twitching like it’s haunted

The admin dashboard

YouTube restreaming: turning the whole thing into a spectator sport

The plan: tournaments and real timing

Similar Posts