Why WebRTC Breaks for Robot Teleop | Adamo

Why WebRTC Breaks for Robot Teleop


6 minute read

Listen to article
Audio generated by DropInBlog's Blog Voice AI™ may have slight pronunciation nuances. Learn more

WebRTC is the default plumbing for almost every teleop stack in robotics today. It is also, in our experience, the reason most of those stacks do not scale.

That is not a controversial claim inside teams that have actually shipped robots to customers. It is the standard six to twelve month learning curve, and the shape of it is almost always the same. A robotics company starts a teleop project, picks WebRTC because it is the obvious open standard with off the shelf libraries, builds a demo, watches the demo work, ships to a customer, and then discovers, slowly and painfully, that the protocol that runs Google Meet does not run a humanoid hand.

What follows is what breaks, and why the patches teams reach for do not fix it.

WebRTC was built for video calls

This is the original sin. WebRTC was designed around call setup time, codec negotiation, NAT traversal, and adaptive bitrate to keep faces talking. Every one of those priorities is reasonable for a video call. Every one of them is wrong for a robot.

The asymmetry runs deep. Humans are slow. A 150ms delay between speaker and listener is invisible because human conversation has 500ms gaps in it. Teleop, by contrast, is a tight control loop where the operator is the controller and the robot is the plant. Closed loop control has hard latency requirements that conversational video simply does not.

Loss tolerance is asymmetric too. A dropped frame on a Zoom is invisible; the next frame replaces it before your eyes even register. A dropped control packet to a robot hand is the difference between a clean grasp and a broken object on the floor.

What breaks in the field

Five things consistently go wrong:

• Jitter buffers. WebRTC implementations buffer incoming packets to smooth out variable arrival times. The default targets prioritize smooth video over low latency, so even on a healthy network the buffers add 50 to 100ms the operator can never recover. Tune the buffer down and you trade smoothness for responsiveness, and most teams find out about that tradeoff after their pilot.

• Retransmits and NACK storms. When packets drop, WebRTC asks for them again. For a video call this is fine because audio and video can wait a beat. For a control loop, the retransmitted packet arrives after the action window has closed. Worse, NACK storms during a lossy moment create packet bursts that worsen congestion at exactly the moment the operator most needs the channel clean.

• Congestion control built for the public internet. WebRTC uses Google Congestion Control or similar algorithms that read packet loss as a signal of congestion and back off the send rate. That is correct most of the time on the public internet, and wrong most of the time on cellular networks, where packet loss is often a brief radio event rather than persistent congestion. The result is a throttled video stream and a degraded operator experience at exactly the wrong moment.

• Adaptive bitrate that downshifts on you. WebRTC drops video quality when the channel narrows. For a video call, a slightly blurry face is still recognizable. For teleop, the operator might be inspecting a millimeter scale grasp affordance, and a blurry image at the wrong moment is a missed task and a corrupted training trajectory.

• Codec profiles tuned for the wrong thing. WebRTC profiles for H.264 and VP9 are tuned for natural video with motion compensation between frames. Robot environments have hard edges, fast camera motion, and machine learning relevant detail that these codecs compress poorly without configuration most teams never reach.

The six to twelve month tax

Every robotics company we talk to has spent six to twelve months patching around these issues. They build custom jitter buffers. They add side channels for control. They write their own congestion control. They tune codec parameters. They run their own TURN servers because the public ones are too slow.

None of this engineering differentiates their product. It is networking plumbing they did not want to build, in service of a protocol that was never designed for their problem. The version of this story we hear most often goes something like this:

“We spent two engineers for a year and our latency is now 150ms instead of 250.”

The patches are also fragile. They assume a specific network environment. They break when the operator location changes or the robot deploys on a new cellular provider. And they tend to get worse over time as the WebRTC stack evolves underneath them.

Why a rebuilt protocol matters

The shape of the right answer is a transport whose design priorities are control loop latency, jitter floor, and graceful degradation, rather than call quality. That protocol does not exist in the open standards world because nobody needed it until robots started deploying. So Adamo built one.

The headlines are familiar; the mechanisms behind them are the part worth understanding:

• Glass to glass latency below 50ms, with the gap against WebRTC widening as network conditions degrade. The protocol holds its responsiveness in exactly the moments where WebRTC stacks collapse.

• Control commands ride on a dedicated channel, isolated from video. If the channel saturates, the control plane keeps responding while video bitrate adapts. The robot does not freeze.

• First class ROS and ROS2 integration. Drops into existing robot stacks without the months of glue code teams normally write to bridge their middleware to a video transport.

• Encryption on every stream by default. TLS 1.3 as the floor, no plaintext path, no configuration required to turn it on.

• Integration measured in hours, not months. A working teleop pipeline from one config file and a container image, with no custom networking work required from your team.

What this means for teams shipping robots

If you are building teleop on WebRTC today, you have three options. You can keep patching. You can accept the latency floor and design your tasks around it. Or you can replace the networking stack with something that was actually built for the job.

We have watched a dozen teams go through this decision in the past year. The teams that win move quickly on the third option. The teams that lose spend another quarter building a worse version of what they could have bought.

If you are somewhere in that decision, we would be glad to run a head to head latency test against your current setup. Book a slot at adamohq.com.

« Back to Blog