A Three-Machine Relay for Real-Time Robot Inference
GPU workstation, relay/control node, and robot — turned into one control loop with defensible interfaces.
Role — Architecture, interface design, networking, debugging
- An 8B model doesn't fit on a robot. Splitting perception, inference, and control across machines buys compute but spends reliability — every hop adds latency, jitter, and a new way to fail.
- Distributed inference/control topology
- Almost every ambitious embodied-AI system is a distributed system in disguise. The relay pattern here — edge sensing, remote inference, local control — is the pattern the field keeps rebuilding.
- Built within the K1 research program under Dr. Yiyan Li — SDK primitives from Booster; topology, translation, and integration were the research work.
Overview
The systems half of the Booster K1 deployment, treated on its own terms: how camera streaming, model inference on an RTX 5090, and SDK velocity control were split across three machines, what the interfaces between them look like, and how the seams were debugged. The live overlay video shows the whole loop at once — the instruction at the top, the model's reasoning, and the velocity commands with latency and buffer state at the bottom.
Sensing and actuation stay near the robot; heavy inference runs where the compute is; a relay node in between owns translation, pacing, and safety. Each machine has one job, and the interfaces between them are the design surface.
- Robot — sensing & actuation
- Relay / control node — pacing, translation, safety
- GPU workstation — model inference
- Return path — commands & telemetry
Contributions
- Designed the three-machine topology: robot camera streaming, relay/control machine, GPU inference workstation.
- Implemented command translation from model outputs to robot SDK velocity control, pacing ~1 Hz planning against 50 Hz control.
- Owned cross-machine debugging — the integration seams where most failures lived.
- Built the live diagnostic overlay: instruction, model reasoning, velocity commands, per-step latency, and buffer state in one view.
Evidence & evaluation
Evidence
Live loop recording
attachedEmbedded above — instruction, reasoning, and velocity commands with live latency readout.
Topology diagram
attachedHop-by-hop diagram in the gallery.
Interface definitions
pendingThe actual message/command contracts between machines.
Timing measurements
pendingPer-hop latency and jitter under load.
Metrics
~350 ms
Not yet measured
Measure camera→relay, relay→GPU, GPU→command.
Not yet measured
What the loop actually holds during runs.
Limitations
- A production version would need monitoring, failover, and a security posture this research loop doesn't have.
Lessons & tradeoffs
- Interfaces between machines deserve the same design attention as the model — most failures were seam failures.
- [Add the tradeoff you'd make differently now.]to fill
Artifacts
- Live loop recording
- Topology diagram
- Integration postmortemnot yet published