Systems IntegrationSelected work2026

A Three-Machine Relay for Real-Time Robot Inference

GPU workstation, relay/control node, and robot — turned into one control loop with defensible interfaces.

Role — Architecture, interface design, networking, debugging

PythonRobot SDKCamera streamingRTX 5090 workstationNetworking

Problem: An 8B model doesn't fit on a robot. Splitting perception, inference, and control across machines buys compute but spends reliability — every hop adds latency, jitter, and a new way to fail.
System type: Distributed inference/control topology
Why it matters: Almost every ambitious embodied-AI system is a distributed system in disguise. The relay pattern here — edge sensing, remote inference, local control — is the pattern the field keeps rebuilding.
Team context: Built within the K1 research program under Dr. Yiyan Li — SDK primitives from Booster; topology, translation, and integration were the research work.

The view from inside the loop: instruction → reasoning → velocity commands, with live latency.

Overview

The systems half of the Booster K1 deployment, treated on its own terms: how camera streaming, model inference on an RTX 5090, and SDK velocity control were split across three machines, what the interfaces between them look like, and how the seams were debugged. The live overlay video shows the whole loop at once — the instruction at the top, the model's reasoning, and the velocity commands with latency and buffer state at the bottom.

System architecture

Sensing and actuation stay near the robot; heavy inference runs where the compute is; a relay node in between owns translation, pacing, and safety. Each machine has one job, and the interfaces between them are the design surface.

Robot — sensing & actuation

Relay / control node — pacing, translation, safety

GPU workstation — model inference

Return path — commands & telemetry

Robot — sensing & actuation

Relay / control node — pacing, translation, safety

GPU workstation — model inference

Return path — commands & telemetry

Per-hop view of the three-machine relay: robot to relay to GPU workstation and back — The relay, hop by hop.

Contributions

Designed the three-machine topology: robot camera streaming, relay/control machine, GPU inference workstation.
Implemented command translation from model outputs to robot SDK velocity control, pacing ~1 Hz planning against 50 Hz control.
Owned cross-machine debugging — the integration seams where most failures lived.
Built the live diagnostic overlay: instruction, model reasoning, velocity commands, per-step latency, and buffer state in one view.

Evidence & evaluation

Evidence

Live loop recording

attached

Embedded above — instruction, reasoning, and velocity commands with live latency readout.

Topology diagram

attached

Hop-by-hop diagram in the gallery.

Interface definitions

pending

The actual message/command contracts between machines.

Timing measurements

pending

Per-hop latency and jitter under load.

Metrics

Inference step

~350 ms

Per-hop latency

Not yet measured

Measure camera→relay, relay→GPU, GPU→command.

Sustained frame rate

Not yet measured

What the loop actually holds during runs.

Limitations

A production version would need monitoring, failover, and a security posture this research loop doesn't have.

Lessons & tradeoffs

Interfaces between machines deserve the same design attention as the model — most failures were seam failures.
[Add the tradeoff you'd make differently now.]to fill

Artifacts

Live loop recording
Topology diagram
Integration postmortemnot yet published