Skip to content
Jangara Bliss
All projects
Systems IntegrationSelected work2026

A Three-Machine Relay for Real-Time Robot Inference

GPU workstation, relay/control node, and robot — turned into one control loop with defensible interfaces.

Role — Architecture, interface design, networking, debugging

PythonRobot SDKCamera streamingRTX 5090 workstationNetworking
Problem
An 8B model doesn't fit on a robot. Splitting perception, inference, and control across machines buys compute but spends reliability — every hop adds latency, jitter, and a new way to fail.
System type
Distributed inference/control topology
Why it matters
Almost every ambitious embodied-AI system is a distributed system in disguise. The relay pattern here — edge sensing, remote inference, local control — is the pattern the field keeps rebuilding.
Team context
Built within the K1 research program under Dr. Yiyan Li — SDK primitives from Booster; topology, translation, and integration were the research work.
The view from inside the loop: instruction → reasoning → velocity commands, with live latency.

01

Overview

The systems half of the Booster K1 deployment, treated on its own terms: how camera streaming, model inference on an RTX 5090, and SDK velocity control were split across three machines, what the interfaces between them look like, and how the seams were debugged. The live overlay video shows the whole loop at once — the instruction at the top, the model's reasoning, and the velocity commands with latency and buffer state at the bottom.

System architecture

Sensing and actuation stay near the robot; heavy inference runs where the compute is; a relay node in between owns translation, pacing, and safety. Each machine has one job, and the interfaces between them are the design surface.

  1. Robot — sensing & actuation
  2. Relay / control node — pacing, translation, safety
  3. GPU workstation — model inference
  4. Return path — commands & telemetry
Per-hop view of the three-machine relay: robot to relay to GPU workstation and back
The relay, hop by hop.

02

Contributions

  • Designed the three-machine topology: robot camera streaming, relay/control machine, GPU inference workstation.
  • Implemented command translation from model outputs to robot SDK velocity control, pacing ~1 Hz planning against 50 Hz control.
  • Owned cross-machine debugging — the integration seams where most failures lived.
  • Built the live diagnostic overlay: instruction, model reasoning, velocity commands, per-step latency, and buffer state in one view.

03

Evidence & evaluation

Evidence

Live loop recording

attached

Embedded above — instruction, reasoning, and velocity commands with live latency readout.

Topology diagram

attached

Hop-by-hop diagram in the gallery.

Interface definitions

pending

The actual message/command contracts between machines.

Timing measurements

pending

Per-hop latency and jitter under load.

Metrics

Inference step

~350 ms

Per-hop latency

Not yet measured

Measure camera→relay, relay→GPU, GPU→command.

Sustained frame rate

Not yet measured

What the loop actually holds during runs.

04

Limitations

  • A production version would need monitoring, failover, and a security posture this research loop doesn't have.

05

Lessons & tradeoffs

  • Interfaces between machines deserve the same design attention as the model — most failures were seam failures.
  • [Add the tradeoff you'd make differently now.]to fill

06

Artifacts