From demo to deployment: the loop that actually closes.
Bench evaluates, Data cleans, Synth augments. Ship faster, fail less with tooling your team can inspect and extend.
Code, docs, and early SDK work live on GitHub.
Example Bench report
run_8c91a4 · kitchen_handover · real Franka
Three subtasks pass. Handover still fails.→ Find bottlenecks before hardware ever moves.
Verdict
Ship drawer + pick + pour.
Replay handover failures before sign-off.
A demo proves a robot can. Deployment needs proof it will — across seeds, across embodiments, on the hardware that ships.
One loop. Four building blocks.
Data, Synth, Bench, and API work together from failed rollout to verified policy.
OpenBot Data
Clean episodes, mine failures, build the next training set.
Explore DataOpenBot Synth
Generate hard cases from the failures Bench finds.
Explore SynthOpenBot Bench
Acceptance metrics across seeds, subtasks, and embodiments.
Explore BenchOpenBot API
Wire the loop into your runner, CI, or agent.
Explore APIA report built for sign-off.
Task success, failure point, and next action in one view.
kitchen_handover · open_drawer → pick_mug → pour → handover
Subtask success
200 rollouts, real Franka- open_drawer98%
- pick_mug91%
- pour82%
- handover60%
Success across 10 seeds
73% ± 5.2
Ship for drawer + pick + pour. Replay failed handovers in Synth, then re-run Bench.
Open core. Hosted scale.
The methods that decide readiness are public. Managed workflows are available for teams that need scale.
Auditable methods
Metrics and schemas stay public.
A shared standard
Use the same contracts in your own runner.
Hosted at scale
Move to managed rollouts when needed.
Actual SDK, not a mockup.
The openbot Python client is public in the repo today.
from openbot import Client
ob = Client() # reads OPENBOT_API_KEY
run = ob.bench.rollout(
policy="openvla-7b",
embodiment="franka_panda",
task="open_drawer → pick_mug → pour → handover",
rollouts=200,
seeds=10,
)
result = run.wait() # poll until the run finishes
print(result.task_success) # e.g. 0.73
print(result.subtask["handover"]) # e.g. 0.60 ← bottleneck
print(result.sim_to_real_gap) # e.g. -0.29Build your verification loop today.
Clone the core, read the spec, run the examples.
Explore GitHub→Bring OpenBot into your robot loop.
We're onboarding teams that need evaluation, data, and synthesis wired into real deployment workflows.
Request early access→