OpenBot
The trust layer for robots that ship

From demo to deployment: the loop that actually closes.

Bench evaluates, Data cleans, Synth augments. Ship faster, fail less with tooling your team can inspect and extend.

Code, docs, and early SDK work live on GitHub.

Example Bench report

run_8c91a4 · kitchen_handover · real Franka

Three subtasks pass. Handover still fails. Find bottlenecks before hardware ever moves.

open_drawer98%
pick_mug91%
pour82%
handover60%

Verdict

Ship drawer + pick + pour.

Replay handover failures before sign-off.

MIT
Source available
6
Dataset formats
6
Policy types
12+
Embodiments targeted
The gap

A demo proves a robot can. Deployment needs proof it will — across seeds, across embodiments, on the hardware that ships.

Example report

A report built for sign-off.

Task success, failure point, and next action in one view.

openbot bench · kitchen_handover
run_8c91a4·policy: openvla-7b·embodiment: franka_panda·200 rollouts × 10 seeds

kitchen_handover · open_drawer → pick_mug → pour → handover

Conditional pass
Task success
73%+8 pp
Sim→Real gap closed
−29pp+12 pp
Intervention rate
14%−6 pp
Mean time-to-success
18.4s−2.1 s

Subtask success

200 rollouts, real Franka
  • open_drawer98%
  • pick_mug91%
  • pour82%
  • handover60%

Success across 10 seeds

73% ± 5.2

0
1
2
3
4
5
6
7
8
9
Worst: seed 3 · 65%Best: seed 2 · 80%

Ship for drawer + pick + pour. Replay failed handovers in Synth, then re-run Bench.

Open source

Open core. Hosted scale.

The methods that decide readiness are public. Managed workflows are available for teams that need scale.

Auditable methods

Metrics and schemas stay public.

A shared standard

Use the same contracts in your own runner.

Hosted at scale

Move to managed rollouts when needed.

Real code

Actual SDK, not a mockup.

The openbot Python client is public in the repo today.

bench_eval.py
from openbot import Client

ob = Client()                       # reads OPENBOT_API_KEY

run = ob.bench.rollout(
    policy="openvla-7b",
    embodiment="franka_panda",
    task="open_drawer → pick_mug → pour → handover",
    rollouts=200,
    seeds=10,
)

result = run.wait()                 # poll until the run finishes
print(result.task_success)          # e.g. 0.73
print(result.subtask["handover"])   # e.g. 0.60  ← bottleneck
print(result.sim_to_real_gap)       # e.g. -0.29
Open source

Build your verification loop today.

Clone the core, read the spec, run the examples.

Explore GitHub
Early access

Bring OpenBot into your robot loop.

We're onboarding teams that need evaluation, data, and synthesis wired into real deployment workflows.

Request early access