Labelbox

Labelbox · 2026-02-05T17:48:18.044Z

Elon x Dwarkesh x John Collison from Stripe just went live. Their almost three hour chat (over some Guinness 🍻) dives into what actually limits the next phase of AI and how Elon plans to break through. A few takeaways from this must-watch episode: - Space as the next data center: Solar power in orbit is roughly five times more effective than on Earth. Within thirty to thirty six months, Musk believes space could become the most economically viable location for AI compute, with Starship launching massive power and compute capacity into orbit. - Humanoid robots as the economic unlock: Optimus could be the ultimate productivity multiplier, potentially expanding the global economy by orders of magnitude. The hardest problem is hands. The endgame is robots that eventually build robots. - Power as the next bottleneck: Electricity production outside China is flat while compute demand is exploding. Musk says the true scaling wall for AI on Earth is utilities, not just models. - Debuggability as a safety requirement: Tools that show where a model’s reasoning went wrong, trace the origin of errors, or detect potential deception will be essential as AI grows more capable. - Efficiency as an existential issue: Interest on national debt now exceeds the military budget. Musk argues that massive productivity gains from AI and robotics are not optional. They are existential. We’re excited to be featured in the conversation, helping leading AI teams scale high quality robotics and reinforcement learning data so their models learn from the right experiences and reach their full potential.

Software Development

San Francisco, California 36,818 followers

The data factory for leading AI teams

See jobs Follow

Discover all 464 employees

About us

Labelbox builds and operates reinforcement learning data factories for the world’s leading AI labs and enterprises, powering the next generation of frontier models and AI applications.

Website: https://labelbox.com/
External link for Labelbox
Industry: Software Development
Company size: 51-200 employees
Headquarters: San Francisco, California
Type: Privately Held
Founded: 2018

Locations

Primary

510 Treat Ave

San Francisco, California 94110, US

Get directions

Employees at Labelbox

See all employees

Updates

Labelbox

36,818 followers
1w
Report this post
Excited to share our paper on Implicit Intelligence was accepted to [ICML] Int'l Conference on Machine Learning 2026. Do AI agents understand us, or just follow instructions? Real requests aren’t fully specified. Your phone already holds the context: a dinner reservation, the address in Maps, a text that someone’s late, and Do Not Disturb still on. When you ask an agent to "schedule a car pickup”, you naturally want it to weave all of this context together, knowing whether it should contact the restaurant to modify the dinner reservation, or when and where to schedule your car for. Today’s agents usually don’t. We built Implicit Intelligence to measure this gap across 205 real-world scenarios involving accessibility, privacy, and safety. Top models can still struggle, with Opus 4.6 reaching just 53.2% and GPT 5.2 Pro at 48.3%. It demonstrates there is still a large gap between how these models perform for user requests that deal with messy, real world context. Curious how other models compare or want to learn more about the methodology behind the benchmark? Check out our Leaderboard and paper (link in the comments). Ved Sirdeshmukh Marc Wetter
2 Comments

Like Comment Share
Labelbox

36,818 followers
2w Edited
Report this post
Hiring AI builders: Most AI progress today isn’t bottlenecked by models. It’s bottlenecked by data and environments. That layer shapes what models learn, how they behave, and whether they improve. It’s also the part people sometimes overlook but raises the ceiling of performance and is what Labelbox is focused on building. We’re hiring a small group to work at the frontier: - Forward Deployed Engineers (RL environments) - Forward Deployed Research Scientists - FDE Manager You’ll design environments, shape feedback loops, and work alongside leading AI teams pushing what these systems can do. If you want to build the infrastructure that actually determines how state-of-the-art AI systems learn and improve, reach out (JD + links in the comments 👇)
4 Comments

Like Comment Share
Labelbox

36,818 followers
3w
Report this post
This week, we had the pleasure of hosting 50+ researchers and builders from leading AI companies to meet, talk and socialize (MTS 😎) at Labelbox HQ. Huge thanks to Dwarkesh Patel, Sholto Douglas (Anthropic), Mo Bavarian (OpenAI), and Melvin Johnson (DeepMind) for leading our fireside chat on scaling RL and the pursuit of AGI.
- +1
Like Comment Share
Labelbox reposted this
Keshav Sahoo
1mo Edited
Report this post
Keeping customer and workforce data secure is the highest priority at Labelbox, using a trust platform that is thorough, continuous, and adaptive. https://lnkd.in/gZUpfD2s

Engineering trust in an autonomous world labelbox.com

1 Comment

Like Comment Share
Labelbox

36,818 followers
2mo
Report this post
🏆 Forbes’ 2026 list of America’s Best Startup Employers is out, and we’re proud to see Labelbox on the list. We’re committed to enabling the next generation of AI by powering the data and evaluation for the world’s most advanced teams. Recognition like this reflects the people building that mission every day. See the full list: https://bit.ly/4u8CumB

Forbes 2026 America's Best Startup Employers - Ranked List social-www.forbes.com

6 Comments

Like Comment Share
Labelbox

36,818 followers
2mo
Report this post
Voice agents are evolving from rigid turn-based designs toward continuous, natural conversation, enabling streaming comprehension and generation at the same time. However, most existing benchmarks are either turn-based or latency-focused and do not directly test whether models can maintain reasoning when users interrupt or update objectives mid-utterance. We introduce EchoChain 🔊, a novel benchmark for evaluating reasoning under pressure in full-duplex dialogue. Key findings: - Full-duplex models often fail to properly integrate interruption information, even so far as ignoring the interruption entirely in some cases. - A major weakness in today’s most advanced models is that they struggle to stay consistent when new input arrives while they’re still responding. - In many cases, a model performs well when it can respond without interruption, but struggles once it’s interrupted mid-response. Check out the full analysis in our blog post. Stay tuned for the arXiv paper as well which will be released in the coming days. https://lnkd.in/g3QkNZdb

Introducing EchoChain: An audio benchmark for reasoning under pressure in full-duplex dialogue labelbox.com

5 Comments

Like Comment Share
Labelbox

36,818 followers
2mo
Report this post
Model safety is often judged by refusal rates on AI safety benchmarks. But what if our evaluations are flagging overtly negative or sensitive language rather than detecting genuine adversarial behavior? In our latest research, we show that when this language is removed, frontier models previously labeled as safe frequently fail, exposing a gap between how model safety is evaluated using benchmarks and how adversarial behavior occurs in the real world. Key findings: - AI safety benchmarks are over-reliant on explicit triggering language, provoking model refusals unrealistically. - Removing these cues significantly degrades safety performance, challenging prior assumptions about the robustness of safety evaluations. - We found evidence that both internal safety evaluations and safety alignment techniques use similar language patterns, further questioning the robustness of safety evaluations. - Our novel “intent laundering” framework serves as a strong diagnostic and red-teaming tool, exposing where model safety succeeds and where it fails. Read the full blog post for the complete analysis. https://lnkd.in/g84dywcR

The AI safety illusion: why current safety datasets fool us on model safety labelbox.com

1 Comment

Like Comment Share
Labelbox

36,818 followers
3mo Edited
Report this post
Today, Dario (CEO of Anthropic) x Dwarkesh unpacked where AI is headed, from exponential scaling to what he calls a “country of geniuses in a data center". A few key takeaways: - RL is about generalization, not specialization: Like early pretraining, the goal isn’t mastering one task, but building rich environments and broad data so models generalize across domains. - 1–3 years to a “country of geniuses”: Dario estimates ~50/50 odds that AI systems collectively match the output of an entire nation of top experts in a few years. Not a single superintelligence, but millions of genius-level systems in parallel. - Context as the next unlock: With context windows in the tens of millions of tokens, models could absorb months of workflow in one pass. The goal: steerable, human-aligned systems, as opposed to unchecked autonomous actors. - Software engineering goes end to end: Models are moving from writing code to executing full engineering cycles: setup, debugging, iteration. Bottlenecks now shift from syntax to judgment. - Diffusion will lag capability, briefly: Enterprise adoption slows even with rapid growth, but AI can onboard itself via docs, Slack threads, and codebases. By compressing the adoption curve, trillions in AI-driven revenue by 2030 becomes realistic. Excited to be featured in this conversation, showcasing how we help leading AI teams build high-fidelity RL environments and tighten the iteration loop so models learn from the most informative experiences.
3 Comments

Like Comment Share
Labelbox

36,818 followers
3mo
Report this post
We're excited to share that we’ve acquired Upcraft to bring AI agents to the heart of how we scale human expertise for frontier AI. Upcraft’s AI-powered automation strengthens Alignerr by helping us recruit, engage, and empower a global network of domain experts who train and evaluate the world’s most advanced models. As leading AI teams invest billions into post-training and reinforcement learning, expert-generated data has become the true bottleneck for injecting models with the taste and judgement that only deep human expertise can provide. A big welcome to Greg Caplan and the Upcraft team and we look forward to building together. https://lnkd.in/g4rjRNeA

Welcoming Upcraft to the Labelbox team labelbox.com

4 Comments

Like Comment Share
Labelbox

36,818 followers
3mo Edited
Report this post
Elon x Dwarkesh x John Collison from Stripe just went live. Their almost three hour chat (over some Guinness 🍻) dives into what actually limits the next phase of AI and how Elon plans to break through. A few takeaways from this must-watch episode: - Space as the next data center: Solar power in orbit is roughly five times more effective than on Earth. Within thirty to thirty six months, Musk believes space could become the most economically viable location for AI compute, with Starship launching massive power and compute capacity into orbit. - Humanoid robots as the economic unlock: Optimus could be the ultimate productivity multiplier, potentially expanding the global economy by orders of magnitude. The hardest problem is hands. The endgame is robots that eventually build robots. - Power as the next bottleneck: Electricity production outside China is flat while compute demand is exploding. Musk says the true scaling wall for AI on Earth is utilities, not just models. - Debuggability as a safety requirement: Tools that show where a model’s reasoning went wrong, trace the origin of errors, or detect potential deception will be essential as AI grows more capable. - Efficiency as an existential issue: Interest on national debt now exceeds the military budget. Musk argues that massive productivity gains from AI and robotics are not optional. They are existential. We’re excited to be featured in the conversation, helping leading AI teams scale high quality robotics and reinforcement learning data so their models learn from the right experiences and reach their full potential.
5 Comments

Like Comment Share

Browse jobs

Funding

Labelbox 5 total rounds

Last Round

Series D Feb 6, 2022

US$ 110.0M

Investors

SoftBank Vision Fund + 4 Other investors

See more info on crunchbase

Labelbox

Software Development

San Francisco, California 36,818 followers

The data factory for leading AI teams

About us

Locations

Employees at Labelbox

Ross Barbash

Jonathan Perry

John Bullock

Greg Caplan

Updates

Join now to see what you are missing

Similar pages

Alignerr

Scale AI

Outlier

SuperAnnotate

DataAnnotation

Invisible Technologies

Appen

Turing

Snorkel AI

Mercor

Browse jobs

Labelbox jobs

Engineer jobs

Machine Learning Engineer jobs

Analyst jobs

Project Manager jobs

Scientist jobs

Developer jobs

Manager jobs

Intern jobs

Associate jobs

Software Engineer jobs

Specialist jobs

Python Developer jobs

Full Stack Engineer jobs

Solutions Engineer jobs

Director jobs

Engineering Manager jobs

Intelligence Specialist jobs

Account Executive jobs

Support Engineer jobs

Funding