Skip to content
View jbarnes850's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report jbarnes850

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Repository for "Training Language Models To Explain Their Own Computations"

Python 5 1 Updated Dec 22, 2025
Jupyter Notebook 172 2 Updated Dec 19, 2025

A benchmark that challenges language models to code solutions for scientific problems

Python 164 27 Updated Dec 22, 2025

Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows

Python 97 3 Updated Dec 24, 2025

Measuring how well CLI agents can post-train LLMs

Python 51 7 Updated Dec 17, 2025

[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Python 2,418 359 Updated Dec 23, 2025

OpenTinker is an RL-as-a-Service infrastructure for foundation models

Python 370 25 Updated Dec 25, 2025

MiniMax-M2, a model built for Max coding & agentic workflows.

2,121 162 Updated Nov 13, 2025

This was designed for interp researchers who want to do research on or with interp agents to give quality of life improvements and fix some of the annoying things you get from only using Claude cod…

Jupyter Notebook 81 5 Updated Dec 19, 2025

Accelerating MoE with IO and Tile-aware Optimizations

Python 462 27 Updated Dec 25, 2025

ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)

Python 262 31 Updated Dec 23, 2025

MoE training for Me and You and maybe other people

Python 296 25 Updated Dec 17, 2025

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 2,337 206 Updated Dec 23, 2025
TypeScript 2 Updated Dec 16, 2025
Python 18 2 Updated Dec 24, 2025

My learning notes for ML SYS.

Python 4,804 309 Updated Dec 24, 2025

Build RL environments for LLM training

Python 536 32 Updated Dec 23, 2025

Developer Asset Hub for NVIDIA Nemotron — A one-stop resource for training recipes, usage cookbooks, and full end-to-end reference examples to build with Nemotron models

Jupyter Notebook 246 40 Updated Dec 23, 2025

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 25,876 1,816 Updated Oct 13, 2025

Open-source release accompanying Gao et al. 2025

Python 471 48 Updated Dec 11, 2025

PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

Python 236 6 Updated Dec 10, 2025

Repository for getting started with the OfficeQA Benchmark.

Python 42 4 Updated Dec 18, 2025

ThetaEvolve: Test-time Learning on Open Problems, enabling RL training on AlphaEvolve/OpenEvolve and emphasizing scaling test-time compute

Python 85 7 Updated Dec 8, 2025

Evolve your language agent with Agentic Context Engineering (ACE)

Python 439 50 Updated Nov 18, 2025

A live benchmark and evaluation framework for open-ended deep research in the wild.

Python 102 10 Updated Nov 13, 2025
Python 627 60 Updated Dec 25, 2025

Causal RL Environments Simulator

Python 18 2 Updated Dec 20, 2025
Next