Stars
An LLM post-training framework with vLLM for RL Scaling
A benchmark for evaluating AI agents on realistic business workflows
Undetected version of the Playwright testing and automation library.
Rubric compiler and judge engine for LLM evaluation
autonomous nanogpt optimizer speedrun
Fast LLM speculative inference server for consumer hardware.
TokenSpeed is a speed-of-light LLM inference engine.
Programmable chat templates for LLM training and inference.
🚀 An open and lightweight modification to Windows, designed to optimize performance, privacy and usability.
The agent that grows with you
Simple & Scalable Pretraining for Neural Architecture Research
Harness for RLM-style rollouts. Only for RL training
Control panel for VLLM, Sglang, llama.cpp, exllamav3
Training library for Megatron-based models with bidirectional Hugging Face conversion capability
τ-Bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
A protocol for connecting any editor to any agent
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
Autonomous environment research loop for building verifiers RL training environments
AI agents running research on single-GPU nanochat training automatically
Train AI coding agents with RL (GRPO) on Prime Intellect — works with Pi Agent
A lightweight, AI-native training framework for large language models. Designed for fast iteration, reproducible experiments, and modular configuration across SFT, RLVR, and evaluation workflows.
Developing as part of Prime Intellect's RL Residency
Environments by the Prime Intellect Research Team
Detect leaked asyncio tasks, threads, and event loop blocking with stack trace in Python. Inspired by goleak.