Stars
qqr is an RL training framework for open-ended agents.
Kubernetes compatible infrastructure for Affine
Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.
OpenTinker is an RL-as-a-Service infrastructure for foundation models
Accompanying code for "Discovering State-of-the-art Reinforcement Algorithms" Nature publication
A framework for efficient model inference with omni-modality models
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
An extremely fast Python type checker and language server, written in Rust.
official code for "Fox: Focus Anywhere for Fine-grained Multi-page Document Understanding"
Official code implementation of Context Cascade Compression: Exploring the Upper Limits of Text Compression
Official PyTorch Implementation of "Flow Map Distillation Without Data"
[ASPLOS'26] Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
Tooling for exact and MinHash deduplication of large-scale text datasets
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…
PyTorch implementation of JiT https://arxiv.org/abs/2511.13720
Interactive visualizations of the geometric intuition behind diffusion models.
🌚 🌍 🌝 GeoIP 规则文件加强版,支持自行定制 V2Ray dat 格式文件 geoip.dat、MaxMind mmdb 格式文件、sing-box SRS 格式文件、mihomo MRS 格式文件、Clash ruleset、Surge ruleset 等。Enhanced edition of GeoIP files for V2Ray, Xray-core, sing-box,…
Label Studio is a multi-type data labeling and annotation tool with standardized output format
RLinf: Reinforcement Learning Infrastructure for Embodied and Agentic AI
Native Multimodal Models are World Learners
[NeurIPS 2025 Spotlight] TPA: Tensor ProducT ATTenTion Transformer (T6) (https://arxiv.org/abs/2501.06425)