Stars
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
The open-source code for the NeurIPS 2025 paper, "Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning."
SGLang is a fast serving framework for large language models and vision language models.
DoctorAgent-RL: A Multi-Agent Collaborative Reinforcement Learning System for Multi-Turn Clinical Dialogue
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
Qwen3Guard is a multilingual guardrail model series developed by the Qwen team at Alibaba Cloud.
nnScaler: Compiling DNN models for Parallel Training
maps between 1-D space filling hilbert curve and N-D coordinates
MiroMind Research Agent: Fully Open-Source Deep Research Agent with Reproducible State-of-the-Art Performance on FutureX, GAIA, HLE, BrowserComp and xBench.
This is the official implementation of paper "The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward"
🚀 Efficient implementations of state-of-the-art linear attention models
Official Repository of "Learning what reinforcement learning can't"
Unleashing the Power of Reinforcement Learning for Math and Code Reasoners
Lightweight coding agent that runs in your terminal
The official github repo for "Training Optimal Large Diffusion Language Models", the first-ever large-scale diffusion language models scaling law..
Awesome-Parallel-Reasoning: Unlocking the reasoning potential of LLMs. Papers, Code, Resources & Survey.
[NeurIPS2025] "AI-Researcher: Autonomous Scientific Innovation" -- A production-ready version: https://novix.science/chat
Repository for the paper 'Is In-Context Learning Learning?'
MiMo: Unlocking the Reasoning Potential of Language Model – From Pretraining to Posttraining
🧬 Augmenting zero-shot mutant prediction by retrieval-based logits fusion. (ISMB/ECCB 2025)
Awesome In-Context RL: A curated list of In-Context Reinforcement Learning - - —
Official repository for the paper "LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code"