Stars
Official implementation of the paper [ICLR2026] Stop Unnecessary Reflection: Training LRMs for Efficient Reasoning with Adaptive Reflection and Length Coordinated Penalty
[NeurIPS'25] The official code implementation for paper "R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing"
[ICLR 2026] ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
[NeurIPS 2025] A*-Thought: Efficient Reasoning via Bidirectional Compression for Low-Resource Settings
[ICLR 2026] Efficient Reasoning with Balanced Thinking
📚 《从零开始构建智能体》——从零开始的智能体原理与实践教程
No fortress, purely open ground. OpenManus is Coming.
🚀🚀 「大模型」2小时完全从0训练64M的小参数GPT!🌏 Train a 64M-parameter GPT from scratch in just 2h!
AgentX 致力于让小白也能无门槛通过自然语言打造属于自己的 Agent。AgentX 采用了自研 MCP 网关,模型高可用组件打造高可用
🤗 smolagents: a barebones library for agents that think in code.
An open-source long-horizon SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of…
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
A natural language interface for computers
open-source agentic AI data assistant for the next generation of AI + Data products.
Minimal reproduction of DeepSeek R1-Zero
verl: Volcano Engine Reinforcement Learning for LLMs
A GUI client for Windows, Linux and macOS, support Xray and sing-box and others
Ongoing research training transformer models at scale
[NeurIPS 2025] Let LRMs Break Free from Overthinking via Self-Braking Tuning. https://arxiv.org/abs/2505.14604
Review automated kernel generation in the era of LLMs
国科大雁栖湖校区2024~2025年课程资料,包括强化学习、智能计算系统、模式识别、矩阵分析与应用、人工智能原理与算法、自然语言处理
One repository is all that is necessary for Multi-agent Reinforcement Learning (MARL)
Official repository for the paper "LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code"
Pruning the Unsurprising: Efficient LLM Reasoning via First-Token Surprisal
This is the official code for OThink-R1 project.