-
UCAS
- Beijing, China
Starred repositories
The evaluation code for MultiIF multi-turn and multi-lingual instruction following
Z-library,官方Z-lib镜像网址及入口(2026/6/7)
A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.
Machine Learning Engineering Open Book
This repository contains a Freebase dump parser that extracts links to Wikipedia.
Language Savant. If your repository's language is being reported incorrectly, send us a pull request!
Collection of LLM completions for reasoning-gym task datasets
[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards
FROM $f(x)$ AND $g(x)$ TO $f(g(x))$: LLMs Learn New Skills in RL by Composing Old Ones
[Up-to-date] Awesome Agentic Deep Research Resources
🥢像老乡鸡🐔那样做饭。主要部分于2024年完工,非老乡鸡官方仓库。文字来自《老乡鸡菜品溯源报告》,并做归纳、编辑与整理。CookLikeHOC.
Bridge Megatron-Core to Hugging Face/Reinforcement Learning
Best practices for training DeepSeek, Mixtral, Qwen and other MoE models using Megatron Core.
[EMNLP 2025] CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward
Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains papers, codes, datasets, evaluations, and analyses.
AI-Powered Python & Python-Powered AI (Python-Use)
[EMNLP 2025] Awesome RAG Reasoning Resources
open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality
(best/better) practices of megatron on veRL and tuning guide
Official repository for RAGViz: Diagnose and Visualize Retrieval-Augmented Generation [EMNLP 2024]
A python module to repair invalid JSON from LLMs
Paper list for Efficient Reasoning.
slime is an LLM post-training framework for RL Scaling.
A versatile toolkit for applying Logit Lens to modern large language models (LLMs). Currently supports Llama-3.1-8B and Qwen-2.5-7B, enabling layer-wise analysis of hidden states and predictions.