Stars
Official repo of Toucan: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments
中文法律LLaMA (LLaMA for Chinese legel domain)
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Fast and memory-efficient exact attention
Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (LLM).
[ICLR 2025] Released code for paper "Spurious Forgetting in Continual Learning of Language Models"
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
A book for Learning the Foundations of LLMs
Source code for a LoRA-based continual relation extraction method.
Continual Learning for Transformers that allows training on multiple tasks sequentially while preserving knowledge from earlier tasks using Elastic Weight Consolidation.
PyContinual (An Easy and Extendible Framework for Continual Learning)
Code for the paper "Evaluating Large Language Models Trained on Code"
Official Repository of "Learning to Reason under Off-Policy Guidance"
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO、GRPO。
A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation and performance benchmarking.
A high-throughput and memory-efficient inference and serving engine for LLMs
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Sky-T1: Train your own O1 preview model within $450
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.5, DeepSeek-R1, GLM-5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, Phi4, ...)…
verl: Volcano Engine Reinforcement Learning for LLMs
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.