Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …

Python 12,584 1,195 Updated Feb 7, 2026

open-compass / opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 6,644 736 Updated Feb 6, 2026

shibing624 / MedicalGPT

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型，实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO、GRPO。

Python 4,728 687 Updated Jan 27, 2026

NovaSky-AI / SkyThought

Sky-T1: Train your own O1 preview model within $450

Python 3,370 345 Updated Jul 12, 2025

openai / human-eval

Code for the paper "Evaluating Large Language Models Trained on Code"

Python 3,124 435 Updated Jan 17, 2025

modelscope / evalscope

A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation and performance benchmarking.

Python 2,380 275 Updated Feb 7, 2026

modelscope / modelscope-classroom

Jupyter Notebook 1,295 160 Updated Jan 4, 2026

huggingface / Math-Verify

Python 1,088 51 Updated Jan 10, 2026

AndrewZhe / lawyer-llama

中文法律LLaMA (LLaMA for Chinese legel domain)

Python 979 130 Updated Aug 28, 2024

agentscope-ai / Trinity-RFT

Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (LLM).

Python 520 54 Updated Feb 6, 2026

ElliottYan / LUFFY

Official Repository of "Learning to Reason under Off-Policy Guidance"

Python 413 51 Updated Oct 4, 2025

ZixuanKe / PyContinual

PyContinual (An Easy and Extendible Framework for Continual Learning)

Python 324 69 Updated Jan 29, 2024

TheAgentArk / Toucan

Official repo of Toucan: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments

Python 224 11 Updated Dec 16, 2025

wlll123456 / study_rlhf

Jupyter Notebook 79 6 Updated Jul 24, 2025

zzz47zzz / spurious-forgetting

[ICLR 2025] Released code for paper "Spurious Forgetting in Continual Learning of Language Models"

Jupyter Notebook 59 4 Updated May 9, 2025

chandraprvkvsh / Continual-Learning-for-Transformers

Continual Learning for Transformers that allows training on multiple tasks sequentially while preserving knowledge from earlier tasks using Elastic Weight Consolidation.

Python 17 Updated Aug 8, 2025

Raincleared-Song / ConPET

Source code for a LoRA-based continual relation extraction method.

Python 14 2 Updated Sep 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cyst219

Block or report cyst219

Stars

vllm-project / vllm

hiyouga / LlamaFactory

TapXWorld / ChinaTextbook

deepspeedai / DeepSpeed

QwenLM / Qwen3

Dao-AILab / flash-attention

verl-project / verl

ZJU-LLMs / Foundations-of-LLMs

modelscope / ms-swift