Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …

Python 12,576 1,195 Updated Feb 7, 2026

Qihoo360 / Light-R1

Python 762 49 Updated Dec 23, 2025

hijkzzz / Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,893 371 Updated Dec 17, 2025

zhentingqi / rStar

Python 970 111 Updated Jan 23, 2025

hiyouga / LlamaFactory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 67,009 8,142 Updated Feb 4, 2026

QwenLM / Qwen2.5-Math

A series of math-specific large language models of our Qwen2 series.

Python 1,065 152 Updated Jan 11, 2025

huggingface / trl

Train transformer language models with reinforcement learning.

Python 17,304 2,476 Updated Feb 7, 2026

tongyx361 / Awesome-LLM4Math

Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied with elaborately-written concise descriptions to help readers g…

150 8 Updated Jul 12, 2024

JIA-Lab-research / Step-DPO

Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"

Python 391 16 Updated Jan 19, 2025

percent4 / llm_math_solver

本项目用于大模型数学解题能力方面的数据集合成，模型训练及评测，相关文章记录。

Python 100 9 Updated Sep 14, 2024

hkust-nlp / dart-math

[NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*

Jupyter Notebook 120 7 Updated Dec 10, 2024

OpenBMB / OlympiadBench

[ACL 2024]Official GitHub repo for OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems.

Python 180 11 Updated Jun 8, 2025

project-numina / aimo-progress-prize

Jupyter Notebook 481 36 Updated Jul 22, 2024

hkust-nlp / deita

Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]

Python 587 35 Updated Dec 9, 2024

Ikaros-521 / AI-Vtuber

Forked from sandboxdream/AI-Vtuber

AI Vtuber是一个由【ChatterBot/ChatGPT/claude/langchain/chatglm/text-gen-webui/闻达/千问/kimi/ollama】驱动的虚拟主播【Live2D/UE/xuniren】，可以在【Bilibili/抖音/快手/微信视频号/拼多多/斗鱼/YouTube/twitch/TikTok】直播中与观众实时互动或直接在本地进行聊…

Python 4,262 647 Updated Jul 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI Enthusiast aishoot

Achievements

Achievements

Block or report aishoot

Stars

TsinghuaC3I / Awesome-RL-for-LRMs

jingyaogong / minimind-v

thu-ml / TurboDiffusion

jingyaogong / minimind

ZuodaoTech / everyone-can-use-english

divejp88 / one_book

easychen / one-person-businesses-methodology-v2.0

HarleyCoops / Math-To-Manim

ManimCommunity / manim

HumanAIGC-Engineering / OpenAvatarChat

huggingface / Math-Verify

a-m-team / a-m-models

Qihoo360 / 360-LLaMA-Factory

zwhe99 / DeepMath

open-thoughts / open-thoughts

modelscope / ms-swift