Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …

Python 11,829 1,084 Updated Dec 24, 2025

Qihoo360 / Light-R1

Python 755 49 Updated Dec 23, 2025

hijkzzz / Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,873 371 Updated Dec 17, 2025

zhentingqi / rStar

Python 969 111 Updated Jan 23, 2025

hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 64,432 7,814 Updated Dec 24, 2025

QwenLM / Qwen2.5-Math

A series of math-specific large language models of our Qwen2 series.

Python 1,054 151 Updated Jan 11, 2025

huggingface / trl

Train transformer language models with reinforcement learning.

Python 16,766 2,374 Updated Dec 24, 2025

tongyx361 / Awesome-LLM4Math

Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied with elaborately-written concise descriptions to help readers g…

148 8 Updated Jul 12, 2024

dvlab-research / Step-DPO

Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"

Python 389 16 Updated Jan 19, 2025

percent4 / llm_math_solver

本项目用于大模型数学解题能力方面的数据集合成，模型训练及评测，相关文章记录。

Python 98 9 Updated Sep 14, 2024

hkust-nlp / dart-math

[NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*

Jupyter Notebook 119 7 Updated Dec 10, 2024

OpenBMB / OlympiadBench

[ACL 2024]Official GitHub repo for OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems.

Python 175 10 Updated Jun 8, 2025

project-numina / aimo-progress-prize

Jupyter Notebook 476 34 Updated Jul 22, 2024

hkust-nlp / deita

Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]

Python 579 33 Updated Dec 9, 2024

teacherpeterpan / self-correction-llm-papers

This is a collection of research papers for Self-Correcting Large Language Models with Automated Feedback.

561 34 Updated Oct 28, 2024

Joyce94 / LLM-RLHF-Tuning

LLM Tuning with PEFT (SFT+RM+PPO+DPO with LoRA)

Python 444 24 Updated Oct 11, 2023

longday1102 / OVM

⚡ OVM for Planning in Mathematical Reasoning

Python 10 Updated Feb 20, 2024

lz1oceani / verify_cot

Python 137 8 Updated Nov 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI Enthusiast aishoot

Achievements

Achievements

Block or report aishoot

Stars

thu-ml / TurboDiffusion

jingyaogong / minimind

ZuodaoTech / everyone-can-use-english

divejp88 / one_book

easychen / one-person-businesses-methodology-v2.0

HarleyCoops / Math-To-Manim

ManimCommunity / manim

HumanAIGC-Engineering / OpenAvatarChat

huggingface / Math-Verify

a-m-team / a-m-models

zwhe99 / DeepMath

open-thoughts / open-thoughts

modelscope / ms-swift