-
EECS, Peking University
- Beijing, China, the Earth
- https://github.com/aishoot
Stars
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
《一人企业方法论》第二版,也适合做其他副业(比如自媒体、电商、数字商品)的非技术人群。
Create Epic Math and Physics Animations & Study Notes From Text and Images.
A community-maintained Python framework for creating mathematical animations.
Qihoo360 / 360-LLaMA-Factory
Forked from hiyouga/LLaMA-Factoryadds Sequence Parallelism into LLaMA-Factory
A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning
Fully open data curation for reasoning models
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
A series of math-specific large language models of our Qwen2 series.
Train transformer language models with reinforcement learning.
Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied with elaborately-written concise descriptions to help readers g…
Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"
[NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*
[ACL 2024]Official GitHub repo for OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems.
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
Ikaros-521 / AI-Vtuber
Forked from sandboxdream/AI-VtuberAI Vtuber是一个由 【ChatterBot/ChatGPT/claude/langchain/chatglm/text-gen-webui/闻达/千问/kimi/ollama】 驱动的虚拟主播【Live2D/UE/xuniren】,可以在 【Bilibili/抖音/快手/微信视频号/拼多多/斗鱼/YouTube/twitch/TikTok】 直播中与观众实时互动 或 直接在本地进行聊…
This is a collection of research papers for Self-Correcting Large Language Models with Automated Feedback.
LLM Tuning with PEFT (SFT+RM+PPO+DPO with LoRA)