Starred repositories
Get started with building Fullstack Agents using Gemini 2.5 and LangGraph
Text-audio foundation model from Boson AI
Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.
Official Repo for Open-Reasoner-Zero
Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI (Kunlun Inc.), specializing in vision-language reasoning.
[TMLR 2025] Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning
Democratizing Reinforcement Learning for LLMs
Fully open data curation for reasoning models
Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.
An Open-source RL System from ByteDance Seed and Tsinghua AIR
Fully open reproduction of DeepSeek-R1
Explore the Multimodal “Aha Moment” on 2B Model
A brief and partial summary of RLHF algorithms.
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention
MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
经济学人(含音频)、纽约客、卫报、连线、大西洋月刊等英语杂志免费下载,支持epub、mobi、pdf格式, 每周更新
SGLang is a fast serving framework for large language models and vision language models.
🚀 Efficient implementations of state-of-the-art linear attention models
ChatYuan: Large Language Model for Dialogue in Chinese and English
✨✨Latest Advances on Multimodal Large Language Models