Starred repositories
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
Fully Open Framework for Democratized Multimodal Training
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
AI agents running research on single-GPU nanochat training automatically
slime is an LLM post-training framework for RL Scaling.
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, …
Scalable data pre processing and curation toolkit for LLMs
Minimal reproduction of DeepSeek R1-Zero
一个完整的大语言模型学习项目,从零实现 GPT 风格的语言模型。本项目旨在帮助开发者理解大语言模型的底层原理,通过亲手实现每个组件,打破对大模型的神秘感。
Scalable toolkit for efficient model reinforcement
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Official inference repo for FLUX.1 models
Official repository for "PosterLayout: A New Benchmark and Approach for Content-aware Visual-Textual Presentation Layout" (CVPR 2023).
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
The minimal opencv for Android, iOS, ARM Linux, Windows, Linux, MacOS, HarmonyOS, WebAssembly, watchOS, tvOS, visionOS
[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"
[ICCV 2021 Oral] Deep Evidential Action Recognition
Open-source and strong foundation image recognition models.
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
Segment Anything in High Quality [NeurIPS 2023]
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
[ICME 2022] Official Implementation for "Adaptive Mean-Residue Loss for Robust Facial Age Estimation"
[AAAI 2024] Follow-Your-Pose: This repo is the official implementation of "Follow-Your-Pose : Pose-Guided Text-to-Video Generation using Pose-Free Videos"
[CVPR2023] All in One: Exploring Unified Video-Language Pre-training