Stars
A high-throughput and memory-efficient inference and serving engine for LLMs
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.
A toolkit for developing and comparing reinforcement learning algorithms.
Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
A generative world for general-purpose robotics & embodied AI learning.
Fully open reproduction of DeepSeek-R1
Code for the paper "Language Models are Unsupervised Multitask Learners"
Build resilient language agents as graphs.
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
SGLang is a fast serving framework for large language models and vision language models.
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
Train transformer language models with reinforcement learning.
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
Ongoing research training transformer models at scale
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
Minimal reproduction of DeepSeek R1-Zero
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)
Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!
A Collection of Variational Autoencoders (VAE) in PyTorch.
An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
Robust recipes to align language models with human and AI preferences
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
A PyTorch native platform for training generative AI models
[CVPR 2023 Best Paper Award] Planning-oriented Autonomous Driving