Stars
A NCCL extension library, designed to efficiently offload GPU memory allocated by the NCCL communication library.
A construction kit for reinforcement learning environment management.
Checkpoint-engine is a simple middleware to update model weights in LLM inference engines
A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
A lightweight, powerful framework for multi-agent workflows
Official Repo For "Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos"
Fully Open Framework for Democratized Multimodal Training
InternLM / GroupedGEMM
Forked from fanshiqing/grouped_gemmPyTorch bindings for CUTLASS and CUBLAS Grouped GEMM, Permute and Unpermute.
InternLM / AdaptiveGEMM
Forked from deepseek-ai/DeepGEMMAdaptiveGEMM: FP8 GEMM with Adaptation to Various Lengths of Group M
A debugging and profiling tool that can trace and visualize python code execution
how to optimize some algorithm in cuda.
Reference PyTorch implementation and models for DINOv3
Implementation for FP8/INT8 Rollout for RL training without performence drop.
VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
Scalable toolkit for efficient model reinforcement
(best/better) practices of megatron on veRL and tuning guide
SkyRL: A Modular Full-stack RL Library for LLMs
siiRL: Shanghai Innovation Institute RL Framework for Advanced LLMs and Multi-Agent Systems
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.