Starred repositories
Development repository for the Triton language and compiler
A high-throughput and memory-efficient inference and serving engine for LLMs
SGLang is a fast serving framework for large language models and vision language models.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
torchcomms: a modern PyTorch communications API
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
🤩 Easy-to-use global IM bot platform designed for LLM era / 简单易用的大模型即时通信机器人开发平台 ⚡️ Bots for QQ / QQ频道 / Discord / LINE / WeChat(微信, 企业微信)/ Telegram / 飞书 / 钉钉 / Slack 🧩 Integrated with ChatGPT(GPT),…
Production-ready platform for agentic workflow development.
ncnn is a high-performance neural network inference framework optimized for the mobile platform
Enjoy the magic of Diffusion models!
🚀 Efficient implementations of state-of-the-art linear attention models
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.
A curated list of papers and resources related to Described Object Detection, Open-Vocabulary/Open-World Object Detection and Referring Expression Comprehension. Updated frequently and pull request…
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
real time face swap and one-click video deepfake with only a single image
Reference PyTorch implementation and models for DINOv3
Diffusion model(SD,Flux,Wan,Qwen Image,...) inference in pure C/C++
[CVPR 2025] MatAnyone: Stable Video Matting with Consistent Memory Propagation
Faster Whisper transcription with CTranslate2