Stars
A high-throughput and memory-efficient inference and serving engine for LLMs
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
A generative world for general-purpose robotics & embodied AI learning.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
[Lumina Embodied AI] 具身智能技术指南 Embodied-AI-Guide
High-speed Large Language Model Serving for Local Deployment
A debugging and profiling tool that can trace and visualize python code execution
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Supercharge Your LLM with the Fastest KV Cache Layer
Cost-efficient and pluggable Infrastructure components for GenAI inference
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
A comprehensive list of papers using large language/multi-modal models for Robotics/RL, including papers, codes, and related websites
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
The official Python client for the Hugging Face Hub.
Collection of AWESOME vision-language models for vision tasks
A collection of modern C++ libraries, include coro_http, coro_rpc, compile-time reflection, struct_pack, struct_json, struct_xml, struct_pb, easylog, async_simple etc.
Achieve state of the art inference performance with modern accelerators on Kubernetes
Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).
Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training
This is a curated list of "Embodied AI or robot with Large Language Models" research. Watch this repository for the latest updates! 🔥
library to read/write .npy and .npz files in C/C++
Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.