Skip to content
View zhisbug's full-sized avatar
❄️
raising a baby
❄️
raising a baby

Organizations

@alpa-projects

Block or report zhisbug

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 69,507 13,194 Updated Feb 5, 2026

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 39,395 4,778 Updated Jun 2, 2025

Running large language models on a single GPU for throughput-oriented scenarios.

Python 9,384 591 Updated Oct 28, 2024

A Datacenter Scale Distributed Inference Serving Framework

Rust 6,042 841 Updated Feb 5, 2026

DyNet: The Dynamic Neural Network Toolkit

C++ 3,435 704 Updated Dec 1, 2023

Training and serving large-scale neural networks with auto parallelization.

Python 3,183 359 Updated Dec 9, 2023

shadowsocks.wiki

3,123 516 Updated Apr 22, 2025

A unified inference and post-training framework for accelerated video generation.

Python 3,041 253 Updated Feb 5, 2026

An end-to-end PyTorch framework for image and video classification

Python 1,612 274 Updated Jun 27, 2024

[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

Python 1,316 78 Updated Mar 6, 2025

Resource-adaptive cluster scheduler for deep learning training.

Python 452 81 Updated Mar 5, 2023

[ICML 2024] CLLMs: Consistency Large Language Models

Python 410 21 Updated Nov 16, 2024
Lua 266 64 Updated Jan 26, 2023

Swarm training framework using Haiku + JAX + Ray for layer parallel transformer language models on unreliable, heterogeneous nodes

Python 242 22 Updated May 12, 2023

[NeurIPS 2025] Simple extension on vLLM to help you speed up reasoning model without training.

Python 219 28 Updated May 31, 2025

PMLS-Caffe: Distributed Deep Learning Framework for Parallel ML System

C++ 193 63 Updated May 10, 2018

Simple Distributed Deep Learning on TensorFlow

Python 134 25 Updated Jun 17, 2025

GPU-specialized parameter server for GPU machine learning.

C++ 102 40 Updated Apr 5, 2018

Automatic Photo Adjustment Using Deep Neural Networks

Python 97 29 Updated Sep 2, 2015

AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)

Python 93 17 Updated Jul 14, 2023

d3LLM: Ultra-Fast Diffusion LLM 🚀

Python 87 2 Updated Feb 4, 2026
Jupyter Notebook 84 11 Updated Oct 17, 2025

The source of LMSYS website and blogs

JavaScript 78 67 Updated Feb 4, 2026

[NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank

Python 69 16 Updated Nov 4, 2024

[NeurIPS 2025] Scaling Speculative Decoding with Lookahead Reasoning

Python 63 6 Updated Oct 31, 2025

Multi-Turn RL Training System with AgentTrainer for Language Model Game Reinforcement Learning

Python 58 10 Updated Dec 18, 2025

A curated list of recent papers on efficient video attention for video diffusion models, including sparsification, quantization, and caching, etc.

55 4 Updated Oct 27, 2025

Hyperparameter tuning via uncertainty modeling

Python 49 4 Updated May 3, 2024

(NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.

Python 44 7 Updated Nov 4, 2022

Distributed ML Optimizer

Python 35 1 Updated Jul 28, 2021
Next