Skip to content
View lbh2001's full-sized avatar
🎣
Fishing
🎣
Fishing

Organizations

@bullfrog-store

Block or report lbh2001

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
17 stars written in Python
Clear filter

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 69,907 13,329 Updated Feb 10, 2026

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 41,192 7,198 Updated Feb 10, 2026

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 23,457 4,375 Updated Feb 10, 2026

Nano vLLM

Python 11,596 1,548 Updated Nov 3, 2025

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 7,591 653 Updated Feb 9, 2026

My learning notes for ML SYS.

Python 5,307 346 Updated Jan 30, 2026

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

Python 5,122 440 Updated Feb 9, 2026

FlashInfer: Kernel Library for LLM Serving

Python 4,928 698 Updated Feb 10, 2026

Performance-optimized AI inference on your GPUs. Unlock superior throughput by selecting and tuning engines like vLLM or SGLang.

Python 4,494 459 Updated Feb 9, 2026

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 3,408 420 Updated Feb 9, 2026

为 CSAPP 视频课程提供字幕,翻译 PPT,Lab。

Python 2,768 283 Updated May 29, 2024

A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation and performance benchmarking.

Python 2,387 275 Updated Feb 7, 2026

这个仓库有1426个star,不信你试试

Python 1,434 37 Updated Sep 13, 2022

Analysis leveldb source code step by step

Python 368 73 Updated Nov 11, 2024

Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serving systems.

Python 265 48 Updated Feb 4, 2026

[ICLR2025] Codebase for "ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing", built on Megatron-LM.

Python 105 9 Updated Dec 20, 2024

This repo release the detailed benchmark code and results of Sea Labs AI.

Python 13 1 Updated Jan 3, 2026