Skip to content
View tanjunchen's full-sized avatar

Organizations

@istio @k8smeetup @servicemesher @cloudnativeto @aeraki-mesh @YuCloudNative

Block or report tanjunchen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 4,001 578 Updated Mar 13, 2026

本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)

HTML 24,016 2,763 Updated Mar 12, 2026

Kubernetes-native AI serving platform for scalable model serving.

Go 305 83 Updated Apr 16, 2026

A Datacenter Scale Distributed Inference Serving Framework

Rust 6,564 1,034 Updated Apr 16, 2026

Nano vLLM

Python 12,941 1,941 Updated Apr 13, 2026

Community maintained hardware plugin for vLLM on Ascend

Python 1,935 1,088 Updated Apr 16, 2026

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 3,099 484 Updated Apr 16, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 76,943 15,696 Updated Apr 16, 2026

vLLM Kunlun (vllm-kunlun) is a community-maintained hardware plugin designed to seamlessly run vLLM on the Kunlun XPU.

Python 397 67 Updated Apr 14, 2026

主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题

HTML 13,891 1,371 Updated Apr 30, 2025

The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.

Python 5,049 466 Updated Apr 16, 2026

System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge

Go 3,720 620 Updated Apr 16, 2026

放b站视频的一些文档和代码 @堂吉诃德拉曼查的英豪

613 74 Updated Aug 16, 2025

AI 基础知识 - GPU 架构、CUDA 编程、大模型基础及AI Agent 相关知识

HTML 1,070 167 Updated Apr 16, 2026
Python 127 19 Updated Feb 19, 2026

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,974 286 Updated May 15, 2025

LeaderWorkerSet: An API for deploying a group of pods as a unit of replication

Go 697 144 Updated Apr 15, 2026

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 25,918 5,401 Updated Apr 16, 2026

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 70,190 8,592 Updated Apr 12, 2026

KAI Scheduler is an open source Kubernetes Native scheduler for AI workloads at large scale

Go 1,237 181 Updated Apr 16, 2026

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

JavaScript 1 Updated Mar 19, 2025

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

Python 132,215 18,769 Updated Apr 16, 2026

复现大模型相关算法及一些学习记录

Python 3,266 439 Updated Mar 21, 2026

AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。

Jupyter Notebook 6,752 883 Updated Dec 22, 2025

My learning notes for ML SYS.

Python 6,024 394 Updated Apr 8, 2026

每个人都能看懂的大模型知识分享,LLMs春/秋招大模型面试前必看,让你和面试官侃侃而谈

Jupyter Notebook 6,291 583 Updated Apr 12, 2026

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 12,556 1,007 Updated Apr 7, 2026

Golang Version Manager

Go 2,735 254 Updated Dec 14, 2025
Next