lambda7xx

Xiao lambda7xx

千里之行, 始于足下 Build Systems Think AI in System, Think System in AI.

346 followers · 1.3k following

Shanghai Jiao Tong University
Shanghai

Achievements

x2 x2

Achievements

x2 x2

Lists (2)

Sort

LLM Serving

5 repositories

🌟mlsys

Stars

xiaoxuanNLP / GoLongRL

GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment

Python 47 Updated May 23, 2026

modelcorp / axon

Python 6 2 Updated Jun 15, 2026

inclusionAI / humming

Python 138 18 Updated Jun 10, 2026

inclusionAI / DR-Venus

Python 88 11 Updated May 8, 2026

JustinTong0323 / sgl-eval

Forked from sgl-project/sgl-eval

Python 1 Updated Jun 14, 2026

pytorch / rl

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

Python 3,462 459 Updated Jun 15, 2026

jshn9515 / deep-learning-notes

Personal deep learning study notes and tutorial-style notebooks

Python 467 19 Updated Jun 15, 2026

WingEdge777 / vitamin-cuda

🍎 One kernel a day keeps high latency away. A hands-on CUDA learning path featuring a rich collection of kernels, from the basics to peak performance, seamlessly integrated as PyTorch C++ extensions.

Cuda 146 8 Updated Jun 13, 2026

aipoch / medical-research-skills

Hundreds of agent skills for medical research, including protocol design, data analysis, evidence insights, and academic writing.

Python 1,140 78 Updated Jun 15, 2026

Infini-AI-Lab / Sparrow

Python 11 Updated Jun 10, 2026

Infini-AI-Lab / vortex_torch

Vortex: Programmable Sparse Attention for Agents as Algorithm Designers

Python 60 7 Updated Jun 8, 2026

scitix / InstantTensor

An ultra-fast, distributed Safetensors loader

C++ 61 8 Updated May 27, 2026

FareedKhan-dev / train-llm-from-scratch

A straightforward method for training your LLM, from downloading data to generating text.

Python 6,219 844 Updated Jun 15, 2026

mlsyscourse / assignment-distributed-training

Python 13 15 Updated Feb 15, 2026

jiazhihao / agentic-compiler

An Agentic Compiler for CUDA

9 Updated May 17, 2026

thinkwee / AgentsMeetRL

Awesome List for Agentic RL

HTML 1,573 61 Updated May 26, 2026

Tencent-Hunyuan / UniRL

UniRL is a Framework for Unified Multimodal Model Reinforcement Learning

Python 615 32 Updated Jun 15, 2026

NetX-lab / Frontier

Frontier: A Discrete-Event Simulator for Modern LLM Serving

Python 28 4 Updated Jun 14, 2026

verl-project / uni-agent

A unified framework for building, running, and training general agents at scale.

Python 340 44 Updated Jun 15, 2026

mvanhorn / last30days-skill

AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary

Python 42,782 3,486 Updated Jun 10, 2026

zhen8838 / handson-polyhedral

tutorials about polyhedral compilation.

Jupyter Notebook 65 10 Updated Jun 6, 2026

thustorage / RoundPipe

Large DNNs training framework for consumer GPUs

Python 88 15 Updated Jun 1, 2026

Aravind0403 / clairvoyant-scheduler

Go sidecar proxy that eliminates Head-of-Line Blocking in LLM inference via ML-driven SJF scheduling — zero backend modification. Paper in preparation

Python 1 Updated Jun 5, 2026

excalidraw / excalidraw

Virtual whiteboard for sketching hand-drawn like diagrams

TypeScript 125,401 14,030 Updated Jun 15, 2026

foundry-org / foundry

Foundry materializes CUDA graphs along with its execution context to disk to support fast cold start of serving engines.

C++ 36 3 Updated Jun 15, 2026

vdcores / vdcores

Virtual Decoupled Cores: Composable Programming Framework and Runtime for Async GPUs

Python 17 5 Updated Jun 10, 2026

blitz-serving / blitz-router

BlitzScale Router - Distributed LLM Inference Router (Rust)

Rust 3 1 Updated May 25, 2026

hk011 / yanxi-paper-note

AI拆解论文，人人都能读懂前沿研究

TypeScript 15 Updated Jun 9, 2026

huangyibo / SwiftRDMA

SwiftRDMA -- Exposing RDMA NIC Resources for Software-Defined RDMA Scheduling

C++ 19 Updated Jun 9, 2026

RL-Align / RL-Kernel

Modern RL Post-training Infrastructure: Optimized for NVIDIA/AMD GPUs with a focus on vLLM and DeepSpeed integration, CUDA/ROCm/Triton kernels, and transparent hardware-aware scaling.

Python 124 22 Updated Jun 15, 2026