Skip to content
View Weili17's full-sized avatar

Block or report Weili17

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Use Garry Tan's exact Claude Code setup: 15 opinionated tools that serve as CEO, Designer, Eng Manager, Release Manager, Doc Engineer, and QA

TypeScript 53,838 6,970 Updated Mar 28, 2026

Linux kernel source tree

C 225,458 61,280 Updated Mar 28, 2026

Offline optimization of your disaggregated Dynamo graph

Python 232 84 Updated Mar 28, 2026

Machine Learning Engineering Open Book

Python 17,562 1,115 Updated Mar 16, 2026

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 3,805 527 Updated Mar 13, 2026

Nano vLLM

Python 12,478 1,801 Updated Nov 3, 2025
Jupyter Notebook 602 27 Updated Aug 23, 2024

High Performance LLM Inference Operator Library

C++ 801 74 Updated Feb 5, 2026

[ICLR2025] Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding

Python 145 12 Updated Dec 4, 2024

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 4,934 437 Updated Mar 28, 2026

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 3,022 253 Updated Mar 28, 2026

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 339,381 66,829 Updated Mar 28, 2026

A hyperparameter optimization framework

Python 13,778 1,290 Updated Mar 27, 2026

分享AI Infra知识&代码练习:PyTorch/vLLM/SGLang框架入门⚡️、性能加速🚀、大模型基础🧠、AI软硬件🔧等

Jupyter Notebook 1,401 109 Updated Mar 28, 2026

A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks …

Python 2,255 319 Updated Mar 28, 2026

Model compression toolkit engineered for enhanced usability, comprehensiveness, and efficiency.

Python 558 74 Updated Mar 26, 2026

A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM

Python 306 57 Updated Mar 27, 2026

Provide with pre-build flash-attention package wheels on Linux and Windows platforms using GitHub Actions

Python 1,163 60 Updated Mar 28, 2026

Extending eBPF Programmability and Observability to GPUs (merged into https://github.com/eunomia-bpf/bpftime)

C++ 296 13 Updated Nov 24, 2025

eBPF Developer Tutorial: Learning eBPF Step by Step with Examples

C 4,010 571 Updated Mar 11, 2026

ebpf-go is a pure-Go library to read, modify and load eBPF programs and attach them to various hooks in the Linux kernel.

Go 7,619 841 Updated Mar 26, 2026

在常规推荐系统算法和系统双优化的范式下,一线公司针对单个任务或单个业务的效果挖掘几乎达到极限。从2019年我们开始关注多种信息的萃取融合,提出了OneRec算法,希望通过平台或外部各种各样的信息来进行知识集成,打破数据孤岛,极大扩充推荐的“Extra World Knowledge”。 已实践的算法包括行为数据,内容描述,社交信息,知识图谱等。在OneRec,每种信息和整体算法的集成是可插拔…

Python 229 32 Updated Jan 13, 2026

[Pytorch] Generative retrieval model using semantic IDs from "Recommender Systems with Generative Retrieval"

Python 759 106 Updated Sep 22, 2025
CMake 44 66 Updated Mar 25, 2026

Distributed Compiler based on Triton for Parallel Systems

Python 1,398 135 Updated Mar 11, 2026

Unified Collective Communication Library

C 298 127 Updated Mar 25, 2026

Tile-Based Runtime for Ultra-Low-Latency LLM Inference

Python 690 40 Updated Mar 8, 2026
Next