Skip to content
View zfy3000163's full-sized avatar

Block or report zfy3000163

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

1066 results for source starred repositories
Clear filter

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 162,760 25,565 Updated Feb 4, 2026
Python 149 22 Updated Oct 9, 2024

UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)

C++ 1,204 118 Updated Feb 4, 2026

Supercharge Your LLM with the Fastest KV Cache Layer

Python 6,839 886 Updated Feb 4, 2026

TurboDiffusion: 100–200× Acceleration for Video Diffusion Models

Python 3,306 226 Updated Jan 29, 2026

Perplexity open source garden for inference technology

Rust 358 29 Updated Dec 25, 2025
Jupyter Notebook 395 79 Updated Feb 3, 2026

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,689 545 Updated Feb 4, 2026

Tile primitives for speedy kernels

Cuda 3,119 234 Updated Feb 4, 2026

Official inference repo for FLUX.1 models

Python 25,187 1,851 Updated Jul 31, 2025

An early research stage expert-parallel load balancer for MoE models based on linear programming.

Python 495 33 Updated Nov 19, 2025
Python 1 2 Updated Jan 22, 2025

A guidance language for controlling large language models.

Jupyter Notebook 21,254 1,145 Updated Feb 4, 2026
Jupyter Notebook 21 3 Updated Sep 26, 2025

Disaggregated serving system for Large Language Models (LLMs).

Jupyter Notebook 772 85 Updated Apr 6, 2025

High performance Transformer implementation in C++.

C++ 151 17 Updated Jan 18, 2025

DLSlime: Flexible & Efficient Heterogeneous Transfer Toolkit

C++ 92 8 Updated Jan 26, 2026

Contexts Optical Compression

Python 22,381 2,055 Updated Jan 27, 2026

MSCCL++: A GPU-driven communication stack for scalable AI applications

C++ 461 81 Updated Feb 4, 2026

A throughput-oriented high-performance serving framework for LLMs

Jupyter Notebook 945 46 Updated Oct 29, 2025

Venus Collective Communication Library, supported by SII and Infrawaves.

C++ 137 7 Updated Feb 4, 2026

Accepted to MLSys 2026

Python 70 6 Updated Jan 29, 2026
C++ 17 5 Updated Sep 10, 2025

一个深挖 Linux 内核的新功能特性,以 io_uring, cgroup, ebpf, llvm 为代表,包含开源项目,代码案例,文章,视频,架构脑图等

C 1,881 286 Updated May 20, 2024

Seamless operability between C++11 and Python

C++ 17,694 2,265 Updated Feb 3, 2026

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 7,573 651 Updated Feb 4, 2026

注释的nano_vllm仓库,并且完成了MiniCPM4的适配以及注册新模型的功能

Python 155 28 Updated Aug 11, 2025

Qlib is an AI-oriented Quant investment platform that aims to use AI tech to empower Quant Research, from exploring ideas to implementing productions. Qlib supports diverse ML modeling paradigms, i…

Python 36,503 5,685 Updated Feb 4, 2026

ArcticInference: vLLM plugin for high-throughput, low-latency inference

Python 385 46 Updated Feb 2, 2026

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

Python 5,014 434 Updated Feb 4, 2026
Next