zfy3000163

Follow

zfy3000 zfy3000163

Follow

15 followers · 97 following

Achievements

Achievements

Starred repositories

david-xinyuwei / david-share

Jupyter Notebook 383 74 Updated Dec 21, 2025

kvcache-ai / Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,460 475 Updated Dec 21, 2025

HazyResearch / ThunderKittens

Tile primitives for speedy kernels

Cuda 3,009 217 Updated Dec 9, 2025

black-forest-labs / flux

Official inference repo for FLUX.1 models

Python 24,935 1,829 Updated Jul 31, 2025

deepseek-ai / LPLB

An early research stage expert-parallel load balancer for MoE models based on linear programming.

Python 471 27 Updated Nov 19, 2025

YitaoYuan / pytorch-nccl-test

Python 1 1 Updated Jan 22, 2025

guidance-ai / guidance

A guidance language for controlling large language models.

Jupyter Notebook 21,013 1,129 Updated Dec 17, 2025

LLMServe / FastServe

Jupyter Notebook 19 3 Updated Sep 26, 2025

LLMServe / DistServe

Disaggregated serving system for Large Language Models (LLMs).

Jupyter Notebook 754 82 Updated Apr 6, 2025

LLMServe / SwiftTransformer

High performance Transformer implementation in C++.

C++ 146 17 Updated Jan 18, 2025

DeepLink-org / DLSlime

DLSlime: Flexible & Efficient Heterogeneous Transfer Toolkit

C++ 84 7 Updated Dec 20, 2025

deepseek-ai / DeepSeek-OCR

Contexts Optical Compression

Python 21,510 1,924 Updated Oct 25, 2025

microsoft / mscclpp

MSCCL++: A GPU-driven communication stack for scalable AI applications

C++ 444 77 Updated Dec 19, 2025

efeslab / Nanoflow

A throughput-oriented high-performance serving framework for LLMs

Jupyter Notebook 924 45 Updated Oct 29, 2025

sii-research / VCCL

Venus Collective Communication Library, supported by SII and Infrawaves.

C++ 125 5 Updated Dec 18, 2025

microsoft / tokenweave

Efficient Compute-Communication Overlap for Distributed LLM Inference

Python 66 4 Updated Oct 31, 2025

aliyun / syccl

C++ 15 5 Updated Sep 10, 2025

0voice / kernel_new_features

一个深挖 Linux 内核的新功能特性，以 io_uring, cgroup, ebpf, llvm 为代表，包含开源项目，代码案例，文章，视频，架构脑图等

C 1,866 283 Updated May 20, 2024

pybind / pybind11

Seamless operability between C++11 and Python

C++ 17,565 2,253 Updated Dec 15, 2025

InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 7,408 635 Updated Dec 20, 2025

LDLINGLINGLING / nano_vllm_note

注释的nano_vllm仓库，并且完成了MiniCPM4的适配以及注册新模型的功能

Python 121 23 Updated Aug 11, 2025

bytedance-iaas / sglang

Forked from sgl-project/sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 17 10 Updated Dec 19, 2025

preminstrel / vllm

Forked from vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 1 1 Updated Aug 12, 2025

microsoft / qlib

Qlib is an AI-oriented Quant investment platform that aims to use AI tech to empower Quant Research, from exploring ideas to implementing productions. Qlib supports diverse ML modeling paradigms, i…

Python 34,715 5,402 Updated Dec 18, 2025

snowflakedb / ArcticInference

ArcticInference: vLLM plugin for high-throughput, low-latency inference

Python 352 40 Updated Dec 16, 2025

tile-ai / tilelang

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 4,266 350 Updated Dec 21, 2025

zhuzilin / ring-flash-attention

Ring attention implementation with flash attention

Python 949 90 Updated Sep 10, 2025

linux-rdma / perftest

Infiniband Verbs Performance Tests

C 889 363 Updated Dec 14, 2025

cchan / tccl

extensible collectives library in triton

Python 91 6 Updated Mar 31, 2025

coze-dev / coze-loop

Next-generation AI Agent Optimization Platform: Cozeloop addresses challenges in AI agent development by providing full-lifecycle management capabilities from development, debugging, and evaluation…

Go 5,181 712 Updated Dec 19, 2025

Starred topics

packet-filter

socket-server-c

fstackqperf

qperfdpdk

noviswitch