Jeffwan

Follow

Jiaxin Shan Jeffwan

Follow

Software Engineer @ Bytedance

570 followers · 325 following

Bytedance
Seattle, WA

Achievements

Achievements

Highlights

Pro

Organizations

Stars

vllm-project / vllm-omni

A framework for efficient model inference with omni-modality models

Python 4,119 687 Updated Apr 4, 2026

cornserve-ai / cornserve

Easy, Fast, and Scalable Multimodal AI

Python 121 8 Updated Apr 3, 2026

aibrix / PrisKV

High Performance KV Cache Store for LLM

C 52 8 Updated Mar 31, 2026

ovg-project / kvcached

Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond

Python 840 95 Updated Apr 4, 2026

bytedance / deer-flow

An open-source long-horizon SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of…

Python 57,718 7,162 Updated Apr 4, 2026

bytedance / InfiniStore

KV cache store for distributed LLM inference

C++ 402 36 Updated Nov 13, 2025

adityatelange / hugo-PaperMod

A fast, clean, responsive Hugo theme.

HTML 13,323 3,361 Updated Mar 22, 2026

vllm-project / aibrix

Cost-efficient and pluggable Infrastructure components for GenAI inference

Go 4,708 546 Updated Apr 4, 2026

mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation

Python 22,337 1,987 Updated Apr 2, 2026

hahnyuan / LLM-Viewer

Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.

Python 633 84 Updated Sep 11, 2024

aibrix / ai-accelerator-tool

Go 13 2 Updated Feb 11, 2025

estesp / manifest-tool

Command line tool to create and query container image manifest list/indexes

Go 835 99 Updated Mar 30, 2026

IntelliSys-Lab / InstaInfer-SoCC24

Reproduce of Pre-warming is Not Enough (SoCC'24)

Scala 7 2 Updated Aug 12, 2025

InternLM / AcmeTrace

Jupyter Notebook 178 11 Updated Mar 12, 2024

kvcache-ai / Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 5,040 651 Updated Apr 3, 2026

IBM / LLM-performance-prediction

Predict the performance of LLM inference services

Jupyter Notebook 23 1 Updated Sep 18, 2025

efeslab / Nanoflow

A throughput-oriented high-performance serving framework for LLMs

Jupyter Notebook 952 48 Updated Mar 29, 2026

v6d-io / v6d

vineyard (v6d): an in-memory immutable data manager. (Project under CNCF, TAG-Storage)

C++ 947 131 Updated Jan 22, 2026

llumnix-project / llumnix-ray

Efficient and easy multi-instance LLM serving

Python 541 47 Updated Mar 12, 2026

volcengine / veTurboIO

A library developed by Volcano Engine for high-performance reading and writing of PyTorch model files.

Python 25 7 Updated Jan 2, 2025

S-LoRA / S-LoRA

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Python 1,905 123 Updated Jan 21, 2024

spegel-org / spegel

Stateless cluster local OCI registry mirror.

Go 3,566 141 Updated Apr 3, 2026

ServerlessLLM / ServerlessLLM

Serverless LLM Serving for Everyone.

Python 667 69 Updated Mar 6, 2026

nianhuatiandi / Fast-Distributed-Inference-Serving-for-Large-Language-Models

Fast Distributed Inference Serving for Large Language Models

4 Updated Oct 18, 2023

kserve / modelmesh

Distributed Model Serving Framework

Java 187 79 Updated Mar 19, 2026

DataDog / watermarkpodautoscaler

Custom controller that extends the Horizontal Pod Autoscaler

Go 240 32 Updated Apr 1, 2026

lambda7xx / awesome-AI-system

paper and its code for AI System

357 23 Updated Feb 10, 2026

Hsword / SpotServe

SpotServe: Serving Generative Large Language Models on Preemptible Instances

134 15 Updated Feb 22, 2024

predibase / lorax

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

Python 3,743 310 Updated May 21, 2025

xlang-ai / OpenAgents

[COLM 2024] OpenAgents: An Open Platform for Language Agents in the Wild

Python 4,745 525 Updated Nov 18, 2024