lengrongfu

rongfu.leng lengrongfu

63 followers · 122 following

Achievements

x3 x3

Achievements

x3 x3

Organizations

Lists (1)

Sort

🔮 Future ideas

11 repositories

Starred repositories

kubernetes-sigs / agent-sandbox

agent-sandbox enables easy management of isolated, stateful, singleton workloads, ideal for use cases like AI agent runtimes.

Go 513 71 Updated Dec 19, 2025

vllm-project / vllm-omni

A framework for efficient model inference with omni-modality models

Python 1,228 163 Updated Dec 22, 2025

uccl-project / uccl

UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)

C++ 1,133 106 Updated Dec 22, 2025

perplexityai / pplx-garden

Perplexity open source garden for inference technology

Rust 307 25 Updated Dec 9, 2025

zhaochenyang20 / Awesome-ML-SYS-Tutorial

My learning notes for ML SYS.

Python 4,746 300 Updated Dec 22, 2025

vllm-project / semantic-router

Intelligent Router for Mixture-of-Models

Go 2,535 351 Updated Dec 22, 2025

moeru-ai / airi

💖🧸 Self hosted, you owned Grok Companion, a container of souls of waifu, cyber livings to bring them into our worlds, wishing to achieve Neuro-sama's altitude. Capable of realtime voice chat, Minec…

Vue 16,168 1,501 Updated Dec 22, 2025

LMCache / LMCache

Supercharge Your LLM with the Fastest KV Cache Layer

Python 6,396 805 Updated Dec 22, 2025

BaizeAI / kube-snapshot

Go 6 5 Updated Dec 10, 2025

vllm-project / production-stack

vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization

Python 2,049 342 Updated Dec 20, 2025

DaoCloud / dify-chart

Smarty 8 Updated May 26, 2025

OpenCIDN / ocimirror

Go 45 6 Updated Dec 8, 2025

vllm-project / aibrix

Cost-efficient and pluggable Infrastructure components for GenAI inference

Go 4,478 501 Updated Dec 13, 2025

SJTU-IPADS / PhoenixOS

Fast OS-level support for GPU checkpoint and restore

C++ 263 28 Updated Sep 28, 2025

kvcache-ai / ktransformers

A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations

Python 16,238 1,190 Updated Dec 22, 2025

daisyfbk / containerdsnoop

Intercept gRPC traffic of containerd with eBPF

Go 2 Updated Jan 23, 2024

chaunceyjiang / fake-gpu

This project is designed to simulate GPU information, making it easier to test scenarios where a GPU is not available.

C++ 59 4 Updated Mar 5, 2025

agiresearch / AIOS

AIOS: AI Agent Operating System

Python 4,880 643 Updated Nov 24, 2025

InftyAI / llmaz

☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!

Go 279 44 Updated Dec 15, 2025

Genesis-Embodied-AI / Genesis

A generative world for general-purpose robotics & embodied AI learning.

Python 27,826 2,573 Updated Dec 19, 2025

google / dranet

DRANET is a Kubernetes Network Driver that uses Dynamic Resource Allocation (DRA) to deliver high-performance networking for demanding applications in Kubernetes.

Go 159 24 Updated Dec 9, 2025