Skip to content
View lengrongfu's full-sized avatar

Organizations

@kubernetes @Project-HAMi

Block or report lengrongfu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

agent-sandbox enables easy management of isolated, stateful, singleton workloads, ideal for use cases like AI agent runtimes.

Go 513 71 Updated Dec 19, 2025

A framework for efficient model inference with omni-modality models

Python 1,228 163 Updated Dec 22, 2025

UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)

C++ 1,133 106 Updated Dec 22, 2025

Perplexity open source garden for inference technology

Rust 307 25 Updated Dec 9, 2025

My learning notes for ML SYS.

Python 4,746 300 Updated Dec 22, 2025

Intelligent Router for Mixture-of-Models

Go 2,535 351 Updated Dec 22, 2025

💖🧸 Self hosted, you owned Grok Companion, a container of souls of waifu, cyber livings to bring them into our worlds, wishing to achieve Neuro-sama's altitude. Capable of realtime voice chat, Minec…

Vue 16,168 1,501 Updated Dec 22, 2025

Supercharge Your LLM with the Fastest KV Cache Layer

Python 6,396 805 Updated Dec 22, 2025
Go 6 5 Updated Dec 10, 2025

vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization

Python 2,049 342 Updated Dec 20, 2025
Smarty 8 Updated May 26, 2025
Go 45 6 Updated Dec 8, 2025

Cost-efficient and pluggable Infrastructure components for GenAI inference

Go 4,478 501 Updated Dec 13, 2025

Fast OS-level support for GPU checkpoint and restore

C++ 263 28 Updated Sep 28, 2025

A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations

Python 16,238 1,190 Updated Dec 22, 2025

Intercept gRPC traffic of containerd with eBPF

Go 2 Updated Jan 23, 2024

This project is designed to simulate GPU information, making it easier to test scenarios where a GPU is not available.

C++ 59 4 Updated Mar 5, 2025

AIOS: AI Agent Operating System

Python 4,880 643 Updated Nov 24, 2025

☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!

Go 279 44 Updated Dec 15, 2025

A generative world for general-purpose robotics & embodied AI learning.

Python 27,826 2,573 Updated Dec 19, 2025

DRANET is a Kubernetes Network Driver that uses Dynamic Resource Allocation (DRA) to deliver high-performance networking for demanding applications in Kubernetes.

Go 159 24 Updated Dec 9, 2025

Heterogeneous AI Computing Virtualization Middleware(Project under CNCF)

Go 2,785 434 Updated Dec 22, 2025

Visualize your multi-stage Dockerfiles

Go 249 16 Updated Dec 18, 2025

Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"

Python 716 63 Updated Jan 7, 2024

❤️ 充满爱的 AI 结对编程神器 —— 🐌 Guii Devtool,轻松融入现有前端项目,通过自然语言指令即可轻松定制和优化代码。我们不替代创造者或 Hackers,只愿成为他们桌旁的亲密伙伴,✨ 共同创造美好的产品。

212 Updated Jul 30, 2024

groupcache is a caching and cache-filling library, intended as a replacement for memcached in many cases.

Go 13,292 1,396 Updated Nov 29, 2024

Unlock Unlimited Potential! Share Your GPU Power Across Your Local Network!

Go 72 3 Updated May 22, 2025

This is the Rust course used by the Android team at Google. It provides you the material to quickly teach Rust.

Rust 32,419 1,943 Updated Dec 5, 2025

LLM inference in C/C++

C++ 91,803 14,185 Updated Dec 22, 2025

Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.

Go 158,033 13,981 Updated Dec 21, 2025
Next