Skip to content
View staryxchen's full-sized avatar
  • Tencent
  • Shenzhen, China
  • 20:18 (UTC +08:00)

Block or report staryxchen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 144 19 Updated Oct 9, 2024

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 2,215 268 Updated Dec 19, 2025

Kubernetes AI Toolchain Operator

Go 849 149 Updated Dec 23, 2025

High Performance KV Cache Store for LLM

C 43 4 Updated Nov 27, 2025

Dynamic Memory Management for Serving LLMs without PagedAttention

C 451 34 Updated May 30, 2025

Public repository for Agent Skills

Python 26,262 2,429 Updated Dec 20, 2025

Persist and reuse KV Cache to speedup your LLM.

Python 219 54 Updated Dec 24, 2025

DLSlime: Flexible & Efficient Heterogeneous Transfer Toolkit

C++ 86 7 Updated Dec 23, 2025

Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond

Python 726 73 Updated Nov 30, 2025

Ongoing research training transformer models at scale

Python 14,692 3,406 Updated Dec 24, 2025

HuggingFace conversion and training library for Megatron-based models

Python 306 109 Updated Dec 24, 2025

eTran: Extensible Kernel Transport with eBPF

C 37 3 Updated Apr 28, 2025

TeRM: Extending RDMA-Attached Memory with SSD [FAST'24]

C++ 45 2 Updated Oct 21, 2024

Checkpoint-engine is a simple middleware to update model weights in LLM inference engines

Python 871 72 Updated Dec 23, 2025

A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology

C++ 1,300 180 Updated Dec 17, 2025
Python 134 23 Updated Dec 24, 2025

A simple software update checking service.

Rust 1 Updated Aug 23, 2025

UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)

C++ 1,142 106 Updated Dec 24, 2025

AI 相关的笔记

2,344 243 Updated Dec 2, 2025

FULL Augment Code, Claude Code, Cluely, CodeBuddy, Comet, Cursor, Devin AI, Junie, Kiro, Leap.new, Lovable, Manus, NotionAI, Orchids.app, Perplexity, Poke, Qoder, Replit, Same.dev, Trae, Traycer AI…

101,974 27,154 Updated Dec 19, 2025

基于多智能体LLM的中文金融交易框架 - TradingAgents中文增强版

Python 14,105 3,101 Updated Nov 24, 2025

Speed-up over 50% in average vs traditional memcpy in gcc 4.9 or vc2012

C 638 153 Updated Apr 7, 2024

This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.

C++ 202 83 Updated Dec 20, 2025

Efficient GPU communication over multiple NICs.

C++ 21 4 Updated Nov 20, 2025

Documentation of NVIDIA chip/hardware interfaces

C 1,317 98 Updated Aug 18, 2025

[NSDI25] AutoCCL: Automated Collective Communication Tuning for Accelerating Distributed and Parallel DNN Training

C++ 29 3 Updated May 2, 2025

Infiniband Verbs Performance Tests

C 890 366 Updated Dec 14, 2025

AI-based command line tool to quickly generate standardized commit messages.

Rust 5 2 Updated Dec 15, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 21,945 3,856 Updated Dec 24, 2025
Next