CTO @ TensorMesh;
Core developer at @LMCache and @vllm-project
-
TensorMesh
- United States
- https://apostac.github.io/about.html
- in/yihuacheng-215133327
Stars
7
stars
written in Python
Clear filter
A high-throughput and memory-efficient inference and serving engine for LLMs
Supercharge Your LLM with the Fastest KV Cache Layer
vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization