yukavio

Follow

KavioYu yukavio

Follow

Work for Tencent-WXG. Focus on model inference optimization, such as inference engine and model compression.

20 followers · 2 following

Shanghai

Achievements

Achievements

Stars

1 result for sponsorable starred repositories written in Python

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 62,471 11,119 Updated Nov 8, 2025