Stars
A tool for running and customizing real-time, interactive generative AI pipelines and models
TurboQuant: Near-optimal KV cache quantization for LLM inference (3-bit keys, 2-bit values) with Triton kernels + vLLM integration
GammaOS Android Firmware Distribution
Android Messages as a Cross-platform Desktop App
Patches needed to build VMware (Player and Workstation) host modules against recent kernels
Dynamically edit AMD Ryzen processor P-States