Stars
From-scratch PyTorch implementation of Google's TurboQuant (ICLR 2026) for LLM KV cache compression. 5x compression at 3-bit with 99.5% attention fidelity.
Single Channel Speech Enhancement Methods and Toolbox
Room Impulse Response reconstruction with Physics Informed Neural Networks
Audio text dataset for pytorch training based on webdataset.
Efficient Inference of Transformer models
Official PyTorch implementation of 'Rec-RIR: Monaural Blind Room Impulse Response Identification via DNN-based Reverberant Speech Reconstruction in STFT Domain'
Precision Alignment, Infinite Possibilities
🔥 Clone and recreate any website as a modern React app in seconds
Official inference framework for 1-bit LLMs
CUDA Templates and Python DSLs for High-Performance Linear Algebra
Accessible large language models via k-bit quantization for PyTorch.
BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.