-
SenseTime
- Beijing, China
Stars
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
Light-tts is a lightweight TTS inference framework optimized for CosyVoice2, enabling fast and scalable speech synthesis in Python.
[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLM, VLM, and video generation models.
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…
Development repository for the Triton language and compiler
🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools
gushiqiao / Dipoorlet
Forked from ModelTC/DipoorletOffline Quantization Tools for Deploy.
PaddleFormers is an easy-to-use library of pre-trained large language model zoo based on PaddlePaddle.