#
Lists (2)
Sort Name ascending (A-Z)
Starred repositories
35
stars
written in Python
Clear filter
A powerful toolkit for compressing large models including LLM, VLM, and video generation models.
A tool for generating information about the matrix multiplication instructions in AMD Radeon™ and AMD Instinct™ accelerators
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…