Quantized LLM training in pure CUDA/C++.
-
Updated
Nov 11, 2025 - C++
Quantized LLM training in pure CUDA/C++.
QuTLASS: CUTLASS-Powered Quantized BLAS for Deep Learning
FakeQuantize with Learned Step Size(LSQ+) as Observer in PyTorch
Model Quantization with Pytorch, Tensorflow & Larq
Support fixed posit quantised training, inference and fine tuning of neural networks (pytorch based) using the highly optimised fp multiplication on GPU
Add a description, image, and links to the quantization-aware-training topic page so that developers can more easily learn about it.
To associate your repository with the quantization-aware-training topic, visit your repo's landing page and select "manage topics."