-
National Yang Ming Chiao Tung University
- Taiwan
Stars
Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation. In CVPR 2022.
QuTLASS: CUTLASS-Powered Quantized BLAS for Deep Learning
BISMO: A Scalable Bit-Serial Matrix Multiplication Overlay for Reconfigurable Computing
Caffe implementation of accurate low-precision neural networks
LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks
This repository implements the paper "Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations"
Simple PYNQ KV260 tutorial: Porting C-based design into FPGA via Xilinx HLS
PYNQ-Torch: a framework to develop PyTorch accelerators on the PYNQ platform
AMD University Program HLS tutorial
Quantization of Convolutional Neural networks.
Papers and codes about Quantized Networks for easier survey and reference.
A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-…
cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it
A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks …
ncnn is a high-performance neural network inference framework optimized for the mobile platform
Official pytorch Implementation of Relational Knowledge Distillation, CVPR 2019
This repository provides an FPGA-based solution for executing object detection, focusing specifically on the popular YOLOv5 model architecture.
BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
PyTorch implementation of Towards Efficient Training for Neural Network Quantization
BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization (ICLR 2021)