ptq

EfficientNetV2 (Efficientnetv2-b2) and quantization int8 and fp32 (QAT and PTQ) on CK+ dataset . fine-tuning, augmentation, solving imbalanced dataset, etc.

python tensorflow keras quantization emotion-recognition qat ckplus facial-emotion-recognition scale-down googlecolab efficientnet imbalanced-dataset quantization-aware-training post-training-quantization efficientnetv2 ptq real-time-emotion-classification real-time-emotion-detection efficientnetv2-b2

Updated May 4, 2024
Jupyter Notebook

amajji / LLM-Quantization-Techniques-Absmax-Zeropoint-GPTQ-GGUF

Star

LLM quantization techniques: absmax, zero-point, GPTQ and GGUF

quantization absolute zeropoint quantization-aware-training ptq llm llamacpp ggml gptq gguf absmax

Updated Aug 2, 2024
Jupyter Notebook

Bobo-y / flexible-yolov5

Star

More readable and flexible yolov5 with more backbone(gcn, resnet, shufflenet, moblienet, efficientnet, hrnet, swin-transformer, etc) and (cbam，dcn and so on), and tensorrt

sparsity backbone pytorch resnet object-detection gcn tensorrt neck qat shufflenet yolov3 cbam hrnet dcnv2 yolov5 moblienet swin-transformer triton-server ptq

Updated Aug 19, 2024
Python

mi-optimize is a versatile tool designed for the quantization and evaluation of large language models (LLMs). The library's seamless integration of various quantization methods and evaluation techniques empowers users to customize their approaches according to specific requirements and constraints, providing a high level of flexibility.

benchmark inference quantization qat ptq llm large-language-model

Updated Nov 28, 2024
Python

ambideXtrous9 / Quantization-of-Models-PTQ-and-QAT

Star

Quantization of Models : Post-Training Quantization(PTQ) and Quantize Aware Training(QAT)

keras pytorch quantization qat tflite pytorch-implementation tflite-models quantization-aware-training ptq

Updated Dec 29, 2024
Jupyter Notebook

MAGICS-LAB / OutEffHop

Star

[ICML 2024] Outlier-Efficient Hopfield Layers for Large Transformer-Based Models

transformer outliers attention attention-mechanism outlier-removal outlier hopfield-neural-network ptq outlier-treatment modern-hopfield-networks modern-hopfield-model icml-2024 softmax-1 quantized-friendly no-op-outlier

Updated May 4, 2025
Python

MAGICS-LAB / GERM

Star

[ICML 2025] Fast and Low-Cost Genomic Foundation Models via Outlier Removal.

Updated Jun 19, 2025
Python

diesimo-ai / TinyQ

Star

A lightweight quantization module for PyTorch models.

compression quantization model-compression edge-devices edge-ai edge-intelligence ptq tinyq pytorch-optimization

Updated Sep 20, 2025
Python

actypedef / AURA

Star

AURA: Augmented Representation for Unified Accuracy-aware Quantization

cuda quantization inference-acceleration ptq llm

Updated Sep 24, 2025
Cuda

Abeshith / FineTuning_LanguageModels

Star

🎯 Fine-tune large language models and use them for text-related tasks. This repository provides a straightforward approach to fine-tuning models like Gemma, Llama 🦙, and Mistral 🌪️ for various NLP tasks. 🔧 It includes training 📚, fine-tuning 🛠️, and inference pipelines ⚙️. 🚀

transformer lora quantization knowledge-distillation finetuning quantization-aware-training post-quantization ptq large-language-models gptq bitsandbytes unsloth multi-lora

Updated Nov 28, 2025
Jupyter Notebook

lix19937 / tensorrt-insight

Star

Deep insight tensorrt, including but not limited to qat, ptq, plugin, triton_inference, cuda

nvidia asp tensorrt qat ptq

Updated Jan 13, 2026
C++

csg2008 / AIToyOCRPro

Star

基于各大 AI 模型开发的OCR文本识别网络应用加强版，支持导出模型部署、模型量化、模型剪枝、优化参数搜索、可视化调试分析...

ocr quantization qat onnx ptq

Updated Feb 1, 2026
Python

Xilinx / brevitas

Star

Brevitas: neural network quantization in PyTorch

fpga deep-learning pytorch neural-networks xilinx quantization hardware-acceleration qat brevitas ptq

Updated Feb 3, 2026
Python

SonySemiconductorSolutions / mct-model-optimization

Star

Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. This project provides researchers, developers, and engineers advanced quantization and compression tools for deploying state-of-the-art neural networks.

machine-learning deep-neural-networks deep-learning neural-network tensorflow optimizer pytorch quantization qat network-quantization network-compression edge-ai ptq

Updated Feb 6, 2026
Python

Improve this page

Add a description, image, and links to the ptq topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the ptq topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ptq

Here are 18 public repositories matching this topic...

yester31 / TensorRT_API

yester31 / TensorRT_ONNX

BlindOver / blindover_AI

smpanaro / norm-tweaking

OmidGhadami95 / EfficientNetV2_Quantization_CK

amajji / LLM-Quantization-Techniques-Absmax-Zeropoint-GPTQ-GGUF

Bobo-y / flexible-yolov5

TsingmaoAI / MI-optimize

ambideXtrous9 / Quantization-of-Models-PTQ-and-QAT

MAGICS-LAB / OutEffHop

MAGICS-LAB / GERM

diesimo-ai / TinyQ

actypedef / AURA

Abeshith / FineTuning_LanguageModels

lix19937 / tensorrt-insight

csg2008 / AIToyOCRPro

Xilinx / brevitas

SonySemiconductorSolutions / mct-model-optimization

Improve this page

Add this topic to your repo