ptq
Here are 18 public repositories matching this topic...
Generating tensorrt model using onnx
-
Updated
Jun 22, 2023 - C++
Build AI model to classify beverages for blind individuals
-
Updated
Aug 16, 2023 - Python
Post post-training-quantization (PTQ) method for improving LLMs. Unofficial implementation of https://arxiv.org/abs/2309.02784
-
Updated
Feb 21, 2024 - Python
EfficientNetV2 (Efficientnetv2-b2) and quantization int8 and fp32 (QAT and PTQ) on CK+ dataset . fine-tuning, augmentation, solving imbalanced dataset, etc.
-
Updated
May 4, 2024 - Jupyter Notebook
More readable and flexible yolov5 with more backbone(gcn, resnet, shufflenet, moblienet, efficientnet, hrnet, swin-transformer, etc) and (cbam,dcn and so on), and tensorrt
-
Updated
Aug 19, 2024 - Python
mi-optimize is a versatile tool designed for the quantization and evaluation of large language models (LLMs). The library's seamless integration of various quantization methods and evaluation techniques empowers users to customize their approaches according to specific requirements and constraints, providing a high level of flexibility.
-
Updated
Nov 28, 2024 - Python
Quantization of Models : Post-Training Quantization(PTQ) and Quantize Aware Training(QAT)
-
Updated
Dec 29, 2024 - Jupyter Notebook
[ICML 2024] Outlier-Efficient Hopfield Layers for Large Transformer-Based Models
-
Updated
May 4, 2025 - Python
[ICML 2025] Fast and Low-Cost Genomic Foundation Models via Outlier Removal.
-
Updated
Jun 19, 2025 - Python
A lightweight quantization module for PyTorch models.
-
Updated
Sep 20, 2025 - Python
AURA: Augmented Representation for Unified Accuracy-aware Quantization
-
Updated
Sep 24, 2025 - Cuda
🎯 Fine-tune large language models and use them for text-related tasks. This repository provides a straightforward approach to fine-tuning models like Gemma, Llama 🦙, and Mistral 🌪️ for various NLP tasks. 🔧 It includes training 📚, fine-tuning 🛠️, and inference pipelines ⚙️. 🚀
-
Updated
Nov 28, 2025 - Jupyter Notebook
基于各大 AI 模型开发的OCR文本识别网络应用加强版,支持导出模型部署、模型量化、模型剪枝、优化参数搜索、可视化调试分析...
-
Updated
Feb 1, 2026 - Python
Brevitas: neural network quantization in PyTorch
-
Updated
Feb 3, 2026 - Python
Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. This project provides researchers, developers, and engineers advanced quantization and compression tools for deploying state-of-the-art neural networks.
-
Updated
Feb 6, 2026 - Python
Improve this page
Add a description, image, and links to the ptq topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the ptq topic, visit your repo's landing page and select "manage topics."