int8

Here are 66 public repositories matching this topic...

intel / neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

sparsity pruning quantization knowledge-distillation auto-tuning int8 low-precision quantization-aware-training post-training-quantization awq int4 large-language-models gptq smoothquant sparsegpt fp4 mxformat

Updated Apr 3, 2026
Python

intel / neural-speed

Star

An innovative library for efficient LLM inference via low-bit quantization

Updated Aug 30, 2024
C++

clancylian / retinaface

Star

Reimplement RetinaFace use C++ and TensorRT

caffe tensorrt int8 retinaface mxnet2caffe

Updated Dec 4, 2019
C++

Wulingtian / yolov5_tensorrt_int8_tools

Star

tensorrt int8 量化yolov5 onnx模型

tensorrt int8 onnx yolov5

Updated Apr 23, 2021
Python

Wulingtian / yolov5_tensorrt_int8

Star

TensorRT int8 量化部署 yolov5s 模型，实测3.3ms一帧！

tensorrt int8 yolov5

Updated Apr 23, 2021
C++

xuanandsix / Tensorrt-int8-quantization-pipline

Star

a simple pipline of int8 quantization based on tensorrt.

quantization tensorrt int8 yolox classifaction

Updated Oct 14, 2022
Python

Wulingtian / RepVGG_TensorRT_int8

Star

RepVGG TensorRT int8 量化，实测推理不到1ms一帧！

tensorrt int8 repvgg

Updated Apr 23, 2021
Python

the0807 / YOLOv8-ONNX-TensorRT

Star

👀 Apply YOLOv8 exported with ONNX or TensorRT(FP16, INT8) to the Real-time camera

computer-vision object-detection fp16 tensorrt int8 onnx yolov8

Updated May 23, 2024
Python

sylvesterkaczmarek / phisat2-trustworthy-onboard-ai

Star

Trustworthy onboard satellite AI in PyTorch→ONNX→INT8 with calibration, telemetry, and a PhiSat-2 EO tile-filter demo.

space telemetry calibration esa satellites cubesat quantization earth-observation int8 onnx edge-ai onnxruntime quantization-efficient-network satellite-security onboard-ai phisat-2 phisat2

Updated Nov 10, 2025
Python

EricRollei / Comfy_HunyuanImage3

Star

Nodes to run Hunyuan Image 3 locally with BF16 and NF4 quantized options in Comfyui

Updated Feb 21, 2026
Python

Wulingtian / nanodet_tensorrt_int8

Star

nanodet int8 量化，实测推理2ms一帧！

tensorrt int8 nanodet

Updated Apr 23, 2021
C++

ppogg / ncnn-yolov4-int8

Star

NCNN+Int8+YOLOv4 quantitative modeling and real-time inference

real-time int8 ncnn yolov4

Updated Aug 24, 2021
C++

BoumedineBillal / yolo26n_esp

Star

World's First NMS-Free YOLOv26n on ESP32-P4. Features end-to-end Int8 QAT and custom C++ optimizations achieving 30% faster inference than the official ESP-DL YOLOv11n (1.7s vs 2.4s).

computer-vision esp32 simd yolo object-detection quantization esp-idf risc-v int8 qat onnx embedded-ai edge-ai tinyml ultralytics esp-dl esp32-p4 graph-surgery nms-free

Updated Feb 4, 2026
Jupyter Notebook

Egorundel / int8_calibrator_cpp

Star

INT8 calibrator for ONNX model with dynamic batch_size at the input and NMS module at the output. C++ Implementation.

cpp calibration tensorrt int8 onnx

Updated Oct 15, 2024
C++

aahouzi / llama2-chatbot-cpu

Star

A LLaMA2-7b chatbot with memory running on CPU, and optimized using smooth quantization, 4-bit quantization or Intel® Extension For PyTorch with bfloat16.

Updated Feb 27, 2024
Python

whitelok / tensorrt-int8-python-sample

Star

TensorRT Int8 Python version sample. TensorRT Int8 Python 实现例子。TensorRT Int8 Pythonの例です

python machine-learning ai deep-learning inference nvidia tensorrt int8 int8-inference tensorrt-int8-python

Updated Jan 28, 2019
Python

cbalint13 / rvv-kernels

Star

RISCV Vector Kernel C/LLVM-IR generator

kernel math vector llvm riscv int8 tvm rvv

Updated Oct 8, 2025
C

dasdristanta13 / LLM-Lora-PEFT_accumulate

Star

LLM-Lora-PEFT_accumulate explores optimizations for Large Language Models (LLMs) using PEFT, LORA, and QLORA. Contribute experiments and implementations to enhance LLM efficiency. Join discussions and push the boundaries of LLM optimization. Let's make LLMs more efficient together!

falcon llama lora alpaca int8 peft llm qlora bitsandbytes

Updated Jun 16, 2023
Jupyter Notebook

egbertYeah / mt-yolov6_tensorrt

Star

MT-Yolov6 TensorRT Inference with Python.

tensorrt int8 yolov6

Updated Jul 2, 2022
Python

umitkacar / onnx-tensorrt-optimization

Star

40x faster AI inference: ONNX to TensorRT optimization with FP16/INT8 quantization, multi-GPU support, and deployment

Updated Nov 14, 2025
Python

Improve this page

Add a description, image, and links to the int8 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the int8 topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

int8

Here are 66 public repositories matching this topic...

intel / neural-compressor

intel / neural-speed

clancylian / retinaface

Wulingtian / yolov5_tensorrt_int8_tools

Wulingtian / yolov5_tensorrt_int8

xuanandsix / Tensorrt-int8-quantization-pipline

Wulingtian / RepVGG_TensorRT_int8

the0807 / YOLOv8-ONNX-TensorRT

sylvesterkaczmarek / phisat2-trustworthy-onboard-ai

EricRollei / Comfy_HunyuanImage3

Wulingtian / nanodet_tensorrt_int8

ppogg / ncnn-yolov4-int8

BoumedineBillal / yolo26n_esp

Egorundel / int8_calibrator_cpp

aahouzi / llama2-chatbot-cpu

whitelok / tensorrt-int8-python-sample

cbalint13 / rvv-kernels

dasdristanta13 / LLM-Lora-PEFT_accumulate

egbertYeah / mt-yolov6_tensorrt

umitkacar / onnx-tensorrt-optimization

Improve this page

Add this topic to your repo