model-optimization

Here are 70 public repositories matching this topic...

onnx / ir-py

Efficient in-memory representation for ONNX, in Python

machine-learning computation-graph intermediate-representation compilation onnx graph-transformation model-optimization large-language-models

Updated Apr 9, 2026
Python

da2so / DA2Lite

Star

DA2Lite is an automated model compression toolkit for PyTorch.

python deep-learning pytorch image-classification pruning quantization knowledge-distillation model-compression on-device model-optimization filter-decomposition

Updated Mar 15, 2022
Python

MaitreChen / openvino-lenet-sample

Star

本仓库包含了完整的深度学习应用开发流程，以经典的手写字符识别为例，基于LeNet网络构建。推理部分使用torch、onnxruntime以及openvino框架💖

deep-learning deployment pytorch lenet mnist-handwriting-recognition openvino onnxruntime model-optimization

Updated Nov 13, 2025
Python

vi2enne / Neural-Network-Pruning

Star

pruning audio-classification model-compression neural-network-compression sparsification model-optimization neural-network-pruning

Updated Sep 14, 2022
Python

esl-epfl / streaminnc

Star

Don't Think It Twice: Exploit Shift Invariance for Efficient Online Streaming Inference of CNNs

time-series convolutional-neural-networks biosignals efficient-inference edge-ai shift-invariant model-optimization online-inference embedded-ml streaming-inference

Updated Aug 6, 2024
Python

shyamsridhar123 / Quantization

Star

Model quantization techniques for efficient LLM inference. Experiments with INT8, INT4, and mixed-precision quantization.

python machine-learning inference quantization model-optimization llm

Updated May 27, 2025
Python

arbitrary-number / arbitrary-number

Star

Arbitrary Numbers

python machine-learning deep-learning tensorflow gpu cuda pytorch nvidia model-serving numerical-computing model-optimization edge-inference ai-inference ai-performance consumer-gpu

Updated Aug 12, 2025
Python

Nandan91 / relu-revival-normfree

Star

PyTorch implementation of normalization-free LLMs investigating entropic behavior to find desirable activation functions

pythia leaky-relu relu privacy-preserving-machine-learning pytorch-implementation gelu gpt-2 model-optimization transformers-models normalization-free-training llm-inference llm-evaluation llm-architecture private-inference entropy-collapse attention-we

Updated Nov 2, 2024
Python

TCLResearchEurope / ptdeco

Star

ptdeco is a library for model optimization by matrix decomposition built on top of PyTorch

deep-learning pytorch model-compression model-optimization model-optimisation

Updated May 7, 2025
Python

tphakala / birdnet-onnx-converter

Sponsor

Star

Convert and optimize BirdNET models for ONNX Runtime inference on GPUs, CPUs, and embedded devices

raspberry-pi machine-learning ai artificial-intelligence onnx birdnet model-optimization bird-identification bioacustics birdnet-go

Updated Mar 2, 2026
Python

umitkacar / awesome-tinyml

Star

TinyML & Edge AI: On-device inference, model quantization, embedded ML, ultra-low-power AI for microcontrollers and IoT devices.

Updated Nov 10, 2025
Python

umitkacar / awesome-mobile-ai

Star

Mobile AI: iOS CoreML, Android TFLite, on-device inference, ONNX, TensorRT, and ML deployment for smartphones.

quantization mlkit tensorrt mnn edge-computing coreml ncnn onnx tensorflow-lite openvino mobile-ai mobile-inference pytorch-mobile model-optimization neural-engine android-ml on-device-inference ios-ml smartphone-ai

Updated Nov 10, 2025
Python

A lightweight, mobile-optimized Neural Machine Translation (NMT) framework in PyTorch. LingoLite features a modern transformer architecture with state-of-the-art optimizations for efficient multilingual translation on resource-constrained devices.

multilingual nlp pytorch neural-machine-translation translation-model edge-computing nmt-model multilingual-translation transformer-architecture on-device-ml model-optimization llm mobile-ml

Updated Feb 13, 2026
Python

liv-skeete / smart-model-router

Star

Semantic model router with parallel LLM classification, prompt caching, and vision short-circuiting. Optimizes request routing with sub-100ms overhead for Open WebUI.

caching machine-learning performance ai async-python request-routing model-optimization llm open-webui semantic-routing

Updated Feb 13, 2026
Python

sumeyye-agac / har-to-tflite

Star

Tools and experiments for converting Human Activity Recognition (HAR) models to TensorFlow Lite for efficient on-device inference on mobile and wearable devices.

python deep-learning human-activity-recognition tensorflow-lite tf-lite embedded-ai edge-ai on-device-ml mobile-ai model-optimization model-conversion

Updated Mar 5, 2026
Python

lattice-ai / Compressed-DNNs-Forget

Star

Minimal Reproducibility Study of (https://arxiv.org/abs/1911.05248). Experiments with Compression of Deep Neural Networks

deep-neural-networks sparsity deep-learning neural-network tensorflow pruning deeplearning celeba celeba-dataset tensorflow-lite tflite sparsity-optimization model-optimization neural-network-pruning tracker-misc

Updated Jun 4, 2021
Python

Ashafa1905 / load-shortfall-regression-predict-api

Star

This is an End to End project and Api deployment for Spain electricity shortfall prediction

deployment modeling exploratory-data-analysis feature-engineering data-cleaning conclusion reccomendations model-optimization

Updated May 31, 2022
Python

hilmansw / Spam-Detection-App

Star

This project is built to detect spam messages using a Long Short-Term Memory (LSTM) model combined with Word2Vec as the word embedding technique. The model has been optimized using Grid Search, achieving a best accuracy of 95.65%.

natural-language-processing deep-learning tensorflow classification spam-detection hyperparameter-tuning streamlit model-optimization

Updated Nov 25, 2025
Python

tk-yasuno / deepseek-v3-quantization-analysis

Star

Comprehensive performance analysis of DeepSeek V3 quantization levels (FP16, Q8_0, Q4_0) on 16GB GPU environments.

quantization model-evaluation fp16 gpu-performance latency-analysis model-quantization inference-acceleration model-optimization llm-inference llm-optimization deepseek-v3 throughput-analysis

Updated Sep 27, 2025
Python

quocnhut134 / Visible-Infrared_Person_Re-Identification_on_Weak_Hardware_using_Optimized-IDKL_Model

Star

Optimized IDKL Model for Visible-Infrared Person Re-Identification focusing on efficiency for resource-constrained hardware.

computer-vision deep-learning surveillance pytorch person-reidentification model-optimization

Updated Jan 27, 2026
Python

Improve this page

Add a description, image, and links to the model-optimization topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the model-optimization topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model-optimization

Here are 70 public repositories matching this topic...

onnx / ir-py

da2so / DA2Lite

MaitreChen / openvino-lenet-sample

vi2enne / Neural-Network-Pruning

esl-epfl / streaminnc

shyamsridhar123 / Quantization

arbitrary-number / arbitrary-number

Nandan91 / relu-revival-normfree

TCLResearchEurope / ptdeco

tphakala / birdnet-onnx-converter

umitkacar / awesome-tinyml

umitkacar / awesome-mobile-ai

TSOR666 / LingoLite

liv-skeete / smart-model-router

sumeyye-agac / har-to-tflite

lattice-ai / Compressed-DNNs-Forget

Ashafa1905 / load-shortfall-regression-predict-api

hilmansw / Spam-Detection-App

tk-yasuno / deepseek-v3-quantization-analysis

quocnhut134 / Visible-Infrared_Person_Re-Identification_on_Weak_Hardware_using_Optimized-IDKL_Model

Improve this page

Add this topic to your repo