Enforce ML model performance in CI/CD by benchmarking inference, validating SLAs, blocking regressions, and generating deployment-ready reports.
-
Updated
Mar 31, 2026 - Python
Enforce ML model performance in CI/CD by benchmarking inference, validating SLAs, blocking regressions, and generating deployment-ready reports.
🐦 Convert and optimize BirdNET models to ONNX for efficient inference on GPUs, CPUs, and embedded devices like Raspberry Pi.
🔍 Showcase AI-driven healthcare projects focused on predictive maintenance, concept drift, and interpretability, prioritizing clinical safety and explainability.
🎛️ Monitor NVIDIA GPUs in real-time, track model usage, and analyze performance metrics efficiently with this Docker-based solution.
🔧 Fine-tune large language models locally on your data, export to GGUF, and train on CPU with ease using the Mobius LLM Fine-Tuning Engine.
🛠️ Optimize LLMs with advanced pruning strategies and real-time visualization for smaller, faster models without losing intelligence.
First thermal super-resolution system to achieve 34.2 dB PSNR at 229+ FPS using novel IMDN architecture with specialized thermal adaptations. Features breakthrough RGB→thermal transfer learning, thermal-aware multi-component loss, and real-time inference (2x: 270.6 FPS, 3x: 256.1 FPS, 4x: 250.9 FPS). Production-ready PyTorch + CUDA implementation
Early exit inference framework for HuggingFace LLMs — skip unnecessary transformer layers when the model is already confident. Supports LLaMA, Mistral, Phi, Gemma, Qwen, Pythia and more.
An optimized, lightweight RetinaFace model featuring custom modules (ECA-CBAM, WFPN). Achieves +3.07% AP on WiderFace Hard with reduced FLOPs.
This project uses Machine Learning to analyze patient health metrics and predict the likelihood of liver disease.
Aegis - OpenClaw智能优化插件。提供模型选择建议、Prompt优化、成本统计和质量评估功能。
MLBuild enforces inference performance SLAs in CI, automatically blocking slow ML models before they reach production.
Dự án Trafic-Monitoring nhận diện phương tiện, phân loại hãng xe và OCR biển số bằng C++ với ONNX Runtime + OpenCV.
Helios — Edge AI Deployment. Edge AI deployment framework
Efficient in-memory representation for ONNX, in Python
MobileNetV3 object detection with TFLite quantization — fp32/fp16/int8 edge deployment benchmarks
Benchmarking bank data to enhance marketing strategies. Models: Decision Tree and Random Forest. Libraries: Pandas, Matplotlib, Seaborn, Scikit-Learn, Numpy. Findings: Customer patterns and seasonal behaviors.
Automated INT8 quantization pipeline for ONNX models (segmentation, classification, and anomaly detection) using ONNX Runtime QDQ format. Supports efficient deployment on edge devices such as Raspberry Pi.
Comprehensive TensorFlow 2.15+ learning hub with 22+ hands-on notebooks covering computer vision, NLP, and generative AI. Features production-ready model optimization, multi-format deployment (TFLite, ONNX), distributed training, and complete MLOps pipelines. Includes pre-trained models, Docker support, and automated testing.
Static ONNX graph repair tool that zero pads weight tensors to satisfy CMSIS-NN fast path alignment constraints, no retraining required.
Add a description, image, and links to the model-optimization topic page so that developers can more easily learn about it.
To associate your repository with the model-optimization topic, visit your repo's landing page and select "manage topics."