Run LLM inference in an Android app with llama.cpp, ExecuTorch, LiteRT, ONNX, and more.
-
Updated
Nov 12, 2025 - Kotlin
Run LLM inference in an Android app with llama.cpp, ExecuTorch, LiteRT, ONNX, and more.
An OBS plugin for removing background in portrait images (video), making it easy to replace the background when recording or streaming.
Real-time SAM2 segmentation on edge devices - 40x faster C++ inference with ONNX Runtime for iOS/Android deployment
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
The Triton backend for the ONNX Runtime.
A Rust runtime for YOLOv8 and YOLOv10 object detection using ONNX Runtime
Highly Performant, Modular, Memory Safe and Production-ready Inference, Ingestion and Indexing built in Rust 🦀
YoloDotNet - A C# .NET 8.0 project for Classification, Object Detection, OBB Detection, Segmentation and Pose Estimation in both images and live video streams.
Offline STT & VAD Service (Rust, Tonic) (Q1:2025)
The Next-Gen Database for AI—an infrastructure designed for data and AI. As the MySQL of the AI era.
🚀 A high performance real-time object detection solution using YOLO11 ⚡️ powered by ONNX-Runtime
Base on tensorrt version 8.2.4, compare inference speed for different tensorrt api.
Developed for Temasek Polytechnic, an AI-Driven Telegram Bot that automates Image Processing Tasks for Scavenger Hunts
Enterprise-unique blazing-fast & precise C# .NET 9.0 library for background removal in C# using ONNX Runtime with DirectML. FP32 and FP16 versions! Always open for PRs.
Face recognition and analytics library based on deep neural networks and ONNX runtime
My personal website.
Run large language models like Qwen and LLaMA locally on Android for offline, private, real-time question answering and chat - powered by ONNX Runtime.
Real-time computer vision for embedded Rust with OpenCV and YOLOv8 built with resource-constrained devices such as the Raspberry Pi Zero 2W in mind.
High-performance C++17 engine for real-time, stateful log anomaly detection. Uses a multi-tiered system combining heuristics, statistical Z-scores, and ONNX machine learning to find threats. Features flexible alerting (JSON, Syslog, HTTP) and live configuration reloading for operational maturity.
Inspired by checkout line assignments, this project simulates advanced load balancing in cloud systems, using reinforcement learning to assign jobs with uncertain workloads across servers.
Add a description, image, and links to the onnx-runtime topic page so that developers can more easily learn about it.
To associate your repository with the onnx-runtime topic, visit your repo's landing page and select "manage topics."