Production ready toolkit to run AI locally
-
Updated
Apr 1, 2026 - C++
Production ready toolkit to run AI locally
[ICLR 2019] ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
Declarative way to run AI models in React Native on device, powered by ExecuTorch.
Talk to your Mac, query your docs, no cloud required. On-device voice AI + RAG
TinyChatEngine: On-Device LLM Inference Library
On-device Neural Engine
⚡ Native MLX Swift LLM inference server for Apple Silicon. OpenAI-compatible API, SSD streaming for 100B+ MoE models, TurboQuant KV cache compression, + iOS iPhone app.
Simplified AI runtime integration for mobile app development
A curated collection of LLM-powered Flutter apps built using RAG, AI Agents, Multi-Agent Systems, MCP, and Voice Agents.
[MobiSys 2026] On-device parallel ultra-low-bit (ternary) LLM inference with LUT-based mpGeMM kernel.
On-device AI SDK powering ToolNeuron — LLM chat & tool calling (llama.cpp), Stable Diffusion image generation (QNN/MNN), image processing (upscale, segment, inpaint, depth, style), and TTS. Native C++ + Kotlin JNI. Fork or clone to use in your own app.
Free AI-powered malware scanner for Linux. Detects evasive threats without signatures. Scans offline, open-source CLI with optional cloud intelligence.
onenm_local_llm is a Flutter plugin that simplifies on-device language model inference on Android using llama.cpp. It removes the complexity of setting up native runtimes, model loading, and inference pipelines, so developers can integrate local AI into their apps through a simple API.
BLOCKSET: Efficient out of core tree ensemble inference
High-performance On-Device MoA (Mixture of Agents) Engine in C++. Optimized for CPU inference with RadixCache & PagedAttention. (Tiny-MoA Native)
Run powerful AI models entirely on your Android device. 100% offline, private, no API keys. Built with Kotlin, Jetpack Compose, Material 3, and llama.cpp. Download GGUF models from Hugging Face. On-device LLM inference for Android.
Low-latency edge speech framework for streaming ASR, fluency event detection, and assistive phrase completion
Add a description, image, and links to the on-device-ai topic page so that developers can more easily learn about it.
To associate your repository with the on-device-ai topic, visit your repo's landing page and select "manage topics."