on-device-ai

⚡ Native MLX Swift LLM inference server for Apple Silicon. OpenAI-compatible API, SSD streaming for 100B+ MoE models, TurboQuant KV cache compression, + iOS iPhone app.

swift ios metal inference moe mlx on-device-ai openai-api llm apple-sili

Updated Apr 4, 2026
C++

neuralize-ai / edgerunner

Star

Simplified AI runtime integration for mobile app development

android ios mobile cplusplus cross-platform neural-network gpu inference coreml tflite qnn npu on-device-ai

Updated Nov 29, 2024
C++

cactus-compute / cactus-flutter

Star

Cactus Flutter plugin: Run AI locally in your Flutter apps

ai flutter tt mlx vlm flutter-apps onnx on-device-ai llm llms llamacpp

Updated Dec 21, 2025
C++

theshivamlko / flutter-ai-labs

Star

A curated collection of LLM-powered Flutter apps built using RAG, AI Agents, Multi-Agent Systems, MCP, and Voice Agents.

embeddings chat-application mobile-development front-end-development tensorflow-lite flutter-examples vector-search on-device-ml on-device-ai retrieval-augmented-generation ai-agents-framework llm-agents llm-application gemini-ai rag-application mcp-server mcp-client gemma3n

Updated Dec 1, 2025
C++

OpenBitSys / vlut.cpp

Star

[MobiSys 2026] On-device parallel ultra-low-bit (ternary) LLM inference with LUT-based mpGeMM kernel.

gemm lookup-table on-device-ai low-bit-quantization llama-cpp ggml llm-inference

Updated Mar 21, 2026
C++

On-device AI SDK powering ToolNeuron — LLM chat & tool calling (llama.cpp), Stable Diffusion image generation (QNN/MNN), image processing (upscale, segment, inpaint, depth, style), and TTS. Native C++ + Kotlin JNI. Fork or clone to use in your own app.

android kotlin text-to-speech ai android-sdk image-processing tts image-generation arm64 snapdragon qualcomm edge-ai on-device-ai llm stable-diffusion llama-cpp gguf offline-ai

Updated Apr 4, 2026
C++

metaforensics-ai / semantics-av-cli

Star

Free AI-powered malware scanner for Linux. Detects evasive threats without signatures. Scans offline, open-source CLI with optional cloud intelligence.

linux security machine-learning ai cybersecurity antivirus threat-hunting security-automation zero-day devsecops threat-intelligence malware-detection malware-scanner endpoint-security on-device-ai clamav-alternative

Updated Feb 24, 2026
C++

SxryxnshS5 / onenm_local_llm

Star

onenm_local_llm is a Flutter plugin that simplifies on-device language model inference on Android using llama.cpp. It removes the complexity of setting up native runtimes, model loading, and inference pipelines, so developers can integrate local AI into their apps through a simple API.

android flutter flutter-plugin pluigin phi2 on-device-ai pubdev pub-dev llm llms llamacpp llama-cpp local-llm llm-inference tinyllama offline-ai

Updated Mar 19, 2026
C++

megh1241 / blockset

Star

BLOCKSET: Efficient out of core tree ensemble inference

machine-learning aws-lambda random-forest memory-efficient out-of-core tree-ensemble gradient-boosting efficient-inference on-device-ai

Updated Feb 13, 2023
C++

hung-yueh / react-native-litert-lm

Star

android ios react-native on-device-ai llm nitro-module

Updated Mar 29, 2026
C++

gyunggyung / Agent.cpp

Star

High-performance On-Device MoA (Mixture of Agents) Engine in C++. Optimized for CPU inference with RadixCache & PagedAttention. (Tiny-MoA Native)

c cpp moa on-device-ai llm llamacpp llama-cpp ggml paged-attention cpu-optimization mixture-of-agents radix-attention

Updated Jan 25, 2026
C++

tk85457 / LocalMind

Star

Run powerful AI models entirely on your Android device. 100% offline, private, no API keys. Built with Kotlin, Jetpack Compose, Material 3, and llama.cpp. Download GGUF models from Hugging Face. On-device LLM inference for Android.

Updated Mar 12, 2026
C++

miosipof / fluens

Star

Low-latency edge speech framework for streaming ASR, fluency event detection, and assistive phrase completion

speech-recognition asr fluency conformer on-device-ai

Updated Mar 19, 2026
C++

Improve this page

Add a description, image, and links to the on-device-ai topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the on-device-ai topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

on-device-ai

Here are 19 public repositories matching this topic...

RunanywhereAI / runanywhere-sdks

mit-han-lab / proxylessnas

software-mansion / react-native-executorch

RunanywhereAI / RCLI

mit-han-lab / TinyChatEngine

Samsung / ONE

SharpAI / SwiftLM

neuralize-ai / edgerunner

cactus-compute / cactus-flutter

theshivamlko / flutter-ai-labs

OpenBitSys / vlut.cpp

Siddhesh2377 / Ai-Systems-New

metaforensics-ai / semantics-av-cli

SxryxnshS5 / onenm_local_llm

megh1241 / blockset

hung-yueh / react-native-litert-lm

gyunggyung / Agent.cpp

tk85457 / LocalMind

miosipof / fluens

Improve this page

Add this topic to your repo