on-device-ai

Here are 18 public repositories matching this topic...

megh1241 / blockset

BLOCKSET: Efficient out of core tree ensemble inference

machine-learning aws-lambda random-forest memory-efficient out-of-core tree-ensemble gradient-boosting efficient-inference on-device-ai

Updated Feb 13, 2023
C++

mit-han-lab / TinyChatEngine

Star

TinyChatEngine: On-Device LLM Inference Library

c arm deep-learning cpp x86-64 quantization edge-computing cuda-programming on-device-ai large-language-models

Updated Jul 4, 2024
C++

mit-han-lab / proxylessnas

Star

[ICLR 2019] ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware

acceleration automl specialization efficient-model on-device-ai hardware-aware

Updated Aug 30, 2024
C++

neuralize-ai / edgerunner

Star

Simplified AI runtime integration for mobile app development

android ios mobile cplusplus cross-platform neural-network gpu inference coreml tflite qnn npu on-device-ai

Updated Nov 29, 2024
C++

theshivamlko / flutter-ai-labs

Star

A curated collection of LLM-powered Flutter apps built using RAG, AI Agents, Multi-Agent Systems, MCP, and Voice Agents.

embeddings chat-application mobile-development front-end-development tensorflow-lite flutter-examples vector-search on-device-ml on-device-ai retrieval-augmented-generation ai-agents-framework llm-agents llm-application gemini-ai rag-application mcp-server mcp-client gemma3n

Updated Dec 1, 2025
C++

cactus-compute / cactus-flutter

Star

Cactus Flutter plugin: Run AI locally in your Flutter apps

ai flutter tt mlx vlm flutter-apps onnx on-device-ai llm llms llamacpp

Updated Dec 21, 2025
C++

gyunggyung / Agent.cpp

Star

High-performance On-Device MoA (Mixture of Agents) Engine in C++. Optimized for CPU inference with RadixCache & PagedAttention. (Tiny-MoA Native)

c cpp moa on-device-ai llm llamacpp llama-cpp ggml paged-attention cpu-optimization mixture-of-agents radix-attention

Updated Jan 25, 2026
C++

metaforensics-ai / semantics-av-cli

Star

Free AI-powered malware scanner for Linux. Detects evasive threats without signatures. Scans offline, open-source CLI with optional cloud intelligence.

linux security machine-learning ai cybersecurity antivirus threat-hunting security-automation zero-day devsecops threat-intelligence malware-detection malware-scanner endpoint-security on-device-ai clamav-alternative

Updated Feb 24, 2026
C++

On-device AI SDK powering ToolNeuron — LLM chat & tool calling (llama.cpp), Stable Diffusion image generation (QNN/MNN), image processing (upscale, segment, inpaint, depth, style), and TTS. Native C++ + Kotlin JNI. Fork or clone to use in your own app.

android kotlin text-to-speech ai android-sdk image-processing tts image-generation arm64 snapdragon qualcomm edge-ai on-device-ai llm stable-diffusion llama-cpp gguf offline-ai

Updated Mar 6, 2026
C++

tk85457 / LocalMind

Star

Run powerful AI models entirely on your Android device. 100% offline, private, no API keys. Built with Kotlin, Jetpack Compose, Material 3, and llama.cpp. Download GGUF models from Hugging Face. On-device LLM inference for Android.

Updated Mar 12, 2026
C++

RunanywhereAI / RCLI

Star

Talk to your Mac, query your docs, no cloud required. On-device voice AI + RAG

text-to-speech metal speech-to-text voice-assistant rag parakeet on-device-ai apple-silicon ai-assistant llm llama-cpp local-ai tool-calling kokoro-tts qwen3 lfm2 kitten-tts

Updated Mar 16, 2026
C++

SxryxnshS5 / onenm_local_llm

Star

onenm_local_llm is a Flutter plugin that simplifies on-device language model inference on Android using llama.cpp. It removes the complexity of setting up native runtimes, model loading, and inference pipelines, so developers can integrate local AI into their apps through a simple API.