Stars
Full Transformer into a custom chip. microGPT in RTL, generating names on a Virtex-5 FPGA at ~56k tokens/second.
From-scratch C++/CUDA inference engine for Qwen3-8B, with zero external libraries
Grammarly for your terminal — underlines typos in Claude Code, Codex, Aider & any other terminal app
A native macOS menu-bar app that keeps your keyboard input source locked — global, per-app, and per-URL.
A diff tool for Typst documents, similar to latexdiff for LaTeX
A pure Rust Excel/OpenDocument SpreadSheets file reader: rust on metal sheets
A rust RTF parser & lexer designed for speed and memory efficiency
Community model zoo + knowledge base for Apple Core AI (iOS/macOS 27): Qwen3.5 & Gemma 4 converted end-to-end, verified on-device (iPhone 17 Pro GPU/ANE), conversion gotchas, custom Metal kernels, …
A self-hostable Wasm sandbox for JavaScript workers
A unified toolkit for benchmarking and running inference across all open-source TTS models
🐹 A free, open-source, native macOS GUI for the Mole CLI (mo): clean, uninstall, optimize, analyze disk, and watch live status. Plus long-range history + an MCP server for AI agents.
A tool for creating and running Linux containers using lightweight virtual machines on a Mac. It is written in Swift, and optimized for Apple silicon.
GPU-accelerated transparent compression S3-compatible storage gateway. Drop-in replacement for AWS S3 endpoints; cuts your S3 bill 50-80% with no app changes (Rust, nvCOMP, zstd).
Python bindings for access to the on-device model at the core of Apple Intelligence through the Foundation Models framework
Bridges PyTorch and Core AI. Convert existing models to Core AI IR, or author new ones from PyTorch via composite ops, custom op lowerings, and inline Metal GPU kernels.
An Agent Skill for designing cross-platform desktop apps that feel native — distilled from Raycast's 2.0 deep-dive and reverse engineering of Raycast Beta.app. Eight architectural tenets, four-laye…
Fork() for AI agent microVMs. Spawn 100 children in ~100ms from a warm parent; BRANCH a live VM in ~150ms. KVM-isolated, snapshot CoW.
An open-source LLM router that optimize your agent for cost and performance — with every run.
KVarN is a native vLLM KV-cache quantization backend for your agents: 3-5x more context, throughput above FP16, and FP16-level accuracy. Calibration-free, one flag.
Kimi Code CLI — The Starting Point for Next-Gen Agents
Neutral, reproducible benchmark for local LLMs on Apple Silicon (Mac · iPhone · iPad) — MLX, llama.cpp, CoreML, Apple Foundation Models
Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.
Edge.js is a secure JavaScript runtime, designed for Edge computing and AI workloads