LLM inference server with continuous batching & SSD caching for Apple Silicon — managed from the macOS menu bar
-
Updated
Apr 17, 2026 - Python
MLX is a NumPy-like array framework designed for efficient and flexible machine learning on Apple silicon, brought to you by Apple machine learning research.
LLM inference server with continuous batching & SSD caching for Apple Silicon — managed from the macOS menu bar
Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
The open source research environment for AI researchers to seamlessly train, evaluate, and scale models from local hardware to GPU clusters.
MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.
MLX native implementations of state-of-the-art generative image models
This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.
Fine-tune LLMs on your Mac with Apple Silicon. SFT, DPO, GRPO, Vision, TTS, STT, Embedding, and OCR fine-tuning — natively on MLX. Unsloth-compatible API.
OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code.
Optimized Whisper models for streaming and on-device use
open-source healthcare ai
MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. It implements OpenAI-compatible API endpoints, enabling seamless integration with existing OpenAI SDK clients while leveraging the power of local ML inference.
Implementation of F5-TTS in MLX
Generate accurate transcripts using Apple's MLX framework
Run Qwen3-TTS text-to-speech locally on Mac (M1/M2/M3/M4). Voice cloning, voice design, custom voices. 100% offline using MLX.