A local, private voice studio for Apple Silicon — on Mac and iPhone. Write a script, shape how the voice should sound, and generate speech right on your device.
On your Mac today · iPhone arriving soon.
Formerly QwenVoice — now Vocello 2.0.
- 🎙️ Three ways to make a voice — pick a built-in speaker, describe one in plain language, or clone from a reference clip you have rights to (record it right in the app, or import a file).
- 🔒 Private by default — after a one-time model download, every line renders on your device. No scripts uploaded, no audio sent to a cloud TTS service.
- ⚡ Native Swift + MLX — no Python runtime, no bundled weights, no per-line meter and no cloud queue.
- 📱 iPhone arriving soon — the same on-device engine on Apple Silicon iPhone; on-device generation already works, with the App Store / TestFlight distribution lane still in progress.
| Platform | Build | Notes |
|---|---|---|
| macOS 26+ (Apple Silicon) | Vocello 2.0.0 — Download DMG | Signed, notarized, stable. Double-click to open. |
| macOS 15 | QwenVoice 1.2.3 — Download legacy | Legacy build. No 2.x backport planned. |
| iPhone (iOS 26+, Apple Silicon) | Arriving soon | The same on-device engine; ships via App Store / TestFlight (not GitHub Releases). |
- Private by default. After models are installed, generation runs locally and your scripts, history, and generated audio stay in local app storage unless you export them.
- No subscription meter. Download the models you want, then generate on your own hardware without paying per line or waiting on a cloud queue.
- Three voice workflows. Use a built-in speaker, describe a new voice, or clone from a reference clip you own or have permission to use — recorded in the app or imported.
- Built for Apple Silicon. A native Swift + MLX engine (replacing the old bundled Python runtime) keeps generation local, private, and fully on-device — and it's what makes the iPhone app possible (Python can't ship on iPhone; on-device MLX can).
- Download
Vocello-macos26.dmg. - Open the DMG and drag
Vocello.appto/Applications. - Open Vocello.
- Go to Settings → Model downloads and install the voice models you want.
- Generate from Custom Voice, Voice Design, or Voice Cloning.
No Python setup or local server is required — install the app, download models from Settings, and generate locally.
The DMG is signed with an Apple Developer ID certificate and notarized with a stapled ticket, so the first launch opens with a double-click (no right-click bypass). To verify:
xcrun stapler validate Vocello-macos26.dmg # "The validate action worked!"
spctl --assess --type install -vv Vocello-macos26.dmg # accepted, source=Notarized Developer IDA release-metadata.txt (commit SHA, Xcode version, SDK, marketing version, build number) is attached to the same release for build provenance.
- macOS 26.0+ on an Apple Silicon Mac — available now.
- iPhone (iOS 26.0+) on Apple Silicon — arriving soon via App Store / TestFlight.
- Voice models installed from Settings → Model downloads.
Speed models are smaller 4-bit packages for faster startup and lower memory use. Quality models are larger 8-bit packages for devices with more headroom.
Vocello 2.0.0 is the current stable macOS release. For macOS 15, use QwenVoice v1.2.3; no 2.x backport is planned. Every macOS GitHub Release ships a notarized, stapled, Developer ID–signed DMG — a normal double-click install with no Gatekeeper workarounds.
- Speech generation runs locally after models are installed.
- Generated audio, recorded reference clips, and history stay in local app storage unless you export them.
- Model downloads come from Hugging Face when you install a voice model.
- Recording a reference clip and transcript auto-fill ask for the Microphone and Speech Recognition permissions on first use. Both run entirely on your Mac — recognition is on-device only, and nothing is sent to Apple or anyone else. (Transcript auto-fill additionally requires Siri to be enabled, a macOS requirement; the app explains this and links the right Settings pane.)
- Voice cloning should only be used with voices you own or have permission to use.
The main branch contains the current Vocello codebase (macOS app, iPhone app, and the vocello CLI). The stable macOS release is tagged v2.0.0.
Vocello's engine is native Swift + MLX — no Python, no bundled weights. On macOS it runs out-of-process in an isolated XPC service; on iPhone it runs in-process, fully on-device. Architecture, engine invariants, and release policy live in CLAUDE.md.
git clone https://github.com/PowerBeef/QwenVoice.git
cd QwenVoice
./scripts/regenerate_project.sh
open QwenVoice.xcodeprojThe Xcode project is generated from project.yml (edit it, not the .xcodeproj, then rerun regenerate_project.sh). SPM dependencies — MLX, Swift HuggingFace, GRDB, and the vendored mlx-audio — are deliberately pinned to exact versions for backend determinism; bumping them follows a benchmark-gated process documented in CLAUDE.md.
Useful checks:
./scripts/check_project_inputs.sh
./scripts/build_foundation_targets.sh macos
./scripts/build_foundation_targets.sh iosMore technical detail:
CLAUDE.md— repo guide: build, architecture, engine invariants, dependency pinning, release policy, conventionsdocs/reference/cli.md— the headlessvocellocommand-line tooldocs/reference/privacy-storage.md— local storage and deletion details
Vocello ships a headless command-line tool, vocello, built from source alongside the app (it is not part of the app download). It drives the same local Swift + MLX engine in-process — no Python, no bundled weights — and serves two roles: scriptable local generation from the terminal, and the deterministic driver for the perf/quality benchmarks. It uses the models you install in the app (Settings → Model downloads); the CLI itself does not download weights.
./scripts/build.sh cli # build build/vocello
build/vocello <command> [options] # run it (runs in place)| Command | What it does |
|---|---|
generate |
Synthesize one clip (Custom Voice / Voice Design / Voice Cloning); supports --stream, --json, and piped stdin. |
custom / design / clone |
Shortcuts for generate --mode … (also pick the mode interactively, or list them with modes). |
batch |
Synthesize many clips from a file with a single model load. |
voices |
List, enroll, or delete saved clone voices. |
speakers |
List the built-in Custom Voice speakers. |
models |
Inventory installed/available models (state, size). |
bench |
Drive the perf/quality matrix and aggregate the results. |
# Generate a clip (mode shortcut), or pipe a script in
build/vocello custom --variant speed --text "The train left at dawn."
echo "Hello there." | build/vocello generate --variant speed --stream --json
# Discover modes/speakers/models, then bulk synth (one model load)
build/vocello modes
build/vocello speakers list
build/vocello batch --file lines.txt --mode custom --variant speed --out-dir /tmp/batchstdout is machine-readable (an output path, or JSON with --json); progress notes go to stderr. Full reference: docs/reference/cli.md.
Vocello is available under the MIT License.
Vocello builds on Qwen3-TTS, mlx-audio, MLX, and GRDB.swift.