Want to hear your own voice?
Zero-shot voice cloning — record or upload a short reference and synthesize across 31 languages.
A 99M-parameter open-weight text-to-speech model running locally on CPU via ONNX Runtime. No GPU. No cloud. No API.
Open weights. Runs in your browser or fully offline on your machine.
Bring your own voice and hear it speak 31 languages.
Hundreds of preset voices and emotion presets, right in your browser.
One 99M-parameter model. No per-language fine-tuning. No GPU.
Highlighted languages have audio samples below.
Same input text, same reference voice prompt, three systems. Supertonic 3 is ours — 99M params on CPU. OmniVoice and Chatterbox Multilingual are 5–8× larger and run on a GPU.
No samples match those filters.
Pick a surface — bring your own voice, browse preset voices, or build with the API.
Zero-shot voice cloning — record or upload a short reference and synthesize across 31 languages.
Pick a voice, pick an emotion, and try Supertonic 3 right in your browser — no install needed.
Integrate character-driven, expressive voice generation across 31 languages with adjustable speech controls.
RTF (real-time factor) measures how long synthesis takes per second of audio — lower is faster. ×RT is the inverse. Supertonic 3 reaches parity with an 800M-parameter GPU baseline while running on a 16-thread CPU.
N = 30 · same machine, same text, same reference voices| Model | Hardware | Params | N | Synth | Audio | RTF ↓ | ×RT ↑ |
|---|---|---|---|---|---|---|---|
| Supertonic 3 | CPU (16 threads) | 99M | 30 | 57.99 s | 289.92 s | 0.200 | 5.00× |
| OmniVoice | RTX 3090 | 800M | 30 | 53.90 s | 275.17 s | 0.196 | 5.11× |
| Chatterbox Multilingual | RTX 3090 | 500M | 30 | 199.70 s | 252.68 s | 0.790 | 1.27× |
Seconds of speech produced per second of wall-clock time, across the same 30 inputs.
./samples/ on this page).Officially supported runtimes. Each tab links to working examples in the upstream repo.
# pip install supertonic
from supertonic import TTS
tts = TTS(auto_download=True)
# 1) Default: synthesize English with voice "M1"
style = tts.get_voice_style(voice_name="M1")
wav, duration = tts.synthesize(
"A gentle breeze moved through the open window.",
voice_style=style,
lang="en",
)
tts.save_audio(wav, "output.wav")
# 2) Swap the voice → "M2"
style = tts.get_voice_style(voice_name="M2")
# 3) Swap the language → Japanese
wav, _ = tts.synthesize("こんにちは、世界。", voice_style=style, lang="ja")
Full reference and example scripts: supertonic-py docs.
// npm install @supertone/supertonic
import { TTS } from "@supertone/supertonic";
const tts = await TTS.load({ autoDownload: true });
const style = await tts.getVoiceStyle("M1");
const { wav } = await tts.synthesize("Hello from Node.", { style, lang: "en" });
See the node/ folder in the upstream repo.
// runs in browsers via onnxruntime-web
import { TTS } from "@supertone/supertonic-web";
const tts = await TTS.load();
const { wav } = await tts.synthesize("Hello from the browser.", { lang: "en" });
See the web/ folder in the upstream repo.
// Swift Package Manager: github.com/supertone-inc/supertonic-swift
import Supertonic
let tts = try Supertonic.TTS(autoDownload: true)
let wav = try tts.synthesize("Hello from iOS.", lang: "en")
See the ios/ folder in the upstream repo.
// Gradle: implementation("ai.supertone:supertonic-android:3.+")
val tts = Supertonic.TTS(context, autoDownload = true)
val wav = tts.synthesize("Hello from Android.", lang = "en")
See the android/ folder in the upstream repo.
// CMake: find_package(Supertonic CONFIG REQUIRED)
#include <supertonic/tts.hpp>
auto tts = supertonic::TTS::create({ .auto_download = true });
auto wav = tts->synthesize("Hello from C++.", { .lang = "en" });
See the cpp/ folder in the upstream repo.
The trained Supertonic 3 model is released under the OpenRAIL-M license. Weights are open and usable commercially, with use-based restrictions (no harm, no impersonation without consent) and an attribution requirement.
Note: OpenRAIL-M is not equivalent to MIT — it imposes downstream use restrictions. Read the full license text before deploying.
The Python package, runtime bindings, and example code in the upstream repo are MIT-licensed. Use, modify, and redistribute freely with attribution.
Standard MIT terms: no warranty, attribution required, no restrictions on commercial use.