THTTS (Thai TTS)

This project is the first implementation of Text-to-Speech (TTS) in Thai using the Wyoming protocol, making it fully compatible with Home Assistant. It enables local, streaming Thai voice synthesis for smarter automations and AI assistants—no cloud required.

Bring your local AI to life in Thai language with seamless integration, low latency, and privacy-first design.

Model Attribution

All model weights are provided by VIZINTZOR via Hugging Face:

VITS Thai Female/Male:
MMS-TTS-THAI-FEMALEV2,
MMS-TTS-THAI-MALEV2
F5-TTS Thai:
F5-TTS-THAI
F5-TTS-TH-V2

Please acknowledge and cite VIZINTZOR if you use these models in your work.

Recommended Model

For best quality and performance, use F5-TTS v1.

How to Run

You can run the server using either direct uv commands or the provided entrypoint.sh script (recommended for Docker and easy switching).

1. Using `uv` Directly

VITS Thai (Female/Male)

uv run python src/wyoming_thai_vits.py --log-level INFO --host 0.0.0.0 --port 10200 \
  --model-id VIZINTZOR/MMS-TTS-THAI-FEMALEV2

uv run python src/wyoming_thai_vits.py --log-level INFO --host 0.0.0.0 --port 10200 \
  --model-id VIZINTZOR/MMS-TTS-THAI-MALEV2

F5-TTS Thai v1 (Recommended)

uv run python src/wyoming_thai_f5.py --log-level INFO --host 0.0.0.0 --port 10200 \
  --model-version v1

F5-TTS Thai v2

uv run python src/wyoming_thai_f5.py --log-level INFO --host 0.0.0.0 --port 10200 \
  --model-version v2

2. Using `entrypoint.sh` (Recommended)

Set the backend via THTTS_BACKEND environment variable:

VITS for VITS model
F5_V1 for F5-TTS v1 (recommended)
F5_V2 for F5-TTS v2

Example:

THTTS_BACKEND=F5_V1 ./entrypoint.sh

You can override other parameters via environment variables (see below).

Environment Variables

Variable	Default Value	Description
`THTTS_BACKEND`	`VITS`	Model backend: `VITS`, `F5_V1`, or `F5_V2`
`THTTS_HOST`	`0.0.0.0`	Bind address
`THTTS_PORT`	`10200`	Port to listen on
`THTTS_LOG_LEVEL`	`INFO`	Log level (`DEBUG`, `INFO`, etc.)
`THTTS_MODEL`	`VIZINTZOR/MMS-TTS-THAI-FEMALEV2`	VITS model ID
`THTTS_REF_AUDIO`	`hf_sample`	F5 reference audio path
`THTTS_REF_TEXT`	(empty)	F5 reference transcript
`THTTS_DEVICE`	`auto`	`auto`, `cpu`, or `cuda`
`THTTS_SPEED`	`1.0`	F5 speech speed multiplier
`THTTS_NFE_STEPS`	`32`	F5 denoising steps
`THTTS_MAX_CONCURRENT`	`1`	Max concurrent synth requests
`THTTS_CKPT_FILE`	(auto-selected by backend)	F5 checkpoint file path
`THTTS_VOCAB_FILE`	(auto-selected by backend)	F5 vocab file path
`THTTS_SPEAK_SPEED`
`THTTS_MAX_WAIT_MS`
`THTTS_MIN_SENT_CHARS`
`THTTS_VOICES_YAML`		Voices List YAML (For multiple voice support) (see [#voice-list-file])

Voices List yaml File

You can specify THTTS_VOICES_YAML to the path containning the following to support multiple voice at the same time

- name: default
  attribution:
    name: VIZINTZOR/F5-TTS-THAI
    url: https://huggingface.co/VIZINTZOR/F5-TTS-THAI
  languages: ["th", "th-TH"]
  description: Default Original
  installed: true
  version: "1.0"
  ref_sound_path: /mnt/data/services/thtts/ref_sound/original__ฉันเดินทางไปเที่ยวที่จังหวัดเชียงใหม่ในช่วงฤดูหนาวเพื่อสัมผัสอากาศเย็นสบาย.wav
  ref_sound_sentence: ฉันเดินทางไปเที่ยวที่จังหวัดเชียงใหม่ในช่วงฤดูหนาวเพื่อสัมผัสอากาศเย็นสบาย

- name: meme
  attribution:
    name: VIZINTZOR/F5-TTS-THAI
    url: https://huggingface.co/VIZINTZOR/F5-TTS-THAI
  languages: ["th", "th-TH"]
  description: meme Female
  installed: true
  version: "1.0"
  ref_sound_path: /mnt/data/services/thtts/ref_sound/meme__ชั้นเดินทางไปเที่ยวที่จังหวัดเชียงใหม่ในช่วงฤดูหนาวเพื่อสัมผัสอากาศเย็นสบาย.mp3
  ref_sound_sentence: ชั้นเดินทางไปเที่ยวที่จังหวัดเชียงใหม่ในช่วงฤดูหนาวเพื่อสัมผัสอากาศเย็นสบาย

3. Docker Compose (NVIDIA GPU)

services:
  thtts:
    image: ghcr.io/zen3515/thtts:latest
    container_name: thtts
    restart: unless-stopped
    shm_size: "2g" # please adjust
    environment:
      - THTTS_BACKEND=F5_V1
      - THTTS_HOST=0.0.0.0
      - THTTS_PORT=10200
      - THTTS_LOG_LEVEL=INFO
      - THTTS_DEVICE=auto
      - NVIDIA_VISIBLE_DEVICES=all
      - NVIDIA_DRIVER_CAPABILITIES=compute,utility
    ports:
      - "10200:10200"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]

Note:

Make sure you have NVIDIA Container Toolkit installed.
Adjust the THTTS_BACKEND and other environment variables as needed.

How to Test

Query Info

printf '{"type":"describe","data":{}}\n' | nc 127.0.0.1 10200

Synthesize Speech

Just connect it to homeassistant, it's probably the most up to spec with wyoming protocol

License

See individual model pages on Hugging Face for license details.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
src		src
.dockerignore		.dockerignore
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
entrypoint.sh		entrypoint.sh
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

THTTS (Thai TTS)

Model Attribution

Recommended Model

How to Run

1. Using `uv` Directly

VITS Thai (Female/Male)

F5-TTS Thai v1 (Recommended)

F5-TTS Thai v2

2. Using `entrypoint.sh` (Recommended)

Environment Variables

Voices List yaml File

3. Docker Compose (NVIDIA GPU)

How to Test

Query Info

Synthesize Speech

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Zen3515/THTTS

Folders and files

Latest commit

History

Repository files navigation

THTTS (Thai TTS)

Model Attribution

Recommended Model

How to Run

1. Using uv Directly

VITS Thai (Female/Male)

F5-TTS Thai v1 (Recommended)

F5-TTS Thai v2

2. Using entrypoint.sh (Recommended)

Environment Variables

Voices List yaml File

3. Docker Compose (NVIDIA GPU)

How to Test

Query Info

Synthesize Speech

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

1. Using `uv` Directly

2. Using `entrypoint.sh` (Recommended)

Packages