Skip to content

awni/voxmlx

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

voxmlx

Realtime speech-to-text with Voxtral Mini Realtime in MLX.

Install

pip install voxmlx

Usage

voxmlx

Transcribe audio from a file or stream from the microphone in real-time.

Stream from microphone:

voxmlx

Transcribe a file:

voxmlx --audio audio.flac

Options:

Flag Description Default
--audio Path to audio file (omit to stream from mic) None
--model Model path or HuggingFace model ID mlx-community/Voxtral-Mini-4B-Realtime-6bit
--temp Sampling temperature (0 = greedy) 0.0

voxmlx-convert

Convert Voxtral weights to voxmlx/MLX format with optional quantization.

Basic conversion:

voxmlx-convert --mlx-path voxtral-mlx

4-bit quantized conversion:

voxmlx-convert -q --mlx-path voxtral-mlx-4bit

Convert and upload to HuggingFace:

voxmlx-convert -q --mlx-path voxtral-mlx-4bit --upload-repo username/voxtral-mlx-4bit

Options:

Flag Description Default
--hf-path HuggingFace model ID or local path mistralai/Voxtral-Mini-4B-Realtime-2602
--mlx-path Output directory mlx_model
-q, --quantize Quantize the model Off
--group-size Quantization group size 64
--bits Bits per weight 4
--dtype Cast weights (float16, bfloat16, float32) None
--upload-repo HuggingFace repo to upload converted model None

Python API

from voxmlx import transcribe

text = transcribe("audio.flac")
print(text)

About

Realtime Transcription with Voxtral in MLX

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages