3 unstable releases

0.2.0	Mar 18, 2026
0.1.1	Mar 18, 2026
0.1.0	Mar 18, 2026

#1361 in Algorithms

Used in 4 crates (2 directly)

MIT license

21KB
356 lines

voice-dsp

DSP primitives for the voice TTS pipeline, built on mlx-rs (Apple MLX).

Install

[dependencies]
voice-dsp = "0.1"

What's inside

STFT / iSTFT — Short-Time Fourier Transform and its inverse, matching PyTorch conventions
MlxStft — batched STFT wrapper used by the vocoder pipeline (transform → magnitude + phase, inverse → audio)
Windowing — Hanning window generation
Interpolation — 1-D nearest/linear interpolation for upsampling tensors
Phase utilities — mlx_angle (complex argument) and mlx_unwrap (phase unwrapping)

Usage

use voice_dsp::{stft, istft, hanning, MlxStft};

// Batched STFT for the vocoder
let stft = MlxStft::new(1024, 256, 1024)?;
let (magnitude, phase) = stft.transform(&audio_batch)?;
let reconstructed = stft.inverse(&magnitude, &phase)?;

All functions operate on mlx_rs::Array and return Result<_, mlx_rs::error::Exception>.

Requirements

macOS with Apple Silicon (MLX requirement)

License

MIT

Dependencies

~4–7MB
~139K SLoC