3 unstable releases
| 0.2.0 | Mar 18, 2026 |
|---|---|
| 0.1.1 | Mar 18, 2026 |
| 0.1.0 | Mar 18, 2026 |
#1361 in Algorithms
Used in 4 crates
(2 directly)
21KB
356 lines
voice-dsp
DSP primitives for the voice TTS pipeline, built on mlx-rs (Apple MLX).
Install
[dependencies]
voice-dsp = "0.1"
What's inside
- STFT / iSTFT — Short-Time Fourier Transform and its inverse, matching PyTorch conventions
MlxStft— batched STFT wrapper used by the vocoder pipeline (transform→ magnitude + phase,inverse→ audio)- Windowing — Hanning window generation
- Interpolation — 1-D nearest/linear interpolation for upsampling tensors
- Phase utilities —
mlx_angle(complex argument) andmlx_unwrap(phase unwrapping)
Usage
use voice_dsp::{stft, istft, hanning, MlxStft};
// Batched STFT for the vocoder
let stft = MlxStft::new(1024, 256, 1024)?;
let (magnitude, phase) = stft.transform(&audio_batch)?;
let reconstructed = stft.inverse(&magnitude, &phase)?;
All functions operate on mlx_rs::Array and return Result<_, mlx_rs::error::Exception>.
Requirements
- macOS with Apple Silicon (MLX requirement)
License
MIT
Dependencies
~4–7MB
~139K SLoC