Note: This project is still working in progress. Tested on Windows only.
Usage: ds_onnx_infer [-h] --ds-file VAR --acoustic-config VAR --vocoder-config VAR
[--spk VAR] [--transpose VAR] [--dur] [--variance] [--pitch] --out VAR
[--depth VAR] [--steps VAR]
[--ep VAR] [--device-index VAR] [--verbose]
Optional arguments:
-h, --help shows help message and exits
-v, --version prints version information and exits
--ds-file Path to .ds file [required]
--acoustic-config Path to acoustic dsconfig.yaml [required]
--vocoder-config Path to vocoder.yaml [required]
--spk Speaker Mixture (e.g. "name" or "name1|name2" or "name1:0.25|name2:0.75")
--transpose Transpose pitch in semitones [default: 0]
--dur Use Duration Predictor
--variance Use Multi-Variance Predictor
--pitch Use Pitch Predictor
--out Output Audio Filename (*.wav) [required]
--depth Shallow diffusion depth (type: int, range: [1, 1000])
or Rectified flow depth (type: float, range: [0, 1]) [default: 1]
--ep Execution Provider for audio inference. (cpu/directml/cuda)
[default: "cpu"]
--device-index GPU device index [default: 0]
--verbose Show debug-level logs
See docs/BUILD.md for detailed build instructions.
See docs/FAQ.md.