- South Africa
- https://orcid.org/0000-0002-8168-7857
Lists (1)
Sort Name ascending (A-Z)
Stars
Here you see how to track your portfolio the right way
Umami is a modern, privacy-focused alternative to Google Analytics.
ishine / tell-stories-webui
Forked from c4fun/tell-stories-webuiDynamic Voice Actor Assignment and Emotional Narration for Realistic Story Play
An open-source, cross-platform terminal for seamless workflows
General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
gau-nernst / torchtune
Forked from meta-pytorch/torchtuneA Native-PyTorch Library for LLM Fine-tuning
The official Implementation of PeriodWave and PeriodWave-Turbo
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)
ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations
Implementation of TTS model based on NVIDIA P-Flow TTS Paper
Unofficial implementation of wavenext vocoder
SubFix: Efficient Web-Based Audio Subtitle Editing and Multilingual Automatic Annotation Tool.
Full version of wav2lip-onnx including face alignment and face enhancement and more...
Zero-Shot Speech Editing and Text-to-Speech in the Wild
The #1 open-source voice interface for desktop, mobile, and ESP32 chips.
Paper, Code and Statistics for Self-Supervised Learning and Pre-Training on Speech.
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
AcademiCodec: An Open Source Audio Codec Model for Academic Research
Solution for Zalo AI Challenge 2022 - Lyrics Alignment
AI-Unicamp / TTS
Forked from coqui-ai/TTS🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
A collection of datasets for the purpose of emotion recognition/detection in speech.
An implement of GlowTTS model. Several modes are added: speaker embedding, prosody encoder(GST), and gradient reversal.
Jason Riggle's chart of phonological features in JSON format + extras