Stars
Project that allows one to use a microphone with OpenAI whisper.
Snes9x - Portable Super Nintendo Entertainment System (TM) emulator
cross-platform (Qt), open-source (GPLv3) video editor
Downloads videos and playlists from YouTube
Demo Programs for the "Talking Head(?) Anime from a Single Image 3: Now the Body Too" Project
The code for the bark-voicecloning model. Training and inference.
Build and share delightful machine learning apps, all in Python. ๐ Star to support our work!
liujing04/Retrieval-based-Voice-Conversion-WebUI reconstruction project
Real-time end-to-end singing voice conversion system based on DDSP (Differentiable Digital Signal Processing)
Easily train a good VC model with voice data <= 10 mins!
๐ Text-Prompted Generative Audio Model
152334H / tortoise-tts-fast
Forked from neonbjb/tortoise-ttsFast TorToiSe inference (5x or your money back!)
Python script that slices audio with silence detection
GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
[CVPR 2022] Thin-Plate Spline Motion Model for Image Animation.
GUI for a Vocal Remover that uses Deep Neural Networks.
flutydeer / audio-slicer
Forked from openvpi/audio-slicerA simple GUI application that slices audio with silence detection
SoftVC VITS Singing Voice Conversion
Code for the paper Hybrid Spectrogram and Waveform Source Separation
Audio Slicer that uses silence detection to split .wav audio files into multiple .wav samples.
Deezer source separation library including pretrained models.
A TensorFlow Implementation of DC-TTS: yet another text-to-speech model
A multi-voice TTS system trained with an emphasis on quality
๐ TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
MelGAN vocoder (compatible with NVIDIA/tacotron2)
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)