-
stretch_audio Public
Pretty decent algorithm to stretch audio with far less artifacts than WSOLA/librosa.
-
awesome-russian-speech Public
Forked from alphacep/awesome-russian-speechRussian speech technology links
-
-
-
whisperX Public
Forked from m-bain/whisperXWhisperX: Automatic Speech Recognition with Accurate Word-level Timestamps.
Python BSD 2-Clause "Simplified" License UpdatedMay 1, 2025 -
llama-cpp-python Public
Forked from abetlen/llama-cpp-pythonPython bindings for llama.cpp
Python MIT License UpdatedApr 11, 2025 -
vocos Public
Forked from gemelo-ai/vocosVocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
Python MIT License UpdatedMar 8, 2025 -
-
stable-diffusion-webui Public
Forked from AUTOMATIC1111/stable-diffusion-webuiStable Diffusion Forge with support for SD3.5
Python GNU Affero General Public License v3.0 UpdatedJan 14, 2025 -
DataProcessingFramework Public
Forked from ai-forever/DataProcessingFrameworkFramework for processing and filtering datasets
Python Apache License 2.0 UpdatedDec 18, 2024 -
CRAFT-text-detection Public
Forked from boomb0om/CRAFT-text-detectionAn unofficial PyTorch implementation of CRAFT text detector with better interface and fp16 support
Jupyter Notebook UpdatedDec 2, 2024 -
Grounded-Segment-Anything Public
Forked from IDEA-Research/Grounded-Segment-AnythingGrounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Jupyter Notebook Apache License 2.0 UpdatedOct 22, 2024 -
-
-
stable-diffusion-docker Public
Forked from FurkanGozukara/stable-diffusion-dockerDocker image for Stable Diffusion WebUI with ControlNet, After Detailer, Dreambooth, Deforum and ReActor extensions, as well as Kohya_ss and ComfyUI
-
Q-VITS2-Voice-Cloning Public
Forked from FENRlR/MB-iSTFT-VITS2WIP: VITS 2 with quantized output of text-encoder and voice cloning
-
OpenVoice Public
Forked from myshell-ai/OpenVoiceInstant voice cloning by MyShell.
Python MIT License UpdatedJul 5, 2024 -
demucs Public
Forked from facebookresearch/demucsCode for the paper Hybrid Spectrogram and Waveform Source Separation
Python MIT License UpdatedJul 5, 2024 -
flask-elastic-image-search Public
Forked from radoondas/flask-elastic-image-searchPython Apache License 2.0 UpdatedJun 21, 2024 -
-
vits2_pytorch_bigvgan Public
Forked from p0p4k/vits2_pytorchunofficial vits2-TTS implementation in pytorch
-
-
HiFi-GAN Public
Forked from rishikksh20/HiFi-GANHiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Python MIT License UpdatedMay 28, 2023 -
serverless-template-whisper Public template
Forked from sahil280114/serverless-template-whisperPython MIT License UpdatedFeb 7, 2023 -
address-normalizer Public
Ищет выбранный адрес в ФИАС
-
ParlAI Public
Forked from facebookresearch/ParlAIA framework for training and evaluating AI models on a variety of openly available dialogue datasets.
Python MIT License UpdatedAug 13, 2022 -
gruut-ipa Public
Forked from rhasspy/gruut-ipaPython library for manipulating pronunciations using the International Phonetic Alphabet (IPA)
Python MIT License UpdatedJun 24, 2022 -
awesome-speech-recognition-speech-synthesis-papers Public
Forked from zzw922cn/awesome-speech-recognition-speech-synthesis-papersAutomatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
MIT License UpdatedJun 3, 2022 -
-
Paper Template for INTERSPEECH 2021
TeX UpdatedFeb 19, 2021