-
gerzz.inc
- shanghai
- dubbing-ai.com
-
-
viet-tts Public
Forked from dangvansam/viet-ttsVietTTS: An Open-Source Vietnamese Text to Speech
Python Apache License 2.0 UpdatedOct 29, 2024 -
vec2wav2.0 Public
Forked from cantabile-kwok/vec2wav2.0Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995
Python GNU General Public License v3.0 UpdatedOct 29, 2024 -
F5-TTS Public
Forked from SWivid/F5-TTSOfficial code for "A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Python MIT License UpdatedOct 29, 2024 -
CMOT Public
Forked from ictnlp/CMOTCode for ACL 2023 main conference paper "CMOT: Cross-modal Mixup via Optimal Transport for Speech Translation"
Python UpdatedOct 29, 2024 -
-
GLM-4-Voice Public
Forked from THUDM/GLM-4-VoiceGLM-4-Voice | 端到端中英语音对话模型
Python Apache License 2.0 UpdatedOct 28, 2024 -
seed-vc Public
Forked from Plachtaa/seed-vczero-shot voice conversion with in context learning
Python GNU General Public License v3.0 UpdatedOct 28, 2024 -
AP-BWE Public
Forked from yxlu-0102/AP-BWETowards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction
Python MIT License UpdatedOct 28, 2024 -
ggml Public
Forked from ggerganov/ggmlTensor library for machine learning
C++ MIT License UpdatedOct 27, 2024 -
GLM-4 Public
Forked from THUDM/GLM-4GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
Python Apache License 2.0 UpdatedOct 27, 2024 -
audiotools Public
Forked from descriptinc/audiotoolsObject-oriented handling of audio data, with GPU-powered augmentations, and more.
Python MIT License UpdatedOct 27, 2024 -
podcastfy Public
Forked from souzatharsis/podcastfyAn Open Source alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Conversations with GenAI
Python Apache License 2.0 UpdatedOct 27, 2024 -
libllm Public
Forked from ling0322/libllmEfficient inference of large language models.
C++ MIT License UpdatedOct 27, 2024 -
llama.cpp Public
Forked from ggerganov/llama.cppPort of Facebook's LLaMA model in C/C++
C++ MIT License UpdatedOct 26, 2024 -
LlamaVoice Public
Forked from OpenT2S/LlamaVoiceLlamaVoice is a llama-based large voice generation model, providing inference and training ability.
-
streaming-sensevoice Public
Forked from pengzhendong/streaming-sensevoicePseudo Streaming SenseVoice with Hotwords
Python Apache License 2.0 UpdatedOct 26, 2024 -
SenseVoice.cpp Public
Forked from lovemefan/SenseVoice.cppPort of Funasr's Sense-voice model in C/C++
-
CosyVoice Public
Forked from FunAudioLLM/CosyVoiceLLM based TTS model, providing inference/training/deployment full-stack ability.
Python Apache License 2.0 UpdatedOct 25, 2024 -
SubFix Public
Forked from cronrpc/SubFixWeb-based tool for efficient batch editing, precise subtitle correction, and flexible audio control.
Python Apache License 2.0 UpdatedOct 25, 2024 -
noScribe Public
Forked from kaixxx/noScribeCutting edge AI technology for automated audio transcription. A nice GUI for OpenAIs Whisper and pyannote (speaker identification)
Python GNU General Public License v3.0 UpdatedOct 25, 2024 -
WhisperBiasing Public
Forked from BriansIDP/WhisperBiasingJupyter Notebook MIT License UpdatedOct 25, 2024 -
SpeechT5 Public
Forked from microsoft/SpeechT5SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing (ACL'2022)
Python MIT License UpdatedOct 25, 2024 -
MeloTTS.cpp Public
Forked from apinge/MeloTTS.cppA lightweight pure C++ Text-to-Speech (TTS) pipeline with OpenVINO, supporting mixed English and Chinese languages.
C++ Apache License 2.0 UpdatedOct 24, 2024 -
moonshine Public
Forked from usefulsensors/moonshineFast and accurate automatic speech recognition (ASR) for edge devices
Python MIT License UpdatedOct 22, 2024 -
WaveFM Public
Forked from PKBHY/WaveFMWaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching
-
ReDimNet Public
Forked from IDRnD/ReDimNetThe official pytorch implemention of the Intespeech 2024 paper "Reshape Dimensions Network for Speaker Recognition"
Python MIT License UpdatedOct 22, 2024 -
asr-ctc-decoder-hotword Public
Forked from pengzhendong/asr-decoderCTC decoder with hotwords for ASR.
Python Apache License 2.0 UpdatedOct 21, 2024 -
AuxiliaryASR-NB Public
Forked from lemoi18/AuxiliaryASR-ConformerJoint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment) For Norwegian
Python MIT License UpdatedOct 21, 2024 -
Amphion Public
Forked from open-mmlab/AmphionAmphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…