QinHsiu

🎯

Focusing

QinHsiu QinHsiu

🎯

Focusing

Man proposes, Gad disposes.

31 followers · 158 following

02:36 (UTC -12:00)
https://qinhsiu.github.io

Achievements

Stars

Awesome-TTS

some amazing TTS projects

122 repositories

spotify / klio

Smarter data pipelines for audio.

Python 866 52 Updated Jan 10, 2024

liusongxiang / StarGAN-Voice-Conversion

This is a pytorch implementation of the paper: StarGAN-VC: Non-parallel many-to-many voice conversion with star generative adversarial networks

Python 523 92 Updated Oct 11, 2019

rishikksh20 / VocGAN

VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network

Python 321 59 Updated Jul 25, 2024

rsxdalv / TTS-WebUI

A single Gradio + React WebUI with extensions for ACE-Step, Kimi Audio, Piper TTS, GPT-SoVITS, CosyVoice, XTTSv2, DIA, Kokoro, OpenVoice, ParlerTTS, Stable Audio, MMS, StyleTTS2, MAGNet, AudioGen, …

TypeScript 2,840 295 Updated Nov 23, 2025

rsxdalv / musicgen-prompts

Site for sharing MusicGen + AudioGen Prompts and Creations

TypeScript 48 5 Updated Mar 25, 2025

audeering / w2v2-how-to

How to use our public wav2vec2 dimensional emotion model

Jupyter Notebook 532 50 Updated May 22, 2023

b04901014 / FT-w2v2-ser

Official implementation for the paper Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition

Python 153 34 Updated Oct 26, 2021

vbelz / Speech-enhancement

Deep learning for audio denoising

Python 742 130 Updated Oct 15, 2023

neonbjb / tortoise-tts

A multi-voice TTS system trained with an emphasis on quality

Jupyter Notebook 14,744 2,048 Updated Nov 19, 2024

Rikorose / DeepFilterNet

Noise supression using deep filtering

Python 3,654 373 Updated Oct 17, 2024

zcaceres / spec_augment

🔦 A Pytorch implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

Jupyter Notebook 499 61 Updated Jun 11, 2021

NVIDIA-NeMo / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,357 3,243 Updated Dec 24, 2025

QinHsiu / CL4SRL

A self-supervised framework for Text-to-Speech

Python 1 Updated Nov 5, 2023

keonlee9420 / Cross-Speaker-Emotion-Transfer

PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech

Python 194 26 Updated Nov 9, 2022

kaituoxu / Speech-Transformer

A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.

Python 805 196 Updated Apr 6, 2023

facebookresearch / audio2photoreal

Code and dataset for photorealistic Codec Avatars driven from audio

Python 2,846 280 Updated Sep 15, 2024

jianchang512 / clone-voice

A sound cloning tool with a web interface, using your voice or any sound to record audio / 一个带web界面的声音克隆工具，使用你的音色或任意声音来录制音频

Python 8,863 976 Updated Aug 29, 2025

open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 9,561 772 Updated May 27, 2025

SYSTRAN / faster-whisper

Faster Whisper transcription with CTranslate2

Python 19,619 1,638 Updated Nov 19, 2025

RasaHQ / rasa

💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

Python 20,949 4,906 Updated Dec 18, 2025

RVC-Boss / GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 53,476 5,855 Updated Dec 25, 2025

2noise / ChatTTS

A generative speech model for daily dialogue.

Python 38,397 4,170 Updated Dec 3, 2025

gpt-omni / mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,502 302 Updated Nov 5, 2024

zzw922cn / awesome-speech-recognition-speech-synthesis-papers

Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)

3,105 513 Updated Oct 19, 2023

zzw922cn / Automatic_Speech_Recognition

End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow

Python 2,841 534 Updated Mar 24, 2023

wenet-e2e / wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Python 4,971 1,170 Updated Dec 19, 2025

remsky / Kokoro-FastAPI

Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/CPU ONNX and NVIDIA GPU PyTorch support, handling, and auto-stitching

Python 4,158 689 Updated Dec 13, 2025

QwenLM / Qwen2-Audio

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 2,016 155 Updated Apr 21, 2025

fishaudio / fish-speech

SOTA Open Source TTS

Python 24,406 2,006 Updated Dec 1, 2025

jianchang512 / ChatTTS-ui

一个简单的本地网页界面，使用ChatTTS将文字合成为语音，同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with support for external API interfaces.

Python 7,467 914 Updated Dec 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QinHsiu QinHsiu

Achievements

Achievements

Block or report QinHsiu

Awesome-TTS

spotify / klio

liusongxiang / StarGAN-Voice-Conversion

rishikksh20 / VocGAN

rsxdalv / TTS-WebUI

rsxdalv / musicgen-prompts

audeering / w2v2-how-to

b04901014 / FT-w2v2-ser

vbelz / Speech-enhancement

neonbjb / tortoise-tts

Rikorose / DeepFilterNet

zcaceres / spec_augment

NVIDIA-NeMo / NeMo

QinHsiu / CL4SRL

keonlee9420 / Cross-Speaker-Emotion-Transfer

kaituoxu / Speech-Transformer

facebookresearch / audio2photoreal

jianchang512 / clone-voice

open-mmlab / Amphion

SYSTRAN / faster-whisper

RasaHQ / rasa

RVC-Boss / GPT-SoVITS

2noise / ChatTTS

gpt-omni / mini-omni

zzw922cn / awesome-speech-recognition-speech-synthesis-papers

zzw922cn / Automatic_Speech_Recognition

wenet-e2e / wenet

remsky / Kokoro-FastAPI

QwenLM / Qwen2-Audio

fishaudio / fish-speech

jianchang512 / ChatTTS-ui