Skip to content
View zhaoxy0303's full-sized avatar

Block or report zhaoxy0303

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
45 results for source starred repositories written in Python
Clear filter

A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统

Python 8,367 1,898 Updated Sep 6, 2025

A python package to analyze and compare voices with deep learning

Python 3,242 476 Updated Oct 12, 2023

WaveRNN Vocoder + TTS

Python 2,178 692 Updated Jul 2, 2022

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

Python 2,165 616 Updated Oct 27, 2023

Speech emotion recognition implemented in Keras (LSTM, CNN, SVM, MLP) | 语音情感识别

Python 1,296 226 Updated Mar 25, 2023

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

Python 1,095 218 Updated Oct 23, 2024

Unsupervised Speech Decomposition Via Triple Information Bottleneck

Python 699 96 Updated Oct 23, 2024

Voice Conversion Tool Kit

Python 608 114 Updated Feb 27, 2023

speech emotion recognition using a convolutional recurrent networks based on IEMOCAP

Python 408 142 Updated Jul 8, 2019

A PyTorch implementation of Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis

Python 374 71 Updated Dec 8, 2022

Implementation of "Perceptual Losses for Real-Time Style Transfer and Super-Resolution" in PyTorch

Python 315 70 Updated Sep 23, 2020

End-to-End Automatic Speech Recognition on PyTorch

Python 304 62 Updated Jun 2, 2022

Implementation code of non-parallel sequence-to-sequence VC

Python 248 56 Updated Mar 24, 2023

Any-to-any voice conversion by end-to-end extracting and fusing fine-grained voice fragments with attention

Python 203 37 Updated Nov 30, 2020
Python 154 34 Updated Dec 20, 2023

This repository contains code to replicate results from the ICASSP 2020 paper "StarGAN for Emotional Speech Conversion: Validated by Data Augmentation of End-to-End Emotion Recognition".

Python 137 27 Updated Oct 24, 2021

This is the implementation of the Speaker Odyssey 2020 paper " Transforming spectrum and prosody for emotional voice conversion with non-parallel training data".

Python 125 28 Updated Dec 14, 2020

This is the official implementation of the paper AGAIN-VC: A One-shot Voice Conversion using Activation Guidance and Adaptive Instance Normalization.

Python 115 19 Updated Dec 7, 2020

A pytorch based end2end speech recognition system.

Python 114 24 Updated Jan 16, 2021

This is the implementation of our Interspeech 2020 paper "Converting anyone's emotion: towards speaker-independent emotional voice conversion".

Python 90 13 Updated Nov 13, 2020

This is the implementation of our Interspeech 2021 paper: Limited data emotional voice conversion leveraging text-to-speech: two-stage sequence-to-sequence training.

Python 87 16 Updated Dec 31, 2022

A system works on singing voice synthesis

Python 79 19 Updated Jan 11, 2023

style token with tacotron2

Python 62 16 Updated Jul 6, 2023

Implementation of the paper "Improved End-to-End Speech Emotion Recognition Using Self Attention Mechanism and Multitask Learning" From INTERSPEECH 2019

Python 57 11 Updated Dec 20, 2020
Python 49 9 Updated May 3, 2020

End-to-end Speech Emotion Recognition using BLSTMs with self-attention and Multi-domain training

Python 49 11 Updated Dec 7, 2023

This is the code for controllable EVC framework for seen and unseen emotion generation.

Python 45 14 Updated Nov 3, 2021

3-D Convolutional Recurrent Neural Networks With Attention Model for Speech Emotion Recognition.

Python 45 3 Updated Nov 13, 2020

ERISHA is a mulitilingual multispeaker expressive speech synthesis framework. It can transfer the expressivity to the speaker's voice for which no expressive speech corpus is available.

Python 44 18 Updated Dec 17, 2020
Next