Skip to content
View colingogo's full-sized avatar

Block or report colingogo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
16 Updated Feb 28, 2026

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)

Python 25,676 2,888 Updated Sep 2, 2024

🔊 Text-Prompted Generative Audio Model

Jupyter Notebook 39,162 4,675 Updated Aug 19, 2024

singing voice change based on whisper, and lora for singing voice clone

Python 647 80 Updated Nov 3, 2023

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 100,973 28,098 Updated Jun 23, 2026

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 45,599 6,122 Updated Aug 16, 2024

An unofficial implement of autoregressive vocoder Multiband-WaveRNN. Audio samples in https://rongjiehuang.github.io/Multiband-WaveRNN/

Python 28 5 Updated Feb 12, 2021

A No-Recurrence Sequence-to-Sequence Model for Speech Recognition

Python 378 66 Updated Jul 21, 2022

Deep Learning Chinese Word Segment

C++ 2,070 632 Updated May 18, 2018

FastSpeech2 with cross-lingual support

Python 2 3 Updated May 3, 2022

Chinese Text Normalization and Dataset

Python 91 17 Updated May 14, 2022

Neural network-based forced alignment with bidirectional attention mechanism

Python 78 8 Updated Jan 17, 2025

[SIGGRAPH 2022 Journal Track] AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars

Python 1,101 100 Updated Feb 15, 2023

A lightweight yet powerful audio-to-MIDI converter with pitch bend detection

Python 5,207 467 Updated Nov 13, 2025

SyntaSpeech: Syntax-aware Generative Adversarial Text-to-Speech; IJCAI 2022; Official code

Python 201 31 Updated Sep 4, 2022

CUDA implementation of the MelGAN vocoder

Cuda 9 Updated Nov 1, 2021

tacotronV2 + wavernn 实现中文语音合成(Tensorflow + pytorch)

Python 535 132 Updated May 22, 2023

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.

Python 845 157 Updated Oct 10, 2023

C++ implementation of Hidden Markov Model classifier

C++ 6 4 Updated Mar 26, 2015

A system works on singing voice synthesis

Python 79 19 Updated Jan 11, 2023

Quasi-Periodic Parallel WaveGAN Pytorch implementation

Python 46 6 Updated Oct 29, 2022

A pytroch implementation of the FB-MelGAN

Python 90 7 Updated May 26, 2020

A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Python 713 155 Updated Jul 12, 2022

Implementation of Gaussian Mixture Variational Autoencoder (GMVAE) for Unsupervised Clustering

Python 353 64 Updated Oct 2, 2020

Official Tensorflow implementation of U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation (ICLR 2020)

Python 6,117 1,016 Updated May 20, 2021

Quasi-RNN for language modeling

Python 58 17 Updated Feb 9, 2017

Speech Enhancement Generative Adversarial Network in TensorFlow

Python 861 281 Updated Mar 24, 2023

Pytorch implementation for few-shot photorealistic video-to-video translation.

Python 1,798 272 Updated Oct 27, 2021

ODAS: Open embeddeD Audition System

C 1,027 276 Updated Dec 5, 2024

Utilities for resampling and filtering audio data

Python 47 8 Updated Jan 9, 2020
Next