Stars
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Train transformer language models with reinforcement learning.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
kaldi-asr/kaldi is the official location of the Kaldi project.
A beautiful, simple, clean, and responsive Jekyll theme for academics
pathogen.vim: manage your runtimepath
Foundational Models for State-of-the-Art Speech and Text Translation
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
⚡机器学习实战(Python3):kNN、决策树、贝叶斯、逻辑回归、SVM、线性回归、树回归
Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.
Google Drive Public File Downloader when Curl/Wget Fails
Muzic: Music Understanding and Generation with Artificial Intelligence
Align Anything: Training All-modality Model with Feedback
Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation
Hidden Markov Models in Python, with scikit-learn like API
Core Engine of Singing Voice Conversion & Singing Voice Clone
Self-Supervised Speech Pre-training and Representation Learning Toolkit
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
FSA/FST algorithms, differentiable, with PyTorch compatibility.
A fundamental toolkit designed for music, song, and audio generation
Awesome speech/audio LLMs, representation learning, and codec models
A Framework for Speech, Language, Audio, Music Processing with Large Language Model