Stars
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
🚀🚀 「大模型」2小时完全从0训练64M的小参数GPT!🌏 Train a 64M-parameter GPT from scratch in just 2h!
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Easily train a good VC model with voice data <= 10 mins!
PyTorch Tutorial for Deep Learning Researchers
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
SoftVC VITS Singing Voice Conversion
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
A very simple framework for state-of-the-art Natural Language Processing (NLP)
An open source implementation of CLIP.
A MNIST-like fashion product database. Benchmark 👇
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Official implementation of AnimateDiff.
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
The most powerful local music generation model that outperforms almost all commercial alternatives, supporting Mac, AMD, Intel, and CUDA devices.
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
Code for the paper "Jukebox: A Generative Model for Music"
Text-audio foundation model from Boson AI
Open Source Neural Machine Translation and (Large) Language Models in PyTorch
MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, …
Inference and training library for high-quality TTS models.