A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,684 3,325 Updated Feb 5, 2026

jindongwang / transferlearning

Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-迁移学习

Python 14,269 3,846 Updated Feb 18, 2025

jina-ai / clip-as-service

🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP

Python 12,818 2,077 Updated Jan 23, 2024

zai-org / CogVideo

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 12,397 1,251 Updated Nov 4, 2025

speechbrain / speechbrain

A PyTorch-based Speech Toolkit

Python 11,179 1,646 Updated Feb 4, 2026

lucidrains / denoising-diffusion-pytorch

Implementation of Denoising Diffusion Probabilistic Model in Pytorch

Python 10,443 1,266 Updated Aug 4, 2025

espnet / espnet

End-to-End Speech Processing Toolkit

Python 9,717 2,379 Updated Feb 4, 2026

open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 9,683 793 Updated May 27, 2025

NVIDIA / apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

Python 8,911 1,513 Updated Jan 26, 2026

Plachtaa / VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

Python 7,959 783 Updated Feb 11, 2024

zai-org / GLM-4

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Python 7,044 607 Updated Jul 4, 2025

yenchenlin / nerf-pytorch

A PyTorch implementation of NeRF (Neural Radiance Fields) that reproduces the results.

Python 6,008 1,133 Updated Jul 25, 2024

MoonInTheRiver / DiffSinger

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

Python 4,721 792 Updated Mar 19, 2025

nateraw / stable-diffusion-videos

Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts

Python 4,660 450 Updated Dec 16, 2025

ssut / py-googletrans

(unofficial) Googletrans: Free and Unlimited Google translate API for Python. Translates totally free of charge.

Python 4,211 744 Updated Apr 25, 2025

metavoiceio / metavoice-src

Foundational model for human-like, expressive TTS

Python 4,190 691 Updated Jul 30, 2024

TensorSpeech / TensorFlowTTS

😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

Python 3,993 807 Updated Jul 5, 2024

gpt-omni / mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,520 307 Updated Nov 5, 2024

NVIDIA / flownet2-pytorch

Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks

Python 3,273 746 Updated May 28, 2023

magenta / ddsp

DDSP: Differentiable Digital Signal Processing

Python 3,204 370 Updated Jan 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chenpeng Du cpdu

Achievements

Achievements

Block or report cpdu

Stars

coqui-ai / TTS

deepspeedai / DeepSpeed

hpcaitech / ColossalAI

google-research / bert

fxsjy / jieba

facebookresearch / fairseq

haotian-liu / LLaVA

pytorch / examples

microsoft / VibeVoice

mnielsen / neural-networks-and-deep-learning

NVIDIA-NeMo / NeMo