Skip to content
View cpdu's full-sized avatar

Block or report cpdu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
98 stars written in Python
Clear filter

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 44,456 5,949 Updated Aug 16, 2024

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 41,539 4,704 Updated Feb 3, 2026

Making large AI models cheaper, faster and more accessible

Python 41,339 4,538 Updated Jan 19, 2026

TensorFlow code and pre-trained models for BERT

Python 39,834 9,713 Updated Jul 23, 2024

结巴中文分词

Python 34,742 6,720 Updated Aug 21, 2024

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Python 32,137 6,658 Updated Sep 30, 2025

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 24,425 2,723 Updated Aug 12, 2024

A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.

Python 23,725 9,792 Updated Sep 1, 2025

Open-Source Frontier Voice AI

Python 22,920 2,504 Updated Feb 3, 2026

Code samples for my book "Neural Networks and Deep Learning"

Python 17,407 6,993 Updated Jun 2, 2024

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,684 3,325 Updated Feb 5, 2026

Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-迁移学习

Python 14,269 3,846 Updated Feb 18, 2025

🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP

Python 12,818 2,077 Updated Jan 23, 2024

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 12,397 1,251 Updated Nov 4, 2025

A PyTorch-based Speech Toolkit

Python 11,179 1,646 Updated Feb 4, 2026

Implementation of Denoising Diffusion Probabilistic Model in Pytorch

Python 10,443 1,266 Updated Aug 4, 2025

End-to-End Speech Processing Toolkit

Python 9,717 2,379 Updated Feb 4, 2026

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 9,683 793 Updated May 27, 2025

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

Python 8,911 1,513 Updated Jan 26, 2026

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

Python 7,959 783 Updated Feb 11, 2024

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Python 7,044 607 Updated Jul 4, 2025

A PyTorch implementation of NeRF (Neural Radiance Fields) that reproduces the results.

Python 6,008 1,133 Updated Jul 25, 2024

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

Python 4,721 792 Updated Mar 19, 2025

Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts

Python 4,660 450 Updated Dec 16, 2025

(unofficial) Googletrans: Free and Unlimited Google translate API for Python. Translates totally free of charge.

Python 4,211 744 Updated Apr 25, 2025

Foundational model for human-like, expressive TTS

Python 4,190 691 Updated Jul 30, 2024

😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

Python 3,993 807 Updated Jul 5, 2024

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,520 307 Updated Nov 5, 2024

Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks

Python 3,273 746 Updated May 28, 2023

DDSP: Differentiable Digital Signal Processing

Python 3,204 370 Updated Jan 9, 2026
Next