qichilu

Charles qichilu

2 followers · 0 following

Stars

vibevoice-community / VibeVoice

VibeVoice: Expressive, longform conversational speech synthesis. (Community fork)

Python 1,069 403 Updated Jan 23, 2026

SWivid / F5-TTS

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 14,337 2,115 Updated Apr 4, 2026

Plachtaa / VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

Python 7,956 779 Updated Feb 11, 2024

modelscope / KAN-TTS

KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech

Python 524 88 Updated Dec 28, 2023

lifeiteng / vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

Python 2,206 333 Updated Sep 10, 2025

MIC-DKFZ / nnUNet

Python 8,292 2,357 Updated Apr 14, 2026

KrishnaDN / x-vector-pytorch

Implementation of the paper "Spoken Language Recognition using X-vectors" in Pytorch

Python 108 26 Updated Jul 20, 2020

modelscope / 3D-Speaker

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Python 2,889 261 Updated Dec 8, 2025

snakers4 / silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 8,808 764 Updated Mar 26, 2026

PaddlePaddle / PaddleOCR

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

Python 75,675 10,240 Updated Apr 14, 2026

microsoft / NeuralSpeech

Python 1,459 185 Updated Feb 11, 2024

modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 15,686 1,647 Updated Mar 17, 2026

mansr / sox

Forked from cbagwell/sox

SoX, Swiss Army knife of sound processing

C 60 33 Updated Nov 20, 2017

jadore801120 / attention-is-all-you-need-pytorch

A PyTorch implementation of the Transformer model in "Attention is All You Need".

Python 9,684 2,096 Updated Apr 16, 2024

xinjli / allosaurus

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

Python 726 100 Updated Apr 26, 2024

antematter / kaldi-asr-client

Python client for Triton's Kaldi backend

C++ 2 Updated Dec 27, 2022

NVIDIA / DeepLearningExamples

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

Jupyter Notebook 14,780 3,410 Updated Aug 12, 2024

triton-inference-server / python_backend

Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.

C++ 673 193 Updated Apr 15, 2026

NVIDIA-AI-IOT / torch2trt

An easy to use PyTorch to TensorRT converter

Python 4,864 700 Updated Aug 17, 2024

NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 12,900 2,345 Updated Apr 13, 2026

pytorch / TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

Python 2,964 393 Updated Apr 15, 2026

jik876 / hifi-gan

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Python 2,341 554 Updated Jul 27, 2024

rishikksh20 / Fre-GAN-pytorch

Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

Python 112 32 Updated Aug 26, 2021

Labmem-Zhouyx / CDFSE_FastSpeech2

The Official Implementation of “Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis”

Python 88 12 Updated Dec 20, 2022

jailuthra / asr

Kaldi ASR wrapper scripts

Python 2 1 Updated Jul 17, 2017

ming024 / FastSpeech2

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

Python 2,166 614 Updated Oct 27, 2023

MontrealCorpusTools / Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi

Python 1,796 287 Updated Mar 31, 2026

xcmyz / FastVocoder

Include Basis-MelGAN, MelGAN, HifiGAN and Multiband-HifiGAN, maybe NHV in the future.

Python 157 19 Updated Jul 2, 2021

CMsmartvoice / One-Shot-Voice-Cloning

☺️ One Shot Voice Cloning base on Unet-TTS

Jupyter Notebook 245 43 Updated Mar 22, 2022

facebookresearch / textlesslib

Library for Textless Spoken Language Processing

Python 557 57 Updated Aug 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly