-
UESTC PhD, TJU Master's
Lists (6)
Sort Name ascending (A-Z)
Starred repositories
daanzu / py-webrtcvad-wheels
Forked from wiseman/py-webrtcvadPython interface to the WebRTC Voice Activity Detector (VAD) [released with binary wheels!]
Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.
PyTorch implementation of "Deep Speech 2: End-to-End Speech Recognition in English and Mandarin" (ICML, 2016)
A repository that will hold my experiments with various variational models
Wav2Lip version 288 and pipeline to train
🐍 Geometric Computer Vision Library for Spatial AI
Implementation of ViViT: A Video Vision Transformer
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
Pytorch implementation of deep audio embedding calculation
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform
🔉 spafe: Simplified Python Audio Features Extraction
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
This repo contains my attempt to create a Speaker Recognition and Verification system using SideKit-1.3.1
Go to https://github.com/pytorch/tutorials - this repo is deprecated and no longer maintained
Audio-Visual Speech Recognition using Sequence to Sequence Models
[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)
🔓 Lip Reading - Cross Audio-Visual Recognition using 3D Architectures
《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version in translation
🔥Highlighting the top ML papers every week.
A pipeline to read lips and generate speech for the read content, i.e Lip to Speech Synthesis.