Stars
Assist Non-native Viewers: Multimodal Crosslingual Summarization for How2 Videos
GangmingZhao / pytorch-boat
Forked from mahaoyuHKU/pytorch-boatThis is an unofficial implementation of BOAT: Bilateral Local Attention Vision Transformer
Pytorch/Python implementation of the joint CNN-LSTM deep learning model
This repository includes the code to reproduce our paper "RawBoost: A Raw Data Boosting and Augmentation Method applied to Automatic Speaker Verification Anti-Spoofing".
This repository includes the code to reproduce our paper "End-to-End Spectro-Temporal Graph Attention Networks for Speaker Verification Anti-Spoofing and Speech Deepfake Detection" (https://arxiv.o…
implementation of Monaural Speech Enhancement with Recursive Learning in the Time Domain
The implementation of "A Recursive Network with Dynamic Attention for Monaural Speech Enhancement"
This repo provides the network code and the processed samples of the manuscript "Glance and Gaze: A Collaborative Learning Framework for Single-channel Speech Enhancement", which was accepted by El…
Official implementation of the SPL paper "One-class Learning Towards Synthetic Voice Spoofing Detection"
This repository contains some material of speech enhancement and dereverberation. On the one hand, I summarize this work for my further understanding. On the other hand, I hope that all beginners o…
Deep Learning Based Monaural Speech Dereverberation Models: Hope We Can Get Better Performance of Dereverberation
Production First and Production Ready End-to-End Speech Recognition Toolkit
Python codes for Lite Audio-Visual Speech Enhancement.
Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation implemented by Pytorch
Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation implemented by Pytorch
A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.
Spectral Subtraction, Wiener Filtering, MMSE
Spectral Subtraction, Wiener Filtering, MMSE
The PyTorch improved version of TPAMI 2017 paper: Face Alignment in Full Pose Range: A 3D Total Solution.
The official PyTorch implementation of Towards Fast, Accurate and Stable 3D Dense Face Alignment, ECCV 2020.
A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.
A PyTorch implementation of dual-path RNNs (DPRNNs) based speech separation described in "Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation".
transform-average-concatenate (TAC) method for end-to-end microphone permutation and number invariant ad-hoc beamforming.
A must-read paper for speech separation based on neural networks
A PyTorch implementation of Speech Transformer with multi-GPUs, an End-to-End ASR with Transformer network on Mandarin Chinese. This code is followed by kaituo xu's work.
A PyTorch implementation of Conv-TasNet