Stars
Robust Speech Recognition via Large-Scale Weak Supervision
Deezer source separation library including pretrained models.
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
Code for the paper Hybrid Spectrogram and Waveform Source Separation
The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."
Transformer: PyTorch Implementation of "Attention Is All You Need"
On-device AI across mobile, embedded and edge for PyTorch
Code and hyperparameters for the paper "Generative Adversarial Networks"
😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
music21: a Toolkit for Computer-Aided Musical Analysis and Computational Musicology
NeuroKit2: The Python Toolbox for Neurophysiological Signal Processing
Reformer, the efficient Transformer, in Pytorch
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a ca…
[NeurIPS‘2021] "TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can Scale Up", Yifan Jiang, Shiyu Chang, Zhangyang Wang
Official repository for the "Big Transfer (BiT): General Visual Representation Learning" paper.
This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.
Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network
Official Pytorch Implementation of: "Asymmetric Loss For Multi-Label Classification"(ICCV, 2021) paper
[TNSRE 23] EEG Transformer 2.0. i. Convolutional Transformer for EEG Decoding. ii. Novel visualization - Class Activation Topography.
Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.
Code for the paper "DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks" (ICCV '19)
A Python library aimed at acousticians.