Stars
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Automatic headphone equalization from frequency responses
Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Python library for audio and music analysis
A PyTorch implementation of EfficientNet
The most powerful local music generation model that outperforms most commercial alternatives, supporting Mac, AMD, Intel, and CUDA devices.
🎮 ⌨ An easy to use tool to change the behaviour of your input devices.
HeartMuLa Official Repo: The Most Powerful Open-Source Music Generation Model of 2026
Speech-to-Text-WaveNet : End-to-end sentence level English speech recognition based on DeepMind's WaveNet and tensorflow
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)
A collection of themes for kitty terminal 😻
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.
Gin provides a lightweight configuration framework for Python
Python audio and music signal processing library
Pretrained EfficientNet, EfficientNet-Lite, MixNet, MobileNetV3 / V2, MNASNet A1 and B1, FBNet, Single-Path NAS
A Language Server Protocol implementation for Ruff.
[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling
[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)
Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.
A lightweight, simple-to-use, RNN wake word listener
A tool for converting ONNX files to LiteRT/TFLite/TensorFlow, PyTorch native code (nn.Module), TorchScript (.pt), state_dict (.pt), Exported Program (.pt2), and Dynamo ONNX. It also supports direct…
Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation