Skip to content
View p3zo's full-sized avatar

Highlights

  • Pro

Block or report p3zo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Robust Speech Recognition via Large-Scale Weak Supervision

Python 96,484 11,917 Updated Dec 15, 2025

[ICASSP'26] Real-time streaming voice anonymization & voice conversion

Python 61 6 Updated Mar 16, 2026

The best ChatGPT that $100 can buy.

Python 50,053 6,558 Updated Mar 17, 2026

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,968 3,384 Updated Mar 23, 2026

OceanHackWeek - Tutorials

55 82 Updated Feb 12, 2026

A lightweight psychoacoustic bass enhancement plugin - in stereo where available!

Rust 196 7 Updated Dec 30, 2023

A lightweight, local-first, and 🆓 experiment tracking library from Hugging Face 🤗

Python 1,334 100 Updated Mar 23, 2026

Pytorch implementation of MaskGIT: Masked Generative Image Transformer (https://arxiv.org/pdf/2202.04200.pdf)

Python 472 39 Updated Sep 3, 2023

A unified tokenizer that is capable of both extracting semantic information and enabling high-fidelity audio reconstruction.

Python 137 11 Updated Sep 19, 2025

An opinionated docker container for a web-interface around the music organizer beets

TypeScript 363 22 Updated Mar 17, 2026

Beets plugin to manage external files

Python 130 26 Updated Mar 20, 2026

🎛 Stemgen is a Stem file generator. Convert any track into a Stem and have fun with Traktor.

Python 270 49 Updated Sep 1, 2025

Download Tidal tracks, videos, albums, playlists & artists! Tidal downloader that supports master quality.

Python 369 41 Updated Mar 8, 2026

Audio Dataset for training CLAP and other models

Python 732 59 Updated Jan 8, 2026

Official implementation of the paper MGE-LDM: Joint Latent Diffusion for Simultaneous Music Generation and Source Extraction

Python 18 Updated Feb 19, 2026

Generative Model Evaluation Lab - An evaluation suite for your generative models.

Python 7 Updated Dec 13, 2024

an open source, extensible AI agent that goes beyond code suggestions - install, execute, edit, and test with any LLM

Rust 33,486 3,114 Updated Mar 23, 2026

TorchCFM: a Conditional Flow Matching library

Python 2,372 194 Updated Nov 11, 2025

🎥 Python and OpenCV-based scene cut/transition detection program & library.

Python 4,638 477 Updated Mar 3, 2026
Python 117 8 Updated Feb 26, 2026

Livecoding networked visuals in the browser

JavaScript 2,613 310 Updated Sep 14, 2025

Awesome list for vjing/visuals-related resources

341 14 Updated Feb 26, 2025

ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning. In ICCV, 2021.

Python 63 7 Updated Nov 18, 2021

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 2,062 164 Updated Apr 21, 2025

Frontier Multimodal Foundation Models for Image and Video Understanding

Jupyter Notebook 1,128 86 Updated Aug 14, 2025

Code, slides, and examples from my generative AI video course... taking you all the way from VAEs to near real-time Stable Diffusion with PyTorch and Hugging Face!

Jupyter Notebook 21 9 Updated Dec 19, 2024

Fine-tune Stable Audio Open with DiT ControlNet.

Python 249 10 Updated May 16, 2025

This repo contains the source code of the first deep learning-base singing voice beat tracking system. It leverages WavLM and DistilHuBERT pre-trained speech models to create vocal embeddings and t…

Python 33 4 Updated Sep 4, 2022

Accurate and general beat tracker

Python 246 46 Updated Feb 27, 2026

Flexible LoRA Implementation to use with stable-audio-tools

Python 81 6 Updated Sep 9, 2024
Next