ebezzam

Eric Bezzam ebezzam

Audio ML @huggingface

105 followers · 1 following

Highlights

Organizations

Lists (4)

Sort

Stars

harvard-edge / cs249r_book

Machine Learning Systems

JavaScript 23,576 2,830 Updated Apr 12, 2026

keith2018 / TinyTorch

A lightweight deep learning training framework implemented from scratch in C++, featuring a PyTorch-style API.

C++ 168 26 Updated Apr 4, 2026

Qwen3-ASR is an open-source series of ASR models developed by the Qwen team at Alibaba Cloud, supporting stable multilingual speech/music/song recognition, language detection and timestamp prediction.

Python 2,376 231 Updated Jan 30, 2026

huggingface / hf-mount

Mount Hugging Face Buckets and repos as local filesystems. No download, no copy, no waiting.

Rust 668 38 Updated Apr 3, 2026

MatthewCYM / VoiceBench

VoiceBench: Benchmarking LLM-Based Voice Assistants

Python 353 23 Updated Mar 21, 2026

jishengpeng / WavTokenizer

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

Python 1,283 110 Updated Mar 2, 2025

boson-ai / EmergentTTS-Eval-public

[NeurIPS' 25] Benchmark for evaluating TTS models on complex prosodic, expressiveness, and linguistic challenges.

Python 207 15 Updated Dec 9, 2025

Blinorot / ALARM

Official Implementation of "ALARM: Audio–Language Alignment for Reasoning Models"

Python 9 1 Updated Mar 14, 2026

Deep-unlearning / smol-audio

Jupyter Notebook 2 Updated Mar 11, 2026

andimarafioti / nano-parakeet

Pure-PyTorch Parakeet TDT inference

Python 36 6 Updated Mar 10, 2026

HumeAI / tada

Open Source Speech Language Model

Jupyter Notebook 959 100 Updated Mar 24, 2026

scienceetonnante / eleusis-llm-benchmark

Benchmarking Large Language Models using the Eleusis card game

Python 14 3 Updated Feb 16, 2026

andimarafioti / faster-qwen3-tts

Real-time text-to-speech with Qwen3-TTS

Python 852 120 Updated Mar 27, 2026

ace-step / ACE-Step-1.5

The most powerful local music generation model that outperforms almost all commercial alternatives, supporting Mac, AMD, Intel, and CUDA devices.

Python 9,067 1,041 Updated Apr 13, 2026

nvidia-riva / python-clients

Riva Python client API and CLI utils

Python 124 49 Updated Mar 25, 2026

ekwek1 / soprano

Soprano: Instant, Ultra-Realistic Text-to-Speech

Python 1,218 108 Updated Jan 15, 2026

LCAV / pyroomacoustics

Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

Python 1,833 479 Updated Apr 6, 2026

resemble-ai / chatterbox

SoTA open-source TTS

Python 24,269 3,230 Updated Mar 26, 2026

NVIDIA-NeMo / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 17,073 3,400 Updated Apr 12, 2026