Skip to content
View ebezzam's full-sized avatar

Highlights

  • Pro

Organizations

@LCAV

Block or report ebezzam

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Mount Hugging Face Buckets and repos as local filesystems. No download, no copy, no waiting.

Rust 601 33 Updated Mar 31, 2026

VoiceBench: Benchmarking LLM-Based Voice Assistants

Python 345 23 Updated Mar 21, 2026

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

Python 1,282 110 Updated Mar 2, 2025

[NeurIPS' 25] Benchmark for evaluating TTS models on complex prosodic, expressiveness, and linguistic challenges.

Python 204 15 Updated Dec 9, 2025

Official Implementation of "ALARM: Audio–Language Alignment for Reasoning Models"

Python 8 1 Updated Mar 14, 2026
Jupyter Notebook 2 Updated Mar 11, 2026

Pure-PyTorch Parakeet TDT inference

Python 33 6 Updated Mar 10, 2026

Open Source Speech Language Model

Jupyter Notebook 936 99 Updated Mar 24, 2026

Benchmarking Large Language Models using the Eleusis card game

Python 14 3 Updated Feb 16, 2026

Real-time text-to-speech with Qwen3-TTS

Python 795 110 Updated Mar 27, 2026

The most powerful local music generation model that outperforms most commercial alternatives, supporting Mac, AMD, Intel, and CUDA devices.

Python 8,358 953 Updated Mar 31, 2026

Riva Python client API and CLI utils

Python 123 48 Updated Mar 25, 2026

Soprano: Instant, Ultra-Realistic Text-to-Speech

Python 1,211 107 Updated Jan 15, 2026

Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

Python 1,823 479 Updated Mar 16, 2026

SoTA open-source TTS

Python 24,078 3,191 Updated Mar 26, 2026

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 17,014 3,394 Updated Mar 31, 2026

The Hugging Face Course on Transformers for Audio

MDX 490 150 Updated Jan 16, 2026

Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages

Python 2,743 245 Updated Dec 30, 2025
Python 18 1 Updated Nov 19, 2025

SoulX-Podcast is an inference codebase by the Soul AI team for generating high-fidelity podcasts from text.

Python 3,265 425 Updated Dec 11, 2025
Jupyter Notebook 188 12 Updated Nov 3, 2025

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.

Python 33,226 6,887 Updated Mar 31, 2026

On-device TTS model by Neuphonic

Python 5,098 560 Updated Mar 23, 2026

Official repo for MMAU-Pro Benchmark

Python 19 1 Updated Sep 25, 2025

Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection

Python 934 48 Updated Jun 3, 2025

Whisper realtime streaming for long speech-to-text transcription and translation

Python 3,582 413 Updated Nov 12, 2025

A fast multimodal LLM for real-time voice

Python 4,385 369 Updated Dec 12, 2025
Next