Skip to content
View ebezzam's full-sized avatar

Highlights

  • Pro

Organizations

@LCAV

Block or report ebezzam

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Machine Learning Systems

JavaScript 23,576 2,830 Updated Apr 12, 2026

A lightweight deep learning training framework implemented from scratch in C++, featuring a PyTorch-style API.

C++ 168 26 Updated Apr 4, 2026

Qwen3-ASR is an open-source series of ASR models developed by the Qwen team at Alibaba Cloud, supporting stable multilingual speech/music/song recognition, language detection and timestamp prediction.

Python 2,376 231 Updated Jan 30, 2026

Mount Hugging Face Buckets and repos as local filesystems. No download, no copy, no waiting.

Rust 668 38 Updated Apr 3, 2026

VoiceBench: Benchmarking LLM-Based Voice Assistants

Python 353 23 Updated Mar 21, 2026

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

Python 1,283 110 Updated Mar 2, 2025

[NeurIPS' 25] Benchmark for evaluating TTS models on complex prosodic, expressiveness, and linguistic challenges.

Python 207 15 Updated Dec 9, 2025

Official Implementation of "ALARM: Audio–Language Alignment for Reasoning Models"

Python 9 1 Updated Mar 14, 2026
Jupyter Notebook 2 Updated Mar 11, 2026

Pure-PyTorch Parakeet TDT inference

Python 36 6 Updated Mar 10, 2026

Open Source Speech Language Model

Jupyter Notebook 959 100 Updated Mar 24, 2026

Benchmarking Large Language Models using the Eleusis card game

Python 14 3 Updated Feb 16, 2026

Real-time text-to-speech with Qwen3-TTS

Python 852 120 Updated Mar 27, 2026

The most powerful local music generation model that outperforms almost all commercial alternatives, supporting Mac, AMD, Intel, and CUDA devices.

Python 9,067 1,041 Updated Apr 13, 2026

Riva Python client API and CLI utils

Python 124 49 Updated Mar 25, 2026

Soprano: Instant, Ultra-Realistic Text-to-Speech

Python 1,218 108 Updated Jan 15, 2026

Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

Python 1,833 479 Updated Apr 6, 2026

SoTA open-source TTS

Python 24,269 3,230 Updated Mar 26, 2026

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 17,073 3,400 Updated Apr 12, 2026

The Hugging Face Course on Transformers for Audio

MDX 490 151 Updated Apr 8, 2026

Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages

Python 2,759 247 Updated Dec 30, 2025
Python 18 1 Updated Nov 19, 2025

SoulX-Podcast is an inference codebase by the Soul AI team for generating high-fidelity podcasts from text.

Python 3,280 426 Updated Dec 11, 2025
Jupyter Notebook 191 12 Updated Nov 3, 2025

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.

Python 33,309 6,917 Updated Apr 12, 2026

On-device TTS model by Neuphonic

Python 5,143 563 Updated Mar 23, 2026

Official repo for MMAU-Pro Benchmark

Python 20 1 Updated Sep 25, 2025
Next