atuxhe

Follow

atuxhe

Follow

37 followers · 86 following

Lists (30)

Sort

Acoustic Echo Cancellation

15 repositories

audio effect

41 repositories

Audio Synthesis

BabyCry Det

corpus

10 repositories

database

73 repositories

Hearing Aid

16 repositories

HIFI-DSP

10 repositories

KWS

112 repositories

LLM

868 repositories

machine translation

13 repositories

mic array

Music & Song AGI

13 repositories

Music Source Separation

25 repositories

neural audio codec

200 repositories

NLP

59 repositories

NN algorithm

205 repositories

Pronunciation Assessment

10 repositories

SED

27 repositories

Sound Source Localization

Sound Source Separation

15 repositories

Speaker Diarization

104 repositories

Speaker ID

41 repositories

Speech Enhancement

608 repositories

Speech Recognition

585 repositories

Spoken Language ID

Spoken Language Understanding

Text-to-Speech

758 repositories

TinyML

53 repositories

voice agent

151 repositories

Starred repositories

xiaomi-research / dasheng-audiogen

end-to-end text to audio scene generation model

42 1 Updated Jun 16, 2026

junxi25liu / TinyAudio

Parameter-efficient text-to-audio generation for edge and low-memory deployment.

13 Updated May 29, 2026

yizhuoyang / NeuralMUSIC

Jupyter Notebook 1 Updated Mar 5, 2026

hanshounsu / sdpcodec-open

Sdpcodec (Interspeech 2026) source code

Python 3 1 Updated Jun 18, 2026

thunlp / OPD

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Python 684 43 Updated May 30, 2026

AutoArk / open-audio-opd

Industrial audio online policy distillation (OPD) training stack for ASR and TTS, distilling compact audio models from stronger teacher models.

Python 197 14 Updated Jun 5, 2026

alphacep / GigaAM

Forked from salute-developers/GigaAM

Foundational Model for Speech Recognition Tasks

Python 1 Updated Jun 17, 2026

TigreGotico / tugaphone

TugaPhone is a Python library that phonemizes arbitrary Portuguese text across major Lusophone dialects (pt-PT, pt-BR, pt-AO, pt-MZ, pt-TL). It uses a curated phonetic lexicon plus a rule-based fal…

Python 3 Updated Jun 13, 2026

TigreGotico / phonematcher

a Python library for phonetic fuzzy searching and segment-to-segment distance computation. It allows you to find words that "sound like" a query by analyzing International Phonetic Alphabet (IPA) f…

Python 8 2 Updated May 30, 2026

TigreGotico / ALIGN

A browser-based tool for aligning audio with text transcriptions and IPA (International Phonetic Alphabet) at word, grapheme, and sentence level. Runs entirely client-side, no server, no dependenci…

JavaScript 7 Updated Jun 13, 2026

TigreGotico / vadonnx

Python 2 Updated Jun 16, 2026

sanghyang00 / ur-bert

Official implementation of the Interspeech 2026 paper: UR-BERT: Scaling Text Encoders for Massively Multilingual TTS Through Universal Romanization and Speech Token Prediction

Python 4 1 Updated Jun 17, 2026

BUTSpeechFIT / Dixtral

Python 6 1 Updated Jun 16, 2026

lihenryhfl / basis

Python 1 Updated Jan 8, 2026

dzq84 / meantok

Python 6 Updated Jun 7, 2026

Etherll / Timbre

Extract a target speaker’s clean, non-overlapped speech from multi-speaker audio and export word-safe LJSpeech-style TTS datasets.

Python 18 2 Updated Jun 14, 2026

maats0519 / maats_mqm

Python 4 1 Updated May 20, 2025

MaikeZuefle / f-actor

Python 27 1 Updated Feb 27, 2026

vectominist / usad

Official implementation of "USAD: Universal Speech and Audio Representation via Distillation"

Python 10 1 Updated Jun 7, 2026

Zyphra / ZONOS2

Zonos2 is a leading open-weight text-to-speech MoE.

Python 214 24 Updated Jun 16, 2026

flowtyone / QwenASR-he

Python 3 Updated Jun 14, 2026

KoelLabs / ML

Koel Labs innovates open-source speech research, inclusive speech technologies, and real-time pronunciation feedback for language learners! This repo contains the ML training, evaluation, and data …

Jupyter Notebook 21 6 Updated Jun 17, 2026

salman-ha / MambAdapter

Codebase for the Interspeech 2026 Paper: MambAdapter: Lightweight Mamba-Based Adapters for Parameter-Efficient Transfer Learning in Speech and Audio

Python 3 Updated Jun 10, 2026

Aratako / MioTTS-Inference

Inference server for MioTTS, a lightweight and fast LLM-based TTS model.

Python 144 21 Updated Feb 14, 2026

bloodraven66 / EndpointAnticipation

Python 5 Updated Jun 15, 2026

NoizAI / AudioX-Turbo

🚀 Fastest Anything-to-Audio Gen for conditioned sound and music creation.

Python 61 15 Updated Jun 16, 2026

BayLing-Models / BayLing-Duplex

Native full-duplex speech dialogue inference for BayLing-Duplex.

Python 14 2 Updated Jun 17, 2026

soundai2016 / pimm_nasdaq_earningscall

Python 2 2 Updated Sep 25, 2025

soundai2016 / EmoAlignBench

Python 1 Updated Jun 5, 2026

soundai2016 / survey_acoustic_world_models

3 1 Updated Jun 16, 2025

Starred topics

Natural language processing

Deep learning