Skip to content
View 17will's full-sized avatar

Block or report 17will

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

MOSS-Audio is an open-source foundation model for unified audio understanding, enabling speech, sound, music, captioning, QA, and reasoning in real-world scenarios.

Python 574 40 Updated Jun 2, 2026

GGML-based C++ inference for BS Roformer/Mel-Band-Roformer vocal separation | 纯 C++ 实现的基于 GGML 的 BS Roformer/Mel-Band-Roformer 人声分离推理

C++ 23 4 Updated Jun 16, 2026

android ffmpeg 仿剪映 视频剪辑 预览条 快速抽帧

C 261 57 Updated Apr 30, 2023

MultiModal Audio Generation in Raw Waveform Space.

Python 3 Updated May 25, 2026

The official implementation of WaveNet-VNNs for Active Noise Control (ANC), a fully causal solution.

Python 19 6 Updated Apr 25, 2025

A curated list of models, benchmarks, tools and guides for audio editing

21 3 Updated Jun 9, 2026

[ICLR 2026] SmartDJ: declarative audio editing with audio langugae model.

Python 66 2 Updated Apr 25, 2026

Multilingual speech understanding: ASR + emotion recognition + audio event detection. 50+ languages, 15x faster than Whisper, non-autoregressive.

Python 8,599 784 Updated Jun 9, 2026

The source code for CineSRD and the SubtitleSD benchmark is provided in this repository.

Python 4 1 Updated Mar 15, 2026

An Open-Source Project to Unify Audio Processing and Generation

Python 395 27 Updated May 7, 2026

Rebuild of GTCRN using Grouped TCNs, amidst other changes. Initially an attempt to target MCU deployment.

Python 26 6 Updated Jan 12, 2026

The awesome collection of OpenClaw skills. 5,400+ skills filtered and categorized from the official OpenClaw Skills Registry.🦞

50,331 4,902 Updated Jun 16, 2026

PASE: Phonologically Anchored Speech Enhancer

Python 67 9 Updated Jun 17, 2026

This is the PyTorch implementation of the Universal Source Separation with Weakly labelled Data.

Python 366 21 Updated Sep 1, 2023

该项目来源于阿里开源的语音降噪模型zipEnhancer

Python 41 8 Updated May 8, 2026

Speed-optimized streaming neural speech enhancement network

Python 130 33 Updated Jun 7, 2026
Python 5 Updated Apr 28, 2024

This is the official implementation of the LiSenNet

Python 160 20 Updated Nov 15, 2024

The official repo of UL-UNAS, an ultra-lightweight SE model.

Python 176 28 Updated Jun 17, 2026

PyTorch-based room impulse response (RIR) simulation toolkit with dynamic scenes, GPU acceleration.

Python 21 Updated Feb 18, 2026

The official implementation of GTCRN, an ultra-lightweight SE model.

Python 670 111 Updated Jan 18, 2026

A Semantically Consistent Dataset for Data-Efficient Query-Based Universal Sound Separation

Python 244 25 Updated Mar 9, 2026
Jupyter Notebook 13 1 Updated Aug 13, 2023

Codebase for the paper "Visually Informed Binaural Audio Generation without Binaural Audios" (CVPR 2021)

Python 72 12 Updated Jul 8, 2021

A Python Library for Full Reference Binaural Fidelity Testing, Visualization & Feature Generation

Python 30 4 Updated Oct 30, 2025

Spatial Audio Python Package

Python 194 19 Updated May 11, 2025

Official page of "DeepASA: An Object-Oriented Multi-Purpose Network for Auditory Scene Analysis"

Python 25 3 Updated Apr 15, 2026
Next