Skip to content
View pandamq's full-sized avatar

Organizations

@SparkAudio

Block or report pandamq

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

🚀 通用 AI IDE 账号管理工具:支持 Antigravity / Codex / GitHub Copilot / Windsurf / Kiro / Cursor / Gemini-cli / CodeBuddy,多账号切换、配额监控、自动唤醒与多开实例管理。 🚀 Universal AI IDE account manager for Antigravity / Codex / …

Rust 11,280 1,023 Updated Jun 12, 2026

[INTERSPEECH 2024] Official code for VoxSim: A perceptual voice similarity dataset

Python 23 Updated Sep 29, 2025
Python 492 34 Updated Jun 12, 2026

WavTTS: Towards High-Quality Zero-Shot TTS via Direct Raw Waveform Modeling

Python 174 6 Updated Jun 6, 2026

Fast Streaming TTS with MTP Acceleration and X-pred Mean Flow Distillation

Python 36 1 Updated Jun 9, 2026

An end-to-end framework for multi-speaker transcription that jointly models who spoke, when, and what.

Python 244 10 Updated Jun 4, 2026

Real-Time Streamable Generative Speech Restoration with Flow Matching

Python 41 6 Updated Jun 5, 2026

Official implementation of paper "Vocoder is not all you need".

Python 14 Updated Jun 5, 2026

PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.

Python 319 47 Updated Aug 25, 2021

一个功能强大的 Polymarket 预测市场跟单交易系统,支持自动化跟单、多账户管理、实时订单推送和统计分析。

Kotlin 196 44 Updated May 26, 2026

X-ASR is a series of automatic speech recognition models based on the icefall framework, focusing on streaming ASR and low-latency deployment.

Swift 115 11 Updated Jun 11, 2026

end-to-end text to audio scene generation model

38 1 Updated May 28, 2026

An enhanced tool for CodexApp, striving to make Codex better to use and more comfortable 一个CodexApp的增强工具,努力让Codex变得更好用更舒服

Rust 17,970 1,125 Updated Jun 13, 2026

A self-hosted ML coding practice platform. 68 problems from ReLU to flow matching — attention, training, RLHF, diffusion, and more. Instant feedback in the browser.

Python 1,136 103 Updated May 12, 2026
Python 247 49 Updated Jun 3, 2026

Multi-modal Emotion detection from IEMOCAP on Speech, Text, Motion-Capture Data using Neural Nets.

Jupyter Notebook 174 72 Updated Dec 13, 2020

LIA-X: Interpretable Latent Portrait Animator

Python 105 12 Updated Sep 17, 2025

Talker-T2AV Joint Talking Audio-Video Generation with Autoregressive Diffusion Modeling

Python 75 3 Updated May 24, 2026

First foundation ASR built for the real world - 7 atomic acoustic conditions, 54 compound scenarios, 2.6M samples, and up to ~30% gains over SOTA where every other model falls apart. **You'll come …

Python 983 63 Updated Jun 2, 2026

MultiModal Audio Generation in Raw Waveform Space.

Python 152 10 Updated May 26, 2026

A curated list of awesome Claude Skills, resources, and tools for customizing Claude AI workflows

Python 64,381 7,117 Updated May 22, 2026

A curated list of practical Codex skills for automating workflows across the Codex CLI and API.

Python 13,615 1,313 Updated May 15, 2026

A suite of plugins for legal workflows

Python 8,248 1,522 Updated Jun 4, 2026

ICLR 2026 Oral: WAVE: Learning Unified & Versatile Audio-Visual Embeddings with Multimodal LLM

Python 39 3 Updated Mar 16, 2026

MCore-Bridge: Providing Megatron-Core model definitions for state-of-the-art large models and making Megatron training as simple as Transformers — with support for 300+ large language models (Qwen3…

Python 75 18 Updated Jun 11, 2026

A geometry-aware audio codec leveraging two-dimensional quantization

Python 5 1 Updated May 15, 2026

Official code for "WavCube: Unifying Speech Representation for Understanding and Generation via Semantic-Acoustic Joint Modeling"

Python 62 7 Updated May 13, 2026

Academic Research Skills for Claude Code: research → write → review → revise → finalize

Python 30,825 2,540 Updated Jun 13, 2026

Towards Fine-Grained Multi-Dimensional Speech Understanding: Data Pipeline, Benchmark, and Model

Python 24 1 Updated May 21, 2026
Next