pandamq

Mingqi Jiang pandamq

35 followers · 147 following

Achievements

Organizations

Lists (31)

Sort

Stars

jlcodes99 / cockpit-tools

🚀 通用 AI IDE 账号管理工具：支持 Antigravity / Codex / GitHub Copilot / Windsurf / Kiro / Cursor / Gemini-cli / CodeBuddy，多账号切换、配额监控、自动唤醒与多开实例管理。 🚀 Universal AI IDE account manager for Antigravity / Codex / …

Rust 11,280 1,023 Updated Jun 12, 2026

kaistmm / voxsim_trainer

[INTERSPEECH 2024] Official code for VoxSim: A perceptual voice similarity dataset

Python 23 Updated Sep 29, 2025

rednote-hilab / dots.tts

Python 492 34 Updated Jun 12, 2026

cwx-worst-one / WavTTS

WavTTS: Towards High-Quality Zero-Shot TTS via Direct Raw Waveform Modeling

Python 174 6 Updated Jun 6, 2026

ASLP-lab / FlashTTS

Fast Streaming TTS with MTP Acceleration and X-pred Mean Flow Distillation

Python 36 1 Updated Jun 9, 2026

Soul-AILab / SoulX-Transcriber

An end-to-end framework for multi-speaker transcription that jointly models who spoke, when, and what.

Python 244 10 Updated Jun 4, 2026

sp-uhh / streamfm

Real-Time Streamable Generative Speech Restoration with Flow Matching

Python 41 6 Updated Jun 5, 2026

PoTaTo-Mika / Shore-TTS

Official implementation of paper "Vocoder is not all you need".

Python 14 Updated Jun 5, 2026

keonlee9420 / Expressive-FastSpeech2

PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.

Python 319 47 Updated Aug 25, 2021

WrBug / PolyHermes

一个功能强大的 Polymarket 预测市场跟单交易系统，支持自动化跟单、多账户管理、实时订单推送和统计分析。

Kotlin 196 44 Updated May 26, 2026

Gilgamesh-J / X-ASR

X-ASR is a series of automatic speech recognition models based on the icefall framework, focusing on streaming ASR and low-latency deployment.

Swift 115 11 Updated Jun 11, 2026

xiaomi-research / dasheng-audiogen

end-to-end text to audio scene generation model

38 1 Updated May 28, 2026

BigPizzaV3 / CodexPlusPlus

An enhanced tool for CodexApp, striving to make Codex better to use and more comfortable 一个CodexApp的增强工具，努力让Codex变得更好用更舒服

Rust 17,970 1,125 Updated Jun 13, 2026

whwangovo / pyre-code

A self-hosted ML coding practice platform. 68 problems from ReLU to flow matching — attention, training, RLHF, diffusion, and more. Instant feedback in the browser.

Python 1,136 103 Updated May 12, 2026

a710128 / nanovllm-voxcpm

Python 247 49 Updated Jun 3, 2026

Stability-AI / stable-audio-3

Python 480 56 Updated Jun 9, 2026

Samarth-Tripathi / IEMOCAP-Emotion-Detection

Multi-modal Emotion detection from IEMOCAP on Speech, Text, Motion-Capture Data using Neural Nets.

Jupyter Notebook 174 72 Updated Dec 13, 2020

wyhsirius / LIA-X

LIA-X: Interpretable Latent Portrait Animator

Python 105 12 Updated Sep 17, 2025

zhenye234 / Talker-T2AV

Talker-T2AV Joint Talking Audio-Video Generation with Autoregressive Diffusion Modeling

Python 75 3 Updated May 24, 2026

xzf-thu / Mega-ASR

First foundation ASR built for the real world - 7 atomic acoustic conditions, 54 compound scenarios, 2.6M samples, and up to ~30% gains over SOTA where every other model falls apart. **You'll come …

Python 983 63 Updated Jun 2, 2026

facebookresearch / WavFlow

MultiModal Audio Generation in Raw Waveform Space.

Python 152 10 Updated May 26, 2026

ComposioHQ / awesome-claude-skills

A curated list of awesome Claude Skills, resources, and tools for customizing Claude AI workflows

Python 64,381 7,117 Updated May 22, 2026

ComposioHQ / awesome-codex-skills

A curated list of practical Codex skills for automating workflows across the Codex CLI and API.

Python 13,615 1,313 Updated May 15, 2026

anthropics / claude-for-legal

A suite of plugins for legal workflows

Python 8,248 1,522 Updated Jun 4, 2026

TCL606 / WAVE

ICLR 2026 Oral: WAVE: Learning Unified & Versatile Audio-Visual Embeddings with Multimodal LLM

Python 39 3 Updated Mar 16, 2026

modelscope / mcore-bridge

MCore-Bridge: Providing Megatron-Core model definitions for state-of-the-art large models and making Megatron training as simple as Transformers — with support for 300+ large language models (Qwen3…

Python 75 18 Updated Jun 11, 2026

tashQ / Q2D2

A geometry-aware audio codec leveraging two-dimensional quantization

Python 5 1 Updated May 15, 2026

yanghaha0908 / WavCube

Official code for "WavCube: Unifying Speech Representation for Understanding and Generation via Semantic-Acoustic Joint Modeling"

Python 62 7 Updated May 13, 2026

Imbad0202 / academic-research-skills

Academic Research Skills for Claude Code: research → write → review → revise → finalize

Python 30,825 2,540 Updated Jun 13, 2026

ASLP-lab / FMSU

Towards Fine-Grained Multi-Dimensional Speech Understanding: Data Pipeline, Benchmark, and Model

Python 24 1 Updated May 21, 2026

Mingqi Jiang pandamq

Organizations

Lists (31)

agent

asr

chatai

codec

crawl

dataset

diffusion

flow-tts

interesting

large language model

learn

llm

llm-multimodal

llm-tts

music

paper list

prompt

RL

speaker

speech understand

ssl

super_resolution

tools

train & inference

tts

tts-dadapipe

tts-eval

tts-postprocess

video

vocoder

评判

Stars