Skip to content
View zhuchb's full-sized avatar

Block or report zhuchb

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 4 Updated Sep 15, 2024

Automatic Depression Detection: a GRU/ BiLSTM-based Model and An Emotional Audio-Textual Corpus

Python 215 36 Updated Jul 10, 2023

We propose C2SER, a novel audio-language model designed to enhance the stability and accuracy of speech emotion recognition through contextual perception and chain of Thought (CoT).

Python 49 3 Updated Mar 3, 2025

M3ED: Multi-modal Multi-scene Multi-label Emotional Dialogue Database. ACL 2022

124 7 Updated Sep 24, 2022

Sparse Adapter Fusion for Continual Learning in NLP - EACL 2026

Python 15 Updated Apr 9, 2026

ActorMind: Emulating Human Actor Reasoning for Speech Role-Playing - ACL Findings 2026

25 1 Updated Jun 4, 2026

Prototype Conditioned Generative Replay for Continual Learning in NLP - NAACL 2025

Python 26 Updated Apr 9, 2026

A curated list of papers and resources based on the survey "Agentic Reasoning for Large Language Models"

1,279 100 Updated Mar 9, 2026

The agent engineering platform.

Python 139,693 23,162 Updated Jun 19, 2026

一个中文心理健康支持问答数据集,提供了丰富的援助策略标注。可用于生成富有援助策略的长咨询文本。

267 21 Updated Jul 21, 2024

SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis

Python 151 18 Updated Jan 1, 2025

It's a repository for implementations of neural speech editing algorithms.

Python 207 21 Updated Jan 9, 2024

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 9,366 786 Updated Mar 26, 2026

A feature-rich command-line audio/video downloader

Python 171,648 14,470 Updated Jun 18, 2026

PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Models

1,145 96 Updated Dec 15, 2025

Dialogue model that produces empathetic responses when trained on the EmpatheticDialogues dataset.

Python 553 69 Updated Dec 3, 2021

Simple text to phones converter for multiple languages

Python 1,555 198 Updated Sep 26, 2024

g2p: English Grapheme To Phoneme Conversion

Python 924 134 Updated Jan 5, 2023

Vector (and Scalar) Quantization, in Pytorch

Python 3,964 331 Updated Jun 5, 2026

Uni-MoE: Lychee's Large Multimodal Model Family.

Python 1,110 69 Updated Dec 22, 2025

🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training

Python 93 13 Updated Dec 3, 2024

CapSpeech: Enabling Downstream Applications in Style-Captioned Text-to-Speech

Jupyter Notebook 368 41 Updated Aug 14, 2025

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 14,780 2,158 Updated May 18, 2026

Github repository for ACL 2025 paper: Recent Advances in Speech Language Models: A Survey.

209 10 Updated Jun 17, 2025

🎛 🔊 A Python library for audio.

C++ 6,171 339 Updated May 21, 2026

[ACL 2025 Main] ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec

Python 274 14 Updated Nov 22, 2024

SOTA Open Source TTS

Python 30,867 2,637 Updated Jun 9, 2026

MiniMax-M2, a model built for Max coding & agentic workflows.

2,599 215 Updated Nov 13, 2025
Next