Skip to content
View boji123's full-sized avatar

Organizations

@100steps

Block or report boji123

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A streaming audio reader, processor, and writer built on top of soundfile, and PyAV (bindings for FFmpeg)

Python 38 3 Updated Mar 31, 2026

RealSI: Open Benchmark for Simultaneous Interpretation in Real-world Scenarios

Python 79 8 Updated Jul 4, 2025

[AAAI 2026] Playmate2: Training-Free Multi-Character Audio-Driven Animation via Diffusion Transformer with Reward Feedback

Python 298 28 Updated Nov 21, 2025

EtrajEval: Official framework for emotional support evaluation in language models, from the paper "Detecting Emotional Dynamic Trajectories: An Evaluation Framework for Emotional Support in Languag…

Python 16 6 Updated Nov 14, 2025

LongCat Audio Tokenizer and Detokenizer

Python 299 23 Updated Apr 15, 2026

MiMo-Audio: Audio Language Models are Few-Shot Learners

Python 1,019 102 Updated Mar 3, 2026

Text Normalization & Inverse Text Normalization

Python 752 101 Updated Feb 27, 2026

OSUM & OSUM-EChat, open speech understanding model and empathetic spoken chatbot based on it, open-sourced by ASLP@NPU.

Python 486 31 Updated Nov 23, 2025

Pseudo Streaming SenseVoice with Hotwords

Python 444 52 Updated Mar 13, 2025

Official code for "EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting"

Python 115 12 Updated Oct 16, 2025

Multilingual Voice Understanding Model

Python 7,986 731 Updated Dec 30, 2025

基于PQAEF (https://github.com/QuwanAI/PQAEF) 框架设计的情感陪伴对话系统测评基准

Python 41 14 Updated Sep 1, 2025

A toolkit for processing speech data and creating speech datasets

Python 207 43 Updated Mar 29, 2026

FlashCosyVoice: A lightweight vLLM implementation built from scratch for CosyVoice.

Python 247 25 Updated Feb 25, 2026

CosyVoice_DPO_NOTES: Supercharge Your Cosyvoice model with Cutting-Edge DPO Fine-Tuning!

Python 124 19 Updated Aug 8, 2025

[AAAI 2026] EchoMimicV3: 1.3B Parameters are All You Need for Unified Multi-Modal and Multi-Task Human Animation

Python 871 102 Updated Mar 18, 2026

Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation.

Python 1,393 101 Updated Mar 16, 2026

Text-audio foundation model from Boson AI

Python 8,020 618 Updated Jan 18, 2026

MOSS-TTSD is a spoken dialogue generation model designed for expressive multi-speaker synthesis. It features long-context modeling, flexible speaker control, and multilingual support, while enablin…

Python 1,296 125 Updated Mar 23, 2026

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Python 20,073 2,468 Updated Mar 16, 2026

Official PyTorch implementation of BigVGAN (ICLR 2023)

Python 1,206 145 Updated Sep 5, 2024

Variational Autoencoder (VAE) with Normalizing Flows

Python 72 8 Updated Oct 10, 2024

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 27,121 1,974 Updated Jan 9, 2026

No fortress, purely open ground. OpenManus is Coming.

Python 55,798 9,736 Updated Feb 11, 2026

Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation

Python 4,568 345 Updated Jun 21, 2025

使用vllm加速cosyvoice2的推理

Jupyter Notebook 491 64 Updated Apr 26, 2025

A native-PyTorch library for large scale M-LLM (text/audio) training with tp/cp/dp.

Python 230 30 Updated Apr 8, 2026

Genshin Datasets For SVC/SVS/TTS

721 40 Updated Jan 11, 2026

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 9,767 808 Updated Mar 25, 2026

🚀 Truly open-source AI avatar(digital human) toolkit for offline video generation and digital human cloning.

C 12,767 2,111 Updated Apr 17, 2026
Next