deyituo

deyituo

Stars

titanwings / colleague-skill

将冰冷的离别化为温暖的 Skill，欢迎加入数字生命1.0！Transforming cold farewells into warm skills? It's giving rebirth era. Welcome to Digital Life 1.0. 🫶

Python 13,749 1,289 Updated Apr 13, 2026

k2-fsa / OmniVoice

High-Quality Voice Cloning TTS for 600+ Languages

Python 3,227 490 Updated Apr 13, 2026

ajd12342 / paraspeechclap

Codebase for 'ParaSpeechCLAP: A Dual-Encoder Speech-Text Model for Rich Stylistic Language-Audio Pretraining'

Python 12 1 Updated Apr 6, 2026

ultraworkers / claw-code

The repo is finally unlocked. enjoy the party! The fastest repo in history to surpass 100K stars ⭐. Join Discord: https://discord.gg/5TUQKqFWd Built in Rust using oh-my-codex.

Rust 183,352 107,824 Updated Apr 13, 2026

arjunchandra2 / TRACE

Automatic evaluation of speech-to-speech models via TRACE.

Python 6 Updated Feb 10, 2026

meituan-longcat / LongCat-AudioDiT

Python 426 39 Updated Apr 3, 2026

XiaomiMiMo / MiMo-Audio

MiMo-Audio: Audio Language Models are Few-Shot Learners

Python 1,017 103 Updated Mar 3, 2026

haidog-yaqub / DiffPitcher

Diffusion-based singing voice pitch correction

Python 140 21 Updated Sep 20, 2024

Tencent / Covo-Audio

Covo-Audio is a 7B-parameter end-to-end large audio language model that directly processes continuous audio inputs and generates audio outputs within a single unified architecture.

Python 137 14 Updated Mar 17, 2026

Soul-AILab / SoulX-Duplug

Plug-and-play streaming semantic VAD for real-time full-duplex spoken dialogue systems.

Python 178 16 Updated Mar 20, 2026

public-clis / bilibili-cli

A CLI for Bilibili — browse videos, users, search, and feeds from the terminal

Python 663 70 Updated Mar 14, 2026

jackwener / xiaohongshu-cli

A CLI for Xiaohongshu (小红书) — search, read, interact via reverse-engineered API

Python 1,629 165 Updated Mar 21, 2026

Qwen3-ASR is an open-source series of ASR models developed by the Qwen team at Alibaba Cloud, supporting stable multilingual speech/music/song recognition, language detection and timestamp prediction.

Python 2,391 234 Updated Jan 30, 2026

TaoRuijie / TalkNet-ASD

ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'

Python 464 100 Updated Oct 23, 2023

FunAudioLLM / FunResearch

This repository is maintained by the Speech Team at Alibaba’s Tongyi Lab, serving as an open-source platform for our cutting-edge research in speech, audio, NLP technologies. We believe in accelera…

Python 32 4 Updated Apr 12, 2026

FunAudioLLM / FunCineForge

Python 376 29 Updated Mar 25, 2026

ASLP-lab / OSUM-Pangu

An Open-Source Multidimension Speech Understanding Foundation Model Built upon OpenPangu on Ascend NPUs

Python 29 Updated Mar 15, 2026

jingzhunxue / FlowMirror_HydraVox

FlowMirror-HydraVox — A natively accelerated multi-head autoregressive TTS system derived from CosyVoice 3.0. It predicts multiple tokens per step for faster, high-quality speech synthesis, featuri…

Python 50 4 Updated Feb 17, 2026

haidog-yaqub / MeanFlow

Pytorch Implementation (unofficial) of the paper "Mean Flows for One-step Generative Modeling" by Geng et al.

Python 1,115 64 Updated Dec 17, 2025

anthropics / claude-code

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…

Shell 113,463 18,968 Updated Apr 13, 2026

Master-PLC / DistDF

official implementation for "DistDF: Time-Series Forecasting Needs Joint-Distribution Wasserstein Alignment"

Python 15 Updated Sep 29, 2025

inclusionAI / Ming-omni-tts

Ming-omni-tts: Simple and Efficient Unified Generation of Speech, Music, and Sound with Precise Control

Python 220 16 Updated Feb 26, 2026

inclusionAI / Ming

Ming - facilitating advanced multimodal understanding and generation capabilities built upon the Ling LLM.

Jupyter Notebook 648 57 Updated Mar 17, 2026

ShandaAI / Hive

A Semantically Consistent Dataset for Data-Efficient Query-Based Universal Sound Separation

Python 236 26 Updated Mar 9, 2026

Woodygan / AudioJudge

Jupyter Notebook 8 Updated Sep 8, 2025

GiantAILab / YingMusic-SVC

Official implementation of YingMusic-SVC.

Python 127 12 Updated Dec 29, 2025

yfyeung / CLSP

[ACL 2026 Main] Open-Ended Speaking Style Modeling via Fine-Grained and Multi-Granular Contrastive Language-Speech Pre-training

71 2 Updated Apr 6, 2026

QwenLM / Qwen3-TTS

Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice…

Python 10,633 1,379 Updated Mar 17, 2026

FlashLabs-AI-Corp / FlashLabs-Chroma

Worlds first open-source real-time end-to-end spoken dialogue model with personalized voice cloning.

Jupyter Notebook 530 59 Updated Jan 28, 2026

lucidrains / x-transformers

A concise but complete full-attention transformer with a set of promising experimental features from various papers

Python 5,823 507 Updated Apr 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly