cacard

👽

I may be slow to respond.

cacard cacard

👽

I may be slow to respond.

7 followers · 12 following

Beijing

Achievements

Starred repositories

meituan-longcat / LongCat-Video

Python 4,361 692 Updated May 27, 2026

ZFTurbo / MSS_ONNX_TensorRT

Forked from DanilChernov1/VKR

Python 21 6 Updated Jun 17, 2025

netease-youdao / Confucius4-TTS

Confucius4-TTS: a Multilingual and Cross-Lingual Zero-Shot TTS Engine

Python 198 19 Updated Jun 17, 2026

santinic / audiblez

Generate audiobooks from e-books

Python 7,656 654 Updated Feb 27, 2026

Soul-AILab / SoulX-FlashTalk

SoulX-FlashTalk is the first 14B model to achieve sub-second start-up latency (0.87s) while maintaining a real-time throughput of 32 FPS on an 8xH800 node.

Python 1,349 127 Updated May 21, 2026

liuzhao1225 / YouDub-webui

Python 4,830 518 Updated Jun 17, 2026

Lynpoint / CyberVerse

Self hosted, real-time digital human agent platform. Build voice-first AI agents with WebRTC, persona memory, tools, RAG, and optional digital-human video.

Python 1,230 172 Updated Jun 17, 2026

Soul-AILab / SoulX-LiveAct

Official inference code for SoulX-LiveAct: Towards Hour-Scale Real-Time Human Animation with Neighbor Forcing and ConvKV Memory

Python 1,121 106 Updated Jun 15, 2026

opendataloader-project / opendataloader-pdf

PDF Parser for AI-ready data. Automate PDF accessibility. Open-source.

Java 25,304 2,391 Updated Jun 18, 2026

asr-pub / index-tts-lora

High-quality speech synthesis with LoRA fine-tuning on index-tts, enhancing prosody and naturalness for single and multi-speaker voices.

Python 306 25 Updated Mar 12, 2026

wenet-e2e / WeTextProcessing

Text Normalization & Inverse Text Normalization

Python 784 111 Updated Jun 15, 2026

OpenMOSS / MOSS-TTS-Nano

MOSS-TTS-Nano is an open-source multilingual tiny speech generation model from MOSI.AI and the OpenMOSS team. With only 0.1B parameters, it is designed for realtime speech generation, can run direc…

Python 3,516 450 Updated Jun 2, 2026

OpenBMB / VoxCPM

VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning

Python 30,642 3,456 Updated Jun 10, 2026

FunAudioLLM / FunCineForge

Python 432 34 Updated Mar 25, 2026

speechio / chinese_text_normalization

Chinese text normalization for speech processing

Python 732 151 Updated Mar 18, 2023

meituan-longcat / LongCat-AudioDiT

Python 525 47 Updated Apr 3, 2026

k2-fsa / OmniVoice

High-Quality Voice Cloning TTS for 600+ Languages

Python 7,576 1,187 Updated Jun 11, 2026

Tencent-Hunyuan / Hy-MT

Python 781 71 Updated Jun 1, 2026

nomadkaraoke / python-audio-separator

Easy to use stem (e.g. instrumental/vocals) separation from CLI or as a python package, using a variety of amazing pre-trained models (primarily from UVR)

Python 1,248 188 Updated May 18, 2026