Lists (4)
Sort Name ascending (A-Z)
Starred repositories
Get your documents ready for gen AI
OCR model that handles complex tables, forms, handwriting with full layout.
GLM-TTS: Controllable & Emotion-Expressive Zero-shot TTS with Multi-Reward Reinforcement Learning
Send a phone call from AI agent, in an API call. Or, directly call the bot from the configured phone number!
🤖 可 DIY 的 多模态 AI 聊天机器人 | 🚀 快速接入 微信、 QQ、Telegram、等聊天平台 | 🦈支持DeepSeek、Grok、Claude、Ollama、Gemini、OpenAI | 工作流系统、网页搜索、AI画图、人设调教、虚拟女仆、语音对话 |
Official Code Repo for UniVA: Universal Video Agents
SkyReels V1: The first and most advanced open-source human-centric video foundation model
Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …
An Open Phone Agent Model & Framework. Unlocking the AI Phone for Everyone
ZQuant量化分析平台是一个功能完整的股票量化分析系统,基于 FastAPI 构建,提供数据服务、回测引擎、策略管理等功能,旨在为量化分析者提供从数据采集、策略开发、回测分析到结果管理的一站式解决方案。
Step-by-step Jupyter notebook tutorials for ChatTTS
官方推荐的 ChatTTS 资源汇总项目,整理了全网相关资源和常见问题 || Officially recommended ChatTTS resource collection project
Algorithmic Trading in Python with Machine Learning
Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages
Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics rec…
Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching
Tacotron 2 - PyTorch implementation with faster-than-realtime inference
DeepAnalyze is the first agentic LLM for autonomous data science. 🎈你的AI数据分析师,自动分析大量数据,一键生成专业分析报告!
Silero Models: pre-trained text-to-speech models made embarrassingly simple
TTS model capable of streaming conversational audio in realtime.
Fuse ChatTTS with OpenVoice, upload a 10-second audio clip, and clone your personalized ChatTTS voice.
Instant voice cloning by MIT and MyShell. Audio foundation model.
Enhanced Supertonic TTS with Docker, FastAPI, Web UI, and comprehensive API documentation
Extension of ChatTTS, 3x Faster on Windows, Support Voice Cloning and Mobile Deployment
Added vLLM support to IndexTTS for faster inference.
High-quality speech synthesis with LoRA fine-tuning on index-tts, enhancing prosody and naturalness for single and multi-speaker voices.