Lists (4)
Sort Name ascending (A-Z)
Stars
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
Conversion between Traditional and Simplified Chinese
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.
Run macOS VM in a Docker! Run near native OSX-KVM in Docker! X11 Forwarding! CI/CD for OS X Security Research! Docker mac Containers.
End-to-end realtime stack for connecting humans and AI
WebRTC and ORTC implementation for Python using asyncio
Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
21 Lessons, Get Started Building with Generative AI
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
Open-Sora: Democratizing Efficient Video Production for All
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
Effortless data labeling with AI support from Segment Anything and other awesome models.
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Question and Answer based on Anything.
A python package to build AI-powered real-time audio applications
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
PaddlePaddle Code Convert Toolkit. 『飞桨』深度学习代码转换工具
This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion