Stars
A feature-rich command-line audio/video downloader
Robust Speech Recognition via Large-Scale Weak Supervision
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
Rembg is a tool to remove images background
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
🚀🎬 ShortGPT - Experimental AI framework for youtube shorts / tiktok channel automation
use cnn recognize captcha by tensorflow. 本项目针对字符型图片验证码,使用tensorflow实现卷积神经网络,进行验证码识别。
A Python-based tool for automatically generating PowerPoint presentations from markdown files using AI agents.