Stars
🎬 卡卡字幕助手 | VideoCaptioner - 基于 LLM 的智能字幕助手 - 视频字幕生成、断句、校正、字幕翻译全流程处理!- A powered tool for easy and efficient video subtitling.
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
real time face swap and one-click video deepfake with only a single image
bigfootcn / WeChatMsg-
Forked from LC044/WeChatMsg提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手
An open-source AI content search engine designed specifically for content creators. Supports extraction of text, images, and short videos. Allows full local deployment (web app, RAG server, LLM ser…
A generative speech model for daily dialogue.
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
Incredibly fast Whisper-large-v3
Curated list of project-based tutorials
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Translate the video from one language to another and embed dubbing & subtitles.
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
Official Code for DragGAN (SIGGRAPH 2023)
视频音频生成字幕,生成srt文件。无需申请第三方API,本地实现音频转文本。基于Transformer的视频字幕生成框架。A GUI tool for generating subtitle from videos and generating srt files.
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/网页爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
🤖️ Cross-platform AI language practice app (跨平台AI语言练习应用)
Ecoute is a live transcription tool that provides real-time transcripts for both the user's microphone input (You) and the user's speakers output (Speaker) in a textbox.
🔉 Youtube Videos Transcription with OpenAI's Whisper
🔉 Youtube Videos Transcription with OpenAI's Whisper
serp-ai / bark-with-voice-clone
Forked from suno-ai/bark🔊 Text-prompted Generative Audio Model - With the ability to clone voices
so-vits-svc fork with realtime support, improved interface and more features.
SoftVC VITS Singing Voice Conversion
Repo for BenCao [original name: HuaTuo (华驼)], Instruction-tuning Large Language Models with Chinese Medical Knowledge. 本草(原名:华驼)模型仓库,基于中文医学知识的大语言模型指令微调
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
An all in one solution for adding Temporal Stability to a Stable Diffusion Render via an automatic1111 extension