Starred repositories
High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
一个可以方便白嫖 Google gemini pro 2.0 的移动客户端
A curated list of awesome remote jobs and resources. Inspired by https://github.com/vinta/awesome-python
✨ AsrTools: Smart Voice-to-Text Tool | Efficient Batch Processing | User-Friendly Interface | No GPU Required | Supports SRT/TXT Output | Turn your audio into accurate text in an instant!
Detect songs from the microphone or spotify and display their lyrics in real time, synced up to the song.
Lua scripts for importing .krc .qrc .lrc to Aegisub
FinRobot: An Open-Source AI Agent Platform for Financial Analysis using LLMs 🚀 🚀 🚀
🚀 The best real-time interactive AI avatar(digital human) with on-premise deployment and <1.5 s latency.
Unofficial Implementation of Animate Anyone by Novita AI
[ACM MM 2024] This is the official code for "AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding"
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
FFmpegCommand适用于Android的FFmpeg命令库,实现了对音视频相关的处理,能够快速的处理音视频,大概功能包括:音视频剪切,音视频转码,音视频解码原始数据,音视频编码,视频转图片或gif,视频添加水印,多画面拼接,音频混音,视频亮度和对比度,音频淡入和淡出效果等
Shell scripts to create video slideshows using images and videos
A system of bots that collects clips automatically via custom made filters, lets you easily browse these clips, and puts them together into a compilation video ready to be uploaded straight to any …
Robust Speech Recognition via Large-Scale Weak Supervision
Official Implementation for "StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery" (ICCV 2021 Oral)
Your most handy video processing software
Legado 3.0 Book Reader with powerful controls & full functions❤️阅读3.0, 阅读是一款可以自定义来源阅读网络内容的工具,为广大网络文学爱好者提供一种方便、快捷舒适的试读体验。
Python Backtesting library for trading strategies
A simple TradingView Telegram Bot based on CHART-IMG API version 1 & 2
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising
🚀Clone a voice in 5 seconds to generate arbitrary speech in real-time
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation