Starred repositories
speech self-supervised representations
世界上最好的提示词 (总计估值超过300亿的提示词)外国网友x1xh成功获取了 v0、Manus、Cursor、Same.dev 和 Lovable 的完整官方系统提示词和内部工具。
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
Tracking the latest and greatest research papers on video generation.
自动视频生成器,给定主题,自动生成解说视频。用户输入主题文字,系统调用大语言模型生成故事或解说的文字,然后进一步调用语音合成接口生成解说的语音,调用文生图接口生成契合文字内容的配图,最后融合语音和配图生成解说视频。
GoatWu / CausVid-Plus
Forked from tianweiy/CausVidUnofficial extension implementation of CausVid
[NeurIPS 2025] Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
This is a repo to track the latest autoregressive visual generation papers.
A unified inference and post-training framework for accelerated video generation.
[ICCV 2025] Official implementations for paper: VACE: All-in-One Video Creation and Editing
Pusa: Thousands Timesteps Video Diffusion Model
The ultimate training toolkit for finetuning diffusion models
A pipeline parallel training script for diffusion models.
Awesome curated collection of images and prompts generated by GPT-4o and gpt-image-1. Explore AI generated visuals created with ChatGPT and Sora, showcasing OpenAI’s advanced image generation capab…
All my self trained & released AI upscaling models. After gathering and applying over 600 different upscaling models, I learned how to train my own models, and these are the results.
This repository automatically updates a list of the top 100 repositories related to ComfyUI based on the number of stars on GitHub.
zero-shot voice conversion & singing voice conversion, with real-time support
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model
Making large AI models cheaper, faster and more accessible
Enjoy the magic of Diffusion models!
Simple Controlnet module for CogvideoX model.
HunyuanVideo-I2V: A Customizable Image-to-Video Model based on HunyuanVideo
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
SkyReels V1: The first and most advanced open-source human-centric video foundation model
[CVPR 2025] MatAnyone: Stable Video Matting with Consistent Memory Propagation
Exploring Applications of GRPO
Pippo: High-Resolution Multi-View Humans from a Single Image