Stars
[SiggraphAsia25] OmnimatteZero: Fast Training-free Omnimatte with Pre-trained Video Diffusion Models
Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice…
Official implementation of "VideoMaMa: Mask-Guided Video Matting via Generative Prior", CVPR 2026
High quality training free inpaint for every stable diffusion model. Supports ComfyUI
A ComfyUI node pack that implements FreeLong (NeurIPS 2024) spectral blending for Wan 2.2 video generation
[CVPR 2026] PersonaLive! : Expressive Portrait Image Animation for Live Streaming
Native and Compact Structured Latents for 3D Generation
A lightweight yet powerful audio-to-MIDI converter with pitch bend detection
PainterFLF2V boosts first-last frame motion with inverse structural repulsion. PainterFLF2V: Dynamically enhances the original first-last frame node, allowing you to customize the dynamic enhanceme…
Real-Time Diffusion-Based Streaming Video Super-Resolution / 基于Diffusion架构的实时视频流超分模型
Multilingual Document Layout Parsing in a Single Vision-Language Model
MatteoKartoon / BiRefNet
Forked from ZhengPeng7/BiRefNetToonOut, a fork of BiRefNet focused on background removal for anime images. We open-source our dataset & our weights. See our paper at: https://arxiv.org/abs/2509.06839
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
leililei / HoYo-Spine
Forked from c2t-r/HoYo-SpineCollections of web event spines made by HoYo
[CVPR'26] ObjectClear: Precise Object and Effect Removal with Adaptive Target-Aware Attention
tamilpp25 / Iridium-SR
Forked from Akka0/Iridium-NGA KCP packet sniffer + visualizer in one, backend rewritten in Go. (For SR)
OmniGen2: Exploration to Advanced Multimodal Generation. https://arxiv.org/abs/2506.18871
集中整理了TCAX论坛的287个py脚本,方便大家下载并用TCAX制作特效字幕。
Asset extraction tool for unity games, supports GI 6.0+, HSR 3.6+, ZZZ 2.2+ and many more (*゚∀゚*)
aelurum / AssetStudio
Forked from Perfare/AssetStudioAssetStudioMod - modified version of Perfare's AssetStudio, mainly focused on UI optimization and some functionality enhancements.
MoviiGen 1.1: Towards Cinematic-Quality Video Generative Models
(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
Ultralytics YOLO with Additional Knowledge Distillation Capability
[NeurIPS 2025] Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Surpasses GPT-4o in ID persistence~ MoE ckpt released! Only 4GB VRAM is enough to run!
UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer