Stars
PersonaLive! : Expressive Portrait Image Animation for Live Streaming
Native and Compact Structured Latents for 3D Generation
A lightweight yet powerful audio-to-MIDI converter with pitch bend detection
PainterFLF2V boosts first-last frame motion with inverse structural repulsion. PainterFLF2V: Dynamically enhances the original first-last frame node, allowing you to customize the dynamic enhanceme…
Real-Time Diffusion-Based Streaming Video Super-Resolution / 基于Diffusion架构的实时视频流超分模型
Multilingual Document Layout Parsing in a Single Vision-Language Model
MatteoKartoon / BiRefNet
Forked from ZhengPeng7/BiRefNetToonOut, a fork of BiRefNet focused on background removal for anime images. We open-source our dataset & our weights. See our paper at: https://arxiv.org/abs/2509.06839
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
leililei / HoYo-Spine
Forked from c2t-r/HoYo-SpineCollections of web event spines made by HoYo
ObjectClear: Complete Object Removal via Object-Effect Attention
tamilpp25 / Iridium-SR
Forked from Akka0/Iridium-NGA KCP packet sniffer + visualizer in one, backend rewritten in Go. (For SR)
OmniGen2: Exploration to Advanced Multimodal Generation.
集中整理了TCAX论坛的287个py脚本,方便大家下载并用TCAX制作特效字幕。
Updated AssetStudio, supports GI 6.0+, HSR 3.6+, ZZZ 2.2+ (and more), with improvements and new features (*゚∀゚*)
aelurum / AssetStudio
Forked from Perfare/AssetStudioAssetStudioMod - modified version of Perfare's AssetStudio, mainly focused on UI optimization and some functionality enhancements.
MoviiGen 1.1: Towards Cinematic-Quality Video Generative Models
(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
Ultralytics YOLO with Additional Knowledge Distillation Capability
[NeurIPS 2025] Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Surpasses GPT-4o in ID persistence~ MoE ckpt released! Only 4GB VRAM is enough to run!
UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer
A TTS model capable of generating ultra-realistic dialogue in one pass.
Lets make video diffusion practical!
A set of nodes to edit videos using the Hunyuan Video model
[NeurIPS 2025] OmniSVG is the first family of end-to-end multimodal SVG generators that leverage pre-trained Vision-Language Models (VLMs), capable of generating complex and detailed SVGs, from sim…