Skip to content
View wdrink's full-sized avatar

Block or report wdrink

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

PICABench: How Far Are We from Physically Realistic Image Editing?

Python 27 Updated Nov 5, 2025
Python 1,509 65 Updated Oct 28, 2025

This is an early exploration to introduce Interleaving Reasoning to Text-to-image Generation field and achieve the SoTA benchmark performance. It also significantly improves the quality, fine-grain…

Python 70 Updated Sep 14, 2025

[Preprint 2025] Ditto: Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset

Python 442 36 Updated Oct 29, 2025

Qwen-Image-Lightning: Speed up Qwen-Image model with distillation

Python 907 36 Updated Oct 14, 2025

[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning

Python 885 65 Updated Sep 26, 2025

DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent Space

Python 274 8 Updated Oct 5, 2025

Post-training with Tinker

Python 1,435 112 Updated Nov 4, 2025

Official codebase for "Self Forcing: Bridging Training and Inference in Autoregressive Video Diffusion" (NeurIPS 2025 Spotlight)

Python 2,788 197 Updated Sep 12, 2025

Nano-consistent-150k

Jupyter Notebook 228 8 Updated Oct 20, 2025

LongLive: Real-time Interactive Long Video Generation

Python 789 49 Updated Nov 3, 2025

HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation

Python 2,372 103 Updated Oct 31, 2025

[NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs".

Python 63 3 Updated Jun 17, 2024

verl: Volcano Engine Reinforcement Learning for LLMs

Python 15,124 2,425 Updated Nov 5, 2025

Fully Open Framework for Democratized Multimodal Training

Python 602 41 Updated Nov 2, 2025

OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling

Python 386 6 Updated Oct 15, 2025

Official pytorch implementation of "Tool-R1: Sample-Efficient Reinforcement Learning for Agentic Tool Use"

15 Updated Sep 16, 2025

Voyager is an interactive RGBD video generation model conditioned on camera input, and supports real-time 3D reconstruction.

Python 1,320 121 Updated Oct 22, 2025

[ICCV 2025, Oral] TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models

Python 784 39 Updated Aug 8, 2025

[ICCV 2025] GameFactory: Creating New Games with Generative Interactive Videos

Python 431 15 Updated Mar 22, 2025

[ICCV'25 Best Paper Finalist] ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

Python 1,585 76 Updated Oct 23, 2025

GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Python 1,731 102 Updated Oct 28, 2025

Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition

Python 614 68 Updated Oct 16, 2025

[ICCV 2025 ⭐highlight⭐] Implementation of VMem: Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory

Python 389 14 Updated Jul 25, 2025
Python 566 16 Updated Oct 20, 2025

[ICCV 2025 & ICCV 2025 RIWM Outstanding Paper] Aether: Geometric-Aware Unified World Modeling

Python 520 5 Updated Oct 26, 2025

Reference PyTorch implementation and models for DINOv3

Jupyter Notebook 8,114 546 Updated Nov 3, 2025

DeepVerse: 4D Autoregressive Video Generation as a World Model

Python 188 8 Updated Aug 11, 2025

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,105 1,902 Updated Nov 1, 2025

Industry-level video foundation model for unified Text-to-Video (T2V) and Image-to-Video (I2V) generation.

691 63 Updated Aug 27, 2025
Next