Skip to content
View yhw-yhw's full-sized avatar

Block or report yhw-yhw

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Industry-level video foundation model for unified Text-to-Video (T2V) and Image-to-Video (I2V) generation.

776 86 Updated Aug 27, 2025

UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

Python 821 25 Updated Nov 25, 2025

GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset

Python 238 5 Updated Aug 15, 2025

[ICCV 2023] Consistent Image Synthesis and Editing

Python 828 36 Updated Aug 19, 2024

Multimodal Models in Real World

Jupyter Notebook 551 23 Updated Feb 24, 2025

[TOG 2024]StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter

Python 263 18 Updated Apr 5, 2025

Official repository for LTX-Video

Python 8,914 833 Updated Oct 25, 2025

Tiny AutoEncoder for Hunyuan Video (and other video models)

Python 251 5 Updated Dec 17, 2025

An official implementation of DanceGRPO: Unleashing GRPO on Visual Generation

Python 1,347 66 Updated Oct 16, 2025

Digital Mind Extension

JavaScript 7,105 1,072 Updated Oct 26, 2025

High-performance data engine for AI and multimodal workloads. Process images, audio, video, and structured data at any scale

Rust 5,000 364 Updated Dec 18, 2025

LLM101n: Let's build a Storyteller

35,892 1,961 Updated Aug 1, 2024

Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis

Jupyter Notebook 634 24 Updated May 24, 2024

A collection of awesome text-to-image generation studies.

TeX 719 35 Updated Dec 11, 2025

MoviiGen 1.1: Towards Cinematic-Quality Video Generative Models

Python 180 8 Updated Jul 21, 2025
Jupyter Notebook 1,574 100 Updated Nov 5, 2025

Official PyTorch implementation of DiffMoE, TC-DiT, EC-DiT and Dense DiT

Python 155 5 Updated Oct 21, 2025

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 3,391 284 Updated Jul 17, 2025

[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL

Python 1,761 103 Updated Nov 4, 2025

State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!

Jupyter Notebook 1,882 122 Updated Dec 18, 2025

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 2,865 287 Updated Dec 11, 2025

Code for the project "MegaSaM: Accurate, Fast and Robust Structure and Motion from Casual Dynamic Videos"

Python 1,155 64 Updated Jun 13, 2025

LLM/VLM gaming agents and model evaluation through games.

Python 831 88 Updated Nov 16, 2025

TransNet V2: Shot Boundary Detection Neural Network

Python 808 128 Updated Dec 4, 2023

Wan: Open and Advanced Large-Scale Video Generative Models

Python 14,941 2,206 Updated Dec 15, 2025

Official implementation for "RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers" (ICML 2025) and "UltraViCo: Breaking Extrapolation Limits in Video Diffusion Transformers"

Python 767 73 Updated Dec 4, 2025

Implementation of "EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer"(ICCV2025)

Python 1,706 127 Updated Jul 25, 2025

Code and Data for "Long-context LLMs Struggle with Long In-context Learning" [TMLR2025]

Python 110 9 Updated Feb 20, 2025

The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]

Python 20 Updated Feb 27, 2025

[ICCV 2025, Oral] TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models

Python 806 40 Updated Dec 17, 2025
Next