Skip to content
View yhw-yhw's full-sized avatar

Block or report yhw-yhw

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Industry-level video foundation model for unified Text-to-Video (T2V) and Image-to-Video (I2V) generation.

932 115 Updated Aug 27, 2025

UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

Python 876 29 Updated Dec 23, 2025

GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset

Python 245 5 Updated Aug 15, 2025

[ICCV 2023] Consistent Image Synthesis and Editing

Python 845 37 Updated Aug 19, 2024

Multimodal Models in Real World

Jupyter Notebook 559 23 Updated Feb 24, 2025

[TOG 2024]StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter

Python 267 18 Updated Apr 5, 2025

Official repository for LTX-Video

Python 10,154 990 Updated Jan 5, 2026

Tiny AutoEncoder for Hunyuan Video (and other video models)

Python 368 13 Updated Mar 14, 2026

An official implementation of DanceGRPO: Unleashing GRPO on Visual Generation

Python 1,589 78 Updated Oct 16, 2025

Digital Mind Extension

JavaScript 7,500 1,101 Updated Oct 26, 2025

High-performance data engine for AI and multimodal workloads. Process images, audio, video, and structured data at any scale

Rust 5,446 455 Updated Apr 30, 2026

LLM101n: Let's build a Storyteller

36,850 2,020 Updated Aug 1, 2024

Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis

Jupyter Notebook 664 28 Updated May 24, 2024

A collection of awesome text-to-image generation studies.

TeX 758 40 Updated Apr 25, 2026

MoviiGen 1.1: Towards Cinematic-Quality Video Generative Models

Python 183 9 Updated Jul 21, 2025
Jupyter Notebook 1,815 115 Updated Nov 5, 2025

Official PyTorch implementation of DiffMoE, TC-DiT, EC-DiT and Dense DiT

Python 172 6 Updated Oct 21, 2025

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 3,519 315 Updated Jul 17, 2025

[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL

Python 2,233 155 Updated Nov 4, 2025

State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!

Jupyter Notebook 2,261 156 Updated Apr 13, 2026

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 3,334 403 Updated Jan 17, 2026

Code for the project "MegaSaM: Accurate, Fast and Robust Structure and Motion from Casual Dynamic Videos"

Python 1,283 73 Updated Jan 5, 2026

[ICLR 2026] LLM/VLM gaming agents and model evaluation through games.

Python 925 98 Updated Nov 16, 2025

TransNet V2: Shot Boundary Detection Neural Network

Python 927 141 Updated Dec 4, 2023

Wan: Open and Advanced Large-Scale Video Generative Models

Python 15,935 2,653 Updated Mar 5, 2026

Official implementation for "RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers" (ICML 2025) , UltraViCo (ICLR 2026) and UltraImage

Python 807 75 Updated Mar 8, 2026

Implementation of "EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer"(ICCV2025)

Python 1,724 126 Updated Jul 25, 2025

Code and Data for "Long-context LLMs Struggle with Long In-context Learning" [TMLR2025]

Python 114 8 Updated Feb 20, 2025

The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]

Python 21 Updated Feb 27, 2025

[ICCV 2025, Oral] TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models

Python 853 41 Updated Dec 17, 2025
Next