Skip to content
View ifzhang's full-sized avatar
🐶
Focusing
🐶
Focusing

Organizations

@hustvl

Block or report ifzhang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

An hardware-aware Efficient Implementation for "Mixture-of-Depths Attention".

Python 162 4 Updated Apr 15, 2026

LLM驱动的 A/H/美股智能分析器:多数据源行情 + 实时新闻 + LLM决策仪表盘 + 多渠道推送,零成本定时运行,纯白嫖. LLM-powered stock analysis system for A/H/US markets.

Python 30,055 30,739 Updated Apr 13, 2026

[Tech Report] Alive: A Unified Audio-Video Generation Model

502 36 Updated Mar 31, 2026

Scaling Interactive World Models to 1000-Frame Horizons via Pose-Free Hierarchical Memory

Python 151 4 Updated Feb 9, 2026

HunyuanVideo-1.5: A leading lightweight video generation model

Python 4,375 217 Updated Apr 10, 2026

[NeurIPS 2025 Oral]Infinity⭐️: Unified Spacetime AutoRegressive Modeling for Visual Generation

Python 750 27 Updated Nov 27, 2025
Python 1,688 194 Updated Nov 15, 2025

The official UniVerse-1 code.

Python 123 10 Updated Oct 13, 2025

Repo for SeedVR2 (ICLR2026) & SeedVR (CVPR2025 Highlight)

Python 1,142 66 Updated Jan 27, 2026

MTVCraft: An Open Veo3-style Audio-Video Generation Demo

Python 100 12 Updated Oct 8, 2025

Industry-level video foundation model for unified Text-to-Video (T2V) and Image-to-Video (I2V) generation.

925 115 Updated Aug 27, 2025

Structured Video Comprehension of Real-World Shorts

Python 237 7 Updated Sep 21, 2025

InternRobotics' open platform for building generalized navigation foundation models.

Jupyter Notebook 797 111 Updated Mar 10, 2026

Official PyTorch Implementation of "Optimal Stepsize for Diffusion Sampling".

Python 200 12 Updated Apr 13, 2025

MAGI-1: Autoregressive Video Generation at Scale

Python 3,678 236 Updated Jun 17, 2025

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Python 2,243 144 Updated Mar 25, 2026

[CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Python 1,441 57 Updated Dec 16, 2025

ReNeg: Learning Negative Embedding with Reward Guidance

Python 35 Updated Dec 22, 2025

(Accepted by IJCV) Liquid: Language Models are Scalable and Unified Multi-modal Generators

Python 642 35 Updated Nov 10, 2025

[CVPR 2025] StreetCrafter: Street View Synthesis with Controllable Video Diffusion Models

Python 306 27 Updated Jan 27, 2026

[ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think

Python 1,607 87 Updated Mar 16, 2025

The official implementation of "[MASK] is All You Need"

Jupyter Notebook 126 6 Updated Jul 23, 2025

A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.

Python 4,340 332 Updated Jan 5, 2026

[CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis

Python 1,559 93 Updated Nov 10, 2025

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 11,970 1,226 Updated Nov 21, 2025

[ECCV 2024] Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting

Python 1,314 100 Updated Jul 4, 2025

[CVPR 2025 Highlight] Truncated Diffusion Model for Real-Time End-to-End Autonomous Driving

Python 1,352 129 Updated Dec 8, 2025

[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation

Python 960 63 Updated Mar 24, 2026

Bridging Large Vision-Language Models and End-to-End Autonomous Driving

Python 545 44 Updated Mar 15, 2026
Next