-
Stanford University
- Stanford, California
- https://bchao1.github.io
- @BrianCChao
- in/brian-chao-85425415a
Starred repositories
Official inference repo for FLUX.2 models
[CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
[NeurIPS 2025 Oral]Infinity⭐️: Unified Spacetime AutoRegressive Modeling for Visual Generation
[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…
[arXiv 2025] Upsample What Matters: Region-Adaptive Latent Sampling for Accelerated Diffusion Transformers
Official implementation of ICML2025 paper "ToMA: Token Merge with Attention for Diffusion Models"
OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation
HunyuanVideo-1.5: A leading lightweight video generation model
OpenMMLab Pose Estimation Toolbox and Benchmark.
EgoDex: Learning Dexterous Manipulation from Large-Scale Egocentric Video
A curated list of egocentric (first-person) vision and related area resources
[Nips 2025] EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation
[ArXiv 2025] A survey about controllable video generation: This repo is the official awesome of "Controllable video generation: A survey"
Get cookies.txt, NEVER send information outside.
A lightweight, dependency-free Python library (and command-line utility) for downloading YouTube Videos.
Source code of paper "NVS-Solver: Video Diffusion Model as Zero-Shot Novel View Synthesizer"
ViPE: Video Pose Engine for Geometric 3D Perception
Export iMessage data + run iMessage Diagnostics
A sparse attention kernel supporting mix sparse patterns
FlashInfer: Kernel Library for LLM Serving
[ICML2025, NeurIPS2025 Spotlight] Sparse VideoGen 1 & 2: Accelerating Video Diffusion Transformers with Sparse Attention
[NeurIPS 2025] Radial Attention: O(nlogn) Sparse Attention with Energy Decay for Long Video Generation
[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.
📹 A more flexible framework that can generate videos at any resolution and creates videos from images.
Wan: Open and Advanced Large-Scale Video Generative Models
Official PyTorch Implementation for Dual-Process Image Generation, ICCV 2025
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.