xichenpan

Flash xichenpan

CS PhD @nyu | Prev: @SJTU-CSE @facebookresearch @microsoft @alibaba @HorizonRobotics

353 followers · 169 following

@nyu
New York
00:56 (UTC -04:00)
xichenpan.com
in/xichenpan
@xichen_pan

Achievements

Highlights

Organizations

Lists (1)

Sort

🔮 Future ideas

Stars

nnnth / UniLIP

[ICLR 2026 🔥 ] Official implementation of "UniLiP: Adapting CLIP for Unified Multimodal Understanding, Generation and Editing"

Python 149 5 Updated Jan 26, 2026

bytetriper / RAE

Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"

Python 1,894 83 Updated Feb 25, 2026

lst627 / CLIP-Embeds

Python 8 Updated Jan 13, 2026

PicoTrex / Awesome-Nano-Banana-images

A curated collection of fun and creative examples generated with Nano Banana & Nano Banana Pro🍌, Gemini-2.5-flash-image based model. We also release Nano-consistent-150K openly to support the commu…

22,777 2,360 Updated Dec 12, 2025

facebookresearch / metaquery

Official Implementation of Paper Transfer between Modalities with MetaQueries

Python 319 14 Updated Oct 12, 2025

tang-bd / fuse-dit

[CVPR 2025] Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis

Python 134 5 Updated May 16, 2025

JiuhaiChen / BLIP3o

Official implementation of BLIP3o-Series

Python 1,653 78 Updated Nov 29, 2025

vision-x-nyu / pisa-experiments

Code release for "PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop" (ICML 2025)

Jupyter Notebook 56 3 Updated May 8, 2025

zeyofu / Commonsense-T2I

Code for Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? [COLM 2024]

Python 24 1 Updated Aug 13, 2024

cambrian-mllm / cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,998 137 Updated Nov 7, 2025

VectorSpaceLab / OmniGen

OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340

Jupyter Notebook 4,320 362 Updated Dec 4, 2025

overleaf-workshop / Overleaf-Workshop

Open Overleaf/ShareLaTex projects in vscode, with full collaboration support.

TypeScript 1,551 62 Updated May 11, 2026

EvolvingLMMs-Lab / lmms-eval

One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks

Python 4,143 589 Updated May 20, 2026

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 23,846 2,747 Updated May 19, 2026

nyu-systems / Grendel-GS

[ICLR 2025 Oral] On Scaling Up 3D Gaussian Splatting Training

Python 667 43 Updated Sep 24, 2025

FutureXiang / soda

Unofficial implementation of "SODA: Bottleneck Diffusion Models for Representation Learning"

Jupyter Notebook 97 4 Updated Mar 21, 2024

lllyasviel / Omost

Your image is almost there!

Python 7,621 438 Updated Jul 26, 2024

penghao-wu / vstar

PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"

Python 704 43 Updated Jan 7, 2024

pkunlp-icler / FastV

[ECCV 2024 Oral] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models

Python 579 29 Updated Jan 4, 2025

xavihart / PDM-Pure

PDM-based Purifier

Python 23 Updated Nov 5, 2024

xichenpan / Kosmos-G

Official Implementation of ICLR'24: Kosmos-G: Generating Images in Context with Multimodal Large Language Models

Python 75 4 Updated May 25, 2024

conda-forge / miniforge

A conda-forge distribution.

Shell 9,788 502 Updated May 14, 2026

princeton-vl / infinigen

Infinite Photorealistic Worlds using Procedural Generation

Python 6,966 588 Updated May 19, 2026

dangeng / visual_anagrams

Code for the paper "Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models"

Jupyter Notebook 965 99 Updated May 3, 2026

guoyww / AnimateDiff

Official implementation of AnimateDiff.

Python 12,119 1,075 Updated Jul 31, 2024

xai-org / grok-1

Grok open release

Python 51,671 8,482 Updated Aug 30, 2024

deepseek-ai / DeepSeek-VL

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Python 4,107 594 Updated Apr 24, 2024

VAST-AI-Research / TripoSR

TripoSR: Fast 3D Object Reconstruction from a Single Image

Python 6,509 823 Updated Aug 16, 2024

naver / dust3r

DUSt3R: Geometric 3D Vision Made Easy

Python 7,146 753 Updated Sep 24, 2025

microsoft / infinibatch

Efficient, check-pointed data loading for deep learning with massive data sets.

Python 211 17 Updated Jun 12, 2023