Skip to content
View JunlinHan's full-sized avatar

Organizations

@torrvision @facebookresearch

Block or report JunlinHan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Tempo: Small Vision-Language Models are Smart Compressors for Long Video Understanding

Python 68 2 Updated Apr 29, 2026

🦞 Just talk to your agent — it learns and EVOLVES 🧬.

Python 3,386 440 Updated Apr 11, 2026

Learning to See by Looking at Noise

Python 115 8 Updated Nov 24, 2024

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

Python 1,359 72 Updated Jan 27, 2026

Procedural Image Programs for Representation Learning - NeurIPS 2022

Python 42 3 Updated Feb 4, 2026

[CVPR 2026] Mesh4D: 4D Mesh Reconstruction and Tracking from Monocular Video

Python 95 7 Updated Jan 9, 2026

A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.

Jupyter Notebook 2,725 274 Updated May 12, 2026

A generative world for general-purpose robotics & embodied AI learning.

Python 28,793 2,708 Updated May 16, 2026

A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

Python 648 216 Updated Aug 30, 2021
Python 63 20 Updated Apr 19, 2021

[CVPR 2026] 👋 Dataset and Benchmark code for EgoEdit

Python 147 5 Updated Apr 5, 2026

This repository contains the official code and data for CogIP-Bench (Cognition Image Property Benchmark) and the associated alignment methods described in the paper "From Pixels to Feelings: Aligni…

Python 6 Updated Dec 1, 2025
Jupyter Notebook 136 7 Updated Nov 8, 2025

Implementation of Reinforcement Pre-Training (RPT) for Language Models - ArXiv:2506.08007

Python 22 2 Updated Jul 19, 2025

State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!

Jupyter Notebook 2,282 156 Updated Apr 13, 2026

Fully Open Framework for Democratized Multimodal Training

Python 839 67 Updated May 18, 2026

Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"

Python 1,885 82 Updated Feb 25, 2026

NeurIPS 2025 Spotlight; ICLR2024 Spotlight; CVPR 2024; EMNLP 2024

Python 1,837 76 Updated Nov 27, 2025

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 19,193 1,764 Updated Jan 30, 2026

Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation and Reconstruction (ICCV 2025)

Python 842 42 Updated Jan 28, 2026

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 27,239 1,979 Updated Jan 9, 2026

verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework

Python 21,350 3,883 Updated May 18, 2026

[ICLR 2026] Official PyTorch Implementation of RLP: Reinforcement as a Pretraining Objective

250 16 Updated Jan 26, 2026

[CVPR 2025] Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

Python 1,570 93 Updated May 7, 2025

MapAnything: Universal Feed-Forward Metric 3D Reconstruction

Python 3,377 252 Updated Mar 23, 2026

(Accepted by IJCV) Liquid: Language Models are Scalable and Unified Multi-modal Generators

Python 643 35 Updated Nov 10, 2025

🔥🔥🔥 Latest Papers, Codes and Datasets on Video-LMM Post-Training

Python 287 13 Updated Mar 3, 2026

Code for Words That Make Language Models Perceive

Jupyter Notebook 42 3 Updated Oct 14, 2025

[CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".

Python 460 10 Updated Aug 8, 2025
Python 27 Updated Oct 10, 2025
Next