JunlinHan

Follow

Junlin Han JunlinHan

Follow

AI research

104 followers · 64 following

University of Oxford & Meta AI
London
https://junlinhan.github.io/
@han_junlin

Achievements

Achievements

Organizations

Stars

FeiElysia / Tempo

Tempo: Small Vision-Language Models are Smart Compressors for Long Video Understanding

Python 68 2 Updated Apr 29, 2026

aiming-lab / MetaClaw

🦞 Just talk to your agent — it learns and EVOLVES 🧬.

Python 3,386 440 Updated Apr 11, 2026

mbaradad / learning_with_noise

Learning to See by Looking at Noise

Python 115 8 Updated Nov 24, 2024

lucidrains / transfusion-pytorch

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

Python 1,359 72 Updated Jan 27, 2026

mbaradad / shaders21k

Procedural Image Programs for Representation Learning - NeurIPS 2022

Python 42 3 Updated Feb 4, 2026

jzr99 / Mesh4D

[CVPR 2026] Mesh4D: 4D Mesh Reconstruction and Tracking from Monocular Video

Python 95 7 Updated Jan 9, 2026

google-research / kubric

A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.

Jupyter Notebook 2,725 274 Updated May 12, 2026

Genesis-Embodied-AI / genesis-world

A generative world for general-purpose robotics & embodied AI learning.

Python 28,793 2,708 Updated May 16, 2026

facebookresearch / clevr-dataset-gen

A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

Python 648 216 Updated Aug 30, 2021

AlexKuhnle / ShapeWorld

Python 63 20 Updated Apr 19, 2021

snap-research / EgoEdit

[CVPR 2026] 👋 Dataset and Benchmark code for EgoEdit

Python 147 5 Updated Apr 5, 2026

Follen-cry / MLLM_Cognition_Alignment

This repository contains the official code and data for CogIP-Bench (Cognition Image Property Benchmark) and the associated alignment methods described in the paper "From Pixels to Feelings: Aligni…

Python 6 Updated Dec 1, 2025

apple / ml-atoken

Jupyter Notebook 136 7 Updated Nov 8, 2025

ProCreations-Official / reinforcement-pretraining

Implementation of Reinforcement Pre-Training (RPT) for Language Models - ArXiv:2506.08007

Python 22 2 Updated Jul 19, 2025

facebookresearch / perception_models

State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!

Jupyter Notebook 2,282 156 Updated Apr 13, 2026

EvolvingLMMs-Lab / LLaVA-OneVision-2

Fully Open Framework for Democratized Multimodal Training

Python 839 67 Updated May 18, 2026

bytetriper / RAE

Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"

Python 1,885 82 Updated Feb 25, 2026

facebookresearch / MetaCLIP

NeurIPS 2025 Spotlight; ICLR2024 Spotlight; CVPR 2024; EMNLP 2024

Python 1,837 76 Updated Nov 27, 2025

QwenLM / Qwen3-VL

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 19,193 1,764 Updated Jan 30, 2026

caiyuanhao1998 / Open-DiffusionGS

Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation and Reconstruction (ICCV 2025)

Python 842 42 Updated Jan 28, 2026

QwenLM / Qwen3

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 27,239 1,979 Updated Jan 9, 2026

verl-project / verl

verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework

Python 21,350 3,883 Updated May 18, 2026

NVlabs / RLP

[ICLR 2026] Official PyTorch Implementation of RLP: Reinforcement as a Pretraining Objective

250 16 Updated Jan 26, 2026

facebookresearch / fast3r

[CVPR 2025] Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

Python 1,570 93 Updated May 7, 2025

facebookresearch / map-anything

MapAnything: Universal Feed-Forward Metric 3D Reconstruction

Python 3,377 252 Updated Mar 23, 2026

FoundationVision / Liquid

(Accepted by IJCV) Liquid: Language Models are Scalable and Unified Multi-modal Generators

Python 643 35 Updated Nov 10, 2025

yunlong10 / Awesome-Video-LMM-Post-Training

🔥🔥🔥 Latest Papers, Codes and Datasets on Video-LMM Post-Training

Python 287 13 Updated Mar 3, 2026

sophicle / sensory

Code for Words That Make Language Models Perceive

Jupyter Notebook 42 3 Updated Oct 14, 2025

ByteVisionLab / TokenFlow

[CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".

Python 460 10 Updated Aug 8, 2025

FrankYang-17 / RealUnify

Python 27 Updated Oct 10, 2025