Efficient Universal Perception Encoder: a single on-device vision encoder with versatile representations that match or exceed specialized experts across multiple task domains.

Python 665 38 Updated Apr 14, 2026

AIDC-AI / Awesome-Unified-Multimodal-Models

Awesome Unified Multimodal Models

1,282 40 Updated Mar 24, 2026

showlab / Show-o

[ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 1,953 91 Updated Jan 8, 2026

facebookresearch / chameleon

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Python 2,101 117 Updated Jul 29, 2024

joseph-nagel / diffusion-demo

PyTorch denoising diffusion demo

Jupyter Notebook 21 10 Updated Apr 1, 2026

RunpeiDong / DreamLLM

[ICLR 2024 Spotlight] DreamLLM: Synergistic Multimodal Comprehension and Creation

Python 461 7 Updated Dec 2, 2024

LTH14 / mar

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 1,935 123 Updated Feb 20, 2026

deepseek-ai / Janus

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 17,747 2,229 Updated Feb 1, 2025

YanFangCS / GenLIP

Official repo for "Let ViT Speak: Generative Language-Image Pre-training"

Python 129 4 Updated Jun 10, 2026

thunlp / OPD

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Python 670 43 Updated May 30, 2026

Tencent-Hunyuan / Hy3-preview

Hy3 preview (295B A21B), a leading reasoning and agent model in its size, with great cost efficiency

Python 380 18 Updated Apr 23, 2026

OpenSenseNova / SenseNova-U1

SenseNova-U series: Native Unified Paradigm with NEO-unify from the First Principles

Python 3,207 279 Updated Jun 15, 2026

facebookresearch / tuna-2

Official implementation of Tuna-2: Pixel Embeddings Beat Vision Encoders for Unified Understanding and Generation

Python 716 28 Updated Jun 9, 2026

zlab-princeton / VisionFoundry

VisionFoundry: Teaching VLMs Visual Perception with Synthetic Images

Python 52 1 Updated Apr 28, 2026

zlab-princeton / vero

Vero: An Open RL Recipe for General Visual Reasoning

Python 125 11 Updated Jun 15, 2026

Lin Chen lchen1019

Highlights

Lists (24)

1️⃣ Unified Model

🧰 Agent

🦁 Backbone

🦅 Dataset Distillation

🌲 Eval

💎 generation

😪 Hallucination

🛥️ infra

🐰 KD

💭 latent

☕ LoRA

🔢 math

🌟 MLLM

🍄 MTP

🐤 Open-Vocabulary Detection

⭐ Open-Vocabulary Segmentation

👁️‍🗨️ Post-Training

🚀 Pretraining

🍦 RAG

🤔 Resoning LLM

🍰 SAM

✊ SAM+CLIP

✈️ Segmentation

🚘 V2A

Stars