bellos1203

Follow

JaeYoo Park bellos1203

Follow

Graduate Student, SNU CVLab

24 followers · 26 following

Seoul National University
Seoul, Korea
00:18 (UTC +09:00)
https://bellos1203.github.io

Achievements

Achievements

Highlights

Pro

Stars

baaivision / Emu3.5

Native Multimodal Models are World Learners

Python 1,372 52 Updated Nov 28, 2025

yejy53 / Echo-4o

Echo-4o

Jupyter Notebook 463 28 Updated Dec 9, 2025

tyfeld / MMaDA-Parallel

Official Implementation of "MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation"

Python 280 7 Updated Nov 19, 2025

zhengdian1 / AIA

Python 30 1 Updated Dec 2, 2025

deepseek-ai / DeepSeek-OCR

Contexts Optical Compression

Python 21,561 1,928 Updated Oct 25, 2025

bytetriper / RAE

Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"

Python 1,649 55 Updated Nov 15, 2025

rongyaofang / prism-bench

This is the official repository for the paper "FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehensive Benchmark"

Python 112 1 Updated Sep 12, 2025

jacklishufan / LaViDa

Official Implementation of LaViDa: :A Large Diffusion Language Model for Multimodal Understanding

Python 186 9 Updated Dec 17, 2025

HL-hanlin / Bifrost-1

Official implementation of Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents (NeurIPS 2025)

Python 44 3 Updated Nov 24, 2025

HorizonWind2004 / reconstruction-alignment

Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potential in Unified Multimodal Models through Self-supervised Learning.

Python 338 11 Updated Dec 22, 2025

geehokim / FedLPA

Official Implementation of FedLPA (Neurips 2025)

4 Updated Oct 11, 2025

stepfun-ai / NextStep-1

Python 581 16 Updated Dec 24, 2025

facebookresearch / dinov3

Reference PyTorch implementation and models for DINOv3

Jupyter Notebook 8,992 664 Updated Nov 20, 2025

yinboc / dito

Official PyTorch Implementation of "Diffusion Autoencoders are Scalable Image Tokenizers"

Python 162 5 Updated Jan 31, 2025

openai / gpt-oss

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,465 1,999 Updated Nov 1, 2025

YuqingWang1029 / TokenBridge

[ICCV2025] TokenBridge: Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation. https://yuqingwang1029.github.io/TokenBridge

Python 150 4 Updated Jul 24, 2025

ByteDance-Seed / Bagel

Open-source unified multimodal model

Python 5,502 481 Updated Oct 27, 2025

LTH14 / mar

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 1,829 113 Updated Sep 27, 2024

bytedance / 1d-tokenizer

This repo contains the code for 1D tokenizer and generator

Jupyter Notebook 1,086 59 Updated Mar 20, 2025

csuhan / Tar

[NeurIPS 2025] Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations

Python 192 6 Updated Sep 18, 2025

FoundationVision / UniTok

[NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understanding

Python 492 10 Updated Nov 14, 2025

casper-hansen / AutoAWQ

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

Python 2,299 296 Updated May 11, 2025

lllyasviel / FramePack

Lets make video diffusion practical!

Python 16,391 1,596 Updated Oct 16, 2025

facebookresearch / perception_models

State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!

Jupyter Notebook 1,984 130 Updated Dec 18, 2025

allenai / olmocr

Toolkit for linearizing PDFs for LLM datasets/training

Python 16,357 1,265 Updated Dec 20, 2025

illuin-tech / colpali

The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.

Python 2,402 224 Updated Dec 19, 2025

opendatalab / OHR-Bench

(ICCV 2025) OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation

Python 94 14 Updated Dec 3, 2025

dhryougit / learning-to-translate-noise

Python 15 2 Updated Dec 11, 2024

ylingfeng / FGVP

Official Codes for Fine-Grained Visual Prompting, NeurIPS 2023

Python 56 2 Updated Feb 1, 2024

om-ai-lab / VL-CheckList

Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations. [EMNLP 2022]

Python 136 4 Updated Sep 29, 2024