Skip to content
View bellos1203's full-sized avatar

Highlights

  • Pro

Block or report bellos1203

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Native Multimodal Models are World Learners

Python 1,372 52 Updated Nov 28, 2025

Echo-4o

Jupyter Notebook 463 28 Updated Dec 9, 2025

Official Implementation of "MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation"

Python 280 7 Updated Nov 19, 2025
Python 30 1 Updated Dec 2, 2025

Contexts Optical Compression

Python 21,561 1,928 Updated Oct 25, 2025

Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"

Python 1,649 55 Updated Nov 15, 2025

This is the official repository for the paper "FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehensive Benchmark"

Python 112 1 Updated Sep 12, 2025

Official Implementation of LaViDa: :A Large Diffusion Language Model for Multimodal Understanding

Python 186 9 Updated Dec 17, 2025

Official implementation of Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents (NeurIPS 2025)

Python 44 3 Updated Nov 24, 2025

Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potential in Unified Multimodal Models through Self-supervised Learning.

Python 338 11 Updated Dec 22, 2025

Official Implementation of FedLPA (Neurips 2025)

4 Updated Oct 11, 2025
Python 581 16 Updated Dec 24, 2025

Reference PyTorch implementation and models for DINOv3

Jupyter Notebook 8,992 664 Updated Nov 20, 2025

Official PyTorch Implementation of "Diffusion Autoencoders are Scalable Image Tokenizers"

Python 162 5 Updated Jan 31, 2025

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,465 1,999 Updated Nov 1, 2025

[ICCV2025] TokenBridge: Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation. https://yuqingwang1029.github.io/TokenBridge

Python 150 4 Updated Jul 24, 2025

Open-source unified multimodal model

Python 5,502 481 Updated Oct 27, 2025

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 1,829 113 Updated Sep 27, 2024

This repo contains the code for 1D tokenizer and generator

Jupyter Notebook 1,086 59 Updated Mar 20, 2025

[NeurIPS 2025] Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations

Python 192 6 Updated Sep 18, 2025

[NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understanding

Python 492 10 Updated Nov 14, 2025

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

Python 2,299 296 Updated May 11, 2025

Lets make video diffusion practical!

Python 16,391 1,596 Updated Oct 16, 2025

State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!

Jupyter Notebook 1,984 130 Updated Dec 18, 2025

Toolkit for linearizing PDFs for LLM datasets/training

Python 16,357 1,265 Updated Dec 20, 2025

The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.

Python 2,402 224 Updated Dec 19, 2025

(ICCV 2025) OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation

Python 94 14 Updated Dec 3, 2025

Official Codes for Fine-Grained Visual Prompting, NeurIPS 2023

Python 56 2 Updated Feb 1, 2024

Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations. [EMNLP 2022]

Python 136 4 Updated Sep 29, 2024
Next