[ACL-main-2026]We introduce Chart2Code, the first user-driven, hierarchical benchmark that systematically evaluates Large Multimodal Models on chart-to-code tasks of increasing difficulty.

Python 28 Updated Jan 27, 2026

yeezhu / UNIT

PyTorch implementation of "UNIT: Unifying Image and Text Recognition in One Vision Encoder", NeurlPS 2024.

Python 34 2 Updated Sep 26, 2024

deepseek-ai / DeepSeek-OCR

Contexts Optical Compression

Python 23,288 2,151 Updated Jan 27, 2026

showlab / Code2Video

[ICML 2026] Video generation via code

Python 1,795 252 Updated May 31, 2026

showlab / Paper2Video

Automatic Video Generation from Scientific Papers

Python 2,313 328 Updated Mar 5, 2026

UT-Austin-RobIn / lang4sim2real

Code for Data Collection & Training in Sim+Real Envs: [RSS 2024] Natural Language Can Help Bridge the Sim2Real Gap

Python 11 Updated Oct 25, 2025

allenai / molmoact

Official Repository for MolmoAct

Python 369 41 Updated May 11, 2026

X-Omni-Team / X-Omni

Official inference code and LongText-Bench benchmark for our paper X-Omni (https://arxiv.org/pdf/2507.22058).

Python 425 11 Updated Aug 26, 2025

weijiawu / Awesome-RL-for-Multimodal-Foundation-Models

📖 This is a repository for organizing papers, codes and other resources related to Visual Reinforcement Learning.

449 22 Updated Apr 28, 2026

showlab / SMS

[ICCV 2025] Balanced Image Stylization with Style Matching Score

Python 70 2 Updated Mar 9, 2026

CSU-JPG / MVPBench

Seeing is Not Reasoning: MVPBench for Graph-based Evaluation of Multi-path Visual Physical CoT

Python 15 Updated Jul 30, 2025

AIGText / Glyph-ByT5

[ECCV2024] This is an official inference code of the paper "Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering" and "Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Mu…

Jupyter Notebook 622 31 Updated Sep 5, 2025

FoundationVision / UniTok

[NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understanding

Python 527 12 Updated Nov 14, 2025

SilentView / GigaTok

[ICCV 2025] Official repo for "GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation"

Python 204 2 Updated Jan 7, 2026

Alpha-VLLM / Lumina-mGPT-2.0

Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling

Python 1,084 51 Updated Nov 3, 2025

deepseek-ai / Janus

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 17,747 2,231 Updated Feb 1, 2025

360CVGroup / PlanGen

Unified layout planning and image generation, ICCV2025

Python 45 3 Updated Jan 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jinpeng Wang FingerRec

Achievements

Achievements

Block or report FingerRec

Stars

OpenGVLab / EgoExoLearn

CSU-JPG / FlowInOne

shareAI-lab / learn-claude-code

CSU-JPG / VJA

CSU-JPG / MIND

CSU-JPG / VIST

CSU-JPG / Glance

Yuanshi9815 / ViBT

showlab / Adv-GRPO

ThinkMorph / ThinkMorph

showlab / AUI

CSU-JPG / VCode

baaivision / Emu3.5

CSU-JPG / Chart2Code