Skip to content
View lchen1019's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report lchen1019

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

UniRL is a Framework for Unified Multimodal Model Reinforcement Learning

Python 625 35 Updated Jun 17, 2026

A Curated List of Vision-Language-Action (VLA) and World Action Models (WAM) Research and Beyond

756 26 Updated Jun 16, 2026

【三年面试五年模拟】AIGC/LLM/AI Agent算法工程师面试秘籍。涵盖AIGC、LLM大模型、AI Agent、传统深度学习、自动驾驶、机器学习、计算机视觉、自然语言处理、强化学习、大数据挖掘、具身智能、元宇宙、AGI等AI行业面试笔试干货经验与核心知识。

3,945 413 Updated Jun 15, 2026

Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Python 371 20 Updated May 13, 2026

Implementation for "The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer"

Python 83 3 Updated Oct 29, 2025

Trainable fast and memory-efficient sparse attention

Python 706 52 Updated Jun 16, 2026

Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models [CVPR 2025]

Python 82 1 Updated Jun 24, 2025

MTRefSeg: An Open-Source Benchmark and Baseline for Multi-temporal Referring Segmentation

Python 45 Updated Jun 4, 2026

ICML2025

Python 65 5 Updated Aug 28, 2025
Python 10 Updated May 26, 2026
Python 3 Updated May 28, 2026

Unofficial PyTorch reproduction of DeepSeek's Thinking with Visual Primitives.

Python 119 12 Updated Jun 11, 2026
Python 93 7 Updated Oct 10, 2025
Python 37 2 Updated Jun 7, 2026

Efficient Universal Perception Encoder: a single on-device vision encoder with versatile representations that match or exceed specialized experts across multiple task domains.

Python 665 38 Updated Apr 14, 2026

Awesome Unified Multimodal Models

1,282 40 Updated Mar 24, 2026

[ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 1,953 91 Updated Jan 8, 2026

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Python 2,101 117 Updated Jul 29, 2024

PyTorch denoising diffusion demo

Jupyter Notebook 21 10 Updated Apr 1, 2026

[ICLR 2024 Spotlight] DreamLLM: Synergistic Multimodal Comprehension and Creation

Python 461 7 Updated Dec 2, 2024

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 1,935 123 Updated Feb 20, 2026

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 17,747 2,229 Updated Feb 1, 2025

Official repo for "Let ViT Speak: Generative Language-Image Pre-training"

Python 129 4 Updated Jun 10, 2026

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Python 670 43 Updated May 30, 2026

Hy3 preview (295B A21B), a leading reasoning and agent model in its size, with great cost efficiency

Python 380 18 Updated Apr 23, 2026

SenseNova-U series: Native Unified Paradigm with NEO-unify from the First Principles

Python 3,207 279 Updated Jun 15, 2026

Official implementation of Tuna-2: Pixel Embeddings Beat Vision Encoders for Unified Understanding and Generation

Python 716 28 Updated Jun 9, 2026

VisionFoundry: Teaching VLMs Visual Perception with Synthetic Images

Python 52 1 Updated Apr 28, 2026

Vero: An Open RL Recipe for General Visual Reasoning

Python 125 11 Updated Jun 15, 2026
Next