Skip to content
View RobertLuo1's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report RobertLuo1

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Towards Scalable Pre-training of Visual Tokenizers for Generation

Python 236 5 Updated Dec 16, 2025

WorldPlay: Interactive World Modeling with Real-Time Latency and Geometric Consistency

Python 623 35 Updated Dec 19, 2025

一个基于nano banana pro🍌的原生AI PPT生成应用,迈向真正的"Vibe PPT"; 支持上传任意模板图片;上传任意素材&智能解析;一句话/大纲/页面描述自动生成PPT;口头修改指定区域、一键导出 - An AI-native PPT generator based on nano banana pro🍌

TypeScript 5,478 604 Updated Dec 20, 2025
Python 40 1 Updated Dec 16, 2025
Jupyter Notebook 2,746 366 Updated May 2, 2025

A curated collection of fun and creative examples generated with Nano Banana & Nano Banana Pro🍌, Gemini-2.5-flash-image based model. We also release Nano-consistent-150K openly to support the commu…

18,946 1,972 Updated Dec 12, 2025

Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition

Python 660 72 Updated Nov 28, 2025

Ovis-Image is a 7B text-to-image model specifically optimized for high-quality text rendering, designed to operate efficiently under stringent computational constraints.

Python 276 12 Updated Dec 21, 2025

ENACT is a benchmark that evaluates embodied cognition through world modeling from egocentric interaction. It is designed to be simple and have a scalable dataset.

Python 33 1 Updated Nov 27, 2025

Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>

Python 4,823 307 Updated Mar 7, 2025
Python 7,542 445 Updated Dec 14, 2025

Adapting Self-Supervised Representations as a Latent Space for Efficient Generation

33 Updated Oct 17, 2025

MiMo-Embodied

Python 320 11 Updated Nov 21, 2025

PyTorch implementation of JiT https://arxiv.org/abs/2511.13720

Python 1,831 108 Updated Dec 8, 2025

The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…

Python 6,279 727 Updated Dec 21, 2025

[NeurIPS 2025] VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models

Python 137 7 Updated Nov 8, 2025

Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potential in Unified Multimodal Models through Self-supervised Learning.

Python 335 11 Updated Dec 16, 2025
Jupyter Notebook 97 1 Updated Nov 8, 2025

[NeurIPS 2025 Oral]Infinity⭐️: Unified Spacetime AutoRegressive Modeling for Visual Generation

Python 658 24 Updated Nov 27, 2025

Code release for Ming-UniVision: Joint Image Understanding and Geneation with a Continuous Unified Tokenizer

Python 133 5 Updated Oct 14, 2025

Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"

Python 1,640 53 Updated Nov 15, 2025

A PyTorch Implementation of Image Style Transfer Using Convolutional Neural Networks

Python 24 3 Updated Apr 6, 2019

This is the official repo for the paper "LongCat-Flash-Omni Technical Report"

Python 444 24 Updated Dec 15, 2025

Native Multimodal Models are World Learners

Python 1,367 52 Updated Nov 28, 2025

🐻 Uniform Discrete Diffusion with Metric Path for Video Generation

Python 81 2 Updated Dec 21, 2025

Contexts Optical Compression

Python 21,515 1,925 Updated Oct 25, 2025

[NeurIPS 2025] Unveiling Chain of Step Reasoning for Vision-Language Models with Fine-grained Rewards

Python 16 Updated Oct 6, 2025

InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy

Python 317 16 Updated Dec 17, 2025
Next