Skip to content
View liubl1217's full-sized avatar

Block or report liubl1217

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 1,384 112 Updated Feb 12, 2026
Python 9 Updated May 9, 2026

Official implementation of Tuna-2: Pixel Embeddings Beat Vision Encoders for Unified Understanding and Generation

Python 716 28 Updated Jun 9, 2026

Reference code for the Meta-Harness paper.

Python 1,101 105 Updated Apr 29, 2026

[ICLR2026] Everything in Its Place: Benchmarking Spatial Intelligence of Text-to-Image Models

Python 131 Updated Jan 30, 2026

[ICLR 2026] Uni-CoT: Towards Unified Chain-of-Thought Reasoning Across Text and Vision

Python 228 7 Updated May 31, 2026
Python 610 58 Updated Feb 26, 2026
Python 1,237 78 Updated Nov 20, 2025

Official codebase for "Self Forcing: Bridging Training and Inference in Autoregressive Video Diffusion" (NeurIPS 2025 Spotlight)

Python 3,404 273 Updated Sep 12, 2025

Benchmarking physical understanding in generative video models

Python 305 33 Updated May 27, 2026

[ICLR 2026] "VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use"

Python 193 8 Updated Mar 20, 2026

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 5,016 373 Updated Apr 6, 2026

Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)

Python 725 28 Updated Sep 24, 2025
Jupyter Notebook 168 7 Updated Jun 8, 2026

Lets make video diffusion practical!

Python 17,033 1,702 Updated Oct 16, 2025

A framework for few-shot evaluation of language models.

Python 12,980 3,346 Updated Jun 2, 2026

Fully open data curation for reasoning models

Python 2,279 189 Updated Dec 2, 2025

Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents

Python 3,279 106 Updated May 11, 2026

verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework

Python 22,012 4,087 Updated Jun 16, 2026

One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks

Python 4,231 603 Updated Jun 11, 2026

[NeurIPS 2025] SpatialLM: Training Large Language Models for Structured Indoor Modeling

Python 4,594 383 Updated Sep 26, 2025

Explore the Multimodal “Aha Moment” on 2B Model

Python 623 23 Updated Mar 18, 2025

[CVPR 2025] EgoLife: Towards Egocentric Life Assistant

Python 440 19 Updated Mar 19, 2025

[ICLR'25] Reconstructive Visual Instruction Tuning

Python 135 5 Updated Apr 9, 2025

✨✨[NeurIPS 2025] VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 2,517 182 Updated Mar 28, 2025

The first behavioral foundation model to control a virtual physics-based humanoid agent for a wide range of whole-body tasks.

Python 767 76 Updated Jun 10, 2025

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 12,215 1,250 Updated Nov 21, 2025

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 19,364 2,476 Updated May 30, 2026

Allegro is a powerful text-to-video model that generates high-quality videos up to 6 seconds at 15 FPS and 720p resolution from simple text input.

Python 1,130 69 Updated Feb 7, 2025

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 1,935 123 Updated Feb 20, 2026
Next