Skip to content
View ghy0324's full-sized avatar
🎯
Focusing
🎯
Focusing

Organizations

@zju3dv

Block or report ghy0324

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts Interested in World Modeling.

639 15 Updated Nov 5, 2025

Native Multimodal Models are World Learners

Python 1,136 39 Updated Nov 5, 2025

Detect Anything via Next Point Prediction (Based on Qwen2.5-VL-3B)

Jupyter Notebook 683 42 Updated Nov 4, 2025

Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"

Python 1,479 40 Updated Oct 15, 2025

Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference

Python 1,165 36 Updated Oct 26, 2025

Recipes to train reward model for RLHF.

Python 1,474 102 Updated Apr 24, 2025

[NeurIPS 2025] Pixel-Perfect Depth

Python 611 23 Updated Oct 13, 2025

Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Give it a star 🌟 if you find it useful.

Python 171 3 Updated Oct 12, 2025

UniVid: The Open-Source Unified Video Model

Python 24 Updated Oct 13, 2025
Python 24 Updated Oct 10, 2025

Fully Open Framework for Democratized Multimodal Training

Python 603 41 Updated Nov 2, 2025

Code of BRIDGE: Building Reinforcement-Learning Depth-to-Image Data Generation Engine for Monocular Depth Estimation

Python 113 2 Updated Sep 30, 2025

A minimal implementation of DeepMind's Genie world model

Python 1,013 74 Updated Sep 28, 2025

HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation

Python 2,373 103 Updated Oct 31, 2025

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 15,990 1,261 Updated Oct 27, 2025

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 2,821 159 Updated Oct 9, 2025

OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling

Python 386 6 Updated Oct 15, 2025

Official repository for the UAE paper, unified-GRPO, and unified-Bench

Python 147 6 Updated Sep 12, 2025

[NeurIPS 2025 (Spotlight)] The implementation for the paper "4DGT Learning a 4D Gaussian Transformer Using Real-World Monocular Videos"

Python 312 5 Updated Sep 19, 2025

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 47,938 3,920 Updated Nov 5, 2025

slime is an LLM post-training framework for RL Scaling.

Python 2,378 243 Updated Nov 5, 2025

Unified Reinforcement Learning Framework

Python 786 79 Updated Sep 6, 2024

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)

Python 8,319 807 Updated Oct 31, 2025

Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.

Python 828 54 Updated May 14, 2025

A fork to add multimodal model training to open-r1

Python 1,416 70 Updated Feb 8, 2025

Witness the aha moment of VLM with less than $3.

Python 3,975 290 Updated May 19, 2025

Fully open reproduction of DeepSeek-R1

Python 25,614 2,401 Updated Sep 8, 2025

Train transformer language models with reinforcement learning.

Python 16,171 2,277 Updated Nov 6, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 3,984 297 Updated Nov 3, 2025
Next