ghy0324

🎯

Focusing

Haoyu Guo ghy0324

🎯

Focusing

Researcher at Shanghai AI Lab, PhD from ZJU3DV

145 followers · 144 following

Shanghai AI Lab
Shanghai, China
@Haoyu__Guo

Achievements

Organizations

Lists (24)

Sort

Stars

knightnemo / Awesome-World-Models

A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts Interested in World Modeling.

639 15 Updated Nov 5, 2025

baaivision / Emu3.5

Native Multimodal Models are World Learners

Python 1,136 39 Updated Nov 5, 2025

meituan-longcat / LongCat-Video

Python 1,026 90 Updated Nov 4, 2025

IDEA-Research / Rex-Omni

Detect Anything via Next Point Prediction (Based on Qwen2.5-VL-3B)

Jupyter Notebook 683 42 Updated Nov 4, 2025

bytetriper / RAE

Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"

Python 1,479 40 Updated Oct 15, 2025

Tencent-Hunyuan / SRPO

Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference

Python 1,165 36 Updated Oct 26, 2025

RLHFlow / RLHF-Reward-Modeling

Recipes to train reward model for RLHF.

Python 1,474 102 Updated Apr 24, 2025

gangweix / pixel-perfect-depth

[NeurIPS 2025] Pixel-Perfect Depth

Python 611 23 Updated Oct 13, 2025

thuml / MiniVeo3-Reasoner

Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Give it a star 🌟 if you find it useful.

Python 171 3 Updated Oct 12, 2025

AIGeeksGroup / UniVid

UniVid: The Open-Source Unified Video Model

Python 24 Updated Oct 13, 2025

FrankYang-17 / RealUnify

Python 24 Updated Oct 10, 2025

EvolvingLMMs-Lab / LLaVA-OneVision-1.5

Fully Open Framework for Democratized Multimodal Training

Python 603 41 Updated Nov 2, 2025

lnbxldn / Bridge

Code of BRIDGE: Building Reinforcement-Learning Depth-to-Image Data Generation Engine for Monocular Depth Estimation

Python 113 2 Updated Sep 30, 2025

AlmondGod / tinyworlds

A minimal implementation of DeepMind's Genie world model

Python 1,013 74 Updated Sep 28, 2025

Tencent-Hunyuan / HunyuanImage-3.0

HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation

Python 2,373 103 Updated Oct 31, 2025

QwenLM / Qwen3-VL

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 15,990 1,261 Updated Oct 27, 2025

QwenLM / Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 2,821 159 Updated Oct 9, 2025

yangzhou24 / OmniWorld

OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling

Python 386 6 Updated Oct 15, 2025

PKU-YuanGroup / UAE

Official repository for the UAE paper, unified-GRPO, and unified-Bench

Python 147 6 Updated Sep 12, 2025

facebookresearch / 4DGT

[NeurIPS 2025 (Spotlight)] The implementation for the paper "4DGT Learning a 4D Gaussian Transformer Using Real-World Monocular Videos"

Python 312 5 Updated Sep 19, 2025

unslothai / unsloth

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 47,938 3,920 Updated Nov 5, 2025

THUDM / slime

slime is an LLM post-training framework for RL Scaling.

Python 2,378 243 Updated Nov 5, 2025

OpenRL-Lab / openrl

Unified Reinforcement Learning Framework

Python 786 79 Updated Sep 6, 2024

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)

Python 8,319 807 Updated Oct 31, 2025

TideDra / lmm-r1

Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.

Python 828 54 Updated May 14, 2025

EvolvingLMMs-Lab / open-r1-multimodal

A fork to add multimodal model training to open-r1

Python 1,416 70 Updated Feb 8, 2025

StarsfieldAI / R1-V

Witness the aha moment of VLM with less than $3.

Python 3,975 290 Updated May 19, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 25,614 2,401 Updated Sep 8, 2025

huggingface / trl

Train transformer language models with reinforcement learning.

Python 16,171 2,277 Updated Nov 6, 2025

hiyouga / EasyR1

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 3,984 297 Updated Nov 3, 2025

Haoyu Guo ghy0324

Organizations

Lists (24)

2DV

3D segmentation

3DV

4D

Acceleration / Compression

Datasets

Experience

Framework

GAN

Generation

Human

Indoor

Inverse rendering

Learning

MVS / Stereo matching

NLP

Other

Representation

Review / Survey

RL

SfM / SLAM

Surface reconstruction

Tools

View synthesis

Stars