GengzeZhou

Follow

🎯

Focusing

GengzeZhou

🎯

Focusing

Follow

Ph.D. @ UoA working on Embodied AI, Vision-and-Language Navigation.

68 followers · 39 following

Australian Institute for Machine Learning (AIML)
Adelaide, Australia
20:35 (UTC +10:30)
https://gengzezhou.github.io/
in/gengze-zhou-159095203

Achievements

Achievements

Highlights

Pro

Lists (1)

Sort

🗺️ Navigation

26 repositories

Starred repositories

NVlabs / DDO

[ICML 2025 Spotlight] Direct Discriminative Optimization: Supercharging Diffusion/Autoregressive with GAN-type Discrimination

Python 109 3 Updated Jul 31, 2025

xlang-ai / OSWorld

[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Python 2,418 359 Updated Dec 23, 2025

GengzeZhou / SAR

Official implementation of Rethinking Training Dynamics in Scale-wise Autoregressive Generation

Jupyter Notebook 1 Updated Dec 17, 2025

showlab / EVOLVE-VLA

EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models

HTML 46 Updated Dec 17, 2025

jiaosiyu1999 / FlexVAR

Python 121 7 Updated Feb 28, 2025

Lixsp11 / sekai-codebase

[NeurIPS 2025] The official repository of "Sekai: A Video Dataset towards World Exploration"

Python 217 5 Updated Dec 5, 2025

sansan0 / TrendRadar

🎯 告别信息过载，AI 助你看懂新闻资讯热点，简单的舆情监控分析 - 多平台热点聚合+基于 MCP 的AI分析工具。监控35个平台（抖音、知乎、B站、华尔街见闻、财联社等），智能筛选+自动推送+AI对话分析（用自然语言深度挖掘新闻：趋势追踪、情感分析、相似检索等13种工具）。支持企业微信/个人微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 推送，1分钟手机通知，无需…

Python 40,381 20,885 Updated Dec 23, 2025

YuyaoGe / Awesome-Vibe-Coding

98 12 Updated Oct 29, 2025

baaivision / Emu3.5

Native Multimodal Models are World Learners

Python 1,372 52 Updated Nov 28, 2025

ali-vilab / TeaCache

Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

Python 1,208 49 Updated Jun 8, 2025

QwenLM / Qwen-Image

Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

Python 6,524 367 Updated Dec 24, 2025

Inception3D / TTT3R

A simple state update rule to enhance length generalization for CUT3R

Python 545 17 Updated Oct 1, 2025

nv-tlabs / GEN3C

[CVPR 2025 Highlight] GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control

Jupyter Notebook 1,217 65 Updated Sep 24, 2025

Alibaba-NLP / DeepResearch

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 17,732 1,361 Updated Dec 24, 2025

InternRobotics / InternNav

InternRobotics' open platform for building generalized navigation foundation models.

Jupyter Notebook 535 59 Updated Dec 23, 2025

zhangyuejoslin / VLN-Survey-with-Foundation-Models

[TMLR 2024] repository for VLN with foundation models

229 11 Updated Oct 25, 2025

fla-org / flash-linear-attention

🚀 Efficient implementations of state-of-the-art linear attention models

Python 4,118 339 Updated Dec 25, 2025

RL4VLM / RL4VLM

Official Repo for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning

Jupyter Notebook 404 36 Updated Dec 15, 2024

Hhhhhhao / continuous_tokenizer

Python 294 7 Updated May 29, 2025

tianweiy / CausVid

(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models

Python 1,124 63 Updated Aug 7, 2025

TauricResearch / TradingAgents

TradingAgents: Multi-Agents LLM Financial Trading Framework

Python 26,957 5,094 Updated Oct 9, 2025

sihyun-yu / REPA

[ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think

Python 1,474 71 Updated Mar 16, 2025

DreamLM / Dream

Dream 7B, a large diffusion language model

Python 1,120 72 Updated Nov 21, 2025

ByteVisionLab / DetailFlow

🔥 Official impl. of "DetailFlow: 1D Coarse-to-Fine Autoregressive Image Generation via Next-Detail Prediction"

Python 161 8 Updated Jul 10, 2025

facebookresearch / nwm

Official code for the CVPR 2025 paper "Navigation World Models".

Python 484 43 Updated Nov 24, 2025

a1600012888 / LaCT

Code release for paper "Test-Time Training Done Right"

Python 345 16 Updated Nov 18, 2025

wz0919 / EPiC

Official implementation of EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance

Python 46 1 Updated Jun 2, 2025

bareblackfoot / Object2HabitatMap

Awesome habitat top down map work 🤩

Python 35 2 Updated Apr 7, 2024

Qinyu-Allen-Zhao / DiSA

Official Implementation of Diffusion Step Annealing (DiSA) in Autoregressive Image Generation

Jupyter Notebook 144 1 Updated May 27, 2025

yifan123 / flow_grpo

[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL

Python 1,782 105 Updated Nov 4, 2025

Starred topics

Computer vision

Artificial Intelligence