Skip to content
View GengzeZhou's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report GengzeZhou

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

[ICML 2025 Spotlight] Direct Discriminative Optimization: Supercharging Diffusion/Autoregressive with GAN-type Discrimination

Python 109 3 Updated Jul 31, 2025

[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Python 2,418 359 Updated Dec 23, 2025

Official implementation of Rethinking Training Dynamics in Scale-wise Autoregressive Generation

Jupyter Notebook 1 Updated Dec 17, 2025

EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models

HTML 46 Updated Dec 17, 2025
Python 121 7 Updated Feb 28, 2025

[NeurIPS 2025] The official repository of "Sekai: A Video Dataset towards World Exploration"

Python 217 5 Updated Dec 5, 2025

🎯 告别信息过载,AI 助你看懂新闻资讯热点,简单的舆情监控分析 - 多平台热点聚合+基于 MCP 的AI分析工具。监控35个平台(抖音、知乎、B站、华尔街见闻、财联社等),智能筛选+自动推送+AI对话分析(用自然语言深度挖掘新闻:趋势追踪、情感分析、相似检索等13种工具)。支持企业微信/个人微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 推送,1分钟手机通知,无需…

Python 40,381 20,885 Updated Dec 23, 2025

Native Multimodal Models are World Learners

Python 1,372 52 Updated Nov 28, 2025

Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

Python 1,208 49 Updated Jun 8, 2025

Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

Python 6,524 367 Updated Dec 24, 2025

A simple state update rule to enhance length generalization for CUT3R

Python 545 17 Updated Oct 1, 2025

[CVPR 2025 Highlight] GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control

Jupyter Notebook 1,217 65 Updated Sep 24, 2025

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 17,732 1,361 Updated Dec 24, 2025

InternRobotics' open platform for building generalized navigation foundation models.

Jupyter Notebook 535 59 Updated Dec 23, 2025

[TMLR 2024] repository for VLN with foundation models

229 11 Updated Oct 25, 2025

🚀 Efficient implementations of state-of-the-art linear attention models

Python 4,118 339 Updated Dec 25, 2025

Official Repo for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning

Jupyter Notebook 404 36 Updated Dec 15, 2024

(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models

Python 1,124 63 Updated Aug 7, 2025

TradingAgents: Multi-Agents LLM Financial Trading Framework

Python 26,957 5,094 Updated Oct 9, 2025

[ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think

Python 1,474 71 Updated Mar 16, 2025

Dream 7B, a large diffusion language model

Python 1,120 72 Updated Nov 21, 2025

🔥 Official impl. of "DetailFlow: 1D Coarse-to-Fine Autoregressive Image Generation via Next-Detail Prediction"

Python 161 8 Updated Jul 10, 2025

Official code for the CVPR 2025 paper "Navigation World Models".

Python 484 43 Updated Nov 24, 2025

Code release for paper "Test-Time Training Done Right"

Python 345 16 Updated Nov 18, 2025

Official implementation of EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance

Python 46 1 Updated Jun 2, 2025

Awesome habitat top down map work 🤩

Python 35 2 Updated Apr 7, 2024

Official Implementation of Diffusion Step Annealing (DiSA) in Autoregressive Image Generation

Jupyter Notebook 144 1 Updated May 27, 2025

[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL

Python 1,782 105 Updated Nov 4, 2025
Next