Qiang (Jony) ZHANG jonyzhang2023

Hi 👋, I'm Qiang Jony ZHANG

🔭 I'm the Chair of the Research Committee and Chief Researcher at the China National Innovation Center of Embodied AI Robotics. (I'm looking forward to collaborating on exciting projects in this field!) Previously, I participated in 'Tianhe 天河' supercomputing projects in China and also had a very pleasant working experience at DJI.
📚 Visit my Google Scholar page for my publications.
🌟 I’m currently working on Humanoid Robots:
🗞️ News
- 2025-11-29: Our work [SPO] has been adopted as the baseline reinforcement learning algorithm by [PI0.6].
- 2025-08: We won the 100m championship, 400m second and third place, 1500m second place, 4x100m second place, material organization competition championship, and material handling competition second place at the first World Robotics Conference. WHR2025
- 2025-07-29: Released the Humanoid General Multimodal Perception Module, Humanoid Occupancy.
- 2025-07-09: Released the motion control framework for the humanoid robot marathon champion. TienKung Marathon Control Framework
- 2025-05-24: Published an article in People's Daily.
- 2025-05-08: The "Innovation China" program I participated in was officially broadcast.
- 2025-04-24: Co-hosted a seminar on embodied intelligent robots in cooperation with Peking University.
- 2025-04-19: The Tiangong humanoid robot made history! It successfully completed a half-marathon!
- 2025-04-10: Interviewed by the "Innovation China" column of the China Association for Science and Technology.
- 2025-03-29: Delivered an invited talk at CEAI 2025.
- 2025-03-06: Invited to join the CNR Finance Jin Ding Think Tank.
- 2025-01-08: Participated in the design of the long-term planning and technical roadmap for robots in Beijing City.
- 2024-12-30: Invited by the National Health Commission to serve as an expert reviewer for the artificial intelligence medical project.
- 2024-11-25: Interviewed by the Global Times. Advancements in AI for football and robot marathons
- 2024-11-16: Invited to attend the AGIROS Open Source Community Conference at the Institute of Software, Chinese Academy of Sciences, and delivered a keynote speech on "Embodied Intelligence of Humanoid Robots".
- 2024-11-09: Joined the Intel China Academic Talent Program and delivered a keynote speech on "Research on Embodied AI of Humanoid Robots".
- 2024-11-05: Invited to attend the BAAI "ZhiYuan Forum - Embodiment and World Model Summit".
- 2024-09-30: Delivered a talk at the School of Computer Science, Peking University, on Multimodal Perception and Large Model Decision-Making for Humanoid Robots.
- 2024-07-29: Interviewed by Mango (Hunan) TV on the topic of "Embodied AI and Humanoid Robots Tian Gong".
- 2024-07-05: Invited by the China Internet Research Institute to draft a white paper on embodied intelligence in the industrial internet, presenting a report on the integration and innovation of new industrial networks and humanoid robot embodied intelligence technology.
- 2024-07-01: Invited to give a talk in XMech at Zhejiang University.
- 2024-05-29: Interviewed by the Beijing Association for Science and Technology.
- 2024-05-09: The humanoid robot platform 'TianGong' is continuously being iterated and developed.
📝 I’m currently learning Embodied AI, RL, Vision Perception, LLM, Control&Plan in Robotics. Here are some papers I've published recently :

Pre - prints

Title	Link
SwiftVLA: Unlocking Spatiotemporal Dynamics for Lightweight VLA Models at Minimal Overhead	Link
DPL: Depth-only Perceptive Humanoid Locomotion via Realistic Depth Synthesis and Cross-Attention Terrain Reconstruction	Link
Compose Your Policies! Improving Diffusion-based or Flow-based Robot Policies via Test-time Distribution-level Composition	Link
PoseDiff: A Unified Diffusion Model Bridging Robot Pose Estimation and Video-to-Action Control	Link
EgoDemoGen: Novel Egocentric Demonstration Generation Enables Viewpoint-Robust Manipulation	Link
TopoNav: Topological Graphs as a Key Enabler for Advanced Object Navigation	Link
HumanoidVerse: A Versatile Humanoid for Vision-Language Guided Multi-Object Rearrangement	Link
LOVON: Legged Open-Vocabulary Object Navigator	Link
ArtVIP: Articulated Digital Assets of Visual Realism, Modular Interaction, and Physical Fidelity for Robot Learning	Link
Survival Games: Human-LLM Strategic Showdownsunder Severe Resource Scarcity	Link
Occupancy World Model for Robots	Link
RoboOcc: Enhancing the Geometric and Semantic Scene Understanding for Robots	Link
The Meta-Representation Hypothesis	Link
EmbodiedVSR: Dynamic Scene Graph-Guided Chain-of-Thought Reasoning for Visual Spatial Tasks	Link
HumanoidPano: Hybrid Spherical Panoramic-LiDAR Cross-Modal Perception for Humanoid Robots	Link
NeuGPT: Unified multi-modal Neural GPT	Link
Recursive Cleaning for Large-scale Protein Data via Multimodal Learning	Link
Query-based Semantic Gaussian Field for Scene Representation in Reinforcement Learning	Link
Mamba as Decision Maker : Exploring Multi-scale Sequence Modeling in Offline Reinforcement Learning	Link
MAD: Multi-Alignment MEG-to-Text Decoding	Link
NeuSpeech: Decode Neural signal as Speech	Link
Manipulation Facing Threats: Evaluating Physical Vulnerabilities in End-to-End Vision Language Action Models	Link
E2H: A Two-Stage Non-Invasive Neural Signal Driven Humanoid Robotic Whole-Body Control Framework	Link
Typography Leads Semantic Diversifying: Amplifying Adversarial Transferability across Multimodal Large Language Models	Link
A Dual-Agent Adversarial Framework for Robust Generalization in Deep Reinforcement Learning	Link

Publications

Conference/Journal	Title	Link
IROS 2024	Whole-body Humanoid Robot Locomotion with Human Reference	Link
AAAI 2025	What You See is What You Reach: Towards Spatial Navigation with High-Level Human Instructions	Link
IEEE Transactions on Visualization and Computer Graphics	DEGS: Deformable Event-based 3D Gaussian Splatting from RGB and Event Stream	Link
PM2CE workshop @ IROS 2025	Humanoid Occupancy: Enabling A Generalized Multimodal Occupancy Perception System on Humanoid Robots	Link
H2R workshop @ CoRL 2025	UniTracker: Learning Universal Whole-Body Motion Tracker for Humanoid Robots	Link
WS-Sim2Real @ Humanoids 2025	LiPS: Large-Scale Humanoid Robot Reinforcement Learning with Parallel-Series Structures	Link
WS-Sim2Real @ Humanoids 2025	Trinity: A Modular Humanoid Robot AI System	Link
IJCAI 2025 Deepfake Workshop (Best Student Paper)	Exploring Typographic Visual Prompts Injection Threats in Cross-Modality Generation Models	Link
CoRL 2025	Omni-Perception: Omnidirectional Collision Avoidance for Legged Locomotion in Dynamic Environments	Link
ACMMM 2025	Transfer Attack for Bad and Good: Explain and Boost Adversarial Transferability across Multimodal Large Language Models	Link
ICCV 2025	What Makes for Text to 360-degree Panorama Generation with Stable Diffusion?	Link
ICCV 2025	Learning Null Geodesics for Gravitational Lensing Rendering in General Relativity	Link
IROS 2025	Mamba Policy: Towards Efficient 3D Diffusion Policy with Hybrid Selective State Models	Link
IROS 2025	Distillation-PPO: A Novel Two-Stage Reinforcement Learning Framework for Humanoid Robot Perceptive Locomotion	Link
ACL 2025	MapNav: A Novel Memory Representation via Annotated Semantic Maps for VLM-based Vision-and-Language Navigation	Link
ICML 2025	Simple Policy Optimization	Link
Generative Models for Robot Learning Workshop @ ICLR2025	Modality-Composable Diffusion Policy via Inference-Time Distribution-level Composition	Link
RSS 2025	RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation	Link
ICME 2025	ES-Parkour: Advanced Robot Parkour with Bio-inspired Event Camera and Spiking Neural Network	Link
CVPR 2025	Uncovering Vision Modality Threats in Image-to-Image Tasks	Link
ICRA 2025	Multi-Floor Zero-Shot Object Navigation Policy	Link
ICASSP 2025	Fully spiking neural network for legged robots	Link
ICASSP 2025	Event Masked Autoencoder: Point-wise Action Recognition with Event-Based Cameras	Link
NeurIPS 2024	DEL: Discrete Element Learner for Learning 3D Dynamics from 2D Observations	Link
IEEE Transactions on Artificial Intelligence	Spiking Diffusion Models	Link
NeurIPS 2024	Spiking Neural Network as Adaptive Event Stream Slicer	Link
ICRA 2024	Prompting Multi-Modal Tokens to Enhance End-to-End Autonomous Driving Imitation Learning with LLMs	Link
ICRA 2024	Prompt, Plan, Perform: LLM-based Humanoid Control via Quantized Imitation Learning	Link
IROS 2024	Reinforcement Learning with Generalizable Gaussian Splatting	Link
IROS 2024	TriHelper: Zero-Shot Object Navigation with Dynamic Assistance	Link
WACV 2024	Spiking Denoising Diffusion Probabilistic Models	Link
ICCV 2023	Masked Spiking Transformer	Link
CoRL 2022	RoboTube: Learning Household Manipulation from Human Videos with Simulated Twin Environments	Link

👯 I have hidden some of my previous work, welcome to chat. I am currently preparing my personal website and more, and I’m looking to collaborate on Humanoid Robots, Embodied AI, RL, Vision Perception, LLM, Control & Planning in Robotics
📫 How to reach me jony.zhang@x-humanoid.com | zituka@foxmail.com | qzhang749@connect.hkust-gz.edu.cn

💻 My favorite

Python	JavaScript	C++	AWS	Github		Git	HTML5	VsCode
C	Linux	Ubuntu	RedHat	gitlab	MATLAB	notion	PS	pytorch
R	stackoverflow	tensorflow	twitter	vim	raspberrypi	opencv	jenkins	ai
arduino	apple	cmake	eclipse	gmail	instagram	pycharm	sublime	bash