Juo-Tung (Justin) Chen

Juo-Tung (Justin) Chen

I am a PhD Student @ Johns Hopkins University in Mechanical Engineering, advised by Axel Krieger in the Intelligent Medical Robotic Systems and Equipment Lab (IMERSE). I am currently interning at NVIDIA (May 2026).

I received my Master's degree in Robotics from Johns Hopkins University and my Bachelor's degree in Biomechatronics Engineering from National Taiwan University. During my masters, I worked in Intuitive Computing Lab advised by Chien-Ming Huang, and during my undergrad, I worked in Robots and Medical Mechatronics Lab (RMML) advised by Ping-Lang Yen.

Research Interest

My research sits at the intersection of surgical robotics and robot learning, with a focus on building autonomous surgical systems. I am driven by the following questions:
  1. How can we train surgical foundation models that generalize across procedures, robot platforms, and institutions by scaling multi-modal data collection?
  2. How can hierarchical imitation learning enable robots to complete long-horizon surgical tasks that require both high-level reasoning and precise low-level control?
  3. How can surgical tool pose estimation from monocular video unlock large-scale pretraining data for surgical foundation models — by recovering kinematics from the vast amounts of existing surgical footage that lack robot state recordings?


    CV   |     LinkedIn   |     GitHub   |   Google Scholar   |     Email

News

  • [May, 2026]  Started my internship at NVIDIA in Santa Clara, CA!
  • [Apr, 2026]  Our paper Open-H-Embodiment is on arXiv!
  • [Feb, 2026]  Our paper Cosmos-Surg-dVRK got accepted to RA-L!
  • [Jan, 2026]  ImitateCholec dataset paper published in Nature Scientific Data!
  • [Sep, 2025]  Our paper SutureBot was accepted to NeurIPS 2025!
  • [Jul, 2025]  Our paper SRT-H was featured on the cover of Science Robotics!
  • [Jun, 2025]  Our paper SurgiPose got accepted to IROS 2025!

Publications

* Equal contribution

ImitateCholec: A Multimodal Dataset for Long-Horizon Imitation Learning in Robotic Cholecystectomy
Pascal Hansen, Ji Woong Brian Kim, Antony Goldenberg, Juo-Tung Chen, Yuanzhe Amos Li, Anton Deguet, Brandon White, De Ru Tsai, Richard Cha, Jeffrey Jopling, Paul Maria Scheikl, Axel Krieger
Nature Scientific Data 2026
Paper  |  Dataset
ImitateCholec is a multimodal dataset of expert robotic demonstrations for laparoscopic cholecystectomy, designed to support long-horizon imitation learning. It provides synchronized video, robot kinematics, and task annotations spanning full surgical procedures, enabling robots to learn from human surgical expertise at scale.
Cosmos-Surg-dVRK: World Foundation Model-based Automated Online Evaluation of Surgical Robot Policy Learning
Lukas Zbinden, Nigel Nelson, Juo-Tung Chen, Xinhao Chen, Ji Woong Kim, Mahdi Azizian, Axel Krieger, Sean Huver
RA-L 2025
Paper  |  Website
We leverage world foundation models to enable automated online evaluation of surgical robot policies without requiring physical execution. Cosmos-Surg-dVRK generates realistic video rollout predictions that correlate with real-world task performance, enabling efficient policy iteration and reducing the time and cost of surgical robot learning.
SurgiPose: Estimating Surgical Tool Kinematics from Monocular Video for Surgical Robot Learning
Juo-Tung Chen, XinHao Chen, Ji Woong Kim, Paul Maria Scheikl, Richard Jaepyeong Cha, Axel Krieger
IROS 2025
Paper
SurgiPose estimates 6-DoF kinematics of surgical tools from monocular endoscopic video by combining visual feature extraction with robot kinematics priors. This enables recovering tool pose from existing surgical footage that lacks robot state recordings, unlocking large-scale pretraining data for surgical foundation models.
Reducing Performance Variability and Overcoming Limited Spatial Ability: Targeted Training for Remote Robot Teleoperation
Juo-Tung Chen*, Tsung-Chi Lin*, Chien-Ming Huang
IROS 2024
Paper
We investigate how targeted training can reduce performance variability in remote robot teleoperation and help users with limited spatial ability achieve expert-level proficiency. Our study identifies key perceptual and motor skills underlying teleoperation performance and proposes structured training protocols to address individual deficiencies.
Forgetful Large Language Models: Lessons Learned from Using LLMs in Robot Programming
Juo-Tung Chen, Chien-Ming Huang
AAAI Symposium 2023
Paper  |  Slides
We present lessons learned from deploying large language models for robot programming, focusing on the "forgetfulness" problem where LLMs fail to maintain consistent context across long interaction sequences. Our analysis surfaces practical guidelines for LLM-robot integration and highlights open challenges for the community.
Alchemist: LLM-Aided End-User Development of Robot Applications
Ulas Berk Karl, Juo-Tung Chen, Victor Nikhil Antony, Chien-Ming Huang
HRI 2024
Paper  |  Website
Alchemist is an LLM-powered end-user development platform that enables non-experts to create robot applications through natural language interaction. By abstracting low-level robot APIs into conversational interfaces, Alchemist lowers the barrier to robot programming while maintaining flexibility for complex task specifications.