Dongxia Wu
Incoming Assistant Professor, MBZUAI
Currently Postdoctoral Scholar, Stanford University
ATLAS Lab — AI for Trustworthy Learning And Science
Trustworthy, world-grounded AI for scientific discovery.
Dongxia [dot] Wu [at] mbzuai [dot] ac [dot] ae
I am a postdoctoral scholar at Stanford University, advised by Prof. Emily B. Fox. In Aug 2026 I will join MBZUAI as a tenure-track Assistant Professor, jointly in the Department of Statistics and Data Science and the Department of Machine Learning, where I will lead the ATLAS Lab (AI for Trustworthy Learning And Science).
I received my Ph.D. in Computer Science from UC San Diego in 2025, advised by Prof. Rose Yu and Prof. Yi-An Ma, and my B.S. in Applied Mathematics, Physics, and Computer Science from UW–Madison in 2020.
My research builds trustworthy AI — calibrated, auditable, and physics-grounded models — and applies it to AI for science, from molecular and cellular biology to the spatiotemporal dynamics of the physical world (toward scientific world models).
News
- 2026 The ATLAS Lab is recruiting PhD/master’s students, postdocs, RAs, and visiting students — see Prospective Students.
- 2026 We are hosting the Structured Data for Health workshop at ICML 2026 in Seoul!
- 2026 We are hosting the 2nd MMFM-BIOMED workshop at CVPR 2026 in Denver!
- 2026 Honored to receive the UC San Diego CSE Department Dissertation Award (Runner-Up) for my Ph.D. thesis.
- 2026 New paper Divide and Learn: Multi-Objective Combinatorial Optimization at Scale accepted at ICML 2026.
- 2025 New paper MF-LAL: Drug Compound Generation Using Multi-Fidelity Latent Space Active Learning accepted at ICML 2025.
- 2025 I started a postdoc at Stanford University with Prof. Emily B. Fox.
- 2025 I defended my Ph.D. thesis at UC San Diego. Thanks to my advisors Prof. Rose Yu and Prof. Yi-An Ma!
Research
The ATLAS Lab (AI for Trustworthy Learning And Science) builds machine learning that is trustworthy — calibrated, auditable, and physics-grounded — and uses it to advance science, with a focus on biology and the spatiotemporal dynamics of the physical world. My research spans two intertwined thrusts:
Trustworthy AI
Calibrated, auditable, and physics-grounded models.
As AI moves into high-stakes scientific decisions, accuracy alone is not enough. Models must know what they don’t know (calibrated uncertainty), expose why they decided (auditability), and respect known physical and biological laws (physics-grounded inductive bias). I develop probabilistic and generative methods — Bayesian deep learning, uncertainty quantification, and constrained generative modeling — that turn black-box predictions into reliable, accountable decisions.
AI for Science
From molecular and cellular biology to the spatiotemporal dynamics of the world.
Scientific data are scarce, noisy, multi-fidelity, and structured across space and time. I build generative and Bayesian models that learn efficiently from such data to accelerate discovery — especially in biology, spanning molecule, protein, and material design, single-cell modeling, and biomedical imaging — and to build world models that capture how complex physical systems evolve across space and time. I also take a data-centric view, curating open datasets and rigorous benchmarks that make progress in AI for science measurable and reproducible.
Prospective Students
The ATLAS Lab at MBZUAI is recruiting! I am looking for highly motivated PhD students, master’s students, postdocs, research assistants (RAs), and visiting students, and I recruit from both the Department of Statistics and Data Science and the Department of Machine Learning. PhD positions begin in Fall 2027; postdoc, master’s, RA, and visiting positions can begin as early as Fall 2026. I am especially interested in candidates working across these areas:
- Generative Modeling & Reinforcement Learning Diffusion- and flow-based generative models and reinforcement learning — core building blocks of modern machine learning.
- Foundation Models & Agents Large language and vision–language models (LLMs/VLMs), multimodal reasoning, and agentic systems for science.
- Probabilistic Machine Learning Bayesian inference, uncertainty quantification, sequential decision-making, and experimental design.
- AI for Biology & Science Molecular, protein, and material design; single-cell modeling; biomedical imaging; time-series modeling; and spatiotemporal modeling.
If you would like to work with me, please email your CV, academic transcript, and a brief research statement to Dongxia [dot] Wu [at] mbzuai [dot] ac [dot] ae, using the subject line [Your Name] — [Position] — [Research Direction(s)] — ATLAS Lab Application. In your email, please note which of the four directions above best match your interests and why.
Publications
For a full and up-to-date list, see my Google Scholar.
- Calibrating LLMs with Semantic-level Reward arXiv preprint 2026
- BALAR: A Bayesian Agentic Loop for Active Reasoning arXiv preprint 2026
- CellFluxRL: Biologically-Constrained Virtual Cell Modeling via Reinforcement Learning arXiv preprint 2026
- Uncertainty-Aware Distribution-to-Distribution Flow Matching for Scientific Imaging arXiv preprint 2026
- Divide and Learn: Multi-Objective Combinatorial Optimization at Scale International Conference on Machine Learning (ICML) 2026
- MF-LAL: Drug Compound Generation Using Multi-Fidelity Latent Space Active Learning International Conference on Machine Learning (ICML) 2025
- Diffusion Models as Constrained Samplers for Optimization with Unknown Constraints International Conference on Artificial Intelligence and Statistics (AISTATS) 2025
- Learning Granger Causality from Instance-wise Self-attentive Hawkes Processes International Conference on Artificial Intelligence and Statistics (AISTATS) 2024
- Multi-Fidelity Residual Neural Processes for Scalable Surrogate Modeling International Conference on Machine Learning (ICML) 2024
- Diff-BBO: Diffusion-Based Inverse Modeling for Black-Box Optimization NeurIPS Workshop on Bayesian Decision-making and Uncertainty 2024
- Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs NeurIPS Workshop on Statistical Foundations of LLMs and Foundation Models 2024
- GLEAM-AI: Neural Surrogate for Accelerated Epidemic Analytics and Forecasting NeurIPS Workshop on Bayesian Decision-making and Uncertainty 2024
- Disentangled Multi-Fidelity Deep Bayesian Active Learning International Conference on Machine Learning (ICML) 2023
- Deep Bayesian Active Learning for Accelerating Stochastic Simulation ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2023
- Multi-Fidelity Hierarchical Neural Processes ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2022
- DeepViFi: Detecting Oncoviral Infections in Cancer Genomes Using Transformers ACM Conference on Bioinformatics, Computational Biology and Health Informatics (BCB) 2022
- Evaluation of Individual and Ensemble Probabilistic Forecasts of COVID-19 Mortality in the United States Proceedings of the National Academy of Sciences (PNAS) 2022
- Quantifying Uncertainty in Deep Spatiotemporal Forecasting ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2021
- DeepGLEAM: A Hybrid Mechanistic and Deep Learning Model for COVID-19 Forecasting arXiv preprint 2021
- A Deep Learning Based Automatic Defect Analysis Framework for In-situ TEM Ion Irradiations Computational Materials Science 2021
- Multi-Defect Detection and Analysis of Electron Microscopy Images with Deep Learning Computational Materials Science 2021