CS PhD student @ NYU Courant / NYU Shanghai
Generative Models for Scientific Discovery
Diffusion, LLM pretraining, and post-training with GFlowNet/RL
I work on generative modeling for molecules, RNA, and proteins. My goal is to build reliable, controllable models that respect scientific constraints.
Diffusion LLMs GFlowNet / RL AI for Science
Affiliation
New York University (Courant)
Google Scholar
GitHub
ID
STATUS Active research on diffusion + LLMs for science
SECTION
Research
I'm interested in building foundation models and generative systems that can be controlled and trusted in scientific domains.
01
Diffusion for Molecular Design
Data-scarce generation, stereochemistry control, and structure-aware guidance.
classifier guidance stereochemistry 3D
02
LLM Pretraining for Scientific Sequences
Pretraining and adaptation for chemical languages and biological sequences (DNA/RNA/protein).
pretraining foundation models scientific tokens
03
Post-training with GFlowNet / RL
Stabilizing autoregressive GFlowNets and aligning generation with scientific objectives.
trajectory balance post-training RL
04
AI for Science: RNA & Proteins
Molecular structure/function modeling and generative design.
RNA protein molecule generation
SECTION
News
2026
2025
2025-12
2025-02
Chiralcat published in Artificial Intelligence Chemistry.
2023
2023-07