VICON: Vision In-Context Operator Networks for Multi-Physics Fluid Dynamics

Updates

[Mar 2026] Pre-trained checkpoints released on HuggingFace!
[Jan 2026] VICON has been accepted by TMLR! The accepted paper is available on OpenReview.

Installation

# Clone repository
git clone https://github.com/Eydcao/VICON.git
cd VICON

# Create conda environment (installs Python + Poetry)
conda env create -f environment.yml
conda activate vicon

# Install dependencies via Poetry
poetry install --no-root

Pre-trained Checkpoints

Model	Dataset	Download
VICON	Combined (All 3 PDEs)

Dataset

VICON was evaluated on three fluid dynamics datasets:

PDEArena-Incomp (incompressible Navier-Stokes)
PDEBench-Comp-HighVis (compressible Navier-Stokes)
PDEBench-Comp-LowVis (compressible Navier-Stokes with numerical-zero viscosity)

Refer to dataset_prepare/README.md for details.

Usage

We use Hydra for configuration management, allowing flexible parameter modifications via command line or config files.

Training

# Train the model with specific configurations
# Assuming GPUs 0 and 1, enable wandb logging
CUDA_VISIBLE_DEVICES="0,1" python src/train.py \
    plot=0 board=1 amp=0 \
    dataset_workers=2 multi_gpu=1 \
    datasets.train_batch_size=30 \
    loss.min_ex=5 \
    model.transformer.num_layers=10 \
    model.use_patch_pos_encoding=True \
    model.use_func_pos_encoding=True \
    datasets.types.COMPRESSIBLE2D.folder=$COMPRESSIBLE2D_DIR \
    datasets.types.EULER2D.folder=$EULER2D_DIR \
    datasets.types.NS2D.folder=$NS2D_DIR

Evaluation

# Run rollout evaluation
python src/rollout.py \
    rollout.ckpt_dir=/path/to/checkpoint/dir \
    rollout.ckpt_stamp=checkpoint_name \
    board=0

Refer to the configs/ folder for detailed configuration options.

Motivations

Current approaches to operator learning of PDEs face challenges limiting practical applications:

Single Operator Learning
- Requires complete retraining when equation type or parameters change
- Impractical for real-world deployment where system conditions vary
Pretrain-Finetune Approach
- Can handle multiple PDE types during pretraining
- Still requires substantial data collection (hundreds of frames/trajectories) for finetuning
- Challenging in downstream applications with limited data availability, e.g., online environments

The ICON Innovation and Its Limitations

ICON (Yang et al, 24) introduced a novel perspective inspired by in-context learning in LLMs:

Defines physical fields (before/after certain timestep) as query/answer (or COND/QoI) pairs
Extracts dynamics directly from a few pairs without requiring finetuning

However, ICON faces an architectural limitation:

Processes entire discretized physical fields as individual query/answer
Results in extremely long transformer sequences
Becomes computationally infeasible for real-scale, high-dimensional data

VICON's Solution

Figure 1: Schematic overview of VICON architecture.

Inspired by Vision Transformers (ViT), which efficiently handle large images by processing them in patches, VICON overcomes these limitations while maintaining the benefits of in-context learning. Our contributions include:

First implementation of in-context learning for 2D PDEs without requiring explicit PDE information
State-of-the-art empirical results compared to existing methods
Flexible rollout capabilities through learning to extract dynamics from pairs with varying timestep sizes

Method

VICON combines the in-context operator learning framework with vision transformer architecture through several key components:

1. Patch-wise Processing

Divides input physical fields into manageable patches
Significantly reduces sequence length compared to token-per-point approach in original ICON

2. Dual Positional Encoding System

Our system uses two types of positional encodings to inform precise spatial and function relationships between tokens:

a) Patch Position Encoding

Encodes relative spatial relationships between patches
Maintains awareness of physical space structure

b) Function Position Encoding

Indicates the role of each patch in the sequence:
- Which pair in the sequence it belongs to
- Whether it's part of the input (query) or output (answer) in that pair
Crucial for maintaining the in-context learning structure

3. Flexible Rollout Strategies

VICON's unique training approach enables versatile rollout schemes:

Forms pairs with varying timestep sizes during training
Allows a single trained model to:
- Extract dynamics at different time scales
- Perform rollouts with various timestep strides
- Potentially reduce the number of rollout steps when appropriate, minimizing error accumulation in long-term predictions

Figure 2: VICON's flexible rollout strategy: Starting with timestep dt=1, the model progressively accumulates flow fields needed for larger stride predictions up to dt=5, enabling efficient long-term predictions with larger timestep strides.

Results

Performance Improvements over MPP

Reduction in scaled L2 error for long-term predictions:
- 40% for PDEBench-Comp-LowVis
- 61.6% for PDEBench-Comp-HighVis
67% reduction in turbulence kinetic energy prediction error for PDEBench-Comp-LowVis
3x faster inference time

Figure 3: Comparison of turbulence kinetic energy predictions: VICON demonstrates superior accuracy over MPP in both RMSE metrics and advanced physical statistics.

Timestep Stride Generalization and Flexible Rollout Strategies

VICON demonstrates exceptional adaptability in handling varying time strides:

Successfully maintains prediction accuracy with previously unseen larger strides (smax=6,7), despite training only on smax=1~5
Particularly effective for PDEArena-Incomp dataset, where multi-step rollout (smax=5) largely outperforms single-step predictions
Enables direct application to experimental settings with hardware-constrained sampling rates, eliminating the need for interpolation or retraining the model

Figure 4: Comparison with state-of-the-art performance. VICON is additionally evaluated with different timestep strides, including previously unseen ones. For PDEArena-Incomp dataset, maximum stride size (smax=5) achieves optimal rollout performance.

Citation

@article{cao2026vicon,
 author = {Cao, Y. and Liu, Y. and Yang, L. and Yu, R. and Schaeffer, H. and Osher, S.},
 title = {{VICON}: Vision In-Context Operator Networks for Multi-Physics Fluid Dynamics Prediction},
 journal = {Transactions on Machine Learning Research},
 year = {2026},
 issn = {2835-8856},
 url = {https://openreview.net/forum?id=6V3YmHULQ3}
}

Poster

Acknowledgements

This work was supported by the US Army Research Office (Army-ECASE W911NF-23-1-0231), US Department of Energy, IARPA HAYSTAC Program, CDC, DARPA AIE FoundSci, DARPA YFA, NSF grants (#2205093, #2100237, #2146343, #2134274), AFOSR MURI (FA9550-21-1-0084), NSF DMS 2427558, NUS Presidential Young Professorship, STROBE NSF STC887 DMR 1548924, and ONR N00014-20-1-2787.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VICON: Vision In-Context Operator Networks for Multi-Physics Fluid Dynamics

Updates

Installation

Pre-trained Checkpoints

Dataset

Usage

Training

Evaluation

Motivations

The ICON Innovation and Its Limitations

VICON's Solution

Method

1. Patch-wise Processing

2. Dual Positional Encoding System

3. Flexible Rollout Strategies

Results

Performance Improvements over MPP

Timestep Stride Generalization and Flexible Rollout Strategies

Citation

Poster

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
configs		configs
dataset_prepare		dataset_prepare
figs		figs
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

VICON: Vision In-Context Operator Networks for Multi-Physics Fluid Dynamics

Updates

Installation

Pre-trained Checkpoints

Dataset

Usage

Training

Evaluation

Motivations

The ICON Innovation and Its Limitations

VICON's Solution

Method

1. Patch-wise Processing

2. Dual Positional Encoding System

3. Flexible Rollout Strategies

Results

Performance Improvements over MPP

Timestep Stride Generalization and Flexible Rollout Strategies

Citation

Poster

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages