Disclaimer 1: This is NOT the official repository. Please check Yingxuan You's homepage for announcements on open-source code for the paper. Code is mostly written by AI-agents with adequate human guidance.
Disclaimer 2: DrivAerNet++ contains monstrous amount of 39TB data. Even downloading the absoultely necessary data for the present project is a quazy task (especially in China mainland!). This project has never been actually trained, however the downloading and training efforts are on-going.
A faithful reproduction of the paper "PhysGen: Physically Grounded 3D Shape Generation for Industrial Design" (arXiv:2512.00422) by Yingxuan You, Chen Zhao, Hantao Zhang, Mingda Xu, and Pascal Fua (CVLab, EPFL).
PhysGen is a physics-guided 3D shape generation framework that incorporates physical awareness into the generation process. The key innovation is an alternating update strategy that combines flow-matching velocity updates with physics-based refinement.
- SP-VAE (Shape-and-Physics VAE): Jointly encodes shape and physics information into a unified latent space
- Flow Matching Model: Rectified flow formulation for shape generation using Diffusion Transformer (DiT)
- Physics-Guided Generation: Alternating velocity-based updates and physical refinement (Algorithm 1)
- Multi-Dataset Support: DrivAerNet++, ShapeNet, and BlendedNet datasets
The Shape-and-Physics VAE consists of four components:
| Component | Architecture | Key Features |
|---|---|---|
| Shape Encoder | Dual cross-attention | Bidirectional attention between uniform and salient points (based on Dora [5]) |
| Shape Decoder | SDF-based decoding | Cross-attention between query points and latent; predicts SDF values |
| Pressure Decoder | Three-branch fusion | Self-attention + Squeeze-Excitation + MLP with learnable fusion weights |
| Drag Decoder | Three-layer MLP head | Same three-branch architecture as pressure decoder; sigmoid output |
Hyperparameters:
- Latent dimension: 512
- Encoder/Decoder layers: 6
- Attention heads: 8
- Hidden dimension: 512
- SE reduction factor: 16
Stage 1: Independent Training
- Shape encoder-decoder: 1000 epochs with SDF + KL loss (λ_sdf=1.0, λ_KL=0.001)
- Pressure decoder: 1500 epochs with MAE + MSE loss; encoder frozen
- Drag decoder: 1500 epochs with MAE + MSE loss; encoder frozen
Stage 2: Joint Fine-tuning
- All components: 500 epochs with combined loss
- Loss weights: λ_shape=10, λ_press=0.1, λ_drag=10
- Lower learning rate (1e-5)
- Rectified Flow Formulation [25]: Linear interpolation z_t = t·z_1 + (1-t)·ε
- DiT Architecture: 12 layers, 512 hidden dim, 8 heads, MLP ratio 4.0
- Sampling: 100 steps at inference using Euler integration
- Physics-Aware Regularization: λ_d = 0.03 (Equation 10)
For K=20 alternating iterations:
Phase 1: Velocity-based update (25 steps) with physics regularization
Phase 2: Physical refinement (20 steps) using surface pressure
Re-noise to t=0.75
Final velocity-based update from t=0.75 to t=1.0
Directional weights for physical refinement:
- λ_x = 0.2 (drag minimization)
- λ_y = 0.1 (lateral symmetry)
- λ_z = 0.1 (negative lift for traction)
# Clone repository
git clone <repo-url>
cd physgen
# Install dependencies
pip install -r requirements.txtCore:
- Python 3.8+
- PyTorch >= 2.0.0
- NumPy, SciPy
3D Processing:
- trimesh >= 3.21.0
- pymcubes >= 0.1.4
Deep Learning:
- torchvision, timm, einops
Optional:
- OpenFOAM (for CFD simulation)
The primary dataset for training SP-VAE and flow matching models. This is a large-scale aerodynamic design dataset:
- Size: 8,000 high-quality vehicle geometries
- Split: 5,819 training samples, 1,147 test samples
- Physics Data: Each geometry includes high-fidelity CFD simulations:
- Drag coefficients (dimensionless, C_d)
- Surface pressure fields (continuous pressure values at surface points)
- Volumetric 3D flow fields
- Directional forces (drag, lateral, lift)
- Vehicle Body Styles: Fastback, notchback, and estateback
- Variations: Underbody structure and wheel configurations
⚠️ Important Evidence from the Paper: The paper explicitly states the physics information used is A = {C_d, P} where:
- C_d = "dimensionless coefficient of drag"
- P = "pressure field measured at surface points" (Line 214-227)
The volumetric 3D flow fields are mentioned only as part of what the dataset CONTAINS (for reference), NOT for training.
The full DrivAerNet++ dataset is ~39TB due to high-resolution CFD data. However, only a small portion is required for training PhysGen:
| Data Component | Paper Evidence | Used for Training | File Location | Approx. Size |
|---|---|---|---|---|
| Mesh Geometry | Lines 339-360: SDF loss | ✅ YES - Required | meshes/*.obj |
~1-2 GB |
| Surface Pressure | Lines 366-373: "ground-truth surface pressure" | ✅ YES - Required | physics/*.npz (key: 'pressure') |
~500 MB |
| Drag Coefficient | Lines 374-393: "ground-truth drag coefficient" | ✅ YES - Required | physics/*.npz (key: 'drag') |
~500 MB |
| Surface Point Locations | Lines 394-399 | ✅ YES - Required for pressure interpolation | physics/*.npz (key: 'surface_points') |
~500 MB |
| Volumetric Flow Fields | Lines 868, 1654: "full 3D flow fields" (dataset contents only) | ❌ NO - Not used for training | flow_fields/* |
~30+ TB |
Direct Quotes from Paper:
"In this paper, we take A to be aerodynamic properties formulated as A = {C_d, P}, where C_d is the dimensionless coefficient of drag and P : R^3 → R^+ is the pressure field measured at surface points." (Lines 214-227)
"where ˆp and ˆC_d represent the ground-truth surface pressure and drag coefficient, respectively." (Lines 394-399)
Minimum Required Data (for training):
data/drivaernet++/
meshes/
*.obj # 3D mesh geometry (~1-2 GB for 8000 meshes)
physics/
*.npz # Contains: pressure, drag, surface_points (~1-2 GB)
train.txt # Training IDs
test.txt # Test IDs
What you can SKIP downloading:
- ❌ Volumetric flow field data (the largest component at ~30TB)
- ❌ High-resolution CFD mesh data used for simulations
- ✅ Only needed for OpenFOAM validation at inference time (not training)
Important: The
.npzfiles must contain at minimum:
pressure: Surface pressure values (array)drag: Drag coefficient (scalar)surface_points: 3D coordinates of surface points where pressure is measured
Data Format:
Used for out-of-distribution evaluation:
- Category: Vehicle/car split
- Purpose: Test generalization to unseen geometries
- Characteristics: Physically imperfect but geometrically reasonable meshes
- Preprocessing: Shapes uniformly rescaled to match DrivAerNet++ scale
- Usage: Used as initial shapes for physics-guided generation pipeline
- Data Format:
data/shapenet/
car/
meshes/
*.obj
train.txt
test.txt
Used for cross-domain transferability evaluation:
- Size: 999 distinct blended-wing-body (BWB) aircraft geometries
- Conditions: ~9 aerodynamic conditions per geometry
- Total Cases: 8,830 converged CFD simulations
- Turbulence Model: Spalart-Allmaras
- Mesh Size: 9-14 million cells per case
- Purpose: Evaluate transfer to domains beyond ground vehicles (aircraft)
For each 3D mesh during training, the following point clouds are extracted:
- Uniform Surface Points (P_u): 32,768 points using Sharp Edge Sampling (SES)
- Salient Edge Points (P_s): 32,768 points using SES
- Query Points: Downsampled to 1,024 points each via Farthest Point Sampling (FPS), concatenated into 2,048-point set for cross-attention
- Supervision:
- Uniformly sampled coarse points in bounding box
- Sharp points perturbed around ground-truth surface
Stage 1: Train Shape Encoder-Decoder
python train_spvae.py \
--data_dir data/drivaernet++ \
--stage shape \
--checkpoint_dir checkpoints \
--seed 42 \
--deterministicStage 1: Train Pressure Decoder
python train_spvae.py \
--data_dir data/drivaernet++ \
--stage pressure \
--checkpoint_dir checkpoints \
--seed 42Stage 1: Train Drag Decoder
python train_spvae.py \
--data_dir data/drivaernet++ \
--stage drag \
--checkpoint_dir checkpoints \
--seed 42Stage 2: Joint Fine-tuning
python train_spvae.py \
--data_dir data/drivaernet++ \
--stage joint \
--checkpoint_dir checkpoints \
--seed 42Train All Stages:
python train_spvae.py \
--data_dir data/drivaernet++ \
--stage all \
--checkpoint_dir checkpoints \
--seed 42 \
--deterministicpython train_flow_matching.py \
--data_dir data/drivaernet++ \
--spvae_checkpoint checkpoints/spvae_final.pt \
--checkpoint_dir checkpoints \
--seed 42Unconditional Generation
python generate.py \
--spvae_checkpoint checkpoints/spvae_final.pt \
--flow_matching_checkpoint checkpoints/flow_matching_final.pt \
--mode unconditional \
--num_samples 100 \
--output_dir outputs/unconditional \
--seed 42Generation with Target Drag
python generate.py \
--spvae_checkpoint checkpoints/spvae_final.pt \
--flow_matching_checkpoint checkpoints/flow_matching_final.pt \
--mode unconditional \
--target_drag 0.25 \
--num_samples 100 \
--output_dir outputs/target_drag \
--seed 42Generation from Sketch
python generate.py \
--spvae_checkpoint checkpoints/spvae_final.pt \
--flow_matching_checkpoint checkpoints/flow_matching_final.pt \
--mode sketch \
--sketch_path sketches/car_sketch.png \
--target_drag 0.25 \
--output_dir outputs/from_sketch \
--seed 42Shape Reconstruction
python evaluate.py \
--mode reconstruction \
--spvae_checkpoint checkpoints/spvae_final.pt \
--data_dir data/drivaernet++ \
--output_file results/reconstruction.json \
--seed 42Physics Estimation
python evaluate.py \
--mode physics \
--spvae_checkpoint checkpoints/spvae_final.pt \
--data_dir data/drivaernet++ \
--output_file results/physics.json \
--seed 42Generation Quality
python evaluate.py \
--mode generation \
--generated_dir outputs/unconditional \
--gt_dir data/drivaernet++/meshes \
--output_file results/generation.json \
--seed 42physgen/
├── configs/
│ ├── __init__.py
│ └── config.py # All hyperparameters from paper
├── data/
│ ├── __init__.py
│ └── dataset.py # DrivAerNet++, ShapeNet, BlendedNet loaders
├── models/
│ ├── __init__.py
│ ├── attention.py # Multi-head, cross-attention, DiT blocks, SE
│ ├── spvae.py # SP-VAE (encoder + 3 decoders)
│ ├── flow_matching.py # Rectified flow with DiT
│ └── physics_guided.py # Algorithm 1 implementation
├── utils/
│ ├── __init__.py
│ ├── geometry.py # Point sampling, SDF, marching cubes
│ ├── metrics.py # All evaluation metrics
│ └── seed.py # Deterministic seeding for reproducibility
├── train_spvae.py # Two-stage SP-VAE training
├── train_flow_matching.py # Flow matching training
├── generate.py # Inference script (Algorithm 1)
├── evaluate.py # Evaluation script
├── test_training.py # Training loop verification test
├── requirements.txt # Dependencies
└── README.md # This file
| Paper Section | Component | File | Status |
|---|---|---|---|
| 3.1.1 | Shape Encoder (dual cross-attention) | models/spvae.py::ShapeEncoder |
✓ |
| 3.1.1 | Shape Decoder (SDF-based) | models/spvae.py::ShapeDecoder |
✓ |
| 3.1.2 | Pressure Decoder (3-branch) | models/spvae.py::PressureDecoder |
✓ |
| 3.1.2 | Drag Decoder (3-layer MLP) | models/spvae.py::DragDecoder |
✓ |
| 3.1.3 | Stage 1: Independent training | train_spvae.py |
✓ |
| 3.1.3 | Stage 2: Joint fine-tuning | train_spvae.py |
✓ |
| 3.2.1 | Rectified flow formulation | models/flow_matching.py |
✓ |
| 3.2.1 | DiT architecture (12 layers, 512 dim) | models/flow_matching.py::DiT |
✓ |
| 3.2.1 | Physics-aware regularization | models/flow_matching.py::sample_with_physics_guidance |
✓ |
| 3.2.2 | Directional forces (Eq. 11) | models/physics_guided.py::compute_directional_forces |
✓ |
| 3.2.2 | Physics loss (Eq. 12-13) | models/physics_guided.py::compute_physics_loss |
✓ |
| 3.2.3 | Algorithm 1 (alternating update) | models/physics_guided.py::generate |
✓ |
| 4.1 | Evaluation metrics | utils/metrics.py |
✓ |
All hyperparameters are extracted verbatim from the paper:
| Parameter | Paper Value | Code Location |
|---|---|---|
| Latent dimension | 512 | config.py:19 |
| DiT depth | 12 | config.py:49 |
| DiT hidden size | 512 | config.py:50 |
| DiT num heads | 8 | config.py:51 |
| λ_sdf | 1.0 | config.py:22 |
| λ_KL | 0.001 | config.py:23 |
| λ_shape | 10 | config.py:26 |
| λ_press | 0.1 | config.py:27 |
| λ_drag | 10 | config.py:28 |
| λ_d | 0.03 | config.py:70 |
| λ_x | 0.2 | config.py:76 |
| λ_y | 0.1 | config.py:77 |
| λ_z | 0.1 | config.py:78 |
| K (alternating iters) | 20 | config.py:65 |
| M (refinement steps) | 20 | config.py:67 |
| t_ns (re-noising) | 0.75 | config.py:73 |
| Shape epochs | 1000 | config.py:31 |
| Pressure epochs | 1500 | config.py:32 |
| Drag epochs | 1500 | config.py:33 |
| Joint epochs | 500 | config.py:34 |
| FM epochs | 1000 | config.py:55 |
| Sampling steps | 100 | config.py:46 |
Based on the paper (Tables 4, 5, 6):
Shape Reconstruction (DrivAerNet++)
| Metric | Value |
|---|---|
| Overall Accuracy | 96.73% |
| Overall IoU | 91.89% |
| Sharp Accuracy | 95.64% |
| Sharp IoU | 91.50% |
Drag Estimation
| Metric | Value |
|---|---|
| MSE | 4.0 × 10^-5 |
| MAE | 4.83 × 10^-3 |
| Max AE | 2.70 × 10^-2 |
Pressure Estimation
| Metric | Value |
|---|---|
| MSE | 4.55 × 10^-2 |
| MAE | 1.09 × 10^-1 |
| Rel L2 | 20.02% |
| Rel L1 | 17.78% |
Physics-Guided Generation
| Metric | Value |
|---|---|
| F-score (0.01) | 89.65 |
| Chamfer Distance | 20.99 × 10^-3 |
| Overall Accuracy | 66.48% |
The implementation has been thoroughly verified:
- ✓ All equations from paper correctly implemented
- ✓ All hyperparameters match paper specifications
- ✓ Architecture details match paper descriptions
- ✓ Training loop executes without errors
- ✓ Gradient flow is healthy (no NaN/Inf)
- ✓ Checkpoint save/load works correctly
- ✓ Deterministic seeding implemented
- ✓ No hardcoded paths blocking execution
Run the verification test:
python test_training.pyTraining:
- SP-VAE Stage 1 (Shape): ~2 days on 4×A100 GPUs
- SP-VAE Stage 1 (Pressure): ~22 hours on 4×A100 GPUs
- SP-VAE Stage 1 (Drag): ~1 day on 1×A100 GPU
- SP-VAE Stage 2 (Joint): ~18 hours on 4×A100 GPUs
- Flow Matching: ~2-3 days on 4×A100 GPUs
Inference:
- Generation: ~210 seconds per sample (20 alternating iterations)
- Data Availability: DrivAerNet++ dataset must be obtained separately from the authors
- Computational Cost: Full training requires significant GPU resources (4×A100 GPUs)
- CFD Simulation: OpenFOAM integration for drag verification not included; requires external setup
- Image Encoding: DINOv2 model is downloaded automatically via torch.hub
For deterministic results:
- Use
--seed 42(or any fixed seed) in all commands - Use
--deterministicflag for fully deterministic mode (may impact performance) - RNG state is saved in checkpoints and restored on resume
@article{you2025physgen,
title={PhysGen: Physically Grounded 3D Shape Generation for Industrial Design},
author={You, Yingxuan and Zhao, Chen and Zhang, Hantao and Xu, Mingda and Fua, Pascal},
journal={arXiv preprint arXiv:2512.00422},
year={2025}
}[1] Chan et al. "Efficient geometry-aware 3D generative adversarial networks." CVPR 2022. [2] Chang et al. "ShapeNet: An Information-Rich 3D Model Repository." arXiv 2015. [3] Chen et al. "TripNet: Learning large-scale high-fidelity 3D car aerodynamics." arXiv 2025. [5] Chen et al. "Dora: Sampling and benchmarking for 3D shape variational auto-encoders." CVPR 2025. [10] Dhariwal & Nichol. "Diffusion models beat GANs on image synthesis." NeurIPS 2021. [11] Elrefaie et al. "DrivAerNet: A parametric car dataset for data-driven aerodynamic design." IDETC 2024. [12] Elrefaie et al. "DrivAerNet++: A Large-Scale Multimodal Car Dataset." arXiv 2024. [15] Hu et al. "Squeeze-and-excitation networks." CVPR 2018. [16] Jasak et al. "OpenFOAM: A C++ Library for Complex Physics Simulations." 2007. [22] Li et al. "TriposG: High-fidelity 3D shape synthesis using large-scale rectified flow models." arXiv 2025. [25] Liu et al. "Flow straight and fast: Learning to generate and transfer data with rectified flow." ICLR 2023. [26] Lorensen & Cline. "Marching cubes: A high resolution 3D surface construction algorithm." 1998. [29] Moenning & Dodgson. "Fast marching farthest point sampling." 2003. [30] Oquab et al. "DINOv2: Learning robust visual features without supervision." TMLR 2024. [32] Peebles & Xie. "Scalable diffusion models with transformers." ICCV 2023. [39] Sung et al. "BlendedNet: A blended wing body aircraft dataset." IDETC 2025. [41] Vatani et al. "TripOptimizer: Generative 3D shape optimization and drag prediction." arXiv 2025. [43] Wu et al. "Transolver: A fast transformer solver for PDEs on general geometries." ICML 2024. [48] Zhang et al. "3DShape2VecSet: A 3D Shape Representation for Neural Fields and Generative Diffusion Models." SIGGRAPH 2023.
This implementation is for research purposes only.