Skip to content

algorithmx/physgen

Repository files navigation

PhysGen: Physically Grounded 3D Shape Generation for Industrial Design

Disclaimer 1: This is NOT the official repository. Please check Yingxuan You's homepage for announcements on open-source code for the paper. Code is mostly written by AI-agents with adequate human guidance.

Disclaimer 2: DrivAerNet++ contains monstrous amount of 39TB data. Even downloading the absoultely necessary data for the present project is a quazy task (especially in China mainland!). This project has never been actually trained, however the downloading and training efforts are on-going.

A faithful reproduction of the paper "PhysGen: Physically Grounded 3D Shape Generation for Industrial Design" (arXiv:2512.00422) by Yingxuan You, Chen Zhao, Hantao Zhang, Mingda Xu, and Pascal Fua (CVLab, EPFL).

Overview

PhysGen is a physics-guided 3D shape generation framework that incorporates physical awareness into the generation process. The key innovation is an alternating update strategy that combines flow-matching velocity updates with physics-based refinement.

Key Features

  • SP-VAE (Shape-and-Physics VAE): Jointly encodes shape and physics information into a unified latent space
  • Flow Matching Model: Rectified flow formulation for shape generation using Diffusion Transformer (DiT)
  • Physics-Guided Generation: Alternating velocity-based updates and physical refinement (Algorithm 1)
  • Multi-Dataset Support: DrivAerNet++, ShapeNet, and BlendedNet datasets

Architecture

SP-VAE (Section 3.1)

The Shape-and-Physics VAE consists of four components:

Component Architecture Key Features
Shape Encoder Dual cross-attention Bidirectional attention between uniform and salient points (based on Dora [5])
Shape Decoder SDF-based decoding Cross-attention between query points and latent; predicts SDF values
Pressure Decoder Three-branch fusion Self-attention + Squeeze-Excitation + MLP with learnable fusion weights
Drag Decoder Three-layer MLP head Same three-branch architecture as pressure decoder; sigmoid output

Hyperparameters:

  • Latent dimension: 512
  • Encoder/Decoder layers: 6
  • Attention heads: 8
  • Hidden dimension: 512
  • SE reduction factor: 16

Training Strategy (Section 3.1.3)

Stage 1: Independent Training

  • Shape encoder-decoder: 1000 epochs with SDF + KL loss (λ_sdf=1.0, λ_KL=0.001)
  • Pressure decoder: 1500 epochs with MAE + MSE loss; encoder frozen
  • Drag decoder: 1500 epochs with MAE + MSE loss; encoder frozen

Stage 2: Joint Fine-tuning

  • All components: 500 epochs with combined loss
  • Loss weights: λ_shape=10, λ_press=0.1, λ_drag=10
  • Lower learning rate (1e-5)

Flow Matching (Section 3.2.1)

  • Rectified Flow Formulation [25]: Linear interpolation z_t = t·z_1 + (1-t)·ε
  • DiT Architecture: 12 layers, 512 hidden dim, 8 heads, MLP ratio 4.0
  • Sampling: 100 steps at inference using Euler integration
  • Physics-Aware Regularization: λ_d = 0.03 (Equation 10)

Physics-Guided Generation (Algorithm 1)

For K=20 alternating iterations:
  Phase 1: Velocity-based update (25 steps) with physics regularization
  Phase 2: Physical refinement (20 steps) using surface pressure
  Re-noise to t=0.75
Final velocity-based update from t=0.75 to t=1.0

Directional weights for physical refinement:

  • λ_x = 0.2 (drag minimization)
  • λ_y = 0.1 (lateral symmetry)
  • λ_z = 0.1 (negative lift for traction)

Installation

# Clone repository
git clone <repo-url>
cd physgen

# Install dependencies
pip install -r requirements.txt

Dependencies

Core:

  • Python 3.8+
  • PyTorch >= 2.0.0
  • NumPy, SciPy

3D Processing:

  • trimesh >= 3.21.0
  • pymcubes >= 0.1.4

Deep Learning:

  • torchvision, timm, einops

Optional:

  • OpenFOAM (for CFD simulation)

Data Preparation

DrivAerNet++ Dataset [11, 12]

The primary dataset for training SP-VAE and flow matching models. This is a large-scale aerodynamic design dataset:

  • Size: 8,000 high-quality vehicle geometries
  • Split: 5,819 training samples, 1,147 test samples
  • Physics Data: Each geometry includes high-fidelity CFD simulations:
    • Drag coefficients (dimensionless, C_d)
    • Surface pressure fields (continuous pressure values at surface points)
    • Volumetric 3D flow fields
    • Directional forces (drag, lateral, lift)
  • Vehicle Body Styles: Fastback, notchback, and estateback
  • Variations: Underbody structure and wheel configurations

What Data is Actually Required for Training

⚠️ Important Evidence from the Paper: The paper explicitly states the physics information used is A = {C_d, P} where:

  • C_d = "dimensionless coefficient of drag"
  • P = "pressure field measured at surface points" (Line 214-227)

The volumetric 3D flow fields are mentioned only as part of what the dataset CONTAINS (for reference), NOT for training.

The full DrivAerNet++ dataset is ~39TB due to high-resolution CFD data. However, only a small portion is required for training PhysGen:

Data Component Paper Evidence Used for Training File Location Approx. Size
Mesh Geometry Lines 339-360: SDF loss ✅ YES - Required meshes/*.obj ~1-2 GB
Surface Pressure Lines 366-373: "ground-truth surface pressure" ✅ YES - Required physics/*.npz (key: 'pressure') ~500 MB
Drag Coefficient Lines 374-393: "ground-truth drag coefficient" ✅ YES - Required physics/*.npz (key: 'drag') ~500 MB
Surface Point Locations Lines 394-399 ✅ YES - Required for pressure interpolation physics/*.npz (key: 'surface_points') ~500 MB
Volumetric Flow Fields Lines 868, 1654: "full 3D flow fields" (dataset contents only) ❌ NO - Not used for training flow_fields/* ~30+ TB

Direct Quotes from Paper:

"In this paper, we take A to be aerodynamic properties formulated as A = {C_d, P}, where C_d is the dimensionless coefficient of drag and P : R^3 → R^+ is the pressure field measured at surface points." (Lines 214-227)

"where ˆp and ˆC_d represent the ground-truth surface pressure and drag coefficient, respectively." (Lines 394-399)

Minimum Required Data (for training):

data/drivaernet++/
  meshes/
    *.obj           # 3D mesh geometry (~1-2 GB for 8000 meshes)
  physics/
    *.npz           # Contains: pressure, drag, surface_points (~1-2 GB)
  train.txt         # Training IDs
  test.txt          # Test IDs

What you can SKIP downloading:

  • ❌ Volumetric flow field data (the largest component at ~30TB)
  • ❌ High-resolution CFD mesh data used for simulations
  • ✅ Only needed for OpenFOAM validation at inference time (not training)

Important: The .npz files must contain at minimum:

  • pressure: Surface pressure values (array)
  • drag: Drag coefficient (scalar)
  • surface_points: 3D coordinates of surface points where pressure is measured

Data Format:

ShapeNet [2] (Generalization Testing)

Used for out-of-distribution evaluation:

  • Category: Vehicle/car split
  • Purpose: Test generalization to unseen geometries
  • Characteristics: Physically imperfect but geometrically reasonable meshes
  • Preprocessing: Shapes uniformly rescaled to match DrivAerNet++ scale
  • Usage: Used as initial shapes for physics-guided generation pipeline
  • Data Format:
data/shapenet/
  car/
    meshes/
      *.obj
    train.txt
    test.txt

BlendedNet [39] (Aircraft Generalization)

Used for cross-domain transferability evaluation:

  • Size: 999 distinct blended-wing-body (BWB) aircraft geometries
  • Conditions: ~9 aerodynamic conditions per geometry
  • Total Cases: 8,830 converged CFD simulations
  • Turbulence Model: Spalart-Allmaras
  • Mesh Size: 9-14 million cells per case
  • Purpose: Evaluate transfer to domains beyond ground vehicles (aircraft)

SP-VAE Input Requirements

For each 3D mesh during training, the following point clouds are extracted:

  • Uniform Surface Points (P_u): 32,768 points using Sharp Edge Sampling (SES)
  • Salient Edge Points (P_s): 32,768 points using SES
  • Query Points: Downsampled to 1,024 points each via Farthest Point Sampling (FPS), concatenated into 2,048-point set for cross-attention
  • Supervision:
    • Uniformly sampled coarse points in bounding box
    • Sharp points perturbed around ground-truth surface

Usage

Training SP-VAE

Stage 1: Train Shape Encoder-Decoder

python train_spvae.py \
  --data_dir data/drivaernet++ \
  --stage shape \
  --checkpoint_dir checkpoints \
  --seed 42 \
  --deterministic

Stage 1: Train Pressure Decoder

python train_spvae.py \
  --data_dir data/drivaernet++ \
  --stage pressure \
  --checkpoint_dir checkpoints \
  --seed 42

Stage 1: Train Drag Decoder

python train_spvae.py \
  --data_dir data/drivaernet++ \
  --stage drag \
  --checkpoint_dir checkpoints \
  --seed 42

Stage 2: Joint Fine-tuning

python train_spvae.py \
  --data_dir data/drivaernet++ \
  --stage joint \
  --checkpoint_dir checkpoints \
  --seed 42

Train All Stages:

python train_spvae.py \
  --data_dir data/drivaernet++ \
  --stage all \
  --checkpoint_dir checkpoints \
  --seed 42 \
  --deterministic

Training Flow Matching Model

python train_flow_matching.py \
  --data_dir data/drivaernet++ \
  --spvae_checkpoint checkpoints/spvae_final.pt \
  --checkpoint_dir checkpoints \
  --seed 42

Generation

Unconditional Generation

python generate.py \
  --spvae_checkpoint checkpoints/spvae_final.pt \
  --flow_matching_checkpoint checkpoints/flow_matching_final.pt \
  --mode unconditional \
  --num_samples 100 \
  --output_dir outputs/unconditional \
  --seed 42

Generation with Target Drag

python generate.py \
  --spvae_checkpoint checkpoints/spvae_final.pt \
  --flow_matching_checkpoint checkpoints/flow_matching_final.pt \
  --mode unconditional \
  --target_drag 0.25 \
  --num_samples 100 \
  --output_dir outputs/target_drag \
  --seed 42

Generation from Sketch

python generate.py \
  --spvae_checkpoint checkpoints/spvae_final.pt \
  --flow_matching_checkpoint checkpoints/flow_matching_final.pt \
  --mode sketch \
  --sketch_path sketches/car_sketch.png \
  --target_drag 0.25 \
  --output_dir outputs/from_sketch \
  --seed 42

Evaluation

Shape Reconstruction

python evaluate.py \
  --mode reconstruction \
  --spvae_checkpoint checkpoints/spvae_final.pt \
  --data_dir data/drivaernet++ \
  --output_file results/reconstruction.json \
  --seed 42

Physics Estimation

python evaluate.py \
  --mode physics \
  --spvae_checkpoint checkpoints/spvae_final.pt \
  --data_dir data/drivaernet++ \
  --output_file results/physics.json \
  --seed 42

Generation Quality

python evaluate.py \
  --mode generation \
  --generated_dir outputs/unconditional \
  --gt_dir data/drivaernet++/meshes \
  --output_file results/generation.json \
  --seed 42

File Structure

physgen/
├── configs/
│   ├── __init__.py
│   └── config.py                 # All hyperparameters from paper
├── data/
│   ├── __init__.py
│   └── dataset.py                # DrivAerNet++, ShapeNet, BlendedNet loaders
├── models/
│   ├── __init__.py
│   ├── attention.py              # Multi-head, cross-attention, DiT blocks, SE
│   ├── spvae.py                  # SP-VAE (encoder + 3 decoders)
│   ├── flow_matching.py          # Rectified flow with DiT
│   └── physics_guided.py         # Algorithm 1 implementation
├── utils/
│   ├── __init__.py
│   ├── geometry.py               # Point sampling, SDF, marching cubes
│   ├── metrics.py                # All evaluation metrics
│   └── seed.py                   # Deterministic seeding for reproducibility
├── train_spvae.py                # Two-stage SP-VAE training
├── train_flow_matching.py        # Flow matching training
├── generate.py                   # Inference script (Algorithm 1)
├── evaluate.py                   # Evaluation script
├── test_training.py              # Training loop verification test
├── requirements.txt              # Dependencies
└── README.md                     # This file

Implementation Details

Paper-to-Code Mapping

Paper Section Component File Status
3.1.1 Shape Encoder (dual cross-attention) models/spvae.py::ShapeEncoder
3.1.1 Shape Decoder (SDF-based) models/spvae.py::ShapeDecoder
3.1.2 Pressure Decoder (3-branch) models/spvae.py::PressureDecoder
3.1.2 Drag Decoder (3-layer MLP) models/spvae.py::DragDecoder
3.1.3 Stage 1: Independent training train_spvae.py
3.1.3 Stage 2: Joint fine-tuning train_spvae.py
3.2.1 Rectified flow formulation models/flow_matching.py
3.2.1 DiT architecture (12 layers, 512 dim) models/flow_matching.py::DiT
3.2.1 Physics-aware regularization models/flow_matching.py::sample_with_physics_guidance
3.2.2 Directional forces (Eq. 11) models/physics_guided.py::compute_directional_forces
3.2.2 Physics loss (Eq. 12-13) models/physics_guided.py::compute_physics_loss
3.2.3 Algorithm 1 (alternating update) models/physics_guided.py::generate
4.1 Evaluation metrics utils/metrics.py

Hyperparameter Verification

All hyperparameters are extracted verbatim from the paper:

Parameter Paper Value Code Location
Latent dimension 512 config.py:19
DiT depth 12 config.py:49
DiT hidden size 512 config.py:50
DiT num heads 8 config.py:51
λ_sdf 1.0 config.py:22
λ_KL 0.001 config.py:23
λ_shape 10 config.py:26
λ_press 0.1 config.py:27
λ_drag 10 config.py:28
λ_d 0.03 config.py:70
λ_x 0.2 config.py:76
λ_y 0.1 config.py:77
λ_z 0.1 config.py:78
K (alternating iters) 20 config.py:65
M (refinement steps) 20 config.py:67
t_ns (re-noising) 0.75 config.py:73
Shape epochs 1000 config.py:31
Pressure epochs 1500 config.py:32
Drag epochs 1500 config.py:33
Joint epochs 500 config.py:34
FM epochs 1000 config.py:55
Sampling steps 100 config.py:46

Expected Results

Based on the paper (Tables 4, 5, 6):

Shape Reconstruction (DrivAerNet++)

Metric Value
Overall Accuracy 96.73%
Overall IoU 91.89%
Sharp Accuracy 95.64%
Sharp IoU 91.50%

Drag Estimation

Metric Value
MSE 4.0 × 10^-5
MAE 4.83 × 10^-3
Max AE 2.70 × 10^-2

Pressure Estimation

Metric Value
MSE 4.55 × 10^-2
MAE 1.09 × 10^-1
Rel L2 20.02%
Rel L1 17.78%

Physics-Guided Generation

Metric Value
F-score (0.01) 89.65
Chamfer Distance 20.99 × 10^-3
Overall Accuracy 66.48%

Verification

The implementation has been thoroughly verified:

  • ✓ All equations from paper correctly implemented
  • ✓ All hyperparameters match paper specifications
  • ✓ Architecture details match paper descriptions
  • ✓ Training loop executes without errors
  • ✓ Gradient flow is healthy (no NaN/Inf)
  • ✓ Checkpoint save/load works correctly
  • ✓ Deterministic seeding implemented
  • ✓ No hardcoded paths blocking execution

Run the verification test:

python test_training.py

Computational Requirements

Training:

  • SP-VAE Stage 1 (Shape): ~2 days on 4×A100 GPUs
  • SP-VAE Stage 1 (Pressure): ~22 hours on 4×A100 GPUs
  • SP-VAE Stage 1 (Drag): ~1 day on 1×A100 GPU
  • SP-VAE Stage 2 (Joint): ~18 hours on 4×A100 GPUs
  • Flow Matching: ~2-3 days on 4×A100 GPUs

Inference:

  • Generation: ~210 seconds per sample (20 alternating iterations)

Known Limitations

  1. Data Availability: DrivAerNet++ dataset must be obtained separately from the authors
  2. Computational Cost: Full training requires significant GPU resources (4×A100 GPUs)
  3. CFD Simulation: OpenFOAM integration for drag verification not included; requires external setup
  4. Image Encoding: DINOv2 model is downloaded automatically via torch.hub

Reproducibility

For deterministic results:

  • Use --seed 42 (or any fixed seed) in all commands
  • Use --deterministic flag for fully deterministic mode (may impact performance)
  • RNG state is saved in checkpoints and restored on resume

Citation

@article{you2025physgen,
  title={PhysGen: Physically Grounded 3D Shape Generation for Industrial Design},
  author={You, Yingxuan and Zhao, Chen and Zhang, Hantao and Xu, Mingda and Fua, Pascal},
  journal={arXiv preprint arXiv:2512.00422},
  year={2025}
}

References

[1] Chan et al. "Efficient geometry-aware 3D generative adversarial networks." CVPR 2022. [2] Chang et al. "ShapeNet: An Information-Rich 3D Model Repository." arXiv 2015. [3] Chen et al. "TripNet: Learning large-scale high-fidelity 3D car aerodynamics." arXiv 2025. [5] Chen et al. "Dora: Sampling and benchmarking for 3D shape variational auto-encoders." CVPR 2025. [10] Dhariwal & Nichol. "Diffusion models beat GANs on image synthesis." NeurIPS 2021. [11] Elrefaie et al. "DrivAerNet: A parametric car dataset for data-driven aerodynamic design." IDETC 2024. [12] Elrefaie et al. "DrivAerNet++: A Large-Scale Multimodal Car Dataset." arXiv 2024. [15] Hu et al. "Squeeze-and-excitation networks." CVPR 2018. [16] Jasak et al. "OpenFOAM: A C++ Library for Complex Physics Simulations." 2007. [22] Li et al. "TriposG: High-fidelity 3D shape synthesis using large-scale rectified flow models." arXiv 2025. [25] Liu et al. "Flow straight and fast: Learning to generate and transfer data with rectified flow." ICLR 2023. [26] Lorensen & Cline. "Marching cubes: A high resolution 3D surface construction algorithm." 1998. [29] Moenning & Dodgson. "Fast marching farthest point sampling." 2003. [30] Oquab et al. "DINOv2: Learning robust visual features without supervision." TMLR 2024. [32] Peebles & Xie. "Scalable diffusion models with transformers." ICCV 2023. [39] Sung et al. "BlendedNet: A blended wing body aircraft dataset." IDETC 2025. [41] Vatani et al. "TripOptimizer: Generative 3D shape optimization and drag prediction." arXiv 2025. [43] Wu et al. "Transolver: A fast transformer solver for PDEs on general geometries." ICML 2024. [48] Zhang et al. "3DShape2VecSet: A 3D Shape Representation for Neural Fields and Generative Diffusion Models." SIGGRAPH 2023.

License

This implementation is for research purposes only.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages