Skip to content

Conversation

@bernalde
Copy link
Member

Title: Add constraint validation, solver-timeout docs, single-case runner, and tests

Summary

This PR introduces post-solve constraint validation for Pyomo benchmark runs, a lightweight
single-case runner to debug individual parameter combinations, documentation of the IPOPT
timeout semantics, a Future Work roadmap, and unit tests for the new validation and adapter
behavior. The changes are scoped to benchmarking utilities and do not modify core physics.

Key Changes

  • benchmarks/scripts/validate.py: Added validate_constraints() to check dryness, product
    temperature, and ramp-rate constraints and return detailed violation metrics.
  • benchmarks/scripts/run_single_case.py: New utility to run one scipy baseline + one Pyomo run
    with full terminal output for debugging (supports --tee, --solver-timeout, ramp overrides).
  • benchmarks/src/adapters.py: Ensure CLI ramp overrides are applied before extracting
    ramp_constraints metadata; pass solver_timeout through to optimizers.
  • lyopronto/pyomo_models/optimizers.py: Add solver_timeout parameter usage; set IPOPT
    max_cpu_time (CPU seconds). Note: IPOPT <=3.13 does not support max_wall_time.
  • benchmarks/notebooks/grid_analysis_SIMPLE.ipynb: Add JSONL v2 schema explanation and an
    aggregated constraint-metrics summary cell for analysis convenience.
  • docs/FUTURE_WORK.md: Roadmap capturing next steps (wall-time guard, logging refactor, tests).
  • tests/test_benchmarks_validate.py: New unit tests for validate_constraints and adapter ramp
    override logic.
  • .gitignore: Ignore generated benchmark artifacts under benchmarks/results/.

Why

Earlier cases showed that Pyomo solvers reported success even when constraints (drying target or
ramp limits) were violated; this PR fixes that by validating trajectories post-solve and marking
failed runs appropriately in analysis. It also supplies developer tooling to reproduce and
diagnose problematic parameter combinations faster.

Testing / Validation Steps

Local (quick):

conda activate lyopronto
pytest tests/test_benchmarks_validate.py -q
pytest tests -q  # full suite (takes ~5 minutes)

Manual scenario reproduction:

# Run a single-case debug with verbose IPOPT output
python benchmarks/scripts/run_single_case.py \
  --task both --R0 0.4 --A1 20 --method fd --n-elements 1000 \
  --ramp-Tsh-max 40 --ramp-Pch-max 0.05 --solver-timeout 180 --tee

Review Checklist

  • Code compiles and tests pass (CI will run full suite)
  • Confirm validate_constraints metrics are included in JSONL output when running
    grid_cli.py generate ... (this is done by grid_cli integration)
  • Documentation updated: README and notebook reflect behavior and schema
  • Decide fate of legacy benchmark scripts deleted from repo (separate cleanup PR recommended)

Risks and Notes

  • IPOPT max_cpu_time limits CPU time, not wall-clock time. Upgrading to IPOPT 3.14 enables
    max_wall_time. For environments where wall time is essential, use a process-level wrapper
    (e.g., timeout) or upgrade the solver.
  • Staged solve prints can be noisy; consider centralizing logging behind a verbosity flag.

Files Changed (high level)

  • benchmarks/scripts/validate.py (new/updated)
  • benchmarks/scripts/run_single_case.py (new)
  • benchmarks/src/adapters.py (updated)
  • lyopronto/pyomo_models/optimizers.py (updated)
  • benchmarks/notebooks/grid_analysis_SIMPLE.ipynb (updated)
  • docs/FUTURE_WORK.md (new)
  • tests/test_benchmarks_validate.py (new)
  • .gitignore (updated)

Suggested Reviewers

  • @team-lyo (benchmark maintainers)
  • @physics-dev (physics/ops team for solver semantics)

Notes for merging

  • Target branch: pyomo (this PR should be merged into the pyomo integration branch)
  • After merge: open follow-up PRs for test expansion and logging refactor as described in
    docs/FUTURE_WORK.md.

Generated by automation; edit before submitting the PR if needed.

- Grid: grid.jsonl + ratio CSV/PNG for product.A1 x ht.KC
- Batch: series_baseline.jsonl and summary JSON
- Includes objective_time_hr and solver metadata in records
Critical Bug Fix:
- Fix IPOPT warmstart options being unconditionally enabled
- Now properly conditional on warmstart_scipy parameter
- Affects optimize_Tsh_pyomo, optimize_Pch_pyomo, optimize_Pch_Tsh_pyomo
- Prevents state leakage between independent benchmark runs

Benchmark Infrastructure:
- Add comprehensive CLI-based benchmarking system (grid_cli.py)
- Add adapter layer for scipy/Pyomo optimizer integration
- Add JSONL-based result schema with structured validation
- Add Makefile for common benchmark workflows

Benchmark Analysis:
- Add Jupyter notebook for visualization (grid_analysis.ipynb)
- Add baseline 3×3 Tsh grid results (A1 × KC parameter sweep)
- Add heatmaps showing objective parity and speedup
- Demonstrate ~5-10% objective difference, ~250× average speedup

Documentation:
- docs/BENCHMARK_WARMSTART_FIX.md - Root cause and fix details
- docs/BENCHMARK_REFACTOR_COMPLETE.md - Architecture overview
- benchmarks/README.md - Usage guide and CLI reference
- benchmarks/results/README.md - Version control policy

Version Control:
- .gitignore for generated benchmark data
- Keep representative baseline examples for documentation
- All generated data is reproducible via CLI

Verification:
- Regenerated benchmarks confirm no warmstart state leakage
- FD: -5.23% objective difference, 289× speedup
- Collocation: -9.14% objective difference, 267× speedup
- First-run overhead (~1s) is expected (JIT/library initialization)
- Document CLI-based benchmarking workflow
- Add quick-start examples for parameter grid benchmarks
- Reference Makefile shortcuts and detailed documentation
Critical Fix:
- Original schedule was too short (35 hours), causing all Pch benchmarks to fail
- Product dried only 70% before schedule expired with baseline A1=16
- Higher resistance products (A1=18, A1=20) require even longer times

Root Cause:
- Drying time scales with product resistance (A1 parameter)
- A1=16: ~70 hours needed
- A1=18: ~78 hours needed
- A1=20: ~86 hours needed
- Original schedule [300, 1800] minutes = 35 hours total (insufficient)

Solution:
- Increased schedule to [300, 5700] minutes = 100 hours total
- Provides margin for parameter grid variations
- Applied to both scipy and Pyomo Pch adapters

Verification:
- All 27 benchmark combinations now pass (9 scipy + 9 FD + 9 colloc)
- 3×3 parameter grid: A1 ∈ {16, 18, 20} × KC ∈ {2.75e-4, 3.3e-4, 4.0e-4}

Results (baseline_Pch_3x3.jsonl):
- Scipy: 76±7 hr objective, 50±4 s wall time
- FD: 11±1 hr objective (-85%), 0.2 s wall time (1244× speedup)
- Collocation: 11±1 hr objective (-86%), <0.1 s wall time (1187× speedup)

Note: Pyomo finds dramatically better pressure trajectories (~7× faster drying)
than scipy's sequential optimization approach. This is expected and demonstrates
the value of simultaneous optimization.
Introduce single-case runner (run_single_case.py) for interactive Pyomo vs scipy debugging.
Add Future Work roadmap (docs/FUTURE_WORK.md).
Enhance analysis notebook with JSONL schema & constraint metrics summary.
Extend IPOPT configuration with explicit CPU-time note.
Update .gitignore to exclude generated benchmark artifacts.
Update README benchmarking workflow.
@bernalde bernalde self-assigned this Nov 24, 2025
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces post-solve constraint validation for Pyomo benchmark runs, a lightweight single-case runner for debugging individual parameter combinations, documentation of IPOPT timeout semantics, a future work roadmap, and unit tests for validation and adapter behavior. The changes are focused on benchmarking utilities and do not modify core physics code.

Key Changes

  • Added constraint validation (validate_constraints()) to verify dryness, temperature, and ramp-rate constraints post-solve
  • Created run_single_case.py for debugging individual parameter combinations with full terminal output
  • Updated adapter logic to pass solver_timeout and apply CLI ramp overrides before extracting metadata
  • Enhanced optimizers.py with solver_timeout parameter, ramp-rate constraints, and metadata tracking
  • Added new example (example_ramp_constraints.py) demonstrating ramp-rate constraint usage
  • Created comprehensive tests in test_benchmarks_validate.py

Reviewed changes

Copilot reviewed 61 out of 112 changed files in this pull request and generated 33 comments.

Show a summary per file
File Description
tests/test_benchmarks_validate.py New test file for constraint validation and adapter behavior (118 lines)
requirements.txt Added seaborn>=0.12.0 dependency
lyopronto/pyomo_models/optimizers.py Added solver timeout, ramp constraints, metadata tracking, treat_n_elements_as_effective parameter
examples/example_ramp_constraints.py New example demonstrating ramp-rate constraints with 4 scenarios
docs/archive/*.md Multiple archive documentation files (historical reference)
benchmarks/src/__init__.py New package initialization for benchmarks core modules
benchmarks/results/.gitignore Added to ignore generated benchmark results

meta: Dict[str, Any] = {}
model_ref = None
results_ref = None
solver_opts = {}
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable solver_opts is not used.

Copilot uses AI. Check for mistakes.
return pd.DataFrame()

# Get parameter names from first row
param1_name = method_df.iloc[0]['param1_name']
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable param1_name is not used.

Copilot uses AI. Check for mistakes.

# Get parameter names from first row
param1_name = method_df.iloc[0]['param1_name']
param2_name = method_df.iloc[0]['param2_name']
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable param2_name is not used.

Copilot uses AI. Check for mistakes.

# Expected patterns
valid_prefixes = ['traj_', 'heatmap_obj_', 'heatmap_speed_', 'nominal_']
valid_suffixes = ['_fd.png', '_colloc.png', '_combined.png', '.png']
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable valid_suffixes is not used.

Copilot uses AI. Check for mistakes.
return 0

base_scen = copy.deepcopy(SCENARIOS[scenario_name])
vial = base_scen["vial"]
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable vial is not used.

Copilot uses AI. Check for mistakes.
from pathlib import Path
from typing import Any, Dict

import numpy as np
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'np' is not used.

Copilot uses AI. Check for mistakes.
"""
from __future__ import annotations
import sys, platform, datetime, json, hashlib
from typing import Any, Dict, List, Tuple
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'Tuple' is not used.
Import of 'List' is not used.

Copilot uses AI. Check for mistakes.
)
if result.returncode == 0:
return result.stdout.strip()[:12] # Short SHA
except (subprocess.TimeoutExpired, FileNotFoundError, Exception):
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Copilot uses AI. Check for mistakes.
'ncp': None if use_finite_differences else n_collocation,
'treat_effective': treat_n_elements_as_effective if not use_finite_differences else False,
}
except Exception:
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Copilot uses AI. Check for mistakes.
import numpy as _np
if isinstance(o, _np.ndarray):
return o.tolist()
except Exception:
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants