add input design documentation to book source #3716

divine7022 · 2025-12-11T12:15:00Z

Description

Add comprehensive documentation for the input_design design matrix used to coordinate parameter draws and input file selection

closes : #3677

Motivation and Context

Review Time Estimate

Immediately
Within one week
When possible

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

My change requires a change to the documentation.
My name is in the list of CITATION.cff
I agree that PEcAn Project may distribute my contribution under any or all of
- the same license as the existing code,
- and/or the BSD 3-clause license.
I have updated the CHANGELOG.md.
I have updated the documentation accordingly.
I have read the CONTRIBUTING document.
I have added tests to cover my changes.
All new and existing tests passed.

mdietze · 2025-12-11T14:15:27Z

book_source/03_topical_pages/03_pecan_xml.Rmd

+Multi-site ensembles and sensitivity analyses that sample over input files use
+an `input_design` data.frame to keep parameter draws and input files aligned
+across runs. The design is created up front (typically via
+`generate_joint_ensemble_design()` inside `runModule.run.write.configs()`) and


I'd drop the "inside" bit. I think the preferred usage would be to generate the design first then pass it in.

dropped and clarified

mdietze · 2025-12-11T14:18:29Z

book_source/03_topical_pages/03_pecan_xml.Rmd

+`write.sa.configs()`. It is not saved automatically to `samples.Rdata`, so keep
+your copy if you need to reuse it.
+
+- **Parameter column:** `param` gives the 1-based index of the posterior draw to


Unclear what a "1-based index" is. You could just call this an "index" or "index (i.e. row number)"

mdietze · 2025-12-11T14:21:02Z

book_source/03_topical_pages/03_pecan_xml.Rmd

+across runs. The design is created up front (typically via
+`generate_joint_ensemble_design()` inside `runModule.run.write.configs()`) and
+passed through to `run.write.configs()`, `write.ensemble.configs()`, and
+`write.sa.configs()`. It is not saved automatically to `samples.Rdata`, so keep


AFAIK write.sa.configs doesn't currently use the input design, though we've discussed a future direction of pulling out the SA design code into the design function and then merging write.ensemble.configs & write.sa.configs, but that might not happen for months.

mdietze · 2025-12-11T14:22:19Z

book_source/03_topical_pages/03_pecan_xml.Rmd

+
+- **Parameter column:** `param` gives the 1-based index of the posterior draw to
+  use for each run. `run.write.configs()` reads it when building ensemble
+  samples so leave it in the design even though the downstream config writers do


The "even though the downstream config writers do not reference it directly" sure sounds like a bug that should be fixed, not a "feature" that should be documented.

mdietze · 2025-12-11T14:22:53Z

book_source/03_topical_pages/03_pecan_xml.Rmd

+  samples so leave it in the design even though the downstream config writers do
+  not reference it directly.
+- **Input columns:** any name that matches a tag under `run/inputs` (for example
+  `met`, `soil`, `veg`, `poolinitcond`). Values are 1-based indices into that


drop "1-based" throughout PR

mdietze · 2025-12-11T14:23:41Z

book_source/03_topical_pages/03_pecan_xml.Rmd

+- **Input columns:** any name that matches a tag under `run/inputs` (for example
+  `met`, `soil`, `veg`, `poolinitcond`). Values are 1-based indices into that
+  input’s `path` list. Leaving a column out keeps that input fixed across runs.
+- **Row count and order:** include at least one row per run you plan to write.


Drop the "at least" -- you can't specify multiple rows per run.

mdietze · 2025-12-11T14:25:36Z

book_source/03_topical_pages/03_pecan_xml.Rmd

+- **Row count and order:** include at least one row per run you plan to write.
+  For ensembles this means `ensemble.size` rows; for sensitivity analysis it
+  should cover the median run (row 1) plus every trait/quantile combination in
+  the order they are written. Extra rows are ignored; too few will leave later


The text "Extra rows are ignored; too few will leave later
runs without input paths and lead to confusing readme/config files, so size
the table to cover every planned run" also sounds like a bug. I'd prefer the code thrown an error before starting any runs if it detects a mismatch between the design size and ensemble size

added validation in runModule.run.write.configs and run.write.configs

mdietze · 2025-12-11T14:26:35Z

book_source/03_topical_pages/03_pecan_xml.Rmd

+|------:|----:|-----:|
+| 1 | 1 | 1 |
+| 2 | 2 | 1 |
+| 3 | 2 | 2 |


I'd make the met column be 1,2,1,2 as that's an easier design to understand

infotroph · 2025-12-11T17:01:41Z

base/workflow/R/run.write.configs.R

+#' @param input_design data.frame design matrix linking parameter draws and any
+#'   sampled inputs across runs. Include a `param` column whose values select
+#'   rows from `trait.samples`/`ensemble.samples` plus optional columns named for
+#'   `settings$run$inputs` tags (e.g. `met`, `soil`) with 1-based indices into
+#'   each input's `path` list. Provide at least one row per planned run (median +
+#'   all SA members and/or `ensemble.size`). Usually generated by
+#'   `runModule.run.write.configs()` via `generate_joint_ensemble_design()`, but
+#'   custom designs may be supplied.


Since run.write.configs is typically called internally while runModule.rn.write.configs is user-facing, I suggest moving the full details there and having the "as documented in..." here.

Suggested change

#' @param input_design data.frame design matrix linking parameter draws and any

#' sampled inputs across runs. Include a `param` column whose values select

#' rows from `trait.samples`/`ensemble.samples` plus optional columns named for

#' `settings$run$inputs` tags (e.g. `met`, `soil`) with 1-based indices into

#' each input's `path` list. Provide at least one row per planned run (median +

#' all SA members and/or `ensemble.size`). Usually generated by

#' `runModule.run.write.configs()` via `generate_joint_ensemble_design()`, but

#' custom designs may be supplied.

#' @param input_design data frame containing the design matrix describing parameter and input indices, as

#' documented in \code{runModule.run.write.configs()}.

infotroph · 2025-12-11T17:18:36Z

base/workflow/R/runModule.run.write.configs.R

+#' @param input_design design matrix describing parameter and input indices, as
+#'   documented in \code{run.write.configs()}. Defaults to the object returned by
+#'   \code{generate_joint_ensemble_design()} when NULL.


Suggested change

#' @param input_design design matrix describing parameter and input indices, as

#' documented in \code{run.write.configs()}. Defaults to the object returned by

#' \code{generate_joint_ensemble_design()} when NULL.

#' @param input_design data.frame design matrix linking parameter draws and any

#' sampled inputs across runs. Include a `param` column whose values select

#' rows from `trait.samples`/`ensemble.samples` plus optional columns named for

#' `settings$run$inputs` tags (e.g. `met`, `soil`) with 1-based indices into

#' each input's `path` list. Provide at least one row per planned run (median +

#' all SA members and/or `ensemble.size`). Usually generated by

#' `generate_joint_ensemble_design()` but custom designs may be supplied.

#' If NULL, `generate_joint_ensemble_design()` will be called internally.

infotroph · 2025-12-11T21:36:00Z

CHANGELOG.md

+- Added documentation for the `input_design` design matrix that coordinates 
+  parameter draws and input file selections across ensemble runs. The matrix 
+  requires a `param` column with parameter sample indices and one row per run 
+  (#3677).


I think this is covered by the block above

Suggested change

- Added documentation for the `input_design` design matrix that coordinates

parameter draws and input file selections across ensemble runs. The matrix

requires a `param` column with parameter sample indices and one row per run

(#3677).

infotroph · 2025-12-11T21:41:02Z

book_source/03_topical_pages/03_pecan_xml.Rmd

+and passed to `runModule.run.write.configs()`. It is not saved automatically to
+`samples.Rdata`, so keep your copy if you need to reuse it.


Not for this PR, but flagging for improvement later: Needing this sentence is a good indicator the current behavior is unintuitive and the design should be saved somewhere by default (though not in samples.Rdata, as established in other threads).

I completely agree. Relying on the user to manually persist the design matrix is fragile.
This reminds me a potential gap fill could be done here -- #3708
Since we are moving toward a runs_manifest.csv to track run metadata (like pfts and traits, ), the natural home for the input design seems to be that same manifest. We could extend the manifest schema to include columns for the inputs (e.g. param_index, met_index, ...); that way runs_manifest.csv becomes the complete 'recipe' .
This would eliminate the need for a separate saved object entirely for that.

mdietze · 2025-12-15T17:03:15Z

base/workflow/R/runModule.run.write.configs.R

+
+    # Validate design matrix size for MultiSettings
+    if (!is.null(settings$ensemble$size) && nrow(input_design) != settings$ensemble$size) {
+      stop("input_design has ", nrow(input_design), " rows but settings$ensemble$size is ",


@infotroph do you think these need to be switched to PEcAn's logger?

base/workflow/R/runModule.run.write.configs.R

divine7022 added 8 commits December 11, 2025 11:35

update CHANGELOG.md

1498dee

update roxy

e4e26d6

update roxy

4dc4cd9

update run.write.configs.Rd

52e79ba

runModule.run.write.configs.Rd

6ef45da

update roxy

035a6c9

update write.ensemble.configs.Rd

2b252df

add input_design documentation to book

6a9161c

github-actions bot added Modules Base Documentation labels Dec 11, 2025

mdietze requested changes Dec 11, 2025

View reviewed changes

infotroph reviewed Dec 11, 2025

View reviewed changes

divine7022 added 6 commits December 11, 2025 19:18

made message clear

c5c76e8

add input_design row validation

bba6126

add input_design row validation to runModule.run.write.configs

2d5cf79

update roxy

3cba978

update roxy

6aa8237

update the documenation

4297f36

infotroph reviewed Dec 11, 2025

View reviewed changes

drop reduendent info

2b2c601

infotroph approved these changes Dec 12, 2025

View reviewed changes

mdietze reviewed Dec 15, 2025

View reviewed changes

mdietze approved these changes Dec 15, 2025

View reviewed changes

infotroph reviewed Dec 15, 2025

View reviewed changes

base/workflow/R/runModule.run.write.configs.R Outdated Show resolved Hide resolved

base/workflow/R/runModule.run.write.configs.R Outdated Show resolved Hide resolved

Apply suggestions from code review

4dc275d

infotroph merged commit de5bf23 into PecanProject:release/v1.10.0 Dec 15, 2025
19 of 26 checks passed

		and passed to `runModule.run.write.configs()`. It is not saved automatically to
		`samples.Rdata`, so keep your copy if you need to reuse it.

add input design documentation to book source #3716

add input design documentation to book source #3716

Uh oh!

Conversation

divine7022 commented Dec 11, 2025

Description

Motivation and Context

Review Time Estimate

Types of changes

Checklist:

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

infotroph Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

infotroph Dec 11, 2025 •

edited

Loading