Skip to content

Conversation

yantosca
Copy link
Contributor

@yantosca yantosca commented Feb 12, 2024

Name and Institution (Required)

Name: Bob Yantosca
Institution: Harvard + GCST

Confirm you have reviewed the following documentation

Describe the update

Due to legacy, there were several benchmark-related files in the gcpy/ folder. These have now been abstracted into the gcpy/benchmark/modules/ folder (along with several YAML configuration files). Imports and function calls in the various benchmark routines have been updated accordingly.

Expected changes

Prior to this update, the gcpy/gcpy/ folder contained these files:

aod_species.yml           emission_inventories.yml       __pycache__/
append_grid_corners.py    emission_species.yml           raveller_1D.py
benchmark/                examples/                      regrid.py
benchmark_categories.yml  file_regrid.py                 regrid_restart_file.py
benchmark_funcs.py        grid.py                        species_database.yml
bpch_to_nc_names.yml      grid_stretching_transforms.py  ste_flux.py
budget_ox.py              __init__.py                    units.py
budget_tt.py              lumped_species.yml             util.py
constants.py              mean_oh_from_logs.py           _version.py
cstools.py                oh_metrics.py
date_time.py              plot/

after this update, it now contains these files:

append_grid_corners.py  date_time.py                   plot/                   units.py
benchmark/              examples/                      plot.py                 util.py
bpch_to_nc_names.yml    file_regrid.py                 __pycache__/            _version.py
colormaps/              grid.py                        raveller_1D.py
constants.py            grid_stretching_transforms.py  regrid.py
cstools.py              __init__.py                    regrid_restart_file.py

and the gcpy/benchmark/modules folder now contains these files:

aod_species.yml                budget_ox.py               oh_metrics.py
benchmark_categories.yml       budget_tt.py               __pycache__/
benchmark_drydep.py            emission_inventories.yml   README.md
benchmark_funcs.py             emission_species.yml       run_1yr_fullchem_benchmark.py
benchmark_models_vs_obs.py     GC_72_vertical_levels.csv  run_1yr_tt_benchmark.py
benchmark_models_vs_sondes.py  __init__.py                species_database.yml
benchmark_utils.py             lumped_species.yml         ste_flux.py

Furthermore, several benchmark-related functions have been moved out of gcpy/util.py and into gcpy/benchmark/modules/benchmark_utils.py. This helps to keep the general GCPy routines separate from the benchmark-specific routines.

Also, the functionality to rename variables starting with SpeciesConc_ to SpeciesConcVV_ has been moved into a the rename_speciesconc_to_speciesconcvv function in gcpy/benchmark/modules/benchmark_utils.py.

Lastly, we now have fixed a few items from #288, namely:

  • The YAML configuration file is now copied to each of the benchmark results folders (i.e. BenchmarkResults/GCC_version_comparison, BenchmarkResults/GCHP_GCC_comparison, BenchmarkResults/GCHP_version_comparison), not just for the TransportTracers benchmark, but for all benchmark types.
  • In TransportTracers benchmarks:
    - Mass conservation tables are created for Ref and Dev
    - Mass tables are created for Ref and Dev
    - Radionuclude budget tables are created for Ref and Dev
    - STE flux tables are created for Ref and Dev (GCC vs GCC only)
  • A bug was fixed where the directory was used in several tables instead of the version string. This is now fixed.

Tagging @lizziel.

Related Github Issue(s)

gcpy/benchmark/benchmark_slurm.sh
- "${config/.yml/.log}: -> "${config/.yml/.log}"

Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/aod_species.yml
gcpy/benchmark_categories.yml
gcpy/benchmark_funcs.py
gcpy/budget_ox.py
gcpy/budget_tt.py
gcpy/emission_inventories.yml
gcpy/emission_species.yml
gcpy/lumped_species.yml
gcpy/oh_metrics.[py
gcpy/species_database.yml
gcpy/ste_flux.py
- Moved from gcpy -> gcpy/benchmark/modules

gcpy/benchmark/run_benchmark.py
gcpy/benchmark/modules/run_1yr_fullchem_benchmark.py
gcpy/benchmark/modules/run_1yr_tt_benchmark.py
- Updated import statements accordingly
- Import benchmark routines by name

gcpy/mean_oh_from_logs.py
- Removed, this is obsolete

gcpy/__init__.py
- Removed references to benchmark_funcs, budget_ox, budget_tt,
  mean_oh_from_logs, oh_metrics, and ste_flux
CHANGELOG.md
- Updated accordingly
The following benchmark-related functions were moved from
gcpy/util.py to gcpy/benchmark/modules/benchmark_utils.py:
- get_species_categories
- archive_species_categories
- get_lumped_species_definitions
- archive_lumped_species_definitions
- add_lumped_species_to_dataset

Also:

gcpy/benchmark/modules/benchmark_util.py
- Now import xarray
- Add constants for YAML files that are now contained in the
  gcpy/benchmark/modules folder.  These should not change very much,
  except when  benchmark categories or lumped species change.
- Updated Pydoc headers

gcpy/benchmark/modules/benchmark_funcs.py
- Now import several constants and functions from benchmark_utils.py
- Removed local constants for YAML files
- Updated function calls accordingly to be consistent with the new
  functions in benchmark_utils.py

CHANGELOG.md
- Updated accordingly

Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
@yantosca yantosca added category: Feature Request New feature or request topic: Benchmark Plots and Tables Issues pertaining to generating plots/tables from benchmark output topic: Structural Modifications Related to GCPy structural modifications (as opposed to scientific updates) labels Feb 12, 2024
@yantosca yantosca self-assigned this Feb 12, 2024
gcpy/benchmark/modules/benchmark_utils.py
- Now use "ofile" instead of "lumped_spc" in print statements
  in these functions:
  - archive_lumped_species_definitions
  - archive_species_categoriews

Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/util.py
- Removed "import shutil" statement

gcpy/benchmark/modules/benchmark_utils.py
- Added "import shutil" statement.  This is necessary to use the
  shutil.copyfile command

Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
…file

docs/requirements.txt
- Symbolic link to docs/environment_files/read_the_docs_requirements.txt.
  The online RTD builds looks for a requirements.txt file.

CHANGELOG.md
- Updated accordingly

Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/benchmark/modules/benchmark_utils.py
- Added function "rename_speciesconc_to_speciesconcvv", to centralize
  the renaming of variables starting with "SpeciesConc_" to
  "SpeciesConcVV_". This is necessary for backwards compatibility
  with GEOS-Chem versions prior to 14.1.0.

gcpy/benchmark/modules/benchmark_funcs.py
gcpy/benchmark/modules/benchmark_models_vs_obs.py
gcpy/benchmark/modules/benchmark_models_vs_sondes.py
gcpy/benchmark/modules/budget_tt.py
- Now import and call "rename_speciesconc_to_speciesconcvv" from
  benchmark_utils.py

CHANGELOG.md
- Updated accordingly

Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/benchmark/modules/budget_tt.py
- Add quiet=True to read_config_file to suppress printout of the
  species_database.yml path
- Add the version number (globvars.devstr) to the file path, so we can
  save Ref & Dev radionuclide budget tables

gcpy/benchmark/modules/ste_flux.py
- Add the version number (globvars.devstr) to the file path, so we
  can save Ref & Dev STE flux tables

gcpy/benchmark/modules/benchmark_funcs.py
- Updated Pydoc headers
- Add the version number to the output file name in routine
  make_benchmark_mass_conservation_table

gcpy/benchmark/modules/run_1yr_tt_benchmark.py
- Updated function calls to transport_tracers_budgets,
  make_benchmark_ste_tables, and make_benchmark_massconv_tables so that
  we can now print Ref & Dev tables for GCC vs GCC, GCHP vs GCC, and
  GCHP vs GCHP comparisons.

CHANGELOG.md
- Updated accordingly

Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
gcpy/util.py
- Add new function, copy_file_to_dir, which is a wrapper for the
  shutil.copyfile function.  This abstracts the use of shutil.copyfile
  to a single location.

gcpy/benchmark/modules/run_1yr_fullchem_benchmark.py
gcpy/benchmark/modules/run_1yr_tt_benchmark.py
gcpy/benchmark/run_benchmark.py
- Updated Pydoc header documentation w/ more up-to-date usage information
- Import copy_file_to_dir from gcpy/util.py
- No longer import shutil.copyfile
- Call copy_file_to_dir to copy __file__ (i.e. the name of the scripb
  being run) to the benchmark results directories
- Call copy_file_to_dir to copy the YAML configuration file
  to the benchmark results directories

Also in gcpy/benchmark/run_benchmark.py:
- In main: Add the configuration file name (the 1st argument)
  to the config dict (as config["configuration_file_name"]) so that
  we can use it to copy the config file to benchmark results directories.
  to the benchmark run folders

CHANGELOG.md
- Updated accordingly

Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
@yantosca yantosca linked an issue Feb 21, 2024 that may be closed by this pull request
@yantosca yantosca requested a review from lizziel February 21, 2024 21:45
@yantosca yantosca marked this pull request as ready for review February 21, 2024 21:45
@yantosca yantosca changed the title [WIP] Move remaining benchmark-related code out of gcpy and into gcpy/benchmark/modules folder Move remaining benchmark-related code out of gcpy and into gcpy/benchmark/modules folder (plus other improvements) Feb 21, 2024
gcpy/benchmark/modules/benchmark_utils.py
- Now call util.copy_file_to_dir instead of shutil.copyfile
  in the following functions:
  - archive_lumped_species_definitions
  - archive_species_categories

Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
Comment on lines 254 to 255
#bmk_mons_gchp_ref = all_months_gchp_ref[bmk_mon_inds]
#bmk_sec_per_month_ref = sec_per_month_ref[bmk_mon_inds]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these lines be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msulprizio: Now fixed in commit 3b380be.

gcpy/benchmark/modules/run_1yr_fullchem_benchmark.py
- Fixed incorrect comment in Pydoc header

gcpy/benchmark/modules/run_1yr_tt_benchmark.py
- Bug fix: gchp_vs_gchp_refrstdir should use ["data"]["ref"] instead
  of ["data"]["dev"] search keys
- Removed commented-out code that defines bmk_mons_gchp_{ref,dev}
  and bmk_sec_per_month_{ref,dev} variables, these aren't used.
Copy link
Contributor

@lizziel lizziel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved!

@yantosca yantosca merged commit f7d0dc5 into dev Mar 8, 2024
@yantosca yantosca deleted the feature/move-bmk-code-to-gcpy-benchmark-folder branch March 8, 2024 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Feature Request New feature or request topic: Benchmark Plots and Tables Issues pertaining to generating plots/tables from benchmark output topic: Structural Modifications Related to GCPy structural modifications (as opposed to scientific updates)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE REQUEST] Transport tracer benchmark improvements
3 participants