Skip to content

Conversation

yantosca
Copy link
Contributor

This PR fixes an issue that seems to have been introduced with recent versions of xarray. The following updates were made:

(1) The following code in routine get_extents_for_color (in gcpy/plot.py):

          return ds_new.where(\
              ds_new[lon_var] >= minlon, drop=True).\
              where(ds_new[lon_var] <= maxlon, drop=True).\
              where(ds_new[lat_var]>= minlat, drop=True).\
              where(ds_new[lat_var] <= maxlat, drop=True)

needed to be changed to

          # Add .compute() to force evaluation of ds_new[lon_var]
          # See https://github.com/geoschem/gcpy/issues/254
          # Also note: This may return as a dask.array.Array object
          return ds_new.where(\
              ds_new[lon_var].compute() >= minlon, drop=True).\
              where(ds_new[lon_var].compute() <= maxlon, drop=True).\
              where(ds_new[lat_var].compute() >= minlat, drop=True).\
              where(ds_new[lat_var].compute() <= maxlat, drop=True)

as calling where with drop=True on an xarray object silently evaluates the data. Using .compute() forces xarray to do the actual computation. This behavior seems to have changed in xarray recently. For a similar issue, see: hainegroup/oceanspy#332. The object returned also seems to be of type dask.array.Array instead of xarray.DataArray or numpy.ndarray.

(2) We now must add this import statement;

from dask array import Array as DaskArray

so that we can add this to calls to verify_variable_type.

(3) We must now also add DaskArray to the calls to verify_variable_type in six_plot and single_panel in plot.py:

    verify_variable_type(plot_val, (np.ndarray, xr.DataArray, DaskArray))

(4) Update Pydoc headers accordingly:

        """
        ... etc ...

        plot_vals: xarray.DataArray, numpy.ndarray, or dask.array.Array
            Single data variable GEOS-Chem output to plot

        ... etc ...
        """

(5) Because these fixes allow benchmark plots to proceed, we can remove the pegged xarray from environment.yml

    #
    # NOTE: The most recent xarray (2023.8.0) seems to break backwards
    # compatibility with the benchmark plotting code.  Peg to 2023.2.0
    # until we can update GCPy for the most recent xarray.
    #  -- Bob Yantosca (29 Aug 2023)
    #
    - xarray==2023.2.0                # Read data from netCDF etc files

and replace it with

    - xarray                          # Read data from netCDF etc files```

gcpy/plot.py
- Import the dask.array.Array type definition (as DaskArray)
- Update the calls to verify_variable_type in routines "six_plot"
  and "single_panel" so that allowable input arguments may be of type
  xarray.DataArray, numpy.ndarray, or dask.array.Array.
- In internal routine "get_extent_for_colors" (located within the
  "compare_single_level" routine, we must now use the expression
  ds_new[lon_var].compute() so that Xarray will evaluate the
  "ds_new[lon_var]" expression.  This may return as a dask.array.Array.
- Updated Pydoc comments

We have confirmed that this update works with xarray==2023.8.0.

Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
docs/environment_files/environment.yml
- In the prior commit we have added updates to plot.py that render
  the pegging of xarray to version 2023.2.0 unnecessary.  Restore
  the original code in the GCPy environment.yml file.

Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
plot.py
- Updated Pydoc in "six_plot" to state that plot_val can be
  of type xarray.DataArray, numpy.ndarray, dask.array.Array

CHANGELOG.md
- Updated accordingly

Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
@yantosca yantosca added topic: Benchmark Plots and Tables Issues pertaining to generating plots/tables from benchmark output category: Bug Fix Fixes a bug that was previously reported labels Aug 29, 2023
@yantosca yantosca added this to the 1.4.0 milestone Aug 29, 2023
@yantosca yantosca requested a review from msulprizio August 29, 2023 21:44
@yantosca yantosca self-assigned this Aug 29, 2023
@msulprizio
Copy link
Contributor

This fix resolves the error reported in #254.

gcpy/plot.py
- "plot_vals" should be "plot_val" in the Pydoc header, as this
  is the name of the argument.

Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Bug Fix Fixes a bug that was previously reported topic: Benchmark Plots and Tables Issues pertaining to generating plots/tables from benchmark output
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG/ISSUE] Index error when creating 1-year benchmark plots for GCClassic vs GCHP
2 participants