Skip to content

Selecting a subset of levels while creating Zarr file from multiple level NetCDF files #498

@mo-akm

Description

@mo-akm

Describe the bug
Context:

Making Anemoi Zarr file from netcdf files.

The files contain data on 33 pressure levels in Pa, with levels as follows:
1000, 2000, 3000, 4000, 5000, 7000, 10000, 12500, 15000, 17500, 20000, 22500, 25000, 27500, 30000, 32500, 35000, 37500, 40000, 45000, 50000, 55000, 60000, 65000, 70000, 75000, 80000, 85000, 90000, 92500, 95000, 97500, 100000

I wish to sub-select the levels with the following example entries in my recipe file:

pressure: &pressure_levels
  - 3000
  - 7000
  - 10000
  - 15000
  - 20000
  - 25000
  - 30000
  - 40000
  - 50000
  - 70000
  - 85000
  - 95000
  - 100000

input:
  join:
    - netcdf:
      path: /path/to/files/*-relative_humidity_on_pressure_levels.nc
      param: relative_humidity
      pressure: *pressure_levels

Problem:
As soon as I include a level with a pressure value from 35000 upwards, I get the following error message

File ".pixi/envs/default/lib/python3.14/site-packages/anemoi/datasets/create/sources/xarray_support/fieldlist.py", line 228, in sel
    v = v.sel(missing, **rest)
  File ".pixi/envs/default/lib/python3.14/site-packages/anemoi/datasets/create/sources/xarray_support/variable.py", line 213, in sel
    i = c.index(v)
  File ".pixi/envs/default/lib/python3.14/site-packages/anemoi/datasets/create/sources/xarray_support/coordinates.py", line 168, in index
    return self._index_multiple(value)
           ~~~~~~~~~~~~~~~~~~~~^^^^^^^
  File ".pixi/envs/default/lib/python3.14/site-packages/anemoi/datasets/create/sources/xarray_support/coordinates.py", line 228, in _index_multiple
    if np.all(values[index] == value):
              ^^^^^^^^^^^^^^^^^^^^^^
ValueError: operands could not be broadcast together with shapes (7,) (8,)

Exploration:

If I select a subset where all the pressure values are 35000 or below, I successfully get those levels in the output Zarr. It does not appear to be related to the value difference between, or number of levels between, those requested, but to the absolute value of the levels requested. See table below for more details.

If I do not specify the levels desired, I successfully get all the levels in the output Zarr.

Pressure Levels Outcome
1000,2000,3000,32500 Success
1000,2000,3000,35000 Success
1000,2000,3000,4000,35000 Success
1000,2000,3000,4000,37500 Failure. ValueError: operands could not be broadcast together with shapes (4,) (5,)
40000,50000,70000,85000,95000,100000 Failure. ValueError: operands could not be broadcast together with shapes (0,) (6,)

Version number
I am using the following versions/branch/sha1 of the anemoi packages: 0.5.28

URL to sample input data
(https://met-office-atmospheric-model-data.s3-eu-west-2.amazonaws.com/global-deterministic-10km/20251210T0000Z/20251210T0000Z-PT0000H00M-relative_humidity_on_pressure_levels.nc)

Expected behavior
I expect to produce a Zarr file containing the selected levels, no matter what value of the pressure levels given, provided they are available in the source data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    To be triaged

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions