Skip to content

Support reading datasets from CDS (not only ERA5/MARS) #468

@nicrie

Description

@nicrie

Is your feature request related to a problem? Please describe.

We are evaluating the use of the anemoi framework to develop an ML model that maps ERA5 to WFDE5 for near-real-time updates. YAML recipe files currently appear to support fetching ERA5 from the CDS via a translated MARS request. But WFDE5 is only available on the CDS, not on MARS. As a workaround, we generate anemoi datasets from local WFDE5 files, which works technically but is not reproducible for others and doesn’t scale to an operational workflow.

Describe the solution you'd like

Allow YAML recipe files to define inputs from the CDS directly (e.g., specifying dataset name, variables, period, bounding box).

Describe alternatives you've considered

  • Generating datasets from local WFDE5 files (works but not shareable or reproducible).
  • Maintaining a manual pre-download pipeline outside of anemoi, which breaks the clean integration with the operational workflow.

Additional context

Not entirely sure whether this belongs here or in anemoi-dataset; happy to move the issue if needed.

Tagging @NatalieZelenka and @sahahner in case they would like to contribute additional input.

Organisation

ECMWF

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type

Projects

Status

To be triaged

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions