Mrb 807 avoid regridding forecasts#130
Open
Louis-Frey wants to merge 2 commits into
Open
Conversation
…y aligned Avoids O(N log N) kd-tree build and query (~1M points at 1km resolution) when forecast and truth share the same grid, reducing baseline spatial verification from ~20 minutes to near-instantaneous.
jonasbhend
reviewed
May 8, 2026
Comment on lines
+120
to
+122
| result = map_forecast_to_truth(fcst, truth) | ||
|
|
||
| assert result is fcst |
Contributor
There was a problem hiding this comment.
Hi @Louis-Frey. Great addition. The below is just a suggestion:
This doesn't really test if we abort early, no? Either the test should be expanded to actually test timing (probably difficult), or one could add a log statement to the if branch to check if the early option is taken as suspected (and check if not if the grids differ).
Contributor
Author
There was a problem hiding this comment.
Hi Jonas, thanks! I will have a look...
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Performance bottleneck in map_forecast_to_truth
Date: 2026-04-02
Context
During testing of the spatial verification pipeline (evalml experiment --spatial)
with a single-init-time config (forecasters-ich1-oper-fixed.yaml, 2025-03-01),
the verif_metrics_spatial_baseline jobs for ICON-CH1-EPS took ~20 minutes despite
having only one init time to process — i.e., nothing to aggregate.
Root cause
The bottleneck is in src/verification/spatial.py: map_forecast_to_truth().
Even when the forecast (ICON-CH1-EPS) and truth (KENDA-CH1) are on the same
1km grid and no actual remapping is needed, the function always:
At 1km resolution this involves ~1M grid points, making the kd-tree operations
the dominant cost regardless of how many init times are processed.
There is already a TODO comment in the code acknowledging this:
TODO: return fcst unchanged when forecast and truth are already aligned
(src/verification/spatial.py, line 124)
Recommended fix
Before building the kd-tree, check whether forecast and truth lat/lon
coordinates are already aligned (e.g. same shape and max abs difference
below a small tolerance). If so, return fcst unchanged immediately.
This would make the baseline spatial verification near-instantaneous for
same-grid configurations (ICON-CH1-EPS vs KENDA-CH1), and would also
benefit any other same-grid run/truth combination.