Skip to content

🐛 Enrich partial indicator-upgrader mapping in Anomalist#6260

Open
paarriagadap wants to merge 3 commits into
masterfrom
bug-indicatorupgrader-mapping
Open

🐛 Enrich partial indicator-upgrader mapping in Anomalist#6260
paarriagadap wants to merge 3 commits into
masterfrom
bug-indicatorupgrader-mapping

Conversation

@paarriagadap

@paarriagadap paarriagadap commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Written by Claude Code — @paarriagadap at the wheel.

Summary

Fixes a coverage gap in Anomalist's upgrade detectors, found during the Public Finances in Modern History update (#6256).

The bug. Anomalist's upgrade_missing / upgrade_change detectors only compare old→new variable pairs from the wizard's variable-mapping table. The indicator upgrader persists mappings only for charted indicators, and load_variable_mapping used any non-empty table as-is — so a partial mapping silently restricted the upgrade detectors to the charted subset, while suppressing the shortName-inference fallback that an empty table would have triggered. A partial mapping was strictly worse than none.

Real instance. In #6256, the IMF dropped all USA observations before 1929 across all ten indicators, and revised ~1,600 country-year values per column. Only one indicator (expenditure) is used in charts, so only it was in the mapping — Anomalist flagged the USA data loss on expenditure alone, and the other nine indicators (including gross debt, with revisions up to 26 pp of GDP) were never scanned by either upgrade detector.

Why full coverage matters. A dataset update upgrades all of the dataset's indicators, not just the ones currently feeding standalone charts. Indicators outside the chart-derived mapping are still consumed through MDims and explorers, published in the catalog and data API, cited in articles, and may get charted later. Anomalist QA is the gate where regressions in any of them should surface; scoping it to the charted subset means, e.g., an MDim-only indicator could lose or corrupt data without any anomaly ever being raised.

The fix.

  • load_variable_mapping now always infers old→new pairs by shortName for each new dataset and its previous version, and overlays the indicator upgrader's explicit entries on top (explicit wins). The empty-table and no-pairs behaviors are unchanged.
  • owidbot/anomalist.py already computed the new→old dataset pairs but discarded them, calling load_variable_mapping without them — meaning the staging bot never even reached the inference fallback. It now passes them through.

Testing

Against the #6256 staging DB (which has the real partial mapping — 2 expenditure-only entries):

  • load_variable_mapping([7987], {7987: 7049}) → 11 pairs covering all 10 indicators, with both explicit entries preserved and 9 pairs enriched by shortName (log line confirms).
  • load_variable_mapping([7987]) (no pairs, old bot call shape) → 2 pairs, unchanged behavior.
  • Manually re-running the upgrade detectors with the full mapping on that staging DB made the USA flag appear on all ten indicators in upgrade_missing (previously: expenditure only) and grew upgrade_change from expenditure-only to 1,049 reduced rows.
  • make check passes.

🤖 Generated with Claude Code

@owidbot

owidbot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Quick links (staging server):

Site Dev Site Preview Admin Wizard Docs Docs Preview

Login: ssh owid@staging-site-bug-indicatorupgrader-mappin

chart-diff: ✅ No charts for review.
data-diff: ✅ No differences found

Automatically updated datasets matching excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk are not included

Edited: 2026-06-12 13:02:27 UTC
Execution time: 8.43 seconds

… in Anomalist

The indicator upgrader persists variable mappings only for charted
indicators. Anomalist used any non-empty wizard mapping as-is, so its
upgrade detectors (upgrade_missing / upgrade_change) silently scanned only
those indicators — a partial mapping suppressed the shortName-inference
fallback that an empty table would have triggered. Found on the PFMH update,
where the USA's dropped pre-1929 history was flagged only on expenditure
(the lone charted indicator) and invisible on the other nine.

load_variable_mapping now always infers pairs by shortName for each new
dataset and its previous version, and overlays the upgrader's explicit
entries on top (explicit wins). owidbot now passes the new->old dataset
pairs it already computes, so the staging bot benefits too.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@paarriagadap

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1d43f6afee

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/wizard/app_pages/anomalist/utils.py Outdated
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@paarriagadap

Copy link
Copy Markdown
Contributor Author

Hi @pabloarosado! Claude suggested doing this, so we can access all the (old version, new version) indicator pairs (not only the ones in charts) and check them in Anomalist.

This is relevant because we might use indicators in MDims that are not in charts.

Let me know if this makes sense.

@paarriagadap paarriagadap marked this pull request as ready for review June 12, 2026 13:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants