Skip to content

📊 Add Maddison, WID and ILO regions to regions dataset#6262

Draft
paarriagadap wants to merge 4 commits into
masterfrom
data-maddison-regions
Draft

📊 Add Maddison, WID and ILO regions to regions dataset#6262
paarriagadap wants to merge 4 commits into
masterfrom
data-maddison-regions

Conversation

@paarriagadap

@paarriagadap paarriagadap commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Written by Claude Code — @paarriagadap at the wheel.

Summary

Adds the 8 Maddison Project Database regions, the 9 World Inequality Database regions, and the 29 ILOSTAT regions to the regions dataset (garden/regions/2023-01-01/regions.yml), following the same convention as the existing WHO, WB, UN, UN M49, UN SDG, PEW, and IEA sections.

Maddison Project Database

Code Name Members
MADDISON_EA East Asia (Maddison) 6
MADDISON_EE Eastern Europe (Maddison) 32
MADDISON_LA Latin America (Maddison) 26
MADDISON_MENA Middle East and North Africa (Maddison) 20
MADDISON_SSEA South and South East Asia (Maddison) 16
MADDISON_SSA Sub Saharan Africa (Maddison) 46
MADDISON_WE Western Europe (Maddison) 19
MADDISON_WO Western offshoots (Maddison) 4

World Inequality Database

Code Name Members
WID_EA East Asia (WID) 8
WID_EUR Europe (WID) 46
WID_LA Latin America (WID) 43
WID_MENA MENA (WID) 20
WID_NA North America (WID) 4
WID_OC Oceania (WID) 16
WID_RCA Russia and Central Asia (WID) 11
WID_SSEA South & South-East Asia (WID) 19
WID_SSA Sub-Saharan Africa (WID) 49

ILOSTAT

The 29 regions published as "(ILO)" entities in garden/un/2026-02-03/ilostat, defined hierarchically (parents are compositions of child region codes, using the same nesting mechanism as UNM49_AFR):

  • Africa (ILO_AFR) = Northern Africa + Sub-Saharan Africa (= Central + Eastern + Southern + Western Africa)
  • Americas (ILO_AMR) = Latin America and the Caribbean (= Caribbean + Central America + South America) + Northern America
  • Arab States (ILO_ARB)
  • Asia and the Pacific (ILO_ASP) = Eastern Asia + South-Eastern Asia and the Pacific (= South-Eastern Asia + Pacific Islands) + Southern Asia
  • Europe and Central Asia (ILO_ECA) = Northern, Southern and Western Europe (= Northern + Southern + Western Europe) + Eastern Europe + Central and Western Asia (= Central Asia + Western Asia)

Details

  • Region names match the aggregate entities already published in the respective garden/grapher datasets (e.g. "Western offshoots (Maddison)", "Russia and Central Asia (WID)", "Arab States (ILO)"), so the new definitions line up with the entities users see in charts.
  • Maddison membership is taken from the data itself: extracted from the region column of garden/ggdc/2024-04-26/maddison_project_database, which assigns each of the 169 MPD countries to one region. A set-equality check confirmed the YAML lists match the data exactly for all 8 regions.
  • WID membership comes from WID's official country table ("world region" column). The combined "North America & Oceania" region is split into North America and Oceania using WID's "sub-division of world regions" column, since those are the entities OWID publishes. All 216 countries resolved to region codes via ISO alpha-2 (Kosovo → OWID_KOS by name), and a set-equality check confirmed all 9 YAML lists match the table exactly.
  • ILO membership comes from ILOSTAT's table of contents (ilo_region / ilo_subregion_detailed labels, via the regions table of garden/un/2026-02-03/ilostat). Verified: the 5 broad regions are pairwise disjoint and every country lands in the region ILOSTAT assigns it to.
  • Channel Islands deviation (documented in a # NOTE): ILOSTAT lists Guernsey and Jersey under Western Europe while also listing Channel Islands (= Guernsey + Jersey in OWID's definitions) under Northern Europe. Keeping both would double-count Guernsey/Jersey after aggregate expansion and fail the regions step's uniqueness check. They are kept only via Channel Islands in Northern Europe, where ILOSTAT reports the bulk of the data for these territories (1,687 rows vs 55).
  • Historical entities are included where the source assigns them a region: USSR (OWID_USS), Czechoslovakia (OWID_CZS), and Yugoslavia (OWID_YGS) in Maddison's Eastern Europe, Sudan (former) (OWID_SDN) in Maddison's Sub-Saharan Africa, and Netherlands Antilles (ANT) in ILO's Caribbean (alongside its successors, matching ILOSTAT's own listing; ANT has no members, so nothing double-counts).
  • Aggregates don't need entries in regions.codes.csv, so only regions.yml is touched. The regions garden step rebuilds cleanly, including its post-expansion duplicate-member check.

🤖 Generated with Claude Code

@owidbot

owidbot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Quick links (staging server):

Site Dev Site Preview Admin Wizard Docs Docs Preview

Login: ssh owid@staging-site-data-maddison-regions

chart-diff: ✅ No charts for review.
data-diff: ❌ Found differences
2026-06-12 17:12:44 [info     ] Skipped datasets with identical data (source_checksum cascade) count=9
= Dataset garden/hyde/2024-01-02/all_indicators
  = Table all_indicators
= Dataset garden/hyde/2026-06-08/all_indicators
  = Table all_indicators
= Dataset garden/regions/2023-01-01/regions
  = Table regions
    ~ Dim code
+       + New values: 46 / 416 (11.06%)
             code
          ILO_CAF
          ILO_EAS
          ILO_SEA
          ILO_SSA
          ILO_WEU
    ~ Column aliases (new data)
+       + New values: 46 / 416 (11.06%)
             code aliases
          ILO_CAF     NaN
          ILO_EAS     NaN
          ILO_SEA     NaN
          ILO_SSA     NaN
          ILO_WEU     NaN
    ~ Column cow_code (new data)
+       + New values: 46 / 416 (11.06%)
             code  cow_code
          ILO_CAF      <NA>
          ILO_EAS      <NA>
          ILO_SEA      <NA>
          ILO_SSA      <NA>
          ILO_WEU      <NA>
    ~ Column cow_letter (new data)
+       + New values: 46 / 416 (11.06%)
             code cow_letter
          ILO_CAF        NaN
          ILO_EAS        NaN
          ILO_SEA        NaN
          ILO_SSA        NaN
          ILO_WEU        NaN
    ~ Column defined_by (new data)
+       + New values: 46 / 416 (11.06%)
             code defined_by
          ILO_CAF        ilo
          ILO_EAS        ilo
          ILO_SEA        ilo
          ILO_SSA        ilo
          ILO_WEU        ilo
    ~ Column end_year (new data)
+       + New values: 46 / 416 (11.06%)
             code  end_year
          ILO_CAF      <NA>
          ILO_EAS      <NA>
          ILO_SEA      <NA>
          ILO_SSA      <NA>
          ILO_WEU      <NA>
    ~ Column imf_code (new data)
+       + New values: 46 / 416 (11.06%)
             code  imf_code
          ILO_CAF      <NA>
          ILO_EAS      <NA>
          ILO_SEA      <NA>
          ILO_SSA      <NA>
          ILO_WEU      <NA>
    ~ Column is_historical (new data)
+       + New values: 46 / 416 (11.06%)
             code  is_historical
          ILO_CAF          False
          ILO_EAS          False
          ILO_SEA          False
          ILO_SSA          False
          ILO_WEU          False
    ~ Column iso_alpha2 (new data)
+       + New values: 46 / 416 (11.06%)
             code iso_alpha2
          ILO_CAF        NaN
          ILO_EAS        NaN
          ILO_SEA        NaN
          ILO_SSA        NaN
          ILO_WEU        NaN
    ~ Column iso_alpha3 (new data)
+       + New values: 46 / 416 (11.06%)
             code iso_alpha3
          ILO_CAF        NaN
          ILO_EAS        NaN
          ILO_SEA        NaN
          ILO_SSA        NaN
          ILO_WEU        NaN
    ~ Column kansas_code (new data)
+       + New values: 46 / 416 (11.06%)
             code kansas_code
          ILO_CAF         NaN
          ILO_EAS         NaN
          ILO_SEA         NaN
          ILO_SSA         NaN
          ILO_WEU         NaN
    ~ Column legacy_country_id (new data)
+       + New values: 46 / 416 (11.06%)
             code  legacy_country_id
          ILO_CAF               <NA>
          ILO_EAS               <NA>
          ILO_SEA               <NA>
          ILO_SSA               <NA>
          ILO_WEU               <NA>
    ~ Column legacy_entity_id (new data)
+       + New values: 46 / 416 (11.06%)
             code  legacy_entity_id
          ILO_CAF              <NA>
          ILO_EAS              <NA>
          ILO_SEA              <NA>
          ILO_SSA              <NA>
          ILO_WEU              <NA>
    ~ Column marc_code (new data)
+       + New values: 46 / 416 (11.06%)
             code marc_code
          ILO_CAF       NaN
          ILO_EAS       NaN
          ILO_SEA       NaN
          ILO_SSA       NaN
          ILO_WEU       NaN
    ~ Column members (new data)
+       + New values: 46 / 416 (11.06%)
             code                                                                                                                                                                                                                                                                                                                                                        members
          ILO_CAF                                                                                                                                                                                                                                                                                                ["AGO", "CAF", "CMR", "COD", "COG", "GAB", "GNQ", "STP", "TCD"]
          ILO_EAS                                                                                                                                                                                                                                                                                                       ["CHN", "HKG", "JPN", "KOR", "MAC", "MNG", "PRK", "TWN"]
          ILO_SEA                                                                                                                                                                                                                                                                                  ["BRN", "IDN", "KHM", "LAO", "MMR", "MYS", "PHL", "SGP", "THA", "TLS", "VNM"]
          ILO_SSA ["AGO", "CAF", "CMR", "COD", "COG", "GAB", "GNQ", "STP", "TCD", "BDI", "COM", "DJI", "ERI", "ETH", "KEN", "MDG", "MOZ", "MUS", "MWI", "REU", "RWA", "SOM", "SSD", "SYC", "TZA", "UGA", "ZMB", "ZWE", "BWA", "LSO", "NAM", "SWZ", "ZAF", "BEN", "BFA", "CIV", "CPV", "GHA", "GIN", "GMB", "GNB", "LBR", "MLI", "MRT", "NER", "NGA", "SEN", "SHN", "SLE", "TGO"]
          ILO_WEU                                                                                                                                                                                                                                                                                                ["AUT", "BEL", "CHE", "DEU", "FRA", "LIE", "LUX", "MCO", "NLD"]
    ~ Column name (new data)
+       + New values: 46 / 416 (11.06%)
             code                     name
          ILO_CAF     Central Africa (ILO)
          ILO_EAS       Eastern Asia (ILO)
          ILO_SEA South-Eastern Asia (ILO)
          ILO_SSA Sub-Saharan Africa (ILO)
          ILO_WEU     Western Europe (ILO)
    ~ Column ncd_code (new data)
+       + New values: 46 / 416 (11.06%)
             code ncd_code
          ILO_CAF      NaN
          ILO_EAS      NaN
          ILO_SEA      NaN
          ILO_SSA      NaN
          ILO_WEU      NaN
    ~ Column penn_code (new data)
+       + New values: 46 / 416 (11.06%)
             code penn_code
          ILO_CAF       NaN
          ILO_EAS       NaN
          ILO_SEA       NaN
          ILO_SSA       NaN
          ILO_WEU       NaN
    ~ Column region_type (new data)
+       + New values: 46 / 416 (11.06%)
             code region_type
          ILO_CAF   aggregate
          ILO_EAS   aggregate
          ILO_SEA   aggregate
          ILO_SSA   aggregate
          ILO_WEU   aggregate
    ~ Column related (new data)
+       + New values: 46 / 416 (11.06%)
             code related
          ILO_CAF     NaN
          ILO_EAS     NaN
          ILO_SEA     NaN
          ILO_SSA     NaN
          ILO_WEU     NaN
    ~ Column short_name (new data)
+       + New values: 46 / 416 (11.06%)
             code               short_name
          ILO_CAF     Central Africa (ILO)
          ILO_EAS       Eastern Asia (ILO)
          ILO_SEA South-Eastern Asia (ILO)
          ILO_SSA Sub-Saharan Africa (ILO)
          ILO_WEU     Western Europe (ILO)
    ~ Column successors (new data)
+       + New values: 46 / 416 (11.06%)
             code successors
          ILO_CAF        NaN
          ILO_EAS        NaN
          ILO_SEA        NaN
          ILO_SSA        NaN
          ILO_WEU        NaN
    ~ Column unctad_code (new data)
+       + New values: 46 / 416 (11.06%)
             code unctad_code
          ILO_CAF         NaN
          ILO_EAS         NaN
          ILO_SEA         NaN
          ILO_SSA         NaN
          ILO_WEU         NaN
    ~ Column wikidata_code (new data)
+       + New values: 46 / 416 (11.06%)
             code wikidata_code
          ILO_CAF           NaN
          ILO_EAS           NaN
          ILO_SEA           NaN
          ILO_SSA           NaN
          ILO_WEU           NaN
= Dataset garden/un/2024-07-12/un_wpp
  = Table population
  = Table deaths
  = Table fertility_single
  = Table life_expectancy
  = Table births
  = Table natural_change_rate
  = Table growth_rate
  = Table mean_age_childbearing
  = Table dependency_ratio
  = Table sex_ratio
  = Table population_january
  = Table fertility_rate
  = Table mortality_rate
  = Table median_age
  = Table migration
= Dataset garden/worldbank_icp/2024-05-30/icp_2021_currencies
  = Table icp_2021_currencies


Legend: +New  ~Modified  -Removed  =Identical  Details
Hint: Run this locally with etl diff REMOTE data/ --include yourdataset --verbose --snippet

Automatically updated datasets matching excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk are not included

Edited: 2026-06-12 17:13:05 UTC
Execution time: 39.95 seconds

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@paarriagadap paarriagadap changed the title 📊 Add Maddison Project Database regions to regions dataset 📊 Add Maddison Project Database and WID regions to regions dataset Jun 12, 2026
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@paarriagadap paarriagadap changed the title 📊 Add Maddison Project Database and WID regions to regions dataset 📊 Add Maddison, WID and ILO regions to regions dataset Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants