📊 Add Maddison, WID and ILO regions to regions dataset#6262
Draft
paarriagadap wants to merge 4 commits into
Draft
📊 Add Maddison, WID and ILO regions to regions dataset#6262paarriagadap wants to merge 4 commits into
paarriagadap wants to merge 4 commits into
Conversation
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Contributor
|
Quick links (staging server):
Login: chart-diff: ✅No charts for review.data-diff: ❌ Found differences2026-06-12 17:12:44 [info ] Skipped datasets with identical data (source_checksum cascade) count=9
= Dataset garden/hyde/2024-01-02/all_indicators
= Table all_indicators
= Dataset garden/hyde/2026-06-08/all_indicators
= Table all_indicators
= Dataset garden/regions/2023-01-01/regions
= Table regions
~ Dim code
+ + New values: 46 / 416 (11.06%)
code
ILO_CAF
ILO_EAS
ILO_SEA
ILO_SSA
ILO_WEU
~ Column aliases (new data)
+ + New values: 46 / 416 (11.06%)
code aliases
ILO_CAF NaN
ILO_EAS NaN
ILO_SEA NaN
ILO_SSA NaN
ILO_WEU NaN
~ Column cow_code (new data)
+ + New values: 46 / 416 (11.06%)
code cow_code
ILO_CAF <NA>
ILO_EAS <NA>
ILO_SEA <NA>
ILO_SSA <NA>
ILO_WEU <NA>
~ Column cow_letter (new data)
+ + New values: 46 / 416 (11.06%)
code cow_letter
ILO_CAF NaN
ILO_EAS NaN
ILO_SEA NaN
ILO_SSA NaN
ILO_WEU NaN
~ Column defined_by (new data)
+ + New values: 46 / 416 (11.06%)
code defined_by
ILO_CAF ilo
ILO_EAS ilo
ILO_SEA ilo
ILO_SSA ilo
ILO_WEU ilo
~ Column end_year (new data)
+ + New values: 46 / 416 (11.06%)
code end_year
ILO_CAF <NA>
ILO_EAS <NA>
ILO_SEA <NA>
ILO_SSA <NA>
ILO_WEU <NA>
~ Column imf_code (new data)
+ + New values: 46 / 416 (11.06%)
code imf_code
ILO_CAF <NA>
ILO_EAS <NA>
ILO_SEA <NA>
ILO_SSA <NA>
ILO_WEU <NA>
~ Column is_historical (new data)
+ + New values: 46 / 416 (11.06%)
code is_historical
ILO_CAF False
ILO_EAS False
ILO_SEA False
ILO_SSA False
ILO_WEU False
~ Column iso_alpha2 (new data)
+ + New values: 46 / 416 (11.06%)
code iso_alpha2
ILO_CAF NaN
ILO_EAS NaN
ILO_SEA NaN
ILO_SSA NaN
ILO_WEU NaN
~ Column iso_alpha3 (new data)
+ + New values: 46 / 416 (11.06%)
code iso_alpha3
ILO_CAF NaN
ILO_EAS NaN
ILO_SEA NaN
ILO_SSA NaN
ILO_WEU NaN
~ Column kansas_code (new data)
+ + New values: 46 / 416 (11.06%)
code kansas_code
ILO_CAF NaN
ILO_EAS NaN
ILO_SEA NaN
ILO_SSA NaN
ILO_WEU NaN
~ Column legacy_country_id (new data)
+ + New values: 46 / 416 (11.06%)
code legacy_country_id
ILO_CAF <NA>
ILO_EAS <NA>
ILO_SEA <NA>
ILO_SSA <NA>
ILO_WEU <NA>
~ Column legacy_entity_id (new data)
+ + New values: 46 / 416 (11.06%)
code legacy_entity_id
ILO_CAF <NA>
ILO_EAS <NA>
ILO_SEA <NA>
ILO_SSA <NA>
ILO_WEU <NA>
~ Column marc_code (new data)
+ + New values: 46 / 416 (11.06%)
code marc_code
ILO_CAF NaN
ILO_EAS NaN
ILO_SEA NaN
ILO_SSA NaN
ILO_WEU NaN
~ Column members (new data)
+ + New values: 46 / 416 (11.06%)
code members
ILO_CAF ["AGO", "CAF", "CMR", "COD", "COG", "GAB", "GNQ", "STP", "TCD"]
ILO_EAS ["CHN", "HKG", "JPN", "KOR", "MAC", "MNG", "PRK", "TWN"]
ILO_SEA ["BRN", "IDN", "KHM", "LAO", "MMR", "MYS", "PHL", "SGP", "THA", "TLS", "VNM"]
ILO_SSA ["AGO", "CAF", "CMR", "COD", "COG", "GAB", "GNQ", "STP", "TCD", "BDI", "COM", "DJI", "ERI", "ETH", "KEN", "MDG", "MOZ", "MUS", "MWI", "REU", "RWA", "SOM", "SSD", "SYC", "TZA", "UGA", "ZMB", "ZWE", "BWA", "LSO", "NAM", "SWZ", "ZAF", "BEN", "BFA", "CIV", "CPV", "GHA", "GIN", "GMB", "GNB", "LBR", "MLI", "MRT", "NER", "NGA", "SEN", "SHN", "SLE", "TGO"]
ILO_WEU ["AUT", "BEL", "CHE", "DEU", "FRA", "LIE", "LUX", "MCO", "NLD"]
~ Column name (new data)
+ + New values: 46 / 416 (11.06%)
code name
ILO_CAF Central Africa (ILO)
ILO_EAS Eastern Asia (ILO)
ILO_SEA South-Eastern Asia (ILO)
ILO_SSA Sub-Saharan Africa (ILO)
ILO_WEU Western Europe (ILO)
~ Column ncd_code (new data)
+ + New values: 46 / 416 (11.06%)
code ncd_code
ILO_CAF NaN
ILO_EAS NaN
ILO_SEA NaN
ILO_SSA NaN
ILO_WEU NaN
~ Column penn_code (new data)
+ + New values: 46 / 416 (11.06%)
code penn_code
ILO_CAF NaN
ILO_EAS NaN
ILO_SEA NaN
ILO_SSA NaN
ILO_WEU NaN
~ Column region_type (new data)
+ + New values: 46 / 416 (11.06%)
code region_type
ILO_CAF aggregate
ILO_EAS aggregate
ILO_SEA aggregate
ILO_SSA aggregate
ILO_WEU aggregate
~ Column related (new data)
+ + New values: 46 / 416 (11.06%)
code related
ILO_CAF NaN
ILO_EAS NaN
ILO_SEA NaN
ILO_SSA NaN
ILO_WEU NaN
~ Column short_name (new data)
+ + New values: 46 / 416 (11.06%)
code short_name
ILO_CAF Central Africa (ILO)
ILO_EAS Eastern Asia (ILO)
ILO_SEA South-Eastern Asia (ILO)
ILO_SSA Sub-Saharan Africa (ILO)
ILO_WEU Western Europe (ILO)
~ Column successors (new data)
+ + New values: 46 / 416 (11.06%)
code successors
ILO_CAF NaN
ILO_EAS NaN
ILO_SEA NaN
ILO_SSA NaN
ILO_WEU NaN
~ Column unctad_code (new data)
+ + New values: 46 / 416 (11.06%)
code unctad_code
ILO_CAF NaN
ILO_EAS NaN
ILO_SEA NaN
ILO_SSA NaN
ILO_WEU NaN
~ Column wikidata_code (new data)
+ + New values: 46 / 416 (11.06%)
code wikidata_code
ILO_CAF NaN
ILO_EAS NaN
ILO_SEA NaN
ILO_SSA NaN
ILO_WEU NaN
= Dataset garden/un/2024-07-12/un_wpp
= Table population
= Table deaths
= Table fertility_single
= Table life_expectancy
= Table births
= Table natural_change_rate
= Table growth_rate
= Table mean_age_childbearing
= Table dependency_ratio
= Table sex_ratio
= Table population_january
= Table fertility_rate
= Table mortality_rate
= Table median_age
= Table migration
= Dataset garden/worldbank_icp/2024-05-30/icp_2021_currencies
= Table icp_2021_currencies
Legend: +New ~Modified -Removed =Identical Details
Hint: Run this locally with etl diff REMOTE data/ --include yourdataset --verbose --snippet
Automatically updated datasets matching excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk are not included Edited: 2026-06-12 17:13:05 UTC |
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds the 8 Maddison Project Database regions, the 9 World Inequality Database regions, and the 29 ILOSTAT regions to the regions dataset (
garden/regions/2023-01-01/regions.yml), following the same convention as the existing WHO, WB, UN, UN M49, UN SDG, PEW, and IEA sections.Maddison Project Database
MADDISON_EAMADDISON_EEMADDISON_LAMADDISON_MENAMADDISON_SSEAMADDISON_SSAMADDISON_WEMADDISON_WOWorld Inequality Database
WID_EAWID_EURWID_LAWID_MENAWID_NAWID_OCWID_RCAWID_SSEAWID_SSAILOSTAT
The 29 regions published as "(ILO)" entities in
garden/un/2026-02-03/ilostat, defined hierarchically (parents are compositions of child region codes, using the same nesting mechanism asUNM49_AFR):ILO_AFR) = Northern Africa + Sub-Saharan Africa (= Central + Eastern + Southern + Western Africa)ILO_AMR) = Latin America and the Caribbean (= Caribbean + Central America + South America) + Northern AmericaILO_ARB)ILO_ASP) = Eastern Asia + South-Eastern Asia and the Pacific (= South-Eastern Asia + Pacific Islands) + Southern AsiaILO_ECA) = Northern, Southern and Western Europe (= Northern + Southern + Western Europe) + Eastern Europe + Central and Western Asia (= Central Asia + Western Asia)Details
regioncolumn ofgarden/ggdc/2024-04-26/maddison_project_database, which assigns each of the 169 MPD countries to one region. A set-equality check confirmed the YAML lists match the data exactly for all 8 regions.OWID_KOSby name), and a set-equality check confirmed all 9 YAML lists match the table exactly.ilo_region/ilo_subregion_detailedlabels, via theregionstable ofgarden/un/2026-02-03/ilostat). Verified: the 5 broad regions are pairwise disjoint and every country lands in the region ILOSTAT assigns it to.# NOTE): ILOSTAT lists Guernsey and Jersey under Western Europe while also listing Channel Islands (= Guernsey + Jersey in OWID's definitions) under Northern Europe. Keeping both would double-count Guernsey/Jersey after aggregate expansion and fail the regions step's uniqueness check. They are kept only via Channel Islands in Northern Europe, where ILOSTAT reports the bulk of the data for these territories (1,687 rows vs 55).OWID_USS), Czechoslovakia (OWID_CZS), and Yugoslavia (OWID_YGS) in Maddison's Eastern Europe, Sudan (former) (OWID_SDN) in Maddison's Sub-Saharan Africa, and Netherlands Antilles (ANT) in ILO's Caribbean (alongside its successors, matching ILOSTAT's own listing;ANThas nomembers, so nothing double-counts).regions.codes.csv, so onlyregions.ymlis touched. The regions garden step rebuilds cleanly, including its post-expansion duplicate-member check.🤖 Generated with Claude Code