We estimate a hospital's population catchment as the proportion of the population in each surrounding ZIP Code likely to visit that hospital. The "likely" proportions are taken from the Texas DSHS Inpatient Public Use Data File (IP PUDF) according to the equation below (1). The IP PUDF is quarterly so time spans can range from 2018Q3 to 2023Q4 for any disease defined by a set of ICD-10-CM codes. The lowest spatial resolution for patient addresses is their ZIP Code. We convert all ZIP Codes to ZIP Code Tabulation Areas (ZCTAs) to assign PO Boxes to polygons with a US Census Bureau population estimate. If a hospital discharged fewer than 30 patients in a quarter, then patient ZIP Codes were anonymized and are not considered.
For example, COVID-19 has the ICD-10-CM code U071, so we consider all patients with U071 as their principal diagnosis and any who had U071 present on admittance (secondary diagnosis).
| ZCTA |
Hospital |
Contribution to |
|||
|---|---|---|---|---|---|
| 1 | 1 | 100 | 100 | 1000 | |
| 2 | 1 | 50 | 100 | 500 | |
| 2 | 2 | 50 | 100 | 500 |
$C_1 = 1000 + 250 = 1250$ $C_2 = 250$
The Rproj is inside the code folder, so open that to interact with R studio. run_estimation_fns.R will create private_results/ZCTA-HOSP-PAIR_DISEASE_DATE-RANGE.csv, private_results/HOSP_CATCHMENTS/HOSP-CATCH-CALC_DISEASE_DATE-RANGE.csv and private_results/HOSP_CATCHMENTS/HOSP-POP-CATCH_DISEASE_DATE-RANGE.csv. In HOSP-CATCH-CALC_DISEASE_DATE-RANGE.csv you'll find the calculations used to estimate the catchment, so this includes ZCTA-Hospital pairs plus ZCTA population. HOSP-POP-CATCH_DISEASE_DATE-RANGE.csv has only hospital THCIC_ID's and their estimated catchments.
The THCIC_ID's only come with the hospital/provider names in the facility file associated with each year-quarter. create_thcic-id_to_ccn_crosswalk.R uses 3 sources of hospital names to determine which city a hospital is in based on its name alone. All the hospitals that found a match are in private_results/matched_hospital_2024-12-01.csv and those still missing are in private_results/missing_hospital_2024-12-01.csv. The subset of missing hospitals required for the catchment file is chosen as the city of the top visiting ZIP code to each hospital. This works pretty well, but is off for a few contributions with only 1-2 patients admitted. The final file you'd use is private_results/HOSP_CATCHMENTS/CITY-HOSP-POP-CATCH_DISEASE_DATE-RANGE.csv.