Skip to content

haolunshi/COVID-19-data

 
 

Repository files navigation

2019 Novel Coronavirus (2019-nCoV) and COVID-19 Unpivoted Data

DOI Test execution and deploy to DEV BSD 3-clause license

This data set collates a growing number of critical indicators for assessment, monitoring and forecasting of the global COVID-19 situation. The data set is maintained by Starschema, an international data services consultancy.

Real-time data, easy to work with

A range of data sets have been published that are useful for monitoring and understanding the spread of COVID-19. Our efforts are intended to collate, curate and unify the most valuable data sources for enterprises, individuals and public health experts to assess the situation and make data-driven decisions. This single source easily blends with other data sources so you can analyze the movement of the SARS-CoV-2 pandemic over time, in any context.

Data sets

Currently, the following data sets are included:

Name Source Table name
US COVID-19 testing and mortality The COVID Tracking Project CT_US_COVID_TESTS
Global data on healthcare providers OpenStreetMap, via Healthsites.io HS_BULK_DATA
Global case counts JHU CSSE JHU_COVID_19
US healthcare capacity by state, 2018 The Henry J. Kaiser Family Foundation KFF_HCP_CAPACITY
US policy actions by state The Henry J. Kaiser Family Foundation KFF_US_POLICY_ACTIONS
US actions to mitigate spread, by state The Henry J. Kaiser Family Foundation KFF_US_STATE_MITIGATIONS
ICU beds by county, US The Henry J. Kaiser Family Foundation KFF_US_ICU_BEDS
Italy case statistics, summary Protezione Civile PCM_DPS_COVID19
Italy case statistics, detailed Protezione Civile PCM_DPS_COVID19_DETAILS
WHO situation reports World Health Organization WHO_SITUATION_REPORTS
US case and mortality counts, by county The New York Times NYT_US_COVID19
COVID-19 cases and deaths, Canada, province level ViriHealth VH_CAN_DETAILED
Travel restrictions by country World Food Programme via HDX HUM_RESTRICTIONS_COUNTRY
Travel restrictions by airline World Food Programme via HDX HUM_RESTRICTIONS_AIRLINE
ACAPS public health restriction data ACAPS via HDX HDX_ACAPS
Detailed case counts by province, sex and age band, Belgium Sciensano SCS_BE_DETAILED_PROVINCE_CASE_COUNTS
Detailed hospitalisations by type of hospital care, Belgium Sciensano SCS_BE_DETAILED_HOSPITALISATIONS
Detailed mortality by region, sex and age band, Belgium Sciensano SCS_BE_DETAILED_MORTALITY
Number of tests performed by day, Belgium Sciensano SCS_BE_DETAILED_TESTS

Technical details

Conventions

By convention, we unify geographies to ISO-3166-1 and ISO-3166-2 alpha-2 identifiers. We use pycountry's country definitions and mappings.

Outputs

Raw data is available through a range of availabilities.

Snowflake Data Exchange

The COVID-19 data set is available on Snowflake Data Exchange. This data set is continuously refreshed.

You can use the METADATA table for metadata about each table, on a column level. Where the column is not specified, information pertains to the entire table.

S3 raw CSVs

Raw CSV files are available on AWS S3:

Name Source Table name
US COVID-19 testing and mortality The COVID Tracking Project s3://starschema.covid/CT_US_COVID_TESTS.csv
Global data on healthcare providers OpenStreetMap, via Healthsites.io s3://starschema.covid/HS_BULK_DATA.csv
Global case counts JHU CSSE s3://starschema.covid/JHU_COVID-19.csv
US healthcare capacity by state, 2018 The Henry J. Kaiser Family Foundation s3://starschema.covid/KFF_HCP_capacity.csv
US policy actions by state The Henry J. Kaiser Family Foundation s3://starschema.covid/KFF_US_POLICY_ACTIONS.csv
US actions to mitigate spread, by state The Henry J. Kaiser Family Foundation s3://starschema.covid/KFF_US_STATE_MITIGATIONS.csv
ICU beds by county, US The Henry J. Kaiser Family Foundation s3://starschema.covid/KFF_US_ICU_BEDS.csv
Italy case statistics, summary Protezione Civile s3://starschema.covid/PCM_DPS_COVID19.csv
Italy case statistics, detailed Protezione Civile s3://starschema.covid/PCM_DPS_COVID19-DETAILS.csv
WHO situation reports World Health Organization s3://starschema.covid/WHO_SITUATION_REPORTS.csv
US case and mortality counts, by county The New York Times s3://starschema.covid/NYT_US_COVID19.csv
COVID-19 cases and deaths, Canada, province level ViriHealth s3://starschema.covid/VH_CAN_DETAILED.csv
Travel restrictions by country World Food Programme via HDX s3://starschema.covid/HUM_RESTRICTIONS_COUNTRY
Travel restrictions by airline World Food Programme via HDX s3://starschema.covid/HUM_RESTRICTIONS_AIRLINE
ACAPS public health restriction data ACAPS via HDX s3://starschema.covid/HDX_ACAPS.csv
Detailed case counts by province, sex and age band, Belgium Sciensano s3://starschema.covid/SCS_BE_DETAILED_PROVINCE_CASE_COUNTS.csv
Detailed hospitalisations by type of hospital care, Belgium Sciensano s3://starschema.covid/SCS_BE_DETAILED_HOSPITALISATIONS.csv
Detailed mortality by region, sex and age band, Belgium Sciensano s3://starschema.covid/SCS_BE_DETAILED_MORTALITY.csv
Number of tests performed by day, Belgium Sciensano s3://starschema.covid/SCS_BE_DETAILED_TESTS.csv

Tableau Web Data Connector

There is a Tableau Web Data Connector available for your use in Tableau to integrate the COVID-19 data set into your dashboards and analytical applications. Currently, this supports the JHU CSSE data set and the Italian case counts released by the Dipartimento delle Protezione Civile. The reach of the WDC is currently being expanded, please check back for details.

Transformations

All applied transformation sets are documented in the Jupyter notebooks in the notebooks/ folder.

Credits

The original data flow was designed by Allan Walker for Mapbox in Alteryx.

Use and disclaimer

Use of this data source is subject to your implied acceptance of the following terms.

Data and transformations are provided 'as is', without any warranty or representation, express or implied, of correctness, usefulness or fitness to purpose. Starschema Inc. and its contributors disclaim all representations and warranties of any kind with respect to the data or code in this repository to the fullest extent permitted under applicable law.

The 2019 novel coronavirus (2019-nCoV)/COVID-19 outbreak is a rapidly evolving situation. Data may be out of date or incorrect due to reporting constraints. Before making healthcare or other personal decisions, please consult a physician licensed to practice in your jurisdiction and/or the website of the public health authorities in your jurisdiction, such as the CDC, Public Health England or Public Health Canada. Nothing in this repository is to be construed as medical advice.

Citation

To cite this work:

Foldi, T. and Csefalvay, K. 2019 Novel Coronavirus (2019-nCoV) and COVID-19 Unpivoted Data. Available on: https://github.com/starschema/COVID-19-data.

About

Unpivoted and cleaned data sets on the COVID-19 pandemic

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 94.4%
  • Python 5.6%