Skip to content

This project aims to analyze the COVID-19 pandemic using publicly available data. The project includes a Jupyter notebook with Python code to extract, clean, and visualize COVID-19 data from various sources. Additionally, the project provides a dashboard to interactively explore the data.

Notifications You must be signed in to change notification settings

Sannidhi-Shetty2/COVID-19-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

76 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

COVID-19-Analysis

Introduction

COVID-19, caused by SARS-CoV-2, is a global pandemic respiratory illness since 2019. It ranges from mild to severe symptoms, affecting all aspects of life. Preventive measures include vaccination, masks, and distancing. Research and collaboration are key in managing its spread.

Problem Aimed to Solve

This project aims to analyze the COVID-19 pandemic using publicly available data. The project includes a Jupyter notebook with Python code to extract, clean, and visualize COVID-19 data from various sources. Additionally, the project provides a dashboard to explore the data interactively.

User's Manual

Files/Folder Description
Dataset Folder This folder provides data state wise and district wise data in csv format
Python File This contains the .ipynb file of the analysis for Data Extract and Data cleaning.
MySQL File This contains the .sql file for the exploratory data analysis.

Analysis

o Highest number of confirmed and deceased cases were seen in Maharashtra.

o October month had the highest number of total confirmed cases.

o Highest Vaccination rate observed in in Sikkim.

o Lowest Vaccination rate is observed in Uttar Pradesh.

o Dadra and Nagar Haveli has the highest Recovery Rate.

o October has the Highest deceased cases.


Methodology

1. Import the data from API using requests library.

2. The imported data was in json format hence we used json library to read the data.

3. We looked for null values and replaced it with zero, looked for duplicates.

4. Stated analysing the data by using pandas function like groupby, sort_values etc.

5. Used nested 'for' loops to extract the relevant data from the nested dictionary.

6. Extracted the individual state data from dataframe in csv format and imported data into MySQL.

7. Aggregated the distribution by month and week wise for each state.

8. Imported the aggregated data into Excel for further Analysis.

Dashboard

Challenges and Learnings

  • JSON Data Extraction: Navigated nested JSON structures to extract relevant COVID-19 information.
  • Data Cleaning: Tackled missing values and inconsistencies in COVID-19 data for accurate analysis.
  • Code Optimization: Improved efficiency in processing and analyzing extensive COVID-19 datasets.
  • Domain Understanding: Gained insights into public health and epidemiology through COVID-19 data analysis.
  • Collaborative Workflow: Utilized version control and teamwork for successful project completion.

Conclusions

  1. The analysis focused on the weekly progression of COVID-19 cases, recoveries, deaths, and tests, providing valuable insights into the pandemic's impact across various regions and timeframes.
  2. Fluctuations in the number of cases and deaths were observed, underscoring the dynamic nature of the pandemic's effects in different geographical areas.
  3. Through effective data visualization using charts and graphs, the project facilitated a clearer understanding of the data, aiding in the interpretation of trends and patterns.
  4. The project's findings hold practical significance for public health authorities, enabling them to devise more targeted and efficient strategies for containing the virus's transmission.
  5. Policymakers can benefit from the analysis by making informed decisions on resource allocation, directing support to regions experiencing the highest impact from the pandemic.

About

This project aims to analyze the COVID-19 pandemic using publicly available data. The project includes a Jupyter notebook with Python code to extract, clean, and visualize COVID-19 data from various sources. Additionally, the project provides a dashboard to interactively explore the data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published