Skip to content

Saudisis/AQA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Air Quality Analysis (AQA) Library

Welcome to the Air Quality Analysis (AQA) Library, an integrated framework designed to automate the download, processing, and harmonization of environmental data from multiple sources.
This project enables the combination of ARPA Lombardia ground data, Sentinel-5P satellite observations, and ERA5 reanalysis variables to generate daily, spatially consistent air quality indicators.


Data Requirements

To execute this notebook successfully, you will need access to the following datasets:

  • ARPA Lombardia (Ground Sensors)
    Hourly pollutant concentration measurements (NO₂, SO₂, CO, O₃, PM10, PM2.5, etc.), including idsensore, lat, lng, provincia, and timestamp columns.

  • Sentinel-5P (Satellite)
    Atmospheric column densities for selected pollutants (NO2_column_number_density, O3_column_number_density, etc.) accessed through the Google Earth Engine API.

  • ERA5 (Reanalysis)
    Meteorological parameters (temperature, pressure, wind speed, boundary layer height, radiation, precipitation, etc.) obtained from CMCC or Copernicus Climate Data Store (CDS).


Features

  • Automated download of ERA5, Sentinel-5P, and ARPA datasets.
  • Data cleaning and normalization for all input sources.
  • Temporal harmonization to 12:00–15:00 mean values.
  • Spatial merging of satellite, reanalysis, and sensor data.
  • Integrated summary table for pollutant concentrations and weather variables.
  • Visualization tools for pollutant maps and correlation analysis.

Installation

Option 1 — Conda

If using Windows: environment.yml If using Mac: nobuilds.yml

conda env create -f environment.yml
conda activate AQA_DayRange

Option 2 — Pip

python -m venv .venv
.venv\Scripts\activate   # Windows
source .venv/bin/activate  # macOS/Linux
pip install -r requirements.txt

Code

All functions and workflow are implemented in the Jupyter Notebook:

AQA_DayRange.ipynb

This notebook contains:

  • ARPA data access and cleaning.
  • ERA5 and Sentinel-5P integration.
  • Spatial grid generation and interpolation.
  • Computation of current (curr_) and previous (prev_) daily means.
  • Export of harmonized results to CSV and visualization of pollutant maps.

Workflow Overview

1. ARPA Data Processing

Loads ARPA Lombardia datasets using the API and organizes metadata for sensors, pollutants, and coordinates.

Example:

meta = pd.read_csv(meta_url).dropna(subset=["idsensore", "lat", "lng"])
data = requests.get(data_url, headers=headers).json()

2. ERA5 Variable Extraction

Downloads meteorological variables (temperature, pressure, wind speed, radiation, BLH, etc.) using CMCC or CDS API and converts them to a harmonized format.

Example:

import cdsapi
c = cdsapi.Client()
c.retrieve('reanalysis-era5-single-levels', {...}, 'era5_data.nc')

3. Sentinel-5P Pollutant Integration

Retrieves pollutant column density data (e.g., NO₂, O₃, CO, SO₂) via Google Earth Engine and scales values for consistency with ground units.

Example:

pollutants = {
    "no2": {"collection": "COPERNICUS/S5P/OFFL/L3_NO2", "band": "NO2_column_number_density"}
}

4. Spatial Harmonization and Merging

Combines all datasets (ARPA, ERA5, Sentinel) through coordinate matching and averaging based on the AOI grid.

Example:

summary = pd.merge(ground_df, sentinel_df, on=["Latitude", "Longitude"])
summary = pd.merge(summary, era5_df, on=["Latitude", "Longitude"])

Outputs

  • CSV results: results/ARPA_ERA_SP5-<date>.csv
  • Summary tables: harmonized pollutant and meteorological data

Repository Structure

AQA/
│
├── AQA_DayRange.ipynb        # Main analysis notebook
├── environment.yml           # Conda environment for windows
├── nobuilds.yml              # Conda environment for Mac
├── requirements.txt          # pip dependencies
└── README.md                 # Project documentation

Technologies Used

  • Python (pandas, geopandas, numpy, matplotlib, xarray, requests)
  • Google Earth Engine API
  • Copernicus CDS / CMCC ERA5
  • ARPA Lombardia Open Data API
  • GeoPandas + Matplotlib for geospatial analysis

Testing

The data pipeline has been validated across multiple pollutants and date ranges.


License

This project is licensed under the MIT License.
See the LICENSE file for details.


Author

Claudia Isabela Saud-Miño
Politecnico di Milano — Environmental & Geoinformatics Research
📧 Contact via GitHub

About

An approach for Air Quality data incorporation from multiple sources, including Sentinel-5P satellite data, ARPA ground sensors, and ERA5.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors