Skip to content
/ geeo Public

GEEO is a processing pipeline and collection of algorithms for obtaining Analysis-Ready-Data (ARD) from Landsat and Sentinel-2 using the Google Earth Engine Python API.

License

Notifications You must be signed in to change notification settings

leonsnill/geeo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Geographical and Ecological Earth Observation (geeo)

GEEO is a processing pipeline and collection of algorithms that uses the Google Earth Engine (GEE) Python API for creating Analysis-Ready-Data (ARD) from optical imagery, including the Landsat and Sentinel-2 archives. The package is structured along hierarchical levels, emphasizing a standardized, reproducible and efficient workflow, covering the suite from image preprocessing, harmonization, and spatial organisation, to advanced feature generation and time series analyses. Processing instructions are readily defined using a .yml-file or python dictionary, facilitating stand-alone use or interactive integration into processing workflows using GEE:

sample SVG image

GEEO includes processing routines frequently applied in geographical and ecological studies that use satellite remote sensing. The selection of routines is primarily influenced by work conducted in the Biogeography Lab and Earth Observation Lab at Humboldt University of Berlin. Inspiration for the modular structure and parameter file communication comes from David Frantz' Framework for Operational Radiometric Correction for Environmental monitoring (FORCE), a highly-advanced all-in-one processing engine for medium-resolution Earth Observation image archives for your computing infrastructure.


Access and installation

You need to have access to Google Earth Engine and its Python API. Read the following sections how to set up the latter for your local Python environment or by using a Jupyter Notebook hosted on Google Colab.

Local Python environment

Make sure you have a Python 3 distribution of our choice installed (e.g., install Miniconda). Prior to installing GEEO, you preferably want to set up a new environment and also install the package's dependencies using conda (alternatively the dependencies are installed automatically when pip installing GEEO):

conda create -n geeo_env -c conda-forge python=3.12 ipykernel earthengine-api pyyaml pandas geopandas matplotlib tqdm ipyleaflet ipywidgets gdal scikit-learn eerepr geemap
conda activate geeo_env

Once created and activated, you can directly install the package from GitHub using pip:

pip install git+https://github.com/leonsnill/geeo.git

Google Colab

You can also directly get started using Google Colab for hosting a Jupyter Notebook.

In a new .ipynb-notebook, simply install GEEO in the first code chunk like so:

!pip install git+https://github.com/leonsnill/geeo.git

Import after installation

Import, authenticate and initialize the Earth Engine python API, then import GEEO. Make sure to set the Earth Engine eligible Google Cloud project name when initializing.

# Google cloud project name with Earth Engine API enabled
gcloud_project_name = ''

import ee
ee.Authenticate()
ee.Initialize(project=gcloud_project_name)
import geeo

Documentation and examples

In the docs folder, you will find the documentation to the parameter settings, the settings for instructing the core processing chain of geeo and their description. To become familiar with the module design and handling, you also find some example Jupyter Notebooks on how to instruct certain processing chains, visualize ouputs and request exports in the docs folder - or access them using Colab:

  • Introducing geeo: Open in Colab
  • Worked example on habitat mapping: Open in Colab

Additional tutorials

  • Spatial tiling and metadata: Open in Colab
  • Auxiliary exports: Open in Colab

More tutorials are in development and the list will be updated in the future.

Quick use overview

Import, authenticate and initialize the Earth Engine python API, then import GEEO:

gcloud_project_name = ''
import ee
ee.Authenticate()
ee.Initialize(project=gcloud_project_name)
import geeo

The settings for the main processing chain of geeo can be defined using either a .yml-file or python dictionary.

Option 1) Parameter file

# create new .yml file to set instructions
geeo.create_parameter_file('new_param_file')

# open parameter file in editor, set instructions, and safe.

# then simply run instructions:
run = geeo.run_param('new_param_file.yml')

The run variable (dictionary) will contain all parameter settings and newly calculated products. If exports are requested, they will be printed to console and run in the Earth Engine Task Manager.

Option 2) Dictionary

GEEO also allows for giving processing instructions using a python dictionary directly as input:

# specify key-value pair matching parameter names
# all non-specified parameters will be set to default settings from parameter template
param_dict = {
    'YEAR_MIN': 1985,
    'YEAR_MAX': 1990,
    # ...
}
# run
run = geeo.run_param(param_dict)

Let`s use the dictionary approach to illustrate a basic workflow. Consider the simple task of calculating Spectral-Temporal-Metrics (STMs) of the NDVI using Landsat for three seasons for the area of greater Berlin in 2024. STMs are a commonly applied form of statistical reduction in remote sensing, in which pixel-wise statistics across individual bands/features (-> NDVI) are calculated for a given set of imagery (-> Landsat) over tim (-> Mar-May, Jun-Aug, Sep-Nov 2024). We simply edit the parameters that we actually need for calculating the desired output:

# -----------------------------------------------------------------
# Option 2) python dictionary of user-settings

param_dict = {
    
    'YEAR_MIN': 2024,
    'YEAR_MAX': 2024,
    'ROI': [13.07, 52.37, 13.78, 52.64],  # Berlin simplified to bounding box coordinates
    'SENSORS': ['L8', 'L9'],  # Landsat-8 and Landsat-9 Collection 2 Surface Reflectance Tier 1 data
    'FEATURES': ['NDVI'],  # we only want the NDVI
    
    'STM': ['p10', 'p50', 'p90', 'stdDev'],  # reducer metrics: 10%, 50% (median), and 90% percentiles percentile, standard deviation
    'FOLD_CUSTOM': {'month': ['3-5', '6-8', '9-11']},  # spring, summer, autumn sub-windows
    'STM_FOLDING': True,  # apply sub-windows to STM calculation
    
    'EXPORT_IMAGE': False,  # setting to export image products in general
    'EXPORT_STM': False  # setting to export STM products in particular (if EXPORT_IMAGE was True!)

}
# all remaining settings will be set to default values from blueprint!

# run the instructions
run = geeo.run_param(param_dict)

# get STM ee.ImageCollection (collection of ee.Image subwindows)
stm = run.get('STM')
print('STM bands: ', stm.first().bandNames().getInfo())

We left many parameters to default values for illustrational purposes. This includes export settings regarding resampling, projection and metadata, as well as quality masking settings. They are all set to reasonable, - and if possible - common/generic settings (e.g. cloud masking). Nevertheless, being explicit about these settings for your actual implementations is highly recommended!

What we now did in essence with just these few lines of code, is:

  • Combine Landsat-8 and Landsat-9 Collection 2 Tier 1 Surface Reflectance collections into one collection, including scaling the band to actual reflectance values (0-100)
  • Apply quality masks to 'invalid' pixels (cloud, cloud shadow, snow/ice, fill, dilated clouds)
  • Calculate the NDVI
  • Calculate the percentiles and standard deviation for three temporal subwindows for 2024
  • Export result to Drive/Asset (in theory, here we set it to False to not trigger unneccesary processing)

About

GEEO is a processing pipeline and collection of algorithms for obtaining Analysis-Ready-Data (ARD) from Landsat and Sentinel-2 using the Google Earth Engine Python API.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages