TerraTorch-iterate

A tool for benchmarking and hyper-parameter optimization using TerraTorch.

Leverages MLFlow for experiment logging, optuna for hyperparameter optimization and ray for parallelization.

Installation

We recommend using python 3.10, 3.11 or 3.12 and also using a virtual environment for all commands in this guide.

Package installation

pip install terratorch-iterate

Suggested setup for development

git clone https://github.com/IBM/terratorch-iterate.git
cd terratorch-iterate
pip install --upgrade pip setuptools wheel
pip install -e .
pip install -e ".[dev]"
pip install -e ".[test]"
pip install -e ".[utilities]"

Usage for benchmarking

This tool allows you to design a benchmark test for a backbone that exists in TerraTorch over:

Several tasks
Several hyperparameter configurations

To do this it relies on a configuration file where the benchmark is defined. This consists of:

experiment_name: MLFLow experiment to run the benchmark on. This is the highest level grouping of runs in MLFLow.
run_name: Name of the parent (top-level) run under the experiment. NOTE: This item should not be included in the config if you wish to use the parameters extraction function in mlfow_utils to compile results.
defaults: Defaults that are set for all tasks. Can be overriden under each task.
tasks: List of tasks to perform. Tasks specify parameters for the decoder, datamodule to be used and training parameters.
n_trials: Number of trials to be carried out per task, in the case of hyperparameter tuning.
save_models: Whether to save models. Defaults to False. (Setting this to true can take up a lot of space). Models will be logged as artifacts for each run in MLFlow.
optimization_space: Hyperparameter space to search over. Bayesian optimization tends to work well with a small number of hyperparameters.
run_repetitions: number of repetitions to be run using best setting from optimization.
storage_uri: Location to use for mlflow storage for the hyperparameter optimization (hpo) stage. During optimization, additional folders will be created in parent directory of storage_uri. For example, if storage_uri is /opt/benchmark_experiments/hpo, additional folders will include:

/opt/benchmark_experiments/
        └──hpo_results
        └──hpo_repeated_exp
        └──repeated_exp_output_csv
        └──job_logs
        └──optuna_db

Please see configs/benchmark_v2_template.yaml in the git repo for an example.

Besides the --config argument, terratorch-iterate also has two other arguments:

if users include --hpo argument, then terratorch-iterate will optimize hyperparameters. Otherwise, it will rerun best experiment
if users include --repeat argument, then terratorch-iterate will repeat the best experiment. Otherwise, terratorch-iterate will not rerun any experiment

If users want to optimize hyperparameters:

terratorch iterate --hpo --config <config-file>

Another way to run terratorch-iterate is to omit terratorch by running:

iterate --hpo --config <config-file>

For instance:

iterate --hpo --config configs/dofa_large_patch16_224_upernetdecoder_true_modified.yaml

If users want to rerun best experiment, please use the same config file. Additionally, the parent_run_id, which is the mlflow run id from optimization, should be added as shown below:

iterate --repeat --config <config-file> --parent_run_id <mlflow run_id from hpo>

For instance:

iterate --repeat --config configs/dofa_large_patch16_224_upernetdecoder_true_modified.yaml --parent_run_id 61bdee4a35a94f988ad30c46c87d4fbd

If users want to optimize hyperparameters then the rerun best experiment in a single command, please use both settings as shown below:

iterate --hpo --repeat --config <config-file>

For instance:

iterate --hpo --repeat --config configs/dofa_large_patch16_224_upernetdecoder_true_modified.yaml

To check the experiment results, use mlflow ui --host $(hostname -f) --port <port> --backend-store-uri <storage_uri>

Summarizing results

Summarizing results and hyperparameters of multiple experiments relies on a configuration file where the experiments and tasks are defined. This consists of:

list_of_experiment_names: List of MLFLOW experiment names which had been completed.
task_names: List of tasks found in each experiment in list_of_experiment_names.
run_repetitions: Number of repetitions to be expected for each experiment. This should be the same as the value used in in each experiment config file during optimization.
storage_uri: Location to use for mlflow storage for the hyperparameter optimization (hpo) stage. This should be the same value used as storage_uri in each experiment config file during optimization (see above).
benchmark_name: string to be used to name resulting csv file

See configs/summarize_results_template.yaml in the git repo for an example.

To summarize results and hyperparameters, please run the following:

iterate --summarize --config <summarize-config-file>

For instance:

iterate --summarize --config configs/summarize_results.yaml

The results and hyperparameters are extracted into a csv file. For example, if storage_uri is /opt/benchmark_experiments/hpo, then sumarized results will be saved in last file as shown below:

/opt/benchmark_experiments/
        └──hpo_results
        └──hpo_repeated_exp
        └──repeated_exp_output_csv
        └──job_logs
        └──optuna_db
        └──summarized_results/
            └──<benchmark_name>/
                └──results_and_parameters.csv

Ray

You can also parallelize your runs over a ray cluster.

Check out instructions in the docs

terratorch integration

terratorch-iterate provides an utility to convert terratorch's config file into a single terratorch-iterate's config file. You have to follow these steps:

Go to <$TERRATORCH-ITERATE-HOME>/benchmark/config_util
Copy all terratorch config files that you want to be converted into a new directory. Note that all yaml files within this directory will be parsed to generate the terratorch-iterate file, so make sure that only yaml files that you want to be parsed are located in this directory.
Run build_geobench_configs.py script by specifying 3 input parameters:
1. input_dir - Full path to the directory that contains all terratorch config yaml files
2. output_dir - Full path to the directory that will stored the generated files
3. template - Full or relative path to the template file
4. prefix - prefix to the generated file names, e.g., if set prefix to my_prefix then a generated filename could be my_prefix_Clay.yaml

For instance, this is an example of such command:

python3 build_geobench_configs.py --input_dir /Users/john/terratorch/examples/confs/geobenchv2_detection --output_dir /Users/john/terratorch-iterate/benchmark/config_util --template geobenchv2_template.yaml --prefix my_prefix_

Name		Name	Last commit message	Last commit date
Latest commit History 479 Commits
.github		.github
bench		bench
bin		bin
configs		configs
docs		docs
plotting		plotting
terratorch_iterate		terratorch_iterate
tests		tests
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
.secrets.baseline		.secrets.baseline
.whitesource		.whitesource
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
coverage.svg		coverage.svg
mkdocs.yml		mkdocs.yml
mlflow_corrupted.py		mlflow_corrupted.py
pyproject.toml		pyproject.toml
renovate.json		renovate.json
run_tests.py		run_tests.py
setup.cfg		setup.cfg
setup.py		setup.py
start_ray_workers.sh		start_ray_workers.sh
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TerraTorch-iterate

Installation

Package installation

Suggested setup for development

Usage for benchmarking

Summarizing results

Ray

terratorch integration

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors 8

Uh oh!

Languages

License

terrastackai/iterate

Folders and files

Latest commit

History

Repository files navigation

TerraTorch-iterate

Installation

Package installation

Suggested setup for development

Usage for benchmarking

Summarizing results

Ray

terratorch integration

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors 8

Uh oh!

Languages

Packages