Skip to content

[ECAI 2025 Workshop] From Images to Insights: Explainable Biodiversity Monitoring with Plain Language Habitat Explanations

Notifications You must be signed in to change notification settings

Yutong-Zhou-cv/BioX

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

13 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Figure start

From Images to Insights: Explainable Biodiversity Monitoring with Plain Language Habitat Explanations

An End-to-End Framework for Interpretable Ecological Inference

πŸ”¬ Species recognition β†’ Causal analysis of habitat preferences β†’ Human-readable ecological explanations

Paper Download

Contents

News

Overview

Figure overvoew

BioX is an ecological AI pipeline that transforms species images into causal habitat insights. It contains:

  1. πŸ” Species recognition from images (via BioCLIP)
  2. 🌍 Presence & background point collection (via GBIF + Spatial Sampling)
  3. 🌦️ Bioclimatic variable extraction (BIO1–BIO19 from WorldClim)
  4. πŸ”— Causal graph discovery among climate variables (via NOTEARS)
  5. πŸ“ˆ Causal inference on species presence and climate variables (via DoWhy)
  6. πŸ“– Explanation generation in nature language (Rule-based or LLMs)

Code Structure

  • run.py: Run the full pipeline from image to explanation.
  • config.py: Configuration for paths, model params, etc.
  • modules/: All core modules
    • species_recognition.py: Run BioCLIP on input images
    • data_collection.py: Get GBIF data (gbif_extractor.py), background points (background_generator.py), and BIO variables (bioclim_extractor.py).
    • notears_bios.py: Learn causal structure among BIOs
    • bio2p_dowhy.py: Perform causal inference with DoWhy
    • explain.py: Generate human-readable habitat explanations
  • utils/logger_setup.py: Unified logging.
  • test/: Just put input image(s) here.
  • requirements.txt: Python dependencies

Getting Started

0. System Requirements (Example Setup)

  • Microsoft Windows 11
  • Python version: 3.10
  • PyTorch version: 2.7.0 (Official Website)
  • CUDA version: 11.8
  • GPU: NVIDIA RTX A5000

1. Installation

Clone the repo and create a virtual environment:

git clone https://github.com/Yutong-Zhou-cv/BioX.git
cd BioX
conda env create -f environment.yaml
conda activate gta
pip install -r requirements.txt

2. Prepare Image(s)

Place your species image(s) in the test/ folder. Supported formats: "jpg", "png", "jpeg", "bmp".

3. External Data Dependencies

This program requires several external datasets. If automatic download fails due to network restrictions, please download them manually as described below.

3.1 Natural Earth Land Mask

Download the Natural Earth land mask (ne_10m_land.zip) from: https://naturalearth.s3.amazonaws.com/10m_physical/ne_10m_land.zip

Unzip the files and confirm ne_10m_land.shp is placed under temp/.

3.2 WorldClim BIO Variables

Download the BIO variables (wc2.1_2.5m_bio.zip) from: https://geodata.ucdavis.edu/climate/worldclim/2_1/base/

Place all downloaded files and extract all under temp/wc2.1_2.5m_bio/.

4. Explanation Modes

In config.py, set the explanation mode:

EXPLAIN_PARMS = {
    "explanation_mode": "llm",  # "llm" (uses a local LLM via Ollama) or "rule" (template-based)
    "llm_model": "llama3.3:70b"
}

"llm": Use LLM (e.g., LLaMA3) to generate rich, biologically grounded narratives.

"rule": Use concise, template-based explanations for speed or offline use.

If you want the paper-style narratives, use "llm" and set up Ollama as follows.

5. Ollama setting instructions

5.1 Install and start Ollama

Download from https://ollama.com/download and verify installation:

ollama --version

Linux (optional, background service):

sudo systemctl enable --now ollama
systemctl status ollama

5.2 Download an LLM

The default in config.py is llama3.3:70b (very large). If your machine is resource-limited, pull a smaller model and update llm_model accordingly.

# Large (example from config)
ollama pull llama3.3:70b
# Smaller alternatives (pick one you can run)
ollama pull llama3.2:3b
ollama pull llama3.1:8b

Quick test (should print a response):

ollama run llama3.2:3b "hello"

5.3 [Optional] Fallback if Ollama is unavailable

If you want the pipeline to auto-fallback from "llm" to "rule" when Ollama isn’t reachable, add this to your explanation module (e.g., in explain.py):

import json
import os
import socket
import urllib.request

def _ollama_available(host="127.0.0.1", port=11434, http_url="http://127.0.0.1:11434/api/tags", timeout=1.5):
    try:
        with socket.create_connection((host, port), timeout=timeout):
            pass
        with urllib.request.urlopen(http_url, timeout=timeout) as r:
            if r.status == 200:
                _ = json.loads(r.read().decode("utf-8"))
                return True
    except Exception:
        return False
    return False

def pick_explanation_mode(explain_parms):
    mode = explain_parms.get("explanation_mode", "rule")
    if mode == "llm" and not _ollama_available():
        print("[Explain] Ollama not detected; falling back to rule-based explanations.")
        return "rule"
    return mode

6. Testing

python -m run.py

Citation

If you use BioX in your research, please cite:

Zhou, Y. & Ryo, M. (2025). From Images to Insights: Explainable Biodiversity Monitoring with Plain Language Habitat Explanations.
arXiv:2506.10559

@article{zhou2025images,
  title={From Images to Insights: Explainable Biodiversity Monitoring with Plain Language Habitat Explanations},
  author={Zhou, Yutong and Ryo, Masahiro},
  journal={arXiv preprint arXiv:2506.10559},
  year={2025}
}

Acknowledgements

β€œIf I have seen further, it is by standing on the shoulders of giants.” – Isaac Newton

This project builds upon a rich ecosystem of open ecological, statistical, and machine-learning tools. We are deeply grateful to the scientific community for the foundational contributions:

  • BioCLIP: A Vision Foundation Model for the Tree of Life, Samuel Stevens et al. [Paper] [Project]
@inproceedings{stevens2024bioclip,
  title={Bioclip: A vision foundation model for the tree of life},
  author={Stevens, Samuel and Wu, Jiaman and Thompson, Matthew J and Campolongo, Elizabeth G and Song, Chan Hee and Carlyn, David Edward and Dong, Li and Dahdul, Wasila M and Stewart, Charles and Berger-Wolf, Tanya and others},
  booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
  pages={19412--19424},
  year={2024}
}
  • GBIF | Global Biodiversity Information Facility [Website]
  • WorldClim | Global climate and weather data [Website]
  • Natural Earth | Land [Website]
  • DAGs with NO TEARS: Continuous Optimization for Structure Learning, Xun Zheng et al. [Paper] [Code]
@inproceedings{zheng2018dags,
    author = {Zheng, Xun and Aragam, Bryon and Ravikumar, Pradeep and Xing, Eric P.},
    booktitle = {Advances in Neural Information Processing Systems},
    title = {DAGs with NO TEARS: Continuous Optimization for Structure Learning},
    year = {2018}
}
  • DoWhy: An End-to-End Library for Causal Inference, Sharma Amit and Kiciman Emre. [Paper] [Code]
@misc{sharma2020dowhyendtoendlibrarycausal,
      title={DoWhy: An End-to-End Library for Causal Inference}, 
      author={Amit Sharma and Emre Kiciman},
      year={2020},
      eprint={2011.04216},
      archivePrefix={arXiv},
      primaryClass={stat.ME},
      url={https://arxiv.org/abs/2011.04216}, 
}

About

[ECAI 2025 Workshop] From Images to Insights: Explainable Biodiversity Monitoring with Plain Language Habitat Explanations

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages