From Images to Insights: Explainable Biodiversity Monitoring with Plain Language Habitat Explanations

An End-to-End Framework for Interpretable Ecological Inference

🔬 Species recognition → Causal analysis of habitat preferences → Human-readable ecological explanations

News

🎉 Our paper is accepted to the European Conference on Artificial Intelligence (ECAI 2025) Workshop (AISE 2025).

Overview

BioX is an ecological AI pipeline that transforms species images into causal habitat insights. It contains:

🔍 Species recognition from images (via BioCLIP)
🌍 Presence & background point collection (via GBIF + Spatial Sampling)
🌦️ Bioclimatic variable extraction (BIO1–BIO19 from WorldClim)
🔗 Causal graph discovery among climate variables (via NOTEARS)
📈 Causal inference on species presence and climate variables (via DoWhy)
📖 Explanation generation in nature language (Rule-based or LLMs)

Code Structure

run.py: Run the full pipeline from image to explanation.
config.py: Configuration for paths, model params, etc.
modules/: All core modules
- species_recognition.py: Run BioCLIP on input images
- data_collection.py: Get GBIF data (gbif_extractor.py), background points (background_generator.py), and BIO variables (bioclim_extractor.py).
- notears_bios.py: Learn causal structure among BIOs
- bio2p_dowhy.py: Perform causal inference with DoWhy
- explain.py: Generate human-readable habitat explanations
utils/logger_setup.py: Unified logging.
test/: Just put input image(s) here.
requirements.txt: Python dependencies

Getting Started

0. System Requirements (Example Setup)

Microsoft Windows 11
Python version: 3.10
PyTorch version: 2.7.0 (Official Website)
CUDA version: 11.8
GPU: NVIDIA RTX A5000

1. Installation

Clone the repo and create a virtual environment:

git clone https://github.com/Yutong-Zhou-cv/BioX.git
cd BioX
conda env create -f environment.yaml
conda activate gta
pip install -r requirements.txt

2. Prepare Image(s)

Place your species image(s) in the test/ folder. Supported formats: "jpg", "png", "jpeg", "bmp".

3. External Data Dependencies

This program requires several external datasets. If automatic download fails due to network restrictions, please download them manually as described below.

3.1 Natural Earth Land Mask

Download the Natural Earth land mask (ne_10m_land.zip) from: https://naturalearth.s3.amazonaws.com/10m_physical/ne_10m_land.zip

Unzip the files and confirm ne_10m_land.shp is placed under temp/.

3.2 WorldClim BIO Variables

Download the BIO variables (wc2.1_2.5m_bio.zip) from: https://geodata.ucdavis.edu/climate/worldclim/2_1/base/

Place all downloaded files and extract all under temp/wc2.1_2.5m_bio/.

4. Explanation Modes

In config.py, set the explanation mode:

EXPLAIN_PARMS = {
    "explanation_mode": "llm",  # "llm" (uses a local LLM via Ollama) or "rule" (template-based)
    "llm_model": "llama3.3:70b"
}

"llm": Use LLM (e.g., LLaMA3) to generate rich, biologically grounded narratives.

"rule": Use concise, template-based explanations for speed or offline use.

If you want the paper-style narratives, use "llm" and set up Ollama as follows.

5. Ollama setting instructions

5.1 Install and start Ollama

Download from https://ollama.com/download and verify installation:

ollama --version

Linux (optional, background service):

sudo systemctl enable --now ollama
systemctl status ollama

5.2 Download an LLM

The default in config.py is llama3.3:70b (very large). If your machine is resource-limited, pull a smaller model and update llm_model accordingly.

# Large (example from config)
ollama pull llama3.3:70b
# Smaller alternatives (pick one you can run)
ollama pull llama3.2:3b
ollama pull llama3.1:8b

Quick test (should print a response):

ollama run llama3.2:3b "hello"

5.3 [Optional] Fallback if Ollama is unavailable

If you want the pipeline to auto-fallback from "llm" to "rule" when Ollama isn’t reachable, add this to your explanation module (e.g., in explain.py):

import json
import os
import socket
import urllib.request

def _ollama_available(host="127.0.0.1", port=11434, http_url="http://127.0.0.1:11434/api/tags", timeout=1.5):
    try:
        with socket.create_connection((host, port), timeout=timeout):
            pass
        with urllib.request.urlopen(http_url, timeout=timeout) as r:
            if r.status == 200:
                _ = json.loads(r.read().decode("utf-8"))
                return True
    except Exception:
        return False
    return False

def pick_explanation_mode(explain_parms):
    mode = explain_parms.get("explanation_mode", "rule")
    if mode == "llm" and not _ollama_available():
        print("[Explain] Ollama not detected; falling back to rule-based explanations.")
        return "rule"
    return mode

6. Testing

python -m run.py

Citation

If you use BioX in your research, please cite:

Zhou, Y. & Ryo, M. (2025). From Images to Insights: Explainable Biodiversity Monitoring with Plain Language Habitat Explanations.
arXiv:2506.10559

@article{zhou2025images,
  title={From Images to Insights: Explainable Biodiversity Monitoring with Plain Language Habitat Explanations},
  author={Zhou, Yutong and Ryo, Masahiro},
  journal={arXiv preprint arXiv:2506.10559},
  year={2025}
}

Acknowledgements

“If I have seen further, it is by standing on the shoulders of giants.” – Isaac Newton

This project builds upon a rich ecosystem of open ecological, statistical, and machine-learning tools. We are deeply grateful to the scientific community for the foundational contributions:

BioCLIP: A Vision Foundation Model for the Tree of Life, Samuel Stevens et al. [Paper] [Project]

@inproceedings{stevens2024bioclip,
  title={Bioclip: A vision foundation model for the tree of life},
  author={Stevens, Samuel and Wu, Jiaman and Thompson, Matthew J and Campolongo, Elizabeth G and Song, Chan Hee and Carlyn, David Edward and Dong, Li and Dahdul, Wasila M and Stewart, Charles and Berger-Wolf, Tanya and others},
  booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
  pages={19412--19424},
  year={2024}
}

GBIF | Global Biodiversity Information Facility [Website]
WorldClim | Global climate and weather data [Website]
Natural Earth | Land [Website]
DAGs with NO TEARS: Continuous Optimization for Structure Learning, Xun Zheng et al. [Paper] [Code]

@inproceedings{zheng2018dags,
    author = {Zheng, Xun and Aragam, Bryon and Ravikumar, Pradeep and Xing, Eric P.},
    booktitle = {Advances in Neural Information Processing Systems},
    title = {DAGs with NO TEARS: Continuous Optimization for Structure Learning},
    year = {2018}
}

DoWhy: An End-to-End Library for Causal Inference, Sharma Amit and Kiciman Emre. [Paper] [Code]

@misc{sharma2020dowhyendtoendlibrarycausal,
      title={DoWhy: An End-to-End Library for Causal Inference}, 
      author={Amit Sharma and Emre Kiciman},
      year={2020},
      eprint={2011.04216},
      archivePrefix={arXiv},
      primaryClass={stat.ME},
      url={https://arxiv.org/abs/2011.04216}, 
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

From Images to Insights: Explainable Biodiversity Monitoring with Plain Language Habitat Explanations

Contents

News

Overview

Code Structure

Getting Started

0. System Requirements (Example Setup)

1. Installation

2. Prepare Image(s)

3. External Data Dependencies

3.1 Natural Earth Land Mask

3.2 WorldClim BIO Variables

4. Explanation Modes

5. Ollama setting instructions

5.1 Install and start Ollama

5.2 Download an LLM

5.3 [Optional] Fallback if Ollama is unavailable

6. Testing

Citation

Acknowledgements

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
fig		fig
modules		modules
test		test
utils		utils
README.md		README.md
__init__.py		__init__.py
config.py		config.py
environment.yaml		environment.yaml
requirements.txt		requirements.txt
run.py		run.py

Yutong-Zhou-cv/BioX

Folders and files

Latest commit

History

Repository files navigation

From Images to Insights: Explainable Biodiversity Monitoring with Plain Language Habitat Explanations

Contents

News

Overview

Code Structure

Getting Started

0. System Requirements (Example Setup)

1. Installation

2. Prepare Image(s)

3. External Data Dependencies

3.1 Natural Earth Land Mask

3.2 WorldClim BIO Variables

4. Explanation Modes

5. Ollama setting instructions

5.1 Install and start Ollama

5.2 Download an LLM

5.3 [Optional] Fallback if Ollama is unavailable

6. Testing

Citation

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages