Global Nature Watch Agent

Language Interface for Maps & WRI/LCL data APIs.

Project overview

The core of this project is an LLM powered agent that drives the conversations for Global Nature Watch. The project is fully open source and can be run locally with the appropriate keys for accessing external services.

Agent

Our agent is a simple ReAct agent implemented in Langgraph. It uses tools. The tools at a high level do the following things

Provide information about its capabilities
Retrieve areas of interest
Select appropriate datasets
Retrieve statistics from the WRI analytics api
Generate insights including charts from the data

The LLM to use is plug and play, we rely mostly on Sonnet & Gemini for planning and tool calling.

For detailed technical architecture, see Agent Architecture Documentation.

Infrastructure

To enable that, the project relies on a set of services being deployed with it.

eoAPI to provide access to the LCL data in a STAC catalog and serving tiles
Langfuse for tracing of the agent interactions
PostgreSQL for the API data and geographic search of AOIs
FastAPI deployment for the API

All these services are being managed and deployed through our deploy repository at project-zeno-deploy

Frontend

The frontend application for this project is a nextjs project that can be found at project-zeno-next

Evals

We have an evaluation framework we use to do end-to-end testing of the agent on the deployed API. The framework can be found in the gnw-evals repository.

For adding eval cases and running evals locally, see tests/evals/README.md.

STAC

We have a set of scripts to ingest STAC data into the eoAPI deployment. The ingestion code for STAC can be found in the gnw-stac repository.

Versioning

This project uses Calendar Versioning (CalVer) with the format YYYY.M.D.

Version bumps happen automatically via the bump-calver pre-commit hook. When you commit on a branch that is at the same version as main, the hook updates pyproject.toml to today's CalVer date and stages the change (causing the commit to fail once so you re-run it with the bumped version included). If two commits land on the same day, a build suffix is appended (YYYY.M.D.1, YYYY.M.D.2, …).

The CI workflow (calver-bump.yml) runs on pull requests and verifies that the version was already bumped by the hook — it does not commit to main. No manual version changes are needed. The current version is always readable at the /api/v1/version endpoint.

Dependencies

uv
postgresql (for using local DB instead of docker)
docker

Development Setup

There are two ways to run the project locally:

Option 1 (Host-based): Run infrastructure in Docker, API on the host via uv/make — best for active development with fast iteration.
Option 2 (Full Docker): Run everything in containers via docker-compose.yaml — closer to the production environment.

Option 1: Host-based Development

We use uv for package management and docker-compose for running the system locally.

Clone and setup:

git clone git@github.com:wri/project-zeno.git
cd project-zeno
uv sync
source .venv/bin/activate

Environment configuration:

cp .env.example .env
# Edit .env with your API keys and credentials

Build dataset RAG database:

Our agent uses a RAG database to select datasets. The RAG database can be built locally using
```
uv run python src/ingest/embed_datasets.py
```
As an alternative, the current production table can also be retrieved from S3 if you have the corresponding access permissions.
```
aws s3 sync s3://zeno-static-data/ data/
```

Start infrastructure services:

make up       # Start Docker services (PostgreSQL, test DB, migrations, Langfuse via docker-compose.dev.yaml)

Ingest data (required after starting database):

After starting the database and infrastructure services, you need to ingest the required datasets. Feel free to run all or just the ones you need.

This downloads ~2 GB of data per dataset except for WDPA which is ~10 GB. It's ok to skip WDPA if you don't need it.

Make sure you're set up with WRI AWS credentials in your .env file to access the S3 bucket.
```
uv run python src/ingest/ingest_gadm.py
uv run python src/ingest/ingest_kba.py
uv run python src/ingest/ingest_landmark.py
uv run python src/ingest/ingest_wdpa.py
```
See src/ingest/ directory for details on each ingestion script.

Start application services:

make api      # Run API locally (port 8000)

Or start everything at once (after data ingestion):

make dev      # Starts API (requires infrastructure already running)

Setup Local Langfuse:

Langfuse traces every agent run, tool call, and LLM interaction — useful for debugging the full agent flow. It is included in make up — no separate setup is required.

a. Access the Langfuse UI at http://localhost:3001 and log in with the pre-seeded credentials:
- Email: admin@example.com
- Password: Password123!
b. Update your .env with the pre-configured keys:
```
LANGFUSE_HOST=http://localhost:3001
LANGFUSE_PUBLIC_KEY=zeno-public-key-123
LANGFUSE_SECRET_KEY=zeno-secret-key-123
LANGFUSE_TRACING_ENABLED=true
```
Access the application:
- API: http://localhost:8000
- Langfuse: http://localhost:3001

Local PostgreSQL (optional)

By default make up starts a PostgreSQL container. If you prefer to use a local PostgreSQL instance instead:

createuser -s postgres # if you don't have a postgres user
createdb -U postgres zeno-data-local
# Run migrations from the db directory (alembic reads DATABASE_URL from env; same URL as .env is fine)
cd db && DATABASE_URL=postgresql+asyncpg://postgres:postgres@localhost:5432/zeno-data-local uv run alembic upgrade head && cd ..

# Check if you have the database running
psql zeno-data-local

# Check if you have the tables created
\dt

# Output
#               List of relations
#  Schema |      Name       | Type  |  Owner
# --------+-----------------+-------+----------
#  public | alembic_version | table | postgres
#  public | threads         | table | postgres
#  public | users           | table | postgres

Then add the database URL to your .env:

DATABASE_URL=postgresql+asyncpg://postgres:postgres@localhost:5432/zeno-data-local

Option 2: Full Dockerized Development

Runs all services — API, PostgreSQL, Langfuse, and ClickHouse — in containers using docker-compose.yaml.

Environment configuration:
```
cp .env.example .env
# Edit .env with your API keys and credentials
```
Langfuse is pre-seeded and tracing is enabled automatically — no .env changes needed.
Start all services:
```
docker compose up -d
```

Ingest data (required after starting database):

Exec into the API container and run the ingestion scripts:

docker compose exec api bash
uv python src/ingest/ingest_gadm.py
uv python src/ingest/ingest_kba.py
uv python src/ingest/ingest_landmark.py
uv python src/ingest/ingest_wdpa.py
exit

See src/ingest/ directory for details on each ingestion script.

Verify Langfuse:

Langfuse traces every agent run, tool call, and LLM interaction — useful for debugging the full agent flow. Access the UI at http://localhost:3001 and log in with the pre-seeded credentials:
- Email: admin@example.com
- Password: Password123!
Note: The API container communicates with Langfuse on port 3000 (internal service name langfuse-web:3000). Port 3001 is only for browser access to the Langfuse UI.
Access the application:
- API: http://localhost:8000
- Langfuse: http://localhost:3001

Development Commands

make help     # Show all available commands
make up       # Start Docker infrastructure
make down     # Stop Docker infrastructure
make api      # Run API with hot reload
make dev      # Start full development environment

Testing

API Tests

Running make up brings up a zeno-data_test database (in docker-compose.dev.yaml) used by pytest. Set TEST_DATABASE_URL in .env (e.g. postgresql+asyncpg://postgres:postgres@localhost:5434/zeno-data_test for the dev test container). You can also create the test database manually:

createuser -s postgres # if you don't have a postgres user
createdb -U postgres zeno-data_test

Then run the API tests using pytest:

uv run pytest tests/api/

CI (GitHub Actions)

Lint (pre-commit/uv) runs on every push. Pytest runs on every pull request with a Postgres service. tests/tools runs only when the PR has the run-tools-tests label or you start the “Tools Tests” workflow manually (it embeds datasets first).

CLI User Management

For user administration commands (making users admin), see CLI Documentation.

Environment Files

.env - Base configuration (production settings)

The system automatically loads .env.

Name		Name	Last commit message	Last commit date
Latest commit History 1,738 Commits
.github/workflows		.github/workflows
db		db
docs		docs
src		src
tests		tests
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.dev.yaml		docker-compose.dev.yaml
docker-compose.yaml		docker-compose.yaml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Global Nature Watch Agent

Project overview

Agent

Infrastructure

Frontend

Evals

STAC

Versioning

Dependencies

Development Setup

Option 1: Host-based Development

Local PostgreSQL (optional)

Option 2: Full Dockerized Development

Development Commands

Testing

API Tests

CI (GitHub Actions)

CLI User Management

Environment Files

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Global Nature Watch Agent

Project overview

Agent

Infrastructure

Frontend

Evals

STAC

Versioning

Dependencies

Development Setup

Option 1: Host-based Development

Local PostgreSQL (optional)

Option 2: Full Dockerized Development

Development Commands

Testing

API Tests

CI (GitHub Actions)

CLI User Management

Environment Files

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages