Multi-agent LLM system for spatial constraint analysis using geospatial data and RAG-based regulatory knowledge.
This repository is the results of a collaboration between the Hogeschool Utrecht (University of Applied Science Utrecht) and the Province of Utrecht. It contains the result of a feasibility study in which we tried to gain knowledge on how well current AI system (LLMs specifically) are able to perform spatial planning and reasoning tasks. As such it houses geospatial utilities and an agent-based workflow for analyzing spatial constraints (e.g., wind turbine placement restrictions) within the Province of Utrecht, Netherlands. The system combines:
- RAG (Retrieval-Augmented Generation) for retrieving relevant legal/regulatory constraints
- Multi-agent orchestration (reasoning + validation agents) powered by Gemini 2.5 Pro
- Geospatial processing with GML/GPKG data (EPSG:28992 RD New coordinate system <-- Dutch coordinate system)
For questions and/or comments, see Support at the bottom of the page.
- Agent-based reasoning workflow: Sequential pipeline of LLM agents (reasoning → validation → optional extensions)
- Geospatial constraint mapping: Query and process Dutch PDOK spatial datasets (Top10NL, administrative boundaries)
- RAG-powered legal analysis: FAISS vector search over regulatory documents for spatial planning
- GML/GPKG utilities: CRS-aware polygon processing (WGS84 ↔ RD New conversion)
- Automated geometry operations: Buffered cuts, polygon simplification, multi-layer GPKG output
.
├── agent_toolkit.py # Agent orchestration (ReasoningAgent, ValidationAgent, AgentOrchestrator)
├── agent_prompts.py # LLM system prompts for agents
├── agent_workflow.ipynb # Main workflow: RAG query → agent analysis → geometry processing
├── RAG_setup.py # Build FAISS index from PDF documents (one-time setup)
├── utils.py # Core geospatial helpers (CRS transforms, GML I/O, polygon ops)
├── requirements.txt # Python dependencies
├── data/ # Data directory (not in repo, see Setup)
│ └── spatial_genai_storage/
│ ├── data_RAG/ # PDF source documents for RAG
│ ├── database_RAG/ # FAISS index + metadata
│ └── data_PDOK/ # GPKG files (e.g., Top10NL)
├── archive/ # Historical code examples from previous attempts
│ ├── code/ # Sample scripts (WFS fetching, data entry)
│ └── results/ # Archived boundary files
-
System Dependencies:
- GDAL (required for GML/GPKG I/O):
# Ubuntu/Debian sudo apt-get install gdal-bin libgdal-dev # macOS brew install gdal
- Python 3.10+
- GDAL (required for GML/GPKG I/O):
-
API Keys:
- OpenAI API key (for embeddings:
text-embedding-3-large) - Google Gemini API key (for agent reasoning:
gemini-2.5-pro)
- OpenAI API key (for embeddings:
# Clone repository
git clone https://github.com/uashogeschoolutrecht/SpatialGenAI.git
cd SpatialGenAI
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
# Configure environment
cp .env.example .env
# Edit .env and add:
# OPENAI_API_KEY=your_key_here
# GOOGLE_API_KEY=your_key_hereOur data was stored in data/spatial_genai_storage/:
data_RAG/: PDF documents with legal/regulatory textdata_PDOK/: GeoPackage files (we used TOP10NL Geopackage - top10nl_Compleet-2024.gpkg , see e.g., PDOK for lots of free public resources through convenient API access)database_RAG/: Will be created by RAG setup
python RAG_setup.py \
--data-dir data/spatial_genai_storage/data_RAG \
--index-dir data/spatial_genai_storage/database_RAG \
--query "windenergie restricties" # Optional test queryCreates FAISS index from PDFs for semantic search over regulatory documents.
Open and execute agent_workflow.ipynb:
jupyter notebook agent_workflow.ipynbWorkflow steps:
- Query RAG database for relevant legal constraints (e.g., wind turbine restrictions)
- Reasoning agent analyzes constraints → proposes spatial filters (table/column/value)
- Validation agent reviews proposal → provides feedback
- Iterate for N rounds with feedback loop
- Apply filters to GeoPackage data → cut geometries from base polygon
- Export results to multi-layer GPKG with categorized constraints
Example output:
utrecht_cut_with_categories.gpkg: Remaining suitable area + cut layers per constraint category
-
ReasoningAgent (
agent_toolkit.py):- Analyzes thematic objects (e.g., "windturbine") + legal context
- Outputs structured JSON:
[{tabel, kolom, waarde, categorie, reden}, ...] - Refines proposals based on validation feedback
-
ValidationAgent (
agent_toolkit.py):- Reviews reasoning agent's output for completeness/correctness
- Returns:
{approved: bool, comments: [...], issues: [...]} - Drives feedback loop for iterative refinement
-
AgentOrchestrator:
- Sequential pipeline:
agents=[reasoning, validation, ...] - Manages multi-round execution with feedback propagation
- Saves round results to JSON for audit trail
- Sequential pipeline:
Add custom agents to pipeline:
from agent_toolkit import Agent, AgentOrchestrator
class OptimizationAgent(Agent):
def build_prompts(self, context):
# Access previous agents' outputs
reasoning_output = context.get("filters")
return system_prompt, user_prompt
def parse_response(self, raw):
return parse_llm_json(raw)
# 3-agent pipeline
orchestrator = AgentOrchestrator(agents=[
reasoning_agent,
validation_agent,
OptimizationAgent(name="optimizer", system_prompt_template=PROMPT3)
])
results = orchestrator.run_rounds(context, num_rounds=2)OPENAI_API_KEY=sk-... # For embeddings (text-embedding-3-large)
GOOGLE_API_KEY=AIza... # For Gemini 2.5 Pro (agent reasoning)NUM_ROUNDS = 2 # Agent feedback loop iterations
EMBEDDER = "text-embedding-3-large"
EMB_DIM = 3072 # Embedding dimensions
BBOX_UTRECHT_PROV = (109311, 430032, 169326, 479261) # RD coordinatesInput:
- Thematic object: "windturbine" (160m hub height, 162m rotor diameter, 241m tip height)
- Base polygon: Province of Utrecht
- Query: "Legal restrictions for wind turbine placement"
Process:
- RAG retrieves 30 relevant document chunks (laws, policies, spatial plans)
- Reasoning agent identifies constraints:
{tabel: "top10nl_gebouw_vlak", kolom: "typebouw", waarde: "woning", categorie: "harde belemmering"}{tabel: "top10nl_waterdeel_vlak", kolom: "typewater", waarde: "meer", categorie: "complexe belemmering"}- ... (more filters)
- Validation agent reviews, requests clarifications
- Round 2: Reasoning agent refines based on feedback
- Geometry processing: Cut buildings, water, roads from base polygon
- Output: Multi-layer GPKG with remaining suitable areas + cut zones by category
# Test agent orchestrator (mock agents, no LLM calls)
python test_multi_agent.py
# Verify geospatial imports
python -c "import geopandas; import faiss; import openai; print('✓ All imports OK')"
# Test CRS transforms
python -c "from utils import wgs84_to_rd, rd_to_wgs84; print('✓ Utils imported')"For questions about this repository, contact:
- Fabian Kok [fabian.kok@hu.nl], Hogeschool Utrecht, main developer
- Rob Peters, Provincie Utrecht, project owner
- GeoPandas Documentation
- PDOK Services (Dutch geospatial data)
- FAISS Documentation
- Google Gemini API