A Rust-based service that tracks rain gauge readings from the Maricopa County Flood Control District and provides a RESTful API for querying rainfall data by rain year or calendar year.
- Automated Data Collection: Periodically fetches rain gauge data from MCFCD website
- Postgres Storage: Stores readings with automatic deduplication
- REST API: Query rainfall by rain year (Oct 1 - Sep 30) or calendar year
- Kubernetes Ready: Includes deployment manifests for K8s clusters
- Production Ready: Built with Axum, SQLx, and Tokio for performance and reliability
All endpoints are prefixed with /api/v1.
GET /api/v1/health
Returns service health status and latest reading.
GET /api/v1/readings/{gauge_id}/water-year/{year}
Returns all readings for a specific gauge for a water year (Oct 1 of year-1 through Sep 30 of year).
Example: GET /api/v1/readings/59700/water-year/2025 returns readings for gauge 59700 from Oct 1, 2024 to Sep 30, 2025.
GET /api/v1/readings/{gauge_id}/calendar-year/{year}
Returns all readings for a specific gauge for a calendar year (Jan 1 through Dec 31).
Example: GET /api/v1/readings/59700/calendar-year/2025 returns readings for gauge 59700 from Jan 1, 2025 to Dec 31, 2025.
GET /api/v1/readings/{gauge_id}/latest
Returns the most recent reading for a specific gauge.
Example: GET /api/v1/readings/59700/latest returns the latest reading for gauge 59700.
GET /api/v1/gauges?page=1&page_size=50
Returns a paginated list of all rain gauges with their latest rainfall data.
Query parameters:
page(optional): Page number (default: 1)page_size(optional): Number of items per page (default: 50, max: 100)
Example: GET /api/v1/gauges?page=1&page_size=25
GET /api/v1/gauges/{station_id}
Returns detailed information for a specific gauge by its station ID.
Example: GET /api/v1/gauges/59700 returns data for gauge 59700.
The service uses environment variables for configuration. Copy the example file and customize:
cp .env.example .envImportant: Edit DATABASE_URL in .env based on your setup:
- For docker-compose: use
postgresas host (default in example) - For local development: change to
localhost
Important: Before running with Docker, you need to generate SQLx metadata once:
# 1. Copy environment file
cp .env.example .env
# Edit .env if needed to customize settings
# 2. First time setup - requires local PostgreSQL for SQLx preparation
createdb rain_tracker
./prepare-sqlx.sh
# 3. Now you can use docker-compose
make docker-up
make docker-logs
# Or directly with docker-compose
docker-compose up -d
docker-compose logs -f app
# Stop
make docker-down
# or: docker-compose downThe service will be available at http://localhost:8080
- Rust 1.75+
- PostgreSQL 14+ (PostgreSQL 18 recommended)
createdb rain_tracker# Run the helper script to setup SQLx offline mode
./prepare-sqlx.shSQLx uses compile-time query verification. You have two options:
Option 1: With database connection (recommended for development)
export DATABASE_URL=postgres://postgres:password@localhost:5432/rain_tracker
cargo buildOption 2: Offline mode (for CI/CD without database)
# First time: Generate query metadata (requires DATABASE_URL)
cargo sqlx prepare
# Then build without database
cargo buildMigrations are automatically run on startup, or manually with:
sqlx migrate runcargo runThe service includes a CLI tool for importing historical rainfall data from MCFCD Excel files (2022+).
To import historical data from a local Excel file:
# Build the import tool
cargo build --bin historical-import
# Import water year 2023 (Oct 2022 - Sep 2023)
cargo run --bin historical-import -- \
--database-url "postgres://postgres:password@localhost:5432/rain_tracker" \
excel -f plans/pcp_WY_2023.xlsx -w 2023 -yThe import process will:
- Parse the Excel file (all 12 monthly sheets)
- Insert readings into the database (with automatic deduplication)
- Recalculate monthly rainfall summaries for affected months
- Show progress bars for each step
Example output:
Parsing Excel file for water year 2023...
✓ Parsed 8593 readings
Inserting 8593 readings into database...
[00:03] ████████████████████████████ 8593/8593
✓ Inserted 8593 new readings, 0 duplicates skipped
Recalculating monthly summaries for 84 station-months...
[00:01] ████████████████████████████ 84/84
✓ Monthly summaries recalculated
Import completed successfully!
For production environments, use the Kubernetes job manifest:
# Edit the water year value in the manifest
# Then create the job:
kubectl create -f k8s/jobs/historical-single-year-import.yaml
# Or use the helper script:
./scripts/import-water-year.sh 2023
# Monitor the job:
kubectl logs -f -l job-type=historical-single-year
# Check job status:
kubectl get jobs -l job-type=historical-single-yearThe job will automatically clean up after 24 hours.
Imported historical data is tagged with a data_source field:
live_scrape- Real-time data from the current scraperexcel_WY_2023- Historical data from Water Year 2023 Excel filepdf_1119- Historical data from November 2019 PDF file (future)
This allows you to query and analyze data by source if needed.
To avoid CI failures, run the same checks locally before committing:
# Run all CI checks (format, clippy, tests, openapi)
make ci-check
# Or run individually:
make fmt # Check code formatting
make clippy # Run clippy with warnings as errors
make test # Run all tests
make openapi # Generate openapi.json spec fileThe API is documented using OpenAPI 3.1. The specification is automatically generated from code annotations:
- OpenAPI Spec:
openapi.json(automatically kept in sync) - Interactive Docs: Start the service and visit
http://localhost:8080/docsfor Redoc UI - Raw JSON Spec:
http://localhost:8080/api-docs/openapi.json
To regenerate the OpenAPI spec:
make openapiThe pre-commit hook automatically regenerates openapi.json and stages it for commit, ensuring the spec is always up to date with the code.
A git pre-commit hook is installed that automatically runs clippy before each commit. This prevents accidentally committing code with clippy warnings that would fail CI.
To bypass the hook (not recommended):
git commit --no-verifycargo test --libRequires a test database:
createdb rain_tracker_test
DATABASE_URL=postgres://postgres:password@localhost:5432/rain_tracker_test cargo testdocker build -t rain-tracker-service:latest .kubectl apply -f k8s/configmap.yaml
kubectl apply -f k8s/secret.yaml
kubectl apply -f k8s/deployment.yaml
kubectl apply -f k8s/service.yamlEdit k8s/secret.yaml with your actual database URL before deploying.
- Fetcher Module: Scrapes HTML table from MCFCD website using reqwest and scraper
- Database Layer: SQLx with Postgres for storage, supports both rain year and calendar year queries
- Scheduler: Tokio-based periodic task that fetches new readings every N minutes
- API Layer: Axum REST framework with JSON responses
- Configuration: Environment-based config with dotenvy
- ✅ HTML parsing logic (fetcher module)
- ✅ Date range calculations (rain year logic)
- ✅ Rain reading struct parsing
- ✅ Database insert and retrieval operations
- ✅ Water year query correctness
- ✅ Calendar year query correctness
- ✅ Latest reading retrieval
- Deploy to K8s cluster
- Verify database migrations succeed
- Test
/api/v1/healthendpoint returns 200 - Test
/api/v1/readings/{gauge_id}/water-year/2025returns valid data - Test
/api/v1/readings/{gauge_id}/calendar-year/2025returns valid data - Test
/api/v1/readings/{gauge_id}/latestreturns valid data - Test
/api/v1/gaugesendpoint returns paginated gauge list - Test
/api/v1/gauges/{station_id}returns gauge details - Verify scheduler fetches data at configured interval
- Check logs for errors
- Verify data deduplication works (no duplicate readings)
A rain year starts on October 1st and ends on September 30th of the following year. For example:
- Rain Year 2025: Oct 1, 2024 - Sep 30, 2025
- Rain Year 2026: Oct 1, 2025 - Sep 30, 2026
The MCFCD table contains:
- Date and Time of reading
- Cumulative rainfall (inches) for the current rain year
- Incremental rainfall (inches) for that specific reading
- Added
historical-importCLI tool for importing historical rainfall data - Support for MCFCD Excel files (Water Year format, 2022+)
- Automatic monthly rainfall summary recalculation after import
- Data source tracking (
data_sourcecolumn) to distinguish live vs historical data - Kubernetes job manifest for production imports
- Helper scripts for easy import execution
- Successfully tested with 8,593+ readings from Water Year 2023
- Breaking Change: All readings endpoints now require a gauge ID parameter
- Old:
GET /api/v1/readings/water-year/{year} - New:
GET /api/v1/readings/{gauge_id}/water-year/{year} - Old:
GET /api/v1/readings/calendar-year/{year} - New:
GET /api/v1/readings/{gauge_id}/calendar-year/{year} - Old:
GET /api/v1/readings/latest - New:
GET /api/v1/readings/{gauge_id}/latest
- Old:
- Added
/api/v1/gaugesendpoint for listing all gauges with pagination - Added
/api/v1/gauges/{station_id}endpoint for getting specific gauge details - Simplified
/api/v1/healthendpoint to only return status (removed latest_reading field) - Improved database queries to filter by gauge ID for better performance and consistency
- Fixed non-deterministic behavior in latest reading queries
- Updated HTTP tests to use gauge-specific endpoints
- Added support for tracking multiple rain gauges
- Implemented gauge metadata storage (name, location, elevation, etc.)
- Added 6-hour and 24-hour rainfall aggregations per gauge
- Enhanced scraper to handle multi-gauge data from MCFCD
- Basic rain tracker functionality for single gauge
- Water year and calendar year queries
- Automated data collection from MCFCD website
- PostgreSQL storage with deduplication
- REST API with health check endpoint
MIT