Skip to content

brownag/SSURGO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SSURGO

A targets pipeline for building SSURGO databases with DuckDB.

Overview

This R project provides a reproducible pipeline for processing and building SSURGO (Soil Survey Geographic Database) databases using the targets package and DuckDB.

Features

  • Reproducible Workflows: Built on the targets package for reliable, efficient, scalable data pipelines
  • DuckDB Integration: Leverages DuckDB for columnar data storage and querying of spatial and tabular data
  • R-based Pipeline: Written entirely in R, the project leverages the soilDB package for downloading data and creating the database

Installation

To get started, ensure you have R installed (4.0.0+), then clone this repository.

This project uses renv to manage a consistent, isolated set of package dependencies. When you open the project in R, the .Rprofile automatically activates renv, ensuring dependencies are loaded from the project-local library.

First-time Setup

# Open R from the SSURGO/ directory
setwd("path/to/SSURGO")
R

# renv will auto-activate; initialize and discover dependencies from DESCRIPTION
renv::init()

After Setup

Dependencies are managed in the project-local environment via renv. To add or remove dependencies, modify DESCRIPTION and run renv::install() or renv::remove(), which will also update your local lock file.

Once dependencies are set up, run SSURGO.R to generate the _targets.R pipeline file:

You can modify the soil survey areas to include in the database in the first four targets. The default setup assumes you are creating a database with all US States, but you can choose any subset of one or more states, or any alternative method to create the ssas target (a character vector of area symbols).

source("SSURGO.R")

Usage

This project uses the targets package to manage the pipeline.

To run the workflow, be sure your working directory is the ./SSURGO/ folder containing _targets.R.

# Load the targets library
library(targets)

# View the pipeline
tar_visnetwork()  # Visualize the pipeline DAG

# Run the pipeline
tar_make()

Project Structure

SSURGO/
|-- _targets.R           # Main targets pipeline configuration (generated by SSURGO.R)
|-- SSURGO.R             # Entry point for `tar_script()` _targets.R generation
|-- R/                   # Core R functions and wrappers
|-- man/                 # Documentation files
|-- DESCRIPTION          # Package metadata
|-- NAMESPACE            # Package namespace
|-- README.md            # This file

Dependencies

All runtime dependencies are declared in DESCRIPTION and managed by renv:

  • targets: Workflow orchestration
  • duckdb: In-process SQL database engine
  • soilDB: SSURGO data download utilities
  • sf: Spatial data handling
  • Plus supporting packages (DBI, tarchetypes, geotargets)

R >= 4.0.0 is required. See DESCRIPTION for full details.

How It Works

The pipeline follows a structured approach:

  1. Data Ingestion: Download and prepare SSURGO data sources
  2. Database Building: Construct optimized DuckDB databases
  3. Output: Generate final database artifacts

Dependency Management with renv

This project uses renv to provide each user with an isolated, project-local R package library. The .Rprofile file automatically activates renv when you start R in this directory—no manual setup needed beyond the initial renv::restore().

Why this approach? Rather than pinning package versions in version control, each user maintains a local renv.lock file (which is .gitignored). This allows:

  • Flexibility: Teams can work with newer package versions if desired
  • Isolation: This project's dependencies don't affect your other R work
  • Simplicity: No need to manage global package state

Contributing

Please raise any issues on the Issue Tracker.

License

This project is licensed under the terms specified in LICENSE.md.

Author

Andrew G. Brown (@brownag)

References

About

'targets' pipeline for building SSURGO databases with DuckDB

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Contributors

Languages