remotePARTS is a software package for the R statistical programming language. The package contains tools for analyzing spatiotemporal data, typically obtained via remote sensing.
These tools were created to test map-scale hypotheses about trends in large remotely sensed data sets, but they are useful for analyzing trends in any spatial data, with or without a temporal component. Statistical tests are conducted with the PARTS method for analyzing spatially autocorrelated time series (Ives et al., 2021). The method’s unique approach can handle extremely large data sets that other spatiotemporal models cannot, while still appropriately accounting for autocorrelation structure. This is done by partitioning the data into smaller chunks, analyzing chunks separately and then combining the separate analyses into a single test, that accounts for correlations among chunks, of the map-scale hypotheses.
To install the package and it’s dependencies from CRAN, use the following R code:
install.packages("remotePARTS")To install the latest stable version of this package from github, use
remotes::install_github("morrowcj/remotePARTS")and to test out the newest features and functionality, use
remotes::install_github("morrowcj/remotePARTS", ref = "develop")To ensure the vignette is built when installing from GitHub, use
remotes::install_github("morrowcj/remotePARTS", build_vignettes = TRUE)Then, upon successful installation, load the package with
library(remotePARTS).
Since the matrix operations in this package rely on C++ code, as implemented via the RcppEigen package, the latest version of Rtools is required for Windows and C++11 is required for other systems.
To cite this package in publications, please use:
Morrow CJ, Ives AR (2025). “remotePARTS: Spatiotemporal autoregression analyses for large data sets.” Journal of Open Source Software, 10(109), 7937. doi:10.21105/joss.07937 https://doi.org/10.21105/joss.07937.
If you wish to contribute to this package, report bugs, suggest new features, tests, or behavior, correct typos, update documentation, or anything else, please submit a GitHub Issue. We welcome and appreciate any and all feedback.
To manually run the package’s unit tests from a cloned directory, after installing dependencies, use
devtools::test() # run unit tests in tests/testthat/
devtools::check() # full R CMD check (tests + docs)Test coverage is currently low, especially among functions with stochastic outcomes. As the package develops, additional tests will be added.
A typical remotePARTS workflow is comprised of two broad steps for analyzing trends in spatiotemporal datasets: 1) time series analysis and 2) spatial analysis. For purely spatial problems, step 1 is skipped. We briefly summarize these steps and the expected data structure below.
Currently, remotePARTS requires that the data are formatted as “flat”
files (i.e., data frames with 1 row per pixel) with x- and
y-coordinates. We will not go into detail of how to prepare your data
here, as other packages are dedicated to reading and manipulating
spatial data (e.g., see raster::rasterToPoints()). We recognize that
this is a limitation, since flat files are highly inefficient. Future
versions of this package may include interfaces with raster objects if
enough users express interest.
To demonstrate the package’s basic functionality, we first simulate a small spatiotemporal data set for analysis:
library(tibble); library(dplyr); library(tidyr); library(viridisLite)
library(ggplot2); library(remotePARTS)
# set a random seed, for reproducibility
set.seed(42) # don't panic
# simulate a spatiotemporal response variable
sim_spatiotemp <- function(
n, k, n_time = 4, b.0 = 0, b.x = 0.5, b.y = 1,
b.xy = 0.1, sd.xy = 0.2, b.t = 0.2, ar = 0.4,
sd.t = 0.1
){
coords = expand_grid(
x = seq(0, 1, length.out = n), y = seq(0, 1, length.out = k)
)
time = seq_len(n_time)
tibble::tibble(
x = coords[[1]], y = coords[[2]],
z.0 = b.0 + x*b.x + y*b.y + x*y*b.xy,
eps = rnorm(n = length(x), mean = 0, sd = sd.xy),
time.effect = list(time * b.t),
) |>
rowwise() |>
mutate(
z.0 = z.0 + eps,
sp.innov = list(z.0 + time.effect + rnorm(n_time, sd = sd.t)),
z = list(arima.sim(list(ar = ar), n_time, innov = sp.innov))
) |>
unnest_wider(z, names_sep = ".") |>
select(-"time.effect", -"sp.innov", -"eps")
}The function defined above generates a data frame for n k
pixels. The response variable (z) depends upon the x and y
coordinates of the map. The resulting spatial patterns (z.0) are used
as the random innovations of an AR(1) time series model to generate the
spatiotemporal response (z.1 – z.4). These data are visualized
below:
# build the data
dat <- sim_spatiotemp(n = 100, k = 100)
# extract coordinates
coords <- select(dat, x, y)
# visualize the data
dat |>
pivot_longer(cols = z.1:z.4, names_to = "time", values_to = "z") |>
mutate(time = as.numeric(gsub("z\\.", "", time))) |>
ggplot(aes(x = x, y = y, fill = z)) +
facet_wrap(~time, labeller = "label_both") +
geom_tile() +
scale_fill_viridis_c(option = "magma") With properly structured data, the first step is to conduct a time
series analysis. This is done with the fitAR_map function.
# fit a pixel-wise autoregression model to the full map
AR_fit <- fitAR_map(Y = dat |> select(z.1:z.4) |> as.matrix(), coords = coords)
# combine results into a data frame
df <- data.frame(
coords = AR_fit$coords, coefs = AR_fit$coefficients,
resids = AR_fit$residuals
)This function returns time series regression coefficients and residual estimates for each pixel:
df |>
ggplot(aes(x = coords.x, y = coords.y, fill = coefs.t)) +
geom_tile() +
labs(x = "x", y = "y", fill = "t coef") +
scale_fill_viridis_c(option = "magma")df |>
pivot_longer(resids.1:resids.4, names_to = "time", values_to = "resids") |>
mutate(time = as.numeric(gsub("resids\\.", "", time))) |>
ggplot(aes(x = coords.x, y = coords.y, fill = resids)) +
facet_wrap(~time, labeller = "label_both") +
geom_tile() +
labs(x = "x", y = "y", fill = "resid") +
scale_fill_viridis_c(option = "magma")The second step is to conduct a spatial analysis with
fitGLS_partition. In this case, we’ll estimate how the temporal trend
differs across the x and y coordinates.
# randomly divide the data into partitions
partitions <- sample_partitions(npix = nrow(df), partsize = 1000)
# fit the partitioned GLS
part_GLS <- fitGLS_partition(
formula = coefs.t ~ coords.x + coords.y + coords.x:coords.y, data = df,
partmat = partitions, coord.names = c("coords.x", "coords.y"), ncores = 8
)The results provide coefficient estimates that are corrected for spatial and temporal autocorrelation:
part_GLS$overall$t.test
#> Est SE t.stat pval.t
#> (Intercept) 0.30626642 0.007117123 43.032336 0.000000e+00
#> coords.x 0.09598428 0.012076984 7.947703 2.105658e-15
#> coords.y 0.19475968 0.012106987 16.086552 1.673033e-57
#> coords.x:coords.y 0.05523622 0.020571183 2.685126 7.262241e-03Note that these are are not direct estimates of the parameters used to
generate the data with sim_spatiotemp above. For example the
coefficients for coords.x and coords.y are estimates of b.t
b.x and b.t b.y.
This method can also be used for purely spatial problems. Here we will
use our original spatial variable (z.0):
# add the spatial variable into the data frame
df$z.0 <- dat$z.0
# fit the partitioned GLS
part_GLS1 <- fitGLS_partition(
formula = z.0 ~ coords.x + coords.y + coords.x:coords.y, data = df,
partmat = partitions, coord.names = c("coords.x", "coords.y"), ncores = 8
)part_GLS1$overall$t.test
#> Est SE t.stat pval.t
#> (Intercept) -0.005677504 0.01282505 -0.4426885 6.580007e-01
#> coords.x 0.503970052 0.02145689 23.4875584 8.834878e-119
#> coords.y 1.007575874 0.02153613 46.7853715 0.000000e+00
#> coords.x:coords.y 0.090887894 0.03606875 2.5198518 1.175592e-02In this case, the coefficients are direct estimates of the spatial
parameters (b.0, b.x, b.y, b.xy) given to sim_spatiotemp.
For detailed examples of how to use remotePARTS and all its options,
with real data, see the Alaska vignette:
vignette("Alaska")The latest stable version of the vignette is also hosted online at https://morrowcj.github.io/remotePARTS/Alaska.html.
Ives, Anthony R., et al. “Statistical inference for trends in spatiotemporal data.” Remote Sensing of Environment 266 (2021): 112678. https://doi.org/10.1016/j.rse.2021.112678