The varlasso
R package is designed to fit vector
autoregressive (VAR) models, with and without observation error in a
Bayesian setting. The basic formulation of these types of models
consists of a process model describing the latent time series in xt,
xt = Bxt − 1 + u + wt − 1 where Bxt − 1 is a matrix of coefficients describing interactions among time series, u is an optional trend, and wt − 1 is a vector of deviations from a multivariate normal distribution, wt ∼ MVN(0, Q). The covariance matrix Q may be an identity matrix or include estimated covariance between time series.
In the context of ecological examples using the VAR approach (sometimes called multivariate autoregressive or MAR models), the xt can be viewed as log-abundances of different species, and the B matrix a matrix of interactions. Diagonals of the B matrix represent density dependence, and off-diagonals represent interspecific interactions.
The other component of a VAR model is optionally observation errors. To avoid confusion, we refer to the model with observation error as a VARSS (Vector Autoregressive State Space) model. The observation model can be written as
yt ∼ MVN(xt, R) where R is a covariance matrix of the observation errors.
We can fit a VAR model to simulated data
iter <- 50 # MCMC iterations, make this much larger
warmup <- 20 # MCMC warmup, make this much larger
thin <- 1
chains <- 1 # MCMC chains, make this 3 or 4
n_ts <- 4 # number of time series to simulate
n_time <- 70 # number of time points to simulate for
n_burn <- 50 # burn-in for time series
xx <- matrix(0, n_time, n_ts)
set.seed(123)
xx[1, ] <- rnorm(n_ts) # initial state
B <- matrix(rnorm(n = n_ts * n_ts, mean = 0, sd = 0.1), n_ts, n_ts)
diag(B) <- runif(n_ts, min = 0.7, max = 0.9) # make diagonals 0.7 - 0.9
Here is what the B matrix looks like,
0.73 | -0.07 | 0.04 | 0.05 |
0.17 | 0.78 | 0.01 | -0.20 |
0.05 | 0.12 | 0.78 | 0.07 |
-0.13 | 0.04 | 0.18 | 0.77 |
Next we can simulate the evolution of the states,
for (i in 2:n_time) { # simualte evolution of states
xx[i, ] <- B %*% xx[i - 1, ] + rnorm(n_ts, mean = 0, sd = 0.04)
}
xx <- xx[-c(1:n_burn), ] # drop burn in data
Finally we can add observation error noise
yy <- xx + rnorm(n = nrow(xx) * ncol(xx), mean = 0, sd = 0.02)
df <- data.frame(
"time" = rep(1:nrow(yy), ncol(yy)),
"species" = sort(rep(1:ncol(yy), nrow(yy))),
"y" = c(yy)
)
head(df)
#> time species y
#> 1 1 1 -0.08156230
#> 2 2 1 0.01676388
#> 3 3 1 -0.04290659
#> 4 4 1 0.04923091
#> 5 5 1 0.01031494
#> 6 6 1 0.02013256
We can fit a model to this data using the varlasso::fit
function, which at a minimum takes in the dataframe and MCMC
parameters.
There are a few optional arguments that may be applied to this or any
model. The shared_q
and shared_r
arguments can
be used to pass in vectors indicating which process error or observation
error variances may be shared across time series. By default, all
variances are shared across time series. In the example below, process
variances are shared among the 1st and 4th time series, and unique for
time series 2 and 3. Similarly for the observation error variances, time
series 1 and 4 have equal variances and time series 2 and 3 have equal
variances.
m <- fit(data = df,
chains = chains,
shared_q = c(1,2,3,1),
shared_r = c(1,2,2,1),
iter = iter,
warmup = warmup,
thin = thin)
In some cases, it may be desireable to fix observation variances;
this can be done using the fixed_r
argument. The
fixed_r
argument can be a vector of fixed values, or
alternatively a single scalar.
m <- fit(data = df,
chains = chains,
fixed_r = c(0.1, 0.2, 0.2, 0.1),
iter = iter,
warmup = warmup,
thin = thin)
For some cases, we may also want to ignore the trend u. This can be turned on /
off using the est_trend
argument
Finally, observation error can be turned on / off using the
varss
argument. When FALSE, a VAR model is constructed, and
if TRUE a VARSS model is fit.
There are a number of additional arguments documented in
?fit
Many VAR or VARSS models struggle as the number of time series
increases, because of the dimensionality of the interaction matrix B. The varlasso
in this package allows models with alternative priors on the diagonal
and off-diagonal elements. By default, these priors are
Bi, i ∼ N(0.7, 1)
and Bi, j ∼ N(0, 1)
These priors may be specified by changing the arguments
b_diag
and b_offdiag
.
The varlasso
package currently only allows for normal
priors on the diagonal elements. But because the number of off-diagonal
elements can become prohibitively large to work with, we allow for a
number of shrinkage priors. These can be changed by using the
off_diag_prior
argument (defaults to ‘normal’).
As a first option, we may use a partial pooling prior (or hierarchical prior) that estimates a standard deviation, e.g.
Bi, i ∼ N(0.7, σB) This prior can be specified using the following
m <- fit(data = df,
chains = chains,
off_diag_prior = "normal_pp",
iter = iter,
warmup = warmup,
thin = thin)
Second, we might want to include more extremes in the tails than a normal prior, so instead a Student t-distribution can be included,
Bi, i ∼ Student − t(0.7, σB, ν)
where ν is included as an
estimated degrees of freedom parameter. This prior is specified by
setting off_diag_prior
to “student_t”. The ν parameter can also be optionally
fixed with the nu
argument, e.g.
m <- fit(data = df,
chains = chains,
nu = 5,
off_diag_prior = "student-t",
iter = iter,
warmup = warmup,
thin = thin)
As a third prior, we include the Laplace prior (this is also known as a double exponential distribution). This prior has an estimated scale parameter, and is specified with the chunk below
Bi, i ∼ Laplace(0, σB)
m <- fit(data = df,
chains = chains,
nu = 5,
off_diag_prior = "laplace",
iter = iter,
warmup = warmup,
thin = thin)
Finally, we can include a regularized horseshoe prior,
Bi, i ∼ hs(θ)
We adopt the same parameterization in rstanarm
and
brms
. This prior requires specifying several
hyperparameters θ, stored as a
list in hs
. The elements of this list include
df
Degrees of freedom for the Student-t prior for local
shrinkagedf_global
Student-t degrees of freedom for global
shrinkagedf_slab
Degrees of freedom of the Student-t prior on
the slab regularizationscale_global
Scale parameter for Student-t prior on
global shrinkagescale_slab
Scale of the Student-t prior on
shrinkage