--- title: "Dynamic linear models in package atsar" author: "Eric Ward" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Dynamic linear models in package atsar} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` ## Installation ```{r install, eval=TRUE, warning=FALSE, message=FALSE, results='hide'} library(rstan) library(atsar) # for optimizing stan on your machine, #rstan_options(auto_write = TRUE) #options(mc.cores = parallel::detectCores()) mcmc_list = list(n_mcmc = 1000, n_burn = 500, n_chain = 1, n_thin = 1) ``` ## Data We'll use the same data in the MARSS manual, from Mark's example in the Columbia River. The data are accessed with ```{r,eval=FALSE} library(MARSS) data(SalmonSurvCUI) ``` ### Fitting a univariate DLM We can start with fitting a DLM that includes no covariates, but a random walk in the intercept (mean). ```{r, message=FALSE, warning=FALSE, results='hide',eval=FALSE} mod = fit_stan(y = SalmonSurvCUI$logit.s, model_name="dlm-intercept",mcmc_list=mcmc_list) ``` Next, we can fit a model with constant intercept, but time-varying slope. ```{r, message=FALSE, warning=FALSE, results='hide',eval=FALSE} mod2 = fit_stan(y = SalmonSurvCUI$logit.s, x = SalmonSurvCUI$CUI.apr, model_name="dlm-slope",mcmc_list=mcmc_list) ``` Finally, we can fit a model with both a time varying intercept and a time varying slope. ```{r, message=FALSE, warning=FALSE, results='hide',eval=FALSE} mod = fit_stan(y = SalmonSurvCUI$logit.s, x = model.matrix(lm(SalmonSurvCUI$logit.s ~ SalmonSurvCUI$CUI.apr)), model_name="dlm", mcmc_list=mcmc_list) ``` ## Fitting non-normal errors We've implemented several non-normal observation model distributions in `stan_fit`, including "binomial", "poisson", "gamma", etc. As an example of how to use these in a DLM setting, we'll fit a DLM with time-varying mean and Poisson response to a time series of ecoli data, collected weekly. The data are in the `tscount` package. ```{r eval=FALSE} y = tscount::ecoli ``` We could fit a Poisson GLM in `fit_stan` with the following, ```{r, message=FALSE, warning=FALSE, results='hide',eval=FALSE} mod = fit_stan(y=y$cases, x = model.matrix(lm(y$cases~1)), model="regression", family="poisson", mcmc_list=mcmc_list) ``` But the regression model assumes independent observations between time steps. To include the autocorrelation between time points, we can fit a DLM, ```{r, message=FALSE, warning=FALSE, results='hide'} y = data.frame(cases = rpois(20,2)) mod = fit_stan(y$cases, model="dlm-intercept", family="poisson", mcmc_list=mcmc_list) ``` This more flexible model captures the data much better. Note: the credible intervals are on the mean, not new data. \break ```{r, fig.cap="Estimated fit of the DLM with Poisson observation model applied to E. coli count data."} pars = extract(mod) plot(apply(exp(pars$intercept), 2, mean), type="l", lwd=3, ylim=c(0,100), ylab="E coli", xlab="") lines(apply(exp(pars$intercept), 2, quantile,0.025)) lines(apply(exp(pars$intercept), 2, quantile,0.975)) points(y$cases, col="red", cex=0.3) ```