--- output: md_document: variant: gfm --- ```{r, echo = FALSE} knitr::opts_chunk$set( fig.path = "README_figs/README-" ) ``` # ubms: Unmarked Bayesian Models with Stan [![R build status](https://github.com/kenkellner/ubms/workflows/R-CMD-check/badge.svg)](https://github.com/kenkellner/ubms/actions) [![Codecov test coverage](https://codecov.io/gh/kenkellner/ubms/branch/master/graph/badge.svg)](https://codecov.io/gh/kenkellner/ubms?branch=master) [![CRAN status](https://www.r-pkg.org/badges/version/ubms)](https://cran.r-project.org/web/packages/ubms/index.html) `ubms` is an `R` package for fitting Bayesian hierarchical models of animal occurrence and abundance. The package has a formula-based interface compatible with [unmarked](https://cran.r-project.org/web/packages/unmarked/index.html), but the model is fit using MCMC with [Stan](https://mc-stan.org/) instead of using maximum likelihood. Currently there are Stan versions of unmarked functions `occu`, `occuRN`, `colext`, `occuTTD`, `pcount`, `distsamp`, and `multinomPois`. These functions follow the `stan_` prefix naming format established by [rstanarm](https://cran.r-project.org/web/packages/rstanarm/index.html). For example, the Stan version of the `unmarked` function `occu` is `stan_occu`. Advantages compared to `unmarked`: 1. Obtain posterior distributions of parameters and derived parameters 2. Include random effects in parameter formulas (same syntax as `lme4`) 3. Assess model fit using WAIC and LOO via the [loo](https://cran.r-project.org/web/packages/loo/index.html) package Disadvantages compared to `unmarked`: 1. MCMC is slower than maximum likelihood 2. Not all model types are supported 3. Potential for convergence issues ## Installation `ubms` is on [CRAN](https://cran.r-project.org/web/packages/ubms/index.html): ```{r, eval=FALSE} install.packages("ubms") ``` Alternatively, the latest development version can be installed from Github: ```{r, eval=FALSE} # install.packages("devtools") devtools::install_github("kenkellner/ubms") ``` ## Example Simulate occupancy data including a random effect on occupancy: ```{r,message=FALSE} library(ubms) set.seed(123) dat_occ <- data.frame(x1=rnorm(500)) dat_p <- data.frame(x2=rnorm(500*5)) y <- matrix(NA, 500, 5) z <- rep(NA, 500) b <- c(0.4, -0.5, 0.3, 0.5) re_fac <- factor(sample(letters[1:26], 500, replace=T)) dat_occ$group <- re_fac re <- rnorm(26, 0, 1.2) re_idx <- as.numeric(re_fac) idx <- 1 for (i in 1:500){ z[i] <- rbinom(1,1, plogis(b[1] + b[2]*dat_occ$x1[i] + re[re_idx[i]])) for (j in 1:5){ y[i,j] <- z[i]*rbinom(1,1, plogis(b[3] + b[4]*dat_p$x2[idx])) idx <- idx + 1 } } ``` Create `unmarked` frame: ```{r} umf <- unmarkedFrameOccu(y=y, siteCovs=dat_occ, obsCovs=dat_p) ``` Fit a model with a random intercept, using 3 parallel cores: ```{r, eval=FALSE} (fm <- stan_occu(~x2 ~x1 + (1|group), umf, chains=3, cores=3, seed=123)) ``` ```{r, echo=FALSE, warning=FALSE} (fm <- stan_occu(~x2 ~x1 + (1|group), umf, chains=3, cores=3, refresh=0, seed=123)) ``` Examine residuals for occupancy and detection submodels (following [Wright et al. 2019](https://doi.org/10.1002/ecy.2703)). Each panel represents a draw from the posterior distribution. ```{r resids, fig.height=7} plot(fm) ``` Assess model goodness-of-fit with a posterior predictive check, using the MacKenzie-Bailey chi-square test: ```{r, eval=FALSE} fm_fit <- gof(fm, draws=500) plot(fm_fit) ``` ```{r gof, echo=FALSE} fm_fit <- gof(fm, draws=500, quiet=TRUE) plot(fm_fit) ``` Look at the marginal effect of `x2` on detection: ```{r marginal} plot_effects(fm, "det") ``` A more detailed vignette for the package is available [here](https://kenkellner.com/ubms/articles/ubms.html).