aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorlpautrel <lea.pautrel@terroiko.fr>2023-12-08 09:31:33 +0100
committerlpautrel <lea.pautrel@terroiko.fr>2023-12-08 09:31:33 +0100
commitc454c3405600ea25c131959fab64d0961bf4a082 (patch)
tree32cc6ffa1019046c8d8a53da4a2bcb6960d4584f
parentc23ca587a472a5db916dfaf6c217c1e87910c63d (diff)
Init contributing to unmarked
-rw-r--r--contributing_to_unmarked.Rmd89
1 files changed, 89 insertions, 0 deletions
diff --git a/contributing_to_unmarked.Rmd b/contributing_to_unmarked.Rmd
new file mode 100644
index 0000000..333f2c6
--- /dev/null
+++ b/contributing_to_unmarked.Rmd
@@ -0,0 +1,89 @@
+---
+title: Draft guide to adding a new unmarked function
+author: Ken Kellner
+---
+
+This guide uses the recently developed `gdistremoval` function for examples, mainly because most of the relevant code is in a single file instead of spread around.
+
+`unmarked` uses S4 for objects and methods - if you aren't familiar with S4 you may want to consult a book or tutorial such as [this one](https://kasperdanielhansen.github.io/genbioconductor/html/R_S4.html).
+
+# Write the likelihood function
+
+* Should be an R function that takes as input a vector of parameter values, the response variable, design matrices, and other settings/required data
+* The output should be the negative log-likelihood
+* Should be written so it can be used with the `optim()` function
+* Examples:
+* After writing the R version, in most cases it will be very useful to also write a C++ version
+* The C++ version could use Rcpp/RcppArmadillo or TMB
+* Examples: [C++](https://github.com/rbchan/unmarked/blob/master/src/nll_gdistremoval.cpp), [TMB](https://github.com/rbchan/unmarked/blob/master/src/TMB/tmb_gdistremoval.hpp)
+* `gdistremoval` doesn't have an R version of the likelihood function, but here's the R version for single season occupancy `occu`: [R Example](https://github.com/rbchan/unmarked/blob/c82e63947d7df7dfc896066e51dbf63bda3babf4/R/occu.R#L65)
+
+# Design the unmarkedFrame object for the model
+
+* Most model types in unmarked have their own `unmarkedFrame`, a specialized kind of data frame
+* This is an S4 object which contains, at a minimum, the response (y), and site covariates, and may also include observation covariates, primary period covariates, and other info related to study design (such as distance breaks)
+* In some cases you may be able to use an existing `unmarkedFrame` type
+* [Example](https://github.com/rbchan/unmarked/blob/c82e63947d7df7dfc896066e51dbf63bda3babf4/R/gdistremoval.R#L1)
+* Also need to write some S4 methods for the `unmarkedFrame`, such as how to subset it with `[`
+* [Examples](https://github.com/rbchan/unmarked/blob/c82e63947d7df7dfc896066e51dbf63bda3babf4/R/gdistremoval.R#L67)
+
+# Start a draft of the fitting function
+
+* R formulas for each submodel (e.g. state, detection). We have found over time it is better to have separate arguments per formula instead of a combined formula e.g. the way `occu` does it.
+* The `unmarkedFrame`
+* Other model-specific settings, such as key functions or parameterizations to use
+* [Example](https://github.com/rbchan/unmarked/blob/c82e63947d7df7dfc896066e51dbf63bda3babf4/R/gdistremoval.R#L257)
+
+# Write the getDesign method for the model
+
+* Most models have their own `getDesign` function, an S4 method
+* The purpose of this function is to convert the information in the `unmarkedFrame` into a format usable by the likelihood function
+* As such it usually generates design matrices from formulas and components of the `unmarkedFrame`
+* It often also has code to handle missing values, such as by dropping sites that don't have measurements, or giving the user warnings if covariates are missing, etc.
+* Frequently the most tedious and difficult part of the work adding a new function
+* [Example](https://github.com/rbchan/unmarked/blob/c82e63947d7df7dfc896066e51dbf63bda3babf4/R/gdistremoval.R#L177)
+
+# Write a draft of the complete fitting function process
+
+* Simulate some data using your model
+* Construct the `unmarkedFrame`
+* Provide formulas, `unmarkedFrame`, other options to your draft fitting function
+* Process them with `getDesign`
+* Pass results from `getDesign` as inputs to your likelihood function
+* Optimize the likelihood function
+* Check the resulting parameter estimates for accuracy
+
+# Output processing
+
+* Output from `optim` should be organized unto `unmarkedEstimate` (S4) objects
+* One `unmarkedEstimate` per submodel (e.g. state, detection)
+* These objects include the parameter estimates and other info about link functions etc.
+* [Example](https://github.com/rbchan/unmarked/blob/c82e63947d7df7dfc896066e51dbf63bda3babf4/R/gdistremoval.R#L429)
+* Draft a new `unmarkedFit` S4 object type for your model
+* The main component of these objects is a list of the `unmarkedEstimates`
+* [Example definition](https://github.com/rbchan/unmarked/blob/c82e63947d7df7dfc896066e51dbf63bda3babf4/R/gdistremoval.R#L255), [Example creation of object](https://github.com/rbchan/unmarked/blob/c82e63947d7df7dfc896066e51dbf63bda3babf4/R/gdistremoval.R#L466)
+* Return this from your fitting function
+* Test everything out
+
+# unmarkedFit Methods
+
+* Develop methods specific to your `unmarkedFit` type for operating on the output
+* Required methods include: `fitted`, `residuals`, `getP`, `ranef`, `simulate`, and `plot`
+* Make sure your model type works with other `unmarked` functions such as `parboot`, `nonparboot` etc.
+* [Examples](https://github.com/rbchan/unmarked/blob/c82e63947d7df7dfc896066e51dbf63bda3babf4/R/gdistremoval.R#L476)
+
+# Write tests
+
+* For your `unmarkedFrame`, fitting function, and methods code
+* Should be fast, but cover all the key configurations
+* Using `testthat` package
+* [Example](https://github.com/rbchan/unmarked/blob/master/tests/testthat/test_gdistremoval.R)
+
+# Add to unmarked
+
+* Fork the `unmarked` [repository](https://github.com/rbchan/unmarked) on Github
+* Make a new branch with your new function as the name
+* Add the new code
+* Send a pull request on Github
+* Probably fix a few things
+* Merged and done!