-
Notifications
You must be signed in to change notification settings - Fork 8
/
3.2_daily_nest_survival.Rmd
254 lines (177 loc) · 9.68 KB
/
3.2_daily_nest_survival.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
# Daily nest survival {#dailynestsurv}
<!-- todo: explain why we need to start with first at day of detection, give literatur to describe the known fate model (e.g. King's book) -->
## Background
Analyses of nest survival is important for understanding the mechanisms of population dynamics. The life-span of a nest could be used as a measure of nest survival. However, this measure very often is biased towards nests that survived longer because these nests are detected by the ornithologists with higher probability [@Mayfield1975]. In order not to overestimate nest survival, daily nest survival conditional on survival to the previous day can be estimated.
## Models for estimating daily nest survival
What model is best used depends on the type of data available. Data may look:
1. Regular (e.g. daily) nest controls, all nests monitored from their first egg onward
2. Regular nest controls, nests found during the course of the study at different stages and nestling ages
3. Irregular nest controls, all nests monitored from their first egg onward
4. Irregular nest controls, nests found during the course of the study at different stages and nestling ages
Table: (\#tab:nestsurvmod) Models useful for estimating daily nest survival. Data numbers correspond to the descriptions above.
Model | Data | Software, R-code |
:-------|:------------------|:------------------|
Binomial or Bernoulli model | 1, (3) | `glm`, `glmer`,... |
Cox proportional hazard model | 1,2,3,4 | `brm`, soon: `stan_cox` |
Known fate model | 1, 2 | Stan code below |
Known fate model | 3, 4 | Stan code below |
Logistic exposure model | 1,2,3,4 | `glm`, `glmer`using a link function that depends on exposure time |
@Shaffer2004 explains how to adapt the link function in a Bernoulli model to account for having found the nests at different nest ages (exposure time). Ben Bolker explains how to implement the logistic exposure model in R [here](https://rpubs.com/bbolker/logregexp).
## Known fate model
A natural model that allows estimating daily nest survival is the known-fate survival model. It is a Markov model that models the state of a nest $i$ at day $t$ (whether it is alive, $y_{it}=1$ or not $y_{it}=0$) as a Bernoulli variable dependent on the state of the nest the day before.
$$ y_{it} \sim Bernoulli(y_{it-1}S_{it})$$
The daily nest survival $S_{it}$ can be linearly related to predictor variables that are measured on the nest or on the day level.
$$logit(S_{it}) = \textbf{X} \beta$$
It is also possible to add random effects if needed.
## The Stan model {#dailynestsurvstan}
The following Stan model code is saved as `daily_nest_survival.stan`.
```{r engine='cat', engine.opts=list(file="stanmodels/daily_nest_survival.stan",lang="stan")}
data {
int<lower=0> Nnests; // number of nests
int<lower=0> last[Nnests]; // day of last observation (alive or dead)
int<lower=0> first[Nnests]; // day of first observation (alive or dead)
int<lower=0> maxage; // maximum of last
int<lower=0> y[Nnests, maxage]; // indicator of alive nests
real cover[Nnests]; // a covariate of the nest
real age[maxage]; // a covariate of the date
}
parameters {
vector[3] b; // coef of linear pred for S
}
model {
real S[Nnests, maxage-1]; // survival probability
for(i in 1:Nnests){
for(t in first[i]:(last[i]-1)){
S[i,t] = inv_logit(b[1] + b[2]*cover[i] + b[3]*age[t]);
}
}
// priors
b[1]~normal(0,5);
b[2]~normal(0,3);
b[3]~normal(0,3);
// likelihood
for (i in 1:Nnests) {
for(t in (first[i]+1):last[i]){
y[i,t]~bernoulli(y[i,t-1]*S[i,t-1]);
}
}
}
```
## Prepare data and run Stan
Data is from @Grendelmeier2018.
```{r, echo=TRUE}
load("RData/nest_surv_data.rda")
str(datax)
datax$y[is.na(datax$y)] <- 0 # Stan does not allow for NA's in the outcome
```
```{r, echo=TRUE, cache=TRUE, results=FALSE}
# Run STAN
library(rstan)
mod <- stan(file = "stanmodels/daily_nest_survival.stan", data=datax,
chains=5, iter=2500, control=list(adapt_delta=0.9), verbose = FALSE)
```
## Check convergence
We love exploring the performance of the Markov chains by using the function `launch_shinystan` from the package `shinystan`.
## Look at results
It looks like cover does not affect daily nest survival, but daily nest survival decreases with the age of the nestlings.
```{r printmodel, echo=TRUE}
#launch_shinystan(mod)
print(mod)
```
```{r effplots, echo=TRUE, fig.cap="Estimated daily nest survival probability in relation to nest age. Dotted lines are 95% uncertainty intervals of the regression line."}
# effect plot
bsim <- as.data.frame(mod)
nsim <- nrow(bsim)
newdat <- data.frame(age=seq(1, datax$maxage, length=100))
newdat$age.z <- (newdat$age-mean(1:datax$maxage))/sd((1:datax$maxage))
Xmat <- model.matrix(~age.z, data=newdat)
fitmat <- matrix(ncol=nsim, nrow=nrow(newdat))
for(i in 1:nsim) fitmat[,i] <- plogis(Xmat%*%as.numeric(bsim[i,c(1,3)]))
newdat$fit <- apply(fitmat, 1, median)
newdat$lwr <- apply(fitmat, 1, quantile, prob=0.025)
newdat$upr <- apply(fitmat, 1, quantile, prob=0.975)
plot(newdat$age, newdat$fit, ylim=c(0.8,1), type="l",
las=1, ylab="Daily nest survival", xlab="Age [d]")
lines(newdat$age, newdat$lwr, lty=3)
lines(newdat$age, newdat$upr, lty=3)
```
## Known fate model for irregular nest controls
When nest are controlled only irregularly, it may happen that a nest is found predated or dead after a longer break in controlling. In such cases, we know that the nest was predated or it died due to other causes some when between the last control when the nest was still alive and when it was found dead. In such cases, we need to tell the model that the nest could have died any time during the interval when we were not controlling.
To do so, we create a variable that indicates the time (e.g. day since first egg) when the nest was last seen alive (`lastlive`). A second variable indicates the time of the last check which is either the equal to `lastlive` when the nest survived until the last check, or it is larger than `lastlive` when the nest failure has been recorded. A last variable, `gap`, measures the time interval in which the nest failure occurred. A `gap` of zero means that the nest was still alive at the last control, a `gap`of 1 means that the nest failure occurred during the first day after `lastlive`, a `gap` of 2 means that the nest failure either occurred at the first or second day after `lastlive`.
```{r, echo=TRUE, cache=TRUE, results=FALSE}
# time when nest was last observed alive
lastlive <- apply(datax$y, 1, function(x) max(c(1:length(x))[x==1]))
# time when nest was last checked (alive or dead)
lastcheck <- lastlive+1
# here, we turn the above data into a format that can be used for
# irregular nest controls. WOULD BE NICE TO HAVE A REAL DATA EXAMPLE!
# when nest was observed alive at the last check, then lastcheck equals lastlive
lastcheck[lastlive==datax$last] <- datax$last[lastlive==datax$last]
datax1 <- list(Nnests=datax$Nnests,
lastlive = lastlive,
lastcheck= lastcheck,
first=datax$first,
cover=datax$cover,
age=datax$age,
maxage=datax$maxage)
# time between last seen alive and first seen dead (= lastcheck)
datax1$gap <- datax1$lastcheck-datax1$lastlive
```
In the Stan model code, we specify the likelihood for each gap separately.
```{r engine='cat', engine.opts=list(file="stanmodels/daily_nest_survival_irreg.stan",lang="stan")}
data {
int<lower=0> Nnests; // number of nests
int<lower=0> lastlive[Nnests]; // day of last observation (alive)
int<lower=0> lastcheck[Nnests]; // day of observed death or, if alive, last day of study
int<lower=0> first[Nnests]; // day of first observation (alive or dead)
int<lower=0> maxage; // maximum of last
real cover[Nnests]; // a covariate of the nest
real age[maxage]; // a covariate of the date
int<lower=0> gap[Nnests]; // obsdead - lastlive
}
parameters {
vector[3] b; // coef of linear pred for S
}
model {
real S[Nnests, maxage-1]; // survival probability
for(i in 1:Nnests){
for(t in first[i]:(lastcheck[i]-1)){
S[i,t] = inv_logit(b[1] + b[2]*cover[i] + b[3]*age[t]);
}
}
// priors
b[1]~normal(0,1.5);
b[2]~normal(0,3);
b[3]~normal(0,3);
// likelihood
for (i in 1:Nnests) {
for(t in (first[i]+1):lastlive[i]){
1~bernoulli(S[i,t-1]);
}
if(gap[i]==1){
target += log(1-S[i,lastlive[i]]); //
}
if(gap[i]==2){
target += log((1-S[i,lastlive[i]]) + S[i,lastlive[i]]*(1-S[i,lastlive[i]+1])); //
}
if(gap[i]==3){
target += log((1-S[i,lastlive[i]]) + S[i,lastlive[i]]*(1-S[i,lastlive[i]+1]) +
prod(S[i,lastlive[i]:(lastlive[i]+1)])*(1-S[i,lastlive[i]+2])); //
}
if(gap[i]==4){
target += log((1-S[i,lastlive[i]]) + S[i,lastlive[i]]*(1-S[i,lastlive[i]+1]) +
prod(S[i,lastlive[i]:(lastlive[i]+1)])*(1-S[i,lastlive[i]+2]) +
prod(S[i,lastlive[i]:(lastlive[i]+2)])*(1-S[i,lastlive[i]+3])); //
}
}
}
```
```{r, echo=TRUE, cache=TRUE, results=FALSE}
# Run STAN
mod1 <- stan(file = "stanmodels/daily_nest_survival_irreg.stan", data=datax1,
chains=5, iter=2500, control=list(adapt_delta=0.9), verbose = FALSE)
```
## Further reading {-}
Helpful links:
https://deepai.org/publication/bayesian-survival-analysis-using-the-rstanarm-r-package [@Brilleman.2020]
https://www.hammerlab.org/2017/06/26/introducing-survivalstan/