Vous êtes sur la page 1sur 8

Intro to Occupancy Modeling

UA Summer R Workshop: Week 4


Nicholas M. Caruso
Christina L. Staudhammer
28 June 2016

Intro to Occupancy Models


Understanding how different variables relate to the presence and absence of a species at a location is a
reoccuring theme in ecology. However, often, we cannot detect a species perfectly (i.e., a species absence
during a survey doesnt necessarily mean the site is not occupied). Here we will go over the basics of occupancy
modeling and how to perform and visualize these analyses in R. I strongly recommend reading Occupancy
Estimation and Modeling (MacKenzie et al., 2006).

State (Ecological) and Observation Processes


Occupancy models are essentially heirarchical logistic regression models, in which the probability of occupancy
is jointly modeled with the probability of detection. Under occupancy model framework, our observed data
are a realization of two processes; the state or ecological process (if a species is present at a given site) and
the observation process (if we detect a species at a site conditional on that species being present). Thus, if
we detect a species, we know that it is present; however, if we do not detect a species it is either absent from
that site or it is present but we were unable to detect it. Using repeated sampling (typically temporally, can
be spatial if sites are close, which depends on species of interest), we can account for imperfect detection,
and by collecting relevant covariate data (e.g., time of year, weather; depends on species of interest) we can
account for variable detection (e.g., a salamander species that only emerges after rainfall events or during
cooler temperatures).
Click here to visualize how imperfect and variable detection can influence estimates of site occupancy

Assumptions
1) No false identifications
2) Sites are closed, no change in ecological state during sampling
Can use multi-season (dynamic) occupancy models to model colonization and extinction
3) Sites are independent
Sites should be far enough apart such that movement does not occur between sites
Individuals detected at one site cannot be detected at another site
4) No unexplained variability in occupancy among sites or in detection among sites or surveys
We must include covariates in our model that explain variation in detection/occupancy

Sampling Design
1) Duration between temporal replicates should be short enough such that ecological states does not
change (else need to use dynamic models)

2) Sites should be chosen randomly, stratified randomly, or other probabilistic sampling schemes (depending
on species of interest)

Create Data
For our imaginary study, we surveyed 30 different sites for a given species with five replicate surveys for each
site.
sites <- 30
surveys <- 5
y <- array(dim=c(sites,surveys))

Ecological (state) Process


We will assume that the occurrence of this species varies positively with a given covariate across our sites;
the expected relationship (psi) between occurrence and a covariate follows a logistic distribution, while the
occurrence (z) of this species (1 or 0) follows a Bernoulli distribution. We can find the total number of
occupied sites by summing our occurrence vector.
set.seed(8675309)
a.psi <- 0.1
b.psi <- 10
covar.psi <- sort(runif(n=sites, min=-1, max=1))
psi <- plogis(a.psi + b.psi*covar.psi)
## Add Bernoulli noise - draw occurrence indicator z from occupancy prob (psi)
z <- rbinom(n=sites, size=1, prob=psi)
total.occu <- sum(z)

Observation process
Our observations of this species are conditional on its state (i.e., we cannot observe a species if it is not present
at a site). Similar to the ecological process, the expected relationship (p) between detection and a covariate
follows a logistic distribution. Thus, our observations of this species (y) follow a Bernoulli distribution. Note,
that we multiply detection probability (p) and occurrence (z), which constrains detection probability to only
be greater than zero when a species is present. We can then find the naive assumption of occupancy based
on our observations and our naive predictions for the relationship between our observations and the covariate
of interest.
set.seed(8675309)
a.p <- -0.1
b.p <- -4
covar.p <- sort(runif(n=sites, min=-1, max=1))
p <- plogis(a.p + b.p*covar.p)
prob.det <- z*p
for(i in 1:surveys){
y[,i] <- rbinom(n=sites, size=1, prob=prob.det)
}
naive.occu <- apply(y, 1, max)
naive.pred <- plogis(predict(glm(naive.occu~covar.psi,
family=binomial)))

Plot Naive and Expected Occupancy


par(mfrow=c(1,2))
plot(p~covar.p, ylim=c(0,1), xlab="Detection Covariate",
ylab="Detection Propbability (p)", type="l", xlim=c(-1,1),
lwd=2, col="black", las=1, frame.plot=FALSE)
plot(psi~covar.psi, xlab="Occupancy Covariate",
ylab=expression(paste("Occupancy Probability ( ",psi,")")),
las=1, type="l", col="steelblue1", lwd=2,
frame.plot=FALSE, ylim=c(0,1), xlim=c(-1,1))
lines(naive.pred~covar.psi, lwd=2, lty=2, col="firebrick1")
legend('topleft', legend=c('Expected','Naive prediction'),
bty='n', lty=c(1,2), lwd=c(2,2), col=c('steelblue1','firebrick1'))

1.0
Occupancy Probability ( )

Detection Propbability (p)

1.0
0.8
0.6
0.4
0.2
0.0

Expected
Naive prediction

0.8
0.6
0.4
0.2
0.0

1.0

0.5

0.0

0.5

1.0

1.0

Detection Covariate

0.5

0.0
Occupancy Covariate

0.5

1.0

Single Season Occupancy Model: unmarked package


Unmarked Dataframe
Before setting up our models, we need to convert our dataset into an unmarked dataframe (unmarkedFrameOccu()). The observation covariates should be in a list, in which each element of the list is named
and consists of a covariates that vary among sites and/or surveys, while the site covariates should be in a
dataframe.
# install.packages('unmarked')
library(unmarked)
obs.cov <- list(covar.p = matrix(covar.p, nrow=sites, ncol=surveys,
byrow=FALSE))
data.umf <- unmarkedFrameOccu(y=y, siteCovs=data.frame(covar.psi=covar.psi),
obsCovs=obs.cov)

Fit a constant model


For a single season occupancy model, we will use the occu() function. Our model is specified by a double
right-hand side formula, in which the we describe the covariates for detection and occupancy respectively.
Here, we are modeling detection and occupancy as constants.
const.mod <- occu(~1 ~1, data.umf)

Covariates
We can also fit covariates for both detection and occupancy. First, we will determine if our covariate explains
detection and occupancy better than the constant model by the four competeing models using AICc using
the AICcmodavg package.
det.mod <- occu(~covar.p ~1, data.umf)
occu.mod <- occu(~1 ~covar.psi, data.umf)
full.mod <- occu(~covar.p ~covar.psi, data.umf)
# install.packages('AICcmodavg')
library(AICcmodavg)
library(knitr)

kable(aictab(cand.set=c(const.mod, det.mod, occu.mod, full.mod),


modnames=c('psi(.) p(.)',
'psi(.) p(cov)',
'psi(cov) p(.)',
'psi(cov) p(cov)')), digits=2)

4
2
1
3

Modnames

AICc

Delta_AICc

ModelLik

AICcWt

LL

Cum.Wt

psi(cov) p(cov)
psi(.) p(cov)
psi(.) p(.)
psi(cov) p(.)

4
3
2
3

61.09
67.53
68.04
69.09

0.00
6.44
6.95
7.99

1.00
0.04
0.03
0.02

0.92
0.04
0.03
0.02

-25.75
-30.31
-31.80
-31.08

0.92
0.95
0.98
1.00

Sites Occupied
We can estimate the number of sites predicted as occupied from both the occupancy model and the logistic
regression. To extract the state predictions, we will use ranef() to estimate the posterior distributions of the
random variable (occurrence) bup(), which extracts the posterior median of the Best Unbiased Predictor, and
confint(), which computes confidence intervals. For the logistic regression, well extract the predictions and
standard errors using augment() in the broom package.
re <- ranef(full.mod)
glm.aug <- broom::augment(glm(naive.occu~covar.psi,
family=binomial))
library(dplyr)
glm.aug <- glm.aug %>%
select(.fitted, .se.fit) %>%
mutate(lower=.fitted-.se.fit,
upper=.fitted+.se.fit)
kable(data.frame(Estimate=c('Reality','Occupancy Model Estimate','Naive Estimate'),
'Sites Occupied'=c(total.occu, sum(bup(re, stat="mode")), sum(naive.pred)),
'Lower.CI'=c(NA, sum(confint(re)[,1]), sum(plogis(glm.aug$lower))),
'Upper.CI'=c(NA, sum(confint(re)[,2]), sum(plogis(glm.aug$upper)))),
digits=1)
Estimate

Sites.Occupied

Lower.CI

Upper.CI

19
18
6

NA
13.0
3.6

NA
20.0
9.6

Reality
Occupancy Model Estimate
Naive Estimate

Plot Detection and Occupancy


## Plot occupancy prediction with naive prediction and expected
pred.psi <- predict(full.mod, type="state")
pred.dat <- data.frame(pred.psi=pred.psi$Predicted,
reality=psi,
naive.pred=naive.pred, covar.psi=covar.psi)
point.dat <- data.frame(naive.occu=naive.occu,
reality=z, covar.psi=covar.psi)
library(tidyr)
library(dplyr)
pred.dat <- pred.dat %>%
gather(estimate, value, pred.psi:naive.pred)
point.dat <- point.dat %>%
gather(estimate, value, naive.occu:reality)
library(ggplot2)
ggplot(pred.dat) +
geom_line(aes(x=covar.psi, y=value, linetype=estimate), lwd=1) +
geom_jitter(data=point.dat, aes(x=covar.psi, y=value, shape=estimate),
width=0.1, height=0.01, size=2) +
theme_bw() +
scale_shape_manual('', labels=c('Observed','Expected'),
values=c(1,15)) +
scale_linetype_manual('Estimate', labels=c('Naive','Occupancy','Reality'),
values=c(3,2,1)) +
labs(x='Covariate', y=expression(paste('Occupancy Probability ( ',psi,')'))) +
theme(legend.position=c(0.1, 0.8),
legend.background=element_rect('transparent'))

1.00

Observed
Expected

Occupancy Probability ( )

Estimate
0.75

Naive
Occupancy
Reality

0.50

0.25

0.00
1.0

0.5

0.0

Covariate

0.5

1.0

Vous aimerez peut-être aussi