Vous êtes sur la page 1sur 5

STAT 431

Geoffrey Michael Williams


Computing Project
21 Oct 2015
Background: In the Barro Colorado Island 50 hectare tropical forest dynamics plot in Panam,
ecologists have been collecting data for all individuals > 10 cm diameter-at-breast-height (DBH)
of all 252 species of trees that occur on the plot since 1982, at roughly 5 year intervals. The data
is available online through the Smithsonian Tropical Research Institutes website. Ecologists are
generally interested in tracking changes in the species composition of the forest community, and
may want to know which species are increasing or decreasing in abundance. For a given species,
abundance is calculated as the number of individuals in the plot. Over the course of the 30 years
the plot data has been collected, changes in abundance for each species are approximately linear,
because tree lifespans (>200 yrs) are much longer than 30 years. 252 linear regressions were
performed in R. Each regression treated species abundance as the response variable and the
corresponding years 82, 85, 90, 95, 00, and 05 as the independent variables. Next, each model
was checked for linearity with a two-tailed t-test (alpha=0.002) of Beta 1, the average change in
number of individuals for that species per year. The species that passed the test were plotted with
the regression line in color-coated scatter plots (Appendix 1). In each check for linearity, the null
hypothesis was that the coefficient Beta 1 is equal to zero, and was tested by comparing the
magnitude of the t-statistic for Beta 1 to critical t (df=4, alpha/2=0.001) = 7.173.
Results: 43 species of trees showed various rates of increase or decline that were statistically
significant for alpha=0.002. Fitted models for all 43 species had an R2 of at least 0.93 and an F-
statistic greater than the critical F(1, 4, alpha=0.002) = 51.5. The average change in number of
individuals per year among species ranged from -24.4 to 28.4. Data summarized in Appendix 2.
Discussion: The linear model was chosen because it approximates changes in abundance for a
given species accurately over a short time scale; over longer time scales, population growth and
decay is nonlinear. The major limitation of the chosen analysis is the small time series sample
sizes for the abundance data used in the regressions; there are only 6 time points for each species.
This may explain the high F-statistics and Beta 1 t-statistics. To account for this shortcoming in
the data, alpha was chosen at 0.002 instead of 0.05. Six time points should be sufficient because
we are assuming recruitment and mortality are constant, as explained in the following paragraph.
Another concern is larger overall abundance could randomly result in higher incidence of change
in abundance. However, this is countered by the observation of consistent trends over time.
Rate change in abundance is equal to the difference in recruitment (new saplings >10cm DBH
per year) rate and mortality rate. Assuming negligible variation in recruitment and mortality,
additional major fluctuation in abundance would not have been observed under a more frequent
sampling scheme. Estimated Beta 1 and abundance could be inputed as parameters into a model
of population growth over longer time scales. Another question that could be asked is if changes
in abundance have resulted in changes in which species are the most or the least abundant.
Tropical forests are composed of many rare species which contribute to very high diversity, and
few hyperdominant species that account for a disproportionate amount of total abundance.
Appendix 1: x axis: time y axis: abundance (# of individuals) 34 changing spp
Appendix 2: List of species with statistically significant positive (+) linear coefficients
[1] Alseis blackiana Cassipourea elliptica Cupania seemannii
[4] Drypetes standleyi Eugenia oerstediana Faramea occidentalis
[7] Garcinia intermedia Hirtella triandra Inga acuminata
[10] Inga thibaudiana Lacmellea panamensis Laetia procera
[13] Pouteria reticulata Protium tenuifolium Spondias radlkoferi
[16] Tabernaemontana arborea Tachigali versicolor Tetragastris panamensis
[19] Trichilia pallida Xylopia macrantha

List of species with statistically significant negative (-) linear coefficients


[1] Adelia triloba Astrocaryum standleyanum Beilschmiedia pendula
[4] Casearia arborea Casearia sylvestris Dendropanax arboreus
[7] Ficus tonduzii Guarea 'fuzzy' Guatteria dumetorum
[10] Hasseltia floribunda Hirtella americana Inga cocleensis
[13] Lacistema aggregatum Lindackeria laurina Lonchocarpus heptaphyllus
[16] Platymiscium pinnatum Platypodium elegans Poulsenia armata
[19] Siparuna guianensis Sloanea terniflora Trichilia tuberculata
[22] Trophis racemosa Zuelania guidonia

All the species names, rate change in # of trees/yr (Beta 1), R-squared, and F statistic
1 Adelia triloba -2.03539445628998 0.981604298259277 213.442099049975
2 Alseis blackiana 9.30703624733475 0.970441712163511 131.325835587207
3 Astrocaryum standleyanum -3.05373134328358 0.970477528480617 131.490011307953
4 Beilschmiedia pendula -1.56460554371002 0.94963403471829 75.4187101870688
5 Casearia arborea -3.09808102345416 0.970613594178035 132.117360667843
6 Casearia sylvestris -1.18592750533049 0.972020156888916 138.960058214748
7 Cassipourea elliptica 1.28102345415778 0.975708533542084 160.666880319054
8 Cupania seemannii 1.16162046908316 0.945117729325721 68.8832818113459
9 Dendropanax arboreus -1.61791044776119 0.99390418374449 652.187757691084
10 Drypetes standleyi 5.1590618336887 0.952832347092786 80.8038804870926
11 Eugenia oerstediana 2.5863539445629 0.937443176583637 59.941865675898
12 Faramea occidentalis 28.4409381663113 0.930186450234486 53.2954679060872
13 Ficus tonduzii -1.21151385927505 0.986503546632395 292.374157791785
14 Garcinia intermedia 2.25245202558635 0.928617826970584 52.0363999895552
15 Guarea 'fuzzy' -2.32196162046908 0.988590274724697 346.578116780642
16 Guatteria dumetorum -4.07547974413646 0.945349868078739 69.1928699781216
17 Hasseltia floribunda -3.43965884861407 0.957492002792694 90.0999403122319
18 Hirtella americana -0.249466950959488 0.953844224256867 82.6630434782609
19 Hirtella triandra 10.9462686567164 0.989417645823737 373.987726868184
20 Inga acuminata 1.8409381663113 0.945098116775584 68.8572457824381
21 Inga cocleensis -1.25501066098081 0.977891547595055 176.926277730114
22 Inga thibaudiana 1.86993603411514 0.957793424786226 90.7719633668507
23 Lacistema aggregatum -0.527078891257996 0.936019410337475 58.5189611583545
24 Lacmellea panamensis 0.922814498933902 0.947329495110514 71.9438324806808
25 Laetia procera 0.280597014925373 0.982089552238806 219.333333333333
26 Lindackeria laurina -1.5091684434968 0.983781013360689 242.62453265144
27 Lonchocarpus heptaphyllus -1.79957356076759 0.96667520703147 116.030753192599
28 Platymiscium pinnatum -0.867803837953092 0.940852855745627 63.6279480679112
29 Platypodium elegans -0.95181236673774 0.931774211648524 54.6288571616554
30 Poulsenia armata -12.4660980810235 0.987321749880273 311.500953382842
31 Pouteria reticulata 2.66098081023454 0.945803158798333 69.8050393954877
32 Protium tenuifolium 2.6272921108742 0.964587729878788 108.955198475232
33 Siparuna guianensis -0.322814498933902 0.950858271178847 77.3972177201512
34 Sloanea terniflora -0.850746268656716 0.963792621220982 106.474719101124
35 Spondias radlkoferi 0.33773987206823 0.928784648187633 52.1676646706586
36 Tabernaemontana arborea 2.8865671641791 0.950302666909604 76.4872163406485
37 Tachigali versicolor 1.15991471215352 0.930120288047633 53.2412204951066
38 Tetragastris panamensis 3.55479744136461 0.948067298157644 73.0227594193397
39 Trichilia pallida 0.727931769722814 0.970007440216116 129.366409163559
40 Trichilia tuberculata -24.4243070362473 0.992421305207608 523.795367088204
41 Trophis racemosa -1.45756929637527 0.984579417986296 255.393581671913
42 Xylopia macrantha 4.63283582089552 0.984183190422545 248.895502118301
43 Zuelania guidonia -0.282302771855011 0.934422174840085 56.9962283782027

Appendix 3. R Code (also attached as a document)


abundance<-read.csv("Abundance.csv") #50 ha plot data
t=c(0,3,8,13,18,23) # this is a vector representing time (1982,85,90, etc.)
tbar=mean(t) # average time
n=length(abundance$Species.Name) #number of regressions to be performed
#linear regression determining change in abundance of species over time
#storing values in the matrix "species", cols = beta0, beta1, t-stat for beta1, R2, F
species=matrix(nrow=n,ncol=5)
for (i in 1:n) {
#y is the vector of abundance values taken from the data set for the entry with index i
y=c(abundance[i,3],abundance[i,4],abundance[i,5],abundance[i,6],abundance[i,7],abundance[i,8])
ybar=mean(y) #the mean
sxy=sum((t-tbar)*(y-ybar)) #sum of covariances
sxx=sum((t-tbar)^2) #sum of squares of variance in the time vector
beta1=sxy/sxx
beta0=ybar-beta1*tbar
MSRes=sum((y-(beta0+beta1*t))^2)*(1/4) #mean residual sum of squares, df=4
T=beta1/((MSRes/sxx)^.5) #t value for the linear coefficient
SSReg=sum((beta0+beta1*t-ybar)^2) #sum of squares from the regression
SSTot=sum((y-ybar)^2) #sum of squares of variance in abundance data
R2=SSReg/SSTot
F=SSReg/MSRes
species[i,]=c(beta0,beta1,T,R2,F)
}

tc=7.173 #critical two-sided t value for alpha=0.002 and df=4

#this outputs all the data created in the first for loop from the regressions in a matrix
#and also a list of species whose abundance correlated positively or negatively with time
#at a significance level of at least 0.002 (t test coefficient) and range of linear coefficient
species
#outputs species that changed positively and negatively
abundance$Species.Name[which(abs(species[,3])>tc & species[,2]>0)]
abundance$Species.Name[which(abs(species[,3])>tc & species[,2]<0)]
#species with significant linear coefficients
spp=abundance$Species.Name[which(abs(species[,3])>tc)]
#average changes in abundance in trees/year, R2 and F statistics for significant spp
data=species[which(abs(species[,3])>tc),c(2,4,5)]
#output species and linear coefficient, and stats listed in previous comment, together
as.data.frame(cbind(as.vector(spp),data),row.names=c("Species","Beta 1","R_Squared","F"))
#range of values for average change in abundance in trees/year
range(species[which(abs(species[,3])>tc),2])
#the rest of this script outputs a color coated 4-pane graphical representation of the models
#for those Species whose abundances showed a linear correlation with time, with alpha=0.002
#based on a two tailed t-test for linearity where critical t (df=4, alpha/2=0.001) = 7.173
layout(matrix(c(1,2,3,4),2,2,byrow=TRUE))
set1=which(abs(species[,3])>tc & abundance$X1982 > 500) #chosing rows
z=length(set1)
ylim=range(subset.data.frame(abundance[set1,],select=X1982:X2005))
plot.new()
plot.window(range(t),c(0,ylim[2]))
axis(1)
axis(2)
palette(rainbow(z))
leg=5
for (j in 1:z) {
points(t,subset.data.frame(abundance[set1[j],],select=X1982:X2005),col=j)
text(leg,species[set1[j],1]+species[set1[j],2]*leg-50,abundance$Species.Name[set1[j]],col=j)
lines(c(0,23),c(species[set1[j],1],species[set1[j],1]+species[set1[j],2]*23),col=j)
leg=leg+5
if (leg > 20){leg=5}
}
set2=which(abs(species[,3])>tc & abundance$X1982 > 190 & abundance$X1982 < 500)
z=length(set2)
ylim=range(subset.data.frame(abundance[set2,],select=X1982:X2005))
plot.new()
plot.window(range(t),c(ylim[1],ylim[2]))
axis(1)
axis(2)
palette(rainbow(z))
for (k in 1:z) {
points(t,subset.data.frame(abundance[set2[k],],select=X1982:X2005),col=k)
text(leg,species[set2[k],1]+species[set2[k],2]*leg-20,abundance$Species.Name[set2[k]],col=k)
lines(c(0,23),c(species[set2[k],1],species[set2[k],1]+species[set2[k],2]*23),col=k)
leg=leg+5
if (leg > 20){leg=5}
}
set3=which(abs(species[,3])>tc & abundance$X1982 > 75 & abundance$X1982 < 190)
z=length(set3)
ylim=range(subset.data.frame(abundance[set3,],select=X1982:X2005))
plot.new()
plot.window(range(t),c(0,ylim[2]))
axis(1)
axis(2)
palette(rainbow(z))
for (j in 1:z) {
points(t,subset.data.frame(abundance[set3[j],],select=X1982:X2005),col=j)
text(leg,species[set3[j],1]+species[set3[j],2]*leg-10,abundance$Species.Name[set3[j]],col=j)
lines(c(0,23),c(species[set3[j],1],species[set3[j],1]+species[set3[j],2]*23),col=j)
leg=leg+3
if (leg > 20){leg=5}
}
set4=which(abs(species[,3])>tc & abundance$X1982 < 75)
z=length(set4)
ylim=range(subset.data.frame(abundance[set4,],select=X1982:X2005))
plot.new()
plot.window(range(t),c(0,ylim[2]))
axis(1)
axis(2)
palette(rainbow(z))
for (k in 1:z) {
points(t,subset.data.frame(abundance[set4[k],],select=X1982:X2005),col=k)
text(leg,species[set4[k],1]+species[set4[k],2]*leg-5,abundance$Species.Name[set4[k]],col=k)
lines(c(0,23),c(species[set4[k],1],species[set4[k],1]+species[set4[k],2]*23),col=k)
leg=leg+5
if (leg > 20){leg=5}
}

Vous aimerez peut-être aussi