Vous êtes sur la page 1sur 24

1999

No. 2

IRRI Biodiversity Software Series. II.

COLLECTl and COLLECT2: Programs for Calculating Statistics of Collectors' Curves

Developed by W.J. Zhang and K.G. Schoenly

IRRI

INTERNATIONAL RICE RESEARCH INSTITUTE

®~[ffiUDD~@JU onUU®VDmJ

1999 No.2

IRRI Biodiversity Software Series. II.

COLLECTl and COLLECT2: Programs for Calculating Statistics of Collectors' Curves

Developed by W.J. Zhang and K.G. Schoenly

lRRI

INTERNATIONAL RICE RESEARCH INSTITUTE

The International Rice Research Institute (lRRI) was established in 1960 by the Ford and Rockefeller Foundations with the help and approval of the Government of the Philippines. Today IRRI is one of the 16 nonprofit international research centers supported by the Consultative Group on International Agricultural Research (CGIAR). The CGIAR is sponsored by the Food and Agriculture Organization of the United Nations. the International Bank for Reconstruction and Development (World Bank). the United Nations Development Programme (UNDP). and the United Nations Environment Programme (UNEP). Its membership comprises donor countries. international and regional organizations. and private foundations.

As listed in its most recent Corporate Report. IRRI receives support. through the CGIAR, from a number of donors including UNDP. World Bank, European Union. Asian Development Bank. Rockefeller Foundation, and the international aid agencies of the following governments:

Australia, Belgium. Canada. People's Republic of China. Denmark. France, Germany, India. Indonesia. Islamic Republic of Iran. Japan. Republic 01' Korea, The Netherlands. Norway, Peru. Philippines. Spain, Sweden. Switzerland. Thailand, United Kingdom, and United States.

The responsibility for this publication rests with the International Rice Research Institute.

IRRI Technical Bulletins

The IRRI Technical Bulletin is a rapid means of presenting results of research on a specialized technical subject such as the development of experimental methods, specialized software. or other solutions to complex

research problems. -

Copyright International Rice Research Institute 1999

Mailing Address: MCPO Box 3127, 1271 Makati City. Philippines Phone: (63-2) 845-0563. 844-3351 to 53

Fax (63-2) 891-1292, 845-0606

Email: IRRI@CGIAR.ORG

Telex: (ITT) 40890 Rice PM: (CWT) 14519 IRILB PS Cable: RICEFOUND MANILA

URL: hnpv/www.cgiar.org.irri

Riceweb: http://www.riceweb.org

Riceworld: http://www.riceworld.org

Courier address: Suite 1099, Pacific Bank Building 6776 Ayala Avenue, Makati

Metro Manila, Philippines

Tel. (63-2) 891-1236. 891-1174, 891-1258, 891-1303

Suggested citation:

Zhang WJ, Schoenly KG. 1999. IRRI Biodiversity Software Series. II. COLLECT I and COLLECT2: programs for calculating statistics of collectors' curves. IRRI Technical Bulletin No.2. Manila (Philippines): International Rice Research Institute. 15 p.

Cover: All photos from IRRI archives except for photo of rat, which was taken by N.Q. Hung.

ISBN 971-22-0134-1 ISSN

COLLECT1 and COLLECT2: Programs for Calculating Statistics of Collectors' Curves

Developed by W.J. Zhang and K.G. Schoenly

Biodiversity studies in natural resource ecology and agriculture often begin with issues of sampling. For example, researchers may need to know how representative and complete their ecological community being sampled is. Although sampling methods for estimating population sizes of individual species are commonplace (e.g., Pedigo and Buntin 1994, Sutherland

1996), methods for estimating sizes of ecological communities (i.e., species richness) have rarely been systematized in methods manuals. When embarking on any large-scale study in biodiversity, a small pilot study is advised (Krebs 1999) to test the magnitude, resolution, extent, and efficiency of the sampling program.

A pilot study in agrobiodiversity might constitute a set of samples gathered from one experimental plot or farmer's field. To gauge completeness of sampling, a yield-effort or collector's curve may be drawn (Cohen 1978, Dickerson and Robinson 1985, Cohen et al

1993, Colwell and Coddington 1994, Schoenly et al 1996), which plots the cumulative number of taxa caught or observed (y-axis) against the cumulative effort of sampling (x-axis). The collector's curve is a step function with a slope that should decrease as sampling effort increases and as fewer taxa remain to be sampled (Fig. O. The nature of the slope of a collector's curve has two implications. If sampling stops while the collector's curve is still rapidly increasing, the implication is that the ecological community derived from this sampling is incomplete. Alternatively, if sampling ceases when the slope of the collector's curve reaches zero (or close to zero), we can conclude that sampling is probably complete. As the example in Figure 1 shows,

additional sampling is likely to yield additional taxa, even after 100 samples, because the slope of the c?llector's curve is nonzero. (Throughout this text, taxa and species are used synonymously.)

Overview of two software programs This document describes two programs (COLLECn, COLLECT2) for calculating, testing, and bias-correcting collectors' curves in biodiversity research. Although collectors' curves have rarely been reported (Cohen 1978), their use can help researchers identify, earlier rather than later, the shortcomings, extent, and efficiency of their sampling programs. The sensitivity of collectors' curves for gauging sampling completeness makes them an appropriate and informative tool for biodiversity assessment and quality assurance. COLLECn calculates, bias-corrects, and tests the observed (real data) curve against the expected (random placement) hypothesis. COLLECT2 bootstraps the sampling order to analyze the effect of sampling intensity on the average number of individuals of newly collected taxa.

Research questions

Collectors' curves increase ecological understanding of dominance-diversity relationships and spatial distributions of taxa. Some practical questions in rice pest management that can be addressed by collectors' curves (and their associated statistics) include:

1. If a single or aggregate set of rice fields has definable borders, is there a finite number of species in it?

1

Cumulative number of taxa sampled

200

180 160 140 120 100

80 60 40 20

o ~~~~~~~~~~~~~~~~rr=mnTIrrrrmmnTIrrrrmmT----

30

40

Cumulative sample size

50

60

90 100

o

10

20

70

80

Fig. 1. A collector's curve for rice invertebrates gathered from 100 suction samples of a 0.16-m2 enclosure derived from a 0.25-ha plot on the IRRI experimental farm 29 days after transplanting during the 1996 dry season. Each point is the mean of 100 randomizations of sample pooling order.

2. Does the sequence of samples a researcher takes in a rice field affect the shape of the collector's curve and, if so, how can sampleorder bias be minimized?

3. What is the minimum number of samples needed to capture the most common taxa that make up 75%, 90%, and 95% of the total abundance of the community?

4. Does a statistical model that assumes a random distribution of taxa among samples (i.e., random placement model) provide a good fit to the observed collector's curve?

5. What is the quantitative relationship between richness and abundance of newly observed taxa not found in previous samples and at what sample size does sampling tend to concentrate on rare taxa (Cohen 1978)?

Technical information

Two versions each of COLLECn and COLLECT2 were written for use in MS-DOS (QBASIC™) and Windows™ (DELPHI-3™) environments to serve different operating systems in use at national agricultural research stations. Both versions use the algorithms of

2

Coleman et al (1982) and Colwell and Coddington (1994) but vary in screen appearance and have different data/memory capacities. The QBASIC version of each program was developed using the Microsoft QBASIC language that runs under MS-DOS and Windows platforms and requires 7 kb of system memory for COLLECn and 6 kb for COLLECT2, for a total system memory of 13 kb. The Windowsbased version was created using designer tools and Windows interfaces contained within the DELPHI-3 development kit. The source code in the DELPHI-3 version is Object PASCAUM, which is Borland's object-oriented extension to the PASCAL language. The memory required to run the DELPHI-3 programs is 324 kb for COLLECn and 322 kb for COLLECT2. DELPHI-3 versions run under advanced Windows environments such as Windows 95 and later.

Program COLLECT1

According to Colwell and Coddington (1994), complete counts of locally occurring species are feasible for plants and philopatric animals (e.g.,

territorial mammals). For other taxonomic groups (e.g., invertebrates), estimation by sampling is the best option. Arithmetically, a collector's curve plots the cumulative number of taxa, defined as the sum of the number of taxa in the previous sample(s) and the number of taxa in the present sample that were not observed in any previous sample. For the first sample, the cumulative number of taxa is defined to equal its number of taxa. While calculating cumulative counts for each sample point, COLLECn gives equal weight to common and rare taxa.

As an example, Figure 1 shows a collector's curve of rice-associated invertebrates suctionsampled from a 0.25-ha experimental plot on the IRRI farm at 29 days after transplanting during the 1996 dry season (K. Schoenly et al, unpublished data). A total of 100 standardized samples were collected from a plastic barrel enclosure

(0. 16-m2 area) with sampling duration fixed at 2 min. Laboratory counts of vial contents produced a sample-by-taxa matrix in which the samples corresponded to columns and taxa corresponded to rows. Because the samples were all collected on the same day and were selected

at random sites within the 0.25-ha plot, the order in which the samples were pooled to produce the collector's curve is also random.

Bootstrap procedures

The order in which samples are added to the total number of samples affects the shape of the collector's curve (Colwell and Coddington 1994). Variation in curve shape due to sample order is different from sampling error caused by between-sample heterogeneity. Following the algorithm of Colwell and Coddington (1994), program COLLECn bootstraps (randomizes without replacement) the columns of the sampleby-taxa matrix. In so doing, COLLECn produces a different sampling pathway through the crop field. Repeating this process, say 1,000 times, generates a family of collection curves from which the mean number of taxa and its standard deviation (or confidence interval) can be calculated for each sample size in the collector's curve. Colwell and Coddington (1994) claim that the means become stable

(from their example at least) after 20 randomizations. In Figure 2, the collector's curve shows these means (as points) and their 95% confi-

50

80

Cumulative number of taxa sampled

200 180 160 140 120 100

80 60 40 20

o ~mmmmmmmmmmmmmmmmmmmmmmmmmmmrrmrrmrrmrrmrrmrrmr----

40

70

90 100

Cumulative sample size

10

20

o

30

60

Fig. 2. Collector's curve for rice invertebrates gathered from 100 suction samples from the IRRI farm, 29 days after transplanting, dry season 1996. Each point is the mean of 100 randomizations of sample pooling order; vertical bars are means ±2 standard deviations. Data from Figure 1.

3

dence intervals (as vertical error bars). The occasional small dips and peaks appearing in Figures 1 and 2 are corrected by increasing the number of simulations.

Testing collectors' curves against the random placement hypothesis

Even if sample order bias is corrected by bootstrapping procedures, variation in curve shape due to heterogeneity of environmental influences remains a likely significant source of sampling error. Coleman et al (1982) described an analytical model to test whether individuals among the samples (of definable size) obey the hypothesis of random placement, which assumes a lack of correlation in the location of individuals. Under the random placement hypothesis, consider a collection C of N individuals from S taxa, with ni the number of individuals in C belonging to the ith taxon, and suppose that each member of C occurs in one of k nonoverlapping regions (or samples) that have areas aI' a2, ... , ak• The number s of taxa in a given region is a random variable whose magnitude depends on the area a of the region and the relative area is defined as a = al'i.ai. The mean value s and its variance d are calculated as follows:

sea) = S - L(1 - a)ni

The expected collector's curve is obtained by plotting s against log,o(a). One way to examine the level of homogeneity is to compare the observed mean collector's curve with the expected collector's curve. If 95% of the plotted points (means) of the observed collector's curve fall two standard deviations outside the expected collector's curve, then the observed community is statistically more heterogeneous in taxa composition (at the nominal 0.05 level) than sampling error (alone) can account for (Coleman et al 1982). Following the calculation of both the observed and expected collectors' curves and their confidence limits, COLLECn informs users whether the observed case passes (or fails) the random placement hypothesis. As an example, the observed case (data from Figures 1 and 2) is a good fit to the expected case (Fig. 3); thus, we can conclude that these suction samples are no more heterogeneous in taxonomic composition than is expected under the random placement hypothesis.

Cumulative number of taxa sampled 200

180 160 140 120

100 /
/ ,/
80 //
I
60 ,I
40
~
20
0 10

20

30

Cumulative sample size

50

o

- --

40

60

70

80

100

90

Fig. 3. Collector's curve of rice invertebrates from 100 samples (data as in Figures 1 and 2) from the IRRI farm, 29 days after transplanting, dry season 1996. Each point is the observed mean of 100 randomizations of sample pooling order. Broken lines are the expected means ±2 standard deviations for the random placement hypothesis.

4

Colwell and Coddington (1994) suggest that the sampling-without-replacement version of the rarefaction method (e.g., Simberloff 1978) can also test whether real data pass the random placement hypothesis. But because the method of Coleman et al (1982) is computationally more efficient than the rarefaction method, Colwell and Coddington (1994) prefer the former over the latter. A collector's curve that passes the random placement hypothesis (i.e., the collection represents a uniform sampling process for this time and place) also satisfies a necessary assumption of extrapolation methods that estimate how many additional taxa are likely to be observed if sampling continues without bound (Colwell and Coddington 1994). Extrapolation methods for estimating total species richness will be described in another installment of IRRI's biodiversity program series.

Use of abundance thresholds in collectors' curves

Ecological communities are commonly named for the most abundant taxa in them (e.g., a pinyon-juniper woodland). Thus, the taxonomic composition of most ecological communities, including rice-invertebrate assemblages, con-

tains a few common taxa, several to many taxa of intermediate abundance, and usually many taxa of one or two individuals each called "singletons" and "doubletons" (May 1981). Moreover, rice-invertebrate communities show positive and significant correlations between abundance, distribution, and persistence over the cropping season (Schoenly et al 1998). Practically speaking, if taking a few samples in a single field, say 10 samples, faithfully captures the same abundant taxa (at a particular abundance level or "threshold") as taking 100 samples does, then future biodiversity surveys can be done at a lower cost and with a minimal loss of essential ecological information.

As an example, the upper curve of Figure 4 shows the same collector's curve as Figures 1-3 that yielded 173 taxa. Sorting these taxa, by descending order of abundance, reveals that the 16 most abundant taxa represent 75% of the total abundance, followed by 37 taxa at 90%, and 52 at 95%. One-hundred randomizations of sample pooling order of each subset produce the three lower collectors' curves in Figure 4, which show the means (as points) and their 95% confidence intervals (as vertical bars). Because

Cumulative number of taxa sampled

200 180 160 140 120 100

80 60 40 20

o -mrrmrrmrrmrnmrnmrrmrrmmmmmmmmmmmmmmmmrnmrnmmmmmmmr---

o

10

20

30

40

Cumulative sample size

50

60

90 100

11111111111111111111111111 16taxa

70

80

Fig. 4. Collectors' curves for all taxa (upper curve) and for taxa comprising 75%, 90%, and 95% of total invertebrate abundance, 29 days after transplanting, dry season 1996. The upper curve is the same as in Figures 1-3. Numbers beside each curve denote the cumulative number of taxa in each case. Each point is the mean of 100 randomizations of sample pooling order; vertical bars are means ±2 standard deviations.

5

these abundance-threshold curves were derived from the actual abundance distributions, these curves predict that five samples are enough to capture the most abundant (::;52) taxa for this time and place.

Program COLLECT2

Collectors' curves capture qualitative information that describes relationships between sample size and species richness. As we saw above, abundance thresholds offer one option for estimating minimum sample sizes in biodiversity studies (see above). Recall that because locally abundant taxa tend to have wider spatial coverage than rare taxa, sampling is likely to capture the abundant taxa first, followed by taxa of progressively greater rarity. For rice-invertebrates at least, the most serious insect pests and their most promising biocontrol agents (Ooi and Shepard 1994) are among the most abundant taxa in the ecological community (K.G. Schoenly et al, unpublished data). Thus, understanding when sampling efforts "switch" from collecting abundant taxa (pests and their natural

enemies) to rare taxa (after the most common taxa have been collected) is a practical sampling issue for plant protection, monitoring, and research. In other contexts, it may be desirable to continue sampling in order to uncover more rare taxa.

As an example, Figure 5 uses the same data from earlier examples (Figures 1-4) to show how increasing sampling effort affects the average abundance of newly collected taxa (i.e., taxa not collected in any previous sample). In this example, the rarest taxa (i.e., singletons) begin to accumulate after 14 samples or when the average abundance of newly collected taxa equals 1. Thus, we can conclude that the most abundant taxa have been collected between the 1st and 14th samples. Unlike COLLECT 1 , note that COLLECT2 uses abundance data directly rather than assigning each taxon equal weight, and assumes that the arithmetic average, rather than the median or mode, is an adequate measure of the central tendency in species-abundance data.

Mean abundance of newly sampled taxa

4

3

2

?

I Singletons

: begin to

: predominate

o

10

20

30

40

Cumulative sample size

o

50

80

100

60

70

90

Fig. 5. The relationship between sampling intensity and average abundance of taxa not collected in any previous sample (newly sampled taxa) at the IRRI farm, 29 days after transplanting, dry season 1996. When mean abundance equals 1, sampling tends to concentrate on rare taxa (e.g., singletons). Each point is the mean of 100 randomizations of sample pooling order.

6

Input file format

Data for COLLECn and COLLECT2 must be in the form of a data table or matrix in which taxa (immatures or adults) correspond to rows and samples correspond to columns. Cells of the matrix contain integers corresponding to sampled abundances separated by one or more spaces for clarity. The data file for COLLECn and COLLECT2 must be a space-delimited text file. DELPHI-3 versions, unlike QBASIC versions, require that all input files be given the extension ".txt" because this is a default extension for input files (in the input file dialog boxes) of COLLECn and COLLECT2.

The space-delimited text format

The space-delimited format has been incorporated to make it easier to export data from spreadsheet files, such as Microsoft Excel or Corel Quattro Pro. The first row of each spacedelimited text file consists of a numerical label for each sample, separated by spaces. Subsequent rows contain the taxon ID number, taken from the master list, followed by the sampled abundance for that taxon in each sample, all separated by spaces (Fig. 6).

1 2 3 4 5 6 7
727 15 4 0 0 0 0 4
-618 10 0 0 0 0 0 0
-270 9 2 0 0 2 1 5
1000 8 0 8 8 11 2 29
378 7 1 0 0 0 2 3
-620 7 0 1 0 0 4 5
-378 6 0 0 0 0 0 0
1079 6 1 10 2 2 5 20
914 5 1 2 4 8 5 20
1094 5 0 0 0 0 0 0
-149 4 0 3 0 0 0 3
-581 4 0 3 0 2 0 5
Fig. 6. Data matrix containing the space-delimited file
format. Using Excel to create space-delimited text files

The following procedure describes how to create a space-delimited .txt file using Microsoft Excel 97:

1. Using the mouse, highlight the matrix you wish to create as a text file.

2. Choose the EditlCopy command to copy the contents of the matrix in the clipboard.

3. Choose the FilelNew command to create a new file, then click EditlPaste to transfer the contents of the clipboard to the new file.

4. If the file to be saved has 50-100 data columns, use the FormatlColumnIWidth option to change the width of the first column (taxon ID number) to 7 spaces. Highlight the remaining columns with the pointer and click FormatlColumnIWidth again and enter 4 spaces for these columns. This step will ensure that the width of this data matrix does not exceed 240 spaces, the maximum width allowed by Excel.

5. Choose File/Save AsIFormatted Text (space delimited)(*.prn) and click Save.

6. In the file name box, type in a file name.

Change the ".pm" extension (entered by default by MS Excel) to ".txt".

7. A warning box appears informing you that the "selected file type will save only the active sheet". Click Save.

If your spreadsheet software is other than Excel consult your user manual on how to create ' space-delimited text files.

Program installation

QBASIC versions of COLLECn and COLLECT2 (* .bas) require the executable file qbasic.exe to run. You should copy all of these files into the same MS-DOS or Windows directory on the hard drive. DELPHI-3 versions are stand-alone executable files that do not require .dll (dynamic link library) files for installation. As ordinary executable files, COLLECn and COLLECT2 can be copied into MS-DOS, Windows, and other environments that have screen resolution of 800 x 600 pixels or more.

Starting COLLECT1 and COLLECT2 For the QBASIC version, double-click

q basic.exe, then click File/Open to see and select the program of interest. After opening the program, click Run and follow the data input

7

instructions. For DELPHI-3 versions, doubleclick the program icon in Windows Explorer or in the Program Manager or enter the program name on the MS-DOS command line. After a few seconds. a window appears containing different buttons and options for parameter input. Additional program details become available by clicking Help along the bottom panel.

Running the programs

Because of space limitations. our tutorial below will show only the DELPHI-3 (Windows) versions of COLLECT I and COLLECT2. QBASIC versions lack the familiar Windows format but retain inputting instructions and numerical algorithms like those of the DELPHI- 3 versions.

Program windows in COLLECT1 and COLLECT2

The program windows in COLLECTI and COLLECT2 look virtually identical: only the

Scroll bars

nature of the output is different. After starting either program and waiting several seconds, the first program window appears (Fig. 7). In the right panel, specify the number of taxa (i.e., 20 in this example), the number of samples (e.g., 10), and the number of Monte Carlo simulations desired (e.g., 1,000) and click Enter. The second program window requires you to specify the path and file name of the data set (Fig. 8). Hint statements, activated when the mouse pointer approaches an input box, specify the nature of the parameter needed for each step of input. The menu at the bottom is an action panel that includes options to Run the program (after data inputting is complete), Stop it (during program execution), Restart the program, and seek Help. In the right corner of the window, two buttons familiar to Windows users let you Minimize the window but keep the program running ( _ ) and Close the window so you can exit the program (x).

Fig. 7.

Output window

Data input window

........

8

Action panel

While running, COLLECT I and COLLECT2 will display approximate running time and remaining time in the progress window. Running time refers to the approximate real time (elapsed time) COLLECTI or COLLECT2 has taken to run the program from execution. whereas remaining time refers to approximate time left to completion in minutes and seconds (Fig. 9). After execution, you can scan the results in the display window by scrolling up or down on the scroll bar.

Output files

After program execution, you can save the output as a space-delimited text file or print it, or specify the path and file name of the graph. For each option, you can save or cancel by using the appropriate button.

For COLLECT I, the first file that you can save is the text file of all results (Fig. 10). The output of this file starts with values for the first sample and ends with the maximum sample size

Fig. 8.

- - - -

ExpSD' ExpS-2SD
1.55 3.35
1.53 10.97
15.34 1.44 12.75 18.52
16.70 16.78 1.33 14.12 19.43
17.65 1764 1.21 15.23 20.05
18.33 18.31 1.07 16.17 20.45
18.92 18.85 0.93 17.00 20.71
19.37 19.30 0.76 17.79 20.82
19.69 19.68 0.54 18.60 20.76
20.00 20.00 000 20.00 20.00 For these observed dete:

o of the 10 dete points (0.00:%:.) fell outside the RPH confider" Consequently. these field dete are a 900d tit to the Random Pia

Fig. 9.

File and Path to Save Result

!iii ken Folder ..

.j

QK

t;;ancel

.. - - -

!

~"""'" ,~_, ~ .",,,,,".,....1;

9

you specified at input. The last section of the output gives the test results expected under the random placement hypothesis. If 95% or more of the points (means) of the observed case fall within the mean ± 2 SD of the expected results, then the message" ... these field data are a good fit to the random placement hypothesis" is displayed (Fig. 10). Alternately, if 95% of the observed means fall outside the confidence intervals of the expected curve, then the message " ... these field data are a poor fit to the random placement hypothesis" is displayed. For graphing purposes, the observed case uses columns 1 (Ssze = sample size) and 2 (MnObsS = mean observed S); the expected results use columns 1, 3 (ExpS = expected S), 5 (ExpS-2 SD = expected mean - 2 SD), and 6 (ExpS+2 SD = expected mean + 2 SD).

SSze MnObsS

1 10.51

2 13.71

3 15.34

4 16.70

5 17.65

6 18.33

7 18.92

8 19.37

9 19.69

10 20.00

-ExpS

11. 44 14.03 15.63 16.78 17.64 18.31 18.85 19.30 19.68 20.00

For COLLECT2, the first file that you can save is the text file of all results (Fig. 11). Like COLLECT 1 , the output of COLLECT2 starts with values for the first sample and ends with the maximum sample size you specified at input. For graphing purposes, columns 1 (Ssze = sample size) and 4 (AvgNewInd = average number of individuals of newly sampled taxa) are the most useful. In addition, columns 6 (AvgNewInd-2 SD) and 7 (Avgblewlnd+Z SD) provide approximate lower and upper confidence intervals (roughly 95%). The lower bound (col 6) extends below 0, but because this is biologically impossible (i.e., we cannot have negative individuals!), the lower bound can be set to 0.

ExpSD ExpS-2SD ExpS+2SD ObsS-2SD ObsS+2SD

1.55 8.35 14.53 6.78 14.25

1.53 10.97 17.08 10.01 17.42

1.44 12.75 18.52 11.87 18.82

1.33 14.12 19.43 13.78 19.61

1.21 15.23 20.05 15.16 20.14

1.07 16.17 20.45 16.31 20.35

0.93 17.00 20.71 17.28 20.56

0.76 17.79 20.82 18.13 20.61

0.54 18.60 20.76 18.77 20.62

0.00 20.00 20.00 20.00 20.00

For these observed data: 0 of the 10 data points (0.00%) fell outside the random placement hypothesis (RPH) confidence interval. Consequently, these field data are a good fit to the random placement hypothesis.

Fig. 10. Contents of output file colldem1.out.

SSze MnObsS Newlnd AvgNewlnd AvgNewlndSD AvgNewlnd-2 SD AvgNewlnd+2 SD
1 10 55 5.46 0.00 5.46 5.46
2 13 7 2.92 4.12 -5.32 11.17
3 15 2 1. 57 2.62 -3.66 6.81
4 16 2 1. 18 2.24 -3.30 5.67
5 16 1 0.90 1.95 -3.01 4.81
6 17 1 0.67 1.71 -2.74 4.09
7 17 0 0.39 0.93 -1.46 2.24
8 18 0 0.30 0.81 -1.31 1. 91
9 18 0 0.23 0.42 -0.61 1. 08
10 18 0 0.20 0.40 -0.60 1. 00
Fig. 11. Sample output of COLLECT2.
10 Case study: 100-site grid, IRRI lowland 'farm, 1996

A 10-by-1O grid of 100 sampling sites was established in' plot 321 at the IRRI Farm in the 1996 dry and wet seasons, Sampling sites were spaced 2 m apart, yielding a square sampling area of 18 m on each side, Each sampling site was suction-sampled for invertebrates every

2 wk in the growing season using a Rice- Vac apparatus (Domingo and Schoenly 1998) and a bucket enclosure (0,16 m"), Laboratory sorting treated different life stages (immatures, adults) separately. Data tables from this study were stored in Excel spreadsheets as a sample-by-taxa matrix. The program COLLECT was run four times on each of three sampling dates (29, 57, and 99 DT) using all taxa and those comprising the 75%,90%, and 95% abundance thresholds.

Collectors' curves for 29,57, and 99 DT reveal catches of 173, 236, and 211 taxa, respectively, in the dry season (Fig. 12). In each of the three cases, curves for taxa failed to level off, suggesting that additional sampling would likely uncover more taxa. The observed points (means) in each of the three cases fit the expected (mean ± 2 SD) cases, indicating that the observed data are no more heterogeneous in taxonomic composition than sampling error alone can account for.

A smaller number of common taxa at different abundant thresholds were found earlier (28 DT) rather than later (57 and 99 DT). Taken together, none of these crop stages required more than 19 samples to capture the largest fraction of common taxa (95%). Figure 13 shows that later crop stages require a higher sampling intensity (about 24 samples) than earlier stages for singleton taxa to predominate for taxa not found in previous samples.

Execution errors in COLLECT1 and COLLECT2

If a problem develops during execution of either of these programs, an error will be displayed on the lower right panel. The list below includes

some common errors and their explanation, which may help in troubleshooting:

Error

Explanation

Divided by zero with integer!

Divided by zero with floating point!

Range check error!

An integer value divided by zero.

A floating point value divided by zero.

Integer value exceeds defined range.

A floating point value exceeds upper limit.

A floating point value falls below lower limit.

File not open, or read-only, etc.

Floating point overflow!

Floating point underflow!

File cannot be accessed!

Input or output error!

A nonfile, file has no data or not enough data, too many taxa or samples.

Square root of negative value, etc.

Invalid floating point operation!

Math or other error!

Not enough memory on hard disk, etc.

Other features of COLLECT1 and COLLECT2

Using the STOP button

You can stop a program at any time during execution (after data entry) by clicking the Stop button near the bottom of the program window. If Stop is clicked, the program will return to its first program window (i.e., first parameter input window).

Saving output

The program window(s) for saving results will appear automatically following program execution. Recall that in COLLECn and COLLECT2, one text file and one graphics file can be saved. As each file save option appears, you can save any or all of the files by simply entering the path and a unique file name in the filenaming box. Likewise, if you wish to ignore certain files, simply click the Cancel button.

Printing results

After you save a file(s), the program box for printing appears in COLLECn and COLLECT2. If you wish to print all results, then click the All Results and Print buttons. If you

11

Fig. 12. Collectors' curves for all taxa (upper curve) and for taxa comprising 75%, 90%, and 95% of total invertebrate abundance for (A) 29, (8) 57, and (C) 99 days after transplanting, dry season 1996, IRR!. Numbers beside each curve denote the number of taxa in each case. Each point is the mean of 100 randomizations of sample pooling order, vertical bars are means ±2 standard deviations.

12

Mean abundance of newly sampled taxa

4.0

100

3.0

2.0

1.0

0.0

o

10

90

100

100

20

30

40

50

60

70

80

2.0

Fig. 13. Relationship between sampling intensity and average number of individuals for newly sampled invertebrate taxa for (A) 29, (8) 57, and (C) 99 days after transplanting. Each point is the mean of 100 randomizations of sample pooling order. The vertical broken line in each graph identifies the sample size when singletons tend to predominate.

1.5

1.0

0.5

0.0

o

10

20

30

40

50

60

70

80

90

4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0

.......__._.. ...

• 1WWe ...... ..-..._ .................

o

10

20

30

40

50

60

70

80

90

Cumulative sample size

13

wish to print a partial set of results, then click the Selected Results and Print buttons. If no printing is desired, simply click the Cancel button.

Using the Restart and Help buttons

To return a program to the starting program window, you only need to click the Restart button. COLLECn and COLLECT2 come with a Help button, which is located near the bottom of the program window. In COLLECn, information in Help includes a verbal and mathematical description of the collector's curve and bootstrap algorithms of Colwell and Coddington (1994). This information can be scanned in the display window by scrolling up or down on the scroll bar. The QBASIC versions (collect1.bas, collect2.bas) include help text.

Disclaimer

This software was tested on IBM-compatible PCs in 1998-99 using field data collected at IRRI in 1996. In the current version, we made every effort to test these programs thoroughly and corrected known programming' errors. Nevertheless, a computer program subjected to repeated use by different users on different machines will invariably reveal additional errors. Should you uncover what you believe is a new programming bug and find that you can repeatedly reproduce this error, please advise us, preferably bye-mail (b.hardy@cgiar.org), and send us the following: (1) a description of the problem, (2) your computer model and processor, and (3) a copy of the data set you were using. We also welcome suggestions on how COLLECn and COLLECT2 can be improved.

References

Cohen JE. 1978. Food webs and niche space.

Monographs in Population Biology 11. Princeton, N.J. (USA): Princeton University Press.

Cohen JE and 23 others. 1993. Improving food webs. Ecology 74:252-258.

Coleman BD, Mares MA, Willig MR, Hsieh YH. 1982. Randomness, area, and species richness. Ecology 63:1121-1133.

14

Colwell RK, Coddington JA. 1994. Estimating terrestrial biodiversity through extrapolation. Phil. Trans. Royal Soc. London B 345: 101-118.

Dickerson JE, Robinson JV. 1985. Microcosms as islands: a test of the MacArthur-Wilson equilibrium theory. Ecology 66:966-980.

Domingo I, Schoenly K. 1998. An improved suction apparatus for sampling invertebrate communities in flooded rice. Int. Rice Res. Notes 23(2):38-39.

Krebs CJ. 1999. Ecological methodology. 2nd edition. New York (USA): Benjamin! Cummings.

May RM. 1981. Patterns in multi-species communities. In: May RM, editor. Theoretical ecology. Sunderland, Mass. (USA):

Sinauer Associates. p 197-227 .:

Ooi P, Shepard M. 1994. Predators and parasitoids of rice insect pests. In: Heinrichs EA, editor. Biology and management of rice insects. New Delhi: Wiley Eastern. p 585- 612.

Pedigo LP, Buntin GD. 1994. Handbook of sampling methods for arthropods in agriculture. Boca Raton, Fla. (USA): CRC Press.

Schoenly KG, Cohen JE, Heong KL, Litsinger JA, Aquino GB, Barrion AT, Arida G. 1996. Food web dynamics of irrigated rice fields at five elevations in Luzon, Philippines. Bull. Entomol. Res. 86:451-466.

Schoenly KG, Justo HD Jr., Barrion AT, Harris MK, Bottrell DG. 1998. Analysis of invertebrate biodiversity in a Philippine farmer's irrigated rice field. Environ. Entomol.

27: 1125-1136.

Simberloff D. 1978. Rarefaction as a distribution-free method of expressing and estimating diversity. In: Grassle JF, Patil GP, Smith WK, Taillie C, editors. Ecological diversity in theory and practice. Fairland, Md. (USA):

International Cooperative Publishing House. p 159-176.

Sutherland WJ, editor. 1996. Ecological census techniques. Cambridge (UK): Cambridge University Press.

Acknowledgments

This project was supported by the Asian Development Bank, through RETA 5711, "Exploiting Biodiversity for Sustainable Rice Pest Management." We thank the participants of the three lighthouse sites from the Philippines (L. Flor, A. Velilla), China (J. Tang), and Vietnam (L.P. Lan) for beta testing these programs. For the data in the 1996 field study we cited and analyzed in this manual, we thank L. Datoon, A. Rivera, E. Hernandez, and E. Revilla for processing the

invertebrate samples; I. Domingo and V. Magalit for supervising the field study and for encoding the data in Excel spreadsheets; E. Rico, R. Abuyo, J. Reyes, and B. Aquino for collecting the field samples; and A.T. Barrion for his taxonomic assistance. We also thank V. Salazar for producing the cover page for this software series, and the staff of IRRI's Communication and Publications Services, particularly B. Hardy, G. Reyes, and B. Lazaro.

15

Vous aimerez peut-être aussi