1995 Practical Experiments in Statistics

Practical Experiments in Statistics
Craig A. Stone and Lorin D. Mumaw San Jose State University, San Jose, CA95192-0101
Gaining practical knowledge of statistics is important for undereraduates in the physical sciences. Demee programs in . chemistty, physics, bioio&, and math generally require students to take a course in statistics. Other fields, such as psychology and business, also rely on a knowledge of statiskics. Learning the concepts of statistics is essential if students are to understand the onalitv and esueciallv the limitations of their data. Without h i s ;nderst&ding it may be difficult to comoare two different observations whose values s u ~ w sdift ferent conclusions. Statistics can help in designing those experiments by more clearly defining a property or leading to a more firmly established conclusion. Although students may be exposed to a thorough theoretical treatment of statistics, they often miss the benefit of reducing this theory to practice. Laboratory experiments are time-consuming, so the size of data sets is limited. Instruments that have a high throughput are expensive. Few are available in a classroom setting, and they are probably unavailable for classes in math, psychology, and business. I t is thus difficult to generate the large data sets needed to studv statistical conceuts. The expernnents described here ran he app1it.d 11, any field that reauires a knowlt,dm! of statistics. They are ens.v to carry out; and they use ;expensive instrumentation. Sealed sources of radioactive nuclides are used to generate the data. Nuclear decay, a microscopic property, produces a natural statistical fluctuation on which the experiments are based. The sources are small, often with an intensity on the order of the '"Am sources found in home smoke detectors. Thus., no suecial . licenses or handling ~rocedures are needed for either the sources or the i n s t G e n t s . Radiation-detection instruments are available through several manufacturers who market their equipment to high school and university science programs. An introduction to these concepts can be found in numerous books on statistics. The following are suggested references for various fields: general statistics (1, 2);general sciences, mathenuclear science applications matics, and engineering ( 3 , 4 ) ; 7);psychology (8,9); and business ( 1 0 , l l ) . (5); biology (6, The primary goal is for students to become f a m i h r with ti] acorobsbilits distributions. Experiments an,dtsi~mttd quire a data set large enough to generate a series of fre-
quency distributions. From them, students learn how the size of a data set (i.e., the number of measurements) affects
--
Figure I . Gaussian distributionswith mean values o f 25,50, and 100. In most experiments in nuclear science, the width o f the curve is defined as the sauare root o f the mean value.
650 +lo
3 2
a 0
Mean
t o
5M
5Dl
i 100 200 MO 400

Measurement Number
Figure 2. Time distribution for a typical data set. The data set used to construct this figure contained 500 measurements.
518
Journal of Chemical Education
the quality of derived factors. Values for the l o standard deviation and for 20 are extracted from the frequency distribution curve, showing the amount of valid information that lies outside these boundaries. Advanced classes extend these ideas by comparing the performance of two instruments. Students must characterize the stability of the instruments and qualitatively compare both the time and frequency distrihutions. In a second part of the experiment they extract the instrumental standard deviation from the observed total standard deviation. A final set of experiments explore data sampling and inhomogeneity.
Counting Statistics in Radioactive Decay The binomial distribution is the underlying prohahility distribution that describes statistics for nuclear decay. I t is valid when the prohahility of an event is constant. The expression, applied to radiation detection, is given by
where n is the number of decays observed during a time is the prohability of observing n decays; N interval & P(n) is the population of radioactive nuclides; a n d p is the probability that a nucleus will decay in At. Applying this distribution to normal counting conditions is difficult because large factorials are involved. Two properties of radioactive sources simplify the prohability distribution. The sources contain a large population o 1 ' , and their half-life is very of nuclides, usually over l long compared to the time over which the experiment is carried out. The prohability that a particular nucleus decays in any time interval At is thus very small, p << 1. When this is true, and when the mean value is large, the normal or Gaussian distribution adequately describes the prohability distribution
Figure 3. Frequency distributions for data sets with 50,100,200,300,400,and 500 measurements.
Volume 72
Number 6 June 1995
519
Figure 4. The effectthat data binning has on the distribution curve. Distributions are shown for bin sizes 2, 5, 10, and 20. The width o f graphs is constant. The variance of the distribution is 02,and it is equivalent to the mean, n,. This provides a simple method of estimating the inherent scatter in the data. The standard deviationin an experiment is the square root of the mean for the distribution. Amean of 100 m'ves a standard deviation of 10%. 1000 eives 3.2%, and so f o 2 h One thus intuitively knows themagnIrtude of the uncertaintv Fimre 1illustrates how the width of the distribution varies wiTb the mean. Geiger-Muller Counting System Experiments are best carried out with radiation-detection equipment. They are compact, simple to use, and have a high data throughput. The instruments are sensitive enough that individual decays can be detected. Measurements are carried out by counting the number of decays that occur within small time intervals. Data will have a natural statistical fluctuation, and this gives the known response used to understand statistical concepts. Laboratory courses can use almost any radiation-detection instrument to generate frequency distributions. Many gamma-ray detectors use single crystals of NaKTl), Ge, or SXLi) to convert the radiation into an electrical signal. Liquid scintillation systems are used in many biochemistry programs, and they can he run in a n automated mode. Gas-filled counters for alpha and beta spectroscopy can he used as well. Many instruments from physics or chemistry laboratories can be adapted to these experiments. Whatever instrument is used, i t must have a short cycle time, allowing students to complete a measurement i n about 10 s. An exoeriment with 250 data ~ o i n t mav s take a s much as 50 min to carry out with such a cycle time (including time required to record the results). Some instruments can be operated in an automatic, repetitive mode. Although a system that automatically records the informa520 Journal of Chemical Education tion is more efficient. the extra time s ~ e nmanuallv t carrving out the experiment has a n advantage: Students notice the large scatter i n the ~ o i n t when s thev see the data one measurement a t a time. Results from the exoeriments should be disolaved usine " graphics software applications. Some commercial applications have functions that generate a frequency distribution from raw data. If they are not available, a spreadsheet application should he used to process the information before graphing. Data are first sorted by increasing value. Students then count the number of occurrences for each value. plotting the number of occurrences versus value.
Statistical Fluctuation in a Data Set In the first experiment students construct a large data set. Measurements are carried out by placing a radioactive source close to the end window of a Geiger-Muller tube. The time interval should be set so that a t least 100 decays are detected i n a measurement. If the time interval is too large, the experiment can be long. A set of conditions is chosen Le., the time interval and source-to-detector distance), which produces 250 measurements in a reasonable amount of time. Much of the data acquisition time is used to record the results. Students should work in pairs; one partner starts the count and calls the result to the second partner, who records it. A typical set of data is graphed as a time distribution in Figure 2 as the value versus the order in which the measurement was made. The data set used here contains 500 measurements. Number of Points and Appropriate Bin Size Firmre s a data set af. . 3 shows how the number of ~ o i n t in ferts the l'requenc> d1s1r1huurm Thew di;tril~uti~ms \wrt! gnnrrareo using rhr snme data rhar produred the tlme d~qtn-
bution in Figure 2. The 50-point distribution (part a of Figure 2, generated using the first 50 points of the 500-point data set) is starting to take the f o m of a Gaussian distribution but has a large amount of scatter. A 500-point distribution (part 0 is well-defined with several points in the wings of the Gaussian curve. Such a large data set, though, is tedious to construct. Sets with 250 points have worked well under typical classroom situations. Distributions shown in Figure 3 were constructed by combining data in bins of five counts. So, for the 500-point distribution (Fig. 30, the first vertical bar on the left represents the number of measurements withvalues between 536 and 540 decays. An appropriate bin size must be chosen to show sufficient detail. A bin size that is too small will make it difficult to understand the distribution, and one that is too large filters out the detail. Figure 4 shows the effect that the bin size has on the frequency-distribution curve. Optimum Number for Calculating the Mean Students are often required to make replicate measurements on a system and, from this information, to calculate a mean value and its standard deviation. Figure 5 shows how the mean d e ~ e n d on s the size of the data set for one experiment. The horizontal axis represents the number of measurements used to calculate the mean. Eachvalue was calculated hy summing from the first to the nth measurement. A log scale was used for the horizontal axis to emphasize the variation with small sample sizes. This figure shows the mean of three measurements to be high by 1.2% of the value found with the entire set. By 250 measurements the mean settles down to within 0.2% of its final value. Figure 6 shows how the standard deviation of the mean varies with the size of the data set. The optimum number of measurements for calculating the mean depends on the scatter of the data. If a Gaussian or related behavior is assumed to exist in the system, then a small number of measurements may be used. Noise or other nonstandard behavior increases scatter and can skew the data. The mean will be more unstable, requiring a larger number of measurements. In general the performance of each system must be determined. The contribution of the instrument to the scatter will be studied in a later section.
Extracting the Observed Standard Deviation
MS
LO 100 Numberof Mearurementr
Figure 5. H o w the mean varies with the number o f points.
1 0 1m NumberofMearluemen,r
Figure 6. H o w the standard deviation varies with the size o f the data set.
Pan A
I t is straightforward to determine ranges for various multiples of 0 once a large data set has been assembled. The purpose of the more tedious calculation described here is to emphasize how many measurements lie beyond these ranges. Doing this, students will gain a better appreciation of this figure of merit. Spreadsheet computer programs are simple methods of determining the standard deviation. A sorting function is used to arrange the data in increasing value. Students then determine the range that encompasses 68.3% of the data (lo), centered about the mean. The 20 boundary is a t 95.5%, and the 30 boundary is a t 99.7%. For data sets of 250 measurements, they simply select 85 points on either side of the measurement that is closest to the mean value. The rest of the data set (31.7%)lies beyond the f l u boundary. Statistical functions in computer programs can calculate accurate values for the standard deviation, but it is important to emphasize the amount of valid data that lies beyond this boundary. A second method is to eenerate a time-distribution plot (i.e., the value of a measurement versus time) and take-the number of the measurement a s time. Horizontal lines are drawn a t the mean value and a t the mean value f la. Fig-
1m Is0 Mearuremenl Number
Figure 7. Xme distributionsfor two instruments. ure 2 shows a time distribution with lines drawn a t these values.
An i m ~ o r t a nao~lication t. . of the freauencv distribution is thv dt!ttmnination of the pcrfinmanw of an instrument. Each in.itrument will contrtbute to the total width of thc frequency distribution, broadening and possibly skewing the data. This experiment shows how i t is possible to choose the optimum instrument for a measurement, and it shows that the obsewed uncertainty has several components. A radiation-detection measurement is useful here because the instrumental uncertainty can be calculated from the ohsewed total uncertainty and the contribution due to the natural scatter of the data. The experiment can he carried out in two ways. Students can choose to construct two data sets with different instruVolume 72 Number 6 June 1995 521
Vatlle
8b Figure 8. Frequency distribution for two instruments. Part a i s the s frequency distribution for the instrument whose time distribution i shown in part a o f Figure 7. Likewise, part b of this figure i s the frequency distribution for the instrument whose time distribution is f Figure 7. shown in part b o ments. If laboratory time i s limited, groups work as a team, sharing data sets and independently analyzing the data. Asecond method is to work with one instrument. After assembling a counting system, students collect the first data set, change one experimental parameter, and then collect the second data set. Some parameters that can be changed: Students can switch to a different power supply or amplifier. They can also change the high voltage, the preamplifier capacitance, or the amplifier gain. Data sets are constructed for each set of conditions.
Comparing the Stability
F gxe 9 Samp ng c-ne for a nonoqcnco-s system Dan a, Pan o snow me I me o str oui80rr for ins sjstem, ano pan c snows 1 1 s frequency distribution.
Time distributions are used to compare the stability in the two data sets. These are generated by plotting measurements with their value on the vertical axis and time on the horizontal axis. The number of the measurement (ex., 1, 2, 3, ...) is taken a s time or At. A constant should be added to each measurement in one data set to vertically offset that distribution from the other. Figure 7 shows an example of such a graph. I t is possible to measure the stability of a n instrument by looking qualitatively a t the time distributions. The graphs should look random. A noticeable slope suggests that an experimental parameter is drifting. Linear-regression programs can be used to calculate the slope of the data. Other features to look for include a n oscillation that is of a lower frequency than the statistical fluctuation or a region that varies significantly from the mean. These features might suggest that one data set is better than the other or that one instrument performs better than the other. Part a of Figure 7 has a large drift and is obviously the poorer instrument of the two.
522
Comparing the Frequency D~stributions
Instrumental performance is also determined by comparing t h e frequency distributions. The distributions should be symmetric and a s narrow a s possible. Excess noise in one component of the instrument will increase the width of the distribution and can lead to skewing. Without noise the standard deviation is the square root of the mean. If instrumental noise has a Gaussian distribution, then i t will combine with the statistical fluctuation by
a is the observed standard deviation; a d , , , is the where , natural statistical fluctuation of the data from nuclear de, , i s the instrumental uncertainty. Stucay; and q i , t o d e t e r m i n e if t h e d e n t s should calculate a instrument significantly changes t h e assumed model (where a is the square root of the mean). The standard de; , , , , is a quantitative figure of merit for comviation a
paring the two instruments. Figure 8 shows that no such calculations are necessary to determine which is the optimum instmment. Sampling and Inhomogeneity A common measurement problem i s sampling some feature of a large system. What sample size provides representative results:? In a fairly homogeneous system a measurement with a particular sample size can take on a as Gaussian distribuiion. T h e d(y& of icattcr dec~~cases the s i z e of t h e s;irnplt! incrc~scs untd the sample includes the entire system under study. Nuclear-decay data sets can be used to illustrate this, assuming instrumental noise does not significantly distort the distribution. The system i s the set of measurements,
and the sample is the individual measurement, each collected with a time At. Sample sizes are increased by increasing the measurement time. Assume that the counting times are held to multiples of At. Counting for a longer time period is similar to summing every n measurements. Because the measurement is the variance, summing the measurements properly propagates the uncertainties: The total uncertainty is still the square root of the sum. The original set of measurements can thus be used to explore variations i n sample size.
Sampling Curves
A sampling curve i s shown in Figure 9a. In this figure the value for a measurement versus sample size is plotted, which is i n units of At. The data used to generate this fig-
Figure 10. Sampling curve (part a) for a system whose time distribution has a positive bias. A function was added to the homogeneous distribution giving a rise o f 10%over the 500 points. Part b i s the time distribution for this system, and part c is the frequency distribution.
Figure 11. Sampling curve fora system with an exponential bias (part a). The homogeneous system was normalized using a function with an exponential form. Data were then normalized so that the sum of all values is equivalent to that forthe homogeneous system. Part b is the time distribution for this system and part c i s the frequency distribution. Volume 72 Number 6 June 1995
523
Inhomogeneous Systems
mi:
Inhomogeneity skews the results. Spikes may he apparent in the time distribution, and i t may have a slope or other nonstandard behavior. The frequency distribution becomes asymmetric, and a larger number of samples is needed to obtain a representative sampling . - of the system. Three inhomogeneous systems are shown in Figures 1012. The assumed time distributions were generated by adding a function to the original data set and normalizing the entire data set so that the sum of all points is equivalent to that from the original data set. In this way, thk systems are equivalent in total concentration or response hut have different inhomogeneities. Figure 10 bas a time distribution with a 10%(positive) slope over the 500 points. The frequency distribution is asymmetric and is distorted on the right side; the sampling curve is much broader a t large sample sizes. In F i m r e 11, a system i s shown that could describe an vlem(!ntal roncentrat~on highly deprndmr on p:irriclr siw, an enponenti:il function. The frc.qucl~cv dlstrll~ut~on does not appear as a Gaussian f ~ n c t i o ~ a spreads nd over a wide range of counts. Likewise, the sampling . .curve i s broadly distributed. Asvstem with two components or phases is shown in Figure f2. A Gaussian peak was s u p e ~ m p o s e d on the otherwise random distribution. This peak is evident in the frequency distribution a s t h e region to t h e right of t h e primary peak. The sampling curve almost appears to have two components. A low&v&ed sampling c k e , centered near a mean of about 620, is fairly well-defined, and a second weaker sampling curve i s suggested near a mean value of about 750.
Conclusion Several courses a t San Jose State University have used these experiments for two years. Most of the- experience has been in courses in nuclear science and health physics, courses that traditimnlly h a w a strongemphas~s in siaristirs and inirrurncnta~lon. The Chemistrv Department of this teaches a course in scientific computing. AS course students use the experiments to learn about data processing and issues of instrument performance. Students in each course carry out these experiments on a variety of the radiation-detection instruments in the Nuclear Science Facility. During the upcoming year the experiments may be extended to courses within the Physics Department and later to other departments around campus.
Figure 12. Sampling curve for a two-component system (part a). A Gaussian distribution, centered at measurement number 200, was superimposed on the homogeneous distribution. The data were nors equivalent to that from the malized so that the sum of all values i s the time distribution for this system, homogeneous system. Part b i s the frequency distribution. and part c i ure are the same as those used to generate Figures 2 and 3. Summed measurements a r e normalized to l A t . The shape of the sampling curve can he understood a s a series of frequency distributions viewed from above. Ahomogeneous system will have a time distribution that is uniform, randomly varying about a mean value. This is shown as part b of Figure 9, along with the resulting frequency distribution (part c).
Literature Cited
1. Witte, R. S. SlotiBirr. 4th ed.: Harcourt Brace Jovanovich Colleg: New York, 1993. 2. Baird, D. C. Eq~~rimenlofinr~, 2nd ed.: Prentiee Hull: Englewood CliK?, NJ. 1983. 3. ~ a r s e nR. . J.: an, M. L . A ~ntmduction ~ to Marliamoiieoi statialicr and i t x ~ p p l i cations. 2nd ad.; PmnticeHall: Englcwoad Cliffs. NJ. 1986. 4. Hogg. R. V; Ledolte~ J A p p i i d Sloiislics /or Enpinreir ond Physicnl Sckntisfs. 2nd cd.; Macmillan: New York, 1992. 5. Knoll. G. F Radiotion Defection mnd Mensummml. 2nd ed.: John Wiley and S a m NewYork. 1989. fi. Clarke. G. M. Stntislrcs ortd Eroerimsatoi D o s ~ ~ 2nd I . ed.: Edward Arnold Ltd: Lon-
andHall: New York, 1985. 6. Howell. D. C. Fundomenla1 Stolisiics for the Behavioral Se6nces. 2nd ed.: Duxbury: Selmont, CA. 1989. 9. Fergusan. G. A. SfnlisiicnlAnniy.ir in Psyehoiopy nndEducafion. 4th ad.: McGrauHill: New York, 1976. 10. Winn. P. R.; Johnson. R. H. Burinesr Slatidicr: Maemillan: New York. 1978. 11. Mendenhall. W.; Rainmuth. J. E.; Beaver, R.: Duhan, D. Sfolislics/nr Mnizagement ondEmnomirs, 5th ed.: PWS: Boston. 1986.
524

1995 Practical Experiments in Statistics

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

1995 Practical Experiments in Statistics

Transféré par

Droits d'auteur :

Formats disponibles

Practical Experiments in Statistics

i 100 200 MO 400

Journal of Chemical Education

Number 6 June 1995

LO 100 Numberof Mearurementr

Figure 5. H o w the mean varies with the number o f points.

1m Is0 Mearuremenl Number

Comparing the Frequency D~stributions

Journal of Chemical Education

Journal of Chemical Education

Vous aimerez peut-être aussi