Vous êtes sur la page 1sur 17
Introduction to Inference 6.1 Estimating with Confidence 6.2 Tests of Significance 6.3 Use and Abuse of Tests 6.4 Power and Inference as a Decision 382 LCHAPTER 6 Introduction to Inference : Introduction j ‘The purpose of statistical inference is to draw conclusions from data. We have examined data and arrived at conclusions many times previously. Formal in- ference adds an emphasis on substantiating our conclusions by probability calculations. Probability allows us to take chance variation into account and so to cor- rect our judgment by calculation. Here is an example. : Beazer ‘The Wade Tract in Thomas County, Georgia, is an old-growth forest of longleaf pine wees (Pinus palustris) that has survived in a relatively undisturbed state since before the settlement of the area by Europeans. Foresters who study these trees arc interested in how the trees are distributed in the forest. Are the locations of the trees random, with no particular paiterns? Oris there some sort of clustering, resulting i regions that have more trees than others? Figure 6.1 gives a plot of the locations of sll $84 longleaf pine trecs in a 200-meter by 200-meter region in the Wade Tract? Do the locations appear to be random, or do there appear to be clusters of tees? One approach to the analysis of these data indicates that a pattern a clustered of more clustered than the one in Figure 6.1 would occur only 4% of the 4 time if in Fact, the locations of longleaf pine trees in the Wade Tract are random. SEE EEE eee ee ee ee eee Becaiise this chance is fairly small, we conclude that there is some clus- tering of these trees. Our probability calculation helps us to distinguish “Nort-soinh” 8 nae 7 East-West FIGURE 6.1 The distibution of longleaf pine trees, for Example 61. 6.1._ Estimating with Confidence | 383 between patterns that are consistent or inconsistent with the random location scenario. Our unaided judgment can also err in the opposite direction, seeing a sys- tematic effect when only chance variation is at work. Give a new drug and a ' placebo to 20 patients each; 12 of those taking the drug show improvement, but only 8 of the placebo patients improve. Is the drug more effective than the placebo? Perhaps, but a difference this large or larger between the results in the two groups would occur about one time in five simply because of chance variation. An effect that could so easily be just chance is not convincing. In this chapter we introduce the two most prominent types of formal statistical inference. Section 6.1 concerns confidence intervals for estimating the value of a population parameter. Section 6.2 presents tests of significance, a which assess the evidence fora claim. Both types of inference are based on the sampling distributions of statistics. That is, both report probabilities that state what would happen if we used the inference method many times. This kind of probability statement is characteristic of standard statistical inference. Users . : of statistics must understand the nature of the reasoning employed and the 5 meaning of the probability statements that appear, for example, on computer output for statistical procedures. ‘Because the methods of formal inference are based on sampling distribu- tions, they require a probability model for the data. Trustworthy probability models can arise in many ways, but the model is most secure and inference is most reliable when the data are produced by a properly randomized design. When you use statistical inference, you are acting as if the data come from a ran- ef dom sample or a randomized experiment. If this is not true, your conclusions may be open to challenge. Do not be overly itnpressed by the complex details, of formal inference. This elaborate machinery cannot remedy basic flaws in producing the data such as voluntary response samples and confounded ex- periments. Use the common sense developed in your study of the first three chapters of this book, and proceed to detailed formal inference only when you are satisfied that the data deserve such analysis. The primary purpose of this chapter is to describe the reasoning used in statistical inference. We will discuss only a few specific inference techniques, and these require an unrealistic assumption: that we know the standard dev ation a. Later chapters will present inference methods for use in most of the settings we met in learning to explore data. There are libraries—both of books and of computer software—full of more elaborate statistical techniques. In- formed use of any of these methods requires an understanding of the under- lying reasoning. A computer will do the arithmetic, but you must still exercise judgment based on understanding. 6.1 Estimating with Confidence The SAT tesis are widely used measures of readiness for college study: There are two parts, one for verbal reasoning ability (SATV) and one for mathemati- cal reasoning ability (SATM). In April 1995, the scores were recentered so that the mean is approximately 500 in alarge “standardization group.” This scale is, maintained from year to year so that scores have a constant interpretation. In 2003, 1,406,324 college-bound seniors took the SAT. Their mean SATV score 384 | CHAPTER 6 Introduction to Inference was $07 with a standard deviation of 111. For the SATM the mean was 519 with a standard deviation of 115, ‘You want to estimate the mean SATM score for the more than 385,000 high school seniors in California. You know better than to trust data from the students who choose to take the SAT. Only about 45% of California students take the SAT. These self-selected students are planning to attend college and are not representative of all California seniors. At considerable effort and expense, you give the test to a simple random sample (SRS) of $00 California high school seniors. The mean score for your sample is ¥ = 461. What can you say about the mean score sin the population of all 385,000 seniors? tH ‘The sample mean ¥ is the natural estimator of the unknown population mean jz, We know that ¥ is an unbiased estimator of . More important, the Jaw of large numbers says that'the sample mean must approach the popula- tion mean as the size of the sample grows. The value ¥ = 461 therefore ap- pears to be a reasonable estimate of the mean score 1 that all 385,000 students , ‘would achieve if they took the test. But how reliable is this estimate? A second sample would surely not give 461 again, Unbiasedness says only that there is no systematic tendency to underestimate or overestimate the truth. Could we plausibly get a sample mean of 410 or 510 on repeated samples? An estimate without an indication of its variability is of little value. Statistical confidence Just as unbiasedness of an estimator concerns the center of its sampling dis- tribution, questions about variation are answered by looking at the spread. We know that ifthe entire population of SAT scores has mean 1 and standard de- viation o,, then in repeated samples of size 500 the sample mean follows the (t,o /-V500) distribution. Let us suppose that we know that the standard de- viation ¢ of SATM scores in our California population is ¢ = 100. (This is not realistic. We will see in the next chapter how to proceed when ¢ is not known. For now, we are more interested in statistical reasoning than in details of re- alistic methods.) In repeated sampling the sample mean X has a normal dis- tribution centered at the unknown population mean 2 and having standard deviation Now we are in business, Consider this line of thought, which is illustrated by Figure 6.2 + The 68-95-99.7 rule says that the probability is about 0.95 that ¥ will be within 9 points (two standard deviations of 2) of the population mean score #. + Tosay that lies within 9 points of 1 is the same as saying that 1 is within 9 points of F + So 95% of all samples will capture the true ys in the interval from ¥ — 9 to E+9. 61 _ Estimating with Confidence} 385 Density curve oF. FIGURE 6.2. # les within 4 9 of «in 95% of all samples, so w also lies within de 9 of 8 in those samples. We have simply restated a fact about the sampling distribution of €. The language of statistical inference uses this fact about what would happen in the long run to express our confidence in the results of any one sample. Our sample gave = 46). We say that we are 95% confident that the unknown mean score for al} California seniors lies between. 9= 461-9 = 452 and F+9=4614+9=470 Be sure you understand the grounds for our confidence. There are only two possibilities: 4. The interval between 452 and 470 contains the true 1. 2. Our SRS was one of the few samples for which is not within 9 points of the true . Only 5% of all samples give such inaccurate results. We cannot know whether our sample is one of the 95% for which the interval ¥49 catches 1 or one of the unlucky 5%. The statement that we are 9: fident that the unknown yz lies between 452 and 470 is shorthand for saying, “We arrived at these numbers by a method that gives correct results 95% of the time.” Confidence intervals The interval of numbers between the values ¥ 49 is called a 95% confidence interval for j, Like most confidence intervals we will meet, this one has the form estimate + margin of error 386 CHAPTER 6 margin of error Introduction to Inference The estimate (& in this cese) is our guess for the value of the unknown pa~ rameter. The margin of error £9 shows how accurate we believe our guess is, based on the variability of the estimate. The confidence level shows how confident we are that the procedure will catch the true population mean u. Figure 6.3 illustrates the behavior of 95% confidence intervals in repeated sampling. The center of each interval is at ¥ and therefore varies from sample to sample. The sampling distribution of ¥ appears at the top of the figure to show the long-term pattern of this variation. The 95% confidence intervals % £9 from 25 SRSs appear below. The center Z of each interval is marked by a dot. The arrows on either side of the dot span the'confidence interval. All ex- cept one of the 25 intervals cover the true value of . In a very Jarge number of samples, 95% of the confidence intervals would contain j.. With the Con- fidence Interval applet you can construct many diagrams similar to the one displayed in Figure 6.3 Statisticians have constructed confidence intervals for many different parameters based on a variety of designs for data collection. We will meet Density curve of ¥ _/” # FIGURE 6.3 Twenty-five samples from the same population gave these 95% confidence intervals. In the fang run, 95% of all samples give ‘an interval that covers 1. 61. Estimating with Confidence | _387 a number of these in Tater chapters. You need to know two important things about a confidence interval: 1, It is an interval of the form (a,b), where a and b are numbers computed from the dara 2. Ithas a property called a confidence level that gives the probability that the interval covers the parameter. Users can choose the confidence level, but 959% is the standard for most situations. Occasionally, 90% or 99% is used. We will use C to stand for the confidence Jevel in decimal form. For example, a 95% confidence level corre- sponds to C = 0.95. CONFIDENCE INTERVAL Alevel C confidence interval for a parameter is an interval computed from sample data by a method that has probability C of producing an interval containing the true value of the parameter. Confidence interval for a population mean We will now construct a level C confidence interval for the mean 1 of a pop- ulation when the data are an SRS of size n. The construction is based on the sampling disiributibn of the sample mean F. This distribution is exactly Nw. ¢ fit) when the population has the Ntw.a) distribution. The central | Iimit theorem says that this same sampling distribution is approximately cor- rect for large samples whenever the population mean and standard deviation. ) re and o1 Our construction of a 95% confidence interval for the mean SAT score be- gan by noting that any normal distribution has probability about 0.95 within +2 standard deviations of its mean. To construct a Jevel C confidence inter val we first catch the central C area under a normal curve. That is, we must find the number z* such that any normal distribution has probability C within 2° standard deviations of its mean. Because all normal distributions have the same standardized form, we can obtain everything we need from the stan- dard normal curve. Figure 6.4 shows how C and 2" are related. Values of z* for many choices of C appear in the row labeled 2* at the bottom of Table D at the back of the book, Here are the most important entries from that row: 1.645 1.960 2.576 90% 95% 99% Les] Any normal carve has probability C between the point z* standard devia- tions below the mean and the point z* above the mean, as Figure 6.4 reminds us. The sample mean X has the normal distribution with mean jz and standard deviation «/ fi. So there is probability C that ¥ lies between o ° u-2% and wtz 388 | CHAPTER 6 Introduction to Inference FIGURE 6.4 Thearea between ~2" and z* under the standard normal cuneis C. ‘This is exactly the same as saying that the unknown population mean j: lies between metry uN and 340 va va ‘That is, there is probability C that the interval ¥-+2*a /V/ contains j. That is our confidence interval. The estimate of the unknown 1 is %, and the margin of error is 2a //n. CONFIDENCE INTERVAL FOR A POPULATION MEAN Choose an SRS of size n from a population, having unknown mean j1 and known standard deviation a. The margin of error for a level C confidence interval for y is | the critical points —2* and 2". The level C confidence interval for pis, Here 2" is the value on the standard normal curve with area C between | zim | ‘This interval is exaci when the population distribution is normal and_| is approximately correct for large n in other cases. | PSVTUT TER IEG 2h: Notional Student Loan Survey collects data to examine questions related to the amount of money that borrowers owe. The survey selected ‘a sample of 1280 borrowers who began repayment of their loans between four and six months prior tothe study? The mean of the debt for undergraduate study was $18,900 61. Estimating with Confidence | 589 2 tT and the standard deviation was about $49,000. This distribution is clearly skewed but because our sample size is quite large, we can rely on the central limit theorem to as- sure us that the confidenee interval based on the normal distribution will be a good approximation. Let’s compute a 95% confidence interval for the true mean debt for all borrowers. Although the standard deviation is estimated from the data collected, we will treat it asa known quantity for our calculations here. For 95% confidence, we see from Table D that 2° = 1,960. The margin of error for the 95% confidence interval for jis therefore 96 = 2684 ‘We have computed the mangin of error with more digits than we really need. Our mean is rounded to the nearest $100, so we will do the same for the margin of error, Therefore, we will use m = 2700. The 95¢ confidence interval is bm = 18,900: 2700 = (16,200, 21,600) We are 95% confident that the mean debt is between $16,200 and $21,600, Se ES SSeS Settee eo eee Note that we have rounded the results to the nearest $100. Keeping addi- tional digits would provide no additional useful information In this example we used a value for ¢ based on a large sample. Because of this, the confidence interval that we calculated will be the same as the one that ‘we would compute using the methods discussed in the next chaptér, where we no longer need to assume that ¢ is known, ‘Suppose the researchers who designed the National Student Loan Survey had used a different sample size. How would this affect the confidence inter- val? We can answer this question by changing the sample size in our calcula- tions and assuming that the mean and standard deviation are the same. Let's assume that the sample mean of the debt for undergraduate study is $18,900 and the standard deviation is about $49,000, as in Example 6.3. ‘But suppose that the sample size is only 320. The margin of error for 95% confidence is zw Gre) and the 95% confidence interval is Fem = 18,900 + 5400 (43,500, 24,300) ee Soe eee eee eee 390 | CHAPTER 6 Introduction to Inference 14000 16,000 18,000 20,000 22,000 24,000 26,000 FIGURE65 Confidenceintervalsforn = 1280 andn = 320, for Examples 65 and 64 Notice that the margin of error for this example is twice as large as the margin of error that we computed in Example 6.3. The only change that we made was to assume that the sarnple size is 320 rather than 1280. The new sample size is exactly one-fourth of the original 1280. We double the margin of error when we reduce the sample size to one-fourth of the original value. Figure 6.5 illustrates the effect in terms of the intervals. ‘The argument leading to the form of confidence intervals for the popula- tion mean 2 rested on the fact that the statistic used to estimate 4. has a nor- ‘mal distribution. Because many sample estimates have normal distributions (at least approximately), it is useful to notice that the confidence interval has the form estimate + 2°ersimate ‘The estimate based on the sample is the center of the confidence interval. The margin of error is z'Geuimme- The desired confidence level determines 2° from Table D. The standard deviation of the estimate is found from a knowledge of the sampling distribution in a particular case. When the estimate is ¥ from an SRS, the standard ceviation of the estimate is o/ Jn. How confidence intervals behave ‘The margin of error z"z/Vi for the mean of a normal population illustrates several important properties that are shared by all confidence intervals in common use. The user chooses the confidence level, and the margin of er- ror follows from this choice. High confidence is desirable and so is a small margin of error. High confidence says that our method almost always gives correct answers. A small margin of error says that we have pinned down the parameter quite precisely. ‘Suppose that you calculate a margin of errorand decide that itis too large Here are your choices to reduce it + Use a lower level of confidence (smaller C). + Increase the sample size (larger n). + Reduce o. For most problems you would choose a confidence level of 90%, 95%, or 99%. So 2* can be 1.645, 1.960, or 2.576. A look at Figure 6.4 will convince you that 2* will be smaller for lower confidence (smaller C). Table D shows that this is indeed the case, If and o are unchanged, a smaller z* leads to fa smaller margin of ervox. Similarly, increasing the sample size » reduces the 61 Estimating with Confidence | 391 T 99% Bree EEE ee Eee confidence 95% confidence SLL aR RR 14000 16,000 18,000 20,000 22,000 24,000 26,000 FIGURE 6.6 Confidence intervals for Examples 63 and 65, margin of error for any fixed confidence level. The square root in the formula implies that we must multiply the number of observations by 4 in order to cut the margin of error in half The standard deviation @ measures the vari- ation in the population. You can think of the variation among individuals in the population as noise that obscures the average value . It is harder to pin down the mean y of a highly variable population; that is why the margin of error of a confidence interval increases with o. In practice, we can sometimes reduce o by carefully controlling the measurement process or by restricting, our attention to only part of a large population. Pi Suppose that for the student Joan data in Example 6.3, we wanted 99% aia confidence rather than 95%, Table D tells us that for 99% confidence, 2 = 2.576, The margin of error for 99% confidence based on 1280 observations is and the 99% conficence interval is Fim = 18,9004 3500 = 5,400, 22,400) ‘Requiring 99%, rather than 9504, confidence has increased the margin of error from : 2700 to 3500. Figure 6.6 compares the two interval. Se ee eee ee eee eee eee Choosing the sample size A wise user of statistics never plans data collection without at the same time planning the inference. You can arrange to have both high confidence and a small margin of error. The margin of error of the confidence interval X a /

Vous aimerez peut-être aussi