Inference About Variables

Chapter 21.
Inference about Variables: Part III Review

The purpose of this chapter is to review concepts covered in Chapters 17 through 20. No additional SPSS instructions are given in this section. Some additional problems are listed below for extra practice on the concepts already presented.
Chapter 21 Exercises
21.1 21.3 21.5 21.7 21.9 21.11 21.13 21.15 21.19 21.21 21.23 21.25 21.27 21.39 21.41 21.45 21.47 21.49 Wikipedia. Spinning Euros. Men and muscle. Butterflies mating. Listening to rap: Hispanics and whites. Mouse endurance. More on mouse endurance. College-educated parents. Very-low-birth-weight babies. Very low birth weight, drug use, and IQ. Breast cancer. Cholesterol on drugs. Cholesterol in pets. Conditions for inference. Starting to talk. More on dyeing fabrics. Parents behavior. Keeping crackers from breaking.
141
433
Chapter 21 SPSS Solutions
**NOTE: SPSS does not do inference based on Z distributions, nor does it perform inference on variables that are already summarized. If you really want to use SPSS for these problems, follow the instructions below (youll be basically using Transform, Compute Variable as a calculator) or use another technology (such as a graphing calculator or another statistics program like Minitab or Crunchit.)
21.1 The survey found 36% of adult Internet users using Wikipedia. We calculate the confidence interval as shown below.
Based on this sample, between 33.6% and 38.4% of adult Internet users consult Wikipedia, with 95% confidence.
21.3 If the coin is balanced, we should have half heads, so we have H 0 : p = 0.5 . The question is if the coin is unbalanced, so we also have H a : p 0.5. The observed = 140 / 250 = 0.56. We compute the test statistic and its P-value proportion of heads is p (doubling the area above the test statistic) below.
We have z = 1.90 with P-value 0.0574. At the 0.05 level, this is not sufficient evidence to conclude that the coin is unbalanced (although it might be close).
434
21.5 We need a 95% confidence interval for the mean muscle gap perceived by American/European men. We first need the correct t* for a sample of n = 200 (199 df), then compute the lower and upper ends of the interval.
With 95% confidence, these men think they need between 2.001 and 2.699 kilograms more muscle to make them attractive to women.
21.7 Well manually compute a two-sample t test to determine whether the difference in mean time between matings is significantly different. With a t statistic of 10.42, the Pvalue will be essentially 0, there is a significant difference in time between matings; butterflies given the large spermatophore wait longer between matings.
21.9 From the material before Exercise 21.8, 45% of the 314 Hispanics and 23% of the 567 whites listen to rap every day. We use this information to compute the interval.
435
Based on our information, between 16.5% and 27.4% more Hispanics will listen to rap every day, as compared to whites, with 90% confidence.
21.11 T procedures will be reasonably accurate for these data due to the large sample sizes (the Central Limit theorem), even though the distributions are skewed to the right. We want to know if female mice have significantly higher endurance, on average, so we compute a test statistic and find the one-sided P-value, using the conservative degrees of freedom.
The conclusion will depend on the alpha level of the test. At the 0.05 level, we conclude that female mice do have more endurance, on average. At the 0.01 level, we have failed to show a difference in mean endurance between genders of mice.
21.13 We first find the conservative t*, then compute the interval.
With 95% confidence, female mice have endurance between 0.49 and 8.91 minutes more than male mice, on average.
436
21.15 From Exercise 21.14, 5617 students had at least one parent who graduated from college in a sample of 17,554 students., this is a sample proportion of = 5617 /17554 = 0.32. We compute the confidence interval below. p
The information in this survey indicates that between 31.1% and 32.9% of 17-year-old students had at least one parent who graduated from college, with 99% confidence.
21.19 This is an observational study. It would be unethical to randomly assign babies to be born early and have very low birth weights. We first compute hypothesis test for a null hypothesis of no difference in graduation rates against the alternate VLBW = 179 / 242 = 0.7397 and H a : pVLBW < pNBW . The observed proportions are p NBW = 193 / 233 = 0.8283. p Further, the pooled estimate of the proportion is = (179 + 193) /(242 + 233) = 0.7832. p
With test statistic z = 2.34 and P-value 0.0096, we will conclude that very low birth weight babies are less likely to have graduated from high school by age 20 than normal weight babies.
21.21 The first of these questions is a test of proportions. We want to test H 0 : pVLBW = p NBW against the two-tailed alternate. From the data, we have VLBW = 37 /126 = 0.2937 and p NBW = 52 /124 = 0.4194. We also have the pooled p = (37 + 52) /(126 + 124) = 0.356. We compute the test statistic and its P-value estimate p below.
437
For the question about drug use, it seems the VLBW women are less likely to use illegal drugs; z = 2.08 with P-value 0.0375. The question about IQ needs a two-sample t test. We compute the test statistic and the conservative P-value below. The results of this test are very similar. The VLBW women have significantly lower mean IQ (at the 5% level) than women of normal birth weight.
21.23 We want to test H 0 : pLF = pN against H a : pLF < pN , where pLF is the proportion of women who eat low fat diets that will develop breast cancer and pN is the proportion of women who eat normal diets that will develop breast cancer. From the N = 1072 / 29294 = 0.0366. We also LF = 655 /19541 = 0.0.0335 and p data, we have p = (655 + 1072) /(19541 + 29294) = 0.0354. have the pooled estimate p
The difference is significant at the 5% level with a P-value of 0.036. The indication is that a low fat diet will reduce breast cancer.
438
21.25 Well manually compute a two-sample t test to determine whether the pets have a higher mean cholesterol level than clinic dogs. With a test statistic of 1.17 and conservative P-value 0.1273, these data do not show a difference in mean cholesterol levels.
21.27 With only summary statistics, well manually compute the interval using the
conservative degrees of freedom.
2 2 The interval is ( x1 x2 ) t * s1 / n1 + s2 / n2 . This
becomes (193-174) 2.074 682 / 26 + 442 / 23 = 19 33.571. We are 95% confident the difference in mean cholesterol levels is between 14.57and 52.57; since 0 in included in the interval, clinic and pet dogs may have the same mean cholesterol level.
= 80 / 80 = 1. A large 21.39 If a rat is successful in 80 of 80 trials, its success rate is p sample confidence interval will be from 1 to 1, since (1 p) = 0 (in other words, there is no variability). To find the plus four estimate, add four to the number of trials (this becomes 84) and two to the number of successes (this becomes 82). The plus four estimate is then p = 82 / 84 = 0.9762.
Well estimate such a rat would be successful at least 94.4% of the time (it cant be right more than 100% of the time) with 95% confidence.
439
21.41 Open data file ex21_41.por (or enter the data). This is a small sample (n = 20), so we check a boxplot for skewness and outliers. Use Graphs, Legacy Dialogs, Boxplot. Click to enter the variable name and OK. This boxplot does show that the child whose first word was at 26 months is an outlier. The distance from the median to the right side indicates a skew. Inference using t distributions is still not appropriate for this data.
21.45 An outline of the experiment might be as below.
We want to know which gives the lower score (darker color) on average. Open data file ex21-45. This file has one column for the method and another for the color score. Before doing a test, we should check to see that our data are (roughly) Normal, since these are small samples. We create side-by-side boxplots using Graphs, Legacy Dialogs, Boxplot with a Simple chart of data for Summaries for groups of cases.
440
Method C has a high outlier (43.13). However, with sample sizes of n = 8, the 1.5 IQR criterion is not always reliable. Since that value is not particularly extreme, and it certainly appears that Method B gives the darker color (has lower values), well proceed to the test. The methods listed in the data file are B and C; SPSS requires integer values for groups. We created a new variable called Numgroup where 0 = B and 1 = C.To perform the test, use Analyze, Compare Means, Independent Samples T-Test.
441
t-test for Equality of Means Sig. (2tailed) 14 .000 .000 Mean Difference -1.21000 -1.21000 Std. Error Difference .13759 .13759 95% Confidence Interval of the Difference Lower -1.50510 -1.50565 Upper -.91490 -.91435
t Color Equal variances assumed Equal variances not assumed -8.794
df
-8.794 13.727
The test statistic is t = 8.79 with P-value 0.000. We have clear evidence of a difference between the two methods; method B gives darker colors (on average).
21.47 The data in Table 21.1 give the number of drinks per session by female students whose parents do or do not allow them to drink. In the table, there are 65 students whose parents allow them to drink and 33 who do not (making a total of 98 female students = 65 / 98 = 0.6633. represented); this makes p
Assuming these sophomore students are representative of females (sophomore) students in general, we estimate that between 57.0% and 75.7% have at least one parent who allows them to drink, with 95% confidence.
21.49 For the experiment discussed in part (a), were interested in the proportion of crackers with cracking. Since there were only 3 microwaved crackers that showed visible cracking, well use a plus 4 confidence interval to estimate the difference in proportions. We add 1 success (a cracked cracker) to each group, and 2 trials to each group, and find p1 = 4 / 67 = 0.060 and p2 = 58 / 67 = 0.866. To find the 95% confidence
interval, 1.96
we
compute
the
margin
of
error
as
z*
p1 (1 p1 ) p2 (1 p2 ) + , or n1 + 2 n2 + 2
.06*.94 .866*.134 + = 0.099. Based on this information, microwaving crackers 67 67 results in between 70.6% and 90.5% fewer crackers with cracking, with 95% confidence.
442
For the experiment in part (b), well analyze the data with a two-sample t test, since were interested in mean pressure to break the crackers. Since the intent of the experiment is to prove that microwaving improves resistance to breaking, we have selected the greater than alternate hypothesis. The test statistic is t = 6.91 with P-value 0.0000. These data clearly show that microwaving crackers improve their resistance to breaking.

Inference About Variables

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Inference About Variables

Transféré par

Droits d'auteur :

Formats disponibles

Chapter 21.

Inference about Variables: Part III Review

Chapter 21 SPSS Solutions

conservative degrees of freedom.

2 2 The interval is ( x1 x2 ) t * s1 / n1 + s2 / n2 . This

21.45 An outline of the experiment might be as below.

t Color Equal variances assumed Equal variances not assumed -8.794

Vous aimerez peut-être aussi