Vous êtes sur la page 1sur 13

1.

The data in the Excel spreadsheet linked below give the ages and salaries of the chief executive officers of 59 companies with sales between $5 million and $350 million. The correlation between age and salary can be characterized as: 1. Strong and positive. 2. Strong and negative. 3. Weak and positive. 4. Weak and negative. By running correlation under data analysis in excel, we could obtain the following output: Age Salary Age 1 Salary 0.127555 1

Since r value is 0.127555, positive but far away from 1, the correlation between these two variables is weak and positive. In this case, choice 3 is the right one.
Age 53 43 33 45 46 55 41 55 36 45 55 50 49 47 69 51 48 62 45 37 50 50 50 58 53 57 53 61 47 Salary ($thousands) 145 621 262 208 362 424 339 736 291 58 498 643 390 332 750 368 659 234 396 300 343 536 543 217 298 1103 406 254 862

56 44 46 58 48 38 74 60 32 51 50 40 61 63 56 45 61 70 59 57 69 44 56 50 56 43 48 52 62 48

204 206 250 21 298 350 800 726 370 536 291 808 543 149 350 242 198 213 296 317 482 155 802 200 282 573 388 250 396 572

2. A political consultant conducts a survey to determine what position the mayoral candidate she works for should take on a proposed smoking ban in restaurants. Which of the following survey questions will deliver an unbiased response? 1. Should the city ban smoking in restaurants to protect our children from second-hand smoke? 2. Should tobacco smoke, a known cause of lung cancer, be banned from public spaces such as restaurants? 3. Does the city have the right to restrict recreational activities, such as moderate consumption of alcohol or tobacco, on the premises of privately-owned businesses? 4. None of the above.

For 1, to protect our children from second-hand smoke is some kind of bias; for 2, a known cause of lung cancer is some kind of bias; for 3, it is not biased. Therefore, 3 is the right choice.

3. A nutrition researcher wants to determine the mean fat content of hen's eggs. She collects a sample of 40 eggs. She calculates a mean fat content of 23 grams, with a sample standard deviation of 8 grams. From these statistics she calculates a 90% confidence interval of [20.9 grams, 25.1 grams]. What can the researcher do to decrease the width of the confidence interval? 1. Increase the confidence level. 2. Decrease the confidence level. 3. Decrease the sample size 4. None of the above.

When the confidence level increases, the width increases. So 1 is the right choice. Meanwhile, when the sample size decreases, the margin of error increases (critical value*standard deviation/sqrt(n), n is the sample size) and thus the width increases. So choice 3 is not right. In this case, only choice 2 is the right one.
4. In a random sample of 321 senior citizens, 61 were found to own a home computer. Based on this sample, the 95% confidence interval for the proportion of computer-owners among senior citizens is: 1. [2.6%; 7.4%]. 2. [13.4%; 24.6%]. 3. [14.7%; 23.3%]. 4. The answer cannot be determined from the information given.

The critical value for 95% confidence interval is 1.96 Proportion p=61/321=0.19 Margin of error=1.96*sqrt(0.19*(1-0.19)/321)=0.043 Upper limit: 0.19+0.043=0.233 Lower limit: 0.19-0.043=0.147 Therefore, 3 is the right choice.
5. Preliminary estimates suggest that about 58% of students at a state university favor implementing an honor code. To obtain a 95% confidence interval for the proportion of all students at the university favoring the honor code, what is the minimum sample size needed if the total width of the confidence interval must be less than 5 percentage points (i.e., the confidence interval should extend at most 2.5 percentage points above and below the sample proportion)? 1. 375. 2. 264. 3. 1,498. 4. The answer cannot be determined from the information given.

The critical value for 95% confidence interval is 1.96

Margin of error=critical value*standard error 0.025=1.96*sqrt(0.58*(1-0.58)/n) N=(1.96/0.025)^2*0.58*0.42=1497.31498. Therefore, 3 is the right choice.
6. In a survey of twelve Harbor Business School graduates, the mean starting salary was $93,000, with a standard deviation of $17,000. The 95% confidence interval for the average starting salary among all Harbor graduates is: 1. [$83,382; $102,618]. 2. [$82,727; $103,327]. 3. [$82,199; $103,801]. 4. [$59,000; $127,000].

The critical value for 95% confidence interval is 2.200985 Margin of error=2.200985*17000/sqrt(12) =10801 Upper limit: 93000+10801=103801 Lower limit: 93000-10801=82199. Therefore, 3 is the right choice.
7. In a survey of 53 randomly selected patrons of a shopping mall, the mean amount of currency carried is $42, with a standard deviation of $78. What is the 95% confidence interval for the mean amount of currency carried by mall patrons? [$39.1; $44.9]. [$24.4; $59.6]. [$21.0; $63.0]. [$14.4; $69.6].

The critical value for 95% confidence interval is 1.96 Margin of error=1.96*78/sqrt(53)=21.0 Upper limit: 42+21.0=63.0 Lower limit: 42-21.0=21.0. Therefore, 3 is the right choice.
8. A filling machine in a brewery is designed to fill bottles with 355 ml of hard cider. In practice, however, volumes vary slightly from bottle to bottle. In a sample of 49 bottles, the mean volume of cider is found to be 354 ml, with a standard deviation of 3.5 ml. At a significance level of 0.01, which conclusion can the brewer draw? 1. The true mean volume of all bottles filled is 354 ml. 2. The machine is not filling bottles to an average volume of 355 ml.

3. There is not enough evidence to indicate that the machine is not filling bottles to an average volume of 355 ml. 4. The machine is filling bottles to an average volume of 355 ml.

Ho: =355 Ha: 355. This is a two tailed t test. At 0.01 significance level, the degree of freedom is 49-1=48. The critical t values are 2.68 Test value t=(354-355)/(3.5/sqrt(49))=-2. Since -2.68<-2<2.68, we could not reject Ho. Therefore, 3 is the right choice.
9. To conduct a one-sided hypothesis test of the claim that houses located on corner lots (corner-lot houses) have higher average selling prices than those located on non-corner lots, the following alternative hypothesis should be used: 1. The average selling price of a corner-lot house is higher than it is commonly believed to be. 2. The average selling price of a corner-lot house is higher than the average selling price of all houses. 3. The average selling price of a corner-lot house is the same as the average selling price of a house not located on a corner lot. 4. The average selling price of a corner-lot house is higher than the average selling price of a house not located on a corner lot.

Ho: 1=2 Ha: 1>2 Therefore, 4 is the right choice.


Corner-lot House Price (in $hundreds) 2150 1999 1800 1375 1250 1110 1139 995 900 1695 1553 1300 1020 1020 Non-corner Lot House Price (in $hundreds) 2050 2080 2150 1900 1560 1450 1449 1270 1235 1170 1180 1155 995 975

925 725 1299 1250 1080 1050 835 805 750 773 1295 975 700 2100 600 1844 699 1330 1129 1050 1000 1030 940 874 766 739 * * * * * * * * * * * * * * * * * * * * * * * * * * * *

975 960 860 1250 922 899 850 876 890 870 700 720 720 749 731 670 2150 1599 1350 1239 1200 1125 1100 1049 955 934 875 889 855 810 799 759 755 750 730 729 710 690 670 619 939 820 780 770 620 540 1070 725 660 580 1580 1160 1109 1050

* * * * * * * * *

1045 1020 975 950 920 945 872 870 869

10. The data in the Excel spreadsheet linked below indicate the selling prices of houses located on corner lots ("corner-lot houses") and of houses not located on corner lots. Conduct a one-sided hypothesis test of the claim that corner-lot houses have higher average selling prices than those located on non-corner lots. Using a 99% confidence level, which of the following statements do the data support? 1. Upscale, expensive neighborhoods have more street corners. 2. The average selling price of a corner-lot house is higher than that of the average house not located on a corner lot. 3. The average selling price of a corner-lot house is no more than that of the average house not located on a corner lot. 4. There is not enough evidence to support the claim that the average selling price of a corner-lot house is higher than that of the average house not located on a corner lot.

After running t test assuming equal variance, we could obtain the following output: Based on the P value for one tail, it is around 0.04. Since 0.04>0.01*2, we could not reject the null hypothesis. In this case, 4 is the right choice.
t-Test: Two-Sample Assuming Equal Variances Corner(in $hundreds) 1146.725 160686.8712 40 142262.5317 0 115 1.736039721 0.042617538 1.65821183 0.085235077 Non Corner(in $hundreds) 1019.103896 132807.9364 77

Mean Variance Observations Pooled Variance Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail

t Critical two-tail

1.980807541

Corner-lot House Price (in $hundreds) 2150 1999 1800 1375 1250 1110 1139 995 900 1695 1553 1300 1020 1020 925 725 1299 1250 1080 1050 835 805 750 773 1295 975 700 2100 600 1844 699 1330 1129 1050 1000 1030 940 874 766 739 * * * * * *

Non-corner Lot House Price (in $hundreds) 2050 2080 2150 1900 1560 1450 1449 1270 1235 1170 1180 1155 995 975 975 960 860 1250 922 899 850 876 890 870 700 720 720 749 731 670 2150 1599 1350 1239 1200 1125 1100 1049 955 934 875 889 855 810 799 759

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

755 750 730 729 710 690 670 619 939 820 780 770 620 540 1070 725 660 580 1580 1160 1109 1050 1045 1020 975 950 920 945 872 870 869

11. Two semiconductor factories are being compared to see if there is a difference in the average defect rates of the chips they produce. In the first factory, 250 chips are sampled. In the second factory, 350 chips are sampled. The proportions of defective chips are 4.0% and 6.0%, respectively. Using a confidence level of 95%, which of the following statements is supported by the data? 1 There is not sufficient evidence to show a significant difference in the average defect rates of the two factories. 2. There is a significant difference in the average defect rates of the two factories. 3. The first factory's average defect rate is lower than the second factory's on 95 out of 100 days of operation. 4. None of the above.

Ho: p1=p2 Ha: p1p2 This is a two tailed t test.

At 0.05 significance level, the degree of freedom is 250+350-2=598. The critical t values are 1.96 Overall proportion p=(250*0.04+0.06*350)/(250+350)=0.05167 Test value t=(0.04-0.06)/sqrt(0.05167*(10.05167)*(1/250+1/350))=-1.09 Since -1.96<-1.09<1.96, we could not reject the null hypothesis. Based on the test, 1 is right.
12. The regression analysis below relates average annual per capita beef consumption (in pounds) and the independent variable "average annual beef price" (in dollars per pound). The coefficient on beef price tells us that: Beef Consumption and Price

1. For every price increase of $1, average beef consumption decreases by 9.31 pounds. 2. For every price increase of $1, average beef consumption increases by 9.31 pounds. 3. For every price increase $9.31, average beef consumption decreases by 1 pound. 4. For price increase of $9.31, average beef consumption increases by 1 pound.

The coefficient -9.31 means that for every one dollar increase, there is 9.31pounds decrease in the beef consumption. Therefore, 2 is the right choice.
13. The regression analysis below relates average annual per capita beef consumption (in pounds) and the independent variable "average annual beef price" (in dollars per pound). In a year in which the average price of beef is at $3.51 per pound, we can expect average annual per capita beef consumption to be approximately:

Beef Consumption and Price

1. 55.2 pounds 2. 52.6 pounds 3. 53.6 pounds 4. 117.9 pounds

Based on the regression, we could have the regression equation: y=-9.31x+85.24 When x=3.51, y=-9.31*3.51+85.24=52.5652.6. Therefore, 2 is the right choice.
14. The regression analysis below relates average annual per capita beef consumption (in pounds) and the independent variable "average annual per capita pork consumption" (in pounds). At what level is the coefficient of the independent variable pork consumption significant? Beef Consumption and Pork Consumption Source

1. 0.10. 2. 0.05. 3. 0.01.

4. None of the above.

Since P value for independent variable is 0.2563, it wont be significant unless the level is bigger than 0.25. In this case, 4 is the right choice.
15. The regression analysis below relates average annual per capita beef consumption (in pounds) and the independent variable "average annual per capita pork consumption" (in pounds). Which of the following statements is true? Beef Consumption and Pork Consumption Source

1. Beef consumption can never be less than 65.09 pounds. 2. Beef consumption can never be greater than 65.09 pounds. 3. The y-intercept of the regression line is 65.09 pounds. 4. The x-intercept of the regression line is 65.09 pounds.

Based on the output, we have the linear regression equation: y=-0.19x+65.09. Therefore, 3 is the right choice.
16. The regression analysis at the bottom relates average annual per capita beef consumption (in pounds) and the independent variables "average annual per capita pork consumption" (in pounds) and "average annual beef price" (in dollars per pound). Which of the independent variables is significant at the 0.01 level? Beef Consumption, Pork Consumption, and Beef Price Source

1. Beef price only. 2. Pork consumption only. 3. Both independent variables. 4. Neither independent variable

Since the p values for avg beef price and avg pork consumption are 0.0000, they are smaller than 0.01. Therefore, both of them are significant. In this case, 3 is the right choice.

Vous aimerez peut-être aussi