Vous êtes sur la page 1sur 6

Probability Distribution

1. Create boxplots for both X and Y. Are there any outliers? No outliers identified. See boxplot below.

B o x p lo t o f X , Y
0 X 2 4 Y 6 8

2. Make a Scatterplot with Regression. Does there appear to be a linear relationship? What two points appear to be potential outliers? Yes there does appear to be a linear relationship with two points, row 18 (X=9, Y=6) and row 19 (X=5, Y=9), representing possible outliers.
S c a tte r p lo t o f Y v s X
9 8 7 6

5 4 3 2 1 0 0 1 2 3 4 X 5 6 7 8 9

3. Check for outliers using the semi-studentized method. Are there any outliers? What are the absolute values of the semi-studentized residuals you identified in question 2?

No as all semi-studentized residuals have an absolute value less than four. From the two points from question two the absolute semi-studentized values are 1.97819 (row 18) and 2.06186 (row 19). 4. Do a check of normality by using a probability plot of the residuals. Include: a) the null and alternative hypotheses, b) the p-value of the test, c) your decision based on a 0.05 level of significance, and d) Minitab copy of your plot. a) Ho: The residuals come from a normal distribution Ha: The residuals do not come from a normal distribution b) p-value is 0.942 c) Since p-value is greater than 0.05 we fail to reject Ho and will conclude the assumption of normality is plausible. d)
P r o b a b ility P lo t o f R E S I1
Norm a l - 95% C I
99 M ean S tD e v N AD P - V a lu e - 2 .1 3 1 6 3 E - 1 5 1 .5 5 6 20 0 .1 5 8 0 .9 4 2

95 90 80 70

Pe rc e nt

60 50 40 30 20 10 5

- 5 .0

- 2 .5

0 .0 R ES I1

2 .5

5 .0

5. Do a check of equal variances by performing a Modified Levene Test. Include: a) the null and alternative hypotheses, b) the p-value of the test, c) your decision based on a 0.05 level of significance, and d) Minitab copy of your plot. a) Ho: The variances are equal Ha: The variances are not equal b) The p-value is 0.533 NOTE: Remember that the Levenes test is more robust against violations to normality than is the F-test making the Levene test a better overall test of equal variances. The only condition for the Levene test is that the variable being tested is continuous. c) Since the p-value is greater than 0.05 we conclude that the assumption of equal variances is plausible. d)

T e s t fo r E qua l V a r ia nc e s fo r R E S I1
F - T e st 0 T e st S ta tistic P - V a lu e T e st S ta tistic P - V a lu e 0.83 0.861 0.40 0.533

gr oup

L e v e n e 's T e st

1.0

1.5 2.0 2.5 3.0 3.5 9 5 % B o n f e r r o n i C o n f id e n c e I n te r v a ls f o r S tD e v s

4.0

gr oup
1 -4

-3

-2

-1

0 RESI1

6. Perform a Lack of Fit Test to check if linear regression function is appropriate. Include: a) the null and alternative hypotheses, b) the correct F-statistic and p-value of the test, c) your decision based on a 0.05 level of significance, and d) Minitab copy of your ANOVA output. a) Ho: The linear regression function is appropriate Ha: The linear regression function is not appropriate b) F-statistic is 0.95 and p-value is 0.507 c) Since p-value is greater than 0.05 we fail to reject Ho and conclude plausible that linear regression function is appropriate. d) Analysis of Variance Source DF SS MS F P Regression 1 70.769 70.769 27.67 0.000 Residual Error 18 46.031 2.557 Lack of Fit 7 17.364 2.481 0.95 0.507 Pure Error 11 28.667 2.606 Total 19 116.800

7. Even though you may not have found any assumption violations perform a Box-Cox analysis on Y to see if any transformation is suggested. Include the a) estimated and rounded lambda values, b) the interpretation of this value, and c) the Box-Cox plot. NOTE: This can only be done using Minitab Version 15 or higher i.e. student version 14 does not contain Box-Cox program. a) Estimated value is 1.07 and rounded lambda is 1.00 b) The rounded value implies one raise Y to power of 1.00 which means no transformation necessary c)

B o x -C o x P lo t o f Y
9 8 7 6 5 4 3 2 -2 -1 0 1 2 La mbda 3 4 5 L im it
Lo w er C L U p p er C L Lam b d a ( u s in g 9 5 . 0 % c o n f id e n c e ) E s tim a te Lo w er C L U p p er C L R o u n d e d V a lu e 1 .0 7 0 .3 4 1 .8 9 1 .0 0

8. Find Bonferroni joint confidence intervals for Bo and B1 with a 90% family confidence level and include your interpretation of these intervals. You can use the Minitab output to find s{bo} and s{b1} With sample size, n, of 20 the degrees of freedom are n-2 or 18. Since interested in two joint intervals, Bo and B1, g is equal to 2 for our Bonferroni correction. Using the equations

bo Bs{bo } and b1 Bs{b1} where B t1n2/ 4 . From t-table the value for the Bonferrroni
multiplier using DF of 18 and 1-/4 for alpha of 0.10 results in a 2.101 t-statistic. Plugging into the equations: For Bo: 1.377 +/- 2.101*0.8442 = 1.377 +/- 1.774 = -0.397 <= Bo <= 3.151 For B1: 0.8652 +/- 2.101*0.1645 = 0.8652 +/- 0.3456 = 0.5196 <= B1 <= 1.2108 Interpretation: We are 90% confident that both intervals contain the true intercept and slope.

9. Use Minitab to find Bonferroni simultaneous confidence intervals for new X observations of 0 and 10 using a 95% family confidence level. Include your the output and interpretation of these intervals. Follow-up question 1: What is the interpretation of the level of confidence for the confidence intervals in the output? Follow-up question 2: Can you think of a reason why these new X values might not be reliable? Follow-up question 3: Show mathematically how one would use the Minitab output to get the simultaneous level of confidence for new observations.

Interpretation: We are 95 percent confident in both of the following intervals being correct: that the reading achievement stanine for a reading readiness stanine of 0 would be from -0.687 to 3.441 and the reading achievement stanine for a reading readiness stanine of 10 would be from 7.706 to 12.351

Predicted Values for New Observations New Obs Fit SE Fit 97.5% CI 97.5% PI 1 1.377 0.844 (-0.687, 3.441) (-3.044, 5.798)

StDev

2 10.029 0.950 ( 7.706, 12.351) ( 5.481, 14.576)X X denotes a point that is an outlier in the predictors.

Values of Predictors for New Observations New Obs X 1 0.0 2 10.0 Follow-up 1: The 97.5% level of confidence is how confident we are in any ONE of the intervals being correct. Follow-up 2: The range of x-values used in this analysis was from 1 to 9 bringing into consideration the possibility of improper extrapolation of applying the regression equation to values outside this range of x. Follow-up 3: This 97.5% level of confidence is found using 1 /g = 0.975. For this particular problem we are interested in two simultaneous intervals, or a g = 2. Using algebra to find alpha we would get /g = 0.025 resulting in 0.05 alpha or a 95% simultaneous level of confidence. NOTE: Software systems by default use /2 when constructing confidence intervals and is why when solving this equation we do not use /2 but instead /g. If one were to use /2g based on the level of confidence in the output you would double divide by 2. 10. What is the value and interpretation of the coefficient of determination? Using the output and correct values show two ways this value can be calculated. From the output the coefficient of determination, or R-squared, is 60.6% meaning that 60.6 percent of the variation in reading achievement stanines can be explained by reading readiness stanines. S = 1.59914 R-Sq = 60.6% R-Sq(adj) = 58.4%

Analysis of Variance Source DF SS MS F P Regression 1 70.769 70.769 27.67 0.000 Residual Error 18 46.031 2.557 Total 19 116.800

Two possible methods for calculating R-squared are: 1) (SSR/SST)*100% = (70.769/116.8)*100% = 60.6% 2) [1 (SSE/SST)]*100% = [1 (46.031/116.8)]*100% = 60.6%

11. From our in class example of Sales-Advertising, the tests results were as follows: the intercept had T = -0.16 and p-value of 0.885; the slope test had T = 3.66 and p-value of 0.035; and the ANOVA test had F = 13.66 and p-value of 0.035. Use Minitab to find this pvalues by going to Calc > Probability Distributions and selecting appropriately either T or F. Then select the radio button for Cumulative Probability, enter the appropriate degrees of freedom for the test, click the radio button for Input Constant and enter in the text box the appropriate value of the test statistic. Click OK. From the output show how

one gets from this output to the p-value. Include a copy of the Minitab output for each test. Test of Intercept: From the output we would take 0.441524 and multiply by two to get 0.883 which is approximately 0.885 due to rounding. Cumulative Distribution Function Student's t distribution with 3 DF x P( X <= x ) -0.16 0.441524

Test of Slope: From output we would subtract 0.982377 from 1 and then double this result getting 0.017623*2 = 0.035246 which is approximately 0.035 Cumulative Distribution Function Student's t distribution with 3 DF x P( X <= x ) 3.66 0.982377

F-Test: From this output we would simply subtract 0.96526 from 1 to get 0.03474 which is approximately 0.035 Cumulative Distribution Function F distribution with 1 DF in numerator and 3 DF in denominator x P( X <= x ) 13.66 0.965626

Vous aimerez peut-être aussi