Académique Documents
Professionnel Documents
Culture Documents
doc
Page 1 of 7
1 0 1 2
xd
2
1 2 1 3
1 4 9
2
Questions:
= x x yd = y y
yd
( xd )( yd )
-6 -4 -2 0 2 4 6
-5 -4 -2 0 2 3 6
5670 Interpretations: a = 0.625002 , when x is zero, estimated value of y on average is 0.625002 unit. b = 0.910714 , if x increases by 1 unit then y increases on average by 0.910714 unit. r 2 = 0.988222 , 98.82% total variation in y is explained by the independent variable x (only 1.18% is left unexplained, which is due to other variables). The model fits very well. Prediction:
y = 3.630359
36 16 4 0 4 16 36 112
25 16 4 0 4 9 36 94
30 16 4 0 4 12 36 102
r=
102 (112 )( 94 )
= 0.994093
b=
a = 7 0.910714 8 = 0.625002
/opt/scribd/conversion/tmp/scratch6285/84456945.doc
84456945.doc
Page 2 of 7
Zero c orrelation, r =0
Zer o cor r el ati on, r = . 60 12 10 8 6 4 2 0
C orrelation, r= -.52
10 8 6 4 2 0 0 2
X
10 8 6 4 2 0
15 13 11 9 7 5
1 6
4
X
10
Perfect negative correlation Strong Moderate negative negative correlation correlation -1.00 Weak negative correlation
No correlation
Perfect positive correlation Weak Moderate Strong Positive Positive Positive Correlation Correlation Correlation
1.00
/opt/scribd/conversion/tmp/scratch6285/84456945.doc
Page 2 of 7
84456945.doc
Page 3 of 7
Example 2: The owner of Maumee Motors wants to study the relationship between the age of car and its selling price. Listed below is a random sample of 12 used cars sold at Maumee motors during the last year. a. If we want to estimate selling price based on the age of the car, which variable is the dependent variable and which is the independent variable? b. Draw a scatter diagram. c. Determine the coefficient of correlation. d. Determine the coefficient of determination. e. Interpret these statistical measures. Does it surprise you that the relationship is inverse? 18. a. Determine the regression equation. b. Estimate the selling price of a 10-year old car. c. Interpret the regression equation. Selling Age (yrs) price Car X ($000) Y 1 9 8.1 2 7 6 3 11 3.6 4 12 4 5 8 5 6 7 10 7 8 7.6 8 11 8 9 10 8 10 12 6 11 6 8.6 12 6 8 8.9167 6.9083
y= 1 1 .1-8
6.87 7.83 5.91 5.43 7.35 7.83 7.35 5.91 6.39 5.43 8.30 8.30
.4 8 x
e = y- y
Residuals 1.23 -1.83 -2.31 -1.43 -2.35 2.17 0.25 2.09 1.61 0.57 0.30 -0.30
r=
( x - x ) (y - y ) = 2 2 ( ( x - x ) y - y )
- 2.9 62 =5 .9 4 . 9 4 2 25
.544
1.52 3.33 5.34 2.05 5.51 4.73 0.06 4.36 2.59 0.32 0.09 0.09
b=
) ( x - x ) (y - y = 2 ( x- x )
- 2.9 62 =5 .9 42
.479
r 2 = 02959 .
y =1 . 8 .4 9x 11 - 7
/opt/scribd/conversion/tmp/scratch6285/84456945.doc
Page 3 of 7
84456945.doc
Page 4 of 7
Example 4. Associated with a job are two random variables: CPU time required (Y) and the number of disk I/O operations (X). Given the following data, compute the sample correlation coefficient. Number (X) 398 390 410 502 590 305 210 252 398 392 Time (y) 40 38 42 50 60 30 20 25 40 39 a. Draw a scatter diagram from these data. Does a linear fit seem reasonable? Assuming we wish to predict the CPU time requirement given an I/O request count, perform a linear regression:
y = a + bx
Compute point estimates of a and b as well as 90 percent confidence intervals. b. Next suppose we want to predict a value of I/O request count, given a CPU time requirement. Thus perform a linear regression of X on Y. Calculate 90 percent confidence intervals for c and d with the regression line:
x = c + dy
xi y b b x gi y g = ( xi x )2 ( yi y ) 2
11593.2 ==0.998919, r 2 = 0.99784 111464.1 1208.4 xi x yi y = 11593.2 =0.104008, a = y bx =38.4-0.104008 384.7= -1.612 byx = 2 111464.1 xi x . . Thus, the estimated regression line is, y = 1612 + 0104008 x
Number (Y)
b g g b b g
/opt/scribd/conversion/tmp/scratch6285/84456945.doc
Time (Y)
84456945.doc
Page 5 of 7
b. x = c + dy x == 16.29643 + 9.59384 y -------------------------------------------------------------------------------------------Example 4: The failure rate of certain electronic device is suspected to increase linearly with its temperature. Fit a least-squares linear line through the data in the following table. Table: The Failure Rate versus Temperature 55 65 75 85 95 105 55 65 75 85 95 105 1.90 1.93 1.97 2.00 2.01 2.01 1.94 1.95 1.97 2.02 2.02 2.04 1. 2. 3. 4. Draw a scatter diagram. Estimate a least squares line. Comment on the line. Determine correlation coefficient and coefficient of determination and comment on them.
Temp (F) Line Fit Plot
2.06 2.04 2.02 2 1.98 1.96 1.94 1.92 1.9 1.88
50 55 60 65 70 75 80 85 90 95 100 105 110
Temp (F) Failure rate 55 65 75 85 95 105 55 65 75 85 95 105 1.9 1.93 1.97 2 2.01 2.01 1.94 1.95 1.97 2.02 2.02 2.04
Failure rate
Temp (F)
SUMMARY OUTPUT Regression Statistics Multiple R 0.93037806 R Square 0.86560333 Adjusted R 0.85216366 Square Standard 0.01663902 Error Obstions 12 ANOVA df Regressio Residual SS MS 1 0.017831 0.017831 10 0.002769 0.000277 F 64.4066 Significance F 1.145E-05
Page 5 of 7
/opt/scribd/conversion/tmp/scratch6285/84456945.doc
84456945.doc
Page 6 of 7
Total
11 Coefficients
Standard t Stat P-value Lower 95% Upper 95% Error 1.79942857 0.023007 78.21204 2.85E-15 1.7481657 1.85069149 0.00225714 0.000281 8.025373 1.15E-05 0.0016305 0.00288381
Example 5.: An economist is interested in the relationship between the disposable income of a family and the amount of money spent annually on food. For a preliminary study. the economist takes a random sample of eight middle-income families of the same size father. mother, two children). The results are as follows, where x denotes disposable income. in thousands of dollars. and y denotes food expenditure. in hundreds of taka. x y 30 36 27 20 16 24 19 25 55 60 42 40 37 26 39 43
a. Determine the regression equation for the data. b. Graph the regression equation and the data points. c. Describe the apparent relationship between disposable income and annual food expenditure. d. What does the slope of the regression line represent in terms of disposable income and annual food expenditure? e. Use the regression equation to predict the annual food expenditure of a family with a disposable income of Tk25000. f. Identify the predictor and response variables. g. Discuss the graphical implication implications of the value of r. h. Determine and interpret the value of r. Example: A department store gives in-service training to its salesmen which are followed by a test. It is considering whether it should terminate the services of any salesman who does not do well in the test. The following data gives the test scores and sales made by the salesmen during a certain period.
Test Scores Sales (Thousand Tk) 15 32 20 37 25 49 22 38 27 51 23 46 16 33 21 41 20 39
i) Compute the correlation coefficient between the test scores and the sales. ii) Does it indicate that the termination of services of low test scores is justified? iii) If the firm wants a minimum sales of Taka 55000, what is the minimum test scores that will ensure continuation of service?
Scatter diagram
54 52 50 48 46 44 42 40 38 36 34 32 30 12 14 16 18 20 Test Scores 22 24 26 28
/opt/scribd/conversion/tmp/scratch6285/84456945.doc
Page 6 of 7
84456945.doc
Page 7 of 7
Analysis
Click Analyze. Click Correlate. Click Bivariate. Select the variables X and Y by highlighting them and clicking the Right Arrow.
Click OK.
Plot:
/opt/scribd/conversion/tmp/scratch6285/84456945.doc
Page 7 of 7