Académique Documents
Professionnel Documents
Culture Documents
M U LT I P L E R E G R E S S I O N
&
M U LT I FA C TO R A N O VA ( G L M )
Y, Response variable
Continuous Discrete
(Output has a mean and variance) (Output is a proportion, i.e., 15 out of 50 or 30%)
RARELY do you just look at one variable at a time, and it’s not an efficient way to
experiment. Typically multiple factors influence a response, and looking at them at
the same time allows us to investigate interactions!
Overview
Example:
Pˆ rint_Q= 80.0 - 0.0968 Rough
1. S
2. R2
3 . ADJ US TE D R 2
Print_Q vs. Rough and Strength
R2 and s
Multiple Adjusted StErr of
Summary R R-Square R-Square Estimate
0.8645 0.7474 0.7287 1.9441
SSE n 1 Standard
The 2 Confidence Interval 95%
Regression Error adjusted R will always
R a2 1 Table Coefficient. t-Value p-Value be smaller
Lower Upper
Constant p 1 5.1856
SST n61.6823 than the11.8950 < 0.0001
unadjusted 51.0424
coefficient. 72.3223
Rough -0.0825 0.0125 -6.6107 < 0.0001 -0.1082 -0.0569
Strength 0.1060 0.0290 3.6540 0.0011 0.0465 0.1656
After adjusting for the use of 2 predictors with n = 30,
nearly 73% of the variation in print quality can be
explained by the roughness and strength of the
paper used.
“Flaw” in R2
H0: b1 = b2 = 0
Ha: Not both b1 and b2 are 0.0 Reject H0
3.35
Degrees of Sum of Mean of
ANOVA Table Freedom Squares Squares F-Ratio p-Value
Explained 2 301.9384 150.9692 39.9430 < 0.0001
Unexplained 27 102.0496 3.7796
If α = 0.05, Fc = 3.3541
Goal: Parsimony
We want to identify variables that are not
significant in the model, and consider removing
any that aren’t.
Remove one predictor (least significant) at a time.
bj 0
Test Statistic: tobs ~ t (n p 1)
sb j
Reject Reject
α/2 α/2
t
-tc 0 tc
t-Tests
Source df SS MS F
Factor A a-1 SSA MSA = SSA/(a-1) MSA/MSE
Factor B b-1 SSB MSB = SSB/(b-1) MSB/MSE
AB Interaction (a-1)(b-1) SSAB MSAB = SSAB/(a-1)(b-1) MSAB/MSE
Error ab(n-1) SSE MSE = SSE/ab(n-1) -
Total N-1 SST - -
GLM provides flexibility: Will handle unbalanced and balanced designs. Here,
since the design is balanced, the Sequential SS and Adjusted SS are the same.
Main Effects Plot
Height Width
8.0
7.5
7.0
Mean
6.5
6.0
5.5
5.0
8
Mean
4
Narrow Wide
Width
Examining One Variable at a Time
Notice the SST, in both models, they’re the same—64.34. Also notice
the SS for Width is the same—25.81. However, in the One-way ANOVA,
this factor is not significant at an alpha of 0.01. Whereas in the Two-
way ANOVA with the interaction, it is very significant.
Examining One Variable at a Time
Same issue with Height. If a factor is not a part of the analysis, its effect is
in the error term—the denominator of the F-ratios. The error term is
supposed to be just random error and not assignable causes. If there are
assignable causes in the error term, significant effects can be missed.
BREAK
48
FRIDAY’S
HAND’S-ON EXERCISE
Overview
Start Angle
Stop Pin
Generate MLR Data: Directions
Break into groups of 4 or 5
Get catapult, duct tape, tape measure, 12’ aluminum foil & rubber ball
Duct tape catapult to floor securely
Use aluminum foil to indicate where the ball landed—secure it to the
floor with a couple of pieces of tape to prevent it from sliding when the
ball lands on it
Place, and then secure, the tape measure along side the aluminum foil
starting from the front of the catapult
Response variable = Distance ball travels before hitting the ground
Experimental factors, independent variables…
Hook pin
Vertical pin catapult
Start angle Aluminum foil
Tape measure
Generate MLR Data: Directions
Experimental factors
Hook pin
Vertical pin
Start angle
Constants:
Put cup at location 1
Put stop pin in location 3
Generate MLR Data
For Minitab, create a column for each variable—Response variable and the
three factors
You will have to generate the interactions in Minitab to include them in the analysis
Multiple two main effects into a separate column to create the interaction
Calc>Calculator will give you the above dialog box (left)
Store it in the next column, here I labeled it “HP*VP” for the interaction of Hook Pin and Vertical Pin
In “Expression” box, double click on Hook Pin, click on the multiplier button, *, and double click on Vertical Pin
Click on “OK”
Continue this for the other two, two-way interactions, HP*SA and VP*SA
Generate MLR Data
The other columns are the factors you will regress on,
the main effects and their interactions
Generate MLR Data
Select: Stat>Regression>Regression
Click “OK”
Click “OK”
Generate MLR Data
Click on OK
Click on OK
Generate MLR Data
Determine
Which factors are significant; eliminate insignificant factors/interactions
Provide model with only significant variables
If multicollinearity exists
What’s the proof
If it does, what are the issues with your model
Validate your assumptions
Residuals are normal, independent, constant & E{ei} = 0
How much of the total variation does your model explain
What is the 95% range, ballpark, I can expect a predicted value to fall within
Is your model good or are there issues