Vous êtes sur la page 1sur 32

Analysis of Variance

David Chow
Nov 2014

Chap 11-1

Learning Objectives
In this chapter, you learn:

The basic concepts of experimental design

How to use one-way analysis of variance (ANOVA)

How to use two-way analysis of variance and interpret the


interaction effect

Chap 11-2

General ANOVA Setting

Researcher
R
h d
designs
i
an experiment,
i
t collects
ll t d
data,
t and
d
draw conclusions
Researcher controls one or more factors of interest
Observe effects on the dependent variable
Main Question: Are the groups (populations) the same?

Each factor (independent variable) contains two or


more treatments (levels)
Levels can be numerical or categorical
Different levels give different groups,
groups with each group
representing a population
Chap 11-3

Completely Randomized Design

CRD iis th
the simplest
i l t experimental
i
t ld
design
i

Only one factor under consideration

Testt subjects
T
bj t (assumed to be homogeneous) randomly
d l
assigned to different treatment levels
Treatment (level)
(
)

Placebo (P)

Vaccine (V)

300

300

Eg: A medical experiment

Subjects randomly assigned to get one treatment (either P or V)

Dependent variable = no of colds reported

Few no of colds in the vaccine group?


Chap 11-4

One-Way
One
Way ANOVA: Assumptions

Evaluate the difference among the means of three


or more groups
Eg1: Accident rates for 1stt, 2ndd, and 3rdd shift
Eg2: Expected mileage for five brands of tires

Assumptions
Populations are normally distributed
Populations have equal variances
Samples are randomly and independently drawn
Chap 11-5

Setting Hypotheses

H0 : 1 2 3 c

All population means are equal (c = no of groups)

i e no factor effect
i.e.,

H 1 : Not
N t all
ll off the
th population
l ti means are the
th same

At least one pair with different population means

i.e., there is a factor effect

Chap 11-6

Graphical Presentation
H0 is True

1 2 3

H0 NOT true
or

1 2 3

1 2 3
Chap 11-7

Idea: Partitioning the Variation

Total variation can be split into two parts:

SST = SSA + SSW


SST = Total Sum of Squares
(Total variation)
SSA = Sum of Squares Among Groups
(A
(Among-group
variation
i ti due
d tto factor)
f t )
SSW = Sum of Squares Within Groups
((Within-group
g p variation due to ____)

Chap 11-8

Obtaining the Mean Squares


The Mean Squares are obtained by dividing the various
sum of squares by their associated degrees of freedom

SSA
MSA
c1

Mean Square Among


(d.f. = c-1)

SSW
MSW
nc

Mean Square Within


((d.f. = n-c))

SST
MST
n1

Mean Square Total


(d.f. = n-1)
Chap 11-9

One-Way
One
Way ANOVA Table
Source off
S
Variation

Degrees off
D
Freedom

Among
A
Groups

c-1

Within
Groups

n-c

T t l
Total

n1

Sum Of
S
Squares

SSA
SSW
SST

Mean S
M
Square
(Variance)

SSA
MSA =
c-1
SSW
MSW =
n-c

FSTAT =
MSA
MSW
df1 = c 1
df2 = n c

c = number of groups
n = sum of the sample sizes from all groups
df = degrees of freedom
Chap 11-10

Interpreting F Statistic

The F statistic is the ratio of two variance


estimates: among groups to within groups

The ratio must always be positive


df1 = c -1 will typically be small
df2 = n - c will typically be large

One-Tail F-test
Decision
ec s o Rule:
ue
Reject H0 if FSTAT > F,
otherwise do not reject H0

Do not
reject H0

Reject H0

F
Chap 11-11

Eg:
g Are the Clubs Different?

When three
Wh
th
different
diff
t golflf clubs
l b are
used, they hit the ball different
distances.
Y randomly
You
d l select
l t fi
five
measurements for each club.
At the 0.05 significance level, is there
a difference
diff
in
i mean distance?
di t
?

Club 1
254
263
241
237
251

Club 2
234
218
235
227
216

Club 3
200
222
197
206
204

Computations by EXCEL

Chap 11-12

Excel Output
SUMMARY
Groups

Count

Sum

Average

Variance

Club 1

1246

249.2

108.2

Club 2

1130

226

77.5

Club 3

1029

205.8

94.2

ANOVA
Source of
Variation

SS

df

MS

Between Groups

4716.4

2358.2

Within
Groups

1119.6

12

93.3

Total

5836.0

14

P-value

25.275 4.99E-05

F crit
3.89

Chap 11-13

Statistical Decision
Test Statistic:

H0: 1 = 2 = 3
H1: j not all equal
= 0.05

df1= ___

df2 = ___

MSA 2358.2
FSTAT

25.275
MSW
93.3

D i i
Decision:

Critical
Value:

Reject H0 at = 0.05

F = 3.89

Conclusion:
C
l i
There is evidence that
at least one j differs
Reject H
F = 3.89
FSTAT = 25.275 from the rest
= .05

Do not
reject H0

Chap 11-14

Scatter Plot
Club 1
254
263
241
237
251

Club 2
234
218
235
227
216

Club 3
200
222
197
206
204

Distance
270
260
250
240
230

220
210

X1 249.2 X 2 226.0 X 3 205.8

200

X 227.0

190

X1

X2

2
Club

X
X3

3
Chap 11-15

ANOVA Assumptions

Randomness and Independence

Normality

Select random samples from the c groups (or randomly


assign the levels)
The sample values for each group are from a normal
population

Homogeneity
g
y of Variance

All populations sampled from have the same variance

Chap 11-16

Chapter Summary

One-wayy ANOVA

Its logic & assumptions

F-test for difference in c means

(Below are not covered)

If H0 rejected: Tukey-Kramer procedure for multiple comparisons

A
Assumption
ti check:
h k Levene
L
t t for
test
f homogeneity
h
it off variance
i

Another experimental design: randomized block design


Two-way
Two
way analysis of variance

Examined effects of multiple factors

Examined interaction between factors

Chap 11-17

Appendix: Math Details

Chap 11-18

Total Sum of Squares


SST = SSA + SSW
c

nj

SST ( X ij X )
Where:

j 1 i 1

SST = Total sum of squares


c = number of groups or levels
nj = number of observations in group j
Xij = ith observation from group j
X = grand mean (mean of all data values)
Chap 11-19

Total Variation
SST ( X 11 X ) 2 ( X 12 X ) 2 ( X cnc X ) 2
Response, X

X
Group 1

Group 2

Group 3
Chap 11-20

Among-Group
Among
Group Variation
SST = SSA + SSW
c

SSA n j ( X j X ) 2
j 1

Where:

SSA = Sum of squares among groups


c = number of groups
nj = sample size from group j
Xj = sample
p mean from g
group
pj
X = grand mean (mean of all data values)
Chap 11-21

Among-Group
Among
Group Variation
c

SSA n j ( X j X ) 2
j 1

Variation Due to
Differences Among Groups

SSA
MSA
c 1
Mean Square Among =
SSA/degrees of freedom

j
Chap 11-22

Among-Group Variation

SSA n1 ( X 1 X ) 2 n2 ( X 2 X ) 2 nc ( X c X ) 2
Response, X

X3
X1
Group 1

Group 2

X2

Group 3
Chap 11-23

Within-Group
Within
Group Variation
SST = SSA + SSW
c

SSW
j 1

nj

i 1

( X ij X j )

Where:

SSW = Sum of squares within groups


c = number of groups
nj = sample size from group j
Xj = sample mean from group j
Xij = ith observation in group j
Chap 11-24

Within-Group
Within
Group Variation
c

SSW
j 1

nj

i 1

( X ij X j )

Summing the variation


within each group and then
adding over all groups

SSW
MSW
nc
Mean Square Within =
SSW/degrees of freedom

j
Chap 11-25

Within-Group Variation

SSW ( X 11 X 1 ) ( X 12 X 2 ) ( X cnc X c )
2

Response, X

X1
Group 1

Group 2

X2

X3

Group 3
Chap 11-26

Eg:: Car Wax Effectiveness


Eg

The number of times each car went through the


carwash before its wax deteriorated is shown on the
next slide
The wax producer must decide which wax to market
A the
Are
th three
th
waxes equally
ll effective?
ff ti ?
Factor :
Treatments (Levels):
j
Subjects:
Response variable:

Car wax
Type 1, Type 2, Type 3
Cars
Number of washes

Eg:: Car Wax Effectiveness


Eg
Obser ation
Observation

Wax
Type 1

Wax
Type 2

Wax
Type 3

1
2
3
4
5

27
30
29
28
31

33
28
31
30
30

29
28
30
32
31

29.0
2.5

30.4
3.3

30.0
2.5

p Mean
Sample
Sample Variance

Eg:: Car Wax Effectiveness


Eg
I Hypotheses
H0: 1=2=3
H1: Not all the means are equal
where:
1 = mean number of washes using Type 1 wax
2 = mean number of washes using Type 2 wax
3 = mean number of washes using Type 3 wax

Eg:: Car Wax Effectiveness


Eg
I ANOVA Table
Source of
Variation
V i ti

Sum of Degrees of Mean


Squares
Freedom
S
F d
Squares
S
5.2

Error

33.2

Total

38.4

14

Treatments

p-Value
V l

.42

I Rejection Rule (given = 0.05)


p-Value Approach:
Reject H0 if p-value < .05
Critical Value Approach: Reject H0 if F > F.05 = h

ANSWER
I ANOVA Table
Source of
Variation

Sum of Degrees of Mean


Squares Freedom Squares
5.2

a=2

c=2.60

Error

33.2

b=12

d=2.77

Total

38.4

14

Treatments

Critical Value: F.05 = 3.89

p-Value

e=0.939

.42

ANSWER
I Conclusion
p-value approach
From FF-table, p-value is greater than 0.10, where F = 2.81.
(E l gives
(Excel
i
an exactt p-value
l off 0.42)
0 42)
Do not reject H0
Critical value approach:
approach: FTEST=0.939 < F.05, do not reject H0
There is insufficient evidence to conclude that the mean
number of washes for the three wax types are not the same

Vous aimerez peut-être aussi