Vous êtes sur la page 1sur 31

Statistics for Managers

Using Microsoft Excel


5th Edition
Chapter 11
Analysis of Variance

Statistics for Managers Using Microsoft Excel, 5e 2008 Prentice-Hall, Inc.

Chap 11-1

Learning Objectives
After completing this chapter, you should be able to:
Recognize situations in which to use analysis of variance

(ANOVA)
Understand different analysis of variance designs
Evaluate assumptions of the model
Perform a single-factor ANOVA and interpret the results
Conduct and interpret a Tukey-Kramer post-analysis to
determine which means are different
Analyze two-factor analysis of variance tests
Conduct and interpret a Tukey-Kramer post-analysis procedure
to determine which factors are different
Statistics for Managers Using Microsoft Excel, 5e

Chap 11-2

General ANOVA Analysis


Investigator controls one or more independent variables
Called factors or treatment variables
One factor contains three or more levels or groups or

categories/classifications
Other factors contains two or more levels or groups or
categories/classifications
Experimental design: the plan used to test the hypothesis

Statistics for Managers Using Microsoft Excel, 5e

Chap 11-3

One-Factor ANOVA
Also known as Completely Randomized

Design and One-way ANOVA


Experimental units (subjects) are assigned
randomly to treatments
Subjects are assumed homogeneous

Only one factor or independent variable


With three or more treatment levels

Analyzed by one-factor analysis of variance


Statistics for Managers Using Microsoft Excel, 5e

Chap 11-4

One-Factor Analysis of Variance


Evaluates the difference among the means of three or more
groups
Examples: Accident rates for 1st, 2nd, and 3rd shift
Expected mileage for five brands of tires

Assumptions
Populations are normally distributed

(test with Box plot or Normal Probability Plot)


Populations have equal variances
(use Levenes Test for Homogeneity of Variance)
Samples are randomly and independently drawn
Statistics for Managers Using Microsoft Excel, 5e

Chap 11-5

Why Analysis of Variance?


We could compare the means in pairs using a t test for

difference of means
Each t test contains Type 1 error
The total Type 1 error with k pairs of means is 1- (1 - ) k
If there are 5 means and you use = .05
Must perform 10 comparisons
Type I error is 1 (.95) 10 = .40
40% of the time you will reject the null hypothesis of equal

means in favor of the alternative even when the null is true!


Statistics for Managers Using Microsoft Excel, 5e

Chap 11-6

Hypotheses: One-Factor ANOVA

H0 : 1 2 3 c
All population means are equal
i.e., no treatment effect (no variation in means among groups)

H1 : Not all of the population means are the same


At least one population mean is different
i.e., there is a treatment (groups) effect
Does not mean that all population means are different (at

least one of the means is different from the others)


Statistics for Managers Using Microsoft Excel, 5e

Chap 11-7

Hypotheses: One-Factor
ANOVA
H0 : 1 2 3 c
H1 : Not all j are the same
All Means are the same:
The Null Hypothesis is True
(No Group Effect)

1 2 3
Statistics for Managers Using Microsoft Excel, 5e

Chap 11-8

Hypotheses: One-Factor
ANOVA
H 0 : 1 2 3 c

At least one mean is different:


The Null Hypothesis is NOT true
(Treatment Effect is present)

H1 : Not all j are the same

or

1 2 3

1 2 3

Statistics for Managers Using Microsoft Excel, 5e

Chap 11-9

One-Factor ANOVA Table


Source of
Variation

df

SS

MS
(Variance)

P-value

Between
Groups

c-1

SSA

MSA

P(X=F)

Within
Groups

n-c

SSW

MSW

Total

n-1

SST =
SSA+SSW

F-Ratio

MSA
MSW

c = number of groups
n = sum of the sample sizes from all groups
df = degrees of freedom

Statistics for Managers Using Microsoft Excel, 5e

Chap 11-10

One-Factor ANOVA
Test Statistic
H0: 1= 2 = = c
H1: At least two population means are different
Test statistic

MSA
F
MSW

MSA is mean squares among variances


MSW is mean squares within variances
Degrees of freedom
df1 = c 1

(c = number of groups)

df2 = n c

(n = sum of all sample sizes)

Statistics for Managers Using Microsoft Excel, 5e

Chap 11-11

One-Factor ANOVA
Test Statistic
The F statistic is the ratio of the among variance to the

within variance
The ratio must always be positive
df1 = c -1 will typically be small
df2 = n - c will typically be large

Decision Rule:
Reject H0 if F > FU,
otherwise do not reject H0

= .05

Do not
reject H0

Statistics for Managers Using Microsoft Excel, 5e

Reject H0

FU
Chap 11-12

One-Factor ANOVA
F Test Example
You want to see if three different
golf clubs yield different
distances. You randomly select
five measurements from trials on
an automated driving machine
for each club. At the .05
significance level, is there a
difference in mean driving
distance?

Club 1
254
263
241
237
251

Statistics for Managers Using Microsoft Excel, 5e

Club 2
234
218
235
227
216

Club 3
200
222
197
206
204

Chap 11-13

One-Way ANOVA
Example
Distance
270

Club 1
254
263
241
237
251

Club 2
234
218
235
227
216

Club 3
200
222
197
206
204

x1 249.2 x 2 226.0 x 3 205.8


x 227.0

260
250
240
230
220

X 1

X2

210

200
190

Statistics for Managers Using Microsoft Excel, 5e

2
Club

X
X3
3

Chap 11-14

ANOVA -- Single Factor:


Excel Output
EXCEL: Tools | Data Analysis | ANOVA: Single Factor
SUMMARY
Groups

Count

Sum

Average

Variance

Club 1

1246

249.2

108.2

Club 2

1130

226

77.5

Club 3

1029

205.8

94.2

ANOVA
Source of
Variation

SS

df

MS

P-value

25.275

4.99E-05

Between
Groups

4716.4

2358.2

Within
Groups

1119.6

12

93.3

Total

5836.0

14

Statistics for Managers Using Microsoft Excel, 5e

F crit
3.89

Chap 11-15

One-Factor ANOVA Example


Solution
H0: 1 = 2 = 3
H1: i not all equal
= .05
p-value: 4.99E-05
Decision:
Reject H0 at = 0.05
Conclusion:

There is evidence that


at least one i differs
from the rest

Statistics for Managers Using Microsoft Excel, 5e

Chap 11-16

The Tukey-Kramer Procedure


Tells which population means are significantly different
e.g.: 1 = 2 3
Done after rejection of equal means in ANOVA
Allows pair-wise comparisons
Compare absolute mean differences with critical

range

1= 2

Statistics for Managers Using Microsoft Excel, 5e

Chap 11-17

Tukey-Kramer Critical Range

Critical Range QU

MSW 1
1

2 n j n j'

where:
QU = Value from Studentized Range Distribution with c
and n - c degrees of freedom for the desired level
of (see appendix E.9 table)
MSW = Mean Square Within
nj and nj = Sample sizes from groups j and j
Statistics for Managers Using Microsoft Excel, 5e

Chap 11-18

The Tukey-Kramer Procedure:


Example
Club 1
254
263
241
237
251

Club 2
234
218
235
227
216

Club 3
200
222
197
206
204

1. PhStat computes the absolute


mean differences:

x1 x 2 249.2 226.0 23.2


x1 x 3 249.2 205.8 43.4
x 2 x 3 226.0 205.8 20.2

2. You find a QU value from the table in appendix E.9 with


c = 3 (across the table) and n c = 15 3 = 12 degrees of freedom
(down the table) for the desired level of ( = .05 used here):

QU 3.77
Statistics for Managers Using Microsoft Excel, 5e

Chap 11-19

The Tukey-Kramer Procedure:


Example
(continued)
3. PhStat computes the Critical Range:
Critical Range QU

MSW 1 1
93.3 1 1

3.77
16.285

2 n j n j'
2 5 5

4. Compare:
5. All of the absolute mean differences x x 23.2
1
2
are greater than critical range.
Therefore there is a significant
x1 x 3 43.4
difference between each pair of
x 2 x 3 20.2
means at 5% level of significance.
PhStat does all the calculations for you
but you must input the Q value
Statistics for Managers Using Microsoft Excel, 5e

Chap 11-20

Tukey-Kramer in PHStat

Statistics for Managers Using Microsoft Excel, 5e

Chap 11-21

ANOVA Assumptions
Levenes Test
Tests the assumption that the variances of each

group are equal.


First, define the null and alternative hypotheses:
H0: 21 = 22 = =2c
H1: Not all 2j are equal

Second, compute the absolute value of the difference

between each value and the median of each group.


Third, perform a one-way ANOVA on these absolute
differences.

Statistics for Managers Using Microsoft Excel, 5e

Chap 11-22

Two-Factor ANOVA
Examines the effect of
Two factors of interest on the dependent

variable
e.g., Percent carbonation and line speed on soft

drink bottling process


Interaction between the different levels of these

two factors
e.g., Does the effect of one particular carbonation

level depend on which level the line speed is set?

Statistics for Managers Using Microsoft Excel, 5e

Chap 11-23

Two-Factor ANOVA
Assumptions
Populations are normally distributed
Populations have equal variances
Independent random samples are selected

Statistics for Managers Using Microsoft Excel, 5e

Chap 11-24

Two-Factor ANOVA
Sources of Variation
Two Factors of interest: A and B
r = number of levels of factor A
c = number of levels of factor B
n/ = number of replications for each cell
n = total number of observations in all cells
(n = rcn/)
Xijk = value of the kth observation of level i
of factor A and level j of factor B
Statistics for Managers Using Microsoft Excel, 5e

Chap 11-25

Two-Factor ANOVA Summary Table


With Replication
Source of
Variation

Degrees of
Freedom

Sum of
Squares

Mean
Squares

F
Statistic

p-value

Sample
Factor A
(Row)

r1

SSA

MSA =
SSA/(r 1)

MSA/
MSE

f (FA)

Columns
Factor B

c1

SSB

MSB =
SSB/(c 1)

MSB/
MSE

f (FB)

MSAB/
MSE

f (FA&B)

Interaction
(AB)

(r 1)(c 1)

SSAB

MSAB =
SSAB/ [(r 1)(c 1)]

Within
Error

rc n 1)

SSE

MSE =
SSE/[rc n 1)]

Total

rc n 1

SST

Statistics for Managers Using Microsoft Excel, 5e

Chap 11-26

Two-Factor ANOVA
With Replication
As production manager,

you want to see if 3 filling


machines have different
mean filling times when
used with 5 types of boxes.
At the .05 level, is there a
difference in machines, in
boxes? Is there an
interaction?

Box Machine1 Machine2 Machine3


1
25.40
23.40
20.00
26.40
24.40
21.00
2
26.31
21.80
22.20
25.90
23.00
22.00
3
24.10
23.50
19.75
24.40
22.40
19.00
4
23.74
22.75
20.60
25.40
23.40
20.00
5
25.10
21.60
20.40
26.20
22.90
21.90

Statistics for Managers Using Microsoft Excel, 5e

Chap 11-28

Summary Table
Source of
Variation
Sample

(Boxes)
Columns

(Machines)

Degrees of Sum of
Mean
Freedom
Squares Square

P-Value

5-1=4

7.4714

1.8678

3.6868

3-1=2

106.298

53.149

104.908 1.52E-09

Interaction (5-1)(3-1) = 8

9.7032

1.2129

7.5994

.5066

Within
(Error)

53(2-1)=15

Total

352 -1 = 29 131.0720

Statistics for Managers Using Microsoft Excel, 5e

2.3941

.0277

.0690

Chap 11-29

The Tukey-Kramer Procedure:


With Replication Factor A
critical range Qr ,rc ( n ' 1)

MSW
cn '

1. No Macro - compute by formula only


2. MSW (Within) from ANOVA Printout
3. Q from Table E.9 in the Book page 860
alpha = .05 or .01
r is the number of levels of factor A (across table)
rc(n-1) (down the table)
c is the number of levels of factor B
n is the number of replications
Statistics for Managers Using Microsoft Excel, 5e

Chap 11-30

The Tukey-Kramer Procedure:


With Replication Factor B
MSW
critical range Qc ,rc ( n ' 1)
rn'
1. No Macro compute by formula only
2. MSW (Within) from ANOVA Printout
3. Q from Table E.9 in the Book page 860
alpha = .05 or .01
c is the number of levels of factor B (across
table)
rc(n-1) (down the table)
r is the number of levels of factor A
n isfor
the
number
replications
Statistics
Managers
Usingof
Microsoft
Excel, 5e
Chap 11-31

Chapter Summary
Described one-factor analysis of variance
ANOVA assumptions
ANOVA test for difference in c means
The Tukey-Kramer procedure for multiple comparisons

Described two-factor analysis of variance


Examined effects of multiple factors
Examined interaction between factors for the model with replicated

observations
The Tukey-Kramer procedure for multiple comparisons for both

factor A and factor B for the model with replication


Statistics for Managers Using Microsoft Excel, 5e

Chap 11-32

Vous aimerez peut-être aussi