Vous êtes sur la page 1sur 93

About the Supplemental Text Material

I have prepared supplemental text material to accompany the 6th edition of Introduction to Statistical Quality
Control. This material consists of (1) additional background reading on some aspects of statistics and
statistical quality control and improvement, (2) extensions of and elaboration on some textbook topics and (3)
some new topics that I could not easily find a home for in the text without making the book much too long.
Much of this material has been prepared in at least partial response to the many excellent and very helpful
suggestions that have been made over the years by textbook users. However, sometimes there just wasnt any
way to easily accommodate their suggestions directly in the book. Some of the supplemental material is also
in response to FAQs or frequently asked questions from students. I have also provided a list of references
for this supplemental material that are not cited in the textbook.
Feedback from my colleagues indicates that this book is used in a variety of ways. Most often, it is used as
the textbook in an upper-division undergraduate course on statistical quality control and improvement.
However, there are a significant number of instructors that use the book as the basis of a graduate-level course,
or offer a course taken by a mixture of advanced undergraduates and graduate students. Obviously the topical
content and depth of coverage varies widely in these courses. Consequently, I have included some
supplemental material on topics that might be of interest in a more advanced undergraduate or graduate-level
course.
There is considerable personal bias in my selection of topics for the supplemental material. The coverage is
far from comprehensive.
I have not felt as constrained about mathematical level or statistical background of the readers in the
supplemental material as I have tried to be in writing the textbook. There are sections of the supplemental
material that will require more background in statistics than is required to read the text material. However, I
think that many instructors will be able to use selected portions of this supplement material in their courses
quite effectively, depending on the maturity and background of the students.

Supplemental Text Material Contents


Chapter 3

S3-1. Independent Random Variables


S3-2. Development of the Poisson Distribution
S3-3. The Mean and Variance of the Normal Distribution
S3-4. More about the Lognormal Distribution
S3-5. More about the Gamma Distribution
S3-6. The Failure Rate for the Exponential Distribution
S3-7. The Failure Rate for the Weibull Distribution

Chapter 4
S4-1. Random Samples
S4-2. Expected Value and Variance Operators
1

S4-3. Proof That E ( x ) and E ( S 2 ) 2


S4-4. More about Parameter Estimation
S4-5. Proof That E ( S )
S4-6. More about Checking Assumptions in the t-Test
S4-7. Expected Mean Squares in the Single-Factor Analysis of Variance

Chapter 5
S5-1. A Simple Alternative to Runs Rules on the x Chart

Chapter 6
S6-1. s2 is not Always an Unbiased Estimator of 2
S6-2. Should We Use d 2 or d 2* in Estimating via the Range Method?
S6-3. Determining When the Process has Shifted
S6-4. More about Monitoring Variability with Individual Observations
S6-5. Detecting Drifts versus Shifts in the Process Mean
S6-6. The Mean Square Successive Difference as an Estimator of 2

Chapter 7
S7-1. Probability Limits on Control Charts

Chapter 8
S8-1. Fixed Versus Random Factors in the Analysis of Variance
S8-2. More about Analysis of Variance Methods for Measurement Systems Capability Studies

Chapter 9
S9-1. The Markov Chain Approach for Finding the ARLs for Cusum and EWMA Control
Charts
S9-2. Integral Equations versus Markov Chains for Finding the ARL

Chapter 10
S10-1. Difference Control Charts
S10-2. Control Charts for Contrasts
S10-3. Run Sum and Zone Control Charts
S10-4. More about Adaptive Control Charts
2

Chapter 11
S-11.1 Multivariate Cusum Control Charts

Chapter 13
S13-1. Guidelines for Planning Experiments
S13-2. Using a t-Test for Detecting Curvature
S13-3. Blocking in Designed Experiments
S13-4. More about Expected Mean Squares in the Analysis of Variance

Chapter 14
S14-1. Response Surface Designs
S14-2. Fitting Regression Models by Least Squares
S14-3. More about Robust Design and Process Robustness Studies

Chapter 15
S15-1. A Lot Sensitive Compliance (LTPD) Sampling Plan
S15-2. Consideration of Inspection Errors

Supplemental Material for Chapter 3


S3.1. Independent Random Variables
Preliminary Remarks
Readers encounter random variables throughout the textbook. An informal definition of and notation
for random variables is used. A random variable may be thought of informally as any variable for
which the measured or observed value depends on a random or chance mechanism. That is, the value
of a random variable cannot be known in advance of actual observation of the phenomena. Formally,
of course, a random variable is a function that assigns a real number to each outcome in the sample
space of the observed phenomena. Furthermore, it is customary to distinguish between the random
variable and its observed value or realization by using an upper-case letter to denote the random
variable (say X) and the actual numerical value x that is the result of an observation or a measured
value. This formal notation is not used in the book because (1) it is not widely employed in the
statistical quality control field and (2) it is usually quite clear from the context whether we are
discussing the random variable or its realization.
Independent Random variables
In the textbook, we make frequent use of the concept of independent random variables. Most readers
have been exposed to this in a basic statistics course, but here a brief review of the concept is given.
For convenience, we consider only the case of continuous random variables. For the case of discrete
random variables, refer to Montgomery and Runger (2007).
Often there will be two or more random variables that jointly define some physical phenomena of interest. For
example, suppose we consider injection-molded components used to assemble a connector for an automotive
application. To adequately describe the connector, we might need to study both the hole interior diameter and
the wall thickness of the component. Let x1 represent the hole interior diameter and x2 represent the wall
thickness. The joint probability distribution (or density function) of these two continuous random variables
can be specified by providing a method for calculating the probability that x1 and x2 assume a value in any
region R of two-dimensional space, where the region R is often called the range space of the random variable.
This is analogous to the probability density function for a single random variable. Let this joint probability
density function be denoted by f ( x1 , x2 ) . Now the double integral of this joint probability density function
over a specified region R provides the probability that x1 and x2 assume values in the range space R.
A joint probability density function has the following properties:
a. f ( x1 , x2 ) 0 for all x1 , x2

b.

f ( x1 , x2 )dx1dx2 1

c. For any region R of two-dimensional space P{( x1 , x2 ) R}

f ( x , x )dx dx
1

The two random variables x1 and x2 are independent if f ( x1 , x2 ) f1 ( x1 ) f 2 ( x2 ) where f1 ( x1 ) and f 2 ( x2 ) are
the marginal probability distributions of x1 and x2, respectively, defined as

f1 ( x1 )

f ( x1 , x2 )dx2 and f 2 ( x2 )

f ( x1 , x2 )dx1

In general, if there are p random variables x1 , x2 ,..., x p then the joint probability density function is

f ( x1 , x2 ,..., x p ) , with the properties:


a. f ( x1 , x2 ,..., x p ) 0 for all x1, x2 ,..., x p
4

b.

... f ( x , x ..., x
1

)dx1dx2 ...dx p 1

c. For any region R of p-dimensional space,

P{( x1 , x2 ,..., x p ) R} ... f ( x1 , x2 ,..., x p )dx1dx2 ...dx p


R

The random variables x1, x2, , xp are independent if

f ( x1 , x2 ,..., x p ) f1 ( x1 ) f 2 ( x2 )... f p ( x p )
where fi ( xi ) are the marginal probability distributions of x1, x2 , , xp, respectively, defined as

fi ( xi ) ... f ( x1 , x2 ,..., x p ) dx1dx2 ...dxi 1dxi 1...dx p


Rxi

S3.2. Development of the Poisson Distribution


The Poisson distribution is widely used in statistical quality control and improvement, frequently as
the underlying probability model for count data. As noted in Section 3.2.3 of the text, the Poisson
distribution can be derived as a limiting form of the binomial distribution, and it can also be developed
from a probability argument based on the birth and death process. We now give a summary of both
developments.
The Poisson Distribution as a Limiting Form of the Binomial Distribution
Consider the binomial distribution

n
p( x) p x (1 p) n x
x
n!

p x (1 p) n x , x 0,1, 2,..., n
x !(n x)!
Let np so that p / n . We may now write the binomial distribution as

n(n 1)(n 2) (n x 1) n
p( x)

x!
n n
x

1 2
(1) 1 1

x ! n n

n x

x 1
1
1 1
n n n

1 2 x 1
Let n and p 0 so that np remains constant. The terms 1 , 1 ,..., 1
and
n
n n
x


1 all approach unity. Furthermore,
n

1 e as n
n
n

Thus, upon substitution we see that the limiting form of the binomial distribution is
p ( x)
5

x e
x!

which is the Poisson distribution.


Development of the Poisson Distribution from the Poisson Process
Consider a collection of time-oriented events, arbitrarily called arrivals or births. Let xt be the
number of these arrivals or births that occur in the interval [0,t). Note that the range space of xt
is R = {0,1,}. Assume that the number of births during non-overlapping time intervals are
independent random variables, and that there is a positive constant such that for any small time
interval t , the following statements are true:
1. The probability that exactly one birth will occur in an interval of length t is t .
2. The probability that zero births will occur in the interval is 1 t .
3. The probability that more than one birth will occur in the interval is zero.
The parameter is often called the mean arrival rate or the mean birth rate. This type of process, in
which the probability of observing exactly one event in a small interval of time is constant (or the
probability of occurrence of event is directly proportional to the length of the time interval), and the
occurrence of events in non-overlapping time intervals is independent is called a Poisson process.
In the following, let

P{xt x} p( x) px (t ), x 0,1, 2,...


Suppose that there have been no births up to time t. The probability that there are no births at the end
of time t + t is

p0 (t t ) (1 t ) p0 (t )
Note that
p0 (t t ) p0 (t )
p0 (t )
t

so consequently
p (t t ) p0 (t )
lim 0
p0 (t )
t 0
t

p0 (t )

For x > 0 births at the end of time t + t we have

px (t t ) px 1 (t ) t (1 t ) px (t )
and
p (t t ) px (t )
lim x
px (t )
t

px 1 (t ) px (t )

t 0

Thus we have a system of differential equations that describe the arrivals or births:

p0 (t ) p0 (t ) for x 0
px (t ) px 1 (t ) px (t ) for x 1, 2,...
6

The solution to this set of equations is


px (t )

( t ) x e t
x 0,1, 2,...
x!

Obviously for a fixed value of t this is the Poisson distribution.


S3.3. The Mean and Variance of the Normal Distribution
In Section 3.3.1 we introduce the normal distribution, with probability density function
1

2 ( x )2
1
f ( x)
e 2
, x
2

and we stated that and 2 are the mean and variance, respectively, of the distribution. We now
show that this claim is correct.
Note that f ( x) 0 . We first evaluate the integral I

f ( x )dx , showing that it is equal to 1. In

the integral, change the variable of integration to z ( x ) / . Then


I

1 z2 / 2
e
dz
2

Since I 0, if I 2 1, then I 1 . Now we may write

1 x2 / 2 y 2 / 2
e
dx e
dy

2
1 ( x2 y 2 ) / 2

e
dxdy
2

I2

If we switch to polar coordinates, then x r cos( ), y r sin( ) and


1 2 r 2 / 2
I2
e
rdrd
2 0 0
1 2
1

d
2 1

0
2
2
So we have shown that f ( x ) has the properties of a probability density function.
The integrand obtained by the substitution z ( x ) / is, of course, the standard normal
distribution, an important special case of the more general normal distribution. The standard normal
probability density function has a special notation, namely

( z)

1 z2 / 2
e
, z
2

and the cumulative standard normal distribution is


z

( z ) (t ) dt

Several useful properties of the standard normal distribution can be found by basic calculus:
1. ( z ) ( z ), for all real z, so ( z ) is an even function (symmetric about 0) of z
2. ( z ) z ( z )
7

3. ( z ) ( z 2 1) ( z )
Consequently, ( z ) has a unique maximum at z = 0, inflection points at z 1 , and both
( z ) 0 and ( z ) 0 as z .
The mean and variance of the standard normal distribution are found as follows:

E ( z ) z ( z )dz

( z )dz

( z ) |
0
and

E ( z 2 ) z 2 ( z )dz

[ ( z ) ( z )]dz

( z ) | ( z )dz

0 1
1
Because the variance of a random variable can be expressed in terms of expectation as
2 E ( z ) 2 E ( z 2 ) 2 , we have shown that the mean and variance of the standard normal
distribution are 0 and 1, respectively.
Now consider the case where x follows the more general normal distribution. Based on the
substitution, we have z ( x ) /
1

2 ( x )2
1
E ( x) x
e 2
dx

( z ) ( z )dz

( z )dz z ( z )dz
(1) (0)

and
1

2 ( x )2
1
E(x ) x
e 2
dx

2
2

( z ) 2 ( z )dz

2 ( z )dz 2 z ( z )dz 2 ( z )dz

Therefore, it follows that V ( x) E ( x 2 ) 2 ( 2 2 ) 2 2 .

S3.4. More about the Lognormal Distribution


The lognormal distribution is a general distribution of wide applicability. The lognormal distribution
is defined only for positive values of the random variable x and the probability density function is

f ( x)

1
x 2

1
(ln x )2
2 2

x0

The parameters of the lognormal distribution are and 0 2 . The lognormal


random variable is related to the normal random variable in that y ln x is normally distributed with
mean and variance 2 .
The mean and variance of the lognormal distribution are

E ( x) x e

12 2

V ( x) x2 e2 (e 1)
2

The median and mode of the lognormal distribution are


x e
mo e

In general, the kth origin moment of the lognormal random variable is


E( xk ) e

k 12 k 2 2

Like the gamma and Weibull distributions, the lognormal finds application in reliability engineering,
often as a model for survival time of components or systems. Some important properties of the
lognormal distribution are:
1. If x1 and x2 are independent lognormal random variables with parameters ( 1 , 12 ), ( 2 , 22 ) ,
respectively, then y x1 x2 is a lognormal random variable with parameters

1 2 and 12 22 .
2. If x1 , x2 ,..., xk are independently and identically distributed lognormal random variables with
1/ k

k
parameters and , then the geometric mean of the xi, or xi
i 1
2
distribution with parameters and / k .
2

, has a lognormal

3. If x is a lognormal random variable with parameters and 2 , and if a, b, and c are constants
such that b ec , then the random variable y bx a has a lognormal distribution with
parameters c a and a 2 2 .

S3.5. More about the Gamma Distribution


The gamma distribution is introduced in Section 3.3.4. The gamma probability density function is

f ( x)

(r )

( x) r 1 e x , x 0

where r > 0 is a shape parameter and 0 is a scale parameter. The parameter r is called a shape parameter
because it determines the basic shape of the graph of the density function. For example, if r = 1, the gamma
9

distribution reduces to an exponential distribution. There are actually three basic shapes; r 1 or
hyperexponential, r = 1 or exponential, and r > 1 or unimodal with right skew.
The cumulative distribution function of the gamma is
x

(r )

F ( x; r , )

(t ) r 1 e x dt

The substitution u t / in this integral results in F ( x; r , ) F ( x / ; r ,1) , which depends on only


through the variable x / . We typically call such a parameter a scale parameter. It can be important to have
a scale parameter in a probability distribution so that the results do not depend on the scale of measurement
actually used. For example, suppose that we are measuring time in months, and 6 . The probability that
x is less than or equal to 12 months is F (12 / 6; r ,1) F (2; r ,1) . If we wish to consider measuring time in
weeks, then the probability that x is less than or equal to 48 weeks is just F (48 / 24; r ,1) F (2; r ,1) .
Therefore, different scales of measurement can be accommodated by changing the scale parameter without
having to change to a more general form of the distribution.
When r is an integer, the gamma distribution is sometimes called the Erlang distribution. Another special case
of the gamma distribution arises when we let r = , 1, 3/2, 2, and 1/ 2 ; this is the chi-square
distribution with degrees of freedom r / 1, 2,... . The chi-square distribution is very important in statistical
inference.

S3.6. The Failure Rate for the Exponential Distribution


The exponential distribution

f ( x) e x , x 0
was introduced in Section 3.3.3 of the text. The exponential distribution is frequently used in reliability
engineering as a model for the lifetime or time to failure of a component or system. Generally, we define the
reliability function of the unit as

R(t ) P{x t}
t

1 f ( x)dx
0

1 F (t )
where, of course, F (t ) is the cumulative distribution function. In biomedical applications, the reliability
function is usually called the survival function. For the exponential distribution, the reliability function is

F (t ) e t
The Hazard Function
The mean and variance of a distribution are quite important in reliability applications, but an additional
property called the hazard function or the instantaneous failure rate is also useful. The hazard function is the
conditional density function of failure at time t, given that the unit has survived until time t. Therefore, letting
X denote the random variable and x denote the realization,

10

f ( x | X x ) h( x )
F ( x | X x)
F ( x x | X x) F ( x | X x)
lim
x
x
F ( x X x x | X x )
lim
x
x
F ( x X x x, X x )
lim
x
xP{ X x}
F ( x X x x)
lim
x
x[1 F ( x)]
f ( x)

1 F ( x)

It turns out that specifying a hazard function completely determines the cumulative distribution
function (and vive-versa).
The Hazard Function for the Exponential Distribution
For the exponential distribution, the hazard function is

h( x )

f ( x)
1 F ( x)

e x
e x

That is, the hazard function for the exponential distribution is constant, or the failure rate is just the
reciprocal of the mean time to failure.
A constant failure rate implies that the reliability of the unit at time t does not depend on its age. This
may be a reasonable assumption for some types of units, such as electrical components, but its
probably unreasonable for mechanical components. It is probably not a good assumption for many
types of system-level products that are made up of many components (such as an automobile).
Generally, an increasing hazard function indicates that the unit is more likely to fail in the next
increment of time than it would have been in an earlier increment of time of the same length. This is
likely due to aging or wear.
Despite the apparent simplicity of its hazard function, the exponential distribution has been an
important distribution in reliability engineering. This is partly because the constant failure rate
assumption is probably not unreasonable over some region of the units life.
S3.7. The Failure Rate for the Weibull Distribution
The instantaneous failure rate or the hazard function was defined in Section S3.6 of the Supplemental Text
Material. For the Weibull distribution, the hazard function is

11

h( x )

f ( x)
1 F ( x)
( / )( x / ) 1 e ( x / )

e ( x / )

Note that if 1 the Weibull hazard function is constant. This should be no surprise, since for 1 the
Weibull distribution reduces to the exponential. When 1 , the Weibull hazard function increases,
approaching as . Consequently, the Weibull is a fairly common choice as a model for components
or systems that experience deterioration due to wear-out or fatigue. For the case where 1 , the Weibull
hazard function decreases, approaching 0 as 0 .
For comparison purposes, note that the hazard function for the gamma distribution with parameters r and
is also constant for the case r = 1 (the gamma also reduces to the exponential when r = 1). Also, when r > 1
the hazard function increases, and when r < 1 the hazard function decreases. However, when r > 1 the hazard
function approaches from below, while if r < 1 the hazard function approaches from above. Therefore,
even though the graph of the gamma and Weibull distributions look very similar, and they can both produce
reasonable fits to the same sample of data, they clearly have very different characteristics in terms of describing
survival or reliability data.

12

Supplemental Material for Chapter 4


S4.1. Random Samples
To properly apply many statistical techniques, the sample drawn from the population of interest must be a
random sample. To properly define a random sample, let x be a random variable that represents the results
of selecting one observation from the population of interest. Let f ( x ) be the probability distribution of x.
Now suppose that n observations (a sample) are obtained independently from the population under unchanging
conditions. That is, we do not let the outcome from one observation influence the outcome from another
observation. Let xi be the random variable that represents the observation obtained on the ith trial. Then the
observations x1 , x2 ,..., xn are a random sample.
In a random sample the marginal probability distributions f ( x1 ), f ( x2 ),..., f ( xn ) are all identical, the
observations in the sample are independent, and by definition, the joint probability distribution of the random
sample is f ( x1 , x2 ,..., xn ) f ( x1 ) f ( x2 )... f ( xn ) .

S4.2. Expected Value and Variance Operators


Readers should have prior exposure to mathematical expectation from a basic statistics course. Here
some of the basic properties of expectation are reviewed.
The expected value of a random variable x is denoted by E ( x ) and is given by

xi p ( xi ), xi is a discrete random variable


all xi
E ( x)

xf ( x)dx, x is a continuous random variable


The expectation of a random variable is very useful in that it provides a straightforward characterization of the
distribution, and it has a simple practical interpretation as the center of mass, centroid, or mean of the
distribution.
Now suppose that y is a function of the random variable x, say y h( x) . Note that y is also a random variable.
The expectation of h( x ) is defined as

h( xi ) p ( xi ), xi is a discrete random variable


all xi
E[h( x)]

h( x) f ( x)dx, x is a continuous random variable



An interesting result, sometimes called the theorem of the unconscious statistician states that if x is a
continuous random variable with probability density function f ( x ) and y h( x) is a function of x having
probability density function g ( y ) , then the expectation of y can be found either by using the definition of
expectation with g ( y ) or in terms of its definition as the expectation of a function of x with respect to the
probability density function of x. That is, we may write either

E ( y ) yg ( y )dy

or

E ( y ) E[h( x)] h( x) f ( x)dx

13

The name for this theorem comes from the fact that we often apply it without consciously thinking
about whether the theorem is true in our particular case.
Useful Properties of Expectation I:
Let x be a random variable with mean , and c be a constant. Then
1. 1. E (c) c
2. 2. E ( x)
3. 3. E (cx) cE ( x) c
4. 4. E[ch( x)] cE[h( x)]
5. If c1 and c2 are constants and h1 and h2 are functions, then

E[c1h1 ( x) c2 h2 ( x)] c1E[h1 ( x)] c2 E[h2 ( x)]


Because of property 5, expectation is called a linear (or distributive) operator.
Now consider the function h( x) ( x c) 2 where c is a constant, and suppose that E[( x c) 2 ] exists.
To find the value of c for which E[( x c) 2 ] is a minimum, write
E[( x c) 2 ] E[ x 2 2 xc c 2 ]
E ( x 2 ) 2cE ( x) c 2

Now the derivative of E[( x c) 2 ] with respect to c is 2 E ( x) 2c , and this derivative is zero when
c E ( x ) . Therefore, E[( x c) 2 ] is a minimum when c E ( x ) .

The variance of the random variable x is defined as

V ( x) E[( x )2 ]
2
and we usually call

V ( x) E[( x ) 2 ]
the variance operator. It is straightforward to show that if c is a constant, then

V (cx) c 2 2
The variance is analogous to the moment of inertia in mechanics.
Useful Properties of Expectation II:
Let x1 and x2 be random variables with means 1 and 2 and variances 12 and 22 , respectively, and
let c1 and c2 be constants. Then
1. E( x1 x2 ) 1 2
2. It is possible to show that V ( x1 x2 ) 12 22 2Cov( x1 , x2 ) , where

Cov( x1 , x2 ) E[( x1 1 )( x2 2 )]

14

is the covariance of the random variables x1 and x2. The covariance is a measure of the linear
association between x1 and x2. More specifically, we may show that if x1 and x2 are
independent, then Cov( x1 , x2 ) 0 .
3. V ( x1 x2 ) 12 22 2Cov( x1 , x2 )
4. If the random variables x1 and x2 are independent, V ( x1 x2 ) 12 22
5. If the random variables x1 and x2 are independent, E ( x1 x2 ) E ( x1 ) E( x2 ) 12
6. Regardless of whether x1 and x2 are independent, in general
x
E 1
x2

E ( x1 )

E ( x2 )

7. For the single random variable x

V ( x x) 4 2
because Cov( x, x) 2 .

Moments
Although we do not make much use of the notion of the moments of a random variable in the book,
for completeness we give the definition. Let the function of the random variable x be

h( x ) x k
where k is a positive integer. Then the expectation of h( x) x k is called the kth moment about the
origin of the random variable x and is given by
xik p ( xi ), xi is a discrete random variable
all xi
E( xk )

x k f ( x)dx, x is a continuous random variable


Note that the first origin moment is just the mean of the random variable x. The second origin
moment is

E( x2 ) 2 2
Moments about the mean are defined as
( xi ) k p( xi ), xi is a discrete random variable
all xi
E[( x ) k ]

( x ) k f ( x)dx, x is a continuous random variable


The second moment about the mean is the variance 2 of the random variable x.
S4.3. Proof That E ( x ) and E ( s 2 ) 2
It is easy to show that the sample average x and the sample variance s2 are unbiased estimators of the
corresponding population parameters and 2 , respectively. Suppose that the random variable x

15

has mean and variance 2 , and that x1 , x2 ,..., xn is a random sample of size n from the population.
Then
1 n
E ( x ) E xi
n i 1
1 n
E ( xi )
n i 1

1 n

n i 1

because the expected value of each observation in the sample is E ( xi ) . Now consider

n
2
( xi x )

E ( s 2 ) E i 1
n 1

n
1

E ( xi x ) 2
n 1 i 1

It is convenient to write

(x x ) x
2

i 1

i 1

2
i

nx 2 , and so

n
n
E ( xi x )2 E ( xi2 ) E (nx 2 )
i 1
i 1
n

Now

E( x )
i 1

2
i

2 and E (x 2 ) 2 2 / n . Therefore

E (s 2 )

1 n

( 2 2 ) n( 2 2 / n)

n 1 i 1

1
n 2 n 2 n 2 2
n 1
(n 1) 2

n 1
2

Note that:
a. These results do not depend on the form of the distribution for the random variable x. Many
people think that an assumption of normality is required, but this is unnecessary.
b. Even though E ( s 2 ) 2 , the sample standard deviation is not an unbiased estimator of the
population standard deviation. This is discussed more fully in section S3-5.

16

S4.4. More About Parameter Estimation


Throughout the book estimators of various population or process parameters are given without much
discussion concerning how these estimators are generated. Often they are simply logical or
intuitive estimators, such as using the sample average x as an estimator of the population mean .
There are methods for developing point estimators of population parameters. These methods are
typically discussed in detail in courses in mathematical statistics. We now give a brief overview of
some of these methods.
The Method of Maximum Likelihood
One of the best methods for obtaining a point estimator of a population parameter is the method of
maximum likelihood. Suppose that x is a random variable with probability distribution f ( x; ) ,
where is a single unknown parameter. Let x1 , x2 ,..., xn be the observations in a random sample of
size n. Then the likelihood function of the sample is

L( ) f ( x1 ; ) f ( x2 ; )

f ( xn ; )

The maximum likelihood estimator of is the value of that maximizes the likelyhood function
L( ).
Example 1 The Exponential Distribution
To illustrate the maximum likelihood estimation procedure, set x be exponentially distributed with
parameter . The likelihood function of a random sample of size n, say x1 , x2 ,..., xn , is
n

L( ) e xi
i 1

xi
i 1

Now it turns out that, in general, if the maximum likelihood estimator maximizes L( ), it will also
maximize the log likelihood, ln L ( ) . For the exponential distribution, the log likelihood is
n

ln L( ) n ln xi
i 1

Now

d ln L( ) n n
xi
d
i 1
Equating the derivative to zero and solving for the estimator of we obtain

n
n

x
i 1

1
x

Thus the maximum likelihood estimator (or the MLE) of is the reciprocal of the sample average.
Maximum likelihood estimation can be used in situations where there are several unknown
parameters, say 1 , 2 , , p to be estimated. The maximum likelihood estimators would be found

17

simply by equating the p first partial derivatives L(1 , 2 , , p ) / i , i 1, 2,..., p of the likelihood
(or the log likelihood) equal to zero and solving the resulting system of equations.
Example 2 The Normal Distribution
Let x be normally distributed with the parameters and 2 unknown. The likelihood function of a
random sample of size n is
1 xi


1
L( , )
e 2
i 1 2
n

1
(2 )

2 n/2

1
2

( xi )2
i 1

The log-likelihood function is

n
1 n
ln L( , 2 ) ln(2 2 ) 2 ( xi )2
2
2 i 1
Now

ln L( 2 ) 1
2

(x )
i 1

ln L( 2 )
n
1
2 4
2

2
2

(x )
i 1

The solution to these equations yields the MLEs

1 n
xi x
n i 1

1 n
( xi x )2
n i 1

Generally, we like the method of maximum likelihood because when n is large, (1) it results in
estimators that are approximately unbiased, (2) the variance of a MLE is as small as or nearly as small
as the variance that could be obtained with any other estimation technique, and (3) MLEs are
approximately normally distributed. Furthermore, the MLE has an invariance property; that is, if
is the MLE of , then the MLE of a function of , say h( ) , is the same function
h() of the MLE . There are also some other nice statistical properties that MLEs enjoy; see a
book on mathematical statistics, such as Hogg and Craig (1978) or Bain and Engelhardt (1987).
The unbiased property of the MLE is a large-sample or asymptotic property. To illustrate, consider the
MLE for 2 in the normal distribution of example 2 above. We can easily show that

E ( 2 )

n 1 2

Now the bias in estimation of 2 is


E ( 2 ) 2

n 1 2
2
2
n
n

Notice that the bias in estimating 2 goes to zero as the sample size n . Therefore, the MLE is
an asymptotically unbiased estimator.
18

The Method of Moments


Estimation by the method of moments involves equating the origin moments of the probability
distribution (which are functions of the unknown parameters) to the sample moments, and solving for
the unknown parameters. We can define the first p sample moments as
n

M k

k
i

i 1

, k 1, 2,..., p

and the first p moments around the origin of the random variable x are just

k E( xk ), k 1, 2,..., p
Example 3 The Normal Distribution
For the normal distribution the first two origin moments are

1
2 2 2
and the first two sample moments are
M 1 x
M 2

1 n 2
xi
n i 1

Equating the sample and origin moments results in

x
2 2

1 n 2
xi
n i 1

The solution gives the moment estimators of and 2 :

x
2

1 n
( xi x )2
n i 1

The method of moments often yields estimators that are reasonably good. For example, in the above
example the moment estimators are identical to the MLEs. However, generally moment estimators
are not as good as MLEs because they dont have statistical properties that are as nice. For example,
moment estimators usually have larger variances than MLEs.
Least Squares Estimation
The method of least squares is one of the oldest and most widely used methods of parameter
estimation. Section 4.6 gives an introduction to least squares for fitting regression models. Unlike
the method of maximum likelihood and the method of moments, least squares can be employed when
the distribution of the random variable is unknown.
To illustrate, suppose that the simple location model can describe the random variable x:

xi i , i 1, 2,..., n
where the parameter is unknown and the i are random errors. We dont know the distribution of
the errors, but we can assume that they have mean zero and constant variance. The least squares
19

estimator of is chosen so the sum of the squares of the model errors i is minimized. The least
squares function for a sample of n observations x1 , x2 ,..., xn is
n

L i2
i 1
n

( xi ) 2
i 1

Differentiating L and equating the derivative to zero results in the least squares estimator of :

x
In general, the least squares function will contain p unknown parameters and L will be minimized by
solving the equations that result when the first partial derivatives of L with respect to the unknown
parameters are equated to zero. These equations are called the least squares normal equations. See
Section 4.6 in the textbook.
The method of least squares dates from work by Karl Gauss in the early 1800s. It has a very welldeveloped and indeed quite elegant theory. For a discussion of the use of least squares in estimating
the parameters in regression models and many illustrative examples, see Section 4.6 and
Montgomery, Peck and Vining (2007), and for a very readable and concise presentation of the theory,
see Myers and Milton (1991).
S4.5. Proof That E ( S )
In Section S4.4 of the Supplemental Text Material we showed that the sample variance is an unbiased
estimator of the population variance; that is, E ( s 2 ) 2 , and that this result does not depend on the
form of the distribution. However, the sample standard deviation is not an unbiased estimator of the
population standard deviation. This is easy to demonstrate for the case where the random variable x
follows a normal distribution.
Let x have a normal distribution with mean and variance 2 , and let x1 , x2 ,..., xn be a random
sample of size n from the population. Now the distribution of
(n 1) s 2

2
is chi-square with n 1 degrees of freedom, denoted n21 . Therefore the distribution of s2 is

2 /(n 1) times a n21 random variable. So when sampling from a normal distribution, the expected
value of s2 is

2 2
E (s 2 ) E
n 1
n 1

2
n 1

2
n 1

E ( n21 )
(n 1)

2
because the mean of a chi-square random variable with n 1 degrees of freedom is n 1. Now it
follows that the distribution of
20

(n 1) s

is a chi distribution with n 1 degrees of freedom, denoted n 1 . The expected value of S can be
written as

E (s) E
n 1
n 1

n 1

E ( n 1 )

The mean of the chi distribution with n 1 degrees of freedom is


E ( n 1 ) 2

(n / 2)
[(n 1) / 2]

where the gamma function ( r ) y r 1e y dy . Then


0

E ( s)

2
(n / 2)

n 1 [(n 1) / 2]

c4
The constant c4 is given in Appendix table VI.
While s is a biased estimator of , the bias gets small fairly quickly as the sample size n increases.
From Appendix table VI, note that c4 = 0.94 for a sample of n = 5, c4 = 0.9727 for a sample of n =
10, and c4 = 0.9896 or very nearly unity for a sample of n = 25.

S4.6. More about Checking Assumptions in the t-Test


The two-sample t-test can be presented from the viewpoint of a simple linear regression model.
This is a very instructive way to think about the t-test, as it fits in nicely with the general notion of a
factorial experiment with factors at two levels. This type of experiment is very important in process
development and improvement, and is discussed extensively in Chapter 13. This also leads to another
way to check assumptions in the t-test. This method is equivalent to the normal probability plotting
of the original data discussed in Chapter 4.
We will use the data on the two catalysts in Example 4.9 to illustrate. In the two-sample t-test
scenario, we have a factor x with two levels, which we can arbitrarily call low and high. We will
use x = -1 to denote the low level of this factor (Catalyst 1) and x = +1 to denote the high level of this
factor (Catalyst 2). The figure below is a scatter plot (from Minitab) of the yield data resulting from
using the two catalysts shown in Table 4.2 of the textbook.

21

Scatterplot of Yield vs Catalyst


98
97
96

Yield

95
94
93
92
91
90
89
-1.0

-0.5

0.0
Catalyst

0.5

1.0

We will a simple linear regression model to this data, say

yij 0 1 xij ij
where 0 and 1 are the intercept and slope, respectively, of the regression line and the regressor or
predictor variable is x1 j 1 and x2 j 1 . The method of least squares can be used to estimate the
slope and intercept in this model. Assuming that we have equal sample sizes n for each factor level
the least squares normal equations are:
2

2n 0 yij
i 1 j 1
n

j 1

j 1

2n 1 y2 j y1 j

The solution to these equations is

0 y
1
2

1 ( y2 y1 )
Note that the least squares estimator of the intercept is the average of all the observations from both
samples, while the estimator of the slope is one-half of the difference between the sample averages
at the high and low levels of the factor x. Below is the output from the linear regression procedure
in Minitab for the catalyst data.

22

Regression Analysis: Yield versus Catalyst

The regression equation is


Yield = 92.5 + 0.239 Catalyst

Predictor

Coef

SE Coef

Constant

92.4938

0.6752

136.98

0.000

Catalyst

0.2387

0.6752

0.35

0.729

S = 2.70086

R-Sq = 0.9%

R-Sq(adj) = 0.0%

Analysis of Variance

Source

DF

SS

MS

0.912

0.912

0.13

0.729

Residual Error

14

102.125

7.295

Total

15

103.037

Regression

Notice that the estimate of the slope (given in the column labeled Coef and the row labeled Catalyst above)

1
1
( y2 y1 ) (92.7325 92.255) and the estimate of the intercept is 92.4938
2
2
1
1
( y2 y1 ) (93.7325 92.255) . Furthermore, notice that the t-statistic associated with the slope is
2
2

is

0.2387

equal to 0.35, exactly the same value (apart from sign, because we subtracted the averages in the reverse order)
we gave in the text. Now in simple linear regression, the t-test on the slope is actually testing the hypotheses

H0 : 1 0
H0 : 1 0
and this is equivalent to testing H0 : 1 2 .
It is easy to show that the t-test statistic used for testing that the slope equals zero in simple linear regression
is identical to the usual two-sample t-test. Recall that to test the above hypotheses in simple linear regression
the t-statistic is

t0

1
2
S xx

where Sxx

(x

ij

x ) 2 is the corrected sum of squares of the xs. Now in our specific problem,

i 1 j 1

x 0, x1 j 1 and x2 j 1, so S xx 2n. Therefore, since we have already observed that the estimate of
is just sp,
23

t0

2
S xx

1
( y2 y1 )
y y
2
2 1
1
2
sp
sp
2n
n

This is the usual two-sample t-test statistic for the case of equal sample sizes.
Most regression software packages will also compute a table or listing of the residuals from the model. The
residuals from the Minitab regression model fit obtained above are as follows:
Obs

Catalyst

Yield

Fit

SE Fit

Residual

St Resid

-1.00

91.500

92.255

0.955

-0.755

-0.30

-1.00

94.180

92.255

0.955

1.925

0.76

-1.00

92.180

92.255

0.955

-0.075

-0.03

-1.00

95.390

92.255

0.955

3.135

1.24

-1.00

91.790

92.255

0.955

-0.465

-0.18

-1.00

89.070

92.255

0.955

-3.185

-1.26

-1.00

94.720

92.255

0.955

2.465

0.98

-1.00

89.210

92.255

0.955

-3.045

-1.21

1.00

89.190

92.733

0.955

-3.543

-1.40

10

1.00

90.950

92.733

0.955

-1.783

-0.71

11

1.00

90.460

92.733

0.955

-2.273

-0.90

12

1.00

93.210

92.733

0.955

0.477

0.19

13

1.00

97.190

92.733

0.955

4.457

1.76

14

1.00

97.040

92.733

0.955

4.307

1.70

15

1.00

91.070

92.733

0.955

-1.663

-0.66

16

1.00

92.750

92.733

0.955

0.017

0.01

The column labeled Fit contains the predicted values of yield from the regression model, which just turn out
to be the averages of the two samples. The residuals are in the sixth column of this table. They are just the
differences between the observed values of yield and the corresponding predicted values. A normal probability
plot of the residuals follows.

24

Normal Probability Plot of the Residuals


(response is Yield)

99

95
90

Percent

80
70
60
50
40
30
20
10
5

-7.5

-5.0

-2.5

0.0
Residual

2.5

5.0

Notice that the residuals plot approximately along a straight line, indicating that there is no serious
problem with the normality assumption in these data. This is equivalent to plotting the original yield
data on separate probability plots as we did in Chapter 3.
S4.7. Expected Mean Squares in the Single-Factor Analysis of Variance
In section 4.5.2 we give the expected values of the mean squares for treatments and error in the single-factor
analysis of variance (ANOVA). These quantities may be derived by straightforward application of the
expectation operator.
Consider first the mean square for treatments:

E ( MSTreatments ) E

FG SS IJ
H a 1 K
Treatments

Now for a balanced design (equal number of observations in each treatment)

SSTreatments

1 a 2 1 2
yi. an y..
n i 1

and the single-factor ANOVA model is

yij i ij

RSi 1,2,, a
T j 1,2,, n

In addition, we will find the following useful:

E ( ij ) E ( i . ) E ( .. ) 0, E ( ij2 ) 2 , E ( i2. ) n 2 , E ( ..2 ) an 2


Now

25

E ( SSTreatments ) E (

1 a 2
1
yi . ) E ( y..2 )

n i 1
an

Consider the first term on the right hand side of the above expression:

1 a 2 1 a
E ( yi . ) E (n n i i . ) 2
n i 1
n i 1
Squaring the expression in parentheses and taking expectation results in

E(

a
1 a 2 1
2
2
y
)

[
a
(
n

n
i2 an 2 ]
i. n

n i 1
i 1
a

an 2 n i2 a 2
i 1

because the three cross-product terms are all zero. Now consider the second term on the right hand side of
E ( SSTreatments ) :

FG 1 y IJ 1 E (an n
H an K an
a

2
..

.. ) 2

i 1

since

1
E (an .. ) 2
an

0. Upon squaring the term in parentheses and taking expectation, we obtain

i 1

FG 1 y IJ 1 [(an)
H an K an
2
..

an 2 ]

an 2 2
since the expected value of the cross-product is zero. Therefore,

E ( SSTreatments ) E (

1 a 2
1
yi . ) E ( y..2 )

n i 1
an
a

an 2 n i2 a 2 (an 2 2 )
i 1

2 (a 1) n i2
i 1

Consequently the expected value of the mean square for treatments is

26

E ( MSTreatments ) E

FG SS IJ
H a 1 K
Treatments

2 (a 1) n i2
i 1

a 1
a

n i2
i 1

a 1

This is the result given in the textbook.


For the error mean square, we obtain

SS E
E ( MS E ) E

N a
a n

E ( yij yi. ) 2
N a i 1 j 1

a
n
a

1
1

E yij2 yi2.
N a i 1 j 1
n i 1
Substituting the model into this last expression, we obtain
2
a n

1
1 a n
2
E ( MS E )
E ( i ij ) ( i ij )
N a i 1 j 1
n i 1 j 1

After squaring and taking expectation, this last equation becomes

E ( MS E )

a
a
1

2
2
2
2
N

n
i2 a 2

N a
i 1
i 1

27

Supplemental Material for Chapter 5


S5.1. A Simple Alternative to Runs Rules on the x Chart
It is well-known that while Shewhart control charts detect large shifts quickly, they are relative insensitive to
small or moderately-sized process shifts. Various sensitizing rules (sometimes called runs rules) have been
proposed to enhance the effectiveness of the chart to detect small shifts. Of these rules, the Western Electric
rules are among the most popular. The western Electric rules are of the r out of m form; that is, if r out of the
last m consecutive points exceed some limit, an out of control signal is generated.
In a very fundamental paper, Champ and Woodall (1987) point out that the use of these sensitizing rules does
indeed increase chart sensitivity, but at the expense of (sometimes greatly) increasing the rate of false alarms,
hence decreasing the in-control ARL. Generally, I do not think that the sensitizing rules should be used
routinely on a control chart, particularly once the process has been brought into a state of control. They do
have some application in the establishment of control limits (Phase 1 of control chart usage) and in trying to
bring an unruly process into control, but even then they need to be used carefully to avoid false alarms.
Obviously, Cusum and EWMA control charts provide an effective alternative to Shewhart control charts for
the problem of small shifts. However, Klein (2000) has proposed another solution. His solution is simple but
elegant: use an r out of m consecutive point rule, but apply the rule to a single control limit rather than to a set
of interior warning type limits. He analyzes the following two rules:
1. If two consecutive points exceed a control limit, the process is out of control. The width of the control
limits should be 1.78 .
2. If two out of three consecutive points exceed a control limit, the process is out of control. The width
of the control limits should be 1.93 .
These rules would be applied to one side of the chart at a time, just as we do with the Western Electric rules.
Klein (2000) presents the ARL performance of these rules for the x chart, using actual control limit widths of
1.7814 and 1.9307 , as these choices make the in-control ARL exactly equal to 370, the values
associated with the usual three-sigma limits on the Shewhart chart. The table shown below is adapted from
his results. Notice that Professor Kleins procedure greatly improves the ability of the Shewhart x chart to
detect small shifts. The improvement is not as much as can be obtained with an EWMA or a Cusum, but it is
substantial, and considering the simplicity of Kleins procedure, it should be more widely used in practice.

Shift in process mean,


in standard deviation
units

ARL for the Shewhart


x chart with threesigma control limits

ARL for the Shewhart


x chart with
1.7814 control
limits

ARL for the Shewhart


x chart with
1.9307 control
limits

370

350

370

0.2

308

277

271

0.4

200

150

142

0.6

120

79

73

0.8

72

44

40

44

26

23

6.3

4.6

4.3

2.4

2.4

28

Supplemental Material for Chapter 6


S6.1. s2 is not Always an Unbiased Estimator of 2
An important property of the sample variance is that it is an unbiased estimator of the population
variance, as demonstrated in Section S4.3 of the Supplemental Text Material. However, this unbiased
property depends on the assumption that the sample data has been drawn from a stable process; that
is, a process that is in statistical control. In statistical quality control work we sometimes make this
assumption, but if it is incorrect, it can have serious consequences on the estimates of the process
parameters we obtain.
To illustrate, suppose that in the sequence of individual observations
x1 , x2 , , xt , xt 1 , , xm

the process is in-control with mean 0 and standard deviation for the first t observations, but
between xt and xt+1 an assignable cause occurs that results in a sustained shift in the process mean to
a new level 0 and the mean remains at this new level for the remaining sample
observations xt 1 , , xm . Under these conditions, Woodall and Montgomery (2000-01) show that

E (s 2 ) 2

t (m t )
( ) 2 .
m(m 1)

(S6.1)

In fact, this result holds for any case in which the mean of t of the observations is 0 and the mean of the
remaining observations is 0 , since the order of the observations is not relevant in computing s2. Note
that s2 is biased upwards; that is, s2 tends to overestimate 2. Furthermore, the extent of the bias depends on
the magnitude of the shift in the mean (), the time period following which the shift occurs (t), and the number
of available observations (m). For example, if there are m = 25 observations and the process mean shifts from
0 to

0 (that is, 1)

between the 20th and the 21st observation (t = 20), then s2 will overestimate

2 by 16.7% on average. If the shift in the mean occurs earlier, say between the 10th and 11th observations, then
s2 will overestimate 2 by 25% on average.
The proof of Equation S6.1 is straightforward. Since we can write

s2

1 m 2

xi mx 2

m 1 i 1

then

E (s 2 ) E

Now

29

1 m 2
1 m

2
x

mx

E ( xi2 ) mE ( x 2 )

m 1 i 1
m 1 i 1

1
m1

FG E ( x )IJ 1 FG E ( x ) E ( x )IJ
H
K m 1H
K
1

ct ( ) (m t )( ) (m t ) h
m1
1

ct (m t )( ) m h
m1
m

2
i

i 1

2
i

2
i

i 1

i t 1

2
0

2
0

and

LMF
MNGH

FG
H

IJ IJ
K K

1
m
mt
mE ( x 2 )
0

m1
m1
m

OP
m PQ

Therefore
2
2

1 2
m t
2
2
t 0 (m t )( 0 ) m m 0
E (s )

m 1
m
m

2
1 2

mt
2
t 0 (m t )( 0 ) 2 m 0

m 1
m

1
(m t ) 2
2
2
(
m

t
)(

( ) 2

m 1
m

1
(m t )
(m t )( ) 2 1

m 1
m

t (m t )
( ) 2
m(m 1)

S6.2. Should We Use d2 or d2* in Estimating via the Range Method?


In the textbook, we make use of the range method for estimation of the process standard deviation, particularly
in constructing variables control charts (for example, see the x and R charts of Chapter 5). We use the
*

estimator R / d 2 . Sometimes an alternative estimator, R / d 2 , is encountered. In this section we discuss the


nature and potential uses of these two estimators. Much of this discussion is adapted from Woodall and
Montgomery (2000-01). The original work on using ranges to estimate the standard deviation of a normal
distribution is due to Tippett (1925). See also the paper by Duncan (1955).
Suppose one has m independent samples, each of size n, from one or more populations assumed to be normally
distributed with standard deviation . We denote the sample ranges of the m samples or subgroups as
R1 , R2 , , Rm . Note that this type of data arises frequently in statistical process control applications and gauge
repeatability and reproducibility (R & R) studies (refer to Chapter 8). It is well-known that E(Ri) = d2 and
Var(Ri)=d322 for i 1,2,, m where d2 and d3 are constants that depend on the sample size n. Values of
these constants are tabled in virtually all textbooks and training materials on statistical process control. See,
for example Appendix table VI for values of d2 and d3 for n = 2 to 25.
There are two estimators of the process standard deviation based on the average sample range,
30

i 1

(S6.2)

that are commonly encountered in practice. The estimator

1 R / d2

(S6.3)

is widely used after the application of control charts to estimate process variability and to assess process
capability. In Chapter 4 we report the relative efficiency of the range estimator given in Equation (S6.3) to the
sample standard deviation for various sample sizes. For example, if n = 5, the relative efficiency of the range
estimator compared to the sample standard deviation is 0.955. Consequently, there is little practical difference
between the two estimators. Equation (S6.3) is also frequently used to determine the usual 3-sigma limits on
the Shewhart x chart in statistical process control. The estimator

2 R / d 2*

(S6.4)

is more often used in gauge R & R studies and in variables acceptance sampling. Here d 2 represents a constant
whose value depends on both m and n. See Chrysler, Ford, GM (1995), Military Standard 414 (1957), and
Duncan (1986).
Patnaik (1950) showed that R / is distributed approximately as a multiple of a - distribution. In particular,
R / is distributed approximately as

d2* /

, where represents the fractional degrees of freedom for the

distribution. Patnaik (1950) used the approximation

1
1
5

d 2* d 2 1

.
2
128 3
4 32

(S6.5)

It has been pointed out by Duncan (1986), Wheeler (1995), and Luko (1996), among others, that 1 is an
unbiased estimator of and that

22 is an unbiased estimator of 2.

however, David (1951) showed that no approximation for

For

22 to be an unbiased estimator of 2,

d 2* was required.

d2* (d22 Vn / m)1/ 2 ,

He showed that
(S6.6)

where Vn is the variance of the sample range with sample size n from a normal population with unit variance.
It is important to note that

Vn d32 , so Equation (S5-6) can be easily used to determine values of d 2* from the
*

widely available tables of d2 and d3. Thus, a table of d 2 values, such as the ones given by Duncan (1986),
Wheeler (1995), and many others, is not required so long as values of d2 and d3 are tabled, as they usually are
(once again, see Appendix Table VI). Also, use of the approximation

d 2 d 2 1
4
31

given by Duncan (1986) and Wheeler (1995) becomes unnecessary.


The table of

d 2* values given by Duncan (1986) is the most frequently recommended. If a table is required, the

ones by Nelson (1975) and Luko (1996) provide values of


are based on Equation (S6.6).
It has been noted that as m increases,

d 2* approaches d2.

*
2 approaches

as m increases. The fact that d


(S6.6) as pointed out by Luko (1996).

d 2* that are slightly more accurate since their values

This has frequently been argued noting that increases

d2 as m increases is more easily seen, however, from Equation

Sometimes use of Equation (S6.4) is recommended without any explanation. See, for example, the AIAG
measurement systems capability guidelines [Chrysler, Ford, and GM (1995)]. The choice between 1 and 2
has often not been explained clearly in the literature. It is frequently stated that the use of Equation (S6.3)
requires that R be obtained from a fairly large number of individual ranges. See, for example, Bissell (1994,
p. 289). Grant and Leavenworth (1996, p. 128) state that Strictly speaking, the validity of the exact value of
the d2 factor assumes that the ranges have been averaged for a fair number of subgroups, say, 20 or more.
When only a few subgroups are available, a better estimate of is obtained using a factor that writers on
statistics have designated as

d 2* .

Nelson (1975) writes, If fewer than a large number of subgroups are used,

Equation (S6.3) gives an estimate of which does not have the same expected value as the standard deviation
estimator. In fact, Equation (S6.3) produces an unbiased estimator of regardless of the number of samples
m, whereas the pooled standard deviation does not (refer to Section S4.5 of the Supplemental Text Material).
The choice between 1 and 2 depends upon whether one is interested in obtaining an unbiased estimator of
or 2. As m increases, both estimators (S6.3) and (S6.4) become equivalent since each is a consistent
estimator of .
It is interesting to note that among all estimators of the form

cR (c 0), the one minimizing the mean squared

error in estimating has

c d 2 / (d 2* ) 2 .
The derivation of this result is in the proofs at the end of this section. If we let

d
3 *2 2 R
(d 2 )
then it is shown in the proofs below that

FG
H

MSE ( 3 ) 1

d 22
(d 2* ) 2

IJ
K

Luko (1996) compared the mean squared error of 2 in estimating to that of 1 and recommended
2 on the basis of uniformly lower MSE values. By definition, 3 leads to further reduction in MSE.
It is shown in the proofs at the end of this section that the percentage reduction in MSE using 3
instead of 2 is

32

FG d d IJ
H d K
*
2

50

*
2

Values of the percentage reduction are given in Table S6.1. Notice that when both the number of
subgroups and the subgroup size are small, a moderate reduction in mean squared error can be
obtained by using 3 .
Table S6.1.
Percentage Reduction in Mean Squared Error from using
3 instead of 2
Subgroup
Size, n

Number of Subgroups, m
1

20

10.1191 5.9077 4.1769 3.2314 2.6352 1.9251 1.3711 0.9267 0.6998

5.7269

3.1238 2.1485 1.6374 1.3228 0.9556 0.6747 0.4528 0.3408

4.0231

2.1379 1.4560 1.1040 0.8890 0.6399 0.4505 0.3017 0.2268

3.1291

1.6403 1.1116 0.8407 0.6759 0.4856 0.3414 0.2284 0.1716

2.5846

1.3437 0.9079 0.6856 0.5507 0.3952 0.2776 0.1856 0.1394

2.2160

1.1457 0.7726 0.5828 0.4679 0.3355 0.2356 0.1574 0.1182

1.9532

1.0058 0.6773 0.5106 0.4097 0.2937 0.2061 0.1377 0.1034

1.7536

0.9003 0.6056 0.4563 0.3660 0.2623 0.1840 0.1229 0.0923

10

1.5963

0.8176 0.5495 0.4138 0.3319 0.2377 0.1668 0.1114 0.0836

Result 1: Let

cR , then MSE ( ) 2 [c2 (d2* ) 2 2cd2 1]

Proof:

MSE ( ) E[(cR ) 2 ]
[c 2 R 2 2cR 2 ]
c 2 E ( R 2 ) 2cE ( R ) 2
Now E ( R ) Var ( R ) [ E ( R )] d 3 / m (d 2 ) . Thus

33

15

Proofs

10

MSE ( ) c 2 d 32 2 / m c 2 d 22 2 2c (d 2 ) 2
2 [c 2 (d 32 / m d 22 ) 2cd 2 1]
2 [c 2 (d 2* ) 2 2cd 2 1]
Result 2: The value of c that minimizes the mean squared error of estimators of the form cR in estimating
is

d2
.
(d2* ) 2

Proof:

MSE ( ) 2 [c 2 (d 2* ) 2 2cd 2 1]
dMSE ( )
2 [2c(d 2* ) 2 2d 2 ] 0
dc
d
c *2 2 .
(d 2 )
Result 3: The mean square error of 3

FG
H

IJ
K

d2
d
R is 2 1 *2 2 .
* 2
(d 2 )
(d 2 )

Proof:

LM d (d ) 2 d
(d )
N (d )
L d 2 d 1OP
= M
N (d ) (d ) Q
F d IJ
G1
H (d ) K

MSE ( 3 ) 2
2

2
2
* 4
2

* 2
2

2
2
* 2
2

2
* 2
2

OP
Q

d 2 1 (from result 1)

2
2
* 2
2

2
2
* 2
2

Note that MSE ( 3 ) 0 as n and MSE ( 3 ) 0 as m .

Result 4: Let 2

LM
N

OP
Q

R
d

and 3 *2 2 R . Then MSE ( 2 ) MSE ( 3 ) 100 , the percent reduction


*
d2
(d 2 )
MSE ( 2 )

in mean square error using the minimum mean square error estimator instead of
Luko (1996)], is

FG d d IJ
H d K

50

Proof:

34

*
2

*
2

R
[as recommended by
d 2*

Luko (1996) shows that MSE ( 2 )

2 2 (d2* d2 )
, therefore
d2*

MSE ( 2 ) MSE ( 3 )

FG
H

2 2 (d 2* d 2 )
d 22
2

d 2*
(d 2* ) 2

LM 2(d d ) (d ) d OP
(d )
N d
Q
LM 2(d d ) (d d )(d d ) OP
(d )
N d
Q
(d d ) F
d d I
2
G
d
H d JK
(d d ) F d d I
GH d JK
d
*
2

*
2

*
2

*
2

*
2

*
2

*
2

*
2
* 2
2

*
2

*
2

*
2

*
2

*
2

(d 2* d 2 ) 2
(d 2* ) 2

LM MSE ( ) MSE ( ) OP 100 (d d ) / (d ) 100 50FG d d IJ .


2 (d d ) / (d )
H d K
N MSE ( ) Q
2

Consequently

* 2
2
2
2
* 2
2

*
2

IJ
K

*
2

*
2

* 2
2
*
2

*
2

*
2

S6.3. Determining When the Process has Shifted

Control charts monitor a process to determine whether an assignable cause has occurred. Knowing
when the assignable cause has occurred would be very helpful in its identification and eventual
removal. Unfortunately, the time of occurrence of the assignable cause does not always coincide with
the control chart signal. In fact, given what is known about average run length performance of control
charts, it is actually very unlikely that the assignable cause occurs at the time of the signal. Therefore,
when a signal occurs, the control chart analyst should look earlier in the process history to determine
the assignable cause.
But where should we start? The Cusum control chart provides some guidance simply search
backwards on the Cusum status chart to find the point in time where the Cusum last crossed zero
(refer to Chapter 8). However, the Shewhart x control chart provides no such simple guidance.
Samuel, Pignatiello, and Calvin (1998) use some theoretical results by Hinkley (1970) on changepoint problems to suggest a procedure to determine the time of a shift in the process mean following
a signal on the Shewhart x control chart. They assume the standard x control chart with in-control

0 . Suppose that the chart signal at subgroup average xT . Now the incontrol subgroups are x1 , x2 ,..., xt , and the out-of-control subgroups are xt 1 , xt 2 ,..., xT , where
value of the process mean

obviously t T . Their procedure consists of finding the value of t in the range 0 t T that
maximizes

Ct (T t )( xT ,t 0 )2

35

where xT ,t

1 T

xi is the reverse cumulative average; that is, the average of the T t most recent
T t i t 1

subgroup averages. The value of t that maximizes Ct is the estimator of the last subgroup that was
selected from the in-control process.
You may also find it interesting and useful to read the material on change point procedures for process
monitoring in Chapter 10.
S6.4. More about Monitoring Variability with Individual Observations
As noted in the textbook, when one is monitoring a process using individual (as opposed to
subgrouped) measurements, a moving range control chart does not provide much useful additional
information about shifts in process variance beyond that which is provided by the individuals control
chart (or a Cusum or EWMA of the individual observations). Sullivan and Woodall (1996) describe
a change-point procedure that is much more effective that the individuals (or Cusum/EWMA) and
moving range chart.
Assume that the process is normally distributed. Then divide the n observations into two partitions
of n1 and n2 observations, with n1 = 2, 3, , n 2 observations in the first partition and n n1 in the
second. The log-likelihood of each partition is computed, using the maximum likelihood estimators
for and in each partition. The two log-likelihood functions are then added. Call the sum of
the two log-likelihood functions La. Let L0 denote the log-likelihood computed without any partitions.
Then find the maximum value of the likelihood ratio r = 2(La L0). The value of n1 at which this
maximum value occurs is the change point; that is, it is the estimate of the point in time at which a
change in either the process mean or the process variance (or both) has occurred.
2

Sullivan and Woodall show how to obtain a control chart for the likelihood ratio r. The control limits
must be obtained either by simulation or by approximation. When the control chart signals, the
quantity r can be decomposed into two components; one that is zero if the means in each partition are
equal, and another that is zero if the variances in each partition are equal. The relative size of these
two components suggests whether it is the process mean or the process variance that has shifted.
S6.5. Detecting Drifts versus Shifts in the Process Mean
In studying the performance of control charts, most of our attention is directed towards describing
what will happen on the chart following a sustained shift in the process parameter. This is done
largely for convenience, and because such performance studies must start somewhere, and a sustained
shift is certainly a likely scenario. However, a drifting process parameter is also a likely possibility.
Aerne, Champ, and Rigdon (1991) have studies several control charting schemes when the process
mean drifts according to a linear trend. Their study encompasses the Shewhart control chart, the
Shewhart chart with supplementary runs rules, the EWMA control chart, and the Cusum. They design
the charts so that the in-control ARL is 465. Some of the previous studies of control charts with
drifting means did not do this, and different charts have different values of ARL0, thereby making it
difficult to draw conclusions about chart performance. See Aerne, Champ, and Rigdon (1991) for
references and further details.
They report that, in general, Cusum and EWMA charts perform better in detecting trends than does
the Shewhart control chart. For small to moderate trends, both of these charts are significantly better
than the Shewhart chart with and without runs rules. There is not much difference in performance
between the Cusum and the EWMA.

36

S6.6. The Mean Square Successive Difference as an Estimator of 2


An alternative to the moving range estimator of the process standard deviation is the mean square successive
difference as an estimator of 2 . The mean square successive difference is defined as

MSSD

n
1
( x1 xi 1 )2

2(n 1) i 1

It is easy to show that the MSSD is an unbiased estimator of 2 . Let

x1 , x2 ,..., xn be a random sample of size

n from a population with mean and variance 2 . Without any loss of generality, we may take the mean to be
zero. Then
n
1

E ( MSSD) E
( xi xi 1 ) 2

2(n 1) i 2

n
1

E ( xi2 xi21 2 xi xi 1 )
2(n 1) i 2

1
[(n 1) 2 (n 1) 2 ]
2(n 1)
2(n 1) 2

2(n 1)

Therefore, the mean square successive difference is an unbiased estimator of the population variance.

37

Supplemental Material for Chapter 7


S7.1. Probability Limits on Control Charts
In Chapters 6 and 7 of the textbook, probability limits for control charts are briefly discussed. The usual threesigma limits are almost always used with variables control charts, although as we point out, there can be
some occasional advantage to the use of probability limits, such as on the range chart to obtain a non-zero
lower control limit.
The standard applications of attributes control charts almost always use the three-sigma limits as well,
although their use is potentially somewhat more troublesome here. When three-sigma limits are used on
attributes charts, we are basically assuming that the normal approximation to either the binomial or Poisson
distribution is appropriate, at least to the extent that the distribution of the attribute chart statistic is
approximately symmetric, and that the symmetric three-sigma control limits are satisfactory.
This will, of course, not always be the case. If the binomial probability p is small and the sample size n is not
large, or if the Poisson mean is small, then symmetric three-sigma control limits on the p or c chart may not
be appropriate, and probability limits may be much better.
For example, consider a p chart with p = 0.07 and n = 100. The center line is at 0.07 and the usual three-sigma
control limits are UCL = -0.007 = 0 and UCL = 0.147. A short table of cumulative binomial probabilities
computed from Minitab follows.
Cumulative Distribution Function
Binomial with n = 100 and p = 0.0700000
x
P( X <= x )
x
P( X <= x )

0.00
0.0007
11.00
0.9531
1.00
0.0060
12.00
0.9776
2.00
0.0258
13.00
0.9901
3.00
0.0744
14.00
0.9959
4.00
0.1632
15.00
0.9984
5.00
0.2914
16.00
0.9994
6.00
0.4443
17.00
0.9998
7.00
0.5988
18.00
0.9999
8.00
0.7340
19.00
1.0000
9.00
0.8380
20.00
1.0000
10.00
0.9092

If the lower control limit is at zero and the upper control limit is at 0.147, then any sample with 15 or more
defective items will plot beyond the upper control limit. The above table shows that the probability of
obtaining 15 or more defectives when the process is in-control is 1 0.9959 = 0.0041. This is about 50%
greater than the false alarm rate on the normal-theory three-sigma limit control chart (0.0027). However, if
we were to set the lower control limit at 0.01 and the upper control limit at 0.15, and conclude that the process
is out-of-control only if a control limit is exceeded, than the false alarm rate is 0.0007 + 0.0016 = 0.0023,
which is very close to the advertised value of 0.0027. Furthermore, there is a nonzero LCL, which can be very
useful in practice. Notice that the control limits are not symmetric around the center line. However, the
distribution of p is not symmetric, so this should not be too surprising.
There are several other interesting approaches to setting probability-type limits on attribute control charts.
Refer to Ryan (2000), Acosta-Mejia (1999), Ryan and Schwertman (1997), Schwertman and Ryan (1997), and
Shore (2000).

38

Supplemental Material for Chapter 8


S8.1. Fixed Versus Random Factors in the Analysis of Variance
In chapter 4, we present the standard analysis of variance (ANOVA) for a single-factor experiment, assuming
that the factor is a fixed factor. By a fixed factor, we mean that all levels of the factor of interest were studied
in the experiment. Sometimes the levels of a factor are selected at random from a large (theoretically infinite)
population of factor levels. This leads to a random effects ANOVA model.
In the single factor case, there are only modest differences between the fixed and random models. The model
for a random effects experiment is still written as

yij i ij
but now the treatment effects i are random variables, because the treatment levels actually used in the
experiment have been chosen at random. The population of treatments is assumed to be normally and
independently distributed with mean zero and variance 2 . Note that the variance of an observation is

V ( yij ) V ( i ij )
2 2
We often call 2 and 2 variance components, and the random model is sometimes called the components
of variance model. All of the computations in the random model are the same as in the fixed effects model,
but since we are studying an entire population of treatments, it doesnt make much sense to formulate
hypotheses about the individual factor levels selected in the experiment. Instead, we test the following
hypotheses about the variance of the treatment effects:

H 0 : 2 0
H1 : 2 0
The test statistic for these hypotheses is the usual F-ratio, F = MSTreatments/MSE. If the null hypothesis is not
rejected, there is no variability in the population of treatments, while if the null hypothesis is rejected, there is
significant variability among the treatments in the entire population that was sampled. Notice that the
conclusions of the ANOVA extend to the entire population of treatments.
The expected mean squares in the random model are different from their fixed effects model counterparts. It
can be shown that

E ( MSTreatments ) 2 n 2
E ( MS E ) 2
Frequently, the objective of an experiment involving random factors is to estimate the variance components.
A logical way to do this is to equate the expected values of the mean squares to their observed values and solve
the resulting equations. This leads to

MSTreatments MS E
n
2
MS E

A typical application of experiments where some of the factors are random is in a measurement systems
capability study, as discussed in Chapter 7. The model used there is a factorial model, so the analysis and the
expected mean squares are somewhat more complicated than in the single factor model considered here.

39

S8.2. Analysis of Variance Methods for Measurement Systems Capability Studies


In Chapter 8 an analysis of variance model approach to measurement systems studies is presented.
This method replaces the tabular approach that was presented along with the ANOVA method in
earlier editions of the book. The tabular approach is a relatively simple method, but it is not the most
general or efficient approach to conducting gage studies. Gauge and measurement systems studies
are designed experiments, and often we find that the gauge study must be conducted using an
experimental design that does not nicely fit into the tabular analysis scheme. For example, suppose
that the operators used with each instrument (or gauge) are different because the instruments are in
different physical locations. Then operators are nested within instruments, and the experiment has
been conducted as a nested design.
As another example, suppose that the operators are not selected at random, because the specific operators used
in the study are the only ones that actually perform the measurements. This is a mixed model experiment, and
the random effects approach that the tabular method is based on is inappropriate. The random effects model
analysis of variance approach in the text is also inappropriate for this situation. Dolezal, Burdick, and Birch
(1998), Montgomery (2001), and Burdick, Borror, and Montgomery (2003) discuss the mixed model analysis
of variance for gauge R & R studies.

The tabular approach does not lend itself to constructing confidence intervals on the variance
components or functions of the variance components of interest. For that reason we do not
recommend the tabular approach for general use. There are three general approaches to constructing
these confidence intervals: (1) the Satterthwaite method, (2) the maximum likelihood large-sample
method, and (3) the modified large sample method. Montgomery (2001) gives an overview of these
different methods. Of the three approaches, there is good evidence that the modified large sample
approach is the best in the sense that it produces confidence intervals that are closest to the stated
level of confidence.
Hamada and Weerahandi (2000) show how generalized inference can be applied to the problem of determining
confidence intervals in measurement systems capability studies. The technique is somewhat more involved
that the three methods referenced above. Either numerical integration or simulation must be used to find the
desired confidence intervals. Burdick, Borror, and Montgomery (2003) discuss this technique.
While the tabular method should be abandoned, the control charting aspect of measurement systems capability
studies should be used more consistently. All too often a measurement study is conducted and analyzed via
some computer program without adequate graphical analysis of the data. Furthermore, some of the advice in
various quality standards and reference sources regarding these studies is just not very good and can produce
results of questionable validity. The most reliable measure of gauge capability is the probability that parts are
misclassified.

40

Supplemental Material for Chapter 9


S9.1. The Markov Chain Approach to Finding the ARLs for Cusum and EWMA Control
Charts
When the observations drawn from the process are independent, average run lengths or ARLs are easy to
determine for Shewhart control charts because the points plotted on the chart are independent. The distribution
of run length is geometric, so the ARL of the chart is just the mean of the geometric distribution, or 1/p, where
p is the probability that a single point plots outside the control limits.
The sequence of plotted points on Cusum and EWMA charts is not independent, so another approach must be
used to find the ARLs. The Markov chain approach developed by Brook and Evans (1972) is very widely
used. We give a brief discussion of this procedure for a one-sided Cusum.

The Cusum control chart statistic C (or C ) form a Markov process with a continuous state space. By

discretizing the continuous random variable C (or C ) with a finite set of values, approximate ARLs can
be obtained from Markov chain theory. For the upper one-sided Cusum with upper decision interval H, the
intervals
are
defined
as
follows:
(, w / 2],[ w / 2,3w / 2], ,[( k 1/ 2) w, ( k 1/ 2) w], ,[( m 3 / 2) w, H ],[ H , ) where m + 1 is the
number of states and w = 2H/(2m- 1). The elements of the transition probability matrix of the Markov chain
P [ pij ] are

pi 0

w/ 2

pij

f ( x iw k )dx, i 0,1,..., m 1

( j i / 2) w

( j i / 2) w

i 0,1,..., m 1
f ( x iw k )dx
j 1, 2,..., m 1

pim f ( x iw k )dx, i 0,1,..., m 1


H

pmj 0, j 0,1,..., m 1
pmm 1
The absorbing state is m and f denotes the probability density function of the variable that is being monitored
with the Cusum.
From the theory of Markov chains, the expected first passage times from state i to the absorbing state are
m 1

i 1 pij j , i 0,1,..., m 1
j 0

Thus, i is the ARL given that the process started in state i. Let Q be the matrix of transition probabilities
obtained by deleting the last row and column of P. Then the vector of ARLs is found by computing

I Q 1
where 1 is an m1 vector of 1s and I is the m m identity matrix.
When the process is out of control, this procedure gives a vector of initial-state (or zero-state) ARLs. That is,
the process shifts out of control at the initial start-up of the control chart. It is also possible to calculate steadystate ARLs that describe performance assuming that the process shifts out of control after the control chart has
been operating for a long period of time. There is typically very little difference between initial-state and
steady-state ARLs.

41

Let P ( n, i ) be the probability that run length takes on the value n given that the chart started in state i. Collect
these quantities into a vector say

pn [ P(n, 0), P(n,1),..., P(n, m 1)]


for n = 1,2, . These probabilities can be calculated by solving the following equations:

p1 (I Q)11
pn Qpn1 , n 2,3,...
This technique can be used to calculate the probability distribution of the run length, given the control chart
started in state i. Some authors believe that the distribution of run length or its percentiles is more useful that
the ARL, since the distribution of run length is usually highly skewed and so the ARL may not be a typical
value in any sense.

S9.2. Integral Equations Versus Markov Chains for Finding the ARL
Two methods are used to find the ARL distribution of control charts, the Markov chain method and
an approach that uses integral equations. The Markov chain method is described in Section S9.1 of
the Supplemental Text Material. This section gives an overview of the integral equation approach
for the Cusum control chart. Some of the notation defined in Section S9.1 will be used here.
Let P (n, u ) and R (u ) be the probability that the run length takes on the value n and the ARL for the Cusum
when the procedure begins with initial value u. For the one-sided upper Cusum

P(1, u ) 1

f ( x u k )dx

w/ 2

m 1

f ( x u k )dx
j 1

( j 1/ 2) w

( j 1/ 2) w

f ( x u k )dx

and

P(n, u ) P(n 1, 0)

P(n 1, 0)

m 1

j 1

( j 1/ 2) w

( j 1/ 2) w

P(n 1, 0)

m 1

f ( x u k )dx P (n 1, y ) f ( x u k )dx
0

f ( x u k )dx

P(n 1, y ) f ( x u k )dx

P(n 1, y ) f ( x u k )dx
f ( x u k )dx P(n 1, 0 )

w/ 2

P(n 1, j )
j 1

w/ 2

( j 1/ 2) w

( j 1/ 2) w

f ( x u k )dx

f ( x u k )dx

for n = 1,2, and for 0 (, w / 2) and j [( j 1/ 2)w,( j 1/ 2) w), j 1, 2,..., m 1. If w is small,


then j is the midpoint jw of the jth interval for j = 1,2,, m 1, and considering only the values of P ( n, u )
for which u = iw results in

42

m 1

P(1, iw) 1 pij


j 0

m 1

P(n, iw) P(n 1, iw) pij , n 2,3,...


j 0

But these last equations are just the equations used for calculating the probabilities of first-passage
times in a Markov chain. Therefore, the solution to the integral equation approach involves solving
equations identical to those used in the Markov chain procedure.
Champ and Rigdon (1991) give an excellent discussion of the Markov chain and integral equation
techniques for finding ARLs for both the Cusum and the EWMA control charts. They observe that
the Markov chain approach involves obtaining an exact solution to an approximate formulation of the
ARL problem, while the integral equation approach involves finding an approximate solution to the
exact formulation of the ARL problem. They point out that more accurate solutions can likely be
found via the integral equation approach. However, there are problems for which only the Markov
chain method will work, such as the case of a drifting mean.

43

Supplemental Material for Chapter 10


S10.1. Difference Control Charts
The difference control chart is briefly mentioned in Chapter 10, and a reference is given to a paper by Grubbs
(1946). There are actually two types of difference control charts in the literature. Grubbs compared samples
from a current production process to a reference sample. His application was in the context of testing
ordinance. The plotted quantity was the difference in the current sample average and the reference sample
average. This quantity would be plotted on a control chart with center line at zero and control limits at

A2 R12 R22 , where R12 and R22 are the average ranges for the reference samples (1) and the current
production samples (2) used to establish the control limits.
The second type of difference control chart was suggested by Ott (1947), who considered the situation where
differences are observed between paired measurements within each subgroup (much as in a paired t-test), and
the average difference for each subgroup is plotted on the chart. The center line for this chart is zero, and the
control limits are at A2 R , where R is the average of the ranges of the differences. This chart would be
useful in instrument calibration, where one measurement on each unit is from a standard instrument (say in a
laboratory) and the other is from an instrument used in different conditions (such as in production).

S10.1. Control Charts for Contrasts


There are many manufacturing processes where process monitoring is important but traditional
statistical control charts cannot be effectively used because of rational subgrouping considerations.
Examples occur frequently in the chemical and processing industries, stamping, casting and molding
operations, and electronics and semiconductor manufacturing.
As an illustration, consider a furnace used to create an oxide layer on silicon wafers. In each run of the furnace
a set of m wafers will be processed, and at the completion of the run a single measurement of oxide thickness
will be taken at each of n sites or locations on each wafer. These mn thickness measurements will be evaluated
to ensure the stability of the process, check for the possible presence of assignable causes, and to determine
any necessary modifications to the furnace operating conditions (or the recipe) before any subsequent runs are
initiated. Figure S10,1, adapted from Runger and Fowler (1999) and Czitrom and Reece (1997), shows a
typical oxidation furnace with m = 4 wafers and n = 9 sites on each wafer. In Chapter 6 of the textbook,
Example 6.11 illustrates an aerospace casting where vane height and inner diameter are the characteristics of
interest. Each casting has five vanes that are measured to monitor the height characteristic and the diameter
of a casting is measured at 24 locations using a coordinate measuring machine.
In these applications it would be inappropriate to monitor the process with traditional x and R charts. For
example in the oxidation furnace, assuming a rational subgroup of either n = 9 or n = 36 is not correct because
all sites experience the processing activities during each furnace run simultaneously. That is, there is much
less variability between the observations at the 9 sites than would be anticipated in observations collected from
a process where all measurements reflect the processing activity(s) each unit experiences independently. What
usually occurs when this misapplication of the standard charts is implemented is that the control limits on the
X chart will be too narrow. Then if the process experiences moderate run-to-run variability, there will be
many out-of-control points on the X chart that engineers and process operating personnel cannot associate
with specific upsets or assignable causes.

44

Furnace
Location1

3
7

1 5
4

Furnace
Location2

Furnace
Location3

Furnace
Location4

Figure S10.1. Diagram of a Furnace where four wafers are simultaneously processed and nine quality
measurements are performed on each wafer.

The most widely used approach to monitoring these processes is to first consider the average of all mn
observations from a run as a single observation and to use a Shewhart control chart for individuals to monitor
the overall process mean. The control limits for this chart are usually found by applying a moving range to
the sequence of averages. Thus, the control limits for the individuals chart reflect run-to-run variability, not
variability within a run. The variability within a run is monitored by applying a control chart for s (the standard
deviation) or s 2 to all mn observations from each run. It is interesting to note that this approach is so widely
used that at least one popular statistical software package (Minitab) includes it as a standard control charting
option (called the between within procedure in Minitab). This procedure was illustrated in Example 5-11.

Runger and Fowler (1999) show how the structure of the data obtained on these processes can be
represented by an analysis of variance model, and how control charts based on contrasts can be
designed to detect specific assignable causes of potential interest. Below we briefly review their
results and relate them to some other methods. Then we analyze the average run performance of the
contrast charts and show that the use of specifically designed contrast charts can greatly enhance the
ability of the monitoring scheme to detect assignable causes. We confine our analysis to Shewhart
charts, but both Cusum and EWMA control charts would be very effective alternatives, because they
are more effective in detecting small process shifts, which are likely to be of interest in many of these
applications
Contrast Control Charts
We consider the oxidation process in Figure S10.1, but allow m wafers in each run with n measurements or
sites per wafer. The appropriate model for oxide thickness is

yij ri s j ij
45

(S10.1)

where yij is the oxide thickness measurement from run i and site j, ri is the run effect, sj is the site effect, and
ij is a random error component. We assume that the site effects are fixed effects, since the measurements are
generally taken at the same locations on all wafers. The run effect is a random factor and we assume it is
2
distributed as NID(0, r ) . We assume that the error term is distributed as NID(0, 2 ) . Notice that equation

(S10.1) is essentially an analysis of variance model.


Let yt be a vector of all measurements from the process at the end of run t. It is customary in most applications
to update the control charts at the completion of every run. A contrast is a linear combination of the elements
of the observation vector yt , say

c cy t
where the elements of the vector c sum to zero and, for convenience, we assume that the
contrast vector has unit length. That is,

c1 0 and cc 1
Any contrast vector is orthogonal to the vector that generates the mean, since the mean can be written
as
yt 1y t / mn
Thus, a contrast generates information that is different from the information produced by the overall
mean from the current run. Based on the particular problem, the control chart analyst can choose the
elements of the contrast vector c to provide information of interest to that specific process.

For example, suppose that we were interested in detecting process shifts that could cause a difference
in mean thickness between the top and bottom of the furnace. The engineering cause of such a
difference could be a temperature gradient along the furnace from top to bottom. To detect this
disturbance, we would want the contrast to compare the average oxide thickness of the top wafer in
the furnace to the average thickness of the bottom wafer. Thus, if m = 4, the vector c has mn = 36
components, the first 9 of which are +1, the last 9 of which are 1, and the middle 18 elements are
zero. To normalize the contrast to unit length we would actually use

c [1,1,...,1,0,0,..., 0,1,1,...,1] / 18
One could also divide the elements of c by nine to compute the averages of the top and bottom wafers,
but this is not really necessary.
In practice, a set of k contrasts, say
c1 , c2 ,..., ck
can be used to define control charts to monitor a process to detect k assignable causes of interest.
These simultaneous control charts have overall false alarm rate , where
k

1 (1 i )

(S10.2)

i 1

and i is the false alarm rate for the ith contrast. If the contrasts are orthogonal, then Equation (S9-2)
holds exactly, while if the contrasts are not orthogonal then the Bonferroni inequality applies and the
in Equation (S10.2) is a lower bound on the false alarm rate.
46

Related Procedures
Several authors have suggested related approaches for process monitoring when non-standard
conditionss relative to rational subgrouping apply. Yashchin (1994), Czitrom and Reese (1997), and
Hurwicz and Spagon (1997) all present control charts or other similar techniques based on variance
components. The major difference in this approach in comparison to these authors is the use of an
analysis-of-variance type partitioning based on contrasts instead of variance components as the basis
of the monitoring scheme. Roes and Does (1995) do discuss the use of contrasts, and Hurwicz and
Spagon discuss contrasts to estimate the variance contributed by sites within a wafer. However, the
Runger and Fowler model is the most widely applicable of all the techniques we have encountered.
Even though the methodology used to monitor specific differences in processing conditions has been
studied by all these authors, the statistical performance of these charts has not been demonstrated.
We now present some performance results for Shewhart control charts.
Average Run Length Performance of Shewhart Charts
In this section we assume that the process shown in Figure S10.1 is of interest. The following
scenarios are considered:

A change in the mean of the top versus the bottom wafer.

Changes on the left versus the right side of all wafers.

Significant changes between the outside and the inside of each wafer.

Four wafers are selected from the tube.

The contrasts for these charts are:

c1

c2

c3

[1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1]

16

[0,0,0,0,0,1,1,1,1,0,0,0,0,0,1,1,1,1,0,0,0,0,0,1,1,1,10,0,0,0,0,1,1,1,1]

32

[0,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1]

A comparison of the ARL values obtained using these contrasts and the traditional approach an
individuals control chart for the mean of all 36 observations- is presented in Tables S10.1, S10.2, and
S10.3. From inspection of these tables, we see that the charts for the orthogonal contrasts, originally
with the same in-control ARL as the traditional chart, are more sensitive to changes at specific
locations, thus improving the chances of early detection of an assignable cause. Notice that the
improvement is dramatic for small shifts, say on the order of 1.5 standard deviations or less.
A similar analysis was performed for a modified version of the process shown in figure S10.1. In
this example, there are seven measurements per wafer for a total of 28 measurements in a run. There
are still three measurements at the center of the wafer, but now there are only measurements at the
perimeter; one in each corner. The same types of contrasts used in the previous example (top versus
bottom, left versus right and edge versus center) were analyzed and the ARL results are presented in
Tables S10.4, S10.6, and S10.6.
47

Table S10.1. Average Run Length Performance of Traditional and Orthogonal Contrast Charts for a
shift in the Edges of all Wafers. In this chart m = 4 and n = 9.

Size of Shift
In Multiples of

Edge versus Center


Contrast

Traditional Chart

0.5

11.7

13.6

1.9

2.2

1.5

1.1

1.1

2.5

Table S10.2. Average Run Length Performance of Traditional and Orthogonal Contrast Charts for a
shift in the Top Wafer. In this chart m = 4 and n = 9.

Size of Shift
In Multiples of

48

Bottom Top versus


Contrast

Traditional Chart

0.5

23.4

47

3.9

10

1.5

1.5

3.4

1.1

1.7

2.5

1.2

Table S10.3. Average Run Length Performance of Traditional and Orthogonal Contrast Charts for a
shift in the Left side of all Wafers. In this chart m = 4 and n = 9.

Size of Shift
In Multiples of

Left versus Right


Contrast

Traditional Chart

0.5

26.7

57.2

4.6

13.6

1.5

1.7

4.6

1.1

2.2

2.5

1.4

1.1

Table S10.4. Average Run Length Comparison between Traditional and Orthogonal Contrast Charts for
a shift in the Edge of all Wafers. In this chart m = 4 and n = 7.

Size of Shift
In Multiples of

49

Edge versus Center


Contrast

Traditional Chart

0.5

26.7

46.4

4.6

9.8

1.5

1.7

3.3

1.1

1.7

2.5

1.2

Table S10.5. Average Run Length Comparison between Traditional and Orthogonal Contrast Charts for
a change in the Top Wafer. In this chart m = 4 and n = 7.

Size of Shift
In Multiples of

Top versus Bottom


Contrast

Traditional Chart

0.5

30.8

57.9

5.5

13.8

1.5

4.7

1.2

2.2

2.5

1.4

1.1

Table S10.6. Average Run Length Performance of Traditional and Orthogonal Contrast Charts
for a shift in the left side of all Wafers. In this chart m = 4 and n = 7.
Size of Shift

Left versus Right Contrast

Traditional Chart

In Multiples of
0.5

26.7

46.4

4.6

9.8

1.5

1.7

3.3

1.1

1.7

2.5

1.2

Decreasing the number of measurements per wafer has increased the relative importance of the changes in the
mean of a subset of the observations and the traditional control charts signal the shift faster than in the previous
example. Still, note that the control charts based on orthogonal contrasts represent a considerable improvement
over the traditional approach.

S10.3. Run Sum and Zone Control Charts


The run sum control chart was introduced by Roberts (1966), and has been studied further by
Reynolds (1971) and Champ and Rigdon (1997). For a run chart for the sample mean, the procedure
divides the possible values of x into regions on either side of the center line of the control chart. If
0 is the center line and 0 is the process standard deviation, then the regions above the center line,
say, are defined as

[ 0 Ai 0 / n , 0 Ai 1 0 / n ), i 0,1, 2,..., a
50

for 0 A0 A1 A2 Aa Aa 1 , where the constants Ai are determined by the user. A


similar set of regions is defined below the center line. A score is assigned to each region, say si for
the ith region above the center line and s-i for the ith region below the center line. The score si is
nonnegative, while the score s-i is nonpositive. The run sum chart operates by observing the region
in which the subgroup averages fall and accumulating the scores for those regions. The cumulative
score begins at zero. The charting procedure continues until either the cumulative score reaches or
exceeds either a positive upper limit or a negative lower limit in which case an out-of-control signal
is generated, or until the subgroup average falls on the other side of the center line in which case the
scoring starts over with the cumulative score starting according to the current value of x .
Jaehn (1987) discusses a special case of the run sum control chart, usually called the zone control
chart. In the zone control chart, there are only three regions on either side of the center line
corresponding to one-, two-, and three-sigma intervals (as in the western Electric rules), and the zone
scores are often taken as 1, 2, 4, and 8 (this is the value assigned to a point outside the three-sigma
limits and it is also the total score that triggers an alarm). Davis, Homer, and Woodall (1990) studied
the performance of the zone control chart and recommended the zone scores 0, 2, 4, and 8 (or
equivalently, 0, 1, 2, and 4).
Champ and Rigdon (1997) use a Markov chain approach to study the average run length properties
of several versions of the run sum control chart. They observe that the run sum control chart can be
designed so that it has the same in-control ARL as a Shewhart with supplementary runs rules and
better ARL performance than the Shewhart chart with runs rules in detecting small or moderate sized
shifts. Their results are consistent with those of Davis, Homer, and Woodall (1990). Jin and Davis
(1991) give a FORTRAN computer program for finding the ARLs of the zone control chart.
Champ and Rigdon (1997) also compare the zone control chart to the Cusum and EWMA control
charts. They observe that by using a sufficient number of regions, the zone control chart can be made
competitive with the Cusum and EWMA, so it could be a viable alternative to these charts.
S10.4. More About Adaptive Control Charts
Section 10.5 of the text discusses adaptive control charts; that is, control charts on which either the
sample size or the sampling interval, or both, are changed periodically depending on the current value
of the sample statistic. Some authors refer to these schemes as variable sample size (VSS) or
variable sampling interval (VSI) control charts. A procedure that changes both parameters would
be called a VSS/SI control chart. The successful application of these types of charts requires some
flexibility on the part of the organization using them, in that occasionally larger than usual samples
will be taken, or a sample will be taken sooner than routinely scheduled. However, the adaptive
schemes offer real advantages in improving control chart performance.
The textbook illustrates a two-state or two-zone system; that is, the control chart has an inner zone in
which the smaller sample size (or longest time between samples) is used, and an outer zone in which
the larger sample size (or shortest time between samples) is used. The book presents an example
involving an x chart demonstrating that an improvement of at least 50% in ATS performance is
possible if the sampling interval can be adapted using a two-state system. An obvious question
concerns the number of states: are two states optimal, or can even better results be obtained by
designing a system with more that two states?
Several authors have examined this question. Runger and Montgomery (1993) have shown that for
the VSI control chart two states are optimal if one is considering the initial-state or zero-state
performance of the control chart (that is, the process is out of control when the control chart is started
up). However, if one considers steady-state performance (the process shifts after the control chart has
been in operation for a long time), then a VSI control chart with more than two states will be optimal.
51

These authors show that a well-designed two-state VSI control chart will perform nearly as well as
the optimal chart, so that in practical use, there is little to be gained in operational performance by
using more than two states. Zimmer, Montgomery and Runger (1998) consider the VSS control chart
and show that two states are not optimal, although the performance improvements when using more
that two states are modest, and mostly occur when the interest is in detecting small process shifts.
Zimmer, Montgomery and Runger (2000) summarize the performance of numerous adaptive control
chart schemes, and offer some practical guidelines for their use. They observe that, in general,
performance improves more quickly from adapting the sample size than from adapting the sampling
interval.
Tagaras (1998) also gives a nice literature review of the major work in the field up through about
1997. Baxley ((1995) gives an interesting account of using VSI control charts in nylon
manufacturing. Park and Reynolds (1994) have presented an economic model of the VSS control
chart, and Prabhu, Montgomery, and Runger (1997) have investigated economic-statistical design of
VSS/SI control charting schemes.

52

Supplemental Material for Chapter 11


S11.1. Multivariate Cusum Control Charts
In Chapter 11 the multivariate EWMA (or MEWMA) control chart is presented as a relatively
straightforward extension of the univariate EWMA. It was noted that several authors have developed
multivariate extensions of the Cusum. Crosier (1988) proposed two multivariate Cusum procedures.
The one with the best ARL performance is based on the statistic

Ci (Si 1 Xi ) 1 (Si 1 Xi )

1/ 2

where

0, if Ci k

Si
(Si 1 Xi )(1 k / Ci ), if Ci k
with S0 = 0, and k > 0. An out of control signal is generated when
Yi (Si 1Si )1/ 2 H

where k and H the reference value and decision interval for the procedure, respectively.
Two different forms of the multivariate cusum were proposed by Pignatiello and Runger (1990).
Their best-performing control chart is based on the following vectors of cumulative sums:

Di

j i li 1

Xj

and
MCi max{0,(Di 1Di )1/ 2 kli }

where k 0, li li 1 1 if MCi 1 0 and li 1 otherwise . An out of control signal is generated if MCi


> H.
Both of these multivariate Cusums have better ARL performance that the Hotelling T2 or the chisquare control chart. However, the MEWMA has very similar ARL performance to both of these
multivariate Cusums and it much easier to implement in practice, so it should be preferred.

53

Supplemental Material for Chapter 13


S13.1. Guidelines for Planning Experiments

Coleman and Montgomery (1993) present a discussion of methodology and some


guide sheets useful in the pre-experimental planning phases of designing and
conducting an industrial experiment. The guide sheets are particularly appropriate
for complex, high-payoff or high-consequence experiments involving (possibly)
many factors or other issues that need careful consideration and (possibly) many
responses. They are most likely to be useful in the earliest stages of experimentation
with a process or system. Coleman and Montgomery suggest that the guide sheets
work most effectively when they are filled out by a team of experimenters, including
engineers and scientists with specialized process knowledge, operators and
technicians, managers and (if available) individuals with specialized training and
experience in designing experiments. The sheets are intended to encourage
discussion and resolution of technical and logistical issues before the experiment is
actually conducted.
Coleman and Montgomery give an example involving manufacturing impellers on a
CNC-machine that are used in a jet turbine engine. To achieve the desired
performance objectives, it is necessary to produce parts with blade profiles that
closely match the engineering specifications. The objective of the experiment was to
study the effect of different tool vendors and machine set-up parameters on the
dimensional variability of the parts produced by the CNC-machines.
The master guide sheet is shown in Table S13.1 below. It contains information
useful in filling out the individual sheets for a particular experiment. Writing the
objective of the experiment is usually harder than it appears. Objectives should be
unbiased, specific, measurable and of practical consequence. To be unbiased, the
experimenters must encourage participation by knowledgeable and interested people
with diverse perspectives. It is all too easy to design a very narrow experiment to
prove a pet theory. To be specific and measurable the objectives should be detailed
enough and stated so that it is clear when they have been met. To be of practical
consequence, there should be something that will be done differently as a result of the
experiment, such as a new set of operating conditions for the process, a new material
source, or perhaps a new experiment will be conducted. All interested parties should
agree that the proper objectives have been set.
The relevant background should contain information from previous experiments, if
any, observational data that may have been collected routinely by process operating
personnel, field quality or reliability data, knowledge based on physical laws or
theories, and expert opinion. This information helps quantify what new knowledge
could be gained by the present experiment and motivates discussion by all team
members. Table S13.2 shows the beginning of the guide sheet for the CNCmachining experiment.

54

Response variables come to mind easily for most experimenters. When there is a
choice, one should select continuous responses, because generally binary and ordinal
data carry much less information and continuous responses measured on a welldefined numerical scale are typically easier to analyze. On the other hand, there are
many situations where a count of defectives, a proportion, or even a subjective
ranking must be used as a response.
Measurement precision is an important aspect of selecting the response variables in
an experiment. Insuring that the measurement process is in a state of statistical
control is highly desirable. That is, ideally there is a well-established system of
insuring both accuracy and precision of the measurement methods to be used. The
amount of error in measurement imparted by the gauges used should be understood.
If the gauge error is large relative to the change in the response variable that is
important to detect, then the experimenter will want to know this before conducting
the experiment. Sometimes repeat measurements can be made on each experimental
unit or test specimen to reduce the impact of measurement error. For example, when
measuring the number average molecular weight of a polymer with a gel permeation
chromatograph (GPC) each sample can be tested several times and the average of
those molecular weight reading reported as the observation for that sample. When
measurement precision is unacceptable, a measurement systems capability study may
be performed to attempt to improve the system. These studies are often fairly
complicated designed experiments. Chapter 8 presents an example of a factorial
experiment used to study the capability of a measurement system.
The impeller involved in this experiment is shown in Figure S13.1. Table S13.3 lists
the information about the response variables. Notice that there are three response
variables of interest here.
As with response variables, most experimenters can easily generate a list of candidate
design factors to be studied in the experiment. Coleman and Montgomery call these
control variables. We often call them controllable variables, design factors, or
process variables in the text. Control variables can be continuous or categorical
(discrete). The ability of the experimenters to measure and set these factors is
important. Generally, small errors in the ability to set, hold or measure the levels of
control variables are of relatively little consequence. Sometimes when the
measurement or setting error is large, a numerical control variable such as
temperature will have to be treated as a categorical control variable (low or high
temperature). Alternatively, there are errors-in-variables statistical models that can
be employed, although their use is beyond the scope of this book. Information about
the control variables for the CNC-machining example is shown in Table S13.4.
Held-constant factors are control variables whose effects are not of interest in this
experiment. The worksheets can force meaningful discussion about which factors are
adequately controlled, and if any potentially important factors (for purposes of the
present experiment) have inadvertently been held constant when they should have
55

been included as control variables. Sometimes subject-matter experts will elect to


hold too many factors constant and as a result fail to identify useful new information.
Often this information is in the form of interactions among process variables.
Table S13.1. Master Guide Sheet. This guide can be used to help plan and design an
experiment. It serves as a checklist to improve experimentation and ensures that results are not
corrupted for lack of careful planning. Note that it may not be possible to answer all questions
completely. If convenient, use supplementary sheets for topics 4-8
1.Experimenter's Name and Organization:
Brief Title of Experiment:
2. Objectives of the experiment (should be unbiased, specific, measurable, and
of practical consequence):
3. Relevant background on response and control variables: (a) theoretical relationships; (b)
expert knowledge/experience; (c) previous experiments. Where does this experiment fit into the
study of the process or system?:
4. List: (a) each response variable, (b) the normal response variable level at which the process
runs, the distribution or range of normal operation, (c) the precision or range to which it can be
measured (and how):
5. List: (a) each control variable, (b) the normal control variable level at which the process is run,
and the distribution or range of normal operation, (c) the precision (s) or range to which it can be
set (for the experiment, not ordinary plant operations) and the precision to which it can be
measured, (d) the proposed control variable settings, and
(e) the predicted effect (at least qualitative) that the settings will have on each response variable:
6. List: (a) each factor to be "held constant" in the experiment, (b) its desired level
and allowable s or range of variation, (c) the precision or range to which it can
measured (and how), (d) how it can be controlled, and (e) its expected impact, if any,
on each of the responses:
7. List: (a) each nuisance factor (perhaps time-varying), (b) measurement precision, (c)strategy
(e.g., blocking, randomization, or selection), and (d) anticipated effect:
8. List and label known or suspected interactions:
9. List restrictions on the experiment, e.g., ease of changing control variables, methods of data
acquisition, materials, duration, number of runs, type of experimental unit (need for a split-plot
design), illegal or irrelevant experimental regions, limits to randomization, run order, cost of
changing a control variable setting, etc.:
10. Give current design preferences, if any, and reasons for preference, including
blocking and randomization:
11. If possible, propose analysis and presentation techniques, e.g., plots,
ANOVA, regression, plots, t tests, etc.:

56

12. Who will be responsible for the coordination of the experiment?


13. Should trial runs be conducted? Why / why not?

Table S13.2. Beginning of Guide Sheet for CNC-Machining Study.


l.Experimenter's Name and Organization: John Smith, Process Eng. Group
Brief Title of Experiment: CNC Machining Study
2. Objectives of the experiment (should be unbiased, specific, measurable, and
of practical consequence):
For machined titanium forgings, quantify the effects of tool vendor; shifts in a-axis, x- axis, y-axis,
and z-axis; spindle speed; fixture height; feed rate; and spindle position on
the average and variability in blade profile for class X impellers, such as shown in
Figure 1.
3. Relevant background on response and control variables: (a) theoretical relationships; (b)
expert knowledge/experience; (c) previous experiments. Where does this experiment fit into the
study of the process or system?
(a) Because of tool geometry, x-axis shifts would be expected to produce thinner blades, an
undesirable characteristic of the airfoil.
(b) This family of parts has been produced for over 10 years; historical experience indicates that
externally reground tools do not perform as well as those from the internal vendor (our own
regrind operation).
(c) Smith (1987) observed in an internal process engineering study that current spindle speeds
and feed rates work well in producing parts that are at the nominal profile required by the
engineering drawings - but no study was done of the sensitivity to variations in set-up
parameters.

Results of this experiment will be used to determine machine set-up parameters for impeller
machining. A robust process is desirable; that is, on-target and low variability performance
regardless of which tool vendor is used.

57

Figure S13.1. Jet engine impeller (side view). The z-axis is vertical, x-axis is horizontal, y-axis is
into the page. 1 = height of wheel, 2 = diameter of wheel, 3 = inducer blade height, 4 = exducer blade
height, 5 = z height of blade.
Table S13.3. Response Variables
Response variable

Normal operating

Measurement

Relationship of

(units)

level and range

precision, accuracy

response variable to

how known?

objective

Blade profile
(inches)

Nominal (target)

E@ 1 X 10 -5 inches

1 X 10-3 inches to

from a coordinate

2 X 10-3 inches at

measurement

all points

machine capability

Estimate mean
absolute difference
from target and
standard deviation

study
Surface finish

Surface defect
count

Smooth to rough

Visual criterion

Should be as smooth

(requiring hand

(compare to

finish)

standards)

Typically 0 to 10

Visual criterion

Must not be

(compare to

excessive in

standards)

number or

as possible

magnitude

58

Table S13.4. Control Variables


Measurement
Precision and

Proposed settings,

Predicted effects

Control variable

Normal level

setting error-

based on

(for various

(units)

and range

how known?

predicted effects

responses)

x-axis shift*

0-.020 inches

(inches)
y-axis shift*

0-.020 inches

Difference

.001 inches

0, .015 inches

Difference

Difference

(experience)
0-.020 inches

(inches)
Tool vendor

0, .015 inches

(experience)

(inches)
z-axis shift*

.001 inches

.001 inches
(experience)

Internal, external

Internal, external External is more


variable

a-axis shift*

0-.030 degrees

(degrees)
Spindle speed

.001 degrees

0, .030 degrees

Unknown

90%,110%

None?

(guess)
85-115%

1%

(% of

(indicator

nominal)

on control
panel)

Fixture height

0-.025 inches

.002inches

0, .015 inches

Unknown

90%,110%

None?

(guess)
Feed rate (% of 90-110%
nominal)

1%
(indicator
on control
panel)

'The x, y, and z axes are used to refer to the part and the CNC machine. The a axis refers only to the machine.

In the CNC experiment, this worksheet helped the experimenters recognize that the machine had to be fully
warmed up before cutting any blade forgings. The actual procedure used was to mount the forged blanks on
the machine and run a 30-minute cycle without the cutting tool engaged. This allowed all machine parts and
the lubricant to reach normal, steady-state operating temperature. The use of a typical (i.e., mid-level) operator
and the use of one lot of forgings ware decisions made for experimental insurance. Table S13.5 shows the
held-constant factors for the CNC-machining experiment.
59

Table S13.5. Held-Constant Factors


Factor
(units)

Desired experimental level and


allowable range

Measurement
precision-how
known?

How to control
(in experiment)

Anticipated
effects

Type of cutting
fluid

Standard type

Not sure, but


thought to be
adequate

Use one type

None

Temperature of
cutting fluid
(degrees F.)

100F. when
machine is
warmed up

1-2 F. (estimate)

Do runs when
machine has
reached 100F

None

Operator

Several operators
normally work
in the process

Titanium
Forgings

Material
properties may
vary from unit
to unit

Use one "midlevel"


operator
Precision of lab
tests unknown

None

Use one lot


(or block on
forging lot,
only if
necessary)

Slight

Nuisance factors are variables that probably have some effect on the response, but which are of little or no
interest to the experimenter. They differ from held-constant factors in that they either cannot be held entirely
constant, or they cannot be controlled at all. For example, if two lots of forgings were required to run the
experiment, then the potential lot-to-lot differences in the material would be a nuisance variable than could not
be held entirely constant. In a chemical process we often cannot control the viscosity (say) of the incoming
material feed streamit may vary almost continuously over time. In these cases, nuisance variables must be
considered in either the design or the analysis of the experiment. If a nuisance variable can be controlled, then
we can use a design technique called blocking to eliminate its effect. If the nuisance variable cannot be
controlled but it can be measured, then we can reduce its effect by an analysis technique called the analysis of
covariance. Montgomery (2005) gives an introduction to the analysis of covariance.
Table S13.6 shows the nuisance variables identified in the CNC-machining experiment. In this experiment,
the only nuisance factor thought to have potentially serious effects was the machine spindle. The machine has
four spindles, and ultimately a decision was made to run the experiment in four blocks. The other factors were
held constant at levels below which problems might be encountered.

60

Table S13.6. Nuisance Factors


Measurement

Strategy (e.g.,

Nuisance factor

precision-how

randomization,

(units)

known?

blocking, etc.)

Viscosity of

Standard viscosity

cutting fluid

Anticipated effects

Measure viscosity at

None to slight

start and end

Ambient

1-2 F. by room

Make runs below

Slight, unless very

Temperature,F.

thermometer

80F.

hot weather

Spindle

Block or randomize

Spindle-to-spindle

on machine spindle

variation could be
large

Vibration of

Do not move heavy

Severe vibration can

machine during

objects in CNC

introduce variation

operation

machine shop

within an impeller

Coleman and Montgomery also found it useful to introduce an interaction sheet. The concept of interactions
among process variables is not an intuitive one, even to well-trained engineers and scientists. Now it is clearly
unrealistic to think that the experimenters can identify all of the important interactions at the outset of the
planning process. In most situations, the experimenters really dont know which main effects are likely to be
important, so asking them to make decisions about interactions is impractical. However, sometimes the
statistically-trained team members can use this as an opportunity to teach others about the interaction
phenomena. When more is known about the process, it might be possible to use the worksheet to motivate
questions such as are there certain interactions that must be estimated? Table S13.7 shows the results of this
exercise for the CNC-machining example.
Table S13.7. Interactions
Control
variable
x shift

y shift

z shift

Vendor
P

a shift

Speed

y shift

z shift

Vendor

a shift

Speed

Height

Height

F,D
-

NOTE: Response variables are P = profile difference, F = surface finish and D = surface defects
61

Feed

Two final points: First, an experimenter without a coordinator will probably fail. Furthermore, if something
can go wrong, it probably will, so he coordinator will actually have a significant responsibility on checking to
ensure that the experiment is being conducted as planned. Second, concerning trial runs, this is often a very
good ideaparticularly if this is the first in a series of experiments, or if the experiment has high significance
or impact. A trial run can consist of a center point in a factorial or a small part of the experimentperhaps
one of the blocks. Since many experiments often involve people and machines doing something they have not
done before, practice is a good idea. Another reason for trial runs is that we can use them to get an estimate
of the magnitude of experimental error. If the experimental error is much larger than anticipated, then this
may indicate the need for redesigning a significant part of the experiment. Trial runs are also a good
opportunity to ensure that measurement and data-acquisition or collection systems are operating as anticipated.
Most experimenters never regret performing trial runs.

Blank Guide Sheets from Coleman and Montgomery (1993)


Response Variables

response
variable
(units)

normal
operating level
& range

meas. precision,
accuracy
How known?

relationship of
response variable
to
objective

Control Variables

control
variable
(units)

62

normal level
& range

meas.
precision
& setting error
How known?

proposed
settings,
based on
predicted
effects

predicted
effects
(for various
responses)

Held Constant Factors

factor
(units)

desired
experimental
level &
allowable range

measurement
precision
How known?

how to
control (in
experiment)

anticipated
effects

Nuisance Factors

measurement
precision
How known?

nuisance
factor (units)

strategy (e.g.,
randomization,
blocking, etc.)

anticipated effects

Interactions

control var.
1

63

Other Graphical Aids for Planning Experiments


In addition to the tables in Coleman and Montgomerys Technometrics paper, there are a number of useful
graphical aids to pre-experimental planning. Perhaps the first person to suggest graphical methods for planning
an experiment was Andrews (1964), who proposed a schematic diagram of the system much like Figure 13-1
in the textbook, with inputs, experimental variables, and responses all clearly labeled. These diagrams can be
very helpful in focusing attention on the broad aspects of the problem.
Barton (1997) (1998) (1999) has discussed a number of useful graphical aids in planning experiments. He
suggests using IDEF0 diagrams to identify and classify variables. IDEF0 stands for Integrated Computer
Aided Manufacturing Identification Language, Level 0. The U. S. Air Force developed it to represent the
subroutines and functions of complex computer software systems. The IDEF0 diagram is a block diagram that
resembles Figure 13-1 in the textbook. IDEF0 diagrams are hierarchical; that is, the process or system can be
decomposed into a series of process steps or systems and represented as a sequence of lower-level boxes drawn
within the main block diagram.
Barton also suggests that cause-and-effect diagrams can also be useful in identifying and classifying variables
in an experimental design problem. These diagrams are very useful in organizing and conducting
brainstorming or other problem-solving meetings in which process variables and their potential role in the
experiment are discussed and decided.
Both of these techniques can be very helpful in uncovering intermediate variables. These are variables that
are often confused with the directly adjustable process variables. For example, the burning rate of a rocket
propellant may be affected by the presence of voids in the propellant material. However, the voids are the
result of mixing techniques, curing temperature and other process variables and so the experimenter cannot
directly control the voids themselves.
Some other useful papers on planning experiments include Bishop, Petersen and Trayser (1982), Hahn (1977)
(1984), and Hunter (1977).

S13.3. Using a t-Test for Detecting Curvature

In Section 13-5.4 of the textbook we discuss the addition of center points to a 2 k


factorial design. This is a very useful idea as it allows an estimate of pure error to
be obtained even thought the factorial design points are not replicated and it permits
the experimenter to obtain an assessment of model adequacy with respect to certain
second-order terms. Specifically, we present an F-test for the hypotheses
H 0 : 11 22
H1 : 11 22

kk 0
kk 0

An equivalent t-statistic can also be employed to test these hypotheses. Some


computer software programs report the t-test instead of (or in addition to) the F-test.
It is not difficult to develop the t-test and to show that it is equivalent to the F-test.
Suppose that the appropriate model for the response is a complete quadratic
polynomial and that the experimenter has conducted an unreplicated full 2 k factorial
design with nF design points plus nC center points. Let yF and yC represent the
averages of the responses at the factorial and center points, respectively. Also let 2
be the estimate of the variance obtained using the center points. It is easy to show
that

64

E ( yF )

1
nF

(nF 0 nF 11 nF 22

0 11 22

nF kk )

kk

and
E ( yC )

1
nC

(nC 0 )

Therefore,

E ( yF yC ) 11 22 kk
and so we see that the difference in averages yF yC is an unbiased estimator of the sum of the pure
quadratic model parameters. Now the variance of yF yC is
1
1
V ( yF yC ) 2

nF nC

Consequently, a test of the above hypotheses can be conducted using the statistic

t0

yF yC
1
1
2
nF nC

which under the null hypothesis follows a t distribution with nC 1 degrees of freedom. We would
reject the null hypothesis (that is, no pure quadratic curvature) if | t0 | t / 2,nC 1 .
This t-test is equivalent to the F-test given in the book. To see this, square the t-statistic above:

( yF yC ) 2
1
1
2
nF nC
n n ( y yC ) 2
F C F
(nF nC ) 2

t02

This ratio is identical to the F-test presented in the textbook. Furthermore, we know that the square
of a t random variable with (say) v degrees of freedom is an F random variable with 1 numerator and
v denominator degrees of freedom, so the t-test for pure quadratic effects is indeed equivalent to
the F-test.

S13.3. Blocking in Designed Experiments


In many experimental problems it is necessary to design the experiment so that variability arising from
nuisance factors can be controlled. As an example, consider the grocery bag paper tensile strength experiment
described in the book. Recall that the runs must be conducted in a pilot plant. Suppose that each run takes
about two hours to complete, so that at most four runs can be made on a single day. Now it is certainly possible
that pilot plant operations may not be completely consistent from day-to-day, due to variations in
environmental conditions, materials, operation of the test equipment for making the tensile strength
measurements, changes in operating personnel, and so forth. All these sources of variability can be combined
into a single source of nuisance variability called time.
65

A simple method can be used to keep the variability associated with a nuisance variable from impacting
experimental results. One each day (or in general, at each possible level) of the nuisance variable, test all
treatment or factor levels of interest. In our example, this would consist of testing all four hardwood
concentrations that are of interest on a single day. On each day, the four tests are conducted in random order.
This type of experimental design is called a randomized complete block design or RCBD. In the RCBD,
the block size must be large enough to hold all the treatments. If this condition is not satisfied, then an
incomplete block design must be used. These incomplete block designs are discussed in some experimental
design textbooks; for example, see Montgomery (2005).
In general, for a RCBD with a treatments we will run a complete replicate of these treatments in each of b
blocks. The order in which the runs are made in each block is completely random. The statistical model for
the RCBD is

i 1, 2,..., a
yij i j ij
j 1, 2,..., b
where is an overall mean,

j is the jth block effect, and ij is a

is the ith treatment effect,

random error term, taken to be NID(0, 2 ). We will think of the treatments as fixed factors. Defining
the treatment and block effects as deviations from an overall mean leads to the test on equality of the
treatment
means
being
equivalent
to
a
test
of
the
hypotheses
that

H 0 : 1 2 a 0 versus H1 : at least one i 0 .

The analysis of variance (ANOVA) can be adapted to the analysis of the RCBD. The fundamental
ANOVA equality becomes
a

i 1

j 1

( yij y.. )2 b ( yi. y.. )2 a ( y. j y.. )2 ( yij yi. y. j y.. )2


i 1 j 1

i 1 j 1

or

SST SS Factor SS Blocks SS E


The number of degrees of freedom associated with these sums of squares is
ab 1 ( a 1) (b 1) ( a 1)(b 1)

The null hypothesis of no difference in factor level or treatment means is tested by the statistic

SS Factor /(a 1)
MS Factor

SS E /(a 1)(b 1)
MS E

To illustrate the procedure, reconsider the tensile strength experiment data in Table 4-6, and suppose
that the experiment was run as a RCBD. Now the columns of Table 4-6 (currently labeled
Observations) would be labeled blocks or days. Minitab will perform the RCBD analysis. The
output follows.
ANOVA: strength versus concentration, day (or blocks)

Factor

Type Levels Values

conc

fixed

10

15

20

day

fixed

66

Analysis of Variance for strength


Source

DF

SS

MS

conc

382.792

127.597

25.03

0.000

day

53.708

10.742

2.11

0.121

Error

15

76.458

5.097

Total

23

512.958

Notice that the concentration factor is significant; that is, there is evidence that changing the
hardwood concentration affects the mean strength. The F-ratio for blocks or days is small, suggesting
that the variability associated with the blocks was small. There are some technical problems
associated with statistical testing of block effects see the discussion in Montgomery (2005, Chapter
5).
The blocking principal can be extended to experiments with more complex treatment structures. For
example, in Section 13-5.5, we observe that in a replicated factorial experiment, each replicate can
be run in a single block. Thus a nuisance factor can be accommodated in a factorial experiment.
As an illustration, consider the router experiment (Example 13-6 in the text). Suppose that each of
the replicates were run on a single printed circuit board. Considering boards (or replicates) as blocks,
we can analyze this experiment as a 22 factorial in four blocks. The Minitab analysis follows. Notice
that both main effects and the interaction are important. There is also some indication that the block
effects are significant.

Fractional Factorial Fit: Vibration versus A, B

Estimated Effects and Coefficients for Vibration (coded units)


Term

Effect

Coef

SE Coef

23.831

0.4359

54.67

0.000

1.744

0.7550

2.31

0.046

1.494

0.7550

1.98

0.079

-2.156

0.7550

-2.86

0.019

Constant
Block

16.638

8.319

0.4359

19.08

0.000

7.538

3.769

0.4359

8.65

0.000

A*B

8.712

4.356

0.4359

9.99

0.000

Analysis of Variance for Vibration (coded units)


Source

DF

Seq SS

Adj SS

Adj MS

Blocks

44.36

44.36

14.787

4.86

0.028

Main Effects

1334.48

1334.48

667.241 219.48

0.000

2-Way Interactions

303.63

303.63

303.631

0.000

Residual Error

27.36

27.36

3.040

15

1709.83

Total

67

99.88

S13.4. More About Expected Mean Squares in the Analysis of Variance

The Two-Factor Fixed-Effects Model


In Section 13.4 we describe the two-factor factorial experiment and present the
analysis of variance for the fixed-effects case. We observe that dividing the main
effect and interaction mean squares by the mean square for error forms the proper test
statistics. Examining the expected mean squares can verify this.
Consider the two-factor fixed effects model
yij i j ( ) ij ijk

R| ij 11,,22,,,,ab
S|
Tk 1,2,, n

given as Equation (13.2) in the textbook. It is relatively easy to develop the expected
mean squares from direct application of the expectation operator.
For an illustration, consider finding the expected value for one of the main effect
mean squares, say
E ( MS A ) E

FG SS IJ 1 E (SS )
H a 1K a 1
A

where SSA is the sum of squares for the row factor. Since
SS A

1 a 2
y...2
y

i.. abn
bn i 1

E ( SS A )

FG IJ
H K

a
1
y2
E yi2.. E ...
bn i 1
abn

Recall that . 0, . 0,( ). j 0,( )i . 0, and ( ).. 0 , where the dot subscript
implies summation over that subscript. Now
b

yi .. yijk bn bn i n . n( )i . i ..
j 1 k 1

bn bn i i ..

and
a
a
1
1
E yi2..
E (bn ) 2 (bn) 2 i2 i2.. 2(bn) 2 i 2bn i .. 2bn i i ..
bn i 1
bn i 1

LM
N

a
1
a (bn ) 2 (bn) 2 i2 abn 2
bn
i 1
a

abn 2 bn i2 a 2
i 1

Furthermore, we can easily show that


y... abn ...

so
68

OP
Q

1
1
E ( y...2 )
E (abn ... ) 2
abn
abn
1

E (abn ) 2 ...2 2abn ... )


abn
1

(abn ) 2 abn 2
abn
abn 2 2

Therefore
E ( MS A ) E

FG SS IJ
H a 1K
A

1
E ( SS A )
a 1
a
1
abn 2 bn i2 a 2 (abn 2 2 )
a 1
i 1

LM
N

LM
N

a
1
2 (a 1) bn i2
a 1
i 1

OP
Q

OP
Q

bn i2
i 1

a 1

The other expected mean squares are derived similarly. In general, for the fixed
effects model, the expected value of the mean squares for main effects and interaction
are equal to the error variance 2 plus a term involving the corresponding fixed
effect. The fixed effect term will be zero if the treatment means are zero or if the
interaction effects are negligible. The expected value of the error mean square is 2 ,
so the ratio of the model term mean square to the error mean square results in a onesided upper-tail test. The use of the F-distribution as the reference distribution
follows from the normality assumption on the response variable.
The Random Effects Model
In Section 8.6.2 we discuss briefly the use of analysis of variance methods for
measurement systems capability studies. The two-factor factorial random effects
model is assumed to be appropriate for the problem. The model is
yij i j ( ) ij ijk

R| ij 11,,22,,,,ab
S|
Tk 1,2,, n

as given as Equation (8.23) in the textbook. We list the expected mean squares for
this .model in Equation (8.26), but do not formally develop them. It is relatively easy
to develop the expected mean squares from direct application of the expectation
operator.
For example, consider finding
69

E ( MS A ) E

FG SS IJ 1 E (SS )
H a 1K a 1
A

where SSA is the sum of squares for the row factor. Recall that the model components
i , j and ( )ij are normally and independently distributed with means zero and
variances 2 , 2 , and 2 respectively. The sum of squares and its expectation are
defined as
SS A

1 a 2
y2
yi .. ...

bn i 1
abn

E ( SS A )

FG IJ
H K

a
1
y2
E yi2.. E ...
bn i 1
abn

Now
b

yi .. yijk bn bn i n . n( )i . i ..
j 1 k 1

and
a
a
1
1
E yi2..
E (bn ) 2 (bn) 2 i2 i2.. 2(bn) 2 i 2bn i .. 2bn i i ..
bn i 1
bn i 1
1
2

a (bn ) 2 a (bn) 2 2 ab(n) 2 2 abn 2


abn 2
bn
2
abn 2 abn 2 an 2 an
a 2

Furthermore, we can show that


y... abn bn . an . n( ) .. ...

so the second term in the expected value of SSA becomes


1
1
2
E ( y...2 )
(abn ) 2 a (bn) 2 2 b(an) 2 2 abn 2
abn 2
abn
abn
2
abn 2 bn 2 an 2 n
2

We can now collect the components of the expected value of the sum of squares for
factor A and find the expected mean square as follows:

FG SS IJ
H a 1K
1 L1
M E y
a 1 N bn

E ( MS A ) E

2
i ..

i 1

F y IJ OP
EG
H abn K Q
2
...

1
2
2 (a 1) n(a 1)
bn 2
a 1
2
2 n
bn 2

This agrees with the first result in Equation (8.26).

70

There are situations where there are only a specific set of operators that perform the
measurements in a gage R & R study, so we cannot think of the operators as having
been selected at random from a large population. Thus an ANOVA model involving
parts chosen at random and fixed operator effects would be appropriate. This is a
mixed model ANOVA. For details of using mixed models in measurement systems
capability studies, see Montgomery (2005), Burdick, Borror, and Montgomery
(2003), and Dolezal, Burdick, and Birch (1998).

71

Supplemental Material for Chapter 14


S14.1. Response Surface Designs
Example 14.2 introduces the central composite design (CCD), perhaps the most widely used design for fitting
the second-order response surface model. The CCD is very attractive for several reasons: (1) it requires fewer
runs than some if its competitors, such as the 3k factorial, (2) it can be built up from a first-order design (the
2k) by adding the axial runs, and (3) the design has some nice properties, such as the rotatability property
discussed in the text.
The factorial runs in the CCD are important in estimating the first-order (or main effects) in the model as well
as the interaction or cross-product terms. The axial runs contribute towards estimation of the pure quadratic
terms, plus they also contribute to estimation of the main effects. The center points contribute to estimation
of the pure quadratic terms. The CCD can also be run in blocks, with the factorial portion of the design plus
center points forming one block and the axial runs plus some additional center points forming the second block.
For other blocking strategies involving CCDs, refer to Montgomery (2005) or Myers and Montgomery (2002).
In addition to the CCD, there are some other designs that can be useful for fitting a second-order response
surface model. Some of these designs have been created as alternatives to overcome some possible objections
to the CCD.
A fairly common criticism is that the CCD requires 5 levels for each design factor, while the minimum number
of levels required to fit a second-order model is 3. It is easy to modify the CCD to contain only 3 levels for
each factor by setting the axial distance 1 . This places the axial runs in the center of each face of the
cube, resulting in a design called the face-centered cube. The Box-Behnken design is also a second-order
design with all factors at 3 levels. In this design, the runs are located at the centers of the edges of the cube
and not at the corners.
Another criticism of the CCD is that although the designs are not large, they are far from minimal. For
example, a CCD in k = 4 design factors requires 16 factorial runs, 8 axial runs, plus

nC 1 center points.

This

results in a design with at least 25 runs, while the second-order model in k = 4 design factors has only 15
parameters. Obviously, there are situations where it would be desirable to reduce the number of required runs.
One approach is to use a fractional factorial in the cube. However, the fractional must be either of resolution
V, or it must be of resolution III* (main effects aliased with two-factor interactions, but no two-factor
interactions aliased with each other). A resolution IV design cannot be used because that results in two-factor
interactions aliased with each other, so the cross-product terms in the second-order model cannot be estimated.
A small composite design is a CCD with a resolution III* fraction in the cube. For the in k = 4 design factor
example, this would involve setting the generator D = AB for the one-half fraction (the standard resolution IV
half-fraction uses D = ABC). This results in a small composite design with a minimum of 17 runs. Hybrid
designs are another type of small response surface design that are in many ways superior to the small
composite. These and other types of response surface designs are supported by several statistics software
packages. These designs are discussed extensively in Myers and Montgomery (2002).

S14.2. More about Robust Design and Process Robustness Studies


In Chapter 13 and 14 we emphasize the importance of using designed experiments for product and
process improvement. Today, many engineers and scientists are exposed to the principles of
statistically designed experiments as part of their formal technical education. However, during the
1960-1980 time period, the principles of experimental design (and statistical methods, in general)
were not as widely used as they are today
In the early 1980s, Genichi Taguchi, a Japanese engineer, introduced his approach to using
experimental design for
72

1.

Designing products or processes so that they are robust to environmental conditions.

2.

Designing/developing products so that they are robust to component variation.

3.

Minimizing variation around a target value.

Taguchi called this the robust parameter design problem. In Chapter 14 we extend the idea somewhat to
include not only robust product design but process robustness studies.

Taguchi defined meaningful engineering problems and the philosophy that he recommended is sound.
However, he advocated some novel methods of statistical data analysis and some approaches to the design of
experiments that the process of peer review revealed were unnecessarily complicated, inefficient, and
sometimes ineffective. In this section, we will briefly overview Taguchi's philosophy regarding quality
engineering and experimental design. We will present some examples of his approach to robust parameter
design, and we will use these examples to highlight the problems with his technical methods. It is possible to
combine his sound engineering concepts with more efficient and effective experimental design and analysis
based on response surface methods, as we did in the process robustness studies examples in Chapter 14.

The Taguchi Philosophy


Taguchi advocates a philosophy of quality engineering that is broadly applicable. He considers three stages
in product (or process) development: system design, parameter design, and tolerance design. In system design,
the engineer uses scientific and engineering principles to determine the basic system configuration. For
example, if we wish to measure an unknown resistance, we may use our knowledge of electrical circuits to
determine that the basic system should be configured as a Wheatstone bridge. If we are designing a process
to assemble printed circuit boards, we will determine the need for specific types of axial insertion machines,
surface-mount placement machines, flow solder machines, and so forth.
In the parameter design stage, the specific values for the system parameters are determined. This would
involve choosing the nominal resistor and power supply values for the Wheatstone bridge, the number and
type of component placement machines for the printed circuit board assembly process, and so forth. Usually,
the objective is to specify these nominal parameter values such that the variability transmitted from
uncontrollable or noise variables is minimized.
Tolerance design is used to determine the best tolerances for the parameters. For example, in the Wheatstone
bridge, tolerance design methods would reveal which components in the design were most sensitive and where
the tolerances should be set. If a component does not have much effect on the performance of the circuit, it
can be specified with a wide tolerance.
Taguchi recommends that statistical experimental design methods be employed to assist in this process,
particularly during parameter design and tolerance design. We will focus on parameter design. Experimental
design methods can be used to find a best product or process design, where by "best" we mean a product or
process that is robust or insensitive to uncontrollable factors that will influence the product or process once it
is in routine operation.
The notion of robust design is not new. Engineers have always tried to design products so that they will work
well under uncontrollable conditions. For example, commercial transport aircraft fly about as well in a
thunderstorm as they do in clear air. Taguchi deserves recognition for realizing that experimental design can
be used as a formal part of the engineering design process to help accomplish this objective.

73

A key component of Taguchi's philosophy is the reduction of variability. Generally, each product or process
performance characteristic will have a target or nominal value. The objective is to reduce the variability
around this target value. Taguchi models the departures that may occur from this target value with a loss
function. The loss refers to the cost that is incurred by society when the consumer uses a product whose
quality characteristics differ from the nominal. The concept of societal loss is a departure from traditional
thinking. Taguchi imposes a quadratic loss function of the form
L(y) = k (y - T)2
shown in Figure S14.1 below. Clearly this type of function will penalize even small departures of y from the
target T. Again, this is a departure from traditional thinking, which usually attaches penalties only to cases
where y is outside of the upper and lower specifications (say y > USL or y < LSL in Figure S14.1. However,
the Taguchi philosophy regarding reduction of variability and the emphasis on minimizing costs is entirely
consistent with the continuous improvement philosophy of Deming and Juran.
In summary, Taguchi's philosophy involves three central ideas:
1.

Products and processes should be designed so that they are robust to external sources of variability.

2.

Experimental design methods are an engineering tool to help accomplish this objective.

3. Operation on-target is more important than conformance to specifications.

Figure S14.1. Taguchis Quadratic Loss Function

These are sound concepts, and their value should be readily apparent. Furthermore, as we have seen in the
textbook, experimental design methods can play a major role in translating these ideas into practice.
We now turn to a discussion of the specific methods that Taguchi recommends for applying his concepts in
practice. As we will see, his approach to experimental design and data analysis can be improved.
Taguchis Technical Methods

An Example
We will use a connector pull-off force example to illustrate Taguchis technical methods. For more
information about the problem, refer to the original article in Quality Progress in December 1987 (see "The
Taguchi Approach to Parameter Design," by D. M. Byrne and S. Taguchi, Quality Progress, December 1987,
pp. 19-26). The experiment involves finding a method to assemble an elastomeric connector to a nylon tube
that would deliver the required pull-off performance to be suitable for use in an automotive engine application.
The specific objective of the experiment is to maximize the pull-off force. Four controllable and three
uncontrollable noise factors were identified. These factors are shown in Table S14.1 below. We want to find
the levels of the controllable factors that are the least influenced by the noise factors and that provides the
maximum pull-off force. Notice that although the noise factors are not controllable during routine operations,
they can be controlled for the purposes of a test. Each controllable factor is tested at three levels, and each
noise factor is tested at two levels.

74

In the Taguchi parameter design methodology, one experimental design is selected for the controllable factors
and another experimental design is selected for the noise factors. These designs are shown in Table S14.2.
Taguchi refers to these designs as orthogonal arrays, and represents the factor levels with integers 1, 2, and
3. In this case the designs selected are just a standard 23 and a 34-2 fractional factorial. Taguchi calls these the
L8 and L9 orthogonal arrays, respectively.
The two designs are combined as shown in Table S14.3 below. This is called a crossed or product array
design, composed of the inner array containing the controllable factors, and the outer array containing the
noise factors. Literally, each of the 9 runs from the inner array is tested across the 8 runs from the outer array,
for a total sample size of 72 runs. The observed pull-off force is reported in Table S14.3.

Table S14.1. Factors and Levels for the Taguchi Parameter Design Example
Controllable Factors

Levels

A=

Interference

Low

Medium

High

B=

Connector wall thickness

Thin

Medium

Thick

C=

Insertion,depth

Shallow

Medium

Deep

D=

Percent adhesive in

Low

Medium

High

connector pre-dip

Uncontrollable Factors

Levels

E = Conditioning time

24 h

120 h

F = Conditioning temperature

72F

150F

G = Conditioning relative humidity

25%

75%

Table S14.2. Designs for the Controllable and Uncontrollable Factors


(a) L9 Orthogonal Array
for the Controllable
Factors
Variable
.
Run A B C D Run
11
1 1 1 1
1
21
2 2 2 2
1
31
3 3 3 3
1
42
1 2 3 4
1
52
2 3 1 5
2
62
3 1 2 6
2
73
1 3 2 7
2
83
2 1 3 8
2
93
3 2 1

75

E
1
1
2
2
1
1
2
2

F ExF
1
1
1
2
2
1
2
2
2
1
2
2
1
1
1
2

(b) L8 Orthogonal Array


for the Uncontrollable
Factors
Variable
.
G ExG
FxG e
1 1
1
2 2
2
1 2
2
2 1
1
2 1
2
1 2
1
2 2
1
1 1
2

Table S14.3.

Parameter Design with Both Inner and Outer Arrays

___________________________________________________________________________________
Outer Array (L8)
E

Inner Array (L9) .

Responses

.
Run

SNL

15.6

9.5

16.9

19.9

19.6

19.6

20.0

19.1 17.525

24.025

15.0

16.2

19.4

19.2

19.7

19.8

24.2

21.9 19.475

25.522

16.3

16.7

19.1

15.6

22.6

18.2

23.3

20.4 19.025

25.335

18.3

17.4

18.9

18.6

21.0

18.9

23.2

24.7 20.125

25.904

19.7

18.6

19.4

25.1

25.6

21.4

27.5

25.3 22.825

26.908

16.2

16.3

20.0

19.8

14.7

19.6

22.5

24.7 19.225

25.326

16.4

19.1

18.4

23.6

16.8

18.6

24.3

21.6 19.8

25.711

14.2

15.6

15.1

16.8

17.8

19.6

23.2

24.2 18.338

24.852

16.1

19.9

19.3

17.3

23.1

22.7

22.6

28.6 21.200

26.152

__________________________________________________________________________________

Data Analysis and Conclusions


The data from this experiment may now be analyzed. Taguchi recommends analyzing the mean response for
each run in the inner array (see Table S14.3), and he also suggests analyzing variation using an appropriately
chosen signal-to-noise ratio (SN). These signal-to-noise ratios are derived from the quadratic loss function,
and three of them are considered to be "standard" and widely applicable. They are defined as follows:

1. Nominal the best:

y2
SN T 10 log 2
S
2. Larger the better:

1 n 1
SN L 10 log 2
n i 1 yi
3. Smaller the better:

1 n 2
SN L 10 log yi
n i 1
Notice that these SN ratios are expressed on a decibel scale. We would use SNT if the objective is to reduce
variability around a specific target, SNL if the system is optimized when the response is as large as possible,

76

and SNS if the system is optimized when the response is as small as possible. Factor levels that maximize the
appropriate SN ratio are optimal.
In this problem, we would use SNL because the objective is to maximize the pull-off force. The last two
columns of Table S14.3 contain

and SNL values for each of the nine inner-array runs. Taguchi-oriented

practitioners often use the analysis of variance to determine the factors that influence y and the factors that
influence the signal-to-noise ratio. They also employ graphs of the "marginal means" of each factor, such as
the ones shown in Figures S14.2 and S14.3. The usual approach is to examine the graphs and "pick the winner."
In this case, factors A and C have larger effects than do B and D. In terms of maximizing SNL we would select
AMedium, CDeep, BMedium, and DLow. In terms of maximizing the average pull-off force y , we would choose
AMedium, CMedium, BMedium and DLow. Notice that there is almost no difference between CMedium and CDeep. The
implication is that this choice of levels will maximize the mean pull-off force and reduce variability in the pulloff force.
Taguchi advocates claim that the use of the SN ratio generally eliminates the need for examining specific
interactions between the controllable and noise factors, although sometimes looking at these interactions
improves process understanding. The authors of this study found that the AG and DE interactions were large.
Analysis of these interactions, shown in Figure S14.4, suggests that AMedium is best. (It gives the highest pulloff force and a slope close to zero, indicating that if we choose AMedium the effect of relative humidity is
minimized.) The analysis also suggests that DLow gives the highest pull-off force regardless of the conditioning
time.

Figure S14.2. The Effects of Controllable Factors on Each Response

Figure S14.3. The Effects of Controllable Factors on the Signal to Noise Ratio
77

When cost and other factors were taken into account, the experimenters in this example finally decided to use
AMedium, BThin, CMedium, and Dlow. (BThin was much less expensive than BMedium, and CMedium was felt to give
slightly less variability than CDeep.) Since this combination was not a run in the original nine inner array trials,
five additional tests were made at this set of conditions as a confirmation experiment. For this confirmation
experiment, the levels used on the noise variables were ELow, FLow, and GLow. The authors report that good
results were obtained from the confirmation test.

Critique of Taguchis Experimental Strategy and Designs


The advocates of Taguchi's approach to parameter design utilize the orthogonal array designs, two of which
(the L8 and the L9) were presented in the foregoing example. There are other orthogonal arrays: the L4, L12,
L16, L18, and L27. These designs were not developed by Taguchi; for example, the L8 is a
42

2 7III 4

fractional

1511

factorial, the L9 is a 3III fractional factorial, the L12 is a Plackett-Burman design, the L16 is a 2 III fractional
factorial, and so on. Box, Bisgaard, and Fung (1988) trace the origin of these designs. Some of these designs
have very complex alias structures. In particular, the L12 and all of

Figure S14.4. The AG and DE Interactions

the designs that use three-level factors will involve partial aliasing of two-factor interactions with main
effects. If any two-factor interactions are large, this may lead to a situation in which the experimenter does
not get the correct answer. For more details on aliasing in these types of designs, see Montgomery (2005).
Taguchi argues that we do not need to consider two-factor interactions explicitly. He claims that it is possible
to eliminate these interactions either by correctly specifying the response and design factors or by using a
sliding setting approach to choose factor levels. As an example of the latter approach, consider the two
factors pressure and temperature. Varying these factors independently will probably produce an interaction.
However, if temperature levels are chosen contingent on the pressure levels, then the interaction effect can be
minimized. In practice, these two approaches are usually difficult to implement unless we have an unusually
high level of process knowledge. The lack of provision for adequately dealing with potential interactions
between the controllable process factors is a major weakness of the Taguchi approach to parameter design.
Instead of designing the experiment to investigate potential interactions, Taguchi prefers to use three-level
factors to estimate curvature. For example, in the inner and outer array design used by Byrne and Taguchi, all
four controllable factors were run at three levels. Let x1, x2, x3 and x4 represent the controllable factors and let
78

z1, z2, and z3 represent the three noise factors. Recall that the noise factors were run at two levels in a complete
factorial design. The design they used allows us to fit the following model:

y 0 j x j jj x
j 1

j 1

j z j
j 1

i j

j 2

z z j ij zi x j

ij i

i 1 j 1

Notice that we can fit the linear and quadratic effects of the controllable factors but not their two-factor
interactions (which are aliased with the main effects). We can also fit the linear effects of the noise factors
and all the two-factor interactions involving the noise factors. Finally, we can fit the two-factor interactions
involving the controllable factors and the noise factors. It may be unwise to ignore potential interactions in
the controllable factors.
This is a rather odd strategy, since interaction is a form of curvature. A much safer strategy is to identify
potential effects and interactions that may be important and then consider curvature only in the important
variables if there is evidence that the curvature is important. This will usually lead to fewer experiments,
simpler interpretation of the data, and better overall process understanding.
Another criticism of the Taguchi approach to parameter design is that the crossed array structure usually leads
to a very large experiment. For example, in the foregoing application, the authors used 72 tests to investigate
only seven factors, and they still could not estimate any of the two-factor interactions among the four
controllable factors.
There are several alternative experimental designs that would be superior to the inner and outer method used
in this example. Suppose that we run all seven factors at two levels in the combined array design approach
72

discussed in the textbook. Consider the 2 IV fractional factorial design. The alias relationships for this design
are shown in the top half of Table S14.4. Notice that this design requires only 32 runs (as compared to 72).
In the bottom half of Table S14.4, two different possible schemes for assigning process controllable variables
and noise variables to the letters A through G are given. The first assignment scheme allows all the interactions
between controllable factors and noise factors to be estimated, and it allows main effect estimates to be made
that are clear of two-factor interactions. The second assignment scheme allows all the controllable factor main
effects and their two-factor interactions to be estimated; it allows all noise factor main effects to be estimated
clear of two-factor interactions; and it aliases only three interactions between controllable factors and noise
factors with a two-factor interaction between two noise factors. Both of these arrangements present much
cleaner alias relationships than are obtained from the inner and outer array parameter design, which also
required over twice as many runs.
In general, the crossed array approach is often unnecessary. A better strategy is to use the combined array
design discussed in the textbook. This approach will almost always lead to a dramatic reduction in the size of
the experiment, and at the same time, it will produce information that is more likely to improve process
understanding. For more discussion of this approach, see Myers and Montgomery (2002). We can also use a
combined array design that allows the experimenter to directly model the noise factors as a complete quadratic
and to fit all interactions between the controllable factors and the noise factors, as demonstrated in Chapter 14
of the textbook.
Another possible issue with the Taguchi inner and outer array design relates to the order in which the runs are
performed. Now we know that for experimental validity, the runs in a designed experiment should be
conducted in random order. However, in many crossed array experiments, it is possible that the run order
wasnt randomized. In some cases it would be more convenient to fix each row in the inner array (that is, set
the levels of the controllable factors) and run all outer-array trials. In other cases, it might be more convenient
to fix the each column in the outer array and the run each on the inner array trials at that combination of noise
factors.
79

Table S14.4. An Alternative Parameter Design


A one-quarter fraction of 7 factors in 32 runs. Resolution IV.
I = ABCDF = ABDEG = CEFG.
Aliases:
A

AF = BCD

CG = EF

AG = BDE

DE = ABG

C = EFG

BC = ADF

DF = ABC

BD = ACF = AEG

DG = ABE

E = CFG

BE = ADG

ACE = AFG

F = CEG

BF = ACD

ACG = AEF

G = CEF

BG = ADE

BCE = BFG

AB = CDF = DEG

CD = ABF

BCG = BEF

AC = BDF

CE = FG

CDE = DFG

AD = BCF = BEG

CF = ABD = EG

CDG = DEF

AF = BDG

Factor Assignment Schemes:


1.

Controllable factors are assigned to the letters C, E, F, and G. Noise factors are assigned to the letters A, B, and D. All interactions between
controllable factors and noise factors can be estimated. and all controllable factor main effects can be estimated clear of two-factor interactions.

2.

Controllable factors are assigned to the letters A, B, C, and D. Noise factors are assigned to the letters E, F. and G. All controllable factor main
effects and two-factor interactions can be estimated; only the CE, CF, and CG interactions are aliased with interactions of the noise factors.

Exactly which strategy is pursued probably depends on which group of factors is easiest to change, the
controllable factors or the noise factors. If the tests are run in either manner described above, then a split-plot
structure has been introduced into the experiment. If this is not accounted for in the analysis, then the results
and conclusions can be misleading. There is no evidence that Taguchi advocates used split-plot analysis
methods. Furthermore, since Taguchi frequently downplayed the importance of randomization, it is highly
likely that many actual inner and outer array experiments were inadvertently conducted as split-plots, and
perhaps incorrectly analyzed. Montgomery (2005) discusses split-plot designs and their analysis. Box and
Jones give a good discussion of split-plot designs in process robustness studies.
A final aspect of Taguchi's parameter design is the use of linear graphs to assign factors to the columns of the
orthogonal array. A set of linear graphs for the L8 design is shown in Figure S14.5. In these graphs, each
number represents a column in the design. A line segment on the graph corresponds to an interaction between
the nodes it connects. To assign variables to columns in an orthogonal array, assign the variables to nodes
first; then when the nodes are used up, assign the variables to the line segments. When you assign variables
to the nodes, strike out any line segments that correspond to interactions that might be important. The linear
graphs in Figure 5 imply that column 3 in the L8 design contains the interaction between columns 1 and 2,
column 5 contains the interaction between columns 1 and 4, and so forth. If we had four factors, we would
assign them to columns 1, 2, 4, and 7. This would ensure that each main effect is clear of two-factor
interactions. What is not clear is the two-factor interaction aliasing. If the main effects are in columns 1, 2, 4,
and 7, then column 3 contains the 1-2 and the 4-7 interaction, column 5 contains the 1-4 and the 2-7 interaction,
and column 6 contains the 1-7 and the 2-4 interaction. This is clearly the case because four variables in eight
runs is a resolution IV plan with all pairs of two-factor interactions aliased. In order to understand fully the
two-factor interaction aliasing, Taguchi would refer the experiment designer to a supplementary interaction
table.
80

Taguchi (1986) gives a collection of linear graphs for each of his recommended orthogonal array
designs. These linear graphs seem -to have been developed heuristically. Unfortunately, their use
can lead to inefficient designs. For examples, see his car engine experiment [Taguchi and Wu (1980)]
and his cutting tool experiment [Taguchi (1986)]. Both of these are 16-run designs that he sets up as
resolution III designs in which main effects are aliased with two-factor interactions. Conventional
methods for constructing these designs would have resulted in resolution IV plans in which the main
effects are clear of the two-factor interactions. For the experimenter who simply wants to generate a
good design, the linear graph approach may not produce the best result. A better approach is to use
a table that presents the design and its full alias structure. These tables are easy to construct and are
routinely displayed by several widely available and inexpensive computer programs.

Figure S14.5. Linear Graphs for the L8 Design

Critique of Taguchis Data Analysis Methods


Several of Taguchi's data analysis methods are questionable. For example, he recommends some variations
of the analysis of variance that are known to produce spurious results, and he also proposes some unique
methods for the analysis of attribute and life testing data. For a discussion and critique of these methods, refer
to Myers and Montgomery (2002) and the references contained therein. In this section we focus on three
aspects of his recommendations concerning data analysis: the use of "marginal means" plots to optimize factor
settings, the use of signal-to-noise ratios, and some of his uses of the analysis of variance.
Consider the use of "marginal means" plots and the associated "pick the winner" optimization that was
demonstrated previously in the pull-off force problem. To keep the situation simple, suppose that we have two
factors A and B, each at three levels, as shown in Table S14.5. The "marginal means" plots are shown in Figure
S14.6. From looking at these graphs, we would select A3 and B1, as the optimum combination, assuming that
we wish to maximize y. However, this is the wrong answer. Direct inspection of Table S14.5 or the AB
interaction plot in Figure S14.7 shows that the combination of A3 and B2 produces the maximum value of y.
In general, playing "pick the winner" with marginal averages can never be guaranteed to produce the optimum.
The Taguchi advocates recommend that a confirmation experiment be run, although this offers no guarantees
either. We might be confirming a response that differs dramatically from the optimum. The best way to find
a set of optimum conditions is with the use of response surface methods, as discussed and illustrated in Chapter
14 of the textbook.
Taguchi's signal-to-noise ratios are his recommended performance measures in a wide variety of situations.
By maximizing the appropriate SN ratio, he claims that variability is minimized.
81

Table S14.5. Data for the "Marginal Means" Plots in Figure S12-6

Factor A

Factor B

10

10

13

11.00

10

14

9.67

10

8.33

8.00

9.67

11.67

A Averages

B Averages

Figure S14.6. Marginal Means Plots for the Data in Table S14.5

Figure S14.7. The AB Interaction Plot for the Data in Table S14.5.

Consider first the signal to noise ratio for the target is best case

y2
SN T 10 log 2
S

This ratio would be used if we wish to minimize variability around a fixed target value. It has been suggested
by Taguchi that it is preferable to work with SNT instead of the standard deviation because in many cases the
process mean and standard deviation are related. (As gets larger, gets larger, for example.) In such cases,
82

he argues that we cannot directly minimize the standard deviation and then bring the mean on target. Taguchi
claims he found empirically that the use of the SNT ratio coupled with a two-stage optimization procedure
would lead to a combination of factor levels where the standard deviation is minimized and the mean is on
target. The optimization procedure consists of (1) finding the set of controllable factors that affect SNT, called
the control factors, and setting them to levels that maximize SNT and then (2) finding the set of factors that
have significant effects on the mean but do not influence the SNT ratio, called the signal factors, and using
these factors to bring the mean on target.
Given that this partitioning of factors is possible, SNT is an example of a performance measure independent
of adjustment (PERMIA) [see Leon et al. (1987)]. The signal factors would be the adjustment factors. The
motivation behind the signal-to-noise ratio is to uncouple location and dispersion effects. It can be shown that
the use of SNT is equivalent to an analysis of the standard deviation of the logarithm of the original data. Thus,
using SNT implies that a log transformation will always uncouple location and dispersion effects. There is no
assurance that this will happen. A much safer approach is to investigate what type of transformation is
appropriate.
Note that we can write the SNT ratio as

y2
SN T 10 log 2
S
2

10 log( y ) 10 log( S 2 )
If the mean is fixed at a target value (estimated by y ), then maximizing the SNT ratio is equivalent to
minimizing log (S2). Using log (S2) would require fewer calculations, is more intuitively appealing, and would
provide a clearer understanding of the factor relationships that influence process variability - in other words,
it would provide better process understanding. Furthermore, if we minimize log (S2) directly, we eliminate the
risk of obtaining wrong answers from the maximization of SNT if some of the manipulated factors drive the
mean y upward instead of driving S2 downward. In general, if the response variable can be expressed in terms
of the model
y ( x d , x a ) ( x d )

where xd is the subset of factors that drive the dispersion effects and xa is the subset of adjustment factors that
do not affect variability, then maximizing SNT will be equivalent to minimizing the standard deviation.
Considering the other potential problems surrounding SNT , it is likely to be safer to work directly with the
standard deviation (or its logarithm) as a response variable, as suggested in the textbook. For more discussion,
refer to Myers and Montgomery (2002).
The ratios SNL and SNS are even more troublesome. These quantities may be completely ineffective in
identifying dispersion effects, although they may serve to identify location effects, that is, factors that drive
the mean. The reason for this is relatively easy to see. Consider the SNS (smaller-the-better) ratio:

1 n 2
SN S 10log yi
n i 1
The ratio is motivated by the assumption of a quadratic loss function with y nonnegative. The loss function
for such a case would be

1 n 2
L C yi
n i 1

83

where C is a constant. Now

1 n 2
log L log C log yi
n i 1
and
SNS = 10 log C - 10 log L
so maximizing SNS will minimize L. However, it is easy to show that

2
2
1 n 2
1 n 2
y

yi ny

n i 1
n i 1

2
n 1 2
y
S
n

Therefore, the use of SNS as a response variable confounds location and dispersion effects.
The confounding of location and dispersion effects was observed in the analysis of the SNL ratio in the pulloff force example. In Figures S14.2 and S14.3 notice that the plots of y and SNL versus each factor have
approximately the same shape, implying that both responses measure location. Furthermore, since the SNS
and SNL ratios involve y2 and 1/y2, they will be very sensitive to outliers or values near zero, and they are not
invariant to linear transformation of the original response. We strongly recommend that these signal-to-noise
ratios not be used.
A better approach for isolating location and dispersion effects is to develop separate response surface models
for y and log(S2). If no replication is available to estimate variability at each run in the design, methods for
analyzing residuals can be used. Another very effective approach is based on the use of the response model,
as demonstrated in the textbook and in Myers and Montgomery (2002). Recall that this allows both a response
surface for the variance and a response surface for the mean to be obtained for a single model containing both
the controllable design factors and the noise variables. Then standard response surface methods can be used
to optimize the mean and variance.
Finally, we turn to some of the applications of the analysis of variance recommended by Taguchi. As an
example for discussion, consider the experiment reported by Quinlan (1985) at a symposium on Taguchi
methods sponsored by the American Supplier Institute. The experiment concerned the quality improvement
of speedometer cables. Specifically, the objective was to reduce the shrinkage in the plastic casing material.
15-11

(Excessive shrinkage causes the cables to be noisy.) The experiment used an L16 orthogonal array (the 2 III
design). The shrinkage values for four samples taken from 3000-foot lengths of the product manufactured at
each set of test conditions were measured and the responses y and SNS computed.
Quinlan, following the Taguchi approach to data analysis, used SNS as the response variable in an analysis of
variance. The error mean square was formed by pooling the mean squares associated with the seven effects
that had the smallest absolute magnitude. This resulted in all eight remaining factors having significant effects
(in order of magnitude: E, G, K, A, C, F, D, H). The author did note that E and G were the most important.

Pooling of mean squares as in this example is a procedure that has long been known to produce
considerable bias in the ANOVA test results, To illustrate the problem, consider the 15 NID(0, 1)
random numbers shown in column 1 of Table S13.6. The square of each of these numbers, shown in
column 2 of the table, is a single-degree-of-freedom mean square corresponding to the observed
random number. The seven smallest random numbers are marked with an asterisk in column 1 of

84

Table S13.6. The corresponding mean squares are pooled to form a mean square for error with seven
degrees of freedom. This quantity is
MS E

0.5088
0.0727
7

Finally, column 3 of Table S13.6 presents the F ratio formed by dividing each of the eight remaining mean
squares by MSE. Now F0.05,1,7 = 5.59, and this implies that five of the eight effects would be judged significant
at the 0.05 level. Recall that since the original data came from a normal distribution with mean zero, none of
the effects is different from zero.
Analysis methods such as this virtually guarantee erroneous conclusions. The normal probability plotting of
effects avoids this invalid pooling of mean squares and provides a simple, easy to interpret method of analysis.
Box (1988) provides an alternate analysis of
Table S14.6. Pooling of Mean Squares
NID(0,1) Random
Numbers

Mean Squares with One Degree of


Freedom

F0

-08607

0.7408

10.19

-0.8820

0.7779

10.70

0.3608*

0.1302

0.0227*

0.0005

0.1903*

0.0362

-0.3071*

0.0943

1.2075

1.4581

20.06

0.5641

0.3182

4038

-0.3936*

0.1549

-0.6940

0.4816

-0.3028*

0.0917

0.5832

0.3401

0.0324*

0.0010

1.0202

1.0408

14.32

-0.6347

0.4028

5.54

6.63

4.68

the Quinlan data that correctly reveals E and G to be important along with other interesting results not apparent
in the original analysis.
It is important to note that the Taguchi analysis identified negligible factors as significant. This can have
profound impact on our use of experimental design to enhance process knowledge. Experimental design
methods should make gaining process knowledge easier, not harder.

Some Final Remarks


In this section we have directed some major criticisms toward the specific methods of experimental design and
data analysis used in the Taguchi approach to parameter design. Remember that these comments have focused
on technical issues, and that the broad philosophy recommended by Taguchi is inherently sound.
85

On the other hand, while the Taguchi controversy was in full bloom, many companies reported success with
the use of Taguchi's parameter design methods. If the methods are flawed, why do they produce successful
results? Taguchi advocates often refute criticism with the remark that "they work." We must remember that
the "best guess" and "one-factor-at-a-time" methods will also work-and occasionally they produce good results.
This is no reason to claim that they are good methods. Most of the successful applications of Taguchi's
technical methods have been in industries where there was no history of good experimental design practice.
Designers and developers were using the best guess and one-factor-at-a-time methods (or other unstructured
approaches), and since the Taguchi approach is based on the factorial design concept, it often produced better
results than the methods it replaced. In other words, the factorial design is so powerful that, even when it is
used inefficiently, it will often work well.
As pointed out earlier, the Taguchi approach to parameter design often leads to large, comprehensive
experiments, often having 70 or more runs. Many of the successful applications of this approach were in
industries characterized by a high-volume, low-cost manufacturing environment. In such situations, large
designs may not be a real problem, if it is really no more difficult to make 72 runs than to make 16 or 32 runs.
On the other hand, in industries characterized by low-volume and/or high-cost manufacturing (such as the
aerospace industry, chemical and process industries, electronics and semiconductor manufacturing, and so
forth), these methodological inefficiencies can be significant.
A final point concerns the learning process. If the Taguchi approach to parameter design works and yields
good results, we may still not know what has caused the result because of the aliasing of critical interactions.
In other words, we may have solved a problem (a short-term success), but we may not have gained process
knowledge, which could be invaluable in future problems.
In summary, we should support Taguchi's philosophy of quality engineering. However, we must rely on
simpler, more efficient methods that are easier to learn and apply to carry this philosophy into practice. The
response surface modeling framework that we present in Chapter 14 of the textbook is an ideal approach to
process optimization and as we have demonstrated, it is fully adaptable to the robust parameter design problem
and to process robustness studies.

86

Supplemental Material for Chapter 15


S15-1. A Lot Sensitive Compliance (LTPD) Sampling Plan
In this section we briefly describe a sampling plan that is particularly useful in compliance testing, or
sampling for product safety related purposes. The procedure is also very useful in any situation where
minimum sample sizes are required. Developed by Schilling (1978), the plans provide for rejection
of the lot if any defectives are found in the sample. They are based on a well-defined relationship
between the sampling plan and the size of the lots submitted for inspection. The samp-ling procedure
gives the proportion of the lot that must be sampled to guarantee that the fraction defective in the lot
is less than a prescribed limit with probability 0.90. That is, for a specified LTPD the probability of
lot acceptance is 0.10. Sample sizes are based on the hypergeometric distribution. Schilling gives a
table to assist in implementing the procedure.
In general, an acceptance-sampling plan with a zero acceptance number has a very undesirable shape
at low fractions nonconforming. Since these lot sensitive compliance plans have c = 0, the
manufacturers process must operate at an average level of quality that is less than about 5% of the
LTPD in order to have a reasonably small probability of a good lot being rejected. If the process
average is closer than that to the LTPD a plan with c = 0 is not a good choice and a Dodge-Romig
LTPD plan would likely be preferable. However, these plans have much larger sample sizes then
Schillings plans.
S15-2. Consideration of Inspection Error
A basic assumption in the construction of acceptance sampling plans is that the inspection process is
error-free. However, this is often not the case with inspection activities; indeed, many inspection
processes are known to be error-prone. While these errors are usually unintentional, they have the
effect of distorting (sometimes severely) the performance measures of the inspection or acceptance
sampling plan that has been adopted. In complex inspection activities, error rates of 20-30 % are not
terribly unusual. In this section we illustrate some of the effects of inspection error on single sampling
plans for attributes.
Two types of errors are possible in attributes inspection. An item that is good can be classified as defective
(this is a type I error), or an item that is defective can be classified as good (this is a type II error). Define

E1 event that a good item is classified as a defective


E2 event that a defective item is classified as good
A event that an item is defective
B event that an item is classified as a defective
Then

P( B) P( A) P( E2 ) P( A) P( E1 )
By defining the quantities

p P ( A) the true fraction efective


pe P( B) the apparent fraction defective
e1 P ( E1 ) the probability that a type I error is made
e2 P ( E2 ) the probability that a type II error is made

the expression for apparent fraction defective can be written as

87

pe p(1 e2 ) (1 p)e1
The OC curve of a sampling plan plots the probability of lot acceptance versus the lot fraction
defective. When inspection error is present, the probability of lot acceptance is
c
n
Pa ( e) ped (1 pe )nd
d 0 d

Thus the OC curve of a sampling plan when inspection error is present can be easily obtained.
Now consider how inspection error impacts AOQ and ATI. For error-free inspection, the AOQ for
single sampling inspection for attributes is
AOQ

( N n) pPa
N

If defective items are replaced, and if the inspection of the replacements also involves inspection
error, then the AOQ becomes

AOQ

npe2 p( N n)(1 pe ) Pa (e) p( N n)(1 Pa (e) )e2


N (1 pe )

If the defectives discovered are not replaced, then


AOQ

npe2 p( N n) Pa ( e ) p( N n)(1 Pa ( e ) )e2


N npe (1 P( e ) )( N n) pe

The ATI assuming error-free inspection and 100% screening of rejected lots is

ATI n (1 Pa )( N n)
The corresponding ATI when the discovered defectives are replaced and the replacement process is subject to
inspection error is

ATI

n (1 Pa (e) )( N n)
1 pe

If the discovered defectives are not replaced, then

ATI n (1 Pa (e) )( N n)
The design of a sampling plan is usually accomplished by choosing a set of parameters for the plan such that
the OC curve passes through two points, generally at the AQL and LTPD levels of quality. When there is
inspection error, the probabilities of acceptance no longer have the values that would have been obtained with
perfect inspection. However, it is possible to modify the design procedure by forcing the actual OC curve to
fit the desired points.
To illustrate this process, suppose that we want the OC curve to match the points (AQL, 1- ) and (LTPD,
). Then in designing the sampling plan we need to use

AQLe AQL(1 e2 ) (1 AQL)e1


LTPDe LTPD(1 e2 ) (1 LTPD)e1
If the observed OC curve fits these points, then the actual OC curve will fit the desired points.

88

To illustrate, suppose that AQL 0.01,1 0.95, LTPD 0.06, 0.1 . The single sampling plan
without any considerations of inspection error has n 89, c 2 . Now suppose that there is inspection error
and e1 0.01, e2 0.15 . Then we can calculate

AQLe AQL(1 e2 ) (1 AQL)e1


0.01(1 0.15) (1 0.01)0.01
0.0184
and

LTPDe LTPD(1 e2 ) (1 LTPD)e1


0.06(1 0.15) (1 0.06)0.01
0.0604
The single sampling plan that has an OC curve passing through the points (0.0184, 0.95), (0.0604, 0.10) has
the parameters n 150, c 5 . Notice that the effect of inspection error is to increase the sample size required
to obtain the desired protection at the specified points on the OC curve.
Inspection error has an effect on AOQ. Incorrect classification of a good item reduces the AOQ because more
screening inspection occurs. Incorrect classification of a defective item results in higher AOQ values because
it reduces the likelihood that lots will be screened. The general impact of a type I inspection error is to increase
the ATI, while a type II inspection error decreases the ATI.

89

References for the Supplemental Text Material


1. Acosta- Meejia, C. J. (1999), Improved P Charts to Monitor Process Quality, IIE Transactions,
Vol. 31, pp. 509-516.
2. Aerne, L. A., Champ, C. W., and Rigdon, S. E. (1991), Evaluation of Control Charts Under
Linear Trend, Communications in Statistics: Theory and Methods, Vol. 20, No. 10, pp. 33413349.
3. Andrews, H. P. (1964). The Role of Statistics in Setting Food Specifications, Proceedings of
the Sixteenth Annual Conference of the Research Council of the American Meat Institute, pp. 4356. Reprinted in Experiments in Industry: Design, Analysis, and Interpretation of Results, eds.
R. D. Snee, L. B. Hare and J. R. Trout, American Society for Quality Control, Milwaukee, WI
1985.
4. Bain, L. J., and Engelhardt, M. (1987), Introduction to Probability and Mathematical Statistics,
2nd edition, PWS-Kent, Boston, MA.
5. Barton, R. R. (1997). Pre-experiment Planning for Designed Experiments: Graphical Methods,
Journal of Quality Technology, Vol. 29, pp. 307-316.
6. Barton, R. R. (1998). Design-plots for Factorial and Fractional Factorial Designs, Journal of
Quality Technology, Vol. 30, pp. 40-54.
7. Barton, R. R. (1999). Graphical Methods for the Design of Experiments, Springer Lecture Notes
in Statistics 143, Springer-Verlag, New York.
8. Baxley, R. V., Jr. (1995), An Application of Variable Sampling Interval Control Charts, Journal
of Quality Technology, Vol. 27, pp. 275-282.
9. Bishop, T., Petersen, B. and Trayser, D. (1982). Another Look at the Statisticians Role in
Experimental Planning and Design, The American Statistician, Vol. 36, pp. 387-389.
10. Bissell, D. (1994), Statistical Methods for SPC and TQM, Chapman and Hall, London.
11. Box, G. E. P. and S. Jones (1992). Split-Plot Designs for Robust Product Experimentation.
Journal of Applied Statistics, Vol. 19, pp. 3-26.
12. Champ, C. W., and Rigdon, S. E. (1991), A Comparison of the Markov Chain and the Integral
Equation Approaches for Evaluating the Run Length Distribution of Quality Control Charts,
Communications in Statistics: Simulation and Computation, Vol. 20, No. 1, pp. 191-204.
13. Champ, C. W., and Rigdon, S. E. (1997), An Analysis of the Run Sum Control Chart, Journal

of Quality Technology, Vol. 29, pp. 407-417.


14. Charnes, J. M., and Gitlow, H. S. (1995), Using Control Charts to Corroborate Bribery in
Jaialai, The American Statistician, Vol. 49, pp. 386-389.
15. Ciminera, J. L., and Lease, M. P. (1992), Developing Control Charts to Review and Monitor
Medication Errors, Hospital Pharmacy, Vol. 27, p. 192.
16. Czitrom, V., and Reece, J. E. (1997). Virgin Versus Recycled Wafers for Furnace Qualification:
Is the Expense Justified? in V. Czitrom and P. D. Spagon (eds), Statistical Case Studies for
Industrial Process Improvement, pp. 87-104, Society for Industrial and Applied Mathematics,
Philadelphia, PA.
17. David, H. A. (1951), Further Applications of the Range to Analysis of Variance, Biometrika,
Vol. 38, pp. 393-407

90

18. Davis, R. B., Homer, A., and Woodall, W. H. (1990), Performance of the Zone Control Chart,
Communications in Statistics: Theory and Methods, Vol. 19, pp. 1581-1587.
19. Dolezal, K. K., Burdick, R. K., and Birch, N. J. (1998), Analysis of a Two-Factor R & R Study
with Fixed Operators, Journal of Quality Technology, Vol. 30, pp. 163-170.
20. Duncan, A.J. (1955), The Use of the Range in Computing Variabilities, Industrial Quality
Control, Vol. 9, No. 5, pp. 18-22. (Clarification and Further Comment, Vol. 11, No. 8, p. 70;
Errata, Vol. 12, No. 2, p. 36).
21. Finison, L. J., Spencer, M., and Finison, K. S. (1993), Total Quality Measurement in Health
Care: Unsing Individuals Charts in Infection Control, pp. 349-359, in ASQC Quality Congress
Transactions, ASQC, Milwaukee, WI.
22. Grant, E. L. and Leavenworth, R. S. (1996), Statistical Quality Control, 7th Edition, McGrawHill, New York.
23. Hahn, G. J. (1977). Some Things Engineers Should Know About Experimental Design, Journal
of Quality Technology, Vol. 9, pp. 13-20.
24. Hahn, G. J. (1984). Experimental Design in a Complex World, Technometrics, Vol. 26, pp. 1931.
25. Hamada, M., and Weerahandi, S. (2000), Measurement Systems Assessment via Generalized
Inference, Journal of Quality Technology, Vol. 32, pp. 241-253.
26. Herath, H. S. B., Park, C. S., and Prueitt, G. C. (1995), Monitoring Projects using Cash Flow
Control Charts, The Engineering Economist, Vol. 41, pp. 27.
27. Hinkley, D. V. (1970), Inference About the Change-Point in a Sequence of Random Variables,
Biometrika, Vol. 57, pp. 1-17.
28. Hogg, R. V., and Craig, A. T. (1978), Introduction to Mathematical Statistics, 4th edition,
Macmillan, New York.
29. Howarth, R. J. (1995), Quality Control Charting for the Analytical Laboratory: Part 1. Univariate
Methods, Analyst, Vol. 120, pp. 1851-1873.
30. Hunter, W. G. (1977). Some Ideas About Teaching Design of Experiments With 25 Examples of
Experiments Conducted by Students, The American Statistician, Vol. 31, pp. 12-17.
31. Hurwicz, A., and Spagon P. D. (1997), Identifying Sources of Variation in a Wafer Planarization
Process, in V. Czitrom and P. D. Spagon (eds), Statistical Case Studies for Industrial Process
Improvement, pp. 105-144, Society for Industrial and Applied Mathematics, Philadelphia, PA.
32. Jaehn, A. H. (1987), Zone Control Charts SPC Made Easy, Quality, Vol. 26 (October), pp.
51-53.
33. Jin, C., and Davis, R. B. (1991), Calculation of Average Run lengths for Zone Control Charts
With Specified Zone Scores, Journal of Quality Technology, Vol. 23, pp. 355-358.
34. Klein, M. (2000), Two Alternatives to the Shewhart x Control Chart, Journal of Quality
Technology, Vol. 32, pp. 427-431.
35. Leon, R. V., A. C. Shoemaker and R. N. Kackar (1987). Performance Measures Independent of
Adjustment. Technometrics, Vol. 29, pp. 253-265
36. Levey, S., and Jennings, E. R. (1992), Historical Perspectives: The use of Control Charts in the
Clinical Laboratory, Archives of Pathology and Laboratory Medicine, Vol. 116, pp. 791-798.
37. Luko, S. N. (1996), Concerning the Estimators R / d 2 and R / d 2* in Estimating Variability in a
Normal Universe, Quality Engineering, Vol. 8, No. 3, pp. 481-487.
91

38. Montgomery, D. C., Peck, E. A., and Vining, G. G. (2006), Introduction to Linear Regression
Analysis, 4th edition, John Wiley & Sons, new York.
39. Myers, R. H., and Milton, J. S. (1991), A First Course in the Theory of Linear Statistical Models,
PWS-Kent, Boston, MA.
40. Nelson, L. S. (1975), Use of the Range to Estimate Variability, Journal of Quality Technology,
Vol. 7, pp. 46-48.
41. Ott, E. R. (1947), An Indirect Calibration of an Electronic Test Set, Industrial Quality Control,
Vol. 3, No. 4, pp. 11-14.
42. Park, C., and Reynolds, M. R., Jr. (1994), Economic Design of a Variable Sample Size X Chart
Communications in Statistics: Simulation and Computation, Vol. 23, pp. 467-483.
43. Patnaik, P. R. (1950), The Use of the Mean Range as an Estimator of Variance in Statistical
Tests, Biometrika, Vol. 37, pp. 78-87.
44. Prabhu, S. S., Montgomery, D. C., and Runger, G. C. (1997), Economic-Statistical Design of an
Adaptive X Chart, International Journal of Production Economics, Vol. 49, pp. 1-15.
45. Quinlan, J. (1985). Product Improvement by Application of Taguchi Methods. Third Supplier
Symposium on Taguchi Methods, American Supplier Institute, Inc., Dearborn, MI.
46. Reynolds, J. H. (1971), The Run Sum Control Chart Procedure, Journal of Quality Technology,
Vol. 3, pp. 23-27.
47. Roberts, S. W. (1966), A Comparison of Some Control Chart Procedures, Technometrics, Vol.
8, pp. 411-430.
48. Rodriguez, R. N. (1996), Health Care Applications of Statistical Process Control: Examples
using the SAS System, pp. 1381-1396, in Proceedings of the 21st SAS Users Group Conference,
SAS Users Group International, Cary, NC.
49. Roes, K.C.B., and Roes, R. J. M. M. (1995), Shewhart-Type Charts in Nonstandard Situations
(with discussion). Technometrics, Vol. 37, pp. 15-40
50. Runger, G. C., and Fowler, J. W. (1998). Run-to-Run Control Charts with Contrasts. Quality
and Reliability Engineering International, Vol. 14, pp. 262-272.
51. Runger, G. C., and Montgomery, D. C. (1993), Adaptive Sampling Enhancements for Shewhart
Control Charts, IIE Transactions, Vol. 25, pp. 41-51.
52. Ryan, T. P, and Schwertman, N. C. (1997), Optimal Limits for Attributes Control Charts,
Journal of Quality Technology, Vol. 29, pp. 86-98.
53. Samuel, T. R., Pignatiello, J. J., Jr., and Calvin, J. A. (1998), Identifying the Time of a Step
Change with x Control Charts, Quality Engineering, Vol. 10, pp. 521-527.
54. Schilling, E. G. (1978), A Lot Sensitive Sampling Plan for Compliance Testing and Acceptance
Inspection, Journal of Quality Technology, Vol. 10, pp. 47-51.
55. Schwertman, N. C., and Ryan, T. P (1997), Implementing Optimal Attributes Control Charts,
Journal of Quality Technology, Vol. 29, pp. 99-104.
56. Shore, H. (2000), General Control Charts for Attributes, IIE Transactions, Vol. 32, pp. 11491160.
57. Sullivan, J. H., and Woodall, W. H. (1996), A Comparison of Control Charts for Individual
Observations, Journal of Quality Technology, Vol. 28, pp. 398-408.

92

58. Tagaras, G. (1998), A Survey of Recent Developments in the Design of Adaptive Control
Charts, Journal of Quality Technology, Vol. 30, pp. 212-231.
59. Tippett, L. H. C. (1925), On the Extreme Individuals and the Range in Samples Taken from a
Normal Population, Biometrika, Vol. 17, pp. 364-387.
60. Wheeler, D. J. (1995), Advanced Topics in Statistical Process Control, SPC Press, Knoxville, TN.
61. Woodall, W. H., and Montgomery, D. C. (2000-01), Using Ranges to Estimate Variability,
Quality Engineering, Vol. 13, pp. 211-217.
62. Yashchin, E. (1994), Monitoring Variance Components Technometrics, Vol. 36, pp. 379-393.
63. Zimmer, L. S., Montgomery, D. C., and Runger, G. C. (1998), A Three-State Adaptive Sample
Size X Control Chart, International Journal of Production Research, Vol. 36, pp. 733-743.
64. Zimmer, L. S., Montgomery, D. C., and Runger, G. C. (2000), Guidelines for the Application of
Adaptive Control Charting Schemes, International Journal of Production Research, Vol. 38, pp.
1977-1992.

93

Vous aimerez peut-être aussi