What and When To Randomize

STATISTICS ROUNDTABLE
What and When To Randomize

by Christine Anderson-Cook
ne of the most highly stressed principles of statistical design of experiments is the need for proper randomization. Unfortunately, it is sometimes misunderstood and misapplied. The motivation for randomization is to remove some of the subjectivity from the experiment and to offer protection from systematic but unknown or unaccounted for factors affecting the value of the response. For example, if you are interested in assessing the effect of an adjustment to your process, it would be a mistake to separately obtain all the data from one condition and then all the data from the other condition. Suppose, unknown to you, there is a warm-up period for one of the machines involved in the process. The effect of the warm-up period on your response of interest would affect only the data from the first condition, and its effect would be confounded with the difference between the two conditions you are trying to assess. If, however, you randomized the order in which the data were collected from the two conditions, then the unknown warm-up period would influence both conditions, not just a single one. Therefore, when you perform the analysis, the warm-up effect might increase the variance of the measurements you obtainedperhaps making it harder to find a significant difference between the two conditionsbut it would likely not systematically bias
your results to lead you to a false conclusion. This practical protection from unknown causes through randomization is also the theoretical basis for the validity of any inference or testing you might perform. Since you knew ahead
Three common mistakes to watch out for.

of time there were natural differences between the experimental unitsthe units to which you were applying the two treatments, condition one and condition tworandomization assured you that, on average, there would be relatively little difference between the experimental units before receiving the treatment. Any significant differences you saw could therefore be attributed to the difference between the conditions. Three Common Mistakes Appropriate randomization dramatically improves the quality of the data collected and allows you to make valid inferences about the causality of your treatments influencing the values observed for the response. As you try to implement an experiment with randomization, however, you could easily make one of several common
FIGURE 1
Range of Temperatures
120 140 160 Temperature 180 200
100
FIGURE 2
Equally Spaced Design With Replicates
100
120
140 160 Temperature
180
200
mistakes and thereby make your experiment invalid or, at the very least, less effective. Mistake one: randomizing which data to collect, not which observational units get which chosen input level. Consider a not entirely fictitious story from my consulting experience where an experimenter was interested in studying the effect of temperatures between 100 and 200 degrees on the response. The scientist came to see me after running the experiment involving 30 observations, proudly saying he had randomized the temperatures. Figure 1 shows the range of temperatures he used to collect his experimental data. While his intentions may have been good, his execution significantly reduced the experiments effectiveness. Because he did not control which temperatures to select and had randomly selected 30 temperatures, the experiment was more difficult to run. There were no observations at the extremes of the range of interestneither 100 nor 200 degrees had been selectedand the uneven spacing of the temperatures would not provide optimum information about the relationship between the explanatory variable and the response. Also, by not measuring any temperatures twice, no measure of pure error was possible. This experiment would have been more effectively run had the experimenter consciously selected the particular temperatures he wanted to consider. For example, if he was unclear about the shape of the relationship between the input and output before running the experiment and was worried about detecting a phase change in the response at a particular temperature, he should have selected an equally spaced design with replicates to measure natural variability at different locations (see Figure 2). However, if he thought the relationship would be more continuous with only a moderate amount of
QUALITY PROGRESS
I MARCH 2006 I 59
curvature, then a design with just three equally spaced temperatures would provide maximum power for detecting differences between the levels of the input variable (see Figure 3). Once the appropriate levels of the input factortemperaturehad been chosen, then his randomization step would have involved determining which temperature each of the 30 experimental units would receive. The choice of factor levels should be made based on current understanding of the process and on what the nature of the relationship will likely look like. It should never be left to chance through randomization. Mistake two: choosing a randomization approach that does not ensure balance or protection against changes in the amount of data collected. Consider a simple experiment in which you want to compare the relative effect of two drugs. The patients arrive into the study at different times and, therefore, are assigned to receive one of the two drugs at random. One choice for randomization would be to flip a coin each time a new patient arrives and assign him or her to a particular drug based on whether the coin showed heads or tails. This, however, might result in some unbalanced results in terms of the number of patients receiving each drug. A better approach would be to devise an assignment schedule based on the number of subjects planned for the study. For example, if you know the study is designed to continue until 120 patients have been included, then the assignment schedule might include a balanced randomization for the first 20 patients, then the next 20 and so on. The randomization for the first 20 patients might look something like this: 2 1 2 2 1 1 2 2 1 2 1 1 2 1 1 2 2 1 1 2.
FIGURE 3
By doing the randomization in groups of 20, the number of patients receiving each drug will not be too unbalanced if the study is terminated early or preliminary results are needed. The best randomization for this experiment might have separate randomization assignments for patients based on other demographic information, which would ideally be balanced across the allocation to drugs. For example, based on a patient profile, it would be easy to stratify the patients into four categories based on gender and whether their condition is severe or mild. This would lead to four randomization schedules: male-severe, female-severe, male-mild and femalemild. This way, you could consciously include known factors that might affect the performance of the drugs in the study and not have to adjust for them after the experiment. Generally, if there are known factors that might affect the treatment and can be measured before the assignment to treatments, then you should include this stratification in the design of the experiment. Mistake three: implementation and subsequent analysis of the experiment does not match the intended design protocol. Suppose you are interested in running a simple experiment to study the effect of two factors, each at two levels, on your response of interest. Your software package yields the order in Table 1 for your 22 factorial experiment with two replicates. Notice the level of factor one does not change between observations one and two and observations five and six. Similarly, for factor two, the level does not change between observations two, three and four and observations seven and eight. If you collect the first two observations and do not reset the level of fac-
TABLE 1
22 Factorial Experiment With Two Replicates

Factor one Factor two Low Low High Low High High Low High Low High High High Low High Low Low
Run order 1 2 3 4 5 6 7 8
TABLE 2
A Superior Design
Factor one Factor two Run order (whole plot (subplot factor) factor) Low A 1 Low 2 High B 3 4 C 5 6 D 7 8 High Low High High Low Low High Low High
Equally Spaced Temperatures

10 instances 120 140 160 Temperature 180 10 instances 200
10 instances 100
tor one between the runsbecause it might appear to be introducing less variability and is simpler to runthen response values for this pair of observations will be correlated with each other. The same is true for observations five and six, observations two, three and four, and observations seven and eight. This is called an inadvertent split-plot and requires a different, more complicated analysis to correctly estimate the error terms in the model.1 If resetting a factor for each experimental unit will be too costly or difficult, then you should select a different design. Lets say it was not practical to change the level of factor one for each run. You might choose a split-plot experiment to reduce the number of level changes required. For example, it
QUALITY PROGRESS
I MARCH 2006 I 61
might only be practical to change the level every second run. The factor(s) that has restrictions on the number of changes is called the whole plot factor(s), and the one that will be reset for each observation is called the subplot factor(s). In this case, a superior design is the one shown in Table 2 (p. 61). The randomization for this experiment actually occurred in two separate instances: 1. The order to run the four whole plots was randomized (A, B, C, D or C, B, D, A). 2. The order to run each of the two observations within each whole plot was randomized (1, 2 or 2, 1). The analysis of this split-plot experiment will include two error terms: one for the variability associated with changing the whole plots and one for the variability associated with different observations.2, 3 Randomization is an essential component of running a good statistically designed experiment. To choose the levels of factors to be examined, you need to understand the relationship being studied and the number of observations for each combination to examine. Randomization will remove subjectivity and bias as to which experimental units get which treatments after you have selected the proper combinations. If you know of factors that might influence the response, you should account for them in the design whenever possible. And if you dont reset the levels of each factor for each observation, your analysis should reflect this, and you should likely choose a split-plot design to improve the characteristics of the design. With its inclusion in many statistical software packages, randomization has become much easier and more accessible. Just remember when and how to use it, and youll run even better experiments.
REFERENCES
1. Jitendra Ganju and James M. Lucas, Detecting Randomization Restrictions Caused by Factors, Journal of Statistical Planning and Inference, 1999, pp. 129-140. 2. Jennifer D. Letsinger, Raymond H. Myers and Marvin Lentner, Response Surface Methods for Bi-Randomization Structure, Journal of Quality Technology, 1996, pp. 381-397. 3. G. Geoffrey Vining, Scott M. Kowalski and Douglas C. Montgomery, Response Surface Designs Within a Split-Plot Structure, Journal of Quality Technology, 2005, pp. 115-129.
CHRISTINE ANDERSON-COOK is a technical
staff member of Los Alamos National Laboratory in Los Alamos, NM. She earned a doctorate in statistics from the University of Waterloo in Ontario. Anderson-Cook is a Senior Member of ASQ.
Please comment
If you would like to comment on this article, please post your remarks on the Quality Progress Discussion Board at www.asq.org, or e-mail them to editor@asq.org.
62
I MARCH 2006 I www.asq.org

What and When To Randomize

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

What and When To Randomize

Transféré par

Droits d'auteur :

Formats disponibles

STATISTICS ROUNDTABLE

What and When To Randomize

Three common mistakes to watch out for.

Equally Spaced Design With Replicates

140 160 Temperature

22 Factorial Experiment With Two Replicates

Equally Spaced Temperatures

CHRISTINE ANDERSON-COOK is a technical

I MARCH 2006 I www.asq.org

Vous aimerez peut-être aussi