Vous êtes sur la page 1sur 11

Between-Subjects One-Way ANOVA Description

steps | example | calculator

The one-way Analysis of Variance (ANOVA) is used with one categorical independent variable and one continuous variable. The independent variable can consist of any number of groups (levels). For example, an experimenter hypothesizes that learning in groups of three will be more effective than learning in pairs or individually. Students are randomly assigned to three groups and all students study a section of text. Those in group one study the text individually (control group), those in group two study in groups of two and those in group three study in groups of three. After studying for some set period of time all students complete a test over the materials they studied. First, note that this is a betweensubjects design since there are different subjects in each experimental condition. Second, notice that ,instead of two groups (i.e., levels) of the independent variable, we now have three. The t-test, which is often used in similar experiments with two group, is only appropriate for situations where there are only two levels of one independent variable. When there is a categorical independent variable and a continuous dependent variable and there are more than two levels of the independent variable and/or there is more than one independent variable (a case that would require a multi-way, as opposed to one way ANOVA), then the appropriate analysis is the work horse of experimental psychology research, the analysis of variance. In the case where there are more than two levels of the independent variable the analysis goes through two steps. First, we carry out an over-all F test to determine if there is any significant difference existing among any of the means. If this F score is statistically significant, then we carry out a second step in which we compare sets of two means at a time in order to determine specifically, where the significance difference lies. Let's say that we have run the experiment on group learning and we recognize that this is an experiment for which the appropriate analysis is the between-subjects one-way analysis of variance. We use a statistical program and analyze the data with group as the independent variable and test score as the dependent variable. Our results might look something like the following: source D.F. Sum of Squares Mean Squares F Ratio F Prob.

Between Groups Within Groups Total

2 1392.47 27 1583.40 29 2975.87

696.23 58.64



The "Between Groups" row represents what is often called "explained variance" or "systematic variance". We can think of this as variance that is due to the independent variable, the difference among the three groups. For example the difference between a person's score in group one and a person's score in group two would represent explained variance. The "Within Groups" variance represents what is often called "error variance". This is the variance within your groups, variance that is not due to the independent variable. For example, the difference between one person in group one and another person in group one would represent error variance. Intuitively, it's important to understand that, at it's heart, the analysis of variance and the F score it yields is a ratio of explained variance versus error. This actual F score (ratio) is in the next to last column, and the probability of an F of this magnitude is in the final column. As you can see this F score is well below the .05 cut off, so that we can conclude then that the groups are statistically significantly different from one another. But two very important questions remain. First, which means are significantly different from which other means and, second what were the actual scores of the group? To answer the pair comparisons questions we run a series of Tukey's post-hoc tests, which are like a series of t-tests. The post-hoc tests are more stringent than the regular ttests however, due to the fact that the more tests you perform the more likely it is that you will find a significant difference just by chance. Your post hoc tests which statistical programs often presented in a table, might look something like this: Mean Group G G G r r r p p p 1 2 3 72.30 Grp 1 86.60 Grp 2 *

86.90 Grp 3

This table represents a matrix with the groups listed along each axis. Its important, at this point to note the way the groups were originally coded (1=individual; 2=dyad; 3=triad.). First of all, note the means. Clearly, the mean for those in group 1 was substantially lower than the other two groups, which were practically identical. When we look at the Tukeys Post-hoc table, we see that the post hoc tests are consistent with what we observed with the means. Note that the stars are in the boxes that correspond to groups (1 vs. 2) and (1 vs. 3). This means that that these mean differences were statistically significant. We can sum these results up by saying something like "Those who studied individually scored significantly lower that those who studied in dyads or triads, while the latter two groups did not differ significantly from one another." In other words, this experiment indicates that studying in a group is more effective than studying individually, but the size of the group (two vs. three members) is not important. The experimenters original hypothesis that learning in triads is more effective than learning in dyads or individually was not supported. In the case of this experiment, this seems obvious based on the means, but in many "real world" studies this is not the case, and the estimates of statistical significance become very important. The analysis of variance is a simple test to do with a computer, but can get pretty complicated when calculated by hand, even with a small sample size. However, caculating a one-way analysis of variance and subsequent post hoc tests by hand will give you an appreciation for what the computer is doing and also help you to better understand the underlying logic.

Formula for the One-Way Analysis of Variance and Tukey's Post Hoc Test An example of the step by step calculation of the One-Way Analysis of Variance and Tukey's Post Hoc Test Practice by creating your own values with groups sizes of 8, carrying out the calcuation yourself, and then comparing this to the step by step results using the Virtual Statistician Calculator for the One-Way ANOVA and Tukey's Post Hoc Test (virtual calculator will open in another window).


SSwithin = SStotal - SSamong dfamong = r-1 dfwithin = N-r

x = individual observation r = number of groups N = total number of observations (all groups) n = number of observations in group Steps (assuming three groups) Create six columns: "x1", "x12", "x2", "x22", "x3", and "x32" 1. 2. 3. 4. 5. 6. 7. 8. 9. Put the raw data, according to group, in "x1", "x2", and "x3" Calculate the sum for group 1. Calculate (x)2 for group 1. Calculate the mean for group 1 Calculate x2 for group 1. Repeat steps 2-5 for groups 2 and 3 Set up SStotal and SSamong formulas and calculate Calculate SSwithin Enter sums of squares into the ANOVA table, and complete the table by calculating: dfamong, dfwithin, MSamong, and MSwithin, and F 10. Check to see if F is statistically significant on probability table with appropriate degrees of freedom and p < .05.

An example of the application of this formula Practice by creating your own data and then check your work with the Virtual Statistician One-Way ANOVA Calculator (Only works with n = 8) A description of the One-Way Analysis of Variance

Tukey's HSD Post Hoc Test Steps

example | calculator

formula: M= treatment/group mean n = number per treatment/group Steps

1. Calculate an analysis of variance (e.g., One-way between-subjects ANOVA). 2. Select two means and note the relevant variables (Means, Mean Square Within, and number per condition/group) 3. Calculate Tukey's test for each mean comparison 4. Check to see if Tukey's score is statistically significant with Tukey's probability/critical value table taking into account appropriate dfwithin and number of treatments.

An example of the application of this formula Practice by creating your own data and then check your work with the Virtual Statistician Tukey Post-Hoc Calculator (will open in another window) (Only works with n = 8) A description of the Analysis of Variance and Tukey Post Hoc Test.

Between Subjects One-Way ANOVA example

description | steps | calculator

Problem: Susan Sound predicts that students will learn most effectively with a constant background sound, as opposed to an unpredictable sound or no sound at all. She randomly divides twenty-four students into three groups of eight. All students study a passage of text for 30 minutes. Those in group 1 study with background sound at a constant volume in the background. Those in group 2 study with noise that changes volume periodically. Those in group 3 study with no sound at all. After studying, all students take a 10 point multiple choice test over the material. Their scores follow: group test scores 1) constant sound 7 4 6 8 6 6 2 9 2) random sound 5 5 3 4 4 7 2 2 3) no sound 24712155 x1 7 4 6 8 6 6 2 9 x12 49 16 36 64 36 36 4 81 x2 5 5 3 4 4 7 2 2 x22 25 25 9 16 16 49 4 4 x3 2 4 7 1 2 1 5 5 x32 4 16 49 1 4 1 25 25

x1 = 48 x12 = 322 x2 = 32 x22 = 148 x3 = 27 2 2 (x1) = 2304 (x2) = 1024 (x3)2 = 576 M1 = 6 M2 = 4 M3 = 3.375

x32 = 125

= 595 - 477.04 SStotal = 117.96

= 507.13 - 477.04 SSamong = 30.08 SSwithin = 117.96 - 30.08 = 87.88 Source SS df MS F Among 30.08 2 15.04 3.59 Within 87.88 21 4.18 *(according to the F sig/probability table with df = (2,21) F must be at least 3.4668 to reach p < .05, so F score is statistically significant) Interpretation: Susan can conclude that her hypothesis may be supported. The means are as she predicted, in that the constant music group has the highest score. However, the signficant F only indicates that at least two means are signficantly different from one another, but she can't know which specific mean pairs significantly differ until she conducts a post-hoc analysis (e.g., Tukey's HSD).

Analysis of Variance Example

A manager wishes to determine whether the mean times required to complete a certain task differ for the three levels of employee training. He randomly selected 10 employees with each of the three levels of training (Beginner, Intermediate and Advanced). Do the data provide sufficient evidence to indicate that the mean times required to complete a certain task differ for at least two of the three levels of training? The data is summarized in

the table.

Level of Training Advanced Intermediate Beginner

n 10 10 10 24.2 27.1 30.2

s2 21.54 18.64 17.76

Ha: The mean times required to complete a certain task differ for at least two of the three levels of training. Ho: The mean times required to complete a certain task do not differ the three levels of training. ( B = I = A) Assumptions: The samples were drawn independently and randomly from the three populations. The time required to complete the task is normally distributed for each of the three levels of training. The populations have equal variances. Test




Calculations: = 10(24.2 - 27.16...)2 + 10(27.1 - 27.16...)2 + 10(30.2 - 27.16...)2 = 180.066....

= 9(21.54) + 9(18.64) + 9(17.76) = 521.46

Source Treatments Error Total

df 2 27 29

SS 180.067 521.46 702.527

MS 90.033 19.313

F 4.662

Decision: Reject Ho. Conclusion: There is sufficient evidence to indicate that the mean times required to complete a certain task differ for at least two of the three levels of training. Which pairs of means differ? The Bonferroni Test is done for all possible pairs of means.

Decision rule:

Reject Ho, if the


does not contain 0.

c = # of pairs c = p(p-1)/2 = 3(2)/2 = 3

t.0083 = 2.554 (This value is not in the t table; it was obtained from a computer program.)

Since t.010 < t.0083 < t.0050 (2.473 < t.0083 < 2.771), use t.005 when using a table. If you reject the null hypothesis when t = 2.771; you will also reject it for t.0083.

There is sufficient evidence to indicate that the mean response time for the advanced level of training is less than the mean response time for the beginning level. There is not sufficient evidence to indicate that the mean response time for the intermediate level differs from the mean response time of either of the other two levels. A researcher would like to find out whether a man's nickname affects his cholesterol reading (though it is not clear why she believes it should). She records the cholesterol readings of 23 men nicknamed Sam, 24 men nicknamed Lou and 19 men nicknamed Mac; her data appears in the table to the right. She wants to know whether the Sam 364 245 284 Lou 260 204 221 Mac 156 438 272

differences in the average readings are significant; i.e., whether the average reading of all men nicknamed Sam is different from the average reading of all Lous or whether these averages differ from the average reading of all Macs.

172 198 239 259 188 256 263 329 136 272 245 209 298 342 217 358 412 382 593 261

285 308 262 196 299 316 216 155 212 201 175 241 233 279 368 413 240 243 325 156 280

345 198 137 166 236 168 269 296 236 275 269 142 184 301 262 258

Using the ANOVA formulas, we have:

I = 3; n1 = 23, n2 = 24, and n3 = 19, and x1,1 = 364, x1,2 = 260, x1,3 = 156, x2,1 = 245, etc.; N = 23 + 24 + 19 = 66; AV1 = 283.6, AV2 = 253.7, AV3 = 242.5 (at least approximately -- all figures except the degrees of freedom are approximate from here on); AV = [23(283.6) + 24(253.7) + 19(242.5)]/66 = 260.9; SSE = (364 - 283.6)2 + (260 - 253.7)2 + (156 - 242.5)2 + (245 - 283.6)2 + . . . (a total of 66 terms) = 406,259.7; SSG = 23(283.6 - 260.9)2 + 24(253.7 - 260.9)2 + 19(242.5 - 260.9)2 = 19,485.3; DFE = 66 - 3 = 63, and so MSE = 6448.6; DFG = 3 - 1 = 2, and so MSG = 9742.7; and hence, finally, F = 1.51.

And by consulting an F-distribution table with DFG = 2 and DFE = 63, we find that the probability of a F-value of at least is about 23%, not small enough to conclude statistical significance. We do not reject the null hypothesis and conclude that the average cholesterol readings of all Sams, all Lous and all Macs are not different.

Here is a screenshot of an Excel spreadsheet in which these computations are done.