Académique Documents
Professionnel Documents
Culture Documents
T-Test Assumptions
Types of T-Tests
The t-test is used when your data has only two levels of the
independent variable. There is a t-test for dissertations
involving experimental designs with randomized groups
(independent samples), and another t-test for dissertations with
experimental designs involving correlated groups (matched
pairs or within-subjects designs). Knowing what kind of sample
you have is key to selecting the appropriate t-test for your
analyses.
For this analysis, you would use the t-test for independent
means. The crux of your paper is determining whether the 1.3
difference between these means is a statistically-reliable
difference or if the means are different because of sampling
error.
Correlated Samples
Using the above example, let's say your work involved one
group of subjects, but each subject listened to the song first,
without seeing the artist's face, then rated how much they liked
it. Then, the same subject saw the artist's face and listened to
the song again. For your analysis, you would use the t-test for
correlated samples, because each person in your sample made
two observations. Obviously, the ratings for this sample are
correlated, because they came from the same individual. This
type of experimental design is called a "within-subjects" design.
Once you have calculated the t-score for your groups, you
need to know whether these t values are large enough to
assume that the difference you found between the two groups
is significant. Most statistical packages used for analyses (SPSS,
etc.) will provide an alpha level for you. If, for your dissertation,
you have set your significance level at .05, any alpha smaller
than this means that you have significant findings. Dissertation
committees and dissertation chairs love significant results!
The T-Test
The t-test assesses whether the means of two groups are statistically different from each other. This
analysis is appropriate whenever you want to compare the means of two groups, and especially appropriate
Figure 1 shows the distributions for the treated (blue) and control (green) groups in a study. Actually, the
figure shows the idealized distribution -- the actual distribution would usually be depicted with a histogram or
bar graph. The figure indicates where the control and treatment group means are located. The question the
What does it mean to say that the averages for two groups are statistically different? Consider the three
situations shown in Figure 2. The first thing to notice about the three situations is that the difference
between the means is the same in all three. But, you should also notice that the three situations don't look
the same -- they tell very different stories. The top example shows a case with moderate variability of scores
within each group. The second situation shows the high variability case. the third shows the case with low
variability. Clearly, we would conclude that the two groups appear most different or distinct in the bottom or
low-variability case. Why? Because there is relatively little overlap between the two bell-shaped curves. In
the high variability case, the group difference appears least striking because the two bell-shaped
This leads us to a very important conclusion: when we are looking at the differences between scores for two
groups, we have to judge the difference between their means relative to the spread or variability of their
The formula for the t-test is a ratio. The top part of the ratio is just the difference between the two means or
averages. The bottom part is a measure of the variability or dispersion of the scores. This formula is
essentially another example of the signal-to-noise metaphor in research: the difference between the means
is the signal that, in this case, we think our program or treatment introduced into the data; the bottom part of
the formula is a measure of variability that is essentially noise that may make it harder to see the group
difference. Figure 3 shows the formula for the t-test and how the numerator and denominator are related to
the distributions.
The top part of the formula is easy to compute -- just find the difference between the means. The bottom
part is called the standard error of the difference. To compute it, we take the variance for each group and
divide it by the number of people in that group. We add these two values and then take their square root.
Figure 4. Formula for the Standard error of the difference between the means.
Remember, that the variance is simply the square of the standard deviation.
The t-value will be positive if the first mean is larger than the second and negative if it is smaller. Once you
compute the t-value you have to look it up in a table of significance to test whether the ratio is large enough
to say that the difference between the groups is not likely to have been a chance finding. To test the
significance, you need to set a risk level (called the alpha level). In most social research, the "rule of thumb"
is to set the alpha level at .05. This means that five times out of a hundred you would find a statistically
significant difference between the means even if there was none (i.e., by "chance"). You also need to
determine the degrees of freedom (df) for the test. In the t-test, the degrees of freedom is the sum of the
persons in both groups minus 2. Given the alpha level, the df, and the t-value, you can look the t-value up in
a standard table of significance (available as an appendix in the back of most statistics texts) to determine
whether the t-value is large enough to be significant. If it is, you can conclude that the difference between
the means for the two groups is different (even given the variability). Fortunately, statistical computer
programs routinely print the significance test results and save you the trouble of looking them up in a table.
Let’s suppose that at the end of the experiment the experimental group gets a
mean of 16.61 on a fear of HIV scale and the control group gets a mean of
29.67 (where the higher the score, the greater the fear of HIV). These
means support our research hypothesis. But can we be certain that our research
hypothesis is correct? If you’ve been reading various topics on statistics, you
already know that the answer is "no" because of the Null Hypothesis, which
says that there is no true difference between the means; that is, the difference
was created merely by the chance errors created by random sampling (these
errors are known as sampling errors). Put another more simple way,
unrepresentative groups may have been assigned to the two conditions
quite at random.
The t test is often used to test the null hypothesis regarding the observed
difference between two means (to test the null hypothesis between
two medians, the median test is used; it is a specialized form of chi square test).
For the example, we are considering, a series of computations (which are
beyond the scope of this paper) would be performed to obtain a value
of t (which, in this case, is 5.38) and a value of degrees of freedom (which, in
this case, is df = 179). These values are not of any special interest to us except
that they are used to get the probability (p) that the null hypothesis is true. In
this particular case, p is less than .05. Thus, in a research report, you may read a
statement such as this:
The term statistically significant indicates that the null hypothesis has been
rejected. You will recall that when the probability that the null hypothesis is
true is .05 or less (such as .01 or .001), we reject the null hypothesis. When
something is unlikely to be true, because it has a low probability of being true,
we reject it.
Having rejected the null hypothesis, we are in a position to assert that our
research hypothesis probably is true (assuming no procedural bias was allowed
to affect the results, such as testing the control group immediately after a major
news story on a celebrity person with AIDS, while testing the experimental
group at an earlier time).
1. Sample size. The larger the sample, the less likely that an observed
difference is due to sampling errors. Large samples provide more precise
information. Thus, when the sample is large, we are more likely to
reject the null hypothesis than when the sample is small.
2. The size of the difference between means. The larger the difference, the
less likely that the difference is due to sampling errors. Thus, when the
difference between the means is large, we are more likely to reject
the null hypothesis than when the difference is small.
3. The amount of variation in the population. When a population is very
heterogeneous (has much variability) there is more potential for
sampling error. Thus, when there is little variation (as indicated by
the standard deviations of the sample), we are more likely to reject
the null hypothesis than when there is much variation.