Vous êtes sur la page 1sur 45

Rift Valley University,

Department of Public Health

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 1
terek7@gmail.com
Types of t-test:
– One-sample t-test: which is used to compare a
single mean to a fixed number or "gold standard".
– Paired t-test: which is used to compare two
means based on samples that are paired in some
way.
– Two-sample t-test: which is used to compare two
population means based on independent samples
from the two populations or groups.

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 2
terek7@gmail.com
One-sample t-test:

– One-sample t-test: The One-Sample T Test


procedure tests whether the mean of a single
variable differs from a specified constant.
– Assumptions: This test assumes that the data are
normally distributed;
If the confidence interval does not contain null
value, this also indicates that the test is
significant.

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 3
terek7@gmail.com
One-sample t-test, Example:-

 A researcher is planning a psychological intervention


study, but before he proceeds he wants to characterize
his participants' depression levels. He tests each
participant on a particular depression index, where
anyone who achieves a score of 4.0 is deemed to have
'normal' levels of depression

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 4
terek7@gmail.com
One-sample t-test, Example cont’d…

 Lower scores indicate less depression and higher


scores indicate greater depression. He has recruited
40 participants to take part in the study. Depression
scores are recorded in the variable dep_score. He
wants to know whether his sample is representative
of the normal population (i.e., do they score
statistically significant difference from 4.0).
One-Sample Test

Test Value = 4
95% Confidence
Interval of the
Mean Difference
t df Sig. (2-tailed) Difference Lower Upper
dep_score -3.347 Teresa Kisi
39 (MPH in Epidemiology
.002 and
-.26875 Biostatistics,
-.4312 Assist.
-.1063Prof.)
5/15/2016 5
terek7@gmail.com
Paired t-test:

 One of the most common experimental designs is


the "pre-post" design.
 A study of this type often consists of two
measurements taken on the same subject, one
before and one after the introduction of a treatment
or a stimulus.
 The basic idea is simple. If the treatment had no
effect, the average difference between the
measurements is equal to 0 and the null hypothesis
holds.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
5/15/2016 6
terek7@gmail.com
Paired t-test cont’d…

 On the other hand, if the treatment did have an


effect (intended or unintended!), the average
difference is not 0 and the null hypothesis is
rejected.
 The Paired-Samples T Test procedure is used to test
the hypothesis of no difference between two
variables.

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 7
terek7@gmail.com
Paired t-test cont’d…

Example (dietstudy)
 A physician is evaluating a new diet for her patients
with a family history of heart disease. To test the
effectiveness of this diet, 16 patients are placed on
the diet for 6 months.
 Their weights and triglyceride levels are measured
before and after the study, and the physician wants
to know if either set of measurements has changed.

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 8
terek7@gmail.com
Paired t-test cont’d…

 Use Paired-Samples T Test to determine whether


there is a statistically significant difference between
the pre- and post-diet weights and triglyceride levels
of these patients.

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 9
terek7@gmail.com
Paired t-test cont’d…

 Select Triglyceride (tg0) and Final Triglyceride(tg4)


as the first set of paired variables.
 Select Weight and Final Weight as the second pair.
Paired Samples Test

Paired Differences
95% Confidence
Interval of the
Std. Error Difference
Mean Std. Deviation Mean Lower Upper t df Sig. (2-tailed)
Pair 1 Triglyceride - Final
14.063 46.875 11.719 -10.915 39.040 1.200 15 .249
triglyceride
Pair 2 Weight - Final weight 8.063 2.886 .722 6.525 9.600 11.175 15 .000

Since the significance value for change in weight is less than


0.05, you can conclude that the average loss of 8.06 pounds per
patient is not due toTeresa
chance variation, and can be attributed to
Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
the diet.
5/15/2016
terek7@gmail.com
10
Paired t-test cont’d…

 However, the significance value greater than 0.10 for


change in triglyceride level shows the diet did not
significantly reduce their triglyceride levels.

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 11
terek7@gmail.com
Two-sample t-test:

 First, there are a number of pre-requisites that need


to be met:
– Data for both groups must be metric.
– The distribution of the relevant variable in each
population must be reasonably Normal.
– The population standard deviations of the two
variables concerned should be approximately the
same, but this requirement becomes less
important as sample sizes get larger. You can
check this by examining the two sample standard
deviation
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
5/15/2016 12
terek7@gmail.com
Two-sample t-test: cont’d…

 Example:
 Suppose you want to compare (by estimating the
difference between them), the population mean
birth weights of infants born in a maternity unit
with that of infants born at home (sample data is
given as two sample test). The two samples were
selected independently with no attempt at
matching.

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 13
terek7@gmail.com
Two-sample t-test: cont’d…

 It is important to remember that a difference


between two sample values does not necessarily
mean that there is a difference in the two
population values. Any difference in these sample
birth weight means might simply be due to chance.

A low significance value for the t test (typically less than 0.05)
indicates that there is a significant difference between the two
group means.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
5/15/2016 14
terek7@gmail.com
Two-sample t-test: cont’d…

Independent Samples Test

Levene's Test for


Equality of Variances t-test for Equality of Means
95% Confidence
Interval of the
Mean Std. Error Difference
F Sig. t df Sig. (2-tailed) Difference Difference Lower Upper
Hospiatlbirth Equal variances
.039 .845 -.833 58 .409 -81.96667 98.45150 -279.039 115.10543
assumed
Equal variances
-.833 57.967 .409 -81.96667 98.45150 -279.041 115.10779
not assumed

 If you want to know, the existence of statistically


significant difference between two population
means, calculate the 95 percent confidence interval
for the difference and see if it contains zero.

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 15
terek7@gmail.com
Two-sample t-test: cont’d…

Since this confidence interval includes zero, you can


conclude that there is no statistically significant
difference in population mean birth weights of infants
born in a maternity unit and infants born at home.
If the significance value for the Levene test is high (typically
greater than 0.05)... Use the results that assume equal
variances for both groups.

If the significance value for the Levene test is low...


Use the results that do no assume equal variances for
both groups.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
5/15/2016 16
terek7@gmail.com
– A t-distribution can be used for testing
hypotheses about differences of means for
independent samples if both populations are
normal and have the same variances.
– However, the usual two-sample t-test cannot be
applied when more complex sets of data
comprising more than two groups are considered.
In this regard, one-way analysis of variance
(ANOVA) is used to compare the means of several
groups.

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 17
terek7@gmail.com
 Comparison of several means - Analysis of variance
– It is used when there is a single way of classifying
individuals. That is, when the subgroups to be
compared are defined by just one factor,
E.g. For example, say you are interested in comparing/
studying the blood pressure level of three groups of
patients who take three different treatments. There
is only one grouping (type of treatment
administered) that you are using to define the groups.
– When there are two factors classifying the
observations we need two way analysis of variance,
and so on.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
5/15/2016 18
terek7@gmail.com
Dependant variable ===> blood pressure (outcome)
Independent variable ===>treatment (factor)

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 19
terek7@gmail.com
Hypothesis of Anova:
– Test the hypothesis that the means of two or
more groups are not significantly different.
One-Way ANOVA also offers:
– Group-level statistics for the dependent variable
– A test of variance equality
– A plot of group means
– Pair wise multiple comparisons, and describe the
nature of the group differences
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
5/15/2016 20
terek7@gmail.com
– One-way analysis of variance is based on assessing
how much of the overall variation in the data is
attributable to differences between the group
means, and comparing this with the amount
attributable to differences between individuals in
the same group.
– The calculations for one way ANOVA are
expressed in relation to the sum of the
observations in each sample.

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 21
terek7@gmail.com
– Suppose we have K samples of observations, with
ni observations in the sample, then we calculate:
– y = mean of observations in the ith group,
i k
– T = sum of all observations =  ni y i =Σyi
i 1
n
– S = sum of squares of all observations =  yi2
i 1
k

– N = total number of observations = i 1


ni

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 22
terek7@gmail.com
 One way ANOVA partitions the total sum of squares
(SST) into two distinct components.
– The sum of squares due to differences between
the group means (SSB).

– The sum of squares due to differences among the


observations within each group (SSW). This is also
called the residual sum of squares or unexplained.
SST = SSB + SSW

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 23
terek7@gmail.com
– SSB = Total sum of squared deviations of group
means about grand mean
k  
SS B =  ( yi y ) 2

i1

– SSW = Total sum of squared deviations of each


observation about group mean
n 

SS W =  ( y i  y i
) 2

i1

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 24
terek7@gmail.com
SST = Total sum of squared deviations of each
observation about grand mean

n 
SST = SSB + SSW =  i y
( y
i 1
- ) 2

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 25
terek7@gmail.com
 The sum of squares for one way ANOVA are given as
follows:
Source of variation Sum of squares

Between groups k
2 2
(Explained) SSB =  i i
n x
i 1
i
 T /N

2
Within groups SSW = S T / N
(Unexplained)
k
Total SST = S   ni M i 2
i 1

(= SSB
Teresa Kisi (MPH +SSW) and Biostatistics, Assist. Prof.)
in Epidemiology
5/15/2016 26
terek7@gmail.com
– The significance test for differences between the
groups is based on a comparison of the between
groups and within groups mean squares.

– If the observed differences between the means of


the groups are simply due to chance variation, the
variation between these group means will be
about the same as the variation within individuals
of the same type.

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 27
terek7@gmail.com
 If there are real differences, the between groups
variation will be larger. The mean squares are
compared using the F-test. This test is sometimes
known as variance-ratio test.
B e tw e e n g r o u p s
F =
W ith in g r o u p s

Df Between-groups = k-1
Df within-groups = N-k
where:
N is the total number of observations and
k is the number of groups.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
5/15/2016 28
terek7@gmail.com
One way ANOVA table looks like the following:
Source of DF SS Mean square F P
variation

Between k-1 SSB SSB / k-1


groups (SSB / k-1)/
(SSW / N-k)
Within N-k SSW SSW / N-k
groups

Total N-1 SST

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 29
terek7@gmail.com
 Assumptions
– The data are normally distributed or the samples
have to come from Normally distributed
populations.
– The population value for the standard deviation
between individuals is the same for each group
(equal variance).
– Moderate departures from normality and unequal
standard deviations may be safely ignored. If not
transforming, using assumption free, the data
may be useful.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
5/15/2016 30
terek7@gmail.com
 Example 1
Twenty-two patients undergoing cardiac bypass surgery were
randomized to one of three ventilation groups:
Group I: Patients received a 50% nitrous oxide and
50% oxygen mixture continuously for 24 hours;

Group II: Patients received a 50% nitrous oxide and 50%


oxygen mixture only during the operation;

Group III: Patients received no nitrous oxide but received


35-50% oxygen for 24 hours.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
5/15/2016 31
terek7@gmail.com
– The table below shows red cell folate levels for the
three groups after 24 hours' ventilation. We wish to
compare the three groups, and test the null
hypothesis that the three groups have the same red
cell folate levels.

– Examination of the data does not reveal any obvious


outliers and the data in each group look plausible
samples from a Normal distribution. The standard
deviation in group I is rather higher than those in the
other groups, but moderate variability is not a
problem.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
5/15/2016 32
terek7@gmail.com
– Levene statistic test is useful for assessing the null
hypothesis that more than two samples come from
populations with the same variance. Some computer
programs incorporate this test (Eg. In SPSS).
Bartlett's test: is used in Stata

Test of Homogeneity of Variances

Red cell folate levels (µg/l)


Levene
Statistic df1 df2 Sig.
3.823 2 19 .040

The significance value is less than 0.05, suggesting that the variances for
the three groups are not equal and the assumption is not assumed .
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
5/15/2016 33
terek7@gmail.com
Example 1: Red cell folate levels (μg/l) in three groups
of cardiac bypass patients given different levels of
nitrous oxide ventilation (Amess et al., 1978)
Group I Group II Group III
(n=8) (n=9) (n=5)
243 206 241
251 210 258
275 226 270
291 249 293
347 255 328
354 273
380 285
392 295
309

Mean =316.6 256.4 278.0


SD = 58.7 37.1 33.8
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
5/15/2016 34
terek7@gmail.com
 Hypotheses
– Ho : μ1 = μ2 = μ3 or means of groups are not
significantly different.

– HA : Differences exist between at least some of the


means/ groups

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 35
terek7@gmail.com
 ANOVA table Explained by the model
ANOVA

Red cell folate levels (µg/l)


Sum of
Squares df Mean Square F Sig.
Between Groups 15515.766 2 7757.883 3.711 .044
Within Groups 39716.097 19 2090.321
Total 55231.864 21

Since the P value is less than 0.05, the null hypothesis


is rejected. This is a global test

Unexplained variation

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 36
terek7@gmail.com
 Pair-wise comparisons of group means
– One way ANOVA is an extension of the two
sample t test. When there are only two groups,
the F value will be the square of the
corresponding t value with (1, N-2) degrees of
freedom. Remember the degrees of freedom for
the two sample t test is N-2.
– With two groups the interpretation of a significant
difference is reasonably straightforward, but how
do we interpret significant variation among the
means of three or more groups?
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
5/15/2016 37
terek7@gmail.com
– Further analysis is required to find out how the
means differ, for example, whether one group
differs from all the others.
– It should be noted that pair-wise comparisons will
be carried out when the overall comparison of
groups in the analysis of variance is significant.
This is called Post Hoc multiple comparison
– With k groups, there are ½k(k-1) possible pair-
wise comparisons of group means.

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 38
terek7@gmail.com
If we perform k paired comparisons, then we should
multiply the P value obtained from each test by k;
that is, we calculate P' = kP with the restriction that
P' cannot exceed 1.

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 39
terek7@gmail.com
The Post Hoc tests
are divided into two
sets:
The first set assumes
groups with equal
variances.

The second set does not


assume that the variances are
equal.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
5/15/2016 40
terek7@gmail.com
Do post hoc test for example 1 above using:
– Benferroni method
– Scheffe method

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 41
terek7@gmail.com
Post hoc test result (Bonferroni)
Multiple Comparisons

Dependent Variable: Red cell folate levels (µg/l)

Mean
Difference 95% Confidence Interval
(I) group (J) group (I-J) Std. Error Sig. Lower Bound Upper Bound
Scheffe 1.00 2.00 60.18056* 22.21594 .045 1.2192 119.1420
3.00 38.62500 26.06443 .354 -30.5503 107.8003
2.00 1.00 -60.18056* 22.21594 .045 -119.1420 -1.2192
3.00 -21.55556 25.50141 .704 -89.2366 46.1255
3.00 1.00 -38.62500 26.06443 .354 -107.8003 30.5503
2.00 21.55556 25.50141 .704 -46.1255 89.2366
Bonferroni 1.00 2.00 60.18056* 22.21594 .042 1.8614 118.4998
3.00 38.62500 26.06443 .464 -29.7969 107.0469
2.00 1.00 -60.18056* 22.21594 .042 -118.4998 -1.8614
3.00 -21.55556 25.50141 1.000 -88.4995 45.3884
3.00 1.00 -38.62500 26.06443 .464 -107.0469 29.7969
2.00 21.55556 25.50141 1.000 -45.3884 88.4995
*. The mean difference is significant at the .05 level.

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 42
terek7@gmail.com
A. Groups I and II
– The P- value = 0.042
– 95% CI = (1.86, 118.5)
B. Group I and III
– the p-value = 0.464
– 95% CI = (-29.8,107.05)
C. Group II and III
– The p-value = 1.00
– 95% CI = (-88.50, 45.39)
Therefore, the main explanation for the difference between
the groups that was identified in the analysis of variance is
thus the difference between groups I and II.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
5/15/2016 43
terek7@gmail.com
Post hoc test result (Scheffe)
Multiple Comparisons

Dependent Variable: Red cell folate levels (µg/l)

Mean
Difference 95% Confidence Interval
(I) group (J) group (I-J) Std. Error Sig. Lower Bound Upper Bound
Scheffe 1.00 2.00 60.18056* 22.21594 .045 1.2192 119.1420
3.00 38.62500 26.06443 .354 -30.5503 107.8003
2.00 1.00 -60.18056* 22.21594 .045 -119.1420 -1.2192
3.00 -21.55556 25.50141 .704 -89.2366 46.1255
3.00 1.00 -38.62500 26.06443 .354 -107.8003 30.5503
2.00 21.55556 25.50141 .704 -46.1255 89.2366
Bonferroni 1.00 2.00 60.18056* 22.21594 .042 1.8614 118.4998
3.00 38.62500 26.06443 .464 -29.7969 107.0469
2.00 1.00 -60.18056* 22.21594 .042 -118.4998 -1.8614
3.00 -21.55556 25.50141 1.000 -88.4995 45.3884
3.00 1.00 -38.62500 26.06443 .464 -107.0469 29.7969
2.00 21.55556 25.50141 1.000 -45.3884 88.4995
*. The mean difference is significant at the .05 level.

Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)


5/15/2016 44
terek7@gmail.com
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
5/15/2016 45
terek7@gmail.com

Vous aimerez peut-être aussi