Académique Documents
Professionnel Documents
Culture Documents
multilevel modelling
William Browne
University of Nottingham
(With thanks to
Mousa Golalizadeh and Lynda Leese)
Summary
• Introduction to sample size calculations.
• A simulation-based approach.
• PINT for balanced 2 level models.
• Effect of balance.
• Other approaches.
• Cross classified models.
• Future plans.
Background
• Many quantitative social science research questions are
of the form of a hypothesis – A has a significant effect on
B.
• To answer such a question data is collected that allows
the researcher to (hopefully) test whether statistically A
has a significant effect on B. (In fact we aim to reject the
hypothesis that A doesn’t significantly affect B).
• A test is performed and either the researcher is happy
and A indeed has a significant effect on B or is left
wondering why the data collected do not back up their
hypothesis. Is the hypothesis false or was the data not
sufficient?
• The sufficiency of the data is the motivation for sample
size calculations.
Example
• Suppose I have the research question ‘Are Welshmen
on average taller than 175 cms?’
• I now need to get hold of a random sample of n
Welshmen and measure each of their heights.
• I make some statistical assumption about the distribution
of the heights of Welshmen e.g. that they come from a
Normal distribution.
• I might like to check this assumption by plotting a
histogram of the data.
• I can then form a statistical hypothesis test and test
whether indeed Welshmen are taller than 175cms.
• I need to decide how big to make n, my sample of
Welshmen.
x
Hypothesis Testing
• Let us assume our null hypothesis is that
the average height of Welshmen (μ) is
175cm.
• So we test H0:μ=175 vs HA:μ>175 (or
alternatively H0:θ=0 vs HA:θ>0 where θ=μ-
175)
• In practice we calculate from our sample its
mean (x ) and standard deviation (s2) and
use these along with n to form a test
statistic which we can compare with the
distribution assumed under H0
Type I and Type II errors
• No hypothesis test is perfect and there is always the possibility of
errors
Truth
H0 True H0 False
Reject H0 Type I error Correct
Decision Accept H Correct Type II error
0
Schools (N)
20 30 40 50 60
20 0.33 0.45 0.54 0.63 0.69
Pupils 30 0.42 0.55 0.67 0.76 0.82
(n)
40 0.49 0.64 0.76 0.83 0.89
50 0.57 0.69 0.80 0.89 0.93
60 0.60 0.76 0.85 0.91 0.95
Effect of balance
• Here we look at 3 scenarios: balanced, unbalanced,
severe unbalanced.
• We will consider the variance components model and
construct power curves by evaluating each scenario at
4,8,12,…,100 schools.
• The balanced case for N schools has 10 pupils per
school.
• The equivalent unbalanced case has N/2 schools
containing 5 pupils, N/4 schools containing 10 pupils and
N/4 schools containing 20 pupils.
• The severely unbalanced case has N-1 schools only
containing 1 pupil and 1 school containing 9N+1 pupils.
Results
• Here we see the power curves for the 3 scenarios. Note lower power
for unbalanced and strange behaviour for severe unbalance.
Number of zero variances
Extremely unbalanced designs are really estimating the effect of the large
school instead of the global mean and hence the level 2 variance is often
estimated as 0.
Subsampling approach / post-hoc
power calculations
• We have chosen a parametric approach where, given
effect sizes, we simulate datasets prior to any actual
data being collected.
• An alternative post-data collection non-parameteric
approach is to subsample from a large existing dataset
and test power calculations on these subsamples.
• Such an approach has been investigated by Arshartous
(1995) and Mok (1995).
• The advantage of this approach is that no distributional
assumptions need be made in the dataset generation.
• The disadvantage is that post-data power calculations in
some sense miss the boat in that we really need the
power calculations to guide us in our sampling. However
such calculations may be useful for similar future
studies.
Bayesian approach
• A recent more Bayesian approach is described
in Wang and Gelfand (2002).
• Here rather than fix an effect size for each
unknown parameter the user instead can give a
prior distribution (the sampling prior) which is
used in the generation of the simulated datasets.
• They then use MCMC to fit models to their
simulated datasets and evaluate performance
criterions based on the posterior samples.
Cross-classified models
• In our ESRC grant we are intending to focus on these
model types for our Power calculations as they are
outside the remit of PINT.
yi 0 1 xi uschool
( 2)
( i ) u district( i ) eij ,
( 3)