Sample Size Power 2009

How many participants H ti i t will I need? ill d?
Issues related to sample size calculations in Health Research

Max Bulsara
Institute of Health and Rehabilitation Research, University of Notre Dame
Why do we need to calculate a y sample size?

Biological data is highly variable Crucial element in the planning of any research project Must have adequate numbers to test a given hypothesis Must be able to generalize results Differences could be real, random variation or both
How many subjects do I need?

Always seek advice from a Statistician this talk is intended to give you an idea of the issues involved Reasonable knowledge of likely outcome E ti t of time and money available Estimates f ti d il bl
Inference
Population
judgement
sample population
random sample
The inference from random sample to population is based on probability. b d b bilit
inference
Where do we start?
A variable i a characteristic ( measurement) th t varies i bl is h t i ti (or t) that i from one member of the population to another. The scale on which a characteristic is measured has implications for the way in which information is displayed and summarised summarised. Measurement scales may be quantitative or categorical. Quantitative scales may be discrete (eg number of children) or continuous (e.g. blood pressure). Results are usually expressed as means. Categorical scales may b nominal ( C t i l l be i l (e.g. sex or bl d group) or blood ) ordinal (e.g. pain, grade of breast cancer). Results are often expressed as proportions.
SAMPLE SIZE FOR SIMPLE DESIGN
Quick and Easy using Effect Size:

This is the ratio of clinically significant difference t it standard d i ti diff to its t d d deviation
Example: Suppose an intervention program claims to reduce weight in obese people by 2 kg in the first week and the standard deviation in this group of people is known to be 10kg Then, effect size E= 2/10=0.2
Quick and Easy:

A quick and easy way to calculate a sample size is to use the following formulae:
16 n= 2 E
16~2*7.9
Commonly used values Power (%) P 0.05 0 05 0.01 50 3.8 38 6.6 80 7.9 79 11.7 90 10.5 10 5 14.9 95 13.0 13 0 17.8
For our example we get: n=16/0.04 = 400 per group
Crit Care. 2002; 6(4): 335341.
Quick and Easy:

450
400
400 350 300
Sample Size e
250 200 150
178
100
100
64
50 0
44
33
25
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Effect Size
Quick and Easy:

This method is for continuous outcome, eg weight, blood pressure, pulse rate, height, BMI etc.. Need to decide how much of a difference you would consider to be clinically significant y g
Note : this is just an approximate method
Jargon - A typical example

Q: Q A medical t t di l treatment has 45% cure rate. A new th t surgical procedure is proposed. Due to surgical morbidity researchers judge that a cure rate of 70% needs t b achieved i order t j tif it use. I d to be hi d in d to justify its In order to conduct a clinical trial to compare these two therapies, how many patients do we need? A: We need 80 patients randomised to each therapy to produce a 90% chance of achieving a statistically chance significant difference.
Jargon: we have 90% power to detect a 25% difference
Issues to consider

(J. Peat et al, 2001)
Clinically significant difference Variability Resources Subject availability Statistical power Ethics
Outcome of any trial

What is t Wh t i true in the population? i th l ti ? No Effect Correct conclusion (p=1 -) Type I error (p= ) Treatment Effect Type II error (p= (p )
Conclusion reached in our study
No Effect
Treatment Effect
Correct conclusion (p=1- ) (p=1 -)
Power
Type I and II Errors

Type I : reject null hypothesis when it is true (probability of making thi error i t ( b bilit f ki this is ) Type II: accept hypothesis when it is false (probability of making this error is ) 1- is the power of a study: rejecting the hypothesis when in fact it is false yp
SAMPLE SIZE FOR COMPLEX DESIGN
Cluster Randomised Cluster-Randomised Trials (CRT)

Assume we are interested in measuring the effect of an intervention or treatment implemented at a cluster or group level Cluster: naturally occurring grouping Clusters are randomised to intervention or treatment groups
Examples p School-based research children clustered within schools Health research patients clustered within ward, general practice, health h lth practitioner titi Occupational workers clustered within worksites
Slide courtesy of :Thrse Shaw
CRT: Terminology
Study condition groups to which clusters are randomly assigned e.g. intervention vs comparison group Clusters groups which are sampled and g p p assigned to conditions e.g schools, hospital wards, wards general practices, worksites practices worksites, communities, neighbourhoods M b Members units of observation e.g. it f b ti schoolchildren, patients, workers
CRT: Effect of Clustering

Traditional statistical methods assume independence of observations However members of the same cluster will have things in common e.g. SES, environment, culture etc
Example:
Dietary intake & physical activity in neighbourhoods B ll i outcomes i schools Bullying t in h l Patient characteristics at GP clinics

Ignoring the homogeneity within clusters leads to an underestimation of the variance of an intervention effect This will lead to an overestimated p value to p-value test the intervention effect and thus potentially incorrect conclusions
i.e. concluding that an intervention has impact g p when this is not the case

Difference in variance is factor known as the g (DEFF) or variance inflation ) design effect ( factor (VIF): 1 + (k-1)*ICC
Where k = # members per cluster ICC = (Intracluster / Intraclass Correlation Coefficient) degree of resemblance between members of the same cluster b f th l t
CRT: Sample size

Need to inflate sample size by design effect, so: Calculate sample size required as if conducting a Simple Random Sample (SRS) Multiply sample size by design effect 1 + (k-1)*ICC
Example: n=200 if SRS k=25, ICC=0.02, deff=1.5, k=25 ICC=0 02 deff=1 5 required n=300
CRT Example: Design effect changes

Members per cluster ICC = 0 02 ICC = 0 04 ICC = 0 20 l t 0.02 0.04 0.20 5 10 25 50 100 1.08 1.18 1.48 1.98 1 98 2.98 1.16 1.36 1.96 2.96 2 96 4.96 1.8 2.8 5.8 10.8 10 8 20.8
CRT Example: Sample size & # of clusters p p

k ICC = 0.02 ICC = 0.04 ICC = 0.20
5 10 25 50 100
216 44 236 24 296 12 396 8 596 6
232 47 272 28 392 16 592 12 992 10
360 72 560 56 1160 47 2160 44 4160 42
ISSUES REALTED TO SAMPLE SIZE
Sample size too small

What happens ?
Type I or type II error may occur Not enough p g power to show that a clinically y important difference is significant Estimate of effect will be inaccurate A small difference between groups will not reach statistical significance Study maybe unethical as it can not fulfill its original aims i i l i
Sample size too large

What h Wh t happens? ?
A small difference that is not clinically important will be statistically significant Research resources will be wasted Inaccurate results due to difficulty in maintaining quality Response rate maybe too low It is unethical to study more subjects than a e eeded are needed
Limitations
Sample size calculations make no allowances for the following:
Study subjects who drop out Subjects who fail to attend or do not comply with the intervention strategy Having to screen subjects who do not fulfill the g j eligibility criteria Subjects with missing data Variability in measure been larger than expected Providing power to conduct sub-group analyses you need to plan these ahead
Avoid canned Effect Sizes canned

Cohen (1988) defines this as d=/, d=0.2 (small), d=0.5 (medium) and d=0.8 (large) effect size Example: An industrial experiment: measurements can be made using
Machine (accurate to few microns) C li Caliper ( (accurate t a f t to few th thousandths of an i h) dth f inch) Ruler (accurate to a sixteenth of an inch) No matter what measuring instrument you use you get the p % same sample size for a medium effect size at 80% power. Instrument will have a huge effect on results.
Retrospective (Post-hoc) Power Calculation

Should we do this? NO! Its an obvious answer to an uninteresting q question! If hypothesis test is yp significant, power will be high. If the test is not significant, significant power will be low low. Analogy: if a car makes it to the top of a hill it was powerful enough, if it doesnt then it wasnt powerful (Lenth, 2001) p
Retrospective (Post-hoc) Power Calculation

Obs power is a function of the p value of a test. Do not use pre-experiment numbers to interpret pre experiment results! Analogy: convincing someone that buying lottery ticket was foolish (before-experiment perspective) after they have won a jackpot (after-experiment perspective) Sample size and power calculation should not be used as a D t A l ti l t l it i f planning d Data Analytical tool is for l i purposes only!
A typical example: Reporting Statistical Power

Q: A medical treatment has 45% cure rate. A new surgical procedure is proposed. Due to surgical morbidity researchers j d th t a cure i l bidit h judge that rate of 70% needs to be achieved in order to justify its use. In order to conduct a clinical trial it s to compare these two therapies, how many patients do we need? A: We need 80 patients randomised to each p therapy to produce a 90% chance of achieving a statistically significant difference.
Reporting Statistical Power p g

In this example we have 90% power to detect a 25% difference between 2 trt n=80, trt, n=80 suppose observed the following cure rates: grp1: 60/80 (75%) and grp2:49/80 (61%) difference 14%, p=0.06. Two ways to report this
1) statistically non-significant result (p>0.05) was observed OR 2) report the observed difference (14%), exact p-value (0.06) and 95% confidence intervals of difference (14%15% = -1% to 29%) 1%
THANK YOU
Coming Soon:
9th Sep 09: An Introduction to Cochrane Systematic Reviews
Dr Bruce Walker, Murdoch University
7th Oct 09: The Power of Words: Applying Qualitative Research successfully in a Health Care Setting
Dr Caroline Bulsara, University of WA

Sample Size Power 2009

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Sample Size Power 2009

Transféré par

Droits d'auteur :

Formats disponibles

How many participants H ti i t will I need? ill d?

Issues related to sample size calculations in Health Research

Why do we need to calculate a y sample size?

How many subjects do I need?

SAMPLE SIZE FOR SIMPLE DESIGN

Quick and Easy using Effect Size:

Quick and Easy:

For our example we get: n=16/0.04 = 400 per group

Crit Care. 2002; 6(4): 335341.

Quick and Easy:

250 200 150

Quick and Easy:

Note : this is just an approximate method

Jargon - A typical example

Jargon: we have 90% power to detect a 25% difference

(J. Peat et al, 2001)

Outcome of any trial

Conclusion reached in our study

Correct conclusion (p=1- ) (p=1 -)

Type I and II Errors

SAMPLE SIZE FOR COMPLEX DESIGN

Cluster Randomised Cluster-Randomised Trials (CRT)

CRT: Effect of Clustering

CRT: Effect of Clustering

CRT: Effect of Clustering

CRT: Sample size

CRT Example: Design effect changes

CRT Example: Sample size & # of clusters p p

216 44 236 24 296 12 396 8 596 6

232 47 272 28 392 16 592 12 992 10

360 72 560 56 1160 47 2160 44 4160 42

ISSUES REALTED TO SAMPLE SIZE

Sample size too small

Sample size too large

Avoid canned Effect Sizes canned

Retrospective (Post-hoc) Power Calculation

Retrospective (Post-hoc) Power Calculation

A typical example: Reporting Statistical Power

Reporting Statistical Power p g

Vous aimerez peut-être aussi