Vous êtes sur la page 1sur 33

How many participants H ti i t will I need? ill d?

Issues related to sample size calculations in Health Research


Max Bulsara
Institute of Health and Rehabilitation Research, University of Notre Dame

Why do we need to calculate a y sample size?


Biological data is highly variable Crucial element in the planning of any research project Must have adequate numbers to test a given hypothesis Must be able to generalize results Differences could be real, random variation or both

How many subjects do I need?


Always seek advice from a Statistician this talk is intended to give you an idea of the issues involved Reasonable knowledge of likely outcome E ti t of time and money available Estimates f ti d il bl

Inference
Population

judgement
sample population

random sample
The inference from random sample to population is based on probability. b d b bilit

inference

Where do we start?
A variable i a characteristic ( measurement) th t varies i bl is h t i ti (or t) that i from one member of the population to another. The scale on which a characteristic is measured has implications for the way in which information is displayed and summarised summarised. Measurement scales may be quantitative or categorical. Quantitative scales may be discrete (eg number of children) or continuous (e.g. blood pressure). Results are usually expressed as means. Categorical scales may b nominal ( C t i l l be i l (e.g. sex or bl d group) or blood ) ordinal (e.g. pain, grade of breast cancer). Results are often expressed as proportions.

SAMPLE SIZE FOR SIMPLE DESIGN

Quick and Easy using Effect Size:


This is the ratio of clinically significant difference t it standard d i ti diff to its t d d deviation
Example: Suppose an intervention program claims to reduce weight in obese people by 2 kg in the first week and the standard deviation in this group of people is known to be 10kg Then, effect size E= 2/10=0.2

Quick and Easy:


A quick and easy way to calculate a sample size is to use the following formulae:

16 n= 2 E

16~2*7.9

Commonly used values Power (%) P 0.05 0 05 0.01 50 3.8 38 6.6 80 7.9 79 11.7 90 10.5 10 5 14.9 95 13.0 13 0 17.8

For our example we get: n=16/0.04 = 400 per group

Crit Care. 2002; 6(4): 335341.

Quick and Easy:


450

400
400 350 300

Sample Size e

250 200 150

178

100
100

64
50 0

44

33

25

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Effect Size

Quick and Easy:


This method is for continuous outcome, eg weight, blood pressure, pulse rate, height, BMI etc.. Need to decide how much of a difference you would consider to be clinically significant y g

Note : this is just an approximate method

Jargon - A typical example


Q: Q A medical t t di l treatment has 45% cure rate. A new th t surgical procedure is proposed. Due to surgical morbidity researchers judge that a cure rate of 70% needs t b achieved i order t j tif it use. I d to be hi d in d to justify its In order to conduct a clinical trial to compare these two therapies, how many patients do we need? A: We need 80 patients randomised to each therapy to produce a 90% chance of achieving a statistically chance significant difference.

Jargon: we have 90% power to detect a 25% difference

Issues to consider

(J. Peat et al, 2001)

Clinically significant difference Variability Resources Subject availability Statistical power Ethics

Outcome of any trial


What is t Wh t i true in the population? i th l ti ? No Effect Correct conclusion (p=1 -) Type I error (p= ) Treatment Effect Type II error (p= (p )

Conclusion reached in our study

No Effect

Treatment Effect

Correct conclusion (p=1- ) (p=1 -)

Power

Type I and II Errors


Type I : reject null hypothesis when it is true (probability of making thi error i t ( b bilit f ki this is ) Type II: accept hypothesis when it is false (probability of making this error is ) 1- is the power of a study: rejecting the hypothesis when in fact it is false yp

SAMPLE SIZE FOR COMPLEX DESIGN

Cluster Randomised Cluster-Randomised Trials (CRT)


Assume we are interested in measuring the effect of an intervention or treatment implemented at a cluster or group level Cluster: naturally occurring grouping Clusters are randomised to intervention or treatment groups
Examples p School-based research children clustered within schools Health research patients clustered within ward, general practice, health h lth practitioner titi Occupational workers clustered within worksites
Slide courtesy of :Thrse Shaw

CRT: Terminology
Study condition groups to which clusters are randomly assigned e.g. intervention vs comparison group Clusters groups which are sampled and g p p assigned to conditions e.g schools, hospital wards, wards general practices, worksites practices worksites, communities, neighbourhoods M b Members units of observation e.g. it f b ti schoolchildren, patients, workers

CRT: Effect of Clustering


Traditional statistical methods assume independence of observations However members of the same cluster will have things in common e.g. SES, environment, culture etc
Example:
Dietary intake & physical activity in neighbourhoods B ll i outcomes i schools Bullying t in h l Patient characteristics at GP clinics

CRT: Effect of Clustering


Ignoring the homogeneity within clusters leads to an underestimation of the variance of an intervention effect This will lead to an overestimated p value to p-value test the intervention effect and thus potentially incorrect conclusions
i.e. concluding that an intervention has impact g p when this is not the case

CRT: Effect of Clustering


Difference in variance is factor known as the g (DEFF) or variance inflation ) design effect ( factor (VIF): 1 + (k-1)*ICC
Where k = # members per cluster ICC = (Intracluster / Intraclass Correlation Coefficient) degree of resemblance between members of the same cluster b f th l t

CRT: Sample size


Need to inflate sample size by design effect, so: Calculate sample size required as if conducting a Simple Random Sample (SRS) Multiply sample size by design effect 1 + (k-1)*ICC
Example: n=200 if SRS k=25, ICC=0.02, deff=1.5, k=25 ICC=0 02 deff=1 5 required n=300

CRT Example: Design effect changes


Members per cluster ICC = 0 02 ICC = 0 04 ICC = 0 20 l t 0.02 0.04 0.20 5 10 25 50 100 1.08 1.18 1.48 1.98 1 98 2.98 1.16 1.36 1.96 2.96 2 96 4.96 1.8 2.8 5.8 10.8 10 8 20.8

CRT Example: Sample size & # of clusters p p


k ICC = 0.02 ICC = 0.04 ICC = 0.20

5 10 25 50 100

216 44 236 24 296 12 396 8 596 6

232 47 272 28 392 16 592 12 992 10

360 72 560 56 1160 47 2160 44 4160 42

ISSUES REALTED TO SAMPLE SIZE

Sample size too small


What happens ?
Type I or type II error may occur Not enough p g power to show that a clinically y important difference is significant Estimate of effect will be inaccurate A small difference between groups will not reach statistical significance Study maybe unethical as it can not fulfill its original aims i i l i

Sample size too large


What h Wh t happens? ?
A small difference that is not clinically important will be statistically significant Research resources will be wasted Inaccurate results due to difficulty in maintaining quality Response rate maybe too low It is unethical to study more subjects than a e eeded are needed

Limitations
Sample size calculations make no allowances for the following:
Study subjects who drop out Subjects who fail to attend or do not comply with the intervention strategy Having to screen subjects who do not fulfill the g j eligibility criteria Subjects with missing data Variability in measure been larger than expected Providing power to conduct sub-group analyses you need to plan these ahead

Avoid canned Effect Sizes canned


Cohen (1988) defines this as d=/, d=0.2 (small), d=0.5 (medium) and d=0.8 (large) effect size Example: An industrial experiment: measurements can be made using
Machine (accurate to few microns) C li Caliper ( (accurate t a f t to few th thousandths of an i h) dth f inch) Ruler (accurate to a sixteenth of an inch) No matter what measuring instrument you use you get the p % same sample size for a medium effect size at 80% power. Instrument will have a huge effect on results.

Retrospective (Post-hoc) Power Calculation


Should we do this? NO! Its an obvious answer to an uninteresting q question! If hypothesis test is yp significant, power will be high. If the test is not significant, significant power will be low low. Analogy: if a car makes it to the top of a hill it was powerful enough, if it doesnt then it wasnt powerful (Lenth, 2001) p

Retrospective (Post-hoc) Power Calculation


Obs power is a function of the p value of a test. Do not use pre-experiment numbers to interpret pre experiment results! Analogy: convincing someone that buying lottery ticket was foolish (before-experiment perspective) after they have won a jackpot (after-experiment perspective) Sample size and power calculation should not be used as a D t A l ti l t l it i f planning d Data Analytical tool is for l i purposes only!

A typical example: Reporting Statistical Power


Q: A medical treatment has 45% cure rate. A new surgical procedure is proposed. Due to surgical morbidity researchers j d th t a cure i l bidit h judge that rate of 70% needs to be achieved in order to justify its use. In order to conduct a clinical trial it s to compare these two therapies, how many patients do we need? A: We need 80 patients randomised to each p therapy to produce a 90% chance of achieving a statistically significant difference.

Reporting Statistical Power p g


In this example we have 90% power to detect a 25% difference between 2 trt n=80, trt, n=80 suppose observed the following cure rates: grp1: 60/80 (75%) and grp2:49/80 (61%) difference 14%, p=0.06. Two ways to report this
1) statistically non-significant result (p>0.05) was observed OR 2) report the observed difference (14%), exact p-value (0.06) and 95% confidence intervals of difference (14%15% = -1% to 29%) 1%

THANK YOU
Coming Soon:
9th Sep 09: An Introduction to Cochrane Systematic Reviews
Dr Bruce Walker, Murdoch University

7th Oct 09: The Power of Words: Applying Qualitative Research successfully in a Health Care Setting
Dr Caroline Bulsara, University of WA

Vous aimerez peut-être aussi