Vous êtes sur la page 1sur 4

Stratifiedsampling

FromWikipedia,thefreeencyclopedia

Instatistics,stratifiedsamplingisamethodofsamplingfromapopulation.
Instatisticalsurveys,whensubpopulationswithinanoverallpopulationvary,itisadvantageoustosample
eachsubpopulation(stratum)independently.Stratificationistheprocessofdividingmembersofthe
populationintohomogeneoussubgroupsbeforesampling.Thestratashouldbemutuallyexclusive:every
elementinthepopulationmustbeassignedtoonlyonestratum.Thestratashouldalsobecollectively
exhaustive:nopopulationelementcanbeexcluded.Thensimplerandomsamplingorsystematicsampling
isappliedwithineachstratum.Thisoftenimprovestherepresentativenessofthesamplebyreducing
samplingerror.Itcanproduceaweightedmeanthathaslessvariabilitythanthearithmeticmeanofa
simplerandomsampleofthepopulation.
Incomputationalstatistics,stratifiedsamplingisamethodofvariancereductionwhenMonteCarlo
methodsareusedtoestimatepopulationstatisticsfromaknownpopulation.

Contents
1Stratifiedsamplingstrategies
2Advantages
3Disadvantages
4Practicalexample
5Seealso
6References
7Furtherreading

Stratifiedsamplingstrategies
1. Proportionateallocationusesasamplingfractionineachofthestratathatisproportionaltothatof
thetotalpopulation.Forinstance,ifthepopulationXconsistsofminthemalestratumandfinthe
femalestratum(wherem+f=X),thentherelativesizeofthetwosamples(x1=m/Xmales,x2=f/X
females)shouldreflectthisproportion.
2. Optimumallocation(orDisproportionateallocation)Eachstratumisproportionatetothestandard
deviationofthedistributionofthevariable.Largersamplesaretakeninthestratawiththegreatest
variabilitytogeneratetheleastpossiblesamplingvariance.
Stratifiedsamplingensuresthatatleastoneobservationispickedfromeachofthestrata,evenifprobability
ofitbeingselectedisfarlessthan1.Hencethestatisticalpropertiesofthepopulationmaynotbepreserved
iftherearethinstrata.Aruleofthumbthatisusedtoensurethisisthatthepopulationshouldconsistofno

morethansixstrata,butdependingonspecialcasestherulecanchangeforexampleifthereare100strata
eachwith1millionobservations,itisperfectlyfinetodoa10%stratifiedsamplingonthem.
Arealworldexampleofusingstratifiedsamplingwouldbeforapoliticalsurvey.Iftherespondentsneeded
toreflectthediversityofthepopulation,theresearcherwouldspecificallyseektoincludeparticipantsof
variousminoritygroupssuchasraceorreligion,basedontheirproportionalitytothetotalpopulationas
mentionedabove.Astratifiedsurveycouldthusclaimtobemorerepresentativeofthepopulationthana
surveyofsimplerandomsamplingorsystematicsampling.

Advantages
Ifpopulationdensityvariesgreatlywithinaregion,stratifiedsamplingwillensurethatestimatescanbe
madewithequalaccuracyindifferentpartsoftheregion,andthatcomparisonsofsubregionscanbemade
withequalstatisticalpower.Forexample,inOntarioasurveytakenthroughouttheprovincemightusea
largersamplingfractioninthelesspopulatednorth,sincethedisparityinpopulationbetweennorthand
southissogreatthatasamplingfractionbasedontheprovincialsampleasawholemightresultinthe
collectionofonlyahandfulofdatafromthenorth.
Randomizedstratificationcanalsobeusedtoimprovepopulationrepresentativenessinastudy.

Disadvantages
Stratifiedsamplingisnotusefulwhenthepopulationcannotbeexhaustivelypartitionedintodisjoint
subgroups.Itwouldbeamisapplicationofthetechniquetomakesubgroups'samplesizesproportionalto
theamountofdataavailablefromthesubgroups,ratherthanscalingsamplesizestosubgroupsizes(orto
theirvariances,ifknowntovarysignificantlye.g.bymeansofanFTest).Data(atthesametime)tothe
subgroups'sizeswithinthetotalpopulation.Foranefficientwaytopartitionsamplingresourcesamong
groupsthatvaryintheirmeans,theirvariances,andtheircosts,see"optimumallocation".Theproblemof
stratifiedsamplinginthecaseofunknownclasspriors(ratioofsubpopulationsintheentirepopulation)can
havedeleteriouseffectontheperformanceofanyanalysisonthedataset,e.g.classification.[1]Inthat
regard,minimaxsamplingratiocanbeusedtomakethedatasetrobustwithrespecttouncertaintyinthe
underlyingdatageneratingprocess.[1]

Practicalexample
Ingeneralthesizeofthesampleineachstratumistakeninproportiontothesizeofthestratum.Thisis
calledproportionalallocation.Supposethatinacompanytherearethefollowingstaff:[2]
male,fulltime:90
male,parttime:18
female,fulltime:9
female,parttime:63
Total:180
andweareaskedtotakeasampleof40staff,stratifiedaccordingtotheabovecategories.

Thefirststepistofindthetotalnumberofstaff(180)andcalculatethepercentageineachgroup.
%male,fulltime=90180=50%
%male,parttime=18180=10%
%female,fulltime=9180=5%
%female,parttime=63180=35%
Thistellsusthatofoursampleof40,
50%shouldbemale,fulltime.
10%shouldbemale,parttime.
5%shouldbefemale,fulltime.
35%shouldbefemale,parttime.
50%of40is20.
10%of40is4.
5%of40is2.
35%of40is14.
Anothereasywaywithouthavingtocalculatethepercentageistomultiplyeachgroupsizebythesample
sizeanddividebythetotalpopulationsize(sizeofentirestaff):
male,fulltime=90(40180)=20
male,parttime=18(40180)=4
female,fulltime=9(40180)=2
female,parttime=63(40180)=14

Seealso
OpinionPoll
Statisticalbenchmarking
Stratifiedsamplesize

References
1. ^abShahrokhEsfahani,MohammadDougherty,EdwardR.(2014)."Effectofseparatesamplingon
classificationaccuracy"(http://bioinformatics.oxfordjournals.org/content/30/2/242).Bioinformatics30(2):242
250.doi:10.1093/bioinformatics/btt662(http://dx.doi.org/10.1093%2Fbioinformatics%2Fbtt662).
2. ^Hunt,NevilleTyrrell,Sidney(2001)."StratifiedSampling"
(http://nestor.coventry.ac.uk/~nhunt/meths/strati.html).WebpageatCoventryUniversity.Retrieved12July2012.

Furtherreading
Srndal,CarlEriketal.(2003)."StratifiedSampling".ModelAssistedSurveySampling.NewYork:
Springer.pp.100109.ISBN0387406204.
Retrievedfrom"http://en.wikipedia.org/w/index.php?title=Stratified_sampling&oldid=641589820"
Categories: Sampling(statistics) Samplingtechniques Statisticalterminology Variancereduction
Thispagewaslastmodifiedon8January2015at15:50.
TextisavailableundertheCreativeCommonsAttributionShareAlikeLicenseadditionaltermsmay
apply.Byusingthissite,youagreetotheTermsofUseandPrivacyPolicy.Wikipediaisa
registeredtrademarkoftheWikimediaFoundation,Inc.,anonprofitorganization.

Vous aimerez peut-être aussi