Vous êtes sur la page 1sur 8

THE

CONFIDENCE
INTERVAL
MINIPROJECT

17January2016

StephensonGokingco
D/CStatistics
Crow
PeriodOB

Whatisthetruemeannumberofkilobytesofallthephotosofallthepageswithchaptercontent
inmyAPHumanGeographytextbook?

LastweekmyfriendaskedmeifIcouldsendhimdigitalcopiesofallthepagesofthe
firstchapterofourAPHumanGeographytextbookbecausehehadntgottenonefromMr.
Cutleryet.Ihappilyobligedbecausehegotintotheclasslateduetosomeschedulingissues,and
Icouldunderstandthestressofwherehewascomingfrom.Overtheweekend,Itookaround26
photoswithmyiPhoneofthe26pagesofthefirstchapter,andemailedallofthemtohim.I
noticedthatasIwasmanagingeachphotoinmyfilesfolder,eachphotohadadifferentamount
ofkilobytesofdataassociatedwiththem.Thenumbersdidntvarytoomuch,butIcould
definitelyseeaclearspreadofnumbers.SoIthoughttomyself,Thisisgoingtobealotofdata
thatImgoingtohavetohandleandsendifthisweretocontinueuntiltheendofthesemester.
ExactlywhatamIgettingmyselfinto?Thevariationinnumbersprobablyhastodowiththe
factthateverypageisdifferentfromthenext,whichmeansthateveryphotowillbedifferent
thanthenext.Dependingonhowmuchcolor,text,graphs,charts,etc.areonthepage,different
kilobytenumberswillresultbecauseeachpixelwillthenhavetocarryadifferentamountof
information.
Findingthetruemeannumberofallofthisdatawillnotonlytellmyfriendexactlyhow
muchImhelpinghim,butitwillalsobeatoolIcanusetoconvincehimtogetatextbookas
soonaspossible.Illbasicallybeusingstatisticstobackupmyfriendlynudgeofaskinghimto
gethistextbookalready.Thenexttimesomeonesendsmephotosoftextbookpages,Illbesure
nottotaketheirhelpfulnessforgrantedbecauseIvecrunchedthenumbers.

ThepopulationfromwhichItookmysamplewasthe420pagesofchaptertextinmyAP
HumanGeographytextbook.

Mysamplingprocedurewassimple.Iassignednumbersfrom1to420toeachofthe
pages(i.e.Page1wasassigned1).Ienteredthisrangeofnumbersintoarandomnumber
generator(random.org)togenerateasimplerandomsampleofsamplesize
n
=30.The30
randomnumbersIgotwere:75,100,171,294,292,119,65,76,300,22,381,377,308,108,
363,296,244,212,73,177,123,43,42,238,417,256,350,252,299,337.Iwenttoeachof
thesepagesinmyAPHumanGeographytextbookandtookapictureofeachpagewithmy
iPhone5s.Tokeeptheprocedureconsistent,Imadesuretokeepthebookinconsistent
conditionsforeachphoto.Imadesuretofitonlythepageonmycamerascreenforeachshot.In
otherwords,alloftheimageshaveasimilarviewtothem.Iuploadedallofthesephotostomy
computerandfoundthekilobytesizesofeachone.ThenumbersIgotwere(innoparticular
order):1054,1098,1148,1065,1176,1066,1076,1142,720,845,821,953,991,960,951,944,
1013,933,1011,998,926,925,897,935,939,957,931,986,896,1020.Theunitsforthese
measurementswereinkilobytes.Thereasonwhythisistrulyasimplerandomsampleisthat
everysinglepagehadthesameprobabilityofbeingselectedasthenext.
Tobehonest,Icantreallythinkofareasonwhymysamplewouldnotberepresentative
ofmypopulationbecausethepagesIgotwereafairmixbetweentextheavy,graphicheavy,and
blankheavypages.Somepageshaveveryfewcontentonthembecausetheyareusuallythelast
pageofachapter,andsomepagesaredifferentfromtheothersbecausetheyarechapter
introductionpageswithadifferentlayout.Allofthesevariationswererepresentedinmysample,
andnonumberrangewasleftout.Therewasonlyonepageinthe400srangebecausethereare
fivetimesasmanypagesforthe100s,200s,etc.

The90%confidenceintervalforthetruepopulationmeanwas(948.57,1009.89).This
meansthatweare90%confidentthatthetruemeannumberofkilobytesofallthephotosofall
thepageswithchaptercontentinmyAPHumanGeographytextbookwasbetween948.57and
1009.89kilobytes.The95%confidenceintervalforthetruepopulationmeanwas(942.33,
1016.14).Thismeansthatweare95%confidentthatthetruemeannumberofkilobytesofallthe
photosofallthepageswithchaptercontentinmyAPHumanGeographytextbookwasbetween
942.33and1016.14kilobytes.The99%confidenceintervalforthetruepopulationmeanwas
(929.50,1028.97).Thismeansthatweare99%confidentthatthetruemeannumberofkilobytes
ofallthephotosofallthepageswithchaptercontentinmyAPHumanGeographytextbookwas
between929.50and1028.97kilobytes.Alloftheseintervalsareintheunitsofkilobytes,andsee
AppendixAforthework/code.

Aspercentconfidenceincreases,theintervalalsoincreases,whichistobeexpected.
Usingtheseconfidenceintervals,IcandefinitelysaythateveryphotoItakewillbeclosetoa
megabyte(1000kilobytes)ofdata.MultiplythisbythenumberofpagesthatIwillhavetosend
providedthatmyfriendnevergetshisbook,andIvegotapproximately420megabytesofdata
tostoreandsend.NotonlywillthistakeatollonmyiPhone,itwillalsobehardformy
computertomanagebecauseIalreadyhavealotofmemoryspacetakenup.Weliveinaworld
wheretheinformationwestorecantevenbetouchedbecausetheyreallaroundusinbitsof
information,andwehavetolearnwhatsworthallocating.Forme,itwilltaketoomuchtime
andeffortformetohavetostoreandsendchapterafterchapter,pageafterpage,aftereveryunit

wedointheclass.Forthesereasons,Iwilldefinitelytrytogetmyfriendtogethisbookassoon
aspossible,andIwillusethesestatisticstoconvincehim.
TheonlythingIwouldchangeaboutthisexperimentwouldbetousesomeformof
mountedcameratoobtainmoreprecisedataandresults.Improbablynotthemostreliable
personintheworldwhenitcomestotakingstillphotoafterstillphoto,butIstilldidthebestI
could.HowItookthepicturescouldveintroducedsomeexperimentalandhumanerrorintothe
data.Additionally,Icouldalwaysraisethesamplesizeto42becausethatwouldnotbreakthe
10%rulewhenitcomestodeterminingasamplesizeforapopulation.Bydoingthis,Iwouldbe
abletousetheZ*valuesinplaceoftheT*values.

WorksCited
Haahr,Mads."TrueRandomNumberService."
RANDOM.ORG
.RandomnessandIntegrity
ServicesLtd.,1998.Web.17Jan.2016.
J.,DeBlijHarm,ErinHogan.Fouberg,andAlexanderB.Murphy.
HumanGeography:People,
Place,andCulture
.11thed.Hoboken,NJ:Wiley,2015.Print.

AppendixA
>#####STARTHERE#####
>
>#thedata
>kb<c(1054,1098,1148,1065,1176,1066,1076,1142,720,845,821,953,991,960,951,
944,1013,933,1011,998,926,925,897,935,939,957,931,986,896,1020)
>kb
[1]105410981148106511761066107611427208458219539919609519441013933
1011998
[21]9269258979359399579319868961020
>#samplemeanandstandarddeviation
>mean<mean(kb)
>mean
[1]979.2333
>sd<sd(kb)
>sd
[1]98.8446
>#90%confidenceinterval
>CI90pos<mean+1.699*(sd/sqrt(30))
>CI90pos
[1]1009.894
>CI90neg<mean1.699*(sd/sqrt(30))
>CI90neg
[1]948.5724
>interval90<c(CI90neg,CI90pos)
>interval90
[1]948.57241009.8943
>#95%confidenceinterval
>CI95pos<mean+2.045*(sd/sqrt(30))
>CI95pos
[1]1016.138
>CI95neg<mean2.045*(sd/sqrt(30))
>CI95neg
[1]942.3283
>interval95<c(CI95neg,CI95pos)
>interval95
[1]942.32831016.1384
>#99%confidenceinterval
>CI99pos<mean+2.756*(sd/sqrt(30))
>CI99pos

[1]1028.969
>CI99neg<mean2.756*(sd/sqrt(30))
>CI99neg
[1]929.4973
>interval99<c(CI99neg,CI99pos)
>interval99
[1]929.49731028.9694
>
>#Thenumbers1.699,2.045,and2.765aret*valueswithdf=29

Vous aimerez peut-être aussi