Vous êtes sur la page 1sur 36

Chapter One

What is Statistics?

1.1

What is Statistics?
Statisticsisawaytogetinformationfromdata.

1.2

What is Statistics?
Statisticsisawaytogetinformationfromdata
Statistics
Data

Information

1.3

Example 2.6 Stats Anxiety


Astudentenrolledinabusinessprogramisattendingthefirst
classoftherequiredstatisticscourse.Thestudentissomewhat
apprehensivebecausehebelievesthemyththatthecourseis
difficult.
Toalleviatehisanxietythestudentaskstheprofessorabout
lastyearsmarks.
Theprofessorobligesandprovidesalistofthefinalmarks,
whichiscomposedoftermworkplusthefinalexam.What
informationcanthestudentobtainfromthelist?
1.4

Example 2.6 Stats Anxiety

1.5

Example 2.6 Stats Anxiety


Typical mark
Mean (average mark)
Median (mark such that 50% above and
50% below)
Mean = 72.67
Median = 72
Is this enough information?

1.6

Example 2.6 Stats Anxiety


Are most of the marks clustered around the
mean or are they more spread out?
Range = Maximum minimum = 92-53 =
39
Variance
Standard deviation

1.7

Example 2.6 Stats Anxiety


Are there many marks below 60 or above 80?
What proportion are A, B, C, D grades?
A graphical technique histogram can provide
us with this and other information

1.8

Example 2.6 Stats Anxiety

1.9

Descriptive Statistics
Descriptivestatisticsdealswithmethodsoforganizing,
summarizing,andpresentingdatainaconvenientand
informativeway.
Oneformofdescriptivestatisticsusesgraphicaltechniques,
whichallowstatisticspractitionerstopresentdatainwaysthat
makeiteasyforthereadertoextractusefulinformation.
Chapter2and3introducesseveralgraphicalmethods.

1.10

Descriptive Statistics
Anotherformofdescriptivestatisticsusesnumerical
techniquestosummarizedata.
Themeanandmedianarepopularnumericaltechniquesto
describethelocationofthedata.
Therange,variance,andstandarddeviationmeasurethe
variabilityofthedata
Chapter4introducesseveralnumericalstatisticalmeasures
thatdescribedifferentfeaturesofthedata.
1.11

Case 12.1 Pepsis Exclusivity


Agreement
Alargeuniversitywithatotalenrollmentofabout50,000

studentshasofferedPepsiColaanexclusivityagreementthat
wouldgivePepsiexclusiverightstosellitsproductsatall
universityfacilitiesforthenextyearwithanoptionforfuture
years.
Inreturn,theuniversitywouldreceive35%oftheoncampus
revenuesandanadditionallumpsumof$200,000peryear.
Pepsihasbeengiven2weekstorespond.

1.12

Introduce new concepts via


examples
Population

Sample
Parameters
statistics
StatisticalInference
Confidencelevel
Significancelevel

1.13

Case 12.1 Pepsis Exclusivity


Agreement
Themarketforsoftdrinksismeasuredintermsof12ounce
cans.

Pepsicurrentlysellsanaverageof22,000cansperweek(over
the40weeksoftheyearthattheuniversityoperates).
Thecanssellforanaverageof1dollareach.Thecosts
includinglaborare30centspercan.
Pepsiisunsureofitsmarketsharebutsuspectsitis
considerablylessthan50%.
1.14

Case 12.1 Pepsis Exclusivity


Agreement
Aquickanalysisrevealsthatifitscurrentmarketsharewere
25%,then,withanexclusivityagreement,
Pepsiwouldsell88,000(22,000is25%of88,000)cansper
weekor3,520,000cansperyear.
Theprofitorlosscanbecalculated.

Theonlyproblemisthatwedonotknowhowmanysoft
drinksaresoldweeklyattheuniversity.
1.15

Case 12.1 Pepsis Exclusivity


Agreement
Pepsiassignedarecentuniversitygraduatetosurveythe
university'sstudentstosupplythemissinginformation.

Accordingly,sheorganizesasurveythatasks500studentsto
keeptrackofthenumberofsoftdrinkstheypurchaseinthe
next7days.
Theresponsesarestoredinafileonthediskthataccompanies
thisbook.Case12.1

1.16

Inferential statistics
TheinformationwewouldliketoacquireinCase12.1isan
estimateofannualprofitsfromtheexclusivityagreement.The
dataarethenumbersofcansofsoftdrinksconsumedin7days
bythe500studentsinthesample.
Wewanttoknowthemeannumberofsoftdrinksconsumed
byall50,000studentsoncampus.
Toaccomplishthisgoalweneedanotherbranchofstatistics
inferentialstatistics.

1.17

Inferential statistics
Inferentialstatisticsisabodyofmethodsusedtodraw
conclusionsorinferencesaboutcharacteristicsofpopulations
basedonsampledata.
Thepopulationinquestioninthiscaseisthesoftdrink
consumptionoftheuniversity's50,000students.
Thecostofinterviewingeachstudentwouldbeprohibitiveand
extremelytimeconsuming.
Statisticaltechniquesmakesuchendeavorsunnecessary.
Instead,wecansampleamuchsmallernumberofstudents
(thesamplesizeis500)andinferfromthedatathenumberof
softdrinksconsumedbyall50,000students.Wecanthen
estimateannualprofitsforPepsi.
1.18

Example 12.5
Whenanelectionforpoliticalofficetakesplace,thetelevision
networkscancelregularprogrammingandinsteadprovide
electioncoverage.
Usuallytheballotsarecountedtheresultsarereported.This
takestime.
However,forimportantofficessuchaspresidentorsenatorin
largestates,thenetworksactivelycompetetoseewhichwill
bethefirsttopredictawinner.

1.19

Example 12.5
Thisisdonethroughexitpolls,whereinarandomsampleof
voterswhoexitthepollingboothisaskedforwhomthey
voted.
Fromthedatathesampleproportionofvoterssupportingthe
candidatesiscomputed.
Astatisticaltechniqueisappliedtodeterminewhetherthereis
enoughevidencetoinferthattheleadingcandidatewillgarner
enoughvotestowin.

1.20

Example 12.5
TheexitpollresultsfromthestateofFloridaduringthe2000
yearelectionswererecorded(onlythevotesoftheRepublican
candidateGeorgeW.BushandtheDemocratAlbertGore).
Supposethattheresults(765peoplewhovotedforeitherBush
orGore)werestoredonafileonthedisk.(1=Goreand2=
Bush)

Xm1205
Thenetworkanalystswouldliketoknowwhethertheycan
concludethatGeorgeW.BushwillwinthestateofFlorida.

1.21

Example 12.5
Example12.5describesaverycommonapplicationof
statisticalinference.
Thepopulationthetelevisionnetworkswantedtomake
inferencesaboutistheapproximately5millionFloridianswho
votedforBushorGoreforpresident.
Thesampleconsistedofthe765peoplerandomlyselectedby
thepollingcompanywhovotedforeitherofthetwomain
candidates.

1.22

Example 12.5
Thecharacteristicofthepopulationthatwewouldliketo
knowistheproportionofthetotalelectoratethatvotedfor
Bush.
Specifically,wewouldliketoknowwhethermorethan50%
oftheelectoratevotedforBush(countingonlythosewho
votedforeithertheRepublicanorDemocraticcandidate).

1.23

Example 12.5
Becausewewillnotaskeveryoneofthe5millionactual
votersforwhomtheyvoted,wecannotpredicttheoutcome
with100%certainty.
Asamplethatisonlyasmallfractionofthesizeofthe
populationcanleadtocorrectinferencesonlyacertain
percentageofthetime.
Youwillfindthatstatisticspractitionerscancontrolthat
fractionandusuallysetitbetween90%and99%.

1.24

Key Statistical Concepts


Population
apopulationisthegroupofallitemsofinterestto
astatisticspractitioner.
frequentlyverylarge;sometimesinfinite.
E.g.All5millionFloridavoters,perExample12.5

Sample
Asampleisasubsetofdatadrawnfromthe
population.
Potentiallyverylarge,butlessthanthepopulation.
E.g.asampleof765votersexitpolledonelectionday.
1.25

Key Statistical Concepts


Parameter
Adescriptivemeasureofapopulation.
Statistic
Adescriptivemeasureofasample.

1.26

Key Statistical Concepts


Population

Sample

Subset

Parameter

Statistic

PopulationshaveParameters,
SampleshaveStatistics.
1.27

Descriptive Statistics
aremethodsoforganizing,summarizing,andpresenting
datainaconvenientandinformativeway.Thesemethods
include:
GraphicalTechniques(Chapter2,3),and
NumericalTechniques(Chapter4).

Theactualmethoduseddependsonwhatinformationwe
wouldliketoextract.Areweinterestedin
measure(s)ofcentrallocation?and/or
measure(s)ofvariability(dispersion)?

DescriptiveStatisticshelpstoanswerthesequestions
1.28

Inferential Statistics
DescriptiveStatisticsdescribethedatasetthatsbeing
analyzed,butdoesntallowustodrawanyconclusionsor
makeanyinterferencesaboutthedata.Henceweneed
anotherbranchofstatistics:inferentialstatistics.
Inferentialstatisticsisalsoasetofmethods,butitisused
todrawconclusionsorinferencesaboutcharacteristicsof
populationsbasedondatafromasample.

1.29

Statistical Inference
Statisticalinferenceistheprocessofmakinganestimate,
prediction,ordecisionaboutapopulationbasedonasample.
Population
Sample
Inference

Statistic
Parameter

WhatcanweinferaboutaPopulationsParameters
basedonaSamplesStatistics?
1.30

Statistical Inference
Weusestatisticstomakeinferencesaboutparameters.
Therefore,wecanmakeanestimate,prediction,ordecision
aboutapopulationbasedonsampledata.
Thus,wecanapplywhatweknowaboutasampletothe
largerpopulationfromwhichitwasdrawn!

1.31

Statistical Inference
Rationale:
Largepopulationsmakeinvestigatingeachmemberimpractical
andexpensive.
Easierandcheapertotakeasampleandmakeestimatesaboutthe
populationfromthesample.

However:
Suchconclusionsandestimatesarenotalwaysgoingtobecorrect.
Forthisreason,webuildintothestatisticalinferencemeasuresof
reliability,namelyconfidencelevelandsignificancelevel.

1.32

Confidence & Significance Levels


Theconfidencelevelistheproportionoftimesthatan
estimatingprocedurewillbecorrect.
E.g.aconfidencelevelof95%meansthat,estimatesbasedonthis
formofstatisticalinferencewillbecorrect95%ofthetime.

Whenthepurposeofthestatisticalinferenceistodrawa
conclusionaboutapopulation,thesignificancelevel
measureshowfrequentlytheconclusionwillbewrongin
thelongrun.
E.g.a5%significancelevelmeansthat,inthelongrun,thistype
ofconclusionwillbewrong5%ofthetime.

1.33

Confidence & Significance Levels


Ifweuse(Greekletteralpha)torepresentsignificance,
thenourconfidencelevelis1.
Thisrelationshipcanalsobestatedas:
ConfidenceLevel
+SignificanceLevel
=1

1.34

Confidence & Significance Levels


Considerastatementfrompollingdatayoumayhearabout
inthenews:
This poll is considered accurate within 3.4
percentage points, 19 times out of 20.

Inthiscase,ourconfidencelevelis95%(19/20=0.95),
whileoursignificancelevelis5%.

1.35

Statistical Applications in Business


Statisticalanalysisplaysanimportantroleinvirtuallyall
aspectsofbusinessandeconomics.
Throughoutthiscourse,wewillseeapplicationsofstatistics
inaccounting,economics,finance,humanresources
management,marketing,andoperationsmanagement.

1.36

Vous aimerez peut-être aussi