Vous êtes sur la page 1sur 25

Introduction

to
Statistical Inference

By
Dr. Saddam Hussain
Objectives
To define statistics
To discuss the wide range of
applications of statistics in
business
To understand the branches of
statistics
To describe the levels of
measurement of data
What is Statistics?
A collection of tools used for converting
raw data into information to help
decision makers in their works.
Science of collecting, organizing,
presenting, analyzing, and
interpreting data for the purpose of
assisting in making more effective
decision
Branch of mathematics
Facts and figures
What is Statistics?

“Statistics is a way to get information from data”

Statistics

Data Information

Data: Facts, especially  Information: Knowledge 
numerical facts, collected  communicated 
together for reference or  concerning some 
information. particular fact.

Statistics is a tool for creating new understanding 
from a set of numbers.
Applications of Statistics in Business
 Accounting – auditing and cost estimation
 Finance – investments and portfolio
management
 Human resource – compensation, job
satisfaction, performance measure
 Operation – quality management, forecasting,
MIS, capacity planning, materials control
 Marketing - market analysis, consumer
research, pricing
 Economics – regional, national, and
international economic performance
 International Business- market and
demographic analysis.
Key Statistical Concepts…
Population
— a population is the group of all items of
interest to a statistics practitioner.
— frequently very large; sometimes infinite.
e.g. All blue collar workers in Pakistan
Sample
— A sample is a set of data drawn from the
population.
— Potentially very large, but less than the
population.
e.g. a sample of 765 blue collar workers
Key Statistical Concepts…

Parameter
— A descriptive measure of a
population.

Statistic
— A descriptive measure of a
sample.
Key Statistical Concepts…
Population Sample

Subset

Statistic
Parameter
 Populations have Parameters,
 Samples have Statistics.
Branches of Statistics

Statistics

Descriptive Statistics Inferential Statistics

Parametric Statistics Non-Parametric Statistics


Descriptive Statistics…
 …are methods of organizing, summarizing,
and presenting data in a convenient and
informative way. These methods include:
 Graphical Techniques
 Numerical Techniques
 The actual method used depends on what
information we would like to extract. Are we
interested in…
 measure(s) of central location? and/or
 measure(s) of variability (dispersion)?
 Descriptive Statistics helps to answer these
questions…
Inferential Statistics…
 Descriptive Statistics describe the data set
that’s being analyzed, but doesn’t allow us to
draw any conclusions or make any
interferences about the data. Hence we need
another branch of statistics: inferential
statistics.

 Inferential statistics is also a set of methods,


but it is used to draw conclusions or
inferences about characteristics of
populations based on data from a sample.
Statistical Inference…
Statistical inference is the process of making
an estimate, prediction, or decision about a
population based on a sample.
Population

Sample

Inference

Statistic
Parameter

What can we infer about a Population’s Parameters
based on a Sample’s Statistics?
Population Vs Sample
Population
 A population is a collection of all the elements
we are studying and about which we are trying
to draw conclusions.
All items of interest
Group of interest to investigator
Sample
 A sample is a collection of some, but not all of
the elements of the population.
Portion of population
Will be used to reach conclusions about population
Statistical Inference…
We use statistics to make inferences
about parameters.

Therefore, we can make an estimate,


prediction, or decision about a
population based on sample data.

Thus, we can apply what we know


about a sample to the larger population
from which it was drawn!
Statistical Inference…
 Rationale:
•Large populations make investigating each
member impractical and expensive.
•Easier and cheaper to take a sample and
make estimates about the population from
the sample.
 However:
Such conclusions and estimates are not
always going to be correct.
For this reason, we build into the statistical
inference “measures of reliability”, namely
confidence level and significance level.
Confidence & Significance
Levels…
The confidence level is the proportion of times that an
estimating procedure will be correct.
E.g. a confidence level of 95% means that,
estimates based on this form of statistical
inference will be correct 95% of the time.
When the purpose of the statistical inference is to
draw a conclusion about a population, the
significance level measures how frequently the
conclusion will be wrong in the long run.
E.g. a 5% significance level means that, in the
long run, this type of conclusion will be wrong
5% of the time.
Process
Process of
of Inferential
Inferential
Statistics
Statistics

Calculate x
to estimate 
Population Sample
 x
(parameter) (statistic)

Select a
random sample
Types of Data and Information
Definitions…
A variable is some characteristic of a
population or sample.
E.g. student grades; workers salary
Typically denoted with a capital letter: A, A-,
B+, B, B-…
The values of the variable are the range of
possible values for a variable.
E.g. student marks (0..100)
Data are the observed values of a variable.
E.g. student marks: {67, 74, 71, 83, 93, 55,
48}
Types of Data &
Information
Data (at least for purposes of Statistics)
fall into three main groups:

 Interval Data
 Nominal Data
 Ordinal Data
Interval Data…
Interval data
• Real numbers, i.e. heights, weights,
prices, etc.
• Also referred to as quantitative or
numerical.

Arithmetic operations can be


performed on Interval Data, thus its
meaningful to talk about 2*Height, or
Price + $1, and so on.
Nominal Data…
Nominal Data
• The values of nominal data are categories.
E.g. responses to questions about marital
status, coded as:
Single = 1, Married = 2, Divorced = 3,
Widowed = 4

Because the numbers are arbitrary,


arithmetic operations don’t make any sense
(e.g. does Widowed ÷ 2 = Married?!)
Nominal data are also called qualitative or
categorical.
Ordinal Data…
Ordinal Data appear to be categorical in nature,
but their values have an order; a ranking to
them:
E.g. College course rating system:
poor = 1, fair = 2, good = 3, very good = 4,
excellent = 5

While its still not meaningful to do arithmetic on


this data (e.g. does 2*fair = very good?!), we
can say things like: excellent > poor or fair <
very good
That is, order is maintained no matter what
numeric values are assigned to each category.
E.g. Representing Student
Grades…
N Interval Data
Data Categorical?
e.g. {0..100}
Y

Y Ordinal Data
Ordered?
e.g. {F, D, C, B, A}
Categorical
Data N Rank order to data

Nominal Data
e.g. {Pass | Fail}

NO rank order to data
Calculations for Types of
Data
As mentioned above,
• All calculations are permitted on
interval data.
• Only calculations involving a ranking
process are allowed for ordinal data.
• No calculations are allowed for
nominal data, only counting the
number of observations in each
category is possible.
This lends itself to the following
“hierarchy of data”…
Hierarchy of Data…
Interval
Values are real numbers.
All calculations are valid.
Data may be treated as ordinal or nominal.
Ordinal
Values must represent the ranked order of the
data.
Calculations based on an ordering process are
valid.
Data may be treated as nominal but not as
interval.
Nominal
Values are the arbitrary numbers that represent
categories.

Vous aimerez peut-être aussi