Académique Documents
Professionnel Documents
Culture Documents
Lecture No. 1
Statistics and Probability
STATISTICS
STATUS
Statistics
Meanings
Political
State
Data
Quantitative Qualitative
(Numeric) (Non - Numeric)
Variable
A quantity that, varies from an individual to
individual.
Variable
Quantitative Qualitative
(Numeric) (Non - Numeric)
OBSERVATIONS AND VARIABLES
In statistics, an observation often means any sort
of numerical recording of information, whether it is a
physical measurement such as height or weight; a
classification such as heads or tails, or an answer to a
question such as yes or no.
Variable:
A characteristic that varies with an individual or an
object, is called a variable.
For example, age is a variable as it varies from person to
person. A variable can assume a number of values. The
given set of all possible values from which the variable
takes on a value is called its Domain. If for a given
problem, the domain of a variable contains only one
value, then the variable is referred to as a constant.
QUANTITATIVE & QUALITATIVE VARIABLES
Quantitative Qualitative
(Numeric) (Non - Numeric)
Continuous Discrete
Continuous Variable
Measurement
Height, Weight etc
Continuous Variable
Discrete Variable
Counting
e.g. No. of sisters
Discrete Variable
Gaps, Jumps
DISCRETE AND CONTINUOUS VARIABLES:
Measurement Scales
Errors of Measurements
Population
Sample
Five Elements of an Inferencial
Statistical Problem:
A population
One or more variables of interest
A sample
An Inference
A measure of Reliability
In order of understand the concept of
Reliability, a very important point to be
understood is that making an inference
about population from the sample is only
part of the story.
We also need to know its reliability --- that is,
how good our inference is.
Measure of Reliability
A measure of reliability is a statement
(usually quantified) about the degree of
uncertainty associated with a statistical
inference.
The point to be noted is that the only way we
can be certain that an inference about
population is correct is to include the entire
population in our sample.
However, because of resource constraints,
(i.e. Insufficient time and/ or money). We
usually can not work with whole
population, so we base our inference on
just a portion of population (i.e. Sample)
Consequently, whenever possible, it is
important to determine and report the
reliability of each inference made.
As such, reliability is the fifth element of
statistical inferencial problems.
Example
A large paint retailer has had numerous
complaints from customers about under-
filled paint cans.
As, a result retailer has begun inspecting
incoming shipments of paint from
suppliers.
Shipments with under-filled problems will be
sent back to supplier.
A recent shipment contained 2,440 gallon-
size cans.
The retailer sampled 50 cans and weighted
each on a scale capable of measuring
weight to four decimal places.
Properly filled cans weigh 10 pounds.
a) Describe a population
b) Describe a variable of interest
c) Describe a sample
d) Describe the Inference
e) Describe a measure of uncertainty of our
inference.
Solution
a) The population is the set of units of
interests to the retailer, which is the
shipment of 2,440 cans of paint.
b) The weight of paint cans is the variable,
the retailer wishes to evaluate.
c) The sample is the subset of population.
In this case, it is the 50 cans of paint
selected by the retailer.
d) The inference of interest involves the
generalization of the information contained in
the sample of paint cans to the population of
paint cans.
In particular, Retailer wants to learn about
the content of under-filled problem (if any)
In the population.
This might be accomplished by finding the
average weight of the cans in the sample,
and using it to estimate the average weight
of the cans of population.
e) As far as the measure of reliability of our
inference is concerned, the point to be
noted is that, using statistical methods,
we can determine a bound on the
estimation error.
Bound on the Estimation Error
This bound is simply a number that our
estimation error (i.e. the difference between
the average weight of sample and average
weight of population of cans) is not likely to
exceed.
This bound is a measure of the uncertainty
o f o u r inference, or, in other wo rd s , th e
reliability of statistical inference.
For Example:
If the sample of 50 cans yields a mean
weight of 9 pounds, it does not follow (nor is
it likely) that the mean weight of population
of can is also exactly 9 pounds.
Nevertheless, we can use sound statistical
reasoning to ensure that our sampling
procedure will generate estimate that is
almost certainly within a specified limit of the
true mean weight of all the cans.
For example such reasoning might assure us that
the estimate of the population from the sample is
almost certainly within 1 pound of the actual
population mean.
The implication is that the actual mean weight of
the entire population of the cans is between
9 1=8 pounds and 9 +1=10 pounds --- that is,
(9 1) pounds.
This interval represents the a measure of reliability
for the inference.
IN TODAYS LECTURE,
YOU LEARNT:
The nature of the science of Statistics
The importance of Statistics in various
fields
Some technical concepts such as
The meaning of data
Various types of variables
Various types of measurement scales
The concept of errors of measurement
IN THE NEXT LECTURE,
YOU WILL LEARN:
Concept of sampling
Random verses non-random sampling
Simple random sampling
A brief introduction to other types of random sampling
Methods of data collection
In other words, you will begin your journey in a
subject with reference to which it has been said
that statistical thinking will one day be as
necessary for efficient citizenship as the ability to
read and write.