Vous êtes sur la page 1sur 26

Section 1

Introduction to Statistics and


Sampling

Learning Objectives
Populations and samples
Types of statistics
Good and bad sampling practices

What is Statistics all About?


Statistics deals with the collection,
organization, and presentation of data.
Goals:
Make the seemingly random data
meaningful (descriptive / graphs).
Make generalizations about a population
(make inferences) in the presence of
variability.
Predict what might happen in the future.
3

Samples and Populations


Example: What percentage of
Canadians prefer Liberals over the
other political parties?

Samples and Populations


Clearly, asking all 35 million Canadians
is not possible because

Samples and Populations


A better idea is to ask a randomly
selected group (sample) of Canadians.
Use the results of that sample to make
the prediction that the results for all
Canadians will be similar.

Samples and Populations


Example:
If 60% of the SAMPLE prefer Liberals, then
we hope that about 60% of the ENTIRE
POPULATION of 35 million Canadians also
prefer Liberals.
The sample should not be biased (more
on this later).

Samples and Populations


Definitions
Population: The collection of all
individuals (or things) under
consideration (all 35 million Canadians).
Sample: The part of the population
from which information is obtained (the
randomly selected group of Canadians).
8

Two Types of Statistics


The Liberals example demonstrates
two types of statistics.

Type 1: Descriptive Statistics


Methods of organizing and
summarizing data.
Find sample means, standard deviations,
medians, etc. More on these later!
Tables, scatterplots, histograms, pie charts,
etc.

10

Type 2: Inferential Statistics


Methods of drawing conclusions about
an entire POPULATION based on the
results from a SAMPLE.
Example: If 60% of the sample favour
Liberal Make the conclusion (i.e., infer)
that a similar percentage of ALL Canadians
also prefer Liberal.

11

In Class Exercise 1.1


Either by yourself or in a group,
determine whether the following is an
example of descriptive or inferential
statistics, and explain your answer.

12

In Class Exercise 1.1:


Descriptive or Inferential?
A sports columnist wants to know how
much, on average, sports professionals
make annually. A sample was taken
and it was found that the mean annual
salary of the sample is $2.1 million.
The conclusion was then made that the
average salary of all sports
professionals was about $2.1 million.
13

3 Ways to Obtain Population


Information
Sampling: A small group of the population is
randomly chosen to obtain information about a
population (a good balance of info and
efficiency).
Census: Take EVERYONE in the population
(very costly, usually).
Designed Experiment: Like sampling, but also
tries to determine a CAUSATION rather than just
an association (costly and time consuming).
14

Good and Bad Samples


Sampling is the most common way to
obtain information.
We must be careful to properly
choose a sample.
This is called a REPRESENTATIVE
sample (i.e., the results will represent
similar results for the population).
Improperly choosing a sample causes
15
BIAS.

Bias
BIAS means that the sample was NOT
representative of the population.
In that case, youd have to review your
sampling methodology and start again.
To minimize bias, you must choose a
COMPLETELY RANDOM sample.

16

What Causes Bias?


Selecting individuals from the
population in a non random way.
Subjects were (unintentionally) chosen
because of a particular characteristic
they possess.
Example: Sample of PEI found 55%
prefer Liberal. Use that to infer 55% of
all Canadians prefer Liberal.
17

Bias: Famous Example


A 1948 Gallop poll asked a sample of
Americans who they are voting for.
The results of the sample predicted that
Dewey would win with 45% of the vote.
Truman then actually won with 49% of
the vote.
The reason is that the sample was
taken by TELEPHONE (not a common
18
thing to have at the time).

In Class Exercise 1.2


Determine whether the following
scenarios would result in a
representative sample or a biased
sample.
Include your reasoning.

19

In Class Exercise 1.2:


Representative or Not?
To determine the average height of
UPEI students, a researcher took a
random sample of students who were at
the sports centre.

20

How to get a Representative


Sample
Use Simple Random Sampling, which
is a straightforward Random Number
Table method.
A random number table is posted on
Moodle.
If you bought a book for this course, it will
contain a random number table.

21

Simple Random Sampling


Number your population from 1 to N
(where N is the total number of subjects
in the population).
Use your random number table to find
a random number between 1 and N.
Repeat until you have the desired
number of subjects (say, k subjects) in
your sample.
22

How to Use a Random


Number Table
Column number
Line Number

00
01
02
03
04

00-09

10-19

20-29

15544 80712

97742 21500

97081 42451

01011 21285

04729 39986

73150 31548

47435 53308

40718 29050

74858 64517

91312 75137

86274 59834

69844 19853

12775 08768

80791 16298

22934 09630
23

How to Use a Random


Number Table
Column number
Line Number

00
01
02
03
04

00-09

10-19

20-29

15544 80712

97742 21500

97081 42451

01011 21285

04729 39986

73150 31548

47435 53308

40718 29050

74858 64517

91312 75137

86274 59834

69844 19853

12775 08768

80791 16298

22934 09630
24

Example: Choose a Sample


of 4 from a Population of 67.
Column number
Line Number

00
01
02
03
04

00-09

10-19

20-29

15544 80712

97742 21500

97081 42451

01011 21285

04729 39986

73150 31548

47435 53308

40718 29050

74858 64517

91312 75137

86274 59834

69844 19853

12775 08768

80791 16298

22934 09630
25

Conclusion
In this section, we learned that
Sampling is usually the preferred method
of getting information about a population.
Good samples are representative. Bad
samples are biased.
To avoid bias, you must use complete
randomization to select a sample.

26

Vous aimerez peut-être aussi