Vous êtes sur la page 1sur 13

Graphical p Descriptive p Techniques

Chapter 2

Learning Objectives

Understand different types of data:


Interval Nominal Ordinal

Learn how to describe a set of Nominal data. Learn how to describe the relationships between two nominal variables. variables

Populations & Samples


Population Sample

Subset

The graphical & tabular methods presented here apply to both entire populations and samples drawn from populations.
3

Definitions
Variable: some characteristic of a population or sample.

E.g. student E t d t grades. d Typically denoted with a capital letter: X, Y, Z

Values: range of possible values for a variable.

E.g. student grades (0..100)

Data: observed values of a variable.

E.g. student grades: {67, 74, 71, 83, 93, 55, 48}

Variable: what you want to measure Values Example: Gas range: 2.90 5.00 Data: actual gas prices

Types of Data & Information

Interval Data

e.g. heights, weights, prices, etc. e.g. Marital status:


Single = 1, Married = 2, Divorced = 3, Widowed = 4

Nominal Data

Ordinal Data

e.g. College course rating system:

poor = 1, fair = 2, good = 3, very good = 4, excellent = 5 We can say things like: excellent > poor or fair < very good

Interval: nominal/quantitative > arithmetic (if you can apply, then interval) Ex: avg price, avg qty, age, distance traveled Nominal & Ordinal: both qualitative/categorical Nominal Ex: single x single <> married; cant say one is better than other; gender, race Ordinal Data: order matters (different from nominal) Grades

Hierarchy of Data & Types of Calculations


Interval
- Values are real numbers. - All calculations are valid. - Data may be treated as ordinal or nominal nominal.

Ordinal
- Values must represent the ranked order of the data. - Calculations based on a ranking process are valid. - Data may be treated as nominal but not as interval.

Nominal
Values V l are th the arbitrary bit numbers b th that t represent t categories. t i - Only calculations based on the count/frequencies of occurrence are valid. - Data may not be treated as ordinal or interval.
Interval: nominal/quantitative > arithmetic (if you can apply, then interval) Ex: avg price, avg qty, age, distance traveled Nominal & Ordinal: both qualitative/categorical Ex: single x single <> married; cant say one is better than other; gender, race Ordinal Data: order matters (different from nominal) Grades

Your Turn

In-class team exercises (pages 17-18)


2.2 2 2 2.3 2.4 2.5 2.6

Graphical & Tabular Techniques for Nominal Data


The only allowable calculation on nominal data is to count the frequency of each value of the variable variable.

We can summarize the data in a table that presents the categories and their counts called a frequency distribution.

A relative frequency distribution lists the categories and the proportion with which each occurs.

Tabular description = Table Frequency Distribution: Looks at counts only (own versus rent) Relative Frequency Distribution: Proportion / percentage (% of rent vs % buying)

Work Status in the General Social Survey 2008


Survey respondents were asked the following:

Last week were you working full time, part time, going to school school, keeping house house, or what? what ? The responses were:
Working full time 2. Working part time Generally, variable has a short name Variable = Work Status 3. Temporarily not working Values: 1-8 4. Unemployed, laid off 5. Retired 6. School 7. Keeping house 8. Other The responses were recorded using the codes 1, 2, 3, 4, 5, 6, 7, and 8.
1.
9

Work Status in the General Social Survey 2008


2023 responses. Our O task t k is i to t construct t t a frequency f and d relative l ti frequency distribution for these data and graphically summarize the data by producing a bar chart and a pie chart.

10

Survey Data (150 observations)


1
1 5 1 3 3 3 7 2 6 1 6 3 4 5 2 5 5 2 1 1 5 1 2 3 3 6 1 5 5 4 1 3 1 6 3 1 1 2 1 1 2 1 1 5 1 2 1 3 7 6 3 7 4 4 2 4 3 5 1 1 1 3 1 4 3 6 1 1 1 1 3 5 5 3 7 6 5 1 7 2 5 3 5 7 5 3 5 1 3 3 1 5 3 5 5 3 1 3 3 4 1 5 5 6 3 6 1 3 2 1 3 1 1 6 1 5 1 5 1 3 3 5 3 6 3 6 3 1 1 1 7 1 5 4 6 1 1 5 5 5 3 6 2 4 7 6 1 1 3 5 1 3 3 6 6 1 3 2 3 1 3 3 1 4 3 5 3 7 1 5 1 5 3 5 2 2 7 3 3 3 1 5 6 6 7 6 7 5 1 5 1 1 3 5 3 1 3 1 3 1 1 3 1 5 2 1 7 3 7 5 2 5 1 5 3 5 1 1 5 3 5 5 1 2 1 1 2 2 5 1 4 4 1 5 3 6 6 3 3 7 3 5 4 1 5 6 1 1 5 5 1 5 5 3 1 1 3 6 1 5 1 5 5 1 7 3 1 1 6 5 1 3 3 1 1 1 1 1 7 1 5 1 1 5

11

Frequency & Relative Frequency Distributions

Frequency Distribution: =countif(A1:A2023,1) = 1003 (BAR CHARTS) Relative frequency Distribution with Pivot tables (PIE CHARTS) 12

Nominal Data (Frequency)

Bar Charts are often used to display frequencies.


Frequency Distribution: =countif(A1:A2023,1) = 1003 (BAR CHARTS)
13

Nominal Data (Relative Frequency)

Pie Charts show relative frequencies.


Relative frequency Distribution with Pivot tables (PIE CHARTS)
14

Nominal Data
Its all the same information, (based on the same data). Just different presentation.

Frequency Distribution: =countif(A1:A2023,1) = 1003 (BAR CHARTS) Relative frequency Distribution with Pivot tables (PIE CHARTS)

15

Your Turn

In-class team exercises using MS. Excel (pages 2931):


2.21 2.28 2.32

16

Describing the Relationship between Two Nominal Variables

Newspaper Readership Survey


In

a major North American city there are four competing newspapers: the Post, Globe, Sun, and Star.

To help design advertising campaigns, the advertising managers of the newspapers need to know which segments of the newspaper market are reading their papers.
A survey was conducted to analyze the relationship between newspapers and occupation.

17

Newspaper Readership Survey

A sample of newspaper readers was asked to report which newspaper they read:

Globe (1) Post (2) Star (3) Sun (4)

The readers were also asked to indicate whether they were a blue blue-collar collar worker (1) (1), white white-collar collar worker (2), or professional (3) How many possible combinations of these two variables are there?
18

Cross-classification table of Frequencies

As a first step we need to produce a crossclassification table, which lists the frequency of each combination of the values of the two variables.
Blue Collar White Collar 27 29 18 43 38 21 37 15 120 108 Professional 33 51 22 20 126 Total 89 112 81 72 354

Newspaper Globe Post Star Sun Total

By counting the number of times each of the 12 possible combinations occurs, we can produce the following cross-tabulation (cross-classification)

19

Relative Frequencies

If occupation and newspaper are related, then there will be notable differences in newspapers read by occupations. occupations

An easy way to see this is to covert the frequencies in each column to relative frequencies.
Blue Collar 27/120 =0.23 18/120 = 0.15 38/120 = 0.32 37/120 = 0.31 White Collar 29/108 = 0.27 43/108 = 0.40 21/108 = 0.19 15/108 = 0.14 Professional 33/126 = 0.26 51/126 = 0.40 22/126 = 0.17 20/126 = 0.16
20

Newspaper Globe Post Star Sun

10

Interpretation
The relative frequencies in columns 2 and 3 are similar, but there are large differences between columns 1 and 2 and between columns 1 and 3.
Newspaper Globe Post Star Sun Blue Collar 27/120 =0.23 18/120 = 0.15 38/120 = 0.32 37/120 = 0.31 White Collar 29/108 = 0.27 43/108 = 0.40 21/108 = 0.19 15/108 = 0.14 Professional 33/126 = 0.26

similar 51/126 = 0.40


22/126 = 0.17 20/126 = 0.16

dissimilar

This tells us that blue collar workers tend to read different newspapers from both white collar workers and professionals and that white collar and professionals are quite similar in their newspaper choice.

21

Graphing the Relationship between 2 Nominal Variables


60 50 40 30 20 10 0 Blue collar White collar Occupation Professional G&M Post Star Sun G&M Star Sun Post G&M Star Sun Post

Use the data from the cross-classification table to create bar charts
22

11

Interpretation

If the two variables are unrelated, the patterns exhibited in the bar charts should be approximately the same.

If some relationship exists, then some bar charts will differ from others.

The graphs tell us the same story as did the table.

The shapes of the bar charts for occupations 2 and 3 (Whitecollar and Professional) are very similar. B h diff Both differ considerably id bl f from the h b bar chart h f for occupation i 1 (Bl (Bluecollar).

23

Your Turn

In-class team exercises using MS. Excel (pages 3940):

2.44

24

12

Homework

Pages 41-42:

2.50 2 50 2.52 2.54

25

13

Vous aimerez peut-être aussi