Académique Documents
Professionnel Documents
Culture Documents
18-1
Chapter Outline
1) Overview
2) Basic Concept
3) Relation to Regression and ANOVA
4) Discriminant Analysis Model
5) Statistics Associated with Discriminant Analysis
18-2
Chapter Outline
7)
Formulation
ii.
Estimation
Validation
18-3
Chapter Outline
9) The Logit Model
i.
Estimation
ii.
Model Fit
An Illustrative Application
10) Summary
2007 Prentice Hall
18-4
ANOVA REGRESSION
DISCRIMINANT/LOGIT
Similarities
Number of
One
dependent
variables
Number of
independent Multiple
variables
One
One
Multiple
Multiple
Differences
Nature of the
dependent Metric
Metric
variables
Nature of the
independent Categorical
2007
Prentice Hall
variables
Categorical
Metric
Metric
18-5
Discriminant Analysis
Discriminant analysis is a technique for analyzing
data when the criterion or dependent variable is
categorical and the predictor or independent variables
are interval in nature.
The objectives of discriminant analysis are as follows:
Development of discriminant functions, or linear
combinations of the predictor or independent variables,
which will best discriminate between the categories of
the criterion or dependent variable (groups).
Examination of whether significant differences exist
among the groups, in terms of the predictor variables.
Determination of which predictor variables contribute to
most of the intergroup differences.
Classification of cases to one of the groups based on the
values of the predictor variables.
Evaluation of the accuracy of classification.
18-6
Discriminant Analysis
18-7
Geometric Interpretation
Fig. 18.1
G1
X2
1
1
1
1
1 1 1
1
1
2
11
1
1
2 22
21
2
2
2
2 2
2
22
22
G2
G1
G2
X1
D
18-8
discriminant score
b 's
X 's
This occurs when the ratio of between-group sum of squares to withingroup sum of squares for the discriminant scores is at a maximum.
18-9
Statistics Associated
with Discriminant
Canonical correlation. Canonical correlation measures
Analysis
18-10
18-11
18-12
18-13
Conducting Discriminant
Analysis
Fig. 18.2
18-14
Conducting Discriminant
Analysis
Identify the objectives, the criterion variable, and the
Formulate
the Problem
independent variables.
The criterion variable must consist of two or more mutually
exclusive and collectively exhaustive categories.
The predictor variables should be selected based on a
theoretical model or previous research, or the experience
of the researcher.
One part of the sample, called the estimation or analysis
sample, is used for estimation of the discriminant
function.
The other part, called the holdout or validation sample,
is reserved for validating the discriminant function.
Often the distribution of the number of cases in the
analysis and validation samples follows the distribution in
the total sample.
18-15
Attitude
Amount
Resort
Head of
Spent on
Travel
($000)
Vacation
1
1
M (2)
2
1
H (3)
3
1
H (3)
4
1
L (1)
5
1
H (3)
6
1
H (3)
7
1
M Prentice
(2)
2007
Hall
Family
Toward
Attached
No.
to Family
Size
Visit
Household
Income
Family
Vacation
50.2
43
70.3
61
62.9
52
48.5
36
52.7
55
75.0
68
46.2
62
18-16
Attitude
Amount
Resort
Head of
Spent on
Income
Travel
Household Family
($000)
Vacation
16
17
18
19
20
21
22
23
24
25
26
27
28
2
2
2
2
2
2
2
2
2
2
2
2
2
32.1
36.2
43.2
50.4
44.1
38.3
55.0
46.1
35.0
37.3
41.8
57.0
33.4
Family
Toward
Attached
No.
Size
Visit
to Family
Vacation
5
4
2
5
6
6
1
3
6
2
5
8
6
4
3
5
2
6
6
2
5
4
7
1
3
8
3
2
2
4
3
2
2
3
5
4
3
2
2
58
55
57
37
42
45
57
51
64
54
56
36
50
L
L
M
M
M
L
M
L
L
L
M
M
L
(1)
(1)
(2)
(2)
(2)
(1)
(2)
(1)
(1)
(1)
(2)
(2)
(1)
18-17
Information on Resort
Visits:
Table
Holdout Sample
18.3
Annual
Attitude
Amount
Head of
Income
Family
Resort
Spent on
Travel
Family
Toward
50.8
63.6
54.0
45.0
68.0
62.1
35.0
49.6
39.4
37.0
Size
Visit
to Family
($000)
Vacation
1
1
2
1
3
1
4
1
5
1
6
1
7
2
8
2
9
2
2
200710
Prentice Hall
Attached
No.
Household
Vacation
4
7
6
5
6
5
4
5
6
2
7
4
7
4
6
6
3
3
5
6
3
7
4
3
6
3
4
5
3
5
45
55
58
60
46
56
54
39
44
51
M(2)
H (3)
M(2)
M(2)
H (3)
H (3)
L (1)
L (1)
H (3)
L (1)
18-18
Conducting Discriminant
Analysis
Estimate the Discriminant
Function Coefficients
18-19
Analysis
Table
18.4
GROUP MEANS
VISIT
INCOME
TRAVEL VACATION
HSIZE
1
2
Total
60.52000
41.91333
51.21667
5.40000
4.33333
4.86667
4.33333
2.80000
3.56667
5.80000
4.06667
4.9333
AGE
53.73333
50.13333
51.93333
9.83065
7.55115
12.79523
1.91982
1.95180
1.97804
1.82052
2.05171
2.09981
1.00000
0.19745
0.09148
0.08887
- 0.01431
1.00000
0.08434
-0.01681
-0.19709
1.00000
0.07046
0.01742
1.23443
.94112
1.33089
HSIZE
1.00000
-0.04301
8.77062
8.27101
8.57395
AGE
1.00000
Wilks'
INCOME
TRAVEL
VACATION
HSIZE
AGE
0.45310
0.92479
0.82377
0.65672
0.95441
F
33.800
2.277
5.990
14.640
1.338
Significance
0.0000
0.1425
0.0209
0.0007
0.2572
Cont.
18-20
1.7862
% of
Cum Canonical
After
Variance % Correlation Function
100.00
100.00
Wilks'
Chi-square
: 0
0 .3589
0.8007 :
26.130
df
5
0.0001
0.74301
0.09611
0.23329
0.46911
0.20922
Structure Matrix:
0.82202
0.54096
0.34607
0.21337
0.16354
Cont.
18-21
15
Group
15
12
80.0%
0
0.0%
Percent of grouped cases correctly classified: 90.00%
Membership
2
3
20.0%
15
100.0%
Cont.
18-22
Group
Membership
No. of Cases 1
2
Group
4
66.7%
2
33.3%
Group
0
0.0%
6
100.0%
18-23
18-24
Conducting Discriminant
Analysis
The interpretation of the discriminant weights, or coefficients, is
similar to that inthe
multiple
regression analysis.
Interpret
Results
18-25
18-26
INCOME
TRAVEL
VACATION
HSIZE
38.57000
4.50000
4.70000
50.11000
4.00000
4.20000
64.97000
6.10000
5.90000
4.20000
56.00000
Total
51.21667
4.86667
4.93333
3.56667
51.93333
3.10000
3.40000
AGE
50.30000
49.50000
5.29718
1.71594
1.88856
1.19722
8.09732
6.00231
2.35702
2.48551
1.50555
9.25263
8.61434
1.19722
1.66333
1.13529
7.60117
Total
12.79523
1.97804
2.09981
1.33089
8.57395
TRAVEL
VACATION
HSIZE
INCOME
1.00000
TRAVEL
0.05120
1.00000
VACATION
0.30681
0.03588
1.00000
HSIZE
0.38050
0.00474
0.22080
1.00000
AGE
-0.20939
-0.34022
-0.01326
-0.02512
AGE
1.00000 Cont.
18-27
Wilks' Lambda
INCOME
TRAVEL
VACATION
HSIZE
AGE
0.26215
0.78790
0.87411
0.88214
Significance
38.00
0.0000
3.634 0.0400
0.88060
1.830
1.944
0.1626
1.804 0.1840
0.1797
3.8190
0.24
0.2469
% of
Cum Canonical
After
Variance % Correlation Function
93.93
6.07
: 0
0.8902
93.93
100.00
0.4450
Wilks'
Chi-square df
0.1664
: 1
44.831 10
0.00
0.8020
5.517
FUNC 1
1.04740
0.33991
-0.14198
-0.16317
0.49474
-0.42076
0.76851
0.53354
0.12932
0.52447
FUNC
2
Cont.
18-28
FUNC 1
0.85556*
0.19319*
0.21935
0.14899
0.16576
FUNC 2
-0.27833
0.07749
0.58829*
0.45362*
0.34079*
FUNC 1
FUNC 2
0.1542658
-0.6197148E-01
0.1867977
0.4223430
-0.6952264E-01
0.2612652
-0.1265334
0.1002796
0.5928055E-01
0.6284206E-01
-11.09442
-3.791600
FUNC 1
-2.04100
-0.40479
2.44578
FUNC 2
0.41847
-0.65867
0.24020
Cont.
18-29
9
90.0%
1
10.0%
0
0.0%
1
10.0%
9
90.0%
0
0.0%
Group
10
Group
10
0
2
0.0%
20.0%
Percent of grouped cases correctly classified: 86.67%
8
80.0%
Group
3
75.0%
1
25.0%
0
0.0%
Group
0
0.0%
3
75.0%
1
25.0%
Group
1
0
25.0%
0.0%
of grouped cases correctly classified: 75.00%
2007 Percent
Prentice Hall
3
75.0%
18-30
All-Groups Scattergram
Fig. 18.3
Across: Function
1
Down: Function
2
4.
0
1 1
1 *1
23
1 1 12 *
2 2
1
1
2 2
1
2
0.
0
3 *3 3
3
3
-4.0
* indicates a group
centroid
-6.0
2007 Prentice Hall
-4.0 -2.0
0.0
2.0
4.0
6.0
18-31
Territorial Map
Fig. 18.4
8.0
4.0
0.0
4.0
-8.0
-8.0
13
13
Across: Function 1
13
Down: Function 2
13
13
* Indicates a
13
group
13
centroid
11 3
112
3
112233
* 1 1 1 2 2 2 2 3 *3
112 *
223
233
12 1 2
2233
1 1 22
223
21 2
11
233
12
122
223
2
112
233
1 12
12
2233
1 12 1 2 2
223
112
1 1 1 22
233
2
-6.0
-4.0
-2.0
0.0
2.0
4.0
6.0
8.0
18-32
Stepwise Discriminant
Analysis
Stepwise discriminant analysis is analogous to
18-33
Stepwise Discriminant
Analysis
18-34
18-35
P
log
a a
1 P
Or
Or
... a
P
a X
log
1 P
i 0
18-36
Model Formulation
k
exp(
P
1 exp(
i 0
a X
i
a X
i 0
Where:
= Probability of success
Xi
= Independent variable i
ai
= parameter to be estimated.
18-37
When Xi approaches
, P approaches 0.
When Xi approaches
, P approaches 1.
18-38
The Cox & Snell R Square can not equal 1.0, even if
the fit is perfect
18-39
Significance Testing
18-40
Interpretation of Coefficients
18-41
Loyalty
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Brand
4
6
5
7
6
3
5
5
7
7
6
5
7
5
7
3
4
2
5
4
3
3
3
4
6
3
4
3
5
1
Product
3
4
2
5
3
4
5
4
5
6
7
6
3
1
5
1
6
5
2
1
3
4
6
4
3
6
3
5
5
3
Shopping
5
4
4
5
4
5
5
2
4
4
2
4
3
4
5
3
2
2
4
3
4
5
3
2
6
3
2
2
3
2
18-42
Results of Logistic
Regression
Table 18.7
Internal Value
0
Loyal
Model Summary
Step
1
-2 Log
likelihood
23.471(a)
Nagelkerke R
Square
.604
a Estimation terminated at iteration number 6 because parameter estimates changed by less than .001.
18-43
Results of Logistic
Regression
Table 18.7, cont.
Classification Tablea
Predicted
Observed
Step 1 Loyalty to the
Brand
Not Loyal
Loyal
Overall Percentage
Percentage
Correct
80.0
80.0
80.0
a.
The cut value is .500
a
Variables in the Equation
Step
1
Brand
Product
Shopping
Constant
B
1.274
.186
.590
-8.642
S.E.
.479
.322
.491
3.346
Wald
7.075
.335
1.442
6.672
df
1
1
1
1
Sig.
.008
.563
.230
.010
Exp(B)
3.575
1.205
1.804
.000
18-44
SPSS Windows
The DISCRIMINANT program performs both twogroup and multiple discriminant analysis. To
select this procedure using SPSS for Windows
click:
Analyze>Classify>Discriminant
The run logit analysis or logistic regression using
SPSS for Windows, click:
18-45
5.
6.
7.
8.
9.
18-46
2.
3.
4.
5.
6.
Click OK.
18-47