Discriminant Analysis

Discriminant Analysis_2001-2002
Quantitative Techniques for Marketing 6.0 Discriminant Analysis
Aims To give you an appreciation of the theoretical and practical issues in the application of discriminant analysis to marketing decision making
In particular the topic aims to indicate the objectives of discriminant analysis, data requirements, methodology and technical concepts, practical issues and potential applications. Objectives
By the end of this section, and after additional private study and completion of the assignment, you should be able to: v Understand the suitability of discriminant analysis in the context of data properties and research objectives. v Appreciate theoretical issues and understand technical concepts of discriminant analysis v Understand how to decide how many discriminant functions to derive. v Interpret the results of a discriminant function. v Evaluate the classification performance of a discriminant function. v Appreciate the use of discriminant analysis to marketing v Conduct of discriminant analysis using SPSS Software. v Critically evaluate published studies in discriminant analysis v Write a structured, critical account of the methodology, application, results and interpretation of discriminant analysis.
1
Content
6.1 6.2 6.3
Introduction: Objectives of Discriminant Analysis Data Requirements The Discriminant Function 6.3.1 Estimation of the function 6.3.2 Unstandardized and Standardised Coefficients
6.4
Classification using Discriminant Analysis 6.4.1 Discriminant Scores 6.4.2 Classification Criteria 6.4.3 Evaluating the Discriminant Function using Classification Performance
6.5
Multiple Discriminant Analysis
6.5.1 Choosing the Number of Functions - Eigenvalue criteria and Wilks' Lambda 6.6 6.7 Applications of Discriminant Analysis to Marketing Summary
Reading Hair J F, Anderson R E and Tatham R L and Black W C (1998) Multivariate Data Analysis, 5th Edition, New Jersey, USA Prentice Hall International. Chapter 4. Albers-Miller, N. D (1999). Consumer Misbehaviour: Why People Buy Illicit Goods, Journal of Consumer Marketing, Vol. 16 No. 3, 273287. Beharrell, B. and Crockett, A. (1992). New Age Food! New Age Consumers! With or Without Technology Fix Please, British Food Journal, Vol. 94 No. 7, Crask M R and Perreault J R (1971) Validation of Discriminant Analysis in Marketing Research. Journal of Marketing Research, Vol. XIV, No. l, Feb. Cunningham, I. C. M. and Cunningham, W. H. (1973) The Urban InHome Shopper: Socio-Economic and Attitudinal Characteristics, Journal of Retailing, 49, 42-50. Gamesalingham, S. and Kumar, K. (2001). Detection of Financial Stress via Multivariate Statistical Analysis, Managerial Finance, Vol. 27 No. 4, 45-55. Korgaonkar, P., Silverblatt, R. and O Leary, B. (2001). Web Advertising and Hispanics, Journal of Consumer Marketing, Vol. 18 No. 2, 134152. Kuei, C-H, Madu, C. N., Chinho, L. and Min, H. (1997). An Empirical Investigation of the Association Between Quality Management Practices and Organisational Climate, International Journal of Quality Science, Vol. 2 No. 2, 121-137. Mannion M A, Cowan, C and Gannon, M (2000), Factors Associated With Perceived Quality Influencing Beef Consumption in Ireland, British Food Journal, Vol. 102, No. 2 pp. 195-210. McEnally M R and Hawes J M (1984). The Market for Generic Brand Grocery Products, Journal of Marketing, Winter, 75-83. Montgomery D B (1975). New Product Distribution: An Analysis of Supermarket Buyer Decisions, Journal of Marketing Research, (August 1975), 255-264. Morrison, D. G. (1969) On the Interpretation of Discriminant Analysis. Journal of Marketing Research, Vol. l, May, l56-l63. Perry M (1969) Discriminant Analysis of Relations between Consumers Attitudes, Behaviour and Intentions. Journal of Advertising Research, Vol. 9, No. 2, 1969, 34-39.
Pessemier E A, Burger P C, and Tigert D J (1967) Can New Product Buyers be Identified? Journal of Marketing Research, Vol. 4, November, pp. 349-354. Robertson T S and Kennedy J N (1968) Prediction of Consumer Innovators: Application of Multiple Discriminant Analysis. Journal of Marketing Research, Vol. 5, No. 1, Feb., pp. 64-69. Sands S and Moore P (1981) Store Site Selection by Discriminant Analysis. Journal of the Market Research Society, Vol. 23, No. l, Jan, pp. 40-5l. Siu, W-S and Tsoi, M-Y (1998). Nutrition Label Usage of Chinese Consumers. British Food Journal, Vol. 100, No. 1, 25-29. Steel P, Storey D and Wynarczyk, P (1985) The Prediction of Small Company Failure Using Financial Statement Analysis, CURDS, University of Newcastle upon Tyne, Discussion Paper No 19. (See MRN for loan copy) Taffler R and Houston A W (1980) How to Identify Failing Companies Before it is Late, Professional Administration. (See MRN for loan copy) Tomlinson M (1994) Do Distinct Class Preferences for Foods Exist?, British Food Journal, Vol. 96 (7), 11-17. Waldron D G (1978) The Image of Craftsmanship. A Predictor Variable Influencing the Purchase of European Automobiles by Americans. European Journal of Marketing, Vol. l2, No 8, pp. 554-56l. Williams, C. E. and Tse, E. C. Y. (1995). The Relationship Between Strategy and Entrepreneurship: The US Restaurant Sector. International Journal of Contemporary Hospitality Management, Vol. 7 No. 1, 22-26.
Quantitative Techniques for Marketing 6.0 Discriminant Analysis
6.1 1.
Introduction and Objectives The aim of discriminant analysis is to explain and predict the group membership of things on the basis of measurements on explanatory variables. Explanation / prediction uses discriminant function, a linear combination of explanatory (discriminating) variables. e.g. For 2 groups, 2 explanatory variables D = a 1 x1 + a 2 x2 D = discriminant score a1, a2 are coefficients x1, x2 are explanatory variables. (1)
2.
3.
Analysis concerns estimation of the coefficients for an appropriate set of variables, interpretation of the relative importance of the variables and evaluation of the predictive power of the model Marketing applications in product usage, store site selection and company failure.
4.
6.2 1.
Data Requirements The data is required to include a dependent [nominal] variable and [metric] explanatory variables. In this example freezer ownership [cat] is determined by family size [size] and real disposable monthly income [income]. cat = a1size +
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 cat 2 2 2 2 2 2 1 1 1 1 2 1 2 2 1 2 2 1 1 2
a2income
size 1 2 2 5 4 2 4 4 2 4 1 2 2 5 4 2 4 4 2 4 income 2500 3000 4000 4500 5000 5500 6000 7000 8500 10000 2500 3000 4000 4500 5000 5500 6000 7000 8500 10000 dat 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
Variables cat
size income dat
= Group variable: 1 = freezer owner 2 = non-freezer owner = family size = monthly disposable income in real terms = data selection variable: 1 = data for estimation 2 = data for classification
6.3 1.
The Discriminant Function With g groups a maximum of g-1 discriminant functions are necessary. We consider the 2 group case.
6.3.1 Estimation 1. The objective is to find a linear combination of the explanatory variables which separates the groups. For example, with p number of explanatory variables the discriminant function is: D = a 1 x1 + a 2 x2 + + + a p xp (2)
2.
With g groups, estimation can employ canonical correlations to derive up to g-1 functions as a descending hierarchy so that: D1 explains the group differences the most D2 explains group differences not explained by D1 etc.
6.3.2 Discriminant Function Coefficients 1. Output usually presents 2 types of functions: (a) (b) based on unstandardised coefficients ( au ) based on standardised coefficients ( as )
2.
Standardised variables give the same result as if the data had zero means and unit variance. They are used to interpret relative importance of the explanatory variables. Unstandardised coefficients are used for prediction. See example
3.
Example: Freezer Ownership Group 1 = freezer owner 2 = non-owner Explanatory variables Size = family size Income = real family income Results (based on data from Section 6.2 ) Variable Size Income Constant standardised 0.00537 0.99877 n/a Coefficients unstandardised 0.004013 0.000709 -3.979905
Ds = 0.00537 size + 0.99877 income Du = 0.004013 size + 0.000709 income - 3.979905
6.4
Classification Using Discriminant Analysis
6.4.1 Discriminant Scores 1. Are the basis for prediction derived from the unstandardised function and values of the explanatory variables not used for estimation. We classify the items on the basis of: (a) (b) 3. Individual scores group scores
2.
Individual scores are obtained by substituting individual values of variables in the unstandardised function.
No. 1 2 3 4 5 6 7 8 9 10 4.
group 2 2 2 2 2 2 1 1 1 1
size 1 2 2 5 4 2 4 4 2 4
income 2500 3000 4000 4500 5000 5500 6000 7000 8500 10000
D score -2.2045 -1.8462 -1.1377 -0.7714 -0.4211 -0.0749 0.2874 0.9960 2.0508 3.1216
Group scores are obtained by substituting average group values for each variable in the unstandardised function. Group 1 2 Average values size income 3.50000 7875.000 2.66667 4083.333 D score 1.614 -1.076
10
6.4.2 Classification Criteria 1. Compares individual scores with the group scores. Individuals are classified in the nearest group We can implement this using a mid-point (M) between the group scores and classify people on comparing their score (Di) with M if D < M classify in left group if D > M .classify in right group
2.
No. 1 2 3 4 5 6 7 8 9 10
Actual group 2 2 2 2 2 2 1 1 1 1
Size 1 2 2 5 4 2 4 4 2 4
Income 2500 3000 4000 4500 5000 5500 6000 7000 8500 10000
D score -2.2045 -1.8462 -1.1377 -0.7714 -0.4211 -0.0749 0.2874 0.9960 2.0508 3.1216
Predicted group 2 2 2 2 2 2 1 1 1 1
11
6.4.3 Evaluation of the Discriminant Function From Classification Criteria 1. Evaluation compares actual group membership with predicted group membership. Morrison (1969) suggests this should be carried out using a holdout sample; data not used for estimation. That is, the full data set is split with some data used for estimation and the rest used as a holdout sample for evaluation. For example, suppose we have 10 additional observations and classify these on the basis of the existing function and group scores as shown below The D scores, actual and predicted group membership are as shown below: No. 1 2 3 4 5 6 7 8 9 10 Actual group 2 1 2 2 1 2 2 1 1 2 Size 1 2 2 5 4 2 4 4 2 4 Income 2500 3000 4000 4500 5000 5500 6000 7000 8500 10000 D score -2.2045 -1.8462 -1.1377 -0.7714 -0.4211 -0.0749 0.2874 0.9960 2.0508 3.1216 Predicted group 2 2 2 2 2 2 1 1 1 1
2.
3.
4.
5. Actual and predicted group allocation is summarised in a confusion or classification matrix as follows: Actual Group 1 1 2 Totals 2 2 4 Predicted Group 2 2 4 6 Totals 4 6 10
12
6.
Performance is summarised by C - the per cent correct classification C = (2+4) x 100 = 60 % 10 (3)
7.
In addition to this measure we need a comparative basis on which to evaluate performance Common bases are (a) (b) C max criterion C pro criterion
8.
9.
C max is based on classification of all cases into the dominant group. p = prob of belonging to group 1 1-p = prob of belonging to group 2 C max = max ( p, 1-p ) with the criterion that If C > Cmax the function is good
e.g.
e.g.
C max = max ( .40, .60 ) = .60 C = 60%
10.
C pro is based on the probability of correct classification in all groups using a random method.
13
11.
Defining proportion of actual cases in group 1 = proportion of actual case in group 2 = C pro = p2 + (1-p)2 so the criteria is: If C > C pro the function is good e.g. C pro = ( 0.40 )2 + ( 0.60 )2 = 0.52 C = 60% p 1-p (4)
14
6.5
Multiple Discriminant Analysis
1. With g groups the maximum number of functions required is g-1 but successful discrimination may be possible with fewer functions. 2. Choice of the number of functions (a) (b) 3. Eigenvalue/ variance criterion Wilks' lambda
Eigenvalue/variance e.g. 4 groups ( 3 functions at most) Function number 1 2 3 Total Eigenvalue 0.31781 0.19802 0.00078 0.51661 Percentage variance 61.52 38.33 0.15 100.00 Comment substantial fair Poor
Contribution of function 3 is poor and would not be derived 4. Wilks' Lambda is an inverse measure of the discriminatory power in the explanatory variables which has not been removed by the current set of discriminant functions The statistical test is for the significance of the information which has not been explained by the current set of functions The hypotheses are H0 = remaining information is not significant H1 = remaining information is significant
5.
6.
15
8. 9.
The test is a 2 test Example
After Function 0 1 2
Lambda 0.633 0.834 0.999
Actual 2 85.538 33.931 0.145
df 12 6 2
Critical 2 21.026 12.592 5.991
Sig. 0.000 0.010 0.243
Note 1. df = degrees of freedom 2. critical 2 is based on a 5 per cent significance level.
16
6.6 1.
Applications of Discriminant Analysis to Marketing Discriminant analysis has been applied to three main areas: (a) (b) (c) product/service users and none-users; store site selection; prediction of company failure.
2.
The application to product use, consumers are classified based upon their degree of product use (user vs. non-user), (heavy, medium light user) or the time lapse which evolves before they try a product (early adopter, late adopter, non-adopter). Psychographic and demographic variables are then used as discriminating variables. See for example the application to product innovators (Robertson and Kennedy, 1968), buyers of a new supermarket product (Montgomery, 1975) generic brand grocery products (McEnelly and Hawes, 1984), or buyers of a new detergent (Pessemier, Burger and Tigert, 1967). A study of beef consumption behaviour in Ireland by Mannion et al (2000) employed factor analysis of a 25-item scale concerned with the importance of a series of attributes (health, safety and quality) associated with beef. The solution produced 7 factors that were subsequently employed in a discriminant analysis of consumers who had maintained their consumption and consumers who had reduced consumption. The application to store site selection classifies stores on the basis of performance and uses demographic characteristics of the population to discriminate between good and bad sites (Sands and Moore, 1981). The objective of the analysis is to formulate a screening policy for new store sites.
2.
3.
4.
5.
The application to the prediction of company failure classifies firms on the basis of their performance and uses financial ratios to discriminate between good and bad performers. The objective is to provide a decision framework to anticipate company decline and institute policies to prevent failure. See for example, Steele, Storey and Wynarczyk (1985), Taffler and Houston (1980) and Taffler (1982).
17
6.7 1.
Summary Discriminant analysis is a useful aid to classify people or objects into groups using metric or non-metric discriminating variables. The advantage over univariate analysis is that it is carries the advantage of establishing inter-group profiles and identifies a hierarchy of relevant variables. The effect of estimation bias on the evaluation of predictive performance can be avoided using a split data set A critical aspect of the analysis involves the identification of relevant discriminating variables. In more sophisticated applications, the definition of both dependent and discriminating variables may be less straight-forward. e.g. defining product adopters or devising psychographic variables to measure social mobility or venturesome-ness.
2.
3.
4.
18

Discriminant Analysis

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Discriminant Analysis

Transféré par

Droits d'auteur :

Formats disponibles

Discriminant Analysis_2001-2002

Quantitative Techniques for Marketing 6.0 Discriminant Analysis

6.1 6.2 6.3

Multiple Discriminant Analysis

Quantitative Techniques for Marketing 6.0 Discriminant Analysis

size income dat

Ds = 0.00537 size + 0.99877 income Du = 0.004013 size + 0.000709 income - 3.979905

Classification Using Discriminant Analysis

C max = max ( .40, .60 ) = .60 C = 60%

Multiple Discriminant Analysis

The test is a 2 test Example

Lambda 0.633 0.834 0.999

Actual 2 85.538 33.931 0.145

Critical 2 21.026 12.592 5.991

Sig. 0.000 0.010 0.243

Note 1. df = degrees of freedom 2. critical 2 is based on a 5 per cent significance level.

Vous aimerez peut-être aussi