Vous êtes sur la page 1sur 18

Applied Marketing (Market Research Methods)

Topic 9:

Discriminant analysis

Dr James Abdey

Methods) Topic 9: Discriminant analysis Dr James Abdey Discriminant analysis Dr James Abdey Overview Discriminant
Methods) Topic 9: Discriminant analysis Dr James Abdey Discriminant analysis Dr James Abdey Overview Discriminant

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefficients

Determine the significance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

Overview We discuss the technique of discriminant analysis , initially by examining its relationship to

Overview

We discuss the technique of discriminant analysis, initially by examining its relationship to regression analysis

Modelling of discriminant analysis is presented, along with formulation, estimation, significance, interpretation and validation of results

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefficients

Determine the significance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

Two-group and multiple group discriminant analyses are introduced

the results Assess validity of discriminant analysis Two-group and multiple group discriminant analyses are introduced
Discriminant analysis Discriminant analysis is a technique for analysing data when the dependent variable is

Discriminant analysis

Discriminant analysis is a technique for analysing data when the dependent variable is categorical and the independent variables are measurable

The main objectives of discriminant analysis are:

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefficients

Determine the significance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

Development of discriminant functions, i.e. linear combinations of the independent variables, which will best discriminate between the categories (groups) of the dependent variable

Checking whether significant differences exist among the groups, in terms of the independent variables

variable Checking whether significant differences exist among the groups , in terms of the independent variables
Discriminant analysis Determination of which independent variables contribute to most of the intergroup differences

Discriminant analysis

Determination of which independent variables contribute to most of the intergroup differences

Classification of cases to one of the groups based on the values of the independent variables, and determining the accuracy of classification

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefficients

Determine the significance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

When the dependent variable has two categories, the technique is called two-group discriminant analysis

When three or more categories are involved, the technique is called multiple discriminant analysis

discriminant analysis When three or more categories are involved, the technique is called multiple discriminant analysis
Discriminant analysis In the two-group case , it is possible to derive only one discriminant

Discriminant analysis

In the two-group case, it is possible to derive only one discriminant function

In multiple case, more than one function may be computed

In general, with M groups and k independent variables, it is possible to estimate up to the smaller of M 1, or k , discriminant functions

The first function has the highest ratio of between-groups to within-groups sum of squares

The second function, uncorrelated with the first, has the second highest ratio, and so on

However, not all the functions may be statistically significant

not all the functions may be statistically significant Discriminant analysis Dr James Abdey Overview Discriminant

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefficients

Determine the significance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

Discriminant analysis model The discriminant analysis model involves linear combinations of the following form: D

Discriminant analysis model

The discriminant analysis model involves linear combinations of the following form:

D = β 0 + β 1 X 1 + β 2 X 2 + β 3 X 3 +

where

+ β k X k

D = discriminant score β ’s = discriminant coefficient X ’s = independent variables

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefficients

Determine the significance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

The coefficients, β, are estimated so that the groups differ as much as possible on the values of the discriminant function

This occurs when the ratio of between-group sum of squares to within-group sum of squares for the discriminant scores is at a maximum

when the ratio of between-group sum of squares to within-group sum of squares for the discriminant
Discriminant analysis statistics Canonical correlation – the extent of association between the discriminant scores and

Discriminant analysis statistics

Canonical correlation – the extent of association between the discriminant scores and the groups. It is a measure of association between a discriminant function and the set of dummy variables that define the group membership

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefficients

Determine the significance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

Centroid – mean values for the discriminant scores for a particular group. There are as many centroids as there are groups with one centroid per group

Classification matrix – contains the number of correctly classified and misclassified cases

one centroid per group Classification matrix – contains the number of correctly classified and misclassified cases
Discriminant analysis statistics Discriminant function coefficients – the (unstandardised) multipliers of independent

Discriminant analysis statistics

Discriminant function coefficients – the (unstandardised) multipliers of independent variables, when the variables are in the original units of measurement

Discriminant scores – the discriminant function coefficients are multiplied by the values of the respective independent variables. These products are summed and added to the constant term to obtain the discriminant scores

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefficients

Determine the significance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

Eigenvalue – For each discriminant function, the eigenvalue is the ratio of between-group to within-group sums of squares. Larger eigenvalues indicate better functions

the eigenvalue is the ratio of between-group to within-group sums of squares. Larger eigenvalues indicate better
Discriminant analysis statistics Group means and group standard deviations – computed for each independent variable

Discriminant analysis statistics

Group means and group standard deviations – computed for each independent variable for each group

Standardised discriminant function coefficients – used as the multipliers when the independent variables have been standardised, i.e. have a mean of 0 and a variance of 1

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefficients

Determine the significance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

Structure correlations – simple correlations between the predictors and the discriminant function

discriminant analysis Structure correlations – simple correlations between the predictors and the discriminant function
Discriminant analysis statistics Total correlation matrix – treating the cases as a single sample, a

Discriminant analysis statistics

Total correlation matrix – treating the cases as a single sample, a total correlation matrix is obtained

Wilks’ λ – for each independent variable, Wilk’s λ is the ratio of the within-group sum of squares to the total sum of squares. Its value varies between 0 and 1. Large values of λ (near 1) indicate that group means do not seem to be different. Small values of λ (near 0) indicate that the group means do seem to be different

0) indicate that the group means do seem to be different Discriminant analysis Dr James Abdey

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefficients

Determine the significance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

Formulate the problem Identify the objectives, the dependent variable and the independent variables The dependent

Formulate the problem

Identify the objectives, the dependent variable and the independent variables

The dependent variable must consist of two or more mutually exclusive and collectively exhaustive categories

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefficients

Determine the significance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

The independent variables should be selected based on a theoretical model or previous research, or the experience of the researcher

variables should be selected based on a theoretical model or previous research, or the experience of
Formulate the problem One part of the sample, called the estimation sample , is used

Formulate the problem

One part of the sample, called the estimation sample, is used for estimation of the discriminant function

The other part, called the validation sample, is reserved for validating the discriminant function

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefficients

Determine the significance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

Often the distribution of the number of cases in the estimation and validation samples follows the distribution in the total sample

distribution of the number of cases in the estimation and validation samples follows the distribution in
Estimate the discriminant function coefficients The direct method involves estimating the discriminant function so that

Estimate the discriminant function coefficients

The direct method involves estimating the discriminant function so that all the independent variables are included simultaneously

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefficients

Determine the significance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

In stepwise discriminant analysis, the independent variables are entered sequentially, based on their ability to discriminate among the groups

analysis , the independent variables are entered sequentially, based on their ability to discriminate among the
Determine the significance of discriminant functions The null hypothesis that, in the population, the means

Determine the significance of discriminant functions

The null hypothesis that, in the population, the means of all discriminant functions in all groups are equal can be statistically tested

In SPSS this test is based on Wilks’ λ – if several functions are tested simultaneously (as in the case of multiple discriminant analysis), the Wilks’ λ statistic is the product of the univariate for each function

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefficients

Determine the significance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

If the null hypothesis is rejected, indicating significant discrimination, one can proceed to interpret the results

If the null hypothesis is rejected, indicating significant discrimination, one can proceed to interpret the results
Interpret the results The interpretation of the discriminant coefficients is similar to that in multiple

Interpret the results

The interpretation of the discriminant coefficients is similar to that in multiple regression analysis

Given the multicollinearity in the independent variables, there is no unambiguous measure of the relative importance of the independent variables in discriminating between the groups

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefficients

Determine the significance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

Nevertheless, we can obtain some idea of the relative importance of the variables by examining the absolute magnitude of the standardised discriminant function coefficients

importance of the variables by examining the absolute magnitude of the standardised discriminant function coefficients
Interpret the results Some idea of the relative importance of the independent variables can also

Interpret the results

Some idea of the relative importance of the independent variables can also be obtained by examining the structure correlations

These simple correlations between each independent variable and the discriminant function represent the variance that the independent variable shares with the function

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefficients

Determine the significance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

Another aid to interpreting discriminant analysis results is to develop a characteristic profile for each group by describing each group in terms of the group means for the independent variables

characteristic profile for each group by describing each group in terms of the group means for
analysis Assess validity of discriminant The discriminant coefficients, estimated by using the estimation sample, are

analysis

Assess validity of discriminant

The discriminant coefficients, estimated by using the estimation sample, are multiplied by the values of the independent variables in the validation sample to generate discriminant scores for the cases in the validation sample

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefficients

Determine the significance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

The cases are then assigned to groups based on their discriminant scores and an appropriate decision rule

analysis The cases are then assigned to groups based on their discriminant scores and an appropriate
analysis Assess validity of discriminant The hit ratio , or the percentage of cases correctly

analysis

Assess validity of discriminant

The hit ratio, or the percentage of cases correctly classified, can then be determined by summing the diagonal elements and dividing by the total number of cases

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefficients

Determine the significance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

It is helpful to compare the percentage of cases correctly classified by discriminant analysis to the percentage that would be obtained by chance

Classification accuracy achieved by discriminant analysis should be at least 25% greater than that obtained by chance

Classification accuracy achieved by discriminant analysis should be at least 25% greater than that obtained by