Vous êtes sur la page 1sur 18

Applied Marketing (Market Research Methods)

Topic 9:

Discriminant analysis

Dr James Abdey

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefﬁcients

Determine the signiﬁcance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

Overview

We discuss the technique of discriminant analysis, initially by examining its relationship to regression analysis

Modelling of discriminant analysis is presented, along with formulation, estimation, signiﬁcance, interpretation and validation of results

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefﬁcients

Determine the signiﬁcance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

Two-group and multiple group discriminant analyses are introduced

Discriminant analysis

Discriminant analysis is a technique for analysing data when the dependent variable is categorical and the independent variables are measurable

The main objectives of discriminant analysis are:

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefﬁcients

Determine the signiﬁcance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

Development of discriminant functions, i.e. linear combinations of the independent variables, which will best discriminate between the categories (groups) of the dependent variable

Checking whether signiﬁcant differences exist among the groups, in terms of the independent variables

Discriminant analysis

Determination of which independent variables contribute to most of the intergroup differences

Classiﬁcation of cases to one of the groups based on the values of the independent variables, and determining the accuracy of classiﬁcation

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefﬁcients

Determine the signiﬁcance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

When the dependent variable has two categories, the technique is called two-group discriminant analysis

When three or more categories are involved, the technique is called multiple discriminant analysis

Discriminant analysis

In the two-group case, it is possible to derive only one discriminant function

In multiple case, more than one function may be computed

In general, with M groups and k independent variables, it is possible to estimate up to the smaller of M 1, or k , discriminant functions

The ﬁrst function has the highest ratio of between-groups to within-groups sum of squares

The second function, uncorrelated with the ﬁrst, has the second highest ratio, and so on

However, not all the functions may be statistically signiﬁcant

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefﬁcients

Determine the signiﬁcance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

Discriminant analysis model

The discriminant analysis model involves linear combinations of the following form:

D = β 0 + β 1 X 1 + β 2 X 2 + β 3 X 3 +

where

+ β k X k

D = discriminant score β ’s = discriminant coefﬁcient X ’s = independent variables

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefﬁcients

Determine the signiﬁcance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

The coefﬁcients, β, are estimated so that the groups differ as much as possible on the values of the discriminant function

This occurs when the ratio of between-group sum of squares to within-group sum of squares for the discriminant scores is at a maximum

Discriminant analysis statistics

Canonical correlation – the extent of association between the discriminant scores and the groups. It is a measure of association between a discriminant function and the set of dummy variables that deﬁne the group membership

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefﬁcients

Determine the signiﬁcance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

Centroid – mean values for the discriminant scores for a particular group. There are as many centroids as there are groups with one centroid per group

Classiﬁcation matrix – contains the number of correctly classiﬁed and misclassiﬁed cases

Discriminant analysis statistics

Discriminant function coefﬁcients – the (unstandardised) multipliers of independent variables, when the variables are in the original units of measurement

Discriminant scores – the discriminant function coefﬁcients are multiplied by the values of the respective independent variables. These products are summed and added to the constant term to obtain the discriminant scores

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefﬁcients

Determine the signiﬁcance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

Eigenvalue – For each discriminant function, the eigenvalue is the ratio of between-group to within-group sums of squares. Larger eigenvalues indicate better functions

Discriminant analysis statistics

Group means and group standard deviations – computed for each independent variable for each group

Standardised discriminant function coefﬁcients – used as the multipliers when the independent variables have been standardised, i.e. have a mean of 0 and a variance of 1

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefﬁcients

Determine the signiﬁcance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

Structure correlations – simple correlations between the predictors and the discriminant function

Discriminant analysis statistics

Total correlation matrix – treating the cases as a single sample, a total correlation matrix is obtained

Wilks’ λ – for each independent variable, Wilk’s λ is the ratio of the within-group sum of squares to the total sum of squares. Its value varies between 0 and 1. Large values of λ (near 1) indicate that group means do not seem to be different. Small values of λ (near 0) indicate that the group means do seem to be different

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefﬁcients

Determine the signiﬁcance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

Formulate the problem

Identify the objectives, the dependent variable and the independent variables

The dependent variable must consist of two or more mutually exclusive and collectively exhaustive categories

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefﬁcients

Determine the signiﬁcance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

The independent variables should be selected based on a theoretical model or previous research, or the experience of the researcher

Formulate the problem

One part of the sample, called the estimation sample, is used for estimation of the discriminant function

The other part, called the validation sample, is reserved for validating the discriminant function

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefﬁcients

Determine the signiﬁcance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

Often the distribution of the number of cases in the estimation and validation samples follows the distribution in the total sample

Estimate the discriminant function coefﬁcients

The direct method involves estimating the discriminant function so that all the independent variables are included simultaneously

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefﬁcients

Determine the signiﬁcance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

In stepwise discriminant analysis, the independent variables are entered sequentially, based on their ability to discriminate among the groups

Determine the signiﬁcance of discriminant functions

The null hypothesis that, in the population, the means of all discriminant functions in all groups are equal can be statistically tested

In SPSS this test is based on Wilks’ λ – if several functions are tested simultaneously (as in the case of multiple discriminant analysis), the Wilks’ λ statistic is the product of the univariate for each function

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefﬁcients

Determine the signiﬁcance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

If the null hypothesis is rejected, indicating signiﬁcant discrimination, one can proceed to interpret the results

Interpret the results

The interpretation of the discriminant coefﬁcients is similar to that in multiple regression analysis

Given the multicollinearity in the independent variables, there is no unambiguous measure of the relative importance of the independent variables in discriminating between the groups

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefﬁcients

Determine the signiﬁcance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

Nevertheless, we can obtain some idea of the relative importance of the variables by examining the absolute magnitude of the standardised discriminant function coefﬁcients

Interpret the results

Some idea of the relative importance of the independent variables can also be obtained by examining the structure correlations

These simple correlations between each independent variable and the discriminant function represent the variance that the independent variable shares with the function

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefﬁcients

Determine the signiﬁcance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

Another aid to interpreting discriminant analysis results is to develop a characteristic proﬁle for each group by describing each group in terms of the group means for the independent variables

analysis

Assess validity of discriminant

The discriminant coefﬁcients, estimated by using the estimation sample, are multiplied by the values of the independent variables in the validation sample to generate discriminant scores for the cases in the validation sample

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefﬁcients

Determine the signiﬁcance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

The cases are then assigned to groups based on their discriminant scores and an appropriate decision rule

analysis

Assess validity of discriminant

The hit ratio, or the percentage of cases correctly classiﬁed, can then be determined by summing the diagonal elements and dividing by the total number of cases

Discriminant

analysis

Dr James Abdey

Overview

Discriminant analysis

Discriminant analysis model

Discriminant analysis

statistics

Formulate the problem

Estimate the discriminant function coefﬁcients

Determine the signiﬁcance of discriminant functions

Interpret the results

Assess validity of discriminant analysis

It is helpful to compare the percentage of cases correctly classiﬁed by discriminant analysis to the percentage that would be obtained by chance

Classiﬁcation accuracy achieved by discriminant analysis should be at least 25% greater than that obtained by chance