Vous êtes sur la page 1sur 6

© Rohit Vishal Kumar WBUT 2009 -1-

FACTOR ANALYSIS information about possible outliers The program provides two orthogo-
and linear dependencies may be nal rotation options: varimax and
determined from the factors associ- quartimax.
INTRODUCTION
ated with the relatively small eigen-
Factor analysis (FA) is an explora-
values, so these should be investi- Varimax Rotation: Varimax rota-
tory technique applied to a set of
gated as well. tion is the most popular orthogonal
observed variables that seeks to find
rotation technique. In this technique,
underlying factors (subsets of vari-
Kaiser (1960) proposed dropping the axes are rotated to maximise the
ables) from which the observed vari-
factors whose eigenvalues are less sum of the variances of the squared
ables were generated. For example,
than one since these provide less loadings within each column of the
an individual's response to the ques-
information than is provided by a loadings matrix. Maximising accord-
tions on a college entrance test is
single variable. Jolliffe (1972) feels ing to this criterion forces the load-
influenced by underlying variables
that Kaiser's criterion is too large. ings to be either large or small. The
such as intelligence, years in school,
He suggests using a cutoff on the hope is that by rotating the factors,
age, emotional state on the day of
eigenvalues of 0.7 when correlation you will obtain new factors that are
the test, amount of practice taking
matrices are analyzed. Other authors each highly correlated with only a
tests, and so on. The answers to the
note that if the largest eigenvalue is few of the original variables. This
questions are the observed vari-
close to one, then holding to a cutoff simplifies the interpretation of the
ables. The underlying, influential
of one may cause useful factors to factor to a consideration of these
variables are the factors.
be dropped. However, if the largest two or three variables. Another way
factors are several times larger than of stating the goal of varimax rota-
Factor analysis is carried out on the
one, then those near one may be tion is that it clusters the variables
correlation matrix of the observed
dropped. into groups; each "group" is actually
variables. A factor is a weighted av-
a new factor. Since varimax seeks to
erage of the original variables. The
Cattell (1966) documented the scree maximise a specific criterion, it pro-
factor analyst hopes to find a few
graph, which will be described later duces a unique solution (except for
factors from which the original corre-
in this chapter. Studying this chart is differences in sign). This has added
lation matrix may be generated.
probably the most popular method to its popularity.
Usually the goal of factor analysis is
for determining the number of fac-
to aid data interpretation. The factor
tors, but it is subjective, causing SCREE PLOT
analyst hopes to identify each factor
different people to analyze the same This is a rough bar plot of the eigen-
as representing a specific theoretical
data with different results. values. It enables you to quickly
factor. Therefore, many of the re-
note the relative size of each eigen-
ports from factor analysis are de-
Another criterion is to preset a cer- value. Many authors recommend it
signed to aid in the interpretation of
tain percentage of the variation that as a method of determining how
the factors.
must be accounted for and then many factors to retain. The word
keep enough factors so that this scree, first used by Cattell (1966), is
Another goal of factor analysis is to
variation is achieved. Usually, how- usually defined as the rubble at the
reduce the number of variables. The
ever, this cutoff percentage is used bottom of a cliff. When using the
analyst hopes to reduce the interpre-
as a lower limit. That is, if the desig- scree plot, you must determine
tation of a 200-question test to the
nated number of factors do not ac- which eigenvalues form the “cliff”
study of 4 or 5 factors. One of the
count for at least 50% of the vari- and which form the “rubble.” Cattell
most subtle tasks in factor analysis
ance, then the whole analysis is & Jaspers (1967) suggest keeping
is determining the appropriate num-
aborted. those that make up the cliff plus the
ber of factors.
first factor of the rubble.
ROTATION
Factor analysis has an infinite num-
Factor analysis finds a set of dimen- VALIDATING OUTPUT
ber of solutions. If a solution con-
sions (or co-ordinates) in a subspace Phi: This is the Gleason-Staelin
tains two factors, these may be ro-
of the space defined by the set of redundancy measure of how interre-
tated to form a new solution that
variables. These co-ordinates are lated the variables are. A zero value
does just as good a job at reproduc-
represented as axes. They are or- of means that there is no correlation
ing the correlation matrix. Hence,
thogonal (perpendicular) to one an- among the variables, while a value
one of the biggest complaints of
other. For example, suppose you of one indicates perfect correlation
factor analysis is that the solution is
analyse three variables that are rep- among the variables. It is good to
not unique. Two researchers can find
resented in three-dimensional space. perform factor analysis if the value
two different sets of factors that are
Each variable becomes one axis. of Phi is between 0.50 and 1.00
interpreted quite differently yet fit
Now suppose that the data lie near a
the original data equally well.
two-dimensional plane within the Bartlett Test, df, Prob: This is
three dimensions. A factor analysis Bartlett’s sphericity test (Bartlett,
The program provides the principal
of this data should uncover two fac- 1950) for testing the null hypothesis
axis method of factor analysis. The
tors that would account for the two that the correlation matrix is an
results may be rotated using vari-
dimensions. You may rotate the axes identity matrix (all correlations are
max or quartimax rotation. The fac-
of this two-dimensional plane while zero). If you get a probability value
tor scores may be stored for further
keeping the 90-degree angle be- greater than 0.05, you should not
analysis.
tween them, just as the blades of a perform a factor analysis on the
helicopter propeller rotate yet main- data. The test is valid for large sam-
HOW MANY FACTORS?
tain the same angles among them- ples (N>150). It uses a Chi-square
Several methods have been pro-
selves. The hope is that rotating the distribution with p(p-1)/2 degrees of
posed for determining the number of
axes will improve your ability to in- freedom. Note that this test is only
factors that should be kept for fur-
terpret the "meaning" of each factor. available when you analyse a corre-
ther analysis. Several of these
Many different types of rotation have lation matrix.
methods will now be discussed.
been suggested. Most of them were
However, remember that important
developed for use in factor analysis.
© Rohit Vishal Kumar WBUT 2009 -2-

FACTOR ANALYSIS – EXAMPLE AND OUTPUT

The Raw Data as taken from the example given in the Factor Analysis chapter of Naresh K. Malhotra’s Marketing
Research – An applied orientation.

V1 V2 V3 V4 V5 V6
7 3 6 4 2 4 Where the explanation of the data is as follows:
1 3 2 4 5 4
6 2 7 4 1 3 V1 = Prevent cavities
4 5 4 6 2 5 V2 = Gives Shiny Teeth
1 2 2 3 6 2 V3 = Provide strong gums
6 3 6 4 2 4 V4 = Provides fresh breath
V5 = Prevents Tooth decay
5 3 6 3 4 3
V6 = Gives attractive teeth
6 4 7 4 1 4
3 4 2 3 6 3 30 respondents were asked to rate the above six attributes on a scale of 1 - 7 in
2 6 2 6 7 6 response to a question as to how important the attributes are while purchasing
6 4 7 3 2 3 the toothpaste
2 3 1 4 5 4
The scale used was:
7 2 6 4 1 3
4 6 4 5 3 6 1 = Strongly Disagree
1 3 2 2 6 4 2 = Disagree
6 4 6 3 3 4 3 = Somewhat Disagree
4 = Neither agree nor disagree
5 3 6 3 3 4 5 = Somewhat Agree
7 3 7 4 1 4 6 = Agree
2 4 3 3 6 3 7 = Strongly Agree
3 5 3 6 4 6
1 3 2 3 5 3 So according to the first respondent the most important attribute that can
Influence purchase a toothpaste was its ability to prevent cavities (7), then the
5 4 5 4 2 4 ability to provide strong Gums(6) , followed by their ability to provide fresh
2 2 1 5 4 4 breath (4) and give attractive teeth (4). Followed by the toothpaste’s ability to
4 6 4 6 4 7 give healthy teeth (3) and least of all the ability to prevent tooth decay (5)
6 5 4 2 1 4
The objective of the factor analysis exercise to find out which attributes convey
3 5 4 6 4 7
similar meaning and can be clubbed together.
4 4 7 2 2 5
3 7 2 6 4 3
4 6 3 7 2 7
2 3 2 4 7 2

The data entry window for both NCSS and SPSS looks similar and is reproduced below from NCSS
© Rohit Vishal Kumar WBUT 2009 -3-

From the Analysis Menu in NCSS we choose “Analysis” -> “Multivariate Analysis” -> “Factor Analysis. On doing
so the factor analysis options selection box opens up – which is shown below:

The choices are as follows:

Variables to be included in Factor analysis – V1 to V6


Data Input Format – Regular Data
Factor Rotation – Varimax Rotation
Missing Value Estimation – None
1
Numbers of factors to extract – 2
2
Maximum Iteration – 6

1 SPSS has an option of limiting the number of factors but by default it generates the full factor analysis and
then the onus of choosing the factors lies with the investigator
2 SPSS determines the number of iterations independently of the researcher intervention. NCSS has a default
value of 6 which can be increased or decreased as required. Most factor rotations should converge in 5 but
the best limit is 30.

THE OUTPUT FROM NCSS:

Factor Analysis Report


Page/Date/Time 1 27/02/2002 21:28:28
Database C:\Rohit - Important\Rohit\IIS_WBM\Cases\factor\factor.S0

1. Descriptive Statistics Section


Standard
Variables Count Mean Deviation Communality
V1 30 3.933333 1.981524 0.927653
V2 30 3.9 1.373392 0.561986
V3 30 4.1 2.056948 0.836849
V4 30 4.1 1.373392 0.601356
V5 30 3.5 1.907336 0.789274
V6 30 4.166667 1.391683 0.720238

Table 1 gives us the descriptive statistics.


© Rohit Vishal Kumar WBUT 2009 -4-

2. Correlation Section
Variables
Variables V1 V2 V3 V4 V5 V6
V1 1.000000 -0.053218 0.873090 -0.086162 -0.857637 0.004168
V2 -0.053218 1.000000 -0.155020 0.572212 0.019746 0.640465
V3 0.873090 -0.155020 1.000000 -0.247788 -0.777848 -0.018069
V4 -0.086162 0.572212 -0.247788 1.000000 -0.006582 0.640465
V5 -0.857637 0.019746 -0.777848 -0.006582 1.000000 -0.136403
V6 0.004168 0.640465 -0.018069 0.640465 -0.136403 1.000000
Phi=0.473692 Log(Det|R|)=-4.254032 Bartlett Test=111.31 DF=15 Prob=0.000000

This is the correlation matrix generated by NCSS. Notice that the main diagonal contains 1.00. This matrix forms
the basis of input for Factor Analysis both in the centroid method and the Principal Component Method

3. Eigenvectors after Varimax Rotation


Factors
Variables Factor1 Factor2
V1 0.591531 -0.123071
V2 -0.128823 -0.527405
V3 0.570093 -0.028183
V4 -0.153863 -0.538050
V5 -0.529954 0.189998
V6 -0.062964 -0.616689

in ncss we had to specify the number of factors to extract. Initially a large number is chosen and then the num-
bers of factors are decreased in the selection box until and unless we get the desired number of factors and/or
ncss stop’s complaining that it cannot solve the factor analysis. Table 3 gives us the eigen values of the factors
after applying the varimax rotation. Ncss does not generate the initial factor solution. Spss on the other hand first
generates the initial solution and then applies the rotation to give us the rotated factor matrix

4. Factor Loadings after Varimax Rotation


Factors
Variables Factor1 Factor2
V1 0.962670 0.030345
V2 -0.054005 -0.747709
V3 0.902385 0.150168
V4 -0.090303 -0.770196
V5 -0.884852 0.079443
V6 0.074402 -0.845401

Table 4 gives us the factor loadings for the factors and table 5 below gives us the communalities after rotation

5. Communalities after Varimax Rotation


Factors
Variables Factor1 Factor2 Communality
V1 0.926733 0.000921 0.927653
V2 0.002917 0.559069 0.561986
V3 0.814299 0.022550 0.836849
V4 0.008155 0.593202 0.601356
V5 0.782963 0.006311 0.789274
V6 0.005536 0.714703 0.720238

6. Factor Structure Summary after Varimax Rotation

Factor1 Factor2
V1 V6
V3 V4
V5 V2

Table 6 – as generated by ncss – is an improvement over spss. In table 6 ncss tries to show which attribute are
covered in factor 1 and which other in factor 2. Spss and do not make any attempt to club attributes under vari-
ous factors – this clubing is left to the researcher. However NCSS output should not be taken as final – the re-
searcher should apply his mind and see whether further improvement can be done.

6. Plots Section
© Rohit Vishal Kumar WBUT 2009 -5-

Factor Scores
2.00

18
8 13
3
1.13 1 11
6
2527
16
22 17
Score1

7
0.25 4
29 14
24
26
20
-0.63 28

23 919
12
2 21
10 15
-1.50 30 5
-3.00 -1.75 -0.50 0.75 2.00
Score2

THE OUTPUT FROM SPSS:

1. Analysis number 1 List-wise deletion of cases with missing values

Mean Std Dev Label


V1 3.93333 1.98152 Prevents Cavities
V2 3.90000 1.37339 Gives Shiny Teeth
V3 4.10000 2.05695 Strengthens Gums
V4 4.10000 1.37339 Freshens Breath
V5 3.50000 1.90734 Prevents Tooth Decay
V6 4.16667 1.39168 Attractive Teeth

Number of Cases = 30

Extraction 1 for analysis 1, Principal Components Analysis (PC)

Table 1 is the descriptive statistics table as generated from the data.

2. Initial Statistics:

Variable Communality * Factor Eigenvalue Pct of Var Cum Pct


V1 1.00000 * 1 2.73119 45.5 45.5
V2 1.00000 * 2 2.21812 37.0 82.5
V3 1.00000 * 3 0.44160 07.4 89.8
V4 1.00000 * 4 0.34126 05.7 95.5
V5 1.00000 * 5 0.18263 03.0 98.6
V6 1.00000 * 6 0.08521 01.4 100.0

PC extracted 2 factors.

Table 2 shows the start of the analysis. Initially SPSS assumes that all the attributes under study are factor’s and
as such each of the factors 1 – 6 are assigned the equal weight 1. The initial eigen values are calculated and
from the eigen value scores the percentage of variation explained and the cumulative percentage determined.
© Rohit Vishal Kumar WBUT 2009 -6-

3. Factor Matrix:
Factor 1 Factor 2
V1 0.92834 0.25323
V2 -0.30053 0.79525
V3 0.93618 0.13089
V4 -0.34158 0.78897
V5 -0.86876 -0.35079
V6 -0.17664 0.87116

Table 3 final factor matrix without rotation. SPSS extracted two factors factor 1 and factor 2. The factor scores
are provided in the above matrix. If any rotation procedure is not selected then table 4 below is the final output. If
any rotation procedure is selected then the final rotated matrix table 5 is also generated. Note that SPSS does
not provide any sort of clubbing as to which attribute belongs to which factor. It is the work of the researcher to
interpret the factor scores and do the clubbing

4. Final Statistics:

Variable Communality * Factor Eigenvalue Pct of Var Cum Pct


V1 0.92594 * 1 2.73119 45.5 45.5
V2 0.72274 * 2 2.21812 37.0 82.5
V3 0.89357 *
V4 0.73915 *
V5 0.87779 *
V6 0.79012 *

VARIMAX rotation 1 for extraction 1 in analysis 1 - Kaiser Normalization.


VARIMAX converged in 3 iterations.

Table 4 presents the final summary of the factor analysis after applying the rotation. As per the table factor 1 and
factor 2 are the two factors that have been extracted and they explain 82.5% of the variation present in the data

5. Rotated Factor Matrix:

Factor 1 Factor 2
V1 0.96189 -0.02663
V2 -0.05721 0.84821
V3 0.93394 -0.14599
V4 -0.09832 0.85410
V5 -0.93313 -0.08401
V6 0.08337 0.88497

Table 5 is the final factor matrix with rotation. Note that the factor scores are different from those of the un-rotated
factor matrix.

Vous aimerez peut-être aussi