Vous êtes sur la page 1sur 11

Karl G Joreskog Factor Analysis and Its Extensions Karl G Joreskog Factor Analysis and Its Extensions

' $ ' $

Communalities

Many papers on factor analysis at that time focused on the


Factor Analysis and Its Extensions
question: What numbers should be put in the diagonal of R to
make this approximately equal to , where is a p k matrix
Karl G Joreskog of factor loadings?

Slide 1 Uppsala University Slide 3


Rc (1)

Alternative Titles The numbers in the diagonal of Rc are called communalities.


Guttman (1956) showed that the squared multiple correlation Ri2
Factor Analysis at 100: The Last 50 Years
in the regression of the ithe variable on all the other variables is a
Factor Analysis: 50 Years in 50 minutes lower bound for the communality of the ith variable:

c2i Ri2 (2)


&
' %
$ &
' %
$

Uppsala Symposium 1953


Uniqueness
My history of factor analysis begins just about 50 years ago, or
1953, at the Uppsala Symposium on Psychological Factor
Analysis. I was in high school then fully unaware of what was The counterpart of the communality c2i is the uniqueness
going on in Uppsala and I had no idea what factor analysis was. u2i = 1 c2i . Hence, (2) is equivalent to
This symposium was hosted by Herman Wold, professor and
Slide 2 Slide 4
chair of statistics at Uppsala University. Wold had met with u2i 1 Ri2 = 1/rii (3)
Louis and Thelma Thurstone in Stockholm and he was inspired
by their work on factor analysis. There were some prominent In my dissertation I therefore suggested that
people in this symposium including Maurice Bartlett,
D.N.Lawley, Georg Rash, and Peter Whittle. The Uppsala u2i = /rii , (4)
Symposium is of minor importance in the history of factor
where is a parameter to be estimated.
analysis but it had a great consequence for me for six years later
Wold suggested that I do a dissertation on factor analysis.
& % & %
Karl G Joreskog Factor Analysis and Its Extensions Karl G Joreskog Factor Analysis and Its Extensions
' $ ' $

Heywood Cases
Statistical Formultion
Joreskog (1967) solved this problem by focusing on the
Much of the discussion in the 50s were procedures for choosing concentrated fit function
communalities and estimating factor loadings. There was a need
for a statistical formulation. So in my dissertation, I suggested f () = min F (, ) , (8)

that one could estimate the covariance matrix subject to the
Slide 5 Slide 7 which could be minimized numerically. If one or more of the i2
constraint
gets close to zero, this procedure becomes unstable. Joreskog
(1977) therefore developed this procedure further by
= + (diag1 )1 (5) reparameterizing

and I investigated a simple non-iterative procedure for estimating


and from the sample covariance matrix S. Later I developed i = ln i2 , i = + (ei ) (9)
a maximum likelihood method for estimating model (5). This leads to a very fast and ecient algorithm, the use of which
has been very successful.
&
' %
$ &
' %
$

Maximum Likelihood Factor Analysis


Other Fit Functions
Factor analysis as a statistical method was formulated already by
Lawley (1940), see also Lawley & Maxwell (1963, 1971) and it Unweighted Least Squares(ULS)
was further developed by Anderson & Rubin (1956). However, as
late as the mid 60s there was no good way for computing the 1
tr[(S )2 ]
FU LS (, ) = (10)
Slide 6 estimates. Using well established notation, the problem is to Slide 8 2
minimize the function Generalized Least Squares(GLS)

1
FML (, ) = log + tr(S1 ) log S p , (6) FGLS (, ) = tr[(I S1 )2 ] (11)
2
where
Each of these fit functions can also be minimized by minimizing
= + 2 , (7) the corresponding concentrated fit function (8).
and 2 is the diagonal matrix of error variances.
& % & %
Karl G Joreskog Factor Analysis and Its Extensions Karl G Joreskog Factor Analysis and Its Extensions
' $ ' $

As more knowledge is gained about the nature of social and


Weighted Least Squares psychological measurements, however, exploratory factor analysis
may not be a useful tool and may even become a hindrance.

1 Most studies are to some extent both exploratory and


FV (, ) = tr[(S )V]2 (12) confirmatory since they involve some variables of known and
2
Slide 9 Slide 11 other variables of unknown composition. The former should be
chosen with great care in order that as much information as
ULS : V = I (13)
possible about the latter may be extracted. It is highly desirable
1
GLS : V = S (14) that a hypothesis which has been suggested by mainly
1 exploratory procedures should subsequently be confirmed, or
ML : V = (15)
disproved, by obtaining new data and subjecting these to more
ML = Iteratively Reweighted Least Squares rigorous statistical techniques.

&
' %
$ &
' %
$
The basic idea of factor analysis is the following. For a given set
Exploratory Factor Analysis of response variables x1 , . . . , xp one wants to find a set of
underlying latent factors 1 , . . . , k , fewer in number than the
observed variables. These latent factors are supposed to account
Exploratory factor analysis is a technique often used to detect for the intercorrelations of the response variables in the sense that
and assess latent sources of variation and covariation in observed when the factors are partialed out from the observed variables,
measurements. It is widely recognized that exploratory factor there should no longer remain any correlations between these. If
Slide 10 analysis can be quite useful in the early stages of experimentation Slide 12 both the observed response variables and the latent factors are
or test development. Thurstones (1938) primary mental abilities, measured in deviations from the mean, this leads to the model:
Frenchs (1951) factors in aptitude and achievement tests and
Guilfords (1956) structure of intelligence are good examples of
xi = i1 1 + i2 2 + + in k + i , (16)
this. The results of an exploratory factor analysis may have
heuristic and suggestive value and may generate hypotheses where i , the unique part of xi , is assumed to be uncorrelated
which are capable of more objective testing by other multivariate with 1 , 2 , . . . , k and with j for j = i. In matrix notation (16) is
methods. x = + (17)

& % & %
Karl G Joreskog Factor Analysis and Its Extensions Karl G Joreskog Factor Analysis and Its Extensions
' $ ' $

Two-Stage Least-Squares
The unique part i consists of two components: a specific factor
si and a pure random measurement error ei . These are
indistinguishable, unless the measurements xi are designed in y = x+u, (19)
Slide 13 such a way that they can be separately identified (panel designs Slide 15
= S1
xx sxy , (20)
and multitrait-multimethod designs). The term i is often called
the measurement error in xi even though it is widely recognized = (S zx S1
zz Szx )
1
S zx S1
zz szy , (21)
that this term may also contain a specific factor as stated above. (n p)1 uu (S zx S1
zz Szx )
1
, (22)
uu = syy 2 sxy + Sxx (23)

&
' %
$ &
' %
$

Rotation Reference Variables Solution

Partitioning x into two parts x1 (k 1) and x2 (q 1), where


= + 2 , (18)
q = p k, and similarly into 1 (k 1) and 2 (q 1), (17) can
where and 2 are the covariance matrices of and , be written
respectively.
Slide 14 Let T be an arbitrary non-singular matrix of order k k and let Slide 16
x1 = + 1 (24)
1
= T = T = TT x2 = 2 + 2 , (25)

Then we have identically where 2 (q k) consists of the last q = p k rows of . The


matrix 2 may, but need not, contain a priori specified elements.

We say that the model is unrestricted when 2 is entirely
This shows that at least k2 independent conditions must be unspecified and that the model is restricted when 2 contains a
imposed on and/or to make these identified. priori specified elements.
& % & %
Karl G Joreskog Factor Analysis and Its Extensions Karl G Joreskog Factor Analysis and Its Extensions
' $ ' $
Solving (24) for and substituting this into (25) gives
Multigroup Analysis
x2 = 2 x1 + u , (26)

where u = 2 2 1 . Each equation in (26) is of the form (19) Consider data from several groups or populations. These may be
but it is not a regression equation because u is correlated with dierent nations, states, or regions, culturally or
x1 , since 1 is correlated with x1 . socioeconomically dierent groups, groups of individuals selected
Slide 17 Let Slide 19 on the basis of some known selection variables, groups receiving
dierent treatments, and control groups, etc. In fact, they may
be any set of mutually exclusive groups of individuals that are
xi = i x1 + ui , (27)
clearly defined. It is assumed that a number of variables have
be the i-th equation in (26), where i is the i-th row of 2 , and been measured on a number of individuals from each population.
let x(i) (q 1 1) be a vector of the remaining variables in x2 . This approach is particularly useful in comparing a number of
Then ui is uncorrelated with x(i) so that x(i) can be used as treatment and control groups regardless of whether individuals
instrumental variables for estimating (27). Provided q k + 1, have been assigned to the groups randomly or not.
this can be done for each i = 1, 2, . . . , q.
&
' %
$ &
' %
$

Factorial Invariance

Confirmatory Factor Analysis Consider the situation where the same tests have been
administered in G dierent groups and the factor analysis model
applied in each group:
In a confirmatory factor analysis, the investigator has such
knowledge about the factorial nature of the variables that he/she
Slide 18 Slide 20 xg = g g + g , g = 1, 2, . . . , G (28)
is able to specify that each measure xi depends only on a few of
the factors j . If xi does not depend on j , ij = 0 in (16) (Slide The covariance matrix in group g is
12). In many applications, the latent factor j represents a
theoretical construct and the observed measures xi are designed g = g g g + 2g (29)
to be indicators of this construct. In this case there is only one
non-zero ij in each equation (16). Hypothesis of factorial invariance:

1 = 2 = = G (30)
& % & %
Karl G Joreskog Factor Analysis and Its Extensions Karl G Joreskog Factor Analysis and Its Extensions
' $ ' $

Factorial Invariance with Latent Means Econometric Models

Sorbom (1974) extended the model in Slide 20 to include


intercepts in (28): yt = + Byt + xt + zt (35)

xg = g + g g + g , g = 1, 2, . . . , G (31)
Slide 21 Slide 23 t = 1, 2, . . . , N (36)
Under complete factorial invariance:

yti = i + (i) yt(i) + (i) xt(i) + zti , (37)


1 = 2 = = G (32)

1 = 2 = = G (33)
y A A + AA
he showed that one can estimate the mean vector and covariance = Cov = (38)
x A
matrix of in each group on a scale common to all groups.
&
' %
$ &
' %
$

Let zg and Sg be the sample mean vector and covariance matrix


Some History of LISREL
in group g, and let g () and g () be the corresponding
population mean vector and covariance matrix g = 1, 2, . . . , G.
The fit function for the multigroup case is defined as The idea of combining features of both econometrics and
psychometrics into a single mathematical model was born in my
G mind in the spring of 1970. This idea was inspired by work of
Ng
Slide 22 F() = Fg () , (34) Slide 24 Professor Arthur S. Goldberger published in Psychometrika,
g=1
N
1971. The first version of LISREL was a linear structural equation
where Fg () = F(zg , Sg , g (), g ()) is any of the fit functions model for latent variables, each with a single observed, possibly
defined for a single group. Here Ng is the sample size in group g fallible, indicator . I presented this model at the conference on
and N = N1 + N2 + . . . + NG is the total sample size. To test the Structural Equation Models in the Social Sciences held in
model, one can again use c = (N 1) times the minimum of F as Madison, Wisconsin, in November 1970. The proceedings of this
a 2 with degrees of freedom d = Gk(k + 1)/2 t, where k is the conference, edited by Professors Goldberger and Duncan, were
number of variables. published in 1973.

& % & %
Karl G Joreskog Factor Analysis and Its Extensions Karl G Joreskog Factor Analysis and Its Extensions
' $ ' $

-
Qk
1 x1
This LISREL model was generalized in 1971-72 to include models Q1
Q
previously developed for multiple indicators of latent variables, Q

-
(x)

for confirmatory factor analysis, for simultaneous factor analysis
2 x2 21

1
Q 11
6 J Q Q
(x)

31
1
1
y1
in several populations and more general models for covariance QQs
1

- + J
structures. The basic form of the LISREL model has remained the 3 x3
Qk
21
J 123 1

P P(y)
Q32
(x)
I P Pq
21

same ever since and is still the same model as used today. The Q ? J 6 1 y2 2

- 1 Q J I
Slide 25 general form of the LISREL model, due to its flexible specification Slide 27 4 x4
(x)
2
J 21 12 21
21

in terms of fixed and free parameters and simple equality




52

6

J ? 1 1
2

y3
J^
3

- +
constraints, has proven to be so rich that it can handle not only 5 x5
3 P (y)
P42P Pq
31 32 2

the large variety of problems studied by hundreds of behavioral ? 23

y4
R
4

science researchers but also complex models, such as 6 - x6 1


3
(x)
multiplicative MTMM models, non-linear models, and time series

73


models, far beyond the type of models for which it was originally 7 - x7 +
conceived.
The LISREL Model in LISREL Notation
&
' %
$ &
' %
$
The first version of LISREL made generally available and with a
written manual was LISREL III. It had fixed column input, fixed
dimensions, only the maximum likelihood method, and users had The LISREL Model with Means
to provide starting values for all parameters. The versions that
followed demonstrated an enormous development in both
statistical methodology and programming technology: y = y + y +
LISREL IV (1978) had Keywords, Free Form Input, and Dynamic x = x + x +
Storage Allocation
Slide 26 Slide 28
= + B + +
LISREL V (1981) had Automatic Starting Values, Unweighted and
Generalized Least Squares, and Total Eects y, x = Observed Variables , = Latent Variables
LISREL VI (1984) had Parameter Plots, Modification Indices, and , = Measurement Errors = Structural Errors
Automatic Model Modification
y , x , = Intercept Terms
LISREL 7 (1988) had PRELIS, Weighted Least Squares, and
Completely Standardized Solution y , x , B, = Parameter Matrices
LISREL 8 (1994) had SIMPLIS, Path Diagrams, and Non-linear
Constraints
& % & %
Karl G Joreskog Factor Analysis and Its Extensions Karl G Joreskog Factor Analysis and Its Extensions
' $ ' $

y + y (I B)1 ( + )
= ,
x + x
Assumptions
y A( + )A y + y Ax +
= ,
is uncorrelated with x A y + x x +
Slide 29 is uncorrelated with Slide 31 where A = (I B)1 .

is uncorrelated with The elements of and are functions of the elements of , , y , x ,


y , x , B, , , , , , and which are of three kinds:
is uncorrelated with and
fixed parameters that have been assigned specified values,
I B is non-singular
constrained parameters that are unknown but linear or non-linear
functions of one or more other parameters, and

free parameters that are unknown and not constrained.


&
' %
$ &
' %
$

LISREL as a Mean and Covariance Structure


General Mean and Covariance Structures
Let
Let and be functions of a parameter vector :
= E() = Cov()
Slide 30 Slide 32
= () = () (39)

= = Cov
or
= () = () (40)
Then it follows that the mean vector and covariance matrix
where is a vector of the non-duplicated elements of .
of z = (y , x ) are

& % & %
Karl G Joreskog Factor Analysis and Its Extensions Karl G Joreskog Factor Analysis and Its Extensions
' $ ' $

Some Formulas

Let s be a vector of the non-duplicated elements of S and assume


that = ()
s1 t1
1
n 2 (s ) N (0, ) (41)
= evaluated at

Slide 33 Slide 35 st

Definitions c = orthogonal complement to


sd

k = number of observed variables c = 0 [|c ] non-singular


1
s = k(k + 1)
2
t = number of independent parameters < s
d = st
&
' %
$ &
' %
$

K = D(D D)1 , where D is the duplication matrix Fit Functions


2 2
k s k s F = (s ) V (s ) ,
ss
s = K vec(S) vec(S) = Ds
s1 where the weight matrix V is defined dierently for dierent fit
functions:
= nACov(s) n=N 1
ss ULS: V = I = diag (1, 2, 1, 2, 2, 1, . . .)
Slide 34 Slide 36 GLS: V = D (S1 S1 )D
W = n Est[ACov(s)]
ss 1 1
ML: V = D ( )D
WNT = 2K ( )K under NT,
1
WLS: V = WNNT or WNNT if WNNT singular
WNNT = (wgh,ij ) under NNT,
wgh,ij = nEst[ACov(sgh , sij )] = mghij sgh sij , DWLS: V = D1
W = [diagW]
1

N with W = WNT or
mghij = (1/N ) (zag zg )(zah zh )(zai zi )(zaj zj ) W = WNNT
a=1
& % & %
Karl G Joreskog Factor Analysis and Its Extensions Karl G Joreskog Factor Analysis and Its Extensions
' $ ' $

Results

E = V What are the factors behind this development?


ss

nACov() = E1 VWVE1
Models can be tested
nACov(s ) = W E1
Slide 37 Slide 39 Computer technology
with W = WNT or
Simple command language
W = WNNT
c2 = n(s ) c (c WNT c )1 c (s ) Path diagram

h1 = tr[(c WNT c )1 (c WNNT c )] SEM courses at many universities


c3 = (d/h1 )c2 Journal of Structural Equation Modeling
1
c4 = n(s ) c (c WNNT c ) c (s )

&
' %
$ &
' %
$

The Growth of Structural Equation Modeling


350

The Success of Structural Equation Modeling


300

250
There has been an enormous development of
structural equation modeling in the last 30 years. 200
Slide 38 Slide 40
Proof:
150

Thousands of journal articles


100

Hundreds of dissertations
50

Numerous books
0
1994 1995 1996 1997 1998 1999 2000 2001

Number of Journals and Articles by Year


& % & %
Karl G Joreskog Factor Analysis and Its Extensions Karl G Joreskog Factor Analysis and Its Extensions
' $ ' $

Main Virtues of SEM Methodology Closing the Circle

SEM has the power to test complex hypotheses involving causal Back to Factor Analysis
relationships among construct or latent variables

SEM unifies several multivariate methods into one analytic Latent Variable Models
framework
Slide 41 SEM specifically expresses the eects of latent variables on each Slide 43
other and the eect of latent variables on observed variables f (x) = h()g(x | )d (42)
SEM can be used to test alternative hypotheses.
p
SEM gives social and behavioral researchers powerful tools for
f (x) = h() g(xi | )d (43)
stating theories more exactly, i=1

testing theories more precisely,


N (0, I) x | N ( + , 2 ) x = N (, + 2 ) (44)
generating a more thorough understanding of observed data.
&
' %
$ &
' %
$

Binary and Ordinal Variables

Retraction
xi = 1, 2, . . . mi (45)
k k
Is it really that great? (i)
gi (xi = s | ) = F (s(i) ij j ) F (s1 ij j ) (46)
Slide 42 In the preface of his 1975 book O. D. Duncan said that he was Slide 44 j=1 j=1

fascinated by the formal properties of causal models but held a (i) (i) (i)
(i) (i)
= 0 < 1 < 2 < m i1
< m i
=
rather agnostic view of their utility.
We have certainly come to great strides in the formal realm but t
1 1 2

are there really any great substantive applications? NOR : F (t) = (t) = e 2 u du (47)
2
et
POM : F (t) = (t) = (48)
1 + et

& % & %

Vous aimerez peut-être aussi