A multidimensional scaling (MDS) model is proposed for 2-way 1-mode asymmetric dissimilarity data, to estimate the unknown symmetric subjacent dissimilarity matrix while the objects are represented in a low-dimensional space. In the context of least squares MDS allowing transformations, and considering both triangular parts of the asymmetric dissimilarity matrix as effects of the unobserved symmetric dissimilarities, an alternating estimation procedure is proposed in which the unknown symmetric

© All Rights Reserved

0 vues

A multidimensional scaling (MDS) model is proposed for 2-way 1-mode asymmetric dissimilarity data, to estimate the unknown symmetric subjacent dissimilarity matrix while the objects are represented in a low-dimensional space. In the context of least squares MDS allowing transformations, and considering both triangular parts of the asymmetric dissimilarity matrix as effects of the unobserved symmetric dissimilarities, an alternating estimation procedure is proposed in which the unknown symmetric

© All Rights Reserved

- 1501.Applied Parameter Estimation for Chemical Engineers (Chemical Industries) by Peter Englezos [Cap18]
- Cfa
- The Incentive Effects of Higher Education Subsidies on Student Effort
- Sta301lec1to45mcqs
- Econometrics Chapter 8 PPT slides
- GMM STATA
- Note206
- Eco No Metrics
- The Metabolic Topography of Parkinsonism
- Panel 2(Sata)
- Panel Data
- Chapter 2
- 2 the Linear Regression Model
- Steenkamp & Baumgartner, 1998 - Assessing Measurement Invariance in Cross-national Consumer Research
- 11.pdf
- Data Analysis
- Math-Supplement-.pdf
- 17673_R Programming Help File
- COMPUSOFT, 3(1), 467-472.pdf
- Tests the null hypothesis that the observed covariance matrices of the dependent variables are equal across groups.

Vous êtes sur la page 1sur 10

Fernando Vera]

On: 31 January 2014, At: 09:35

Publisher: Routledge

Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,

37-41 Mortimer Street, London W1T 3JH, UK

Journal

Publication details, including instructions for authors and subscription information:

http://www.tandfonline.com/loi/hsem20

for One-Mode Asymmetric Dissimilarity Data

a b

J. Fernando Vera & Chiristian D. Rivera

a

University of Granada , Spain

b

University of Los Andes , Venezuela

Published online: 31 Jan 2014.

To cite this article: J. Fernando Vera & Chiristian D. Rivera (2014) A Structural Equation Multidimensional Scaling Model for

One-Mode Asymmetric Dissimilarity Data, Structural Equation Modeling: A Multidisciplinary Journal, 21:1, 54-62

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained

in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no

representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the

Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and

are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and

should be independently verified with primary sources of information. Taylor and Francis shall not be liable for

any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever

or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of

the Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematic

reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any

form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://

www.tandfonline.com/page/terms-and-conditions

Structural Equation Modeling: A Multidisciplinary Journal, 21: 54–62, 2014

Copyright © Taylor & Francis Group, LLC

ISSN: 1070-5511 print / 1532-8007 online

DOI: 10.1080/10705511.2014.856696

Model for One-Mode Asymmetric Dissimilarity Data

J. Fernando Vera1 and Chiristian D. Rivera2

Downloaded by [UGR-BTCA Gral Universitaria], [J. Fernando Vera] at 09:35 31 January 2014

1

University of Granada, Spain

2

University of Los Andes,Venezuela

A multidimensional scaling (MDS) model is proposed for 2-way 1-mode asymmetric dissimi-

larity data, to estimate the unknown symmetric subjacent dissimilarity matrix while the objects

are represented in a low-dimensional space. In the context of least squares MDS allowing

transformations, and considering both triangular parts of the asymmetric dissimilarity matrix

as effects of the unobserved symmetric dissimilarities, an alternating estimation procedure is

proposed in which the unknown symmetric dissimilarity matrix is estimated in a covariance

structure framework. Real and artificial data are analyzed to illustrate the proposed procedure.

structural equation model

For one-mode proximity data multidimensional scaling Hermitian matrix (Escoufier & Ground, 1980), modifying

(MDS) usually states a monotone relation between dissim- the distances with a skew-symmetric form (Weeks & Bentler,

ilarities and distances, according to which the symmetry 1982), embedding skew-symmetries as drift vectors into

constitutes an important subjacent hypothesis in the formula- MDS plots (Borg, 1979; Borg & Groenen, 1995), repre-

tion of the model. Nevertheless there are practical situations senting objects by circles with different radii (Okada &

for which the symmetry is not manifest when proximity data Imaizumi, 1987), or representing the asymmetric structure

are compiled, and the MDS subjacent ideal relation between in the MDS configuration derived from the symmetric part

dissimilarities and distances is then weakened. Since the using a vector model (Yadohisa & Niki, 2000), among other

early work of Kruskal (1964), several procedures have been approaches. See Borg and Groenen (2005, chap. 23) or Saito

developed to deal with asymmetry in MDS. and Yadohisa (2005, chap 3) for an extensive overview of

One sort of model (not restricted to the MDS context) is this kind of asymmetric MDS model.

based on the decomposition of the proximity matrix into two In addition to the proceding procedures, other asymmet-

components, one symmetric and the other skew-symmetric, ric MDS models in the context for one-mode dissimilarity

as discussed extensively in Zielman and Heiser (1996). data are based on the direct analysis of the raw data matrix.

Some procedures considered represent the symmetric com- In a nonmetric context, an approach suggested by Kruskal

ponent by MDS, discussing separate specialized visualiza- (1964) is based on regarding the asymmetry as due to errors;

tion techniques for the skew-symmetric component through therefore the data are symmetrized, the STRESS formula-

the singular value decomposition of the skew-symmetric tion is extended to all matrix entries, or an MDS model is

matrix (e.g., Constantine & Gower, 1978; Gower, 1977). adjusted to the upper and lower triangular parts of the dis-

Other such decomposition procedures simultaneously deal similarity matrix separately. Another approach assumes the

with the symmetric and the skew-symmetric parts of the asymmetry to be of interest in its own right, and several

data matrix by applying the spectral decomposition of a procedures have been suggested in which the distances are

generalized, or based on extensions of scalar product mod-

els, among other possibilities (see Saito & Yadohisa [2005,

Correspondence should be addressed to J. Fernando Vera, Department chap. 4] for a further description of these models).

of Statistics and O.R., Faculty of Sciences, University of Granada, 18071 In this article we adopt a different perspective from the

Granada, Spain. E-mail: jfvera@ugr.es structural equation modeling (SEM) framework, by which

AN SEMDS MODEL FOR ONE-MODE ASYMMETRIC DISSIMILARITY DATA 55

the asymmetry is taken into account as an outcome of where in this context, the dˆij values, i, j = 1, . . . , n, represent

measurement errors in MDS, so that asymmetric proximities symmetric dissimilarities, or in general transformed dissim-

are fully represented by distances between points in an ilarities also referred to as disparities (see Vera, Heiser,

MDS space. Thus, the upper and lower triangular parts of & Murillo, 2007, for a global MDS algorithm in any

the observed asymmetric dissimilarity matrix are assumed Minkowski metric).

to be imperfect measurements of an unobserved subjacent Because the symmetric subjacent dissimilarity matrix ∗

symmetric dissimilarity matrix (or latent matrix), to which is a priori unknown in the asymmetric MDS framework, the

each triangular matrix (or effect indicator) is linearly related. matrix entries arranged in vector form can be considered ele-

A typical situation that can be considered under this perspec- ments of a latent variable denoted by δ ∗ , and the covariance

tive appears when asymmetric dissimilarities are achieved, structure methodology can be employed for their estimation.

for example, from confusion or same–different error-type If only one unobserved dependent η variable and one

experiments as in Miller and Nicely (1955), Rothkopf, unobserved independent ξ variable are considered, the gen-

Downloaded by [UGR-BTCA Gral Universitaria], [J. Fernando Vera] at 09:35 31 January 2014

(1957) or Wang and Bilger (1973). (See Zielman and Heiser, eral structural equation model can be formulated as

1996, for other illustrative examples). In this situation,

the principal aim of the SEM stage is to deal with the η = γξ + ζ, (2)

unknown symmetric dissimilarity matrix that best explains

the observed variability in a covariance structure estimation where the γ coefficient represents the direct effect of the ξ

problem. The effective incorporation of the error terms in the variable on the η variable, and ζ is a random disturbance term

model is an important benefit derived from the covariance or error in the equation, which is assumed to be uncorrelated

structure methodology for the parameter estimation, which with ξ .

helps improve the MDS estimated configuration. In the particular situation where only one indicator y is

A least squares alternating estimation procedure using linearly related to a latent dependent variable, the general

SMACOF (de Leeuw & Heiser, 1980) is developed such measurement model can be expressed by

that in each iteration the MDS configuration is attained, and

the unknown symmetric dissimilarities are estimated in a

y = λy η + e, (3)

covariance structural problem (see, e.g., Vera, Macías, &

Heiser, 2009a, 2009b, for such combined MDS procedures

in a classification context). x = λx ξ + , (4)

THE MODEL independent ξ variable. The coefficient λy and the vector

λx = (λ1 , . . . , λp ) represent the expected effects of η on y

Let us consider a set of n objects, O = {oi |i = 1, . . . , n}, and of ξ on x, respectively, and the variable e and the vector

denoted by = (δij ), i, j = 1 . . . n, an observed asymmet- of variables , are measurement error terms assumed to be

ric matrix of dissimilarities, and where 1 and 2 denote uncorrelated with each other, and also with the latent vari-

the upper and lower triangular parts, respectively, of writ- ables η and ξ . Without loss of generality, it is assumed that

ten in vector form. In this context, both triangular parts of the error variables have mean zero and that the x, y, η, and

are considered as error observed measurements that are lin- ξ are written as deviation scores. Hence, denoting the vec-

early related to an unknown symmetric dissimilarity matrix tor of parameters of the model by θ , the covariance structure

∗ = (δij∗ ). The aim of MDS is to find a configuration, X, matrix adopts the expression,

of n points xi , i = 1, . . . n, usually in a Euclidean space of

dimension K, that best approximates the unknown symmet- λ2y (γ 2 σξ2 + σζ2 ) + σe2

ric matrix of dissimilarities ∗ by means of the matrix of E (θ) = , (5)

γ λy σξ2 λx σξ2 λx λx +

Euclidean distances D(X) = (dij (X)) given by

where σξ2 and σζ2 are the variances of ξ and ζ , respectively,

K 1/2

and σe2 and are the variance of e and the covariance matrix

dij (X) = (xik − xjk )2 . of the elements of , respectively.

k=1 Therefore, when transformations are allowed in the MDS

framework, and following the usual covariance structure esti-

Thus, the MDS problem can be formulated in a least squares mation methodology, it can be assumed that the unknown

framework, by minimizing, distances arranged in vector form constitute a latent variable

denoted by d∗ , which can be measured through a coeffi-

2 cient λd using an indicator variable d(X), the values of which

STRESS = dˆij − dij (X) , (1) are obtained from a previous estimation of the configuration

i<j matrix X.

56 VERA AND RIVERA

Hence, an effect relationship between the latent vari- where p is the number of observed variables. Thus, θ =

able δ ∗ and the observed indicator variables δ 1 and δ 2 (b, λ1 , λ2 , σ 2 , σζ2 ) , and the covariance structure of the effect

(corresponding to the two triangular parts of the observed indicators model can be written as

dissimilarities, suitably arranged in vector form) can be sum-

marized using the structural parameters λ1 and λ2 . Thus, in b2 + σζ2 + kσd(X)

2

E (θ) = , (10)

terms of a covariance structure model, the problem can be bλx λx λx +

formulated by means of the system of structural equations

given by where λx = (λ1 , λ2 ) , and = σ 2 I denotes the diagonal

covariance matrix of the vector . From this, it is straight-

d∗ = b δ ∗ + ζ (6) forward to show that the model is algebraically identified

(see Appendix) and the values of the symmetric dissimilarity

d(X) = λd d∗ + e matrix ∗ expressed in vector form can be given from the

Downloaded by [UGR-BTCA Gral Universitaria], [J. Fernando Vera] at 09:35 31 January 2014

(7)

usual regression method of factor score estimation by

δ1 = λ1 δ ∗ + 1 (8) −1

∗ =

λx ˜ ,

(11)

δ2 = λ2 δ ∗ + 2 . (9) ˜ = [δ1 δ2 ] is the n(n − 1)/2 × 2 block matrix con-

where

formed by the values of the observed dissimilarities, λx =

Equation 6 models the structural spatial relation between the −1

(λ̂1 , λ̂2 ) , and denotes the inverse matrix of the estimated

unknown distances D∗ (measured by D(X)) and the unknown

symmetric dissimilarities ∗ , denoted as the latent variables values of λx

λx + .

d∗ and δ ∗ , respectively. In the MDS framework this relation

can be perceived as a disparity relation with an error term,

where the distances are determined by an alternating estima- AN ALTERNATE SEM–MDS ALGORITHM

tion procedure. Coefficient b is a regression coefficient that

describes the effects of the dissimilarities on the distances, In the usual least squares MDS framework, two main alter-

and ζ is the error term referring to the spatial relationship nate estimation phases are involved when disparities are

of mean zero and variance σζ2 , which is uncorrelated with taken into account. In the first, the parameters of the model

∗ , where b is also estimated in the SEM are estimated assuming the disparities are known; second,

δ ∗ . Thus, D = b̂

the disparities are estimated when the remaining parame-

phase, represents the optimal disparities in MDS. After the

ters are known. This procedure continues such that at the

disparities are estimated, the distances can be estimated by

end, the loss function (usually the STRESS) is minimized.

considering them as predictors of the symmetric disparities

Then, the aim in the proposed least squares ratio MDS model

in an alternating estimation procedure using the Guttman

is to determine a configuration X∗ such that the distances

transform (de Leeuw & Heiser, 1980).

dij∗ = dij (X∗ ) approach as much as possible the symmetric

Equation 7 shows the measurement model for d∗ con-

sidering d(X) as the sole indicator. To solve the inherent disparities d̂ij∗ = b̂δij∗ , by minimizing

identification problem that arises when only one indicator ∗ 2

is considered, we express the error variance as proportional STRESS = dˆij −dˆij (X ∗ ) . (12)

to the variance of the symmetric dissimilarity matrix. Thus, i<j

σe2 = κσd(X)

2

where κ is a number between zero and one. The

value of κ can be predetermined or established on the basis It is well known that in the presence of unobserved vari-

of the decrease in the STRESS value for each iteration in the ables, or measurement errors in the regression analysis, the

estimation algorithm, and the value of λd is fixed at one. SEM methodology achieves solutions that are at least as

Equations 8 and 9 show the measurement model for δ ∗ good as those obtained by the usual least squares proce-

(measured by δ 1 and δ 2 ), in which the two triangular parts of dure (Bentler, 1983; Joreskög, 1978). Thus, for the overall

are considered as effect indicators of the unobserved sym- estimation procedure in this context, the disparities can be

metric dissimilarity matrix ∗ . The error terms 1 and 2 are estimated in the covariance structure framework in an alter-

assumed to be uncorrelated with each other and with δ ∗ and nating estimation procedure that minimizes the STRESS

e, and e is assumed to be uncorrelated with d∗ . In addition, it using SMACOF (de Leeuw & Heiser, 1980); from the pre-

is assumed that the error terms 1 and 2 have the same vari- vious values of the parameter estimators, the configuration

ance denoted by σ 2 , which is coherent with the usual MDS is first estimated by minimizing the STRESS (Equation 12)

framework. from the estimated disparities, and then the values of the

To have globally identified the model, the scale for δ ∗ symmetric dissimilarity matrix ∗ are estimated in a struc-

is set by assigning the value of σδ2∗ = 1, and the num- tural equation model of three observed variables, assuming

ber of nonredundant elements in is p(p + 1)/2 = 6, the configuration X and thus the distance matrix D(X)

AN SEMDS MODEL FOR ONE-MODE ASYMMETRIC DISSIMILARITY DATA 57

is known, such that the STRESS (Equation 12) is also 3. In the SEM phase, the symmetric dissimilarities ∗ (r)

decreased. and the value of b̂(r) are then obtained by minimiz-

At the rth iteration, the distance matrix D(

(r−1)

X ) is ing (Equation 13), using the values of 1 , 2 , and

of the last given distances D(

known in the SEM phase, for which the sample covariance (r−1)

X ). A common

matrix S(r) can be written as trust-region-reflective algorithm for nonlinear least-

⎡ ⎤ squares problem can be employed (see, e.g., Moré

VAR (D(

(r−1)

X ))

& Sorensen, 1983, or Coleman, Branch, & Grace,

= ⎣ COV (D( ⎦.

(r−1)

S(r) X ), 1 ) VAR (1 )

1999 for further details) to minimize (Equation 13), or

COV (D(

(r−1)

X ), 2 ) COV (1 , 2 ) VAR (2 )

(13) equivalently,

1 (r) (r) 2

Because no probabilistic assumption is made for the dis- F(θ ) = (S11 − θ12 − θ5 − κσd(2 X̂(r−1) ) S11 )

Downloaded by [UGR-BTCA Gral Universitaria], [J. Fernando Vera] at 09:35 31 January 2014

is employed in this context (Finney & DiStefano, 2006) for (r) 1 (r)

the SEM parameter estimation problem. Using the estimated + (S21 − θ1 θ2 )2 + (S22 − θ22 − θ4 )2

2 (16)

parameter values, the symmetric dissimilarities are estimated

(r) (r)

using the usual ordinary least squares regression method, + (S31 − θ1 θ3 ) +

2

(S32 − θ2 θ3 ) 2

score and the corresponding factor (Grice, 2001). Both esti- 1 (r)

+ (S33 − θ32 − θ4 )2 .

mation procedures are in consonance with the minimization 2

of the STRESS in an ordinary least squares regression prob-

Then, the optimal disparities

(r)

lem given the estimated disparities. Thus, the aim in this D are calculated and

SEM phase is to minimize the loss function given by (r)2

normalized such that ij dˆij = n(n − 1)/2, to avoid

1 (r) 2 trivial solutions; the STRESS in the rth iteration is then

F(θ (r) ) = tr S − (θ (r) ) , (14) calculated.

2 4. The difference between two consecutive STRESS val-

which provides a consistent estimator of θ (Bollen, 1989; ues is obtained and the algorithm returns to the MDS

Browne, 1982). Then, the estimation process continues phase until the convergence criterion is attained.

to achieve the overall convergence of the STRESS.

The proposed structural equation multidimensional scaling The algorithm stops if the overall convergence criterion is

(SEMDS) procedure can be summarized as follows: achieved; that is, if the difference between two consecutive

values of the STRESS is less than a small positive constant

1. Using the average of 1 and 2 , the initial con- or when a maximum number of iterations is achieved.

figuration X(0) is achieved by classical MDS. Then,

the optimal symmetric dissimilarity matrix ∗ (0) and

coefficient b̂(0) are calculated in the SEM framework ILLUSTRATIVE APPLICATIONS

by minimizing (Equation 13). The initial disparities

D(0) are then calculated and normalized such that To illustrate the proposed algorithm, it was implemented in

(0)2 MatLab,1 working on an Intel Core i5 computer with 4 GB of

ˆ = n(n − 1)/2; the initial STRESS value is

ij d ij

calculated using Equation 1. RAM under Microsoft Windows 7. A maximum of 1000 iter-

ations or a difference in subsequent STRESS values of less

2. In the rth iteration, the configuration

(r)

X is estimated

than 10−7 was employed as the overall convergence criterion,

in the optimal configuration phase by the Guttman

together with the iterative cycle in the SEM phase in which

transform (see, e.g., Borg & Groenen, 2005) from

Equation 13 is minimized.

D

(r−1)

. To test the performance of the model, 8100 artificial

data sets were obtained by generating true distances from

1

X = B( )

(r) (r−1) (r−1)

X X , (15) a configuration and by adding a random error to the given

n distances as shown in Equation 8 and 9, following a method-

ology similar to that proposed by Weeks and Bentler (1979).

the elements of matrix B are given by bii =

where

First, a configuration matrix was attained by randomly gener-

− bij , and

j=i ating n points from a uniform distribution in two dimensions,

⎧ (r−1) for sizes of n equal to 15, 25, and 50. For each configuration,

⎨ dˆij for i = j, and dji (

(r−1)

⎪ X ) = 0,

bij = ,

⎪ (r−1)

⎩ dij X

0, for i = j, and dji (

(r−1)

X ) = 0. 1 The program and data sets are available on request.

58 VERA AND RIVERA

the Euclidean distances between the rows were calculated when applied to the asymmetric data sets, in terms of the

and organized in vector form, representing samples of sizes STRESS and of the Procrustes goodness-of-fit value, are

105, 300, and 1225, respectively. Errors were drawn from a summarized in Figure 2, for each error variance term and fac-

normal distribution with a mean of zero and variance pro- tor loadings. Negative values for these differences reflect the

portional to the variance of the true distances of p equal superior performance of the proposed SEMDS procedure,

to .05, .10, .15, .25, .50, 1.00, 1.50, 2.00, and 3.00. For compared with the MDS analysis of the symmetrized dissim-

cases in which this methodology produced negative values, ilarities for the simulated data sets. In general, the SEMDS

a constant was added such that the smallest value was zero, procedure correctly recovers the original factor loadings in

and the whole process was repeated 100 times for replica- all situations, and for large data sets, it produces the best

tions. The factor loadings λ1 and λ2 were set to (0.5,0.5), results for all factor loadings and error levels both in terms

(0.3,0.7) and (1,2), and the 8100 data sets generated were of STRESS and of the Procrustes goodness-of-fit criterion

analyzed with the proposed SEMDS procedure. All data sets between the recovered and the original configurations.

Downloaded by [UGR-BTCA Gral Universitaria], [J. Fernando Vera] at 09:35 31 January 2014

were also analyzed using SMACOF, considering the corre- For smaller data sets and in terms of the STRESS crite-

sponding averaged matrix given by = (1 + 2 )/2, as the rion, the two procedures achieved similar results for equal

source symmetric dissimilarities. factor loadings (0.5,0.5) up to p = .5, corresponding to an

For each simulated data set, the correlation coefficient averaged correlation coefficient of .32. For factor loadings

between each pair of perturbed distances 1 and 2 was cal- (0.3,0.7), the SEMDS procedure achieved better results up

culated, and for the given configuration, the values of the to p = .5 for n = 15, corresponding to an average correla-

normalized STRESS and of the scaled Procrustes statistic tion coefficient of .27, and of p = 2 for n = 2, corresponding

for the recovered and the original configurations were calcu- to an average correlation coefficient of .087. For the remain-

lated. Correlation values for each pair of factor loadings (λ1 , ing major error levels (and lower correlation values) in each

λ2 ), and each error variance level p were averaged across situation, the SMACOF procedure produced better results

replications corresponding to all values of n. As expected, for the smaller data sets in terms of the STRESS criterion.

for lower values of λ1 and λ2 , high correlation values were Nevertheless, in terms of the Procrustes goodness-of-fit crite-

found for low error variance levels up to p = .10, but for the rion, the results were quite different; in all situations, except

higher values of factor loadings, the correlation increased for for the combination of n = 2, p = 2 and factor loadings

all values of p, as shown in Figure 1. (0.5,0.5), the proposed SEMDS procedure achieved the best

The averaged differences between the results obtained by configuration.

the SMACOF algorithm when applied to the (1 + 2 )/2 To illustrate the performance of the proposed model for a

symmetric matrix, and the proposed SEMDS algorithm real data set, the classic Miller and Nicely (1955) data of per-

ceptual confusions between 16 English consonant phonemes

were used. In this test, female subjects listened to female

speakers reading consonant–vowel syllables formed by pair-

ing the consonants /b,d,g,p,t,k,v,D,z,Z,f,θ,s,S,m,n/ with the

vowel /a/ (as in father), and the subjects were required to

write down the consonant they heard after each syllable was

spoken. The confusions or errors of identification matrices

were compiled under 17 different experimental conditions.

The first four 16 × 16 tables given summarize the data

obtained when noise-masking conditions produced varying

speech-to-noise (S/N) ratios, with the addition of random

noise at different levels. Only the first of these four tables

is analyzed here, because several entries were zeros for the

remaining three tables.

The original similarities were transformed into dissimilar-

ities by considering the normalization procedure described

by Hubert (1972) for this data set. Thus, conditional prob-

abilities were first estimated by dividing each entry of the

original matrix by the corresponding row sum. Then, an

asymmetric dissimilarity matrix was obtained by consider-

ing δij = 1 − sij /(max(sij ) + c), where the constant c was

selected as 0.001, to avoid zero entries, while maintaining

FIGURE 1 Averaged correlation values between the simulated δ 1 and δ 2

the absolute size of the entries (see Arabie & Soli, 1979).

across all replications and sizes, for each pair of factor loadings and error Each triangular part of this asymmetric dissimilarity matrix,

variance level. (Figure appears in color online.) 1 and 2 , can be considered a different measurement of

AN SEMDS MODEL FOR ONE-MODE ASYMMETRIC DISSIMILARITY DATA 59

Downloaded by [UGR-BTCA Gral Universitaria], [J. Fernando Vera] at 09:35 31 January 2014

FIGURE 2 Averaged difference values for the STRESS (left panels) and for the Procrustes goodness-of-fit values (right panels) between the obtained results

for SMACOF and for the proposed SEMDS algorithm for the simulated data sets, for each error variance term and pair of factor loadings.

Note. SEMDS = Structural equation multidimensional scaling; MDS = multidimensional scaling. (Figure appears in color online.)

60 VERA AND RIVERA

similarity matrix can be considered imperfect measurements

of an unobserved subjacent symmetric dissimilarity matrix.

In this situation, an average transformation of the two tri-

∫ δδ

∫

angular parts of the original matrix is usually advisable to

address object representation for one-way, one-mode data

in MDS.

In this context, we propose an MDS model allowing

transformations for asymmetric dissimilarity data, such that

the unknown symmetric dissimilarity matrix is estimated as

disparities in an SEM framework. Considering the triangular

parts of the asymmetric dissimilarity matrix as an effect

Downloaded by [UGR-BTCA Gral Universitaria], [J. Fernando Vera] at 09:35 31 January 2014

symmetric dissimilarity matrix (or latent matrix), a least

squares alternating estimation procedure using SMACOF is

θθ

developed such that in each iteration, the MDS configuration

is attained while the unknown symmetric dissimilarities are

estimated in an algebraically identified covariance structural

problem.

FIGURE 3 Representation of the optimal configuration for the SEMDS

procedure (dots), and the Procrustes configuration obtained with the

To test the performance of the model, several artificial

SMACOF procedure when the dissimilarities were first symmetrized data sets were generated by adding a random error to the

(crosses). (Figure appears in color online.) true Euclidean distances from a configuration, under several

Note. SEMDS = structural equation multidimensional scaling. conditions of error variance and data size, for a fixed pair of

factor loading values. All generated data sets were analyzed

with the proposed SEMDS procedure, and SMACOF was

also used, considering the corresponding averaged matrix

an unknown and subjacent symmetric relation between the given by = (1 + 2 )/2, as the source symmetric dis-

pair of signals, comprising 120 entries. The asymmetric similarities in ratio MDS. For large data sets, the SEMDS

dissimilarity matrix was analyzed by the proposed SEMDS procedure gave the best results in all situations, both in

methodology, and the results obtained were compared with terms of STRESS and of the Procrustes goodness-of-fit cri-

those achieved by the usual SMACOF procedure for ratio terion for the recovered configuration. For small data sets,

MDS when the dissimilarities were first symmetrized as the SEMDS procedure also achieved the best or at least

(1 + 2 )/2. For the SMACOF procedure, a raw STRESS equally good results as those of the MDS procedure for

value of 9.0052 was found for an intercept value of b = the symmetrized dissimilarities, for data with a high corre-

1.6083. For the SEMDS procedure, a lower STRESS value lation coefficient (larger than .35). For low correlated data

of 8.8448 was found, associated with the parameter val- sets, and in terms of the Procrustes criterion, the proposed

ues of θ = (0.3632, 0.0970, 0.0614, 0.0158, 2.2463e − 14), SEMDS procedure also achieved the best configuration in

with a correlation coefficient value of 0.16 between the all situations except for n = 15, p = 2 and factor loadings of

two triangular parts of the asymmetric dissimilarity matrix. (0.5,0.5).

Thus, the best configuration in terms of the STRESS crite- The classical Miller and Nicely (1955) data of percep-

rion was achieved by the SEMDS procedure. Furthermore, tual confusions between 16 English consonant phonemes

both procedures showed differences in the given configu- were also analyzed, using the data normalization proposed

rations following a Procrustes transformation, especially in by Hubert (1972). In terms of STRESS, the best con-

the voiced stops and fricative consonants b, and Z, as can be figuration was achieved by the SEMDS procedure when

appreciated in Figure 3. it was applied to the asymmetric dissimilarities obtained

from the normalized data set, compared to the results pro-

duced by the SMACOF procedure for the symmetrized

CONCLUSIONS dissimilarities. Both procedures showed differences in

their configurations when compared following a Procrustes

Asymmetry is a problematic issue that frequently arises transformation.

in measuring the relationship between a pair of objects to The results obtained show that when the upper and the

be represented by MDS, as a monotone relation between lower triangular parts of an asymmetric dissimilarity matrix

dissimilarities and distances is required. When the asym- can be considered imperfect measurements of an unobserved

metry is not of interest in its own right (assuming it subjacent symmetric dissimilarity matrix, the incorporation

appears because of error fluctuations) then the upper and of the error terms in the model is an important benefit

AN SEMDS MODEL FOR ONE-MODE ASYMMETRIC DISSIMILARITY DATA 61

derived from the SEM methodology for parameter estima- Jöreskog, K. G. (1978). Structural analysis of covariance and correlation

tion, which helps improve the estimation of the configuration matrices. Psychometrika, 43, 443–477.

Kruskal, J. B. (1964). Multidimensionl scaling by optimizing goodness of

to represent the objects in a low-dimensional space using

fit a nonmetric hypothesis. Psychometrika, 29, 1–27.

MDS. This methodology also enables us to address the sit- Miller, G. A., & Nicely, P. E. (1955). An analysis of perceptual confusions

uation in which the lack of symmetry is significant, and among some English consonants. The Journal of the Acoustic Society of

the unknown symmetric dissimilarity matrix is a composite America, 27, 338–352.

matrix, a situation that is currently being investigated by the Moré, J. J., & Sorensen, D. C. (1983). Computing a trust region step. SIAM

Journal on Scientific and Statistical Computing, 3, 553–572.

authors.

Okada, A., & Imaizumi, T. (1987). Geometric models asymmetric similarity

data. Behaviormetrika, 21, 81–96.

Rothkopf, E. Z. (1957). A measure of stimulus similarity and errors in

ACKNOWLEDGMENTS

some paired-associate learning. Journal of Experimental Psychology, 53,

94–101.

Downloaded by [UGR-BTCA Gral Universitaria], [J. Fernando Vera] at 09:35 31 January 2014

The authors would like to thank Albert Satorra and three Saito, T., & Yadohisa, H. (2005). Data analysis of asymmetric structures:

anonymous reviewers for valuable comments on an earlier Advanced approaches in computational statistics. New York, NY: Marcel

draft of this article. Dekker.

Vera, J. F., Heiser, W., & Murillo, A. (2007). Global optimization in

any Minkowski metric: A permutation-translation simulated annealing

algorithm for multidimensional scaling. Journal of Classification, 24,

REFERENCES

277–301.

Vera, J. F., Macías, R., & Heiser, W. J. (2009a). A dual latent class unfold-

Arabie, P., & Soli, S. D. (1979). The interface between the type ing model for two-way two-mode preference rating data. Computational

of regression and methods of collecting proximities data. In R. Statistics and Data Analysis, 53, 3231–3244.

Colledge & J. N. Rayner (Eds.), Multidimensional analysis of large Vera, J. F., Macías, R., & Heiser, W. J. (2009b). A latent class

data sets (pp. 90–115). Minneapolis, MN: University of Minnesota multidimensional scaling model for two-way one-mode continuous rating

Press. dissimilarity data. Psychometrika, 74, 297–315.

Bentler, P. M. (1983). Simultaneous equation systems as moment structure Wang, M. D., & Bilger, R. C. (1973). Consonant confusions in noise:

models. Journal of Econometric, 22, 13–42. A study of perceptual features. The Journal of the Acoustic Society of

Bollen, K. A. (1989). Structural equations with latent variables. New York, America, 54, 1248–1266.

NY: Wiley. Weeks, D. G., & Bentler, P. M. (1979). A comparison of linear and

Borg, I. (1979). Ein Verfahren zur Analyse metrischer asymmetrischer monotone multidimensional scaling models. Psychological Bulletin, 86,

Proximitätsmatrizen. [A procedure for analyzing metric asymmetric 349–354.

proximity matrices.] Archiv für Psychologie, 131, 183–194. Weeks, D. G., & Bentler, P. M. (1982). Restricted multidimensional scaling

Borg, I., & Groenen, P. J. F. (1995). Asymmetries in multidimensional scal- models for asymmetric proximities. Psychometrika, 47, 201–208.

ing. In F. Faulbaum (Ed.), Softstat’95 (pp. 31–35). Stuttgart, Germany: Yadohisa, H., & Niki, N. (2000). Vector representation of asymme-

Lucius. try in multidimensional scaling. Journal of the Japanese Society of

Borg, I., & Groenen, P. J. F. (2005). Modern multidimensional scaling: Computational Statistics, 13, 1–14.

Theory and applications. (2nd ed.). New York, NY: Springer. Zielman, B., & Heiser, W. J. (1996). Models for asymmetric proximities.

Browne, M. W. (1982). Covariance strutures. In D. M. Hawkins (Ed.), British Journal of Mathematical and Statistical Psychology, 49, 127–146.

Topics in multivariate analysis (pp. 72–141). Cambridge,UK: Cambridge

University Press.

Coleman, T. F., Branch, M. A., & Grace, A. (1999). Optimization toolbox:

Users guide, Version 2. Natick, MA: The Mathworks, Inc. APPENDIX

Constantine, A. G., & Gower, J. C. (1978). Graphic representations of GLOBAL IDENTIFICATION OF THE SEM

asymmetric matrices. Applied Statistics, 27, 297–304. MODEL

de Leeuw, J., & Heiser, W. J. (1980). Multidimensional scaling with restric-

tions on the configuration. In P. R. Krishnaiah (Ed.), Multivariate analysis

For the global identification of the model in the SEM step, let us show

(Vol. V, pp. 501–522). Amsterdam,The Netherlands: North-Holland.

algebraically how each element of θ = (b, λ1 , λ2 , σ 2 , σζ2 ) can be solved for

Escoufier, Y., & Ground, A. (1980). Analyse factorielle des matrices carrées

in terms of one or more known-to-be-identified elements of the covariance

non symétriques. [Factorial analysis of non-symmetric square matrices.]

matrix of the observed variables , given by

In E. Diday (Ed.), Data analysis and informatics (pp. 263–276). New

York, NY: North-Holland. ⎡ ⎤

VAR (d∗ )

Finney, S. J., & DiStefano, C. (2006). Nonnormal and categorical data

= ⎣ COV (δ1 , d∗ ) VAR (δ1 ) ⎦. (A.1)

in structural equation models. In G. R. Hancock & R. O. Mueller

COV (δ2 , d∗ ) COV (δ2 , δ1 ) VAR (δ2 )

(Eds.), A second course in structural equation modeling (pp. 269–314).

Greenwich, CT: Information Age.

Considering the five unknowns in θ , substituting the parameters into the

Gower, J. C. (1977). The analysis of asymmetry and orthogonality. In

implied covariance matrix θ in Equation 10, and considering the covariance

J. R. Barra, F. Brodeau, G. Romier, & B. Van Cutsem (Eds.), Recent

structure of = (θ ), the six derived equations can be solved for the five

developments in statistics (pp. 109–123). Amsterdam,The Netherlands:

unknowns in θ as follows:

North-Holland.

Grice, J. W. (2001). Computing and evaluating factor scores. Psychological

Methods, 6, 430–450. VAR (d∗ ) = b2 + σζ2 + kσd(X)

2

(A.2)

Hubert, L. (1972). Some extensions of Johnson’s hierarchical clustering

algorithms. Psychometrika, 37, 261–274.

COV (δ1 , d∗ ) = λ1 b (A.3)

62 VERA AND RIVERA

COV (δ2 , d∗ ) = λ2 b (A.4) expression, first after multiplying by λ2 and second after multiplying by

λ1 , and after replacing λ1 λ2 by COV (δ1 , δ2 ) from Equation A.6 the values

of λ1 and λ2 can be given by

VAR (δ1 ) = λ21 + σ 2 (A.5)

1/2

COV (δ1 , δ2 ) COV (δ1 , d∗ )

λ1 =

COV (δ2 , d∗ )

COV (δ2 , δ1 ) = λ1 λ2 (A.6) 1/2

COV (δ1 , δ2 ) COV (δ2 , d∗ )

λ2 =

COV (δ1 , d∗ )

VAR (δ2 ) = λ22 + σ 2 . (A.7)

Therefore, σ 2 can be identified by means of Equations A.5 or A.7,

From Equations A.3 and A.4 it follows that λ2 COV (δ1 , d∗ ) = whereas b can be identified by means of Equation A.3 or A.4 and σζ2 is

λ1 COV (δ2 , d∗ ). Thus, considering the two equations derived from this also identified by means of Equation A.2.

Downloaded by [UGR-BTCA Gral Universitaria], [J. Fernando Vera] at 09:35 31 January 2014

- 1501.Applied Parameter Estimation for Chemical Engineers (Chemical Industries) by Peter Englezos [Cap18]Transféré parcegarcia
- CfaTransféré parFerry Bachara
- The Incentive Effects of Higher Education Subsidies on Student EffortTransféré parDaniel
- Sta301lec1to45mcqsTransféré parSarfraz Ali
- Econometrics Chapter 8 PPT slidesTransféré parIsabelleDwight
- GMM STATATransféré parsabyrbekov
- Note206Transféré parzeynepil
- Eco No MetricsTransféré parfurqanbuzdar
- The Metabolic Topography of ParkinsonismTransféré parMihaela Toader
- Panel 2(Sata)Transféré parNguyen Xuan Nguyen
- Panel DataTransféré parZhenhellacinta Abadi Selamanya
- Chapter 2Transféré parlight247_1993
- 2 the Linear Regression ModelTransféré parBhodza
- Steenkamp & Baumgartner, 1998 - Assessing Measurement Invariance in Cross-national Consumer ResearchTransféré pardaniel bageac
- 11.pdfTransféré parDay Reba
- Data AnalysisTransféré parKranti Prajapati
- Math-Supplement-.pdfTransféré parRajat Garg
- 17673_R Programming Help FileTransféré parHarmanWalia
- COMPUSOFT, 3(1), 467-472.pdfTransféré parIjact Editor
- Tests the null hypothesis that the observed covariance matrices of the dependent variables are equal across groups.Transféré parQerion Energy
- Colombo StancaTransféré pararturo_be2
- MIT18_S096F13_lecnote7.pdfTransféré parLeoncio Ladeira
- 13020214410008 Annisa Herdini Tugas 1Transféré parBayu Dwi Ananta
- EUSIPCO2014 Future Random Matrix ToolsTransféré parYunus Koç
- Phy3004w Poisson Statistics Practical UCTTransféré parChloe Sole
- 330_Lecture5_2015Transféré parAnonymous gUySMcpSq
- An Introduction to SphericityTransféré parHarry SW
- Pro1 FinalTransféré parLizi Zhu
- allChap6.pdfTransféré parmohit
- Notes 11Transféré parDinhthe Hoi

- BC cram sheet.pdfTransféré parSteven Limantoro
- BURGERS_SOLUTION - Exact Solution of Time Dependent 1D Viscous Burgers EquationTransféré pardhdhdhd6719
- Unit - Binary OperationsTransféré paravaquizzer
- R free codes.Transféré parJorge Hirs
- Ec961 Preparation MaterialTransféré parSapphasak Bun Chatchawan
- תכן לוגי מתקדם- הוכחת משפט פוסטTransféré parRon
- Metode numerice in C - Titus BeuTransféré parleonardf
- Right Angle TrigoTransféré parChristopher Sexton
- EM 122 Syllabus OBTL Ver 2015Transféré parCharllote Alcala
- Function SolutionsTransféré parshehbazs96
- Log ReviewTransféré parViet Quoc Hoang
- Curves & Surfaces in Geometric Modelling Geomcs-V2Transféré pararcangelizeno
- QM_notesTransféré par140557
- MTH316LectureNote.pdfTransféré parAaron Ng
- maths p61Transféré parGeorge swan
- 1-s2.0-S0019357715000981-main.pdfTransféré parSuntood
- Application of Residue Inversion Formula for Laplace Transform to Initial Value Problem of Linear Ode’sTransféré parIOSRjournal
- Definir y Comparar Matrices ALG II SPN l v3 n46 s1Transféré pargerardodoac
- kenkenTransféré parMohd Nazri Salim
- splinesTransféré parBrayan Torres Contreras
- International Competitions IMO Shortlist 2000 17Transféré parBerce Gabriel
- DomainTransféré parKb Ashish
- Particle on a RingTransféré parSurendra Karwa
- Jee 2006Transféré paranirudh
- Alg1 05 Student JournalTransféré paralex
- Important Things to Know for Solving Problems on Time and Work(Www.start123ready.blogspot.com) (1)Transféré parPriyanka Sahni
- 9 Section 1.1 NotesTransféré parMsHardiman
- 11 01 Fourier SeriesTransféré parJohn Bofarull Guix
- HW2sol.pdfTransféré parjulianli0220
- HW3Transféré parAndresEstebanSilvaSanchez

## Bien plus que des documents.

Découvrez tout ce que Scribd a à offrir, dont les livres et les livres audio des principaux éditeurs.

Annulez à tout moment.