Académique Documents
Professionnel Documents
Culture Documents
Inertia decomposition
Graphical representation
Helps to interpret
Correspondence Analysis
Julie Josse, Franois Husson, Sbastien L
Applied Mathematics Department, Agrocampus Ouest
1 / 22
Introduction
Inertia decomposition
Graphical representation
Helps to interpret
History
Theoretical principles: Fisher (1940) Correspondence Analysis has been actively developed in 1965
... in Rennes!
JP. Benzcri: mathematician and linguist PhD thesis of his student B. Escoer : Correspondence
Analysis
The beginning of the "French school"
2 / 22
Introduction
Inertia decomposition
Graphical representation
Helps to interpret
CA in the R packages
anacor (de Leeuw and mair) ca (Nenadic and Greenacre) ade4 (Chessel) vegan (Dixon) homals (de Leeuw) FactoMineR (Husson et al.)
3 / 22
Introduction
Inertia decomposition
Graphical representation
Helps to interpret
Data, examples
4 / 22
Introduction
Inertia decomposition
Graphical representation
Helps to interpret
Notations
Introduction
Inertia decomposition
Graphical representation
Helps to interpret
Notations
Introduction
Inertia decomposition
Graphical representation
Helps to interpret
Aim
Study the relationship (the correspondence) between the two variables, the gap to independence Visualize the association between levels
7 / 22
Introduction
Inertia decomposition
Graphical representation
Helps to interpret
Example
12 perfumes described by 39 words:
8 / 22
Introduction
Inertia decomposition
Graphical representation
Helps to interpret
2 =
ij
= n = n .
ij 2
Introduction
Inertia decomposition
Graphical representation
Helps to interpret
xij =
CA: visualize the residuals matrix X (the gap to independence) As usual, the association structure of X is revealed using the SVD
10 / 22
Introduction
Inertia decomposition
Graphical representation
Helps to interpret
Total inertia
Total inertia = 2 = trace(XX ) = 2 = 2 = n (fij fi . f.j )2 fi . f.j
fij f.j
2 n
ij
2 =
ij
fi . fi .
fij fi .
f.j
fi .
ij
11 / 22
Introduction
Inertia decomposition
Graphical representation
Helps to interpret
fi .
i
fij = f.j fi .
fij fi .
2 =
ij
f.j f.j
fi .
Total inertia = weighted sum of squared distances of the rows prole to the average prole the weight of the row prole is its mass fi . and the squared distance is an Euclidean distance where each squared dierence is divided by the corresponding average value f.j
12 / 22
Introduction
Inertia decomposition
Graphical representation
Helps to interpret
2 -distance
the row i is a point in RJ (with the weight fi . )
d 2 (i , l ) =
j
1 f.j
flj fij fi . fl .
d (j , h) =
i
1 fi .
Introduction
Inertia decomposition
Graphical representation
Helps to interpret
which represent in the better way the variability between individuals (i.e the distance to the barycenter), in CA we are looking for dimensions which better represent the gap to independence.
14 / 22
Introduction
Inertia decomposition
Graphical representation
Helps to interpret
1 Fs (i ) = s
fij Gs (j ) fi .
1 Gs (k ) = s
fij Fs (i ) f.j
Introduction
Inertia decomposition
Graphical representation
Helps to interpret
Graphical representation
CA factor map
Cinma q L_instant
q
fruity soft q J_adore light q Coco Mademoiselle Pure Poison q J_adore_et q discrete q fresh floral Pleasures
q
0.0
old
Chanel 5 0.5
q
soap
1.0
0.5
0.0
1.0
1.5
16 / 22
Introduction
Inertia decomposition
Graphical representation
Helps to interpret
Graphical representation
CA factor map
1.0
Lolita Lempika
q
Angel 0.5
q
Dim 2 (21.12%)
Cinma qL_instant
q
0.0
J_adore
q
0.5
0.0
1.0
1.5
17 / 22
Introduction
Inertia decomposition
Graphical representation
Helps to interpret
Graphical representation
CA factor map
1.5
Dim 2 (21.12%)
0.5
hot
Angel q oriental
Cinma young L_instant q q fruity acid spicy soft lemon vegetable qwoman J_adore light wooded q Coco Mademoiselle Pure Poison forest J_adore_et q discreet q q fresh floral rose nature shampoo Pleasures shower.gel q soap
heavy intense peppery agressive heady drugs eau.de.cologne strong Aromatics ShalimarElixir male q q alcohol powerful old toilets Chanel 5 q amber
0.5
0.0
musky
1.0
0.5
0.0
0.5
1.0
1.5
2.0
Dim 1 (60.46%)
18 / 22
Introduction
Inertia decomposition
Graphical representation
Helps to interpret
The barycenter represents the independence The distance between levels of a same variable can be
interpreted
Representation provided are pseudo-barycentric (dilatation):
transition formulae
It is not possible to interpret the distance between levels of the
19 / 22
Introduction
Inertia decomposition
Graphical representation
Helps to interpret
Helps to interpret
Supplementary informations can be added (zero weight)! Percentage of variance for each axis: information brought by
s , s s
Introduction
Inertia decomposition
Graphical representation
Helps to interpret
Helps to interpret
Contribution:
s (i ) = fi . F . Be careful, extreme points are not s those which contribute the most to the dimension
21 / 22
Introduction
Inertia decomposition
Graphical representation
Helps to interpret
Practice
library(FactoMineR) perfume = read.table("perfume.txt",header=T,sep="\t",row.names=1) res.ca = CA(perfume,col.sup=16:39) plot(res.ca,invisible="row") plot(res.ca,invisible=c("col","col.sup")) res.ca$eig barplot(res.ca$eig[,1],main="Eigenvalues",names.arg=1:nrow(res.ca$eig)) res.ca$row$coord res.ca$row$cos2 res.ca$row$contrib res.ca$col$coord res.ca$col$cos2 res.ca$col$contrib
22 / 22