Vous êtes sur la page 1sur 29

CLASSIFYING OBJECTIVE INTERESTINGNESS MEASURES

BASED ON THE TENDENCY OF VALUE VARIATION

Nghia Quoc Phan1, Hiep Xuan Huynh2, Fabrice Guillet3, Régis Gras4

TITLE
CLASSIFICATION DES MESURES D'INTÉRÊT OBJECTIVES BASÉE SUR LA
TENDANCE DE VARIATION DES VALEURS

RÉSUMÉ
Dans les dernières années, la recherche de découverte de connaissances à partir des données
a été intéressée par de nombreux chercheurs et de nombreux résultats de recherche ont été
utilisés efficacement dans de plusieurs domaines de la vie. La mesure d’intérêt joue un rôle
important dans le domaine de recherche de découverte des connaissances. C’est pourquoi
l’étude sur les mesures d’intérêt qui ont déjà développés est de plus en plus importante. La
pluspart des études se basé sur deux méthodes principales : classification basée sur les
propriétés d’une mesure et de classification basée sur les comportements d’une mesure. Dans
notre étude, nous allons étudier et examiner la tendance de variation de la valeur des mesures
d’intérêt objectives qui répondent à la nature asymétrique en prenant la dérivée partielle de
la fonction qui calcule la valeur des mesures d'intérêt selon le tableau de contingence 2x2.
Nos résultats montrent que les mesures d'intérêt objectives asymétriques sont classées par la
considération de la variation (l'augmentation, la réduction ou la stabililité) à partir des
formules dérivés de chaque mesure.
Mots-clés: Mesures d’intérêt objectives, classification, dérivées partielles, tendance des
valeurs de variation, intensité d’implication.

ABSTRACT
In recent years, the research for Knowledge Discovery from Databases has been interested
by many researchers. Currently, a variety of researching results have effectively been used
in many areas of life. Interestingness measures play important role in researching this
knowledge discovery. Therefore, the study of classification of interestingness measures is
also strong tendency of development. Classification of measures is mostly based on two main
methods: classification based on the properties of measures and classification based on the
behavior of measures. In this study, we propose a new classification method focusing on the
research and survey of the value variation of the objective interestingness measures that
satisfy the asymmetric nature by taking the partial derivative of the function that calculates
the value of interestingness measures according to the 2x2 contingency table. Our results
show that asymmetrical objective interestingness measures are classified by considering of
Keywords: objective interestingness measures, classification, partial derivative, tendency of
value variation, implication intensity.

1
Tra Vinh University, 126 National Road 53, Tra Vinh City, Vietnam, nghiatvnt@gmail.com
2
Can Tho University, 3/2 Street Ninh, Kieu District, Can Tho City, Vietnam, hxhiep@ctu.edu.vn
3
University of Nantes, Polytechnic School of Nantes University, France, fabrice.guillet@univ-nantes.fr
4
University of Nantes, Polytechnic School of Nantes University, France, regisgra@club-internet.fr
VIII Colloque International –VIII International Conference
A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
2 Classification of objective interestingness measures based on the tendency of value variation

1 Introduction
Interestingness measures (Piatetsky-Shapiro and Matheus, 1994; Guillet and
Hamilton, 2007), both subjective and objective measures (Silberschatz and Tuzhilin
1995), play an important role in evaluating the quality of the knowledge in the form of
association rules. According to our survey, the study of interestingness measures has
rapidly developed in the last ten years (Jalali-Heravi and Zaïane, 2010; Jon Hills et al.,
2012; David Glass, 2013; Martínez-Ballesteros et al. 2014). The researchers in this field
mainly focus on two main directions: (1) to propose new measures (Liu, 2008; David
Glass, 2013; Selvarangam and Ramesh Kumar, 2014); (2) to study the properties,
behavior and trend of the variation of the measures in order to rank, cluster and classify
them (Tan et al., 2002; Lenca et al., 2004; Blanchard et al., 2009; Huynh et al., 2005,
2006, 2007). This study aims to assist users to select appropriate measures for their
particular application. As the number of interestingness measures are increasing, the
classification of the measures is interestingly concerned by researchers such as
classification based on objective and subjective criteria (Silberschatz and Tuzhilin, 1995,
1996), evaluation of a good measure based on three principles (Piatetsky-Shapiro, 1991),
based on the key properties in order to evaluate the measures and come to the conclusion
that each measure only satisfies a number of properties and possibly apply in some
specific areas (Tan et al., 2002, 2004), classification of measures based on the
development a multi-criteria system (Lenca et al., 2004, 2008), classification of measures
based on three criteria: the subject, the scope and the nature of the measures (Blanchard
et al., 2009), classification of 62 measures based on defining 12 properties and assigning
values for first nine properties that has supported users to select appropriate measures for
their specific application (Maddouri and Gammoudi, 2007), classification for 40 objective
interestingness measures by examining the value of 06 properties: independence, balance,
symmetry, variation, descriptive and statistical (Huynh et al., 2007, 2011); 62 measures
classified into seven classes, based on the measure formulation for 19 properties and
application of classification techniques (Guillaume et al., 2012); classification of 35
measures by the examining of behavior (Huynh et al., 2005, 2006, 2007); classification
of 61 measures by studying of over 110 data sets (Tew et al., 2013); examining variable
values of the individual measure by taking the partial derivative of the function calculated
measure (Gras and Kuntz, 2008).
The determination of the variability of the interestingness measures is one of the most
important criteria in assessing the objective interestingness measures (Huynh et al.,
2011). On the basis of making a survey on value variation by taking the partial derivative
of the function calculated of Implication index (Gras and Kuntz, 2008), we propose a new
method for the classification of objective interestingness measures. This method is based
on examining partial derivatives of the calculated function of interestingness measures
for each input parameter of the data set. This method will also enable us to know the
increasing, decreasing or independent nature of the interestingness measures for each
input parameter. From this result, the measures will be divided into classes of: increasing
measures, decreasing measures and independent measures for each input parameter of the
measure calculated function. Results of this classification will help users select the
appropriate measures for their application when they know the input parameters of the
actual data set.

VIII Colloque International –VIII International Conference


A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
Nghia Quoc Phan et al. 3

This paper is organized into five sections. Section 1 introduces the general objective
interestingness measures and measures classification methods. Section 2 mentions three
methods for measures classification: classification based on a survey of the properties of
measures; classification based on a survey of the behavior of measures; classification
based on a survey of the variable values of individual measures. Section 3 presents the
details of the method for behavior evaluation of measures according to the tendency of
value variation. Section 4 conducts a classification of the interestingness measures which
satisfy the asymmetric nature and gives the reviews, comments based on partial
derivatives formula. The final section summarizes the important results achieved in this
paper.

2 The classification methods of interestingness measures


Measures classification has been interested by many researchers recently. The
classification focuses on three main approaches: classification based on examining of the
properties of the measures, classification based on examining of the behavior of the
measures, and classification based on a examining of the variable values of individual
measures.

2.1 Classification based on examining of measure properties

Classification method based on theoretical attribute is first used in classifying the


measures by many researchers. At the general level of classification, the interestingness
measures are divided into two separate classes (Silberschatz and Tuzhilin, 1996) which
are subjective interestingness measures and objective interestingness measures.
Subjective interestingness measures evaluate knowledge models basing on the target,
knowledge and belief of users. Objective measures evaluate knowledge models basing on
statistical data structure. At the detailed level of classification, at the first time three key
principles for a good measure (Piatetsky-Shapiro, 1991): (1) the law 𝑎 → 𝑏 is equal 0 if
a and b independent; (2) monotonically increasing with 𝑎 ∩ 𝑏 ; (3) monotonically
decreasing with 𝑎 𝑜𝑟 𝑏 is proposed to examine classification criteria of the
interestingness measures. With 38 common objective measurements and use nine
criterias: conciseness, coverage, reliability, peculiarity, diversity, novelty, surprisingness,
utility, and actionability are examined in order to classify the measures (Geng and
Hamilton, 2006). A system of multi-criteria decision is designed to classify 20 measures
based on the theoretical properties and experimental results on 10 datasets (Lenca et al.,
2004, 2008). As a result, 20 measures are divided into five classes. With 40 objective
interestingness measures, reviewed with the theoretical properties such as independence,
balance, symmetry, variability, descriptive and statistical are divided into five classes:
balance, independence, descriptive, statistical and others (Huynh et al., 2007, 2011). The
classification, in which properties are expandable up to 19 and measures are 61, results
in the division of measures into 07 separate classes (Guillaume et al., 2012).

2.2 Classification based on the behaviors of interestingness measures

Classification method based on a survey of the behaviors of the measures is based on


the experimental investigation of the interestingness measures on specific data sets. From
the experimental results, we can predict the behaviors of the measures for each specific
VIII Colloque International –VIII International Conference
A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
4 Classification of objective interestingness measures based on the tendency of value variation

application areas as a basis for their classification. This can enable users to choose the
appropriate interestingness measures for their specific application. Methods of assessing
the interestingness measures based on behavioral examining was first proposed to
investigate the behavior of 35 measures based on calculating of the distance between
interestingness measures by using two clustering methods: Agglomerative Hierarchical
Clustering (AHC) and Partitioning Around Medoids (PAM) (Huynh et al., 2005). The
results classified measures into 16 separate clusters. This result will help decision makers
choose measures that can bring about the best knowledge. The behaviors of the measures
were surveyed by correlation graph methods. The approach was conducted on two
prototypical datasets with opposite characteristics: a strongly correlated one (mushroom
dataset) and a lowly correlated one (synthetic dataset). The results showed that the
correlation between the interestingness measures depends on data nature and ranking
rules and showed 06 stable clusters of 40 measures (Huynh et al., 2006). The study
examined the behavior of 61 objective interestingness measures on the basis of
experiments conducted on 110 different datasets (Tew et al., 2013). The findings of the
study classified 61 measures into 21 clusters and confirmed that the selection of the
measures depends on the dataset. However, this research has not given the criteria of how
to select a good data set for inclusion in their experiments.

2.3 Classification based on value variation of each individual measure

The survey of value variation of each individual interestingness measures allows us


know the increasing or decreasing tendency of the interestingness measures according to
the parameters of the function calculated interestingness measures (Gras and Kuntz,
2008). This method is carried out by taking the partial derivative of the function calculated
measures for each parameter. This result allows us to consider the behavior of each
measure. Specifically, basing on the partial derivative formula of each measure, we can
consider the relationship of the measures with each parameter and also consider the
variable correlation between parameters of the function calculated measures.

3 Evaluation of the behavior of objective interestingness measures


according to variable value trends
3.1 Objective interestingness measures

Suppose that we have a finite set T of transactions. An association rule is expressed


as a → b where A and B are two disjoint element sets A∩B = ∅. Item set A (respectively
B) is associated with a subset of transactions 𝑡𝐴 = 𝑇(𝐴) = {𝑇 ∈ 𝑇, 𝐴 ⊆ 𝑇}
(respectively 𝑡𝐵 = 𝑇(𝐵) ). Item set 𝐴̅ (respectively 𝐵̅ ) with 𝑡𝐴̅ = 𝑇(𝐴̅) = 𝑇 − 𝑇(𝐴) =
{𝑇 ∈ 𝑇, 𝐴 ∅ 𝑇} (respectively 𝑡𝐵̅ = 𝑇(𝐵̅ )). In order to accept or reject an association rule
𝑎 → 𝑏 it is quite common to consider the number 𝑛𝐴𝐵̅ = card (𝐴 ∩ 𝐵̅) of counter-examples
(Gras et al., 2008). Each association rule is described by four parameters: 𝑛 = |𝑇|, 𝑛𝐴 =
|𝑡𝐴 |, 𝑛𝐵 = |𝑡𝐵 |, 𝑛𝐴̅ = |𝑡𝐴̅ |, 𝑛𝐵̅ = |𝑡𝐵̅ | (Figure 1).
The interestingness value of an association rule based on an objective interestingness
measure will then be calculated by using the cardinality of a rules 𝑚(𝑎 → 𝑏) =
𝑓(𝑛, 𝑛𝐴 , 𝑛𝐵 , 𝑛𝐴𝐵̅ ) ∈ R . To calculate easily, the following equivalent transformations

VIII Colloque International –VIII International Conference


A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
Nghia Quoc Phan et al. 5

should be used: nAB = nA − nAB̅ , nA̅ = n − nA , nB̅ = n − nB , nA̅B = nB − nA + nAB̅ ,


nA̅B̅ = n − nB − nAB̅ .

Figure 1. The cardinality of an association rule A ⟶ B


For example: two given sets A and B where A has 1 element and B has 3 elements.
An association rule is in the form a ⟶ b. A = {Bread}, Y = {Milk, Diappers, Beer} where
𝑛 = 100, 𝑛𝐴 = 50, 𝑛𝐵 = 80 và 𝑛𝐴𝐵̅ = 10.
The objective interestingness measure, Support Expectation, is identified by the
formula:
𝑛𝐴 (𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ )
𝑚(𝑎 → 𝑏) = 𝑓(𝑛, 𝑛𝐴 , 𝑛𝐵 , 𝑛𝐴𝐵̅ ) =
𝑛(𝑛 − 𝑛𝐴 )
Therefore, the interestingness value is:
50(80 − 50 + 10)
𝑚(𝑎 → 𝑏) = = 0.4
100(100 − 50)

3.2 Statistical implicative analysis

The theory of statistical implicative analysis has been formed to evaluate the learning
behavior of pupils in the process of teaching algebra and geometry (Gras, 1979). This
theory then has the effect on many areas such as pedagogy, psychology and computer
science. It has developed a unifying methodology and created synergy of scientific
disciplines such as mathematics, statistics, psychology, education and data mining. It has
also created a framework allowing evaluating the strength of implications, formed
through the acquisition of technical knowledge by human’s natural process or artificial
process. Especially in data mining, the extraction of knowledge from a large set of
association rules is not made by human to serve decision-making process. Therefore, the
development of interestingness measures plays an important role in the evaluation of
association rules, and this is a clear success of the theory of statistical implicative analysis.

3.3 Examining the behavior of measures according to variable value trends

Researching stability of implication index (Gras and Kuntz, 2008) is to observe small
variation of this measure in the surrounding space of parameters 𝑛, 𝑛𝐴 , 𝑛𝐵 and 𝑛𝐴𝐵 . To do
this, taking the partial derivative for each parameter of the implication index formula is
carried. This method showed some ability in supporting the research of the interestingness
measures such as the survey of increasing or decreasing variability of the measures, the
relationship between the dependent variable parameters 𝑛, 𝑛𝐴 , 𝑛𝐵 and 𝑛𝐴𝐵 . The study was
continuously implemented on four measures: Loevinger, Lift, MC and Confidence (Gras
et al., 2014). However, these measures were only examined in the partial derivative
formula under the parameter 𝑛𝐴𝐵 . This study confirms the supporting role of surveying
methods of the measures based on partial derivative formula for each parameter in the
study of objective interestingness measures in general and classification of measures in
VIII Colloque International –VIII International Conference
A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
6 Classification of objective interestingness measures based on the tendency of value variation

particular. On the other hand, the results also showed a limited role of parameters
𝑛, 𝑛𝐴 , 𝑛𝐵 and 𝑛𝐴𝐵 to denote the increasing or decreasing variability of the measures.
Example 1. Let us survey Added value measure (Sahar, 2003) with the following
𝑛 −𝑛 𝑛
formula: 𝑓 = 𝐴 𝑛 𝐴𝐵̅ − 𝑛𝐵
𝐴
We have the partial derivative formula for each parameter 𝑛, 𝑛𝐴 , 𝑛𝐵 , 𝑛𝐴𝐵 of the
function f as follows:
𝜕𝑓 𝑛𝐵 𝜕𝑓 𝑛𝐴𝐵
̅ 𝜕𝑓 1 𝜕𝑓 1
𝜕𝑛
= 𝑛2
; 𝜕𝑛𝐴
= 2
𝑛𝐴
; 𝜕𝑛𝐵
= −𝑛 ; 𝜕𝑛𝐴𝐵
= −𝑛
̅ 𝐴

Where n = 100, 𝑛𝐴 = 20, 𝑛𝐵 = 40, 𝑛𝐴𝐵 = 4, we calculate the relative values as


follows:
𝜕𝑓 𝜕𝑓 𝜕𝑓 𝜕𝑓
𝑓 = 0.4 ; = 0.004 ; = 0.01 ; = −0.01 ; = −0.05
𝜕𝑛 𝜕𝑛𝐴 𝜕𝑛𝐵 𝜕𝑛𝐴𝐵
̅

Thus, the variation of the Added value measure depends on two parameters n, 𝑛𝐴 and
independent on two parameters 𝑛𝐵 , 𝑛𝐴𝐵 . The interestingness value will increase as the
number of n and 𝑛𝐴 increases, whereas the interestingness value decreases when the
number of parameter 𝑛𝐵 and 𝑛𝐴𝐵 increases. However, the declining rate of
interestingness value does not depend on the growth rate of these parameters.
Example 2. Let us examining Conviction measure (Brin et al., 1997) with the
𝑛 (𝑛−𝑛 )
following formula: 𝑓 = 𝐴𝑛𝑛 𝐵
̅
𝐴𝐵
We have the partial derivative formula for each parameter 𝑛, 𝑛𝐴 , 𝑛𝐵 , 𝑛𝐴𝐵 of the
function f as follows:
𝜕𝑓 𝑛 𝑛 𝜕𝑓 (𝑛−𝑛 ) 𝜕𝑓 𝑛 𝜕𝑓 𝑛 (𝑛−𝑛 )
= 𝑛 𝐴 𝑛𝐵2 ; 𝜕𝑛 = 𝑛𝑛 𝐵 ; 𝜕𝑛 = − 𝑛𝑛𝐴 ; 𝜕𝑛 = − 𝐴𝑛𝑛2 𝐵
𝜕𝑛 ̅
𝐴𝐵 𝐴 ̅
𝐴𝐵 𝐵 ̅
𝐴𝐵 ̅
𝐴𝐵 ̅
𝐴𝐵

Where n = 100, 𝑛𝐴 = 20, 𝑛𝐵 = 40, 𝑛𝐴𝐵 = 4, we calculate the relative values as


follows:
𝜕𝑓 𝜕𝑓 𝜕𝑓 𝜕𝑓
𝑓 =3; = 0.02 ; = 0.15 ; = −0.05 ; = −0.75
𝜕𝑛 𝜕𝑛𝐴 𝜕𝑛𝐵 𝜕𝑛𝐴𝐵
̅

Different from Added value measure, the variability of the Conviction measure
depends on two parameters n, 𝑛𝐴𝐵 and independent on two parameters 𝑛𝐴 , 𝑛𝐵 . However,
interestingness value will increase when the number of n, 𝑛𝐴 rise and the growth of the
value does not depend on the growth rate of the parameter 𝑛𝐴 . Conversely, the
interestingness value will decrease when the number of parameters 𝑛𝐵 , 𝑛𝐴𝐵 increases and
decrease speed the interestingness value does not depend on the growth rate of the
parameter 𝑛𝐵 .
Example 3. Let us examining Coverage measure (Geng and Hamilton, 2006) with
𝑛
the following formula: 𝑓 = 𝑛𝐴
We have the partial derivative formula for each of the four parameters 𝑛, 𝑛𝐴 , 𝑛𝐵 , 𝑛𝐴𝐵
of the function f as follows:
𝜕𝑓 𝑛 𝜕𝑓 1 𝜕𝑓 𝜕𝑓
= − 𝑛𝐴2 ; 𝜕𝑛 = 𝑛 ; 𝜕𝑛 = 0 ; 𝜕𝑛 = 0
𝜕𝑛 𝐴 𝐵 ̅
𝐴𝐵

Where n = 100, 𝑛𝐴 = 20, 𝑛𝐵 = 40, 𝑛𝐴𝐵 = 4, we calculate the relative values as


follows:
𝜕𝑓 𝜕𝑓 𝜕𝑓 𝜕𝑓
𝑓 = 0.2 ; = −0.002 ; = 0.01 ; = 0; =0
𝜕𝑛 𝜕𝑛𝐴 𝜕𝑛𝐵 𝜕𝑛𝐴𝐵
̅

VIII Colloque International –VIII International Conference


A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
Nghia Quoc Phan et al. 7

Not the same as above two measures, the variability of the Coverage measure
depends only on parameter n and tends to decrease when parameter n increases. In
contrast, its variation tends to increase and be independent with the parameter 𝑛𝐴 . Two
parameters 𝑛𝐵 𝑛𝐴𝐵 are completely not engaged in the demonstration of the variation in
this measure. This is a specific example to illustrate the limitation of the role of the
parameter in indicating of the variation of the measures mentioned above.

4 Classification based on examining the tendency of value variation


Based on a comprehensive list of objective interestingness measures which is applied
to evaluate the quality of knowledge as the form of association rules 𝑎 → 𝑏 (Agrawal and
Srikant, 1994) and published recently (Tew et al., 2013), we have found that the group of
measures satisfying symmetry (m( 𝑎 → 𝑏 ) = m( 𝑏 → 𝑎 )) and the group of measures
satisfying asymmetry (m( 𝑎 → 𝑏 ) ≠ m( 𝑏 → 𝑎 )) almost share the equal proportion.
According to the five criteria used to evaluate objective interestingness measures based
on a 2x2 contingency table for variables a and b, a good measure has to satisfy symmetry
under variable permutation of two variables a and b (Tan et al., 2002). It means that two
association rules 𝑎 → 𝑏 and 𝑏 → 𝑎 must have the same interestingness value. However,
this is not true for many applications. Confidence measure is an example. It is evaluated
as a good measure by many researchers and applied in many practical applications but it
is an asymmetric measure. From this reality, in this paper, we focus on researching the
group of objective interestingness measures satisfying asymmetric property. We calculate
the partial derivative of the calculated measure function according to 4 parameters
𝒇(𝑛, 𝑛𝐴 , 𝑛𝐵 , 𝑛𝐴𝐵 ) of this group with the aim at surveying of value variability of each
measure according to the 2x2 contingency table. From the result of this survey, the group
of asymmetric measures will be divided into four different classes by each parameter.
This classification helps researchers and users see the relationship between the value of
each parameter and the value variation of the asymmetric measures from which they can
select suitable measures for their study and application.
Table 1: Formula of asymmetric objective interestingness measures
Name of
No interestingness 𝒇(𝒏, 𝒏𝑨 , 𝒏𝑩 , 𝒏𝑨𝑩̅ )
measures
𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛(𝑛𝐴 − 𝑛𝐴𝐵̅ )
1. 1-way Support 𝑙𝑜𝑔2
𝑛𝐴 𝑛𝐴 𝑛𝐵
Added value,
Pavillon, Centred 𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛𝐵
2. −
Confidence, 𝑛𝐴 𝑛
Dependency
Bayes factor, 𝑛𝑛𝐴 − 𝑛𝐴 𝑛𝐵 − 𝑛𝑛𝐴𝐵̅ + 𝑛𝐵 𝑛𝐴𝐵̅
3.
Odd multiplier 𝑛𝐵 𝑛𝐴𝐵̅
Causal- 1 1 1
4. 1− ( + )𝑛 ̅
Confidence 2 𝑛𝐴 𝑛 − 𝑛𝐵 𝐴𝐵
Causal- 1 3 1
5. Confirmed 1− ( + )𝑛 ̅
confidence 2 𝑛𝐴 𝑛 − 𝑛𝐵 𝐴𝐵
Loevinger, 𝑛𝑛𝐴𝐵̅
6. Certainty Factor, 1−
𝑛𝐴 (𝑛 − 𝑛𝐵 )
Satisfaction

VIII Colloque International –VIII International Conference


A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
8 Classification of objective interestingness measures based on the tendency of value variation

Relative Risk , (𝑛𝐴 − 𝑛𝐴𝐵̅ )(𝑛 − 𝑛𝐴 )


7. Class correlation
ratio 𝑛𝐴 (𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ )

Collective (𝑛𝐴 − 𝑛𝐴𝐵̅ )(𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )(𝑛𝐴 (𝑛 − 𝑛𝐵 ) + 𝑛𝐵 (𝑛 − 𝑛𝐴 ))


8.
strength ((𝑛 − 𝑛𝐴 )(𝑛 − 𝑛𝐵 ) + 𝑛𝐴 𝑛𝐵 )(𝑛𝐵 − 𝑛𝐴 + 2𝑛𝐴𝐵̅ )
𝑛𝐴 − 𝑛𝐴𝐵̅
9. Confidence
𝑛𝐴
𝑛 + 𝑛𝐴 − 𝑛𝐵 − 4𝑛𝐴𝐵̅
10. Causal Confirm
𝑛
𝑛𝐴 (𝑛 − 𝑛𝐵 )
11. Conviction
𝑛𝑛𝐴𝐵̅
𝑛𝐴
12. Coverage
𝑛
Descriptive
Confirmed- 2𝑛𝐴𝐵̅
13. 1−
Confidence, 𝑛𝐴
Ganascia Index
Descriptive- 𝑛𝐴 − 2𝑛𝐴𝐵̅
14.
Confirm 𝑛
1
√𝐼𝐼 ((1 − 𝐻 ∝ )(1 − 𝐻 ∝̅ ̅ ))2∝ with (α=1) and
Entropic 𝐵|𝐴 𝐴|𝐵
15. Implication 𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛𝐴𝐵̅ − 𝑛𝐴 𝑛𝐴 − 𝑛𝐵 − 𝑛𝐴𝐵̅ 𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅
Intensity 1 𝐻𝐴|𝐵 = 𝑙𝑜𝑔2 + 𝑙𝑜𝑔2
𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵
Where IIM is Inplication Intensity
1
√𝐼𝐼 ((1 − 𝐻 ∝ )(1 − 𝐻 ∝̅ ̅ ))2∝ with (α=2) and
𝐵|𝐴 𝐴|𝐵
Entropic
16. Implication 𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛𝐴𝐵̅ − 𝑛𝐴 𝑛𝐴 − 𝑛𝐵 − 𝑛𝐴𝐵̅ 𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅
𝐻𝐴|𝐵 = 𝑙𝑜𝑔2 + 𝑙𝑜𝑔2
Intensity 2 𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵

Examples and
counter-
examples rate 𝑛𝐴 − 2𝑛𝐴𝐵̅
17.
(Exam-Cex-rate, 𝑛𝐴 − 𝑛𝐴𝐵̅
Excounterex
rate)
𝑛𝐴 (1 − 𝜃) − 𝑛𝐴𝐵̅
18. Gain, Fukuda
𝑛
(𝑛𝐴 − 𝑛𝐴𝐵̅ )2 + 𝑛𝐴𝐵̅ 2 (𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ )2 + (𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )2
+
19. Gini index 𝑛𝑛𝐴 𝑛(𝑛 − 𝑛𝐴 )
𝑛𝐵 2 + (𝑛 − 𝑛𝐵 )2

𝑛2
𝛼
where
𝛽

𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛𝐴𝐵̅ 𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅


𝛼 = 𝑚𝑎𝑥 ( , ) + 𝑚𝑎𝑥 ( , )
𝑛 𝑛 𝑛 𝑛
Goodman– 𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅
20. + 𝑚𝑎𝑥 ( , )
Kruskal 𝑛 𝑛
𝑛𝐴𝐵̅ 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ 𝑛𝐴 𝑛 − 𝑛𝐴
+ 𝑚𝑎𝑥 ( , ) − 𝑚𝑎𝑥 ( , )
𝑛 𝑛 𝑛 𝑛
𝑛𝐵 𝑛 − 𝑛𝐵
− 𝑚𝑎𝑥 ( , )
𝑛 𝑛

VIII Colloque International –VIII International Conference


A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
Nghia Quoc Phan et al. 9

𝑛𝐴 𝑛 − 𝑛𝐴 𝑛𝐵 𝑛 − 𝑛𝐵
𝛽 = 2 − 𝑚𝑎𝑥 ( , ) − 𝑚𝑎𝑥 ( , )
𝑛 𝑛 𝑛 𝑛
𝑛 (𝑛−𝑛𝐵 )
𝑛𝐴𝐵̅ − 𝐴
𝑛
21. Implication index 𝑛𝐴 (𝑛−𝑛𝐵 )

𝑛
𝑛 −𝑘 𝑘
Implication Inten 𝑛𝐴𝐵
̅ ∁(𝑛−𝑛𝐵) ∁𝑛𝐴𝐵
22. 1−∑ 𝑛
sity (II) 𝑘=𝑚𝑎𝑥 (0,𝑛𝐴 −𝑛𝐵 ) ∁ 𝑛𝐴
Probabilistic
measure of
deviation from
2 𝑛𝐴𝐵
̅
equilibrium(IPE
23. 1− 𝑛
∑ ∁𝑘𝑛𝐴
E), Indice 2 𝐴
𝑘=0
Probabiliste
d’Ecart
d’Equilibre
𝑛𝐵
−∞ =1 𝑖𝑓
𝑛
𝑛𝐵 1 𝑛𝐴 − 𝑛𝐴𝐵̅ 1
0 𝑖𝑓 ≤ 𝑎𝑛𝑑 ≤
𝑛 2 𝑛𝐴 2
Directed 𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛𝐴𝐵̅ 𝑛𝐴𝐵̅ 𝑛𝐵 1 𝑛𝐴 − 𝑛𝐴𝐵̅ 1
1+ 𝑙𝑜𝑔2 + 𝑙𝑜𝑔2 𝑖𝑓 ≤ 𝑎𝑛𝑑 >
24. Information 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛 2 𝑛𝐴 2
1 𝑛𝐵 1 𝑛𝐴 − 𝑛𝐴𝐵̅ 1
ratio(DIR) 1 + 𝑛𝐵 𝑛 𝑛−𝑛𝐵 𝑛−𝑛𝐵 𝑖𝑓 > 𝑎𝑛𝑑 ≤
𝑙𝑜𝑔2 𝐵 + 𝑙𝑜𝑔2 𝑛 2 𝑛𝐴 2
𝑛 𝑛 𝑛 𝑛
𝑛𝐴 −𝑛𝐴𝐵̅ 𝑛𝐴 −𝑛𝐴𝐵
̅ 𝑛𝐴𝐵
̅ 𝑛𝐴𝐵
̅
𝑙𝑜𝑔2 + 𝑙𝑜𝑔2 𝑛𝐵 1 𝑛𝐴 − 𝑛𝐴𝐵̅ 1
𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴
1− 𝑛𝐵 𝑛 𝑛−𝑛𝐵 𝑛−𝑛𝐵 𝑖𝑓 > 𝑎𝑛𝑑 >
{ 𝑙𝑜𝑔2 𝐵 + 𝑙𝑜𝑔2 𝑛 2 𝑛𝐴 2
𝑛 𝑛 𝑛 𝑛
𝑛𝑛𝐴𝐵̅ 𝑛 − 𝑛𝐴𝐵̅ 𝑛𝐵
1− 𝑖𝑓 >
𝑛𝐴 (𝑛 − 𝑛𝐵 ) 𝑛𝐴 𝑛
25. MGK, Ion 𝑛(𝑛𝐴 − 𝑛𝐴𝐵̅ )
−1 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
{ 𝑛𝐴 𝑛𝐵
𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛(𝑛𝐴 − 𝑛𝐴𝐵̅ ) 𝑛𝐴𝐵̅ 𝑛𝑛𝐴𝐵̅
26. J-measure 𝑙𝑜𝑔2 + 𝑙𝑜𝑔2
𝑛 𝑛𝐴 𝑛𝐵 𝑛 𝑛𝐴 (𝑛 − 𝑛𝐵 )
𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛 − 𝑛𝐵 𝑛𝐴𝐵̅
27. Klosgen √ ( − )
𝑛 𝑛 𝑛𝐴
𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅
28. K-measure ( − ) (𝑙𝑜𝑔2 (𝑛 − 𝑛𝐵 ) − 𝑙𝑜𝑔2 𝑛𝐵 )
𝑛𝐴 𝑛 − 𝑛𝐴
(𝑛𝐴 − 𝑛𝐴𝐵̅ ) 1 1
29. Kulczynski index ( + )
2 𝑛𝐴 𝑛𝐵
𝑛𝐴 − 𝑛𝐴𝐵̅ + 1
30. Laplace
𝑛𝐴 + 2
Least 𝑛𝐴 − 2𝑛𝐴𝐵̅
31.
contradiction 𝑛𝐵
Leverage, 𝑛𝐴𝐵̅ 𝑛𝐴 𝑛𝐵
32. 1− − 2
Leverage 1 𝑛𝐴 𝑛
𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛(𝑛 − 𝑛𝐴𝐵̅ )
𝑙𝑜𝑔2
𝑛 𝑛𝐴 𝑛𝐵
Mutual 𝑛𝐴𝐵̅ 𝑛𝑛𝐴𝐵̅
+ 𝑙𝑜𝑔2
Information MI, 𝑛 𝑛𝐴 (𝑛 − 𝑛𝐵 )
33. 𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ 𝑛(𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ )
2-way Support
+ 𝑙𝑜𝑔2
Variation 𝑛 (𝑛 − 𝑛𝐴 )𝑛𝐵
𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ 𝑛(𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )
+ 𝑙𝑜𝑔2
𝑛 (𝑛 − 𝑛𝐴 )(𝑛 − 𝑛𝐵 )
𝑛𝐵
34. Prevalence
𝑛
Putative Causal 3 4𝑛𝐴 − 3𝑛𝐵 3 2
35. + −( + )𝑛 ̅
Dependency 2 2𝑛 2𝑛𝐴 𝑛 − 𝑛𝐵 𝐴𝐵
VIII Colloque International –VIII International Conference
A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
10 Classification of objective interestingness measures based on the tendency of value variation

Recall, 𝑛𝐴 − 𝑛𝐴𝐵̅
36.
Completeness 𝑛𝐵
Sebag and 𝑛𝐴
37. −1
Schoenauer 𝑛𝐴𝐵̅
Specificity 1, 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅
38. Negative
𝑛 − 𝑛𝐴
Reliability
𝑛𝑛𝐴 − 𝑛𝐴 𝑛𝐵 − 𝑛𝑛𝐴𝐵̅
39. Zhang Zhang
𝑚𝑎𝑥 ((𝑛𝐴 − 𝑛𝐴𝐵̅ )(𝑛 − 𝑛𝐵 ), 𝑛𝐵 𝑛𝐴𝐵̅ )

Table 2: Formula of partial derivative under the parameter n

Z
No
n
𝑛𝐴 − 𝑛𝐴𝐵̅
1
𝑛𝑛𝐴 𝑙𝑛2
𝑛𝐵
2
𝑛2
𝑛𝐴 − 𝑛𝐴𝐵̅
3
𝑛𝐵 𝑛𝐴𝐵̅
𝑛𝐴𝐵̅
4
2(𝑛 − 𝑛𝐵 )2
𝑛𝐴𝐵̅
5
2(𝑛 − 𝑛𝐵 )2
𝑛𝐵 𝑛𝐴𝐵̅
6
𝑛𝐴 (𝑛 − 𝑛𝐵 )2
(𝑛𝐴 − 𝑛𝐴𝐵̅ )
7
𝑛𝐴 (𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ )
(𝑛𝐴 − 𝑛𝐴𝐵̅ ) (𝑛𝐴 (𝑛 − 𝑛𝐵 ) + 𝑛𝐵 ((𝑛 − 𝑛𝐴 )) + (𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )(𝑛𝐴 + 𝑛𝐵 )) ((𝑛 − 𝑛𝐴 )(𝑛 − 𝑛𝐵 ) + 𝑛𝐴 𝑛𝐵 )(𝑛𝐵 − 𝑛𝐴 + 2𝑛𝐴𝐵̅ ) − 𝑘
2
(((𝑛 − 𝑛𝐴 )(𝑛 − 𝑛𝐵 ) + 𝑛𝐴 𝑛𝐵 )(𝑛𝐵 − 𝑛𝐴 + 2𝑛𝐴𝐵̅ ))
8
𝑘 = (𝑛𝐴 − 𝑛𝐴𝐵̅ ) (𝑛𝐴 (𝑛 − 𝑛𝐵 ) + 𝑛𝐵 ((𝑛 − 𝑛𝐴 )) + (𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )(𝑛𝐴 + 𝑛𝐵 )) ((𝑛 − 𝑛𝐵 )
+ (𝑛 − 𝑛𝐵 )(𝑛𝐵 − 𝑛𝐴 + 2𝑛𝐴𝐵̅ )
9 0
−𝑛𝐴 + 𝑛𝐵 + 4𝑛𝐴𝐵̅
10
𝑛2
𝑛𝐴 𝑛𝐵
11
𝑛𝐴𝐵̅ 𝑛2
𝑛𝐴
12 − 2
𝑛
13 0
𝑛𝐴 − 2𝑛𝐴𝐵̅
14 −
𝑛2
1 1 𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛−𝑛𝐵 −𝑛𝐴𝐵
1 1 1
𝐼𝐼𝑛′ (ℎ)2 + 𝐼𝐼 (ℎ)−2 (1 − (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 ) (1 − (− (𝑙𝑜𝑔2 − ))))
2 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐵 𝑛𝐵 𝑙𝑛2

1
2√𝐼𝐼(ℎ)2

15 𝑛𝐴 − 𝑛𝐴𝐵 𝑛𝐴 − 𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛


ℎ = 1 − (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 𝐴𝐵 ) (1
𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴
𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵
− (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 ))
𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵

VIII Colloque International –VIII International Conference


A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
Nghia Quoc Phan et al. 11

1 1 3 𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛𝐴𝐵 2


𝐼𝐼𝑛′ (ℎ)4 + 𝐼𝐼 (ℎ)−4 (1 − (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 ) )𝑘
4 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴

1
2√𝐼𝐼(ℎ)4
𝑛𝐴 − 𝑛𝐴𝐵 𝑛𝐴 − 𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛 2
ℎ = 1 − (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 𝐴𝐵 ) (1
𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴
𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 2
16 − (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 ) )
𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵

𝑛𝐴𝐵 𝑛 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 1 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵


𝑘 = (−2 (− 𝑙𝑜𝑔2 𝐴𝐵 − 𝑙𝑜𝑔2 ) (− (𝑙𝑜𝑔2
𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵
1
+ ))
𝑙𝑛2

17 0
𝑛𝐴 (1 − 𝜃) − 𝑛𝐴𝐵̅
18 −
𝑛2
(𝑛𝐴 − 𝑛𝐴𝐵̅ )2 + 𝑛𝐴𝐵̅ 2

𝑛 2 𝑛 2𝐴
(𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ )2 (2𝑛 − 𝑛𝐴 ) + 2(𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ ) − (𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )2 (2𝑛 − 𝑛𝐴 )
19 +
(𝑛(𝑛 − 𝑛𝐴 ))2
2
2𝑛𝑛𝐵 + 2(𝑛 − 𝑛𝐴 ) − 2𝑛(𝑛 − 𝑛𝐵 )2

𝑛4
𝛼
where
𝛽
𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ 𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ 𝑛 − 𝑛𝐴 𝑛 − 𝑛𝐵
𝛼=( + + + − − ) (2
𝑛2 𝑛2 𝑛2 𝑛2 𝑛2 𝑛2
𝑛𝐴 𝑛 − 𝑛𝐴 𝑛𝐵 𝑛 − 𝑛𝐵
− 𝑚𝑎𝑥 ( , ) − 𝑚𝑎𝑥 ( , ))
𝑛 𝑛 𝑛 𝑛
𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛𝐴𝐵̅ 𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅
20 − (𝑚𝑎𝑥 ( , ) + 𝑚𝑎𝑥 ( , )
𝑛 𝑛 𝑛 𝑛
𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ 𝑛𝐴𝐵̅ 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅
+ 𝑚𝑎𝑥 ( , ) + 𝑚𝑎𝑥 ( , )
𝑛 𝑛 𝑛 𝑛
𝑛𝐴 𝑛 − 𝑛𝐴 𝑛𝐵 𝑛 − 𝑛𝐵 𝑛 − 𝑛𝐴 𝑛 − 𝑛𝐵
− 𝑚𝑎𝑥 ( , ) − 𝑚𝑎𝑥 ( , )) (− − )
𝑛 𝑛 𝑛 𝑛 𝑛2 𝑛2
𝑛𝐴 𝑛 − 𝑛𝐴 𝑛𝐵 𝑛 − 𝑛𝐵 2
𝛽 = (2 − 𝑚𝑎𝑥 ( , ) − 𝑚𝑎𝑥 ( , ))
𝑛 𝑛 𝑛 𝑛
1 𝑛𝐴 (𝑛 − 𝑛𝐵 )
21 (𝑛𝐴𝐵̅ + )
2√ 𝑛 𝑛
𝑛𝐴𝐵̅ 𝑛𝐵 ! (𝑛 − 𝑛𝐵 − 1)! 𝑛𝐴 ! (𝑛 − 𝑛𝐴 − 1)!
22 −∑
𝑘=𝑚𝑎𝑥 (1,𝑛𝐴 −𝑛𝐵 ) ((𝑛𝐴 − 𝑘)! (𝑛𝐵 − 𝑛𝐴 + 𝑘)! 𝑘! (𝑛 − 𝑛𝐵 − 𝑘 − 1)! (𝑛 − 1)!

23 0

𝑛𝐵 1 𝑛𝐴 − 𝑛𝐴𝐵 1
0 𝑖𝑓( ≤ 𝑎𝑛𝑑 >
𝑛 2 𝑛𝐴 2
1 𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛−𝑛𝐵 1
(𝑙𝑜𝑔2 + )+( (𝑙𝑜𝑔2 )+ ) 𝑛𝐵 1 𝑛𝐴 − 𝑛𝐴𝐵 1
𝑛2 𝑛 𝑙𝑛2 𝑛2 𝑛 𝑙𝑛2
𝑛−𝑛𝐵 2
𝑖𝑓( > 𝑎𝑛𝑑 ≤
24 (
𝑛𝐵
𝑙𝑜𝑔2
𝑛𝐵
+
𝑛−𝑛𝐵
𝑙𝑜𝑔2 ) 𝑛 2 𝑛𝐴 2
𝑛 𝑛 𝑛 𝑛
𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛𝐴𝐵 1 𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛−𝑛𝐵 1
( 𝑙𝑜𝑔2 + 𝑙𝑜𝑔2 )( 2 (𝑙𝑜𝑔2 + )+( (𝑙𝑜𝑔2 )+ )) 𝑛𝐵 1 𝑛𝐴 − 𝑛𝐴𝐵 1
𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛 𝑛 𝑙𝑛2 𝑛2 𝑛 𝑙𝑛2
− 𝑛−𝑛𝐵 2
𝑖𝑓( > 𝑎𝑛𝑑 >
(
𝑛𝐵
𝑙𝑜𝑔2
𝑛𝐵
+
𝑛−𝑛𝐵
𝑙𝑜𝑔2 ) 𝑛 2 𝑛𝐴 2
{ 𝑛 𝑛 𝑛 𝑛

VIII Colloque International –VIII International Conference


A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
12 Classification of objective interestingness measures based on the tendency of value variation

𝑛𝐴𝐵̅ (𝑛𝑛𝐴 − 𝑛𝐴 𝑛𝐵 ) − 𝑛𝑛𝐴 𝑛𝐴𝐵 𝑛 − 𝑛𝐴𝐵̅ 𝑛𝐵


− 2
𝑖𝑓 >
(𝑛𝑛𝐴 − 𝑛𝐴 𝑛𝐵 ) 𝑛𝐴 𝑛
25
(𝑛𝐴 − 𝑛𝐴𝐵̅ )
𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
{ 𝑛𝐴 𝑛𝐵
𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛(𝑛𝐴 − 𝑛𝐴𝐵̅ ) 1 𝑛𝐴𝐵̅ 𝑛𝑛𝐴𝐵̅ 𝑛𝐵
26 (−𝑙𝑜𝑔2 + ) + 2 (𝑙𝑜𝑔2 − )
𝑛2 𝑛𝐴 𝑛𝐵 𝑙𝑛2 𝑛 𝑛𝐴 (𝑛 − 𝑛𝐵 ) (𝑛 − 𝑛𝐵 )𝑙𝑛2
𝑛𝐴 −𝑛𝐴𝐵
̅
𝑛2 𝑛 − 𝑛𝐵 𝑛𝐴𝐵̅ 𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛𝐵
27 ( − ) + (√ ) ( 2)
𝑛 −𝑛 ̅ 𝑛 𝑛𝐴 𝑛 𝑛
2√ 𝐴 𝐴𝐵
𝑛
( )
(𝑛 − 𝑛𝐴 ) − (𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )
(− ) (𝑙𝑜𝑔2 (𝑛 − 𝑛𝐵 ) − 𝑙𝑜𝑔2 𝑛𝐵 )
(𝑛 − 𝑛𝐴 )2
28
𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ 1
+( − )( )
𝑛𝐴 𝑛 − 𝑛𝐴 (𝑛 − 𝑛𝐵 )𝑙𝑛2
29 0
30 0
31 0
2𝑛𝑛𝐴 𝑛𝐵
32
𝑛4
𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛(𝑛 − 𝑛𝐴𝐵̅ ) (2𝑛 − 𝑛𝐴𝐵̅ )
(−𝑙𝑜𝑔2 + )
𝑛2 𝑛𝐴 𝑛𝐵 𝑛(𝑛 − 𝑛𝐴𝐵̅ )𝑙𝑛2
𝑛𝐴𝐵̅ 𝑛𝑛𝐴𝐵̅ 𝑛𝐵
+ 2 (−𝑙𝑜𝑔2 − )
𝑛 𝑛𝐴 (𝑛 − 𝑛𝐵 ) (𝑛 − 𝑛𝐵 )𝑙𝑛2
𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ 𝑛(𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ )
+ (−𝑙𝑜𝑔2
𝑛2 (𝑛 − 𝑛𝐴 )𝑛𝐵
33 (𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ )(𝑛 − 𝑛𝐴 ) − 𝑛(𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ )
+ )
(𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ )𝑙𝑛2
𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ 𝑛(𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )
+ 2
(𝑙𝑜𝑔2
𝑛 (𝑛 − 𝑛𝐴 )(𝑛 − 𝑛𝐵 )
(2𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )(𝑛 − 𝑛𝐴 )(𝑛 − 𝑛𝐵 ) − 𝑛(𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )(2𝑛 − 𝑛𝐵 − 𝑛𝐴 )
+
(𝑛 − 𝑛𝐴 )(𝑛 − 𝑛𝐵 )(𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )𝑙𝑛2
𝑛𝐵
34 − 2
𝑛
4𝑛𝐴 − 3𝑛𝐵 2𝑛𝐴𝐵̅
35 − 2
+
2𝑛 (𝑛 − 𝑛𝐵 )2
36 0
37 0
−𝑛𝐴 + 𝑛𝐵 + 𝑛𝐴𝐵̅
38
(𝑛 − 𝑛𝐴 )2
(𝑛𝐴 − 𝑛𝐴𝐵̅ ) 𝑚𝑎𝑥((𝑛𝐴 − 𝑛𝐴𝐵̅ )(𝑛 − 𝑛𝐵 ), 𝑛𝐵 𝑛𝐴𝐵̅ ) − (𝑛𝑛𝐴 − 𝑛𝐴 𝑛𝐵 − 𝑛𝑛𝐴𝐵̅ )(𝑛𝐴 − 𝑛𝐴𝐵̅ )
39
(𝑚𝑎𝑥((𝑛𝐴 − 𝑛𝐴𝐵̅ )(𝑛 − 𝑛𝐵 ), 𝑛𝐵 𝑛𝐴𝐵̅ ))2

Table 3: Formula of partial derivative under the parameter 𝑛𝐴

Z
No n A

𝑛𝐴𝐵̅ 𝑛(𝑛𝐴 − 𝑛𝐴𝐵̅ ) 1


1 2
(𝑙𝑜𝑔2 + )
𝑛 𝐴 𝑛𝐴 𝑛𝐵 𝑛𝑙𝑛2
𝑛𝐴𝐵̅
2
𝑛𝐴2

VIII Colloque International –VIII International Conference


A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
Nghia Quoc Phan et al. 13

𝑛 − 𝑛𝐵
3
𝑛𝐵 𝑛𝐴𝐵̅
𝑛𝐴𝐵̅
4
2𝑛𝐴 2
3𝑛𝐴𝐵̅
5
2𝑛𝐴 2
1 𝑛𝑛𝐴𝐵̅
6 2.
𝑛𝐴 (𝑛 − 𝑛𝐵 )
2
𝑛𝑛𝐴2 − 𝑛𝐴2 𝑛𝐵 + 𝑛𝑛𝐵 𝑛𝐴𝐵̅ − 2𝑛𝑛𝐴 𝑛𝐴𝐵̅ + 𝑛𝑛𝐴𝐵̅
7
(𝑛𝐴 (𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ ))2
ℎ + (𝑛𝐴 − 𝑛𝐴𝐵̅ )(𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )(𝑛𝐴 (𝑛 − 𝑛𝐵 ) + 𝑛𝐵 (𝑛 − 𝑛𝐴 )((𝑛 − 𝑛𝐴 )(𝑛 − 𝑛𝐵 ))(𝑛𝐵 − 𝑛𝐴 + 2𝑛𝐴𝐵̅ )((𝑛 − 𝑛𝐴 )(𝑛 − 𝑛𝐵 ) + 𝑛𝐴 𝑛𝐵 )
2
(((𝑛 − 𝑛𝐴 )(𝑛 − 𝑛𝐵 ) + 𝑛𝐴 𝑛𝐵 )(𝑛𝐵 − 𝑛𝐴 + 2𝑛𝐴𝐵̅ ))
8
ℎ = ((𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )(𝑛𝐴 (𝑛 − 𝑛𝐵 )(𝑛𝐴 + 𝑛𝐵 ))(𝑛𝐵 ((𝑛 − 𝑛𝐴 ))(𝑛𝐴 + 𝑛𝐴𝐵̅ )(𝑛 − 𝑛𝐵 )
− 𝑛𝐵 )((𝑛 − 𝑛𝐴 )(𝑛 − 𝑛𝐵 ) + 𝑛𝐴 𝑛𝐵 )(𝑛𝐵 − 𝑛𝐴 + 2𝑛𝐴𝐵̅ )
𝑛𝐴𝐵̅
9
𝑛𝐴2
1
10
𝑛
(𝑛 − 𝑛𝐵 )
11
𝑛𝑛𝐴𝐵̅
1
12
𝑛
2𝑛𝐴𝐵̅
13
𝑛𝐴2
1
14
𝑛
1 1 𝑛𝐴𝐵 𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛−𝑛𝐵 −𝑛𝐴𝐵 𝑛−𝑛𝐵 −𝑛𝐴𝐵
1 1 1
𝐼𝐼𝑛′ 𝐴 (ℎ)2 + 𝐼𝐼 (ℎ)−2 (( 2 (𝑙𝑜𝑔2 + )+ 2 (𝑙𝑜𝑔2 + )) (1 − (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 )))
2 𝑛𝐴 𝑛𝐴 𝑙𝑛2 𝑛𝐴 𝑛𝐴 𝑙𝑛2 𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵

1
2
𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴 −𝑛𝐴𝐵 𝑛 𝑛 𝑛 𝑛 𝑛−𝑛𝐵 −𝑛𝐴𝐵 𝑛−𝑛𝐵 −𝑛𝐴𝐵
2√𝐼𝐼 (1 − (− 𝑙𝑜𝑔2 − 𝐴𝐵 𝑙𝑜𝑔2 𝐴𝐵 ) (1 − (− 𝐴𝐵 𝑙𝑜𝑔2 𝐴𝐵 − 𝑙𝑜𝑔2 )))
𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵

15 𝑛𝐴 − 𝑛𝐴𝐵 𝑛𝐴 − 𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛


ℎ = 1 − (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 𝐴𝐵 ) (1
𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴
𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵
− (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 ))
𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵

1 1 3 𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛−𝑛𝐵 −𝑛𝐴𝐵 𝑛−𝑛𝐵 −𝑛𝐴𝐵 2


𝐼𝐼𝑛′ (ℎ)4 + 𝐼𝐼 (ℎ)−4 ((1 − (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 ) ) − 𝑘)
4 𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵

1
2√𝐼𝐼(ℎ)4
𝑛𝐴 − 𝑛𝐴𝐵 𝑛𝐴 − 𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛 2
ℎ = 1 − (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 𝐴𝐵 ) (1
𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴
16
𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 2
− (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 ) )
𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵
𝑛𝐴 − 𝑛𝐴𝐵 𝑛𝐴 − 𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛 𝑛 𝑛𝐴 − 𝑛𝐴𝐵 1
𝑘 = 2 (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 𝐴𝐵 ) (− 𝐴𝐵 2 (𝑙𝑜𝑔2 + )
𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑙𝑛2
1 𝑛 1
− 2 (𝑙𝑜𝑔2 𝐴𝐵 − + ))
𝑛𝐴 𝑛𝐴 𝑙𝑛2
𝑛𝐴𝐵̅
17
(𝑛𝐴 − 𝑛𝐴𝐵̅ )2

VIII Colloque International –VIII International Conference


A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
14 Classification of objective interestingness measures based on the tendency of value variation

(1 − 𝜃)
18
𝑛
2𝑛𝑛𝐴 (𝑛𝐴 − 𝑛𝐴𝐵̅ ) − 𝑛𝐴 ((𝑛𝐴 − 𝑛𝐴𝐵̅ )2 + 𝑛𝐴𝐵̅ 2 )
𝑛2 𝑛2𝐴
19
2(𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ ) + ((𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ )2 + (𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )2 )(2𝑛 − 𝑛𝐴

(𝑛(𝑛 − 𝑛𝐴 ))2
𝛼
where
𝛽
3 𝑛𝐴 𝑛 − 𝑛𝐴 𝑛 𝐵 𝑛 − 𝑛𝐵
𝛼 = ( ) (2 − 𝑚𝑎𝑥 ( , ) − 𝑚𝑎𝑥 ( , ))
𝑛 𝑛 𝑛 𝑛 𝑛
𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛𝐴𝐵̅ 𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅
− (𝑚𝑎𝑥 ( , ) + 𝑚𝑎𝑥 ( , )
𝑛 𝑛 𝑛 𝑛
20 𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ 𝑛𝐴𝐵̅ 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅
+ 𝑚𝑎𝑥 ( , ) + 𝑚𝑎𝑥 ( , )
𝑛 𝑛 𝑛 𝑛
𝑛𝐴 𝑛 − 𝑛𝐴 𝑛𝐵 𝑛 − 𝑛𝐵 1
− 𝑚𝑎𝑥 ( , ) − 𝑚𝑎𝑥 ( , )) ( )
𝑛 𝑛 𝑛 𝑛 𝑛
𝑛𝐴 𝑛 − 𝑛𝐴 𝑛𝐵 𝑛 − 𝑛𝐵 2
𝛽 = (2 − 𝑚𝑎𝑥 ( , ) − 𝑚𝑎𝑥 ( , ))
𝑛 𝑛 𝑛 𝑛

n  nB
3
1 nAB n 2 1
 ( ) 
21 n  nB nA 2 nA
2
n
𝑛𝐴𝐵
̅ 𝑛𝐵 ! (𝑛 − 𝑛𝐵 )! (𝑛𝐴 − 1)! (𝑛 − 𝑛𝐴 )!
22 −∑
𝑘=𝑚𝑎𝑥 (1,𝑛𝐴 −𝑛𝐵 ) ((𝑛𝐴 − 𝑘 − 1)! (𝑛𝐵 − 𝑛𝐴 + 𝑘)! 𝑘! (𝑛 − 𝑛𝐵 − 𝑘)! 𝑛!

2 𝑛𝐴𝐵
̅ (𝑛𝐴 − 1)!
23 − ∑
2𝑛𝐴 𝑙𝑛2 𝑘=1 𝑘! (𝑛𝐴 − 𝑘 − 1)!

𝑛𝐴𝐵 𝑛𝐴 − 𝑛𝐴𝐵 𝑛𝐴𝐵 1 𝑛 𝑛 𝑛𝐵 1 𝑛𝐴 − 𝑛𝐴𝐵 1


2
(𝑙𝑜𝑔2 + ) + 2 (𝑙𝑜𝑔2 𝐴𝐵 + 𝐴𝐵 ) 𝑖𝑓( ≤ 𝑎𝑛𝑑 >
𝑛 𝐴 𝑛𝐴 𝑙𝑛2 𝑛 𝐴 𝑛𝐴 𝑙𝑛2 𝑛 2 𝑛𝐴 2
𝑛𝐵 1 𝑛𝐴 − 𝑛𝐴𝐵 1
0 𝑖𝑓( > 𝑎𝑛𝑑 ≤
24 𝑛 2 𝑛𝐴 2
𝑛𝐴𝐵 𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴𝐵 1 𝑛𝐴𝐵 𝑛𝐴𝐵
(𝑙𝑜𝑔2 + )+ (𝑙𝑜𝑔2 + ) 𝑛𝐵 1 𝑛𝐴 − 𝑛𝐴𝐵 1
𝑛2 𝐴 𝑛𝐴 𝑙𝑛2 𝑛2 𝐴 𝑛𝐴 𝑙𝑛2
− 𝑛𝐵 𝑛𝐵 𝑛−𝑛𝐵 𝑛−𝑛𝐵 𝑖𝑓( > 𝑎𝑛𝑑 >
{ 𝑙𝑜𝑔2 𝑙𝑜𝑔2+ 𝑛 2 𝑛𝐴 2
𝑛 𝑛 𝑛 𝑛
𝑛𝑛𝐴𝐵̅ 𝑛 − 𝑛𝐴𝐵̅ 𝑛𝐵
𝑖𝑓 >
(𝑛 − 𝑛𝐵 )𝑛2𝐴 𝑛𝐴 𝑛
25 (𝑛𝐴 𝑛𝐵 (𝑛𝐴 − 𝑛𝐴𝐵̅ )) − 𝑛𝑛𝐵 (𝑛𝐴 − 𝑛𝐴𝐵̅ )
𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
{ 𝑛𝐵 𝑛 2𝐴
1 𝑛(𝑛𝐴 − 𝑛𝐴𝐵̅ ) 𝑛𝐵 𝑛𝐴𝐵̅ 𝑛𝐴𝐵̅
26 (𝑙𝑜𝑔2 − ) +
𝑛 𝑛𝐴 𝑛𝐵 𝑙𝑛2 𝑛𝑛𝐴 𝑙𝑛2
1
𝑛 𝑛 − 𝑛𝐵 𝑛𝐴𝐵̅ 𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛𝐴𝐵̅
27 ( − ) −√ ( 2 )
𝑛 −𝑛 ̅ 𝑛 𝑛𝐴 𝑛 𝑛 𝐴
2√ 𝐴 𝐴𝐵
𝑛
( )
𝑛𝐴𝐵̅ ) 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅
28 ( 2 − ) (𝑙𝑜𝑔2 (𝑛 − 𝑛𝐵 ) − 𝑙𝑜𝑔2 𝑛𝐵 )
𝑛 𝐴 (𝑛 − 𝑛𝐴 )2
1 1 𝑛𝐴𝐵̅
29 ( + 2)
2 𝑛𝐵 𝑛𝐴
3 + 𝑛𝐴𝐵̅
30
(𝑛𝐴 + 2)2

VIII Colloque International –VIII International Conference


A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
Nghia Quoc Phan et al. 15

1
31
𝑛𝐵
𝑛𝐴𝐵̅ 𝑛𝐵
32 − 2
𝑛𝐴2 𝑛
1 𝑛(𝑛 − 𝑛𝐴𝐵̅ ) 𝑛𝐴 − 𝑛𝐴𝐵̅
(𝑙𝑜𝑔2 + )
𝑛 𝑛𝐴 𝑛𝐵 𝑛𝐴 𝑙𝑛2
𝑛𝐴𝐵̅ 1 𝑛(𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ )
+ − 𝑙𝑜𝑔2
𝑛𝑛𝐴 𝑙𝑛2 𝑛 (𝑛 − 𝑛𝐴 )𝑛𝐵
33 𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ (𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ )(𝑛 − 𝑛𝐴 )
+ ( )
𝑛 (𝑛 − 𝑛𝐴 )(𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ )𝑙𝑛2
𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ (𝑛(𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ ))(𝑛 − 𝑛𝐵 )
+
𝑛 ((𝑛 − 𝑛𝐴 )(𝑛 − 𝑛𝐵 ))2 𝑙𝑛2
34 0
2 3𝑛𝐴𝐵̅
35 +
𝑛 2𝑛𝐴2
1
36
𝑛𝐵
1
37
𝑛𝐴𝐵̅
𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅
38
(𝑛 − 𝑛𝐴 )2
(𝑛 − 𝑛𝐵 ) 𝑚𝑎𝑥((𝑛𝐴 − 𝑛𝐴𝐵̅ )(𝑛 − 𝑛𝐵 ), 𝑛𝐵 𝑛𝐴𝐵̅ ) − (𝑛𝑛𝐴 − 𝑛𝐴 𝑛𝐵 − 𝑛𝑛𝐴𝐵̅ )(𝑛 − 𝑛𝐵 )
39
(𝑚𝑎𝑥((𝑛𝐴 − 𝑛𝐴𝐵̅ )(𝑛 − 𝑛𝐵 ), 𝑛𝐵 𝑛𝐴𝐵̅ ))2

Table 4: Formula of partial derivative under the parameter 𝑛𝐵


Z
No n B
𝑛𝐴 − 𝑛𝐴𝐵̅
1 −
𝑛𝐴 𝑛𝐵 . 𝑙𝑛2
1
2 −
𝑛
𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛
3 − . 2
𝑛𝐴𝐵̅ 𝑛𝐵
𝑛𝐴𝐵̅
4
2(𝑛 − 𝑛𝐵 )2
𝑛𝐴𝐵̅
5
2(𝑛 − 𝑛𝐵 )2
𝑛𝑛𝐴𝐵̅
6
𝑛𝐴 (𝑛 − 𝑛𝐵 )2
(𝑛𝐴 − 𝑛𝐴𝐵̅ )(𝑛 − 𝑛𝐴 )
7 −
𝑛𝐴 (𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ )2
−(( 𝑛𝐴 − 𝑛𝐴𝐵̅ )(𝑛𝐴 (𝑛 − 𝑛𝐵 )) + ℎ + 𝑘 + 𝑙
2
(((𝑛 − 𝑛𝐴 )(𝑛 − 𝑛𝐵 ) + 𝑛𝐴 𝑛𝐵 )(𝑛𝐵 − 𝑛𝐴 + 2𝑛𝐴𝐵̅ ))
ℎ = (𝑛𝐵 ((𝑛 − 𝑛𝐴 )) + (𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )) ((𝑛 − 2𝑛𝐴 ))((𝑛 − 𝑛𝐴 )(𝑛 − 𝑛𝐵 ) + 𝑛𝐴 𝑛𝐵 )(𝑛𝐵 − 𝑛𝐴
8
+ 2𝑛𝐴𝐵̅ )
𝑘 = (𝑛𝐴 − 𝑛𝐴𝐵̅ )(𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )(𝑛𝐴 (𝑛 − 𝑛𝐵 ))
𝑙 = 𝑛𝐵 (𝑛 − 𝑛𝐴 )((𝑛𝐵 − 𝑛𝐴 + 2𝑛𝐴𝐵̅ )(𝑛 − 𝑛𝐴 ))(2𝑛𝐴𝐵̅ )((𝑛 − 𝑛𝐴 )𝑛 − 𝑛𝐵 (𝑛 − 𝑛𝐵 ) + 𝑛𝐴 𝑛𝐵 )

9 0
1
10 −
𝑛
VIII Colloque International –VIII International Conference
A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
16 Classification of objective interestingness measures based on the tendency of value variation

𝑛𝐴
11 −
𝑛𝑛𝐴𝐵̅
12 0
13 0
14 0
1 1 𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛𝐴𝐵 (𝑛−𝑛𝐴𝐵 ) 𝑛−𝑛𝐵 −𝑛𝐴𝐵
1 −2 1 1
𝐼𝐼𝑛′ 𝐴 (ℎ) + 𝐼𝐼 (ℎ)
2 (1 − (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 ) (( 2 (𝑙𝑜𝑔2 + )− 2 (𝑙𝑜𝑔2 + ))))
2 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐵 𝑛𝐵 𝑙𝑛2 𝑛𝐵 𝑛𝐵 𝑙𝑛2

1
2
𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴 −𝑛𝐴𝐵 𝑛 𝑛 𝑛 𝑛 𝑛−𝑛𝐵 −𝑛𝐴𝐵 𝑛−𝑛𝐵 −𝑛𝐴𝐵
2 √𝐼𝐼 (1 − (− 𝑙𝑜𝑔2 − 𝐴𝐵 𝑙𝑜𝑔2 𝐴𝐵 ) (1 − (− 𝐴𝐵 𝑙𝑜𝑔2 𝐴𝐵 − 𝑙𝑜𝑔2 )))
𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵

15
𝑛𝐴 − 𝑛𝐴𝐵 𝑛𝐴 − 𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛
ℎ = 1 − (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 𝐴𝐵 ) (1
𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴
𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵
− (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 ))
𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵
1
1 3 𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛𝐴𝐵 2
𝐼𝐼𝑛′ (ℎ)4 + 𝐼𝐼 (ℎ)−4 (1 − (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 ) 𝑙)
4 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴
1
𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛𝐴𝐵 2 𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛−𝑛𝐵 −𝑛𝐴𝐵 𝑛−𝑛𝐵 −𝑛𝐴𝐵 2 4
2 √𝐼𝐼 (1 − (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 ) (1 − (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 ) ))
𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵

𝑛𝐴 − 𝑛𝐴𝐵 𝑛𝐴 − 𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛 2


ℎ = 1 − (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 𝐴𝐵 ) (1
𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴
𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 2
16 − (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 ) )
𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵
𝑛 𝑛 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 𝑛 𝑛 1
𝑙 = (−2 (− 𝐴𝐵 𝑙𝑜𝑔2 𝐴𝐵 − 𝑙𝑜𝑔2 ) (− 𝐴𝐵2 (𝑙𝑜𝑔2 𝐴𝐵 + )
𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑙𝑛2
(𝑛 − 𝑛𝐴𝐵 ) 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 1
− 2 (𝑙𝑜𝑔2 + )))
𝑛𝐵 𝑛𝐵 𝑙𝑛2

17 0
18 0
2(𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ ) − 2(𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ ) 2𝑛𝐵 + 2(𝑛 − 𝑛𝐵 )
19 −
𝑛(𝑛 − 𝑛𝐴 ) 𝑛2
𝛼
𝑤ℎ𝑒𝑟𝑒
𝛽
2 𝑛𝐴 𝑛 − 𝑛𝐴 𝑛𝐵 𝑛 − 𝑛𝐵 𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛𝐴𝐵̅
𝛼 = ( ) (2 − 𝑚𝑎𝑥 ( , ) − 𝑚𝑎𝑥 ( , )) (𝑚𝑎𝑥 ( , )
𝑛 𝑛 𝑛 𝑛 𝑛 𝑛 𝑛
20 𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ 𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅
+ 𝑚𝑎𝑥 ( , ) + 𝑚𝑎𝑥 ( , )
𝑛 𝑛 𝑛 𝑛
𝑛𝐴𝐵̅ 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ 𝑛𝐴 𝑛 − 𝑛𝐴 𝑛𝐵 𝑛 − 𝑛𝐵 1
+ 𝑚𝑎𝑥 ( , ) − 𝑚𝑎𝑥 ( , ) − 𝑚𝑎𝑥 ( , ))( )
𝑛 𝑛 𝑛 𝑛 𝑛 𝑛 𝑛
𝑛𝐴 𝑛 − 𝑛𝐴 𝑛𝐵 𝑛 − 𝑛𝐵 2
𝛽 = (2 − 𝑚𝑎𝑥 ( , ) − 𝑚𝑎𝑥 ( , ))
𝑛 𝑛 𝑛 𝑛
1 3 1 1
1 n   1 n 
21 n A B ( A ) 2 ( n  nB ) 2  ( A ) 2 ( n  nB ) 2
2 n 2 n
𝑛𝐴𝐵
̅
(𝑛𝐵 − 1)! (𝑛 − 𝑛𝐵 )! 𝑛𝐴 ! (𝑛 − 𝑛𝐴 )!
22 − ∑
((𝑛𝐴 − 𝑘)! (𝑛𝐵 − 𝑛𝐴 + 𝑘 − 1)! 𝑘! (𝑛 − 𝑛𝐵 − 𝑘)! 𝑛!
𝑘=𝑚𝑎𝑥 (1,𝑛𝐴 −𝑛𝐵 )
23 0

VIII Colloque International –VIII International Conference


A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
Nghia Quoc Phan et al. 17

𝑛𝐵 1 𝑛𝐴 − 𝑛𝐴𝐵 1
0 𝑖𝑓( ≤ 𝑎𝑛𝑑 >
𝑛 2 𝑛𝐴 2
1 𝑛𝐵 1 1 𝑛−𝑛𝐵 1
(𝑙𝑜𝑔2 + ) − 𝑙𝑜𝑔2 − 𝑛𝐵 1 𝑛𝐴 − 𝑛𝐴𝐵 1
𝑛 𝑛 𝑙𝑛2 𝑛 𝑛 (𝑛−𝑛𝐵 )𝑙𝑛2
𝑛−𝑛𝐵 2
𝑖𝑓( > 𝑎𝑛𝑑 ≤
24 𝑛 𝑛
( 𝐵 𝑙𝑜𝑔2 𝐵 +
𝑛−𝑛𝐵
𝑙𝑜𝑔2 ) 𝑛 2 𝑛𝐴 2
𝑛 𝑛 𝑛 𝑛
𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛 1 𝑛 1 1 𝑛−𝑛𝐵 1
( 𝑙𝑜𝑔2 + 𝑙𝑜𝑔2 𝐴𝐵 )( (𝑙𝑜𝑔2 𝐵 + ) − 𝑙𝑜𝑔2 − ) 𝑛𝐵 1 𝑛𝐴 − 𝑛𝐴𝐵 1
𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛 𝑛 𝐴 𝑛 𝑙𝑛2 𝑛 𝑛 (𝑛−𝑛 )𝑙𝑛2
𝐵
− 𝑛𝐵 𝑛𝐵 𝑛−𝑛𝐵 𝑛−𝑛𝐵 2
𝑖𝑓( > 𝑎𝑛𝑑 >
( 𝑙𝑜𝑔2 + 𝑙𝑜𝑔2 ) 𝑛 2 𝑛𝐴 2
{ 𝑛 𝑛 𝑛 𝑛

𝑛𝑛𝐴 𝑛𝐴𝐵̅ 𝑛 − 𝑛𝐴𝐵̅ 𝑛𝐵


− 2
𝑖𝑓 >
(𝑛𝑛𝐴 − 𝑛𝐴 𝑛𝐵 ) 𝑛𝐴 𝑛
25 𝑛(𝑛𝐴 − 𝑛𝐴𝐵̅ )
𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
{ 𝑛𝐴 𝑛2 𝐵
𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛𝐴𝐵̅
26 +
𝑛𝑛𝐵 𝑙𝑛2 𝑛(𝑛 − 𝑛𝐵 )𝑙𝑛2
1 𝑛𝐴 − 𝑛𝐴𝐵̅
27 − √
𝑛 𝑛
1 𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ −1 1
28 ( ) (𝑙𝑜𝑔2 (𝑛 − 𝑛𝐵 ) − 𝑙𝑜𝑔2 𝑛𝐵 ) + ( − )( − )
𝑛 − 𝑛𝐴 𝑛𝐴 𝑛 − 𝑛𝐴 (𝑛 − 𝑛𝐵 )𝑙𝑛2 𝑛𝐵 𝑙𝑛2
𝑛𝐴 − 𝑛𝐴𝐵̅
29 −( )
2𝑛𝐵2
30 0
𝑛𝐴 − 2𝑛𝐴𝐵̅
31 −
𝑛𝐵2
𝑛𝐴
32 − 2
𝑛
𝑛𝐴 − 𝑛𝐴𝐵̅ 𝑛𝐴𝐵̅ 1 𝑛(𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ )
+ + 𝑙𝑜𝑔2
𝑛𝑛𝐵 𝑙𝑛2 𝑛(𝑛 − 𝑛𝐵 )𝑙𝑛2 𝑛 (𝑛 − 𝑛𝐴 )𝑛𝐵
𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ 𝑛𝐴 − 𝑛𝐴𝐵̅
+ ( )
𝑛 𝑛𝐵 (𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ )𝑙𝑛2
33 1 𝑛(𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )
− 𝑙𝑜𝑔2
𝑛 (𝑛 − 𝑛𝐴 )(𝑛 − 𝑛𝐵 )
𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ (𝑛 − 𝑛𝐵 ) + (𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )
− ( )
𝑛 (𝑛 − 𝑛𝐵 ) + (𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )𝑙𝑛2
1
34
𝑛
−3 2𝑛𝐴𝐵̅
35 −
2𝑛 (𝑛 − 𝑛𝐵 )2
𝑛𝐴 − 𝑛𝐴𝐵̅
36 −
𝑛𝐵2
37 0
−1
38
𝑛 − 𝑛𝐴
(𝑛𝑛𝐴 − 𝑛𝐴 𝑛𝐵 − 𝑛𝑛𝐴𝐵̅ )𝑛𝐵
39
(𝑚𝑎𝑥((𝑛𝐴 − 𝑛𝐴𝐵̅ )(𝑛 − 𝑛𝐵 ), 𝑛𝐵 𝑛𝐴𝐵̅ ))2

Table 5: Formula of partial derivative under the parameter nAB̅

Z
No
n A B
1 𝑛(𝑛𝐴 − 𝑛𝐴𝐵̅ ) 1
1 − 𝑙𝑜𝑔2 −
𝑛𝐴 𝑛𝐴 𝑛𝐵 (𝑛𝐴 − 𝑛𝐴𝐵̅ )𝑙𝑛2
1
2 −
𝑛𝐴

VIII Colloque International –VIII International Conference


A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
18 Classification of objective interestingness measures based on the tendency of value variation

𝑛 − 𝑛𝐵
3 2 −𝑛𝐴
𝑛𝐵 𝑛𝐴𝐵̅
1 1 1
4 − ( + )
2 𝑛𝐴 𝑛 − 𝑛𝐵
1 3 1
5 − ( + )
2 𝑛𝐴 𝑛 − 𝑛𝐵
𝑛
6 −
𝑛𝐴 (𝑛 − 𝑛𝐵 )
(𝑛 − 𝑛𝐴 )𝑛𝐵
7 −
𝑛𝐴 (𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ )2
ℎ − 𝑘 + 𝑛𝐵 (𝑛 − 𝑛𝐴 ) − 𝑙)(2((𝑛 − 𝑛𝐴 )(𝑛 − 𝑛𝐵 ) + 𝑛𝐴 𝑛𝐵 )
2
(((𝑛 − 𝑛𝐴 )(𝑛 − 𝑛𝐵 ) + 𝑛𝐴 𝑛𝐵 )(𝑛𝐵 − 𝑛𝐴 + 2𝑛𝐴𝐵̅ ))
8
ℎ = −((𝑛𝐴 (𝑛 − 𝑛𝐵 )) + (𝑛𝐵 ((𝑛 − 𝑛𝐴 ))((𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ ))
𝑘 = (𝑛𝐴 − 𝑛𝐴𝐵̅ )(𝑛𝐴 − 𝑛𝐴𝐵̅ )(𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )(𝑛𝐴 (𝑛 − 𝑛𝐵 ))
𝑙 = (𝑛𝐴 − 𝑛𝐴𝐵̅ )(𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )(𝑛𝐴 (𝑛 − 𝑛𝐵 ) + 𝑛𝐵 (𝑛 − 𝑛𝐴 )
1
9 −
𝑛𝐴
4
10 −
𝑛
𝑛𝐴 (𝑛 − 𝑛𝐵 )
11 − 2
𝑛𝑛𝐴𝐵 ̅
12 0
2
13 −
𝑛𝐴
2
14 −
𝑛
1 𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛−𝑛𝐵 −𝑛𝐴𝐵
1 1 1 1 1
𝐼𝐼 (ℎ)−2 (𝑘 + (1 − (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 ) (( (𝑙𝑜𝑔2 + )− (𝑙𝑜𝑔2 + )))))
2 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐵 𝑛𝐵 𝑙𝑛2 𝑛𝐵 𝑛𝐵 𝑙𝑛2

1
2
𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴 −𝑛𝐴𝐵 𝑛 𝑛 𝑛 𝑛 𝑛−𝑛𝐵 −𝑛𝐴𝐵 𝑛−𝑛𝐵 −𝑛𝐴𝐵
2 √𝐼𝐼 (1 − (− 𝑙𝑜𝑔2 − 𝐴𝐵 𝑙𝑜𝑔2 𝐴𝐵 ) (1 − (− 𝐴𝐵 𝑙𝑜𝑔2 𝐴𝐵 − 𝑙𝑜𝑔2 )))
𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵

𝑛𝐴 − 𝑛𝐴𝐵 𝑛𝐴 − 𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛


ℎ = 1 − (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 𝐴𝐵 ) (1
𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴
15 𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵
− (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 ))
𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵
1 𝑛𝐴 − 𝑛𝐴𝐵 1 1 𝑛 1
𝑘 = ((− (𝑙𝑜𝑔2 + ) + (𝑙𝑜𝑔2 𝐴𝐵 + )) (1
𝑛𝐴 𝑛𝐴 𝑙𝑛2 𝑛𝐴 𝑛𝐴 𝑙𝑛2
𝑛𝐴𝐵 𝑛 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵
− (− 𝑙𝑜𝑔2 𝐴𝐵 − 𝑙𝑜𝑔2 )))
𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵

1
1 3 𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛𝐴𝐵 2
𝐼𝐼𝑛′ (ℎ)4 + 𝐼𝐼 (ℎ)−4 (𝑘) + (1 − (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 ) 𝑙)
4 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴
1
𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴 −𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛𝐴𝐵 2 𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛−𝑛𝐵 −𝑛𝐴𝐵 𝑛−𝑛𝐵 −𝑛𝐴𝐵 2 4
2 √𝐼𝐼 (1 − (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 ) (1 − (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 ) ))
𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵
16
𝑛𝐴 − 𝑛𝐴𝐵 𝑛𝐴 − 𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛 2
ℎ = 1 − (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 𝐴𝐵 ) (1
𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴
𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 2
− (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 ) )
𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵

VIII Colloque International –VIII International Conference


A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
Nghia Quoc Phan et al. 19

𝑛𝐴 − 𝑛𝐴𝐵 𝑛𝐴 − 𝑛𝐴𝐵 𝑛𝐴𝐵 𝑛 1 𝑛𝐴 − 𝑛𝐴𝐵 1


𝑘 = −2 (− 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 𝐴𝐵 ) ( (𝑙𝑜𝑔2 − )
𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑛𝐴 𝑙𝑛2
1 𝑛 1
− (𝑙𝑜𝑔2 𝐴𝐵 + )) (1
𝑛𝐴 𝑛𝐴 𝑙𝑛2
𝑛 𝑛 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 2
− (− 𝐴𝐵 𝑙𝑜𝑔2 𝐴𝐵 − 𝑙𝑜𝑔2 ) )
𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵
𝑛 𝑛 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 1 𝑛 1
𝑙 = (−2 (− 𝐴𝐵 𝑙𝑜𝑔2 𝐴𝐵 − 𝑙𝑜𝑔2 ) (− (𝑙𝑜𝑔2 𝐴𝐵 + )
𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑛𝐵 𝑙𝑛2
1 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵 1
+ 𝑙𝑜𝑔2 − )))
𝑛𝐵 𝑛𝐵 𝑙𝑛2
𝑛𝐴
17 −
(𝑛𝐴 − 𝑛𝐴𝐵̅ )2
1
18 −
𝑛
−2(𝑛𝐴 − 𝑛𝐴𝐵̅ ) + 2𝑛𝐴𝐵̅ 2(𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ ) − 2(𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )
19 +
𝑛𝑛𝐴 𝑛(𝑛 − 𝑛𝐴 )
𝛼
𝑤ℎ𝑒𝑟𝑒
𝛽

−1 1 1 −1 −1 1 1 −1
20 𝛼 = (𝑚𝑎𝑥 ( , ) + 𝑚𝑎𝑥 ( , ) + 𝑚𝑎𝑥 ( , ) + 𝑚𝑎𝑥 ( , ))
𝑛 𝑛 𝑛 𝑛 𝑛 𝑛 𝑛 𝑛

𝑛𝐴 𝑛 − 𝑛𝐴 𝑛𝐵 𝑛 − 𝑛𝐵
𝛽 = 2 − 𝑚𝑎𝑥 ( , )−( , )
𝑛 𝑛 𝑛 𝑛
1
21 n A (n  nB )
n
𝑛𝐴𝐵
̅ −1 𝑛𝐵 ! (𝑛 − 𝑛𝐵 )! (𝑛𝐴 )! (𝑛 − 𝑛𝐴 )!
22 −∑
𝑘=𝑚𝑎𝑥 (1,𝑛𝐴 −𝑛𝐵 ) ((𝑛 𝐴 − 𝑘)! (𝑛 𝐵 − 𝑛𝐴 + 𝑘)! 𝑘! (𝑛 − 𝑛 𝐵 − 𝑘)! 𝑛!
23 0
1 𝑛𝐴 − 𝑛𝐴𝐵 1 1 𝑛 1 𝑛𝐵 1 𝑛𝐴 − 𝑛𝐴𝐵 1
− 𝑙𝑜𝑔2 − + (𝑙𝑜𝑔2 𝐴𝐵 + ) 𝑖𝑓( ≤ 𝑎𝑛𝑑 >
𝑛𝐴 𝑛𝐴 (𝑛𝐴 − 𝑛𝐴𝐵 )𝑙𝑛2 𝑛𝐴 𝑛𝐴 𝑙𝑛2 𝑛 2 𝑛𝐴 2
𝑛𝐵 1 𝑛𝐴 − 𝑛𝐴𝐵 1
0 𝑖𝑓( > 𝑎𝑛𝑑 ≤
24 𝑛 2 𝑛𝐴 2
1 𝑛𝐴 −𝑛𝐴𝐵 1 1 𝑛 1
− 𝑙𝑜𝑔2 − + (𝑙𝑜𝑔2 𝐴𝐵 + )
𝑛𝐴 𝑛𝐴 (𝑛𝐴 −𝑛𝐴𝐵 )𝑙𝑛2 𝑛𝐴 𝑛𝐴 𝑙𝑛2 𝑛𝐵 1 𝑛𝐴 − 𝑛𝐴𝐵 1
− 𝑛𝐵 𝑛𝐵 𝑛−𝑛𝐵 𝑛−𝑛𝐵 𝑖𝑓( > 𝑎𝑛𝑑 >
{ 𝑙𝑜𝑔2 + 𝑙𝑜𝑔2 𝑛 2 𝑛𝐴 2
𝑛 𝑛 𝑛 𝑛
𝑛 𝑛 − 𝑛𝐴𝐵̅ 𝑛𝐵
− 𝑖𝑓 >
𝑛𝐴 (𝑛 − 𝑛𝐵 ) 𝑛𝐴 𝑛
25 { 𝑛
− 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝑛𝐴 𝑛𝐵
1 𝑛(𝑛𝐴 − 𝑛𝐴𝐵̅ ) 1 1 𝑛𝑛𝐴𝐵̅ 1
26 − (𝑙𝑜𝑔2 + ) + (𝑙𝑜𝑔2 + )
𝑛 𝑛𝐴 𝑛𝐵 𝑙𝑛2 𝑛 𝑛𝐴 (𝑛 − 𝑛𝐵 ) 𝑙𝑛2
1
− 𝑛 − 𝑛𝐵 𝑛𝐴𝐵̅ 1 𝑛𝐴 − 𝑛𝐴𝐵̅
𝑛
27 ( − )− √
𝑛𝐴 −𝑛𝐴𝐵
̅ 𝑛 𝑛𝐴 𝑛𝐴 𝑛
2√
𝑛
1 1
28 −( − ) (𝑙𝑜𝑔2 (𝑛 − 𝑛𝐵 ) − 𝑙𝑜𝑔2 𝑛𝐵 )
𝑛𝐴 𝑛 − 𝑛𝐴
1 1 1
29 − ( + )
2 𝑛𝐴 𝑛𝐵
1
30 −
𝑛𝐴 + 2

VIII Colloque International –VIII International Conference


A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
20 Classification of objective interestingness measures based on the tendency of value variation

2
31 −
𝑛𝐵
1
32 −
𝑛𝐴
1 𝑛(𝑛 − 𝑛𝐴𝐵̅ ) 𝑛𝐴 − 𝑛𝐴𝐵̅
− 𝑙𝑜𝑔2 +
𝑛 𝑛𝐴 𝑛𝐵 (𝑛 − 𝑛𝐴𝐵̅ )𝑙𝑛2
1 𝑛𝑛𝐴𝐵̅ 1 1 𝑛(𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ )
33 + 𝑙𝑜𝑔2 + + 𝑙𝑜𝑔2
𝑛 𝑛𝐴 (𝑛 − 𝑛𝐵 ) 𝑛𝑙𝑛2 𝑛 (𝑛 − 𝑛𝐴 )𝑛𝐵
𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ 1 𝑛(𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ ) 𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅
+ − 𝑙𝑜𝑔2 −
𝑛(𝑛𝐵 − 𝑛𝐴 + 𝑛𝐴𝐵̅ )𝑙𝑛2 𝑛 (𝑛 − 𝑛𝐴 )(𝑛 − 𝑛𝐵 ) (𝑛 − 𝑛𝐵 − 𝑛𝐴𝐵̅ )𝑙𝑛2
34 0
3 2
35 −( + )
2𝑛𝐴 𝑛 − 𝑛𝐵
1
36 −
𝑛𝐵
𝑛𝐴
37 − 2
𝑛𝐴𝐵̅
−1
38
𝑛 − 𝑛𝐴
−𝑛 𝑚𝑎𝑥((𝑛𝐴 − 𝑛𝐴𝐵̅ )(𝑛 − 𝑛𝐵 𝑛𝐵 𝑛𝐴𝐵̅ ) − (𝑛𝑛𝐴 − 𝑛𝐴 𝑛𝐵 − 𝑛𝑛𝐴𝐵̅ )𝑚𝑎𝑥 (−(𝑛 − 𝑛𝐵 ), 𝑛𝐵 )
),
39
(𝑚𝑎𝑥((𝑛𝐴 − 𝑛𝐴𝐵̅ )(𝑛 − 𝑛𝐵 ), 𝑛𝐵 𝑛𝐴𝐵̅ ))2

In the next step, we will conduct a examination on the derivative formula of each
measure to classify them into groups which tend to increase, decrease or be independent
of parameters n, nA , nB ,nAB .
Table 6: Findings of the variation of the measures based on partial derivatives under 4 parameters (1:
increasing Variability; -1: decreasing Variability; 0: independent).
No Name of interestingness measures n nA nB nAB
1 1-way Support 1 1 -1 -1
Added value, Pavillon, Centred Confidence,
2 1 1 -1 -1
Dependency
3 Bayes factor, Odd multiplier 1 1 -1 -1
4 Causal-Confidence 1 1 -1 -1
5 Causal-Confirmed confidence 1 1 -1 -1
6 Loevinger, Certainty Factor, Satisfaction 1 1 -1 -1
7 Relative Risk , Class correlation ratio 1 1, 0, -1 -1 -1
8 Collective strength 1 1, 0, -1 -1 -1
9 Confidence 0 1 0 -1
10 Causal Confirm 1 1 -1 -1
11 Conviction 1 1 -1 -1
12 Coverage -1 1 0 0
13 Descriptive Confirmed-Confidence, Ganascia Index 0 1 0 -1
14 Descriptive-Confirm -1 1 0 -1
15 Entropic Implication Intensity 1 1 1 -1 -1
16 Entropic Implication Intensity 2 1 1 -1 -1
17 Examples and counter-examples rate 0 1 0 -1
18 Gain, Fukuda -1 1 0 -1
19 Gini index 1 1 -1 -1, 0, 1

VIII Colloque International –VIII International Conference


A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
Nghia Quoc Phan et al. 21

20 Goodman–Kruskal 1 1 -1 -1, 0, 1
21 Implication index -1 -1 1 1
22 Implication Intensity (II) 1 1 -1 -1
Probabilistic measure of deviation from
23 equilibrium(IPEE), Indice Probabiliste d’Ecart 0 -1 0 0
d’Equilibre
24 Directed Information ratio(DIR) 0, 1, -1 1, 0, -1 0, 1, -1 -1, 0, -1
25 MGK, Ion -1,1 1 -1,1 -1
26 J-measure 1 1 -1 -1, 0, 1
27 Klosgen 1 1 -1 -1
28 K-measure 1 1 -1 -1
29 Kulczynski index 0 1 -1 -1
30 Laplace 0 1 0 -1
31 Least contradiction 0 1 -1 -1
32 Leverage, Leverage 1 1 -1 -1 -1
33 Mutual Information MI, 2-way Support Variation 1 1 -1 -1, 0, 1
34 Prevalence -1 0 1 0
35 Putative Causal Dependency -1 1 -1 -1
36 Recall, Completeness 0 1 -1 -1
37 Sebag and Schoenauer 0 1 0 -1
38 Specificity 1, Negative Reliability 1 1 -1 -1
39 Zhang Zhang 1 1 -1 -1

Table 6 shows that most of the measures varies in either increasing or decreasing
tendency according to four parameters n, 𝑛𝐴 , 𝑛𝐵 , 𝑛𝐴𝐵 . However, the majority of the
measures are of increasing variability with parameter 𝑛𝐴 . This reflects the general rule of
the objective interestingness measures that if the more examples they appear, the more
reliability association rules are increasing. On the other hand, the decreasingly variable
measures with parameter 𝑛𝐴𝐵 also share a relatively large number. This is also in the
general rule of the interesting measure. If the number of counter-example is more
increasing, the interestingness value of the measures is more decreasing.
In the list of measures surveyed, we found that there is some measure of the variation
is quite interesting. They do not follow the general rule of the majority of the measured
variable is always covariates, inverse and independent of the parameters n, 𝑛𝐴 , 𝑛𝐵 , 𝑛𝐴𝐵 .
For example, Relative risk and Collective strength are two measures has their variation
depend on the value of parameter 𝑛𝐴 . When examining two measures, we see that the
value partial derivative of them according 𝑛𝐴 as follows: initial positive value, then
proceed to 0, finally reaching values negative. This means that, when the parameter 𝑛𝐴
increases, the variability of this measure increases, decreases and reaches extreme values.
When we examine the Gini index, Goodman-Kruskal, J-measure and Mutual Information
under parameter 𝑛𝐴𝐵 we also see cases similar to Relative risk and Collective strength
variation depend on the value of parameter 𝑛𝐴 . In particular, measures of Directed
Information Ratio (DIR) and MGK have variability that depend on specific condition of
nA −nAB n−nAB n
, n ̅ and nB.
n A A

VIII Colloque International –VIII International Conference


A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
22 Classification of objective interestingness measures based on the tendency of value variation

Table 7: Classification of measures based on partial derivative under parameter n

Decreases with n Independent with n Increases with n Others


Coverage Confidence 1-way Support Directed
Descriptive-Confirm Descriptive Confirmed- Added value, Pavillon, Information
Gain, Fukuda Confidence , Ganascia Index Centred Confidence, ratio(DIR)
Implication index Examples and counter-examples Dependency MGK, Ion
Prevalence rate Bayes factor, Odd
Putative Causal Probabilistic measure of multiplier
Dependency deviation from Causal-Confidence
equilibrium(IPEE), Indice Causal-Confirmed
Probabiliste d’Ecart d’Equilibre confidence
Kulczynski index Loevinger, Certainty
Laplace Factor, Satisfaction
Least contradiction Relative Risk , Class
Recall, Completeness correlation ratio
Sebag and Schoenauer Collective strength
Causal Confirm
Conviction
Entropic Implication
Intensity 1
Entropic Implication
Intensity 2
Implication Intensity
Gini index
Goodman–Kruskal
J-measure
Klosgen
K-measure
Leverage, Leverage 1
Mutual Information MI, 2-
way Support Variation
Specificity 1, Negative
Reliability
Zhang Zhang

Table 7 shows that the number of measures tend variable covariates according to
parameter n get percentage over 50 percent of the surveyed measures. This reflects that
the interestingnes value of a measure depends on the size of the data used to examine it.
In this class covariates, the measures have the same characteristics as speed of variation
of the measure depends on the rate of change of the parameter n. In contrast, the class of
measures tend inverse variation with parameter n has a relatively small number in the list
of the survey measured. It includes measures such as: Coverage, Descriptive-Confirm,
Gain, Fukuda, Implication index, Prevalence, and Putative Causal Dependency. Group of
the independent variable measures with parameter n is the class of especial measures
because most of them agree on the descriptive nature. If the measure satisfies the
descriptive properties its interestingness value will not depend on the value of the
parameter n (the size of the data). This result shows that the method of variation tendency
survey of the measures based on partial derivative got the results as accurate as other
methods. DIR and MGK is two measures in of the last class has the variation depends on
the specific values of the parameters n.

VIII Colloque International –VIII International Conference


A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
Nghia Quoc Phan et al. 23

Figure 2: Compares the variation between Implication index and Implication intensity under parameter n
Figure 2 shows the decreasing variability of Implication index and the increasing
variability of Implication intensity under parameter n on ARQAT tool implemented in R
(Nghia and Hiep, 2014). These are two measures representing for two classes increase
and decrease measures with parameter n. This is a measure representing for the class of
increasing measures by parameter n.
Table 8: Classification of measures based on partial derivative under parameter nA
Independent
Decreases with 𝒏𝑨 Increases with 𝒏𝑨 Others
with 𝒏𝑨
Implication index Prevalence 1-way Support Relative Risk , Class
Probabilistic measure of Added value, Pavillon, Centred correlation ratio
deviation from Confidence, Dependency Collective strength
equilibrium(IPEE), Bayes factor, Odd multiplier Directed
Indice Probabiliste Causal-Confidence Information
d’Ecart d’Equilibre Causal-Confirmed confidence ratio(DIR)
Leverage, Leverage 1 Loevinger, Certainty Factor,
Satisfaction
Confidence
Causal Confirm
Conviction
Coverage
Descriptive Confirmed-Confidence ,
Ganascia Index
Descriptive-Confirm
Entropic Implication Intensity 1
Entropic Implication Intensity 2
Examples and counter-examples rate
Gain, Fukuda
Gini index
Goodman–Kruskal
Implication Intensity
MGK, Ion
J-measure
Klosgen
K-measure
Kulczynski index
Laplace
Least contradiction
Mutual Information MI, 2-way
Support Variation
Putative Causal Dependency
Recall, Completeness
Sebag and Schoenauer
Specificity 1, Negative Reliability
Zhang Zhang
VIII Colloque International –VIII International Conference
A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
24 Classification of objective interestingness measures based on the tendency of value variation

The findings of Table 8 shows that the class of objective interestingness measures
varies in the increasing tendency under parameter nA with relatively high proportion
(31/39) and Prevalence measure is the only independent measurement parameter nA . This
result can be seen that the interestingness measures of association rules a → b depend on
the number of elements of the set A ( nA ). When parameter nA is greater, the
interestingness measures of association rules reach the maximum value under variability
of the derivative. Particularly, the group of measures derived from Confidence measure
belongs to the class of the increased measures by parameter nA . This is consistent with
the principle for determining the reliability of an association rule a → b. The class of
decreasing measures under parameter nA shares a small percentage (03 over all measures
in our survey). They include measures such as Implication index, IPEE, Leverage. The
group consisting of Relative risk, Class correlation ratio, Collective strength, Directed
information ratio have variation value depending on the specific value of the parameter
nA .

Figure 3: Show the decreasing variability of Implication index under parameter 𝑛𝐴


Implication index measure is of a decreasing variability by parameter nA . The
decreasing variation of this measure is shown in Figure 3. This is a measure representing
the class of the decreased measures by parameter nA .
Table 9: Classification of measures bases on partial derivative under parameter nB
Increases
Decreases with 𝒏𝑩 Independent with 𝒏𝑩 Others
with 𝒏𝑩
1-way Support Confidence Implication Directed
Added value, Pavillon, Centred Coverage index Information
Confidence, Dependency Descriptive Confirmed- Prevalence ratio(DIR)
Bayes factor, Odd multiplier Confidence , Ganascia Index MGK, Ion
Causal-Confidence Descriptive-Confirm
Causal-Confirmed confidence Examples and counter-
Loevinger, Certainty Factor, examples rate
Satisfaction Gain, Fukuda
Relative Risk , Class correlation Probabilistic measure of
ratio deviation from
Collective strength equilibrium(IPEE), Indice
Causal Confirm Probabiliste d’Ecart
Conviction d’Equilibre

VIII Colloque International –VIII International Conference


A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
Nghia Quoc Phan et al. 25

Entropic Implication Intensity 1 Laplace


Entropic Implication Intensity 2 Sebag and Schoenauer
Gini index
Goodman–Kruskal
Implication Intensity
J-measure
Klosgen
K-measure
Kulczynski index
Least contradiction
Leverage, Leverage 1
Mutual Information MI, 2-way
Support Variation
Putative Causal Dependency
Recall, Completeness
Specificity 1, Negative Reliability
Zhang Zhang

Table 9 shows that most of the surveying measures in groups of variable inverse
measures with the parameter nB . From this result, we can confirm that if the value of nB
increase, the interestingness value will decrease. This is entirely consistent with the
principles of the theory to determine the interestingness value of association rules a → b
because of the surveyed measures are asymmetry measures. Similarly when we consider
the measures with parameters n, most of the independent variable measures with
parameter nB are the descriptive measure as Confidence described, Coverage, Laplace,
Sebag and Schoenauer. Class of measures with variable covariates nB has only two
measures: Implication index and Prevalence. It mean that the interestingness value of
those measures will increase when the value of nB increase. Like the survey results with
parameters n, Directed Information Ratio, MGK have variation value depending on the
specific value of the parameter nB .

Figure 4: Show independence of Laplace measure under parameter 𝒏𝑩


Figure 4 shows the independence of Laplace measure under parameter nB on ARQAT
tool implemented in R (Nghia and Hiep, 2014). This is a measure of the class of
independent measures with parameter nB .

VIII Colloque International –VIII International Conference


A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
26 Classification of objective interestingness measures based on the tendency of value variation

Table 10: Classification of measures based on partial derivative under parameter nAB
Independent Increases
Decreases with 𝒏𝑨𝑩 Others
with 𝒏𝑨𝑩 with 𝒏𝑨𝑩
1-way Support Coverage Implication Directed
Added value, Pavillon, Centred Confidence, Probabilistic measure index Information
Dependency of deviation from ratio(DIR)
Bayes factor, Odd multiplier equilibrium(IPEE), Gini index
Causal-Confidence Indice Probabiliste Goodman–
Causal-Confirmed confidence d’Ecart d’Equilibre Kruskal
Loevinger, Certainty Factor, Satisfaction Prevalence J-measure
Relative Risk , Class correlation ratio Mutual
Collective strength Information
Confidence MI, 2-way
Causal Confirm Support
Conviction Variation
Descriptive Confirmed-Confidence ,
Ganascia Index
Descriptive-Confirm
Entropic Implication Intensity 1
Entropic Implication Intensity 2
Examples and counter-examples rate
Gain, Fukuda
Implication Intensity
MGK, Ion
Klosgen
K-measure
Kulczynski index
Laplace
Least contradiction
Leverage, Leverage 1
Putative Causal Dependency
Recall, Completeness
Sebag and Schoenauer
Specificity 1, Negative Reliability
Zhang Zhang

Table 10 shows that the class of objective interestingness measures in the decreasing
variation with parameter 𝑛𝐴𝐵̅ accounts for 71%. This reflects accurately the role of
parameter 𝑛𝐴𝐵̅ when determining the interestingness measure of association rule 𝑎 → 𝑏.
If the number of counter-examples 𝑛𝐴𝐵̅ is more increasing, the interestingness value of
association rules 𝑎 → 𝑏 is more decreasing. In this class, most of measures have derived
from Confidence measure. This result reflects exactly the formula of determining the
reliability of an association rule is defined in formula of Confidence measure (𝑓(𝑎 →
𝑛 −𝑛
𝑏) = 𝐴 𝑛 𝐴𝐵̅ ). Therefore, parameter 𝑛𝐴𝐵̅ is always inversely related to the reliability.
𝐴
Class of independent measures by parameter 𝑛𝐴𝐵̅ has relatively small proportion. It
includes three measures Coverage, IPEE, Prevalence. Group of measures tend variable
covariates parameter 𝑛𝐴𝐵̅ with only one measure is the implication index. This is a
paradoxical situation in determining the interestingness measures of the association rules
because the number of counter-examples increases, interestingnes value decrease but in
this situation it has increased. In the last subclass includes five measures, in which the
Directed Information ratio measure tends variability depends on the constraint of two

VIII Colloque International –VIII International Conference


A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
Nghia Quoc Phan et al. 27

𝑛 −𝑛
expressions: 𝐴 𝑛 𝐴𝐵 and 𝑛𝑛𝐵 . The remaining measures have variation value depending on
𝐴
the specific value of the parameter 𝑛𝐴𝐵̅ .

Figure 5: Show variable value depending on the specific value of parameter 𝑛𝐴𝐵̅ of J-measure
J-measure is measure that the variation of value measure depends on the specific value
of the parameter nAB̅ . The variation of this measure is shown in Figure 5. This is a
representing measure of the class of the measures have variation value depending on the
specific value of the parameter nAB̅ .

5 Conclusion
Classification of the interestingness measures have attracted many researchers in the
field of Knowledge Discovery from Databases. The study focuses on two main
classification methods: one based on the properties of the measures and the other based
on the behaviors of the measures. In this paper, we examine the value variation of the
asymmetric objective interestingness measures by taking their partial derivatives by the
four parameters 𝑛, 𝑛𝐴 , 𝑛𝐵 , 𝑛𝐴𝐵 . The results of classification are considered in the
increasing, decreasing or independent variation of the derivative formula of each measure,
thus evaluating the variable tendency of the asymmetric objective interestingness
measures. This classification enables researchers and users to obtain more insight on the
group of asymmetric objective interestingness measures such as the increasing,
decreasing variability of each measure, the relationship between the value variability and
the statistical parameter values 𝑛, 𝑛𝐴 , 𝑛𝐵 , 𝑛𝐴𝐵 and the interdependence between these
parameters in the formula calculating the interestingness value of the measures. This
information is the basis for defining an appropriate measure in their specific application.

References
[1] A. Silberschatz and A. Tuzhilin (1995), On subjective measures of interestingness
in knowledge discovery, KDD'95 - Proceedings of the First International
Conference on Knowledge Discovery and Data Mining, 275-281.

VIII Colloque International –VIII International Conference


A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
28 Classification of objective interestingness measures based on the tendency of value variation

[2] A. Silberschatz and A. Tuzhilin (1996), What makes patterns interesting in


knowledge discovery systems, IEEE Transactions on Knowledge and Data
Engineering 8(6), 970-974.
[3] C. Tew et al. (2013), Behavior-based clustering and analysis of interestingness
measures for association rule mining, Journal of Data Mining and Knowledge
Discovery 28, Springer-Verlag, 1004-1045.
[4] David H. Glass (2013), Confirmation measures of association rule interestingness,
Knowledge-Based Systems 44, 65–77.
[5] F. Guillet and H. J. Hamilton (2007), Quality Measures in Data Mining - Series in
Computational Intelligence 43, Springer-Verlag.
[6] G. Piatetsky-Shapiro and C. J. Matheus (1994), The interestingness of deviations,
AAAI'94 - Knowledge Discovery in Databases Workshop, 25-36.
[7] G. Piatetsky-Shapiro (1991), Discovery, analysis, and presentation of strong rules,
Knowledge Discovery in Databases, 229-248.
[8] H. X. Huynh et al. (2007), A graph-based clustering approach to evaluate
interestingness measures: a tool and a comparative study (Chapter 2), Quality
Measures in Data Mining, Springer-Verlag, 25-50.
[9] H. X. Huynh et al. (2006), Extracting representative measures for thepost-
processing of association rules, The 2006 IEEE International Conference on
Research, Innovation and Vision for the Future (RIVF’08), 100-106.
[10] H. X. Huynh et al. (2005), Data Analysis Approach for Evaluating the Behavior of
Interestingness Measures, Discovery Science (LNCS 3735), Springer-Verlag, 130-
137.
[11] H. X. Huynh et al. (2006), Evaluating Interestingness Measures with Linear
Correlation Graph, Advances in Applied Artificial Intelligence (LNCS 4031),
Springer-Verlag, 312 – 321.
[12] H. X. Huynh et al. (2012), Classification of objective interestingness measures,
Journal of Can Tho University (2011:20a), 147 - 158.
[13] J. Blanchard et al. (2009), Semantics-based classification of rule interestingness
measures, Post-Mining of Association Rules: techniques for effective knowledge
extraction" (Y. Zhao, C. Zhang, L. Cao editors), IGI Global,56-79.
[14] J. Liu et al. (2008), A New Interestingness Measure of Association Rules, The
Second International Conference on Genetic and Evolutionary Computing
(WGEC’08), 393 – 397.
[15] Jon Hills et al. (2012), Interestingness Measures for Fixed Consequent Rules,
Intelligent Data Engineering and Automated Learning - IDEAL 2012
(LNCS Volume 7435), Springer-Verlag, 68–75.
[16] K. Selvarangam and K. Ramesh Kumar (2014), Selecting Perfect Interestingness
Measures By Coefficient Of Variation Based Ranking Algorithm, Journal of
Computer Science (Volume 10, Issue 9), 1672-1679.
[17] L. Geng and H. J. Hamilton (2006), Interestingness measures for data mining: A
survey, ACM Computing Surveys (Volume 38), 1-32.
VIII Colloque International –VIII International Conference
A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/
Nghia Quoc Phan et al. 29

[18] M. Jalali-Heravi and O. Zaïane (2010), A study on interestingness measures for


associative classifiers, SAC’10 - Proceedings of the 25th ACM Symposium on
Applied Computing, 1039 -1046.
[19] M. Martínez-Ballesteros et al. (2014), Selecting the best measures to discover
quantitative association rule, Neurocomputing (Volume 126), 3–14.
[20] Maddouri and Gammoudi (2007), On Semantic Properties of Interestingness
Measures for Extracting Rules from Data, ICANNGA (1) 2007, Springer-Verlag
Berlin Heidelberg, 148-158
[21] P. Lenca et al. (2004), A multicriteria decision aid for interestingness measure
selection, LUSSI-TR-2004-01-EN, 1–27.
[22] P. Lenca et al. (2008), On selecting interestingness measures for association rules:
user oriented description and multiple criteria decision aid, European Journal of
Operational Research (Volume 184, Issue 2), 610–626.
[23] P. N. Tan et al. (2002), Selecting the Right Interestingness Measure for Association
Patterns, SIGKDD ’02 Edmonton, Alberta, Canada ACM 1-58113-567-X/02/0007,
1-10.
[24] P. N. Tan et al. (2004), Selecting the right objective measure for association
analysis, Journal of Information Systems (Volume 29, Issue 4), 293-313.
[25] R. Gras (1979), Contribution à l’étude expérimentale et à l’analyse de certaines
acquisitions cognitives et de certains objectifs en didactique des mathématiques,
PhD thesis, Université de Rennes 1.
[26] R. Gras and P. Kuntz (2008), An overview of the Statistical Implicative Analysis
(SIA) development, Statistical Implicative Analysis - Studies in Computational
Intelligence (Volume 127), Springer-Verlag, 11-40.
[27] R. Gras et al. (2014), Notion de champ implicatif en analyse statistique implicative,
Articles submitted international conference ASI8.
[28] R. Agrawal and R. Srikant (1994), Fast algorithms for mining association rules,
VLDB'94 - Proceedings of the 20th International Conference on Very Large Data
Bases, 487-499.
[29] S. Brin et al. (1997), Dynamic itemset counting and implication rules for market
basket data, Proceedings of the ACM SIGMOD international conference on
management of data, 255–264.
[30] S. Sahar (2003), What is interesting: studies on interestingness in knowledge
discovery. PhD thesis, School of Computer Science, Tel-Aviv University.
[31] S. Guillaume et al. (2012), Categorization of interestingness measures for
knowledge extraction, journals/corr/abs-1206-6741, 1 – 34.

VIII Colloque International –VIII International Conference


A.S.I. Analyse Statistique Implicative ––Statistical Implicative Analysis
Radès (Tunisie) - Novembre 2015
http://sites.univ-lyon2.fr/ASI8/