Vous êtes sur la page 1sur 4

2015 3rd International Conference on Information and Communication Technology (ICoICT)

A Comparative Study on Market Basket


Analysis and Apriori Association Technique
Warnia Nengsih
Department of Computer
Politeknik Caltex Riau Riau -Indonesia
email: warnia@pcr.ac.id
has been achieved the comparison simulation of the
system knowledge by using market basket analysis
simulation without using particular algorithm with
market basket analysis technique by using apriori
algorithm. This research is a learning for identifying
how the resulted concept, the processing of the rule, and
the achieved rule. Therefore, this research gives a new
learning from each of the step of the usage system until
it forms the resulted system.

Abstract-Association Rules is one of the data mining


techniques which is used for identifying the relation
between one item to another. Creating the rule to generate
the new knowledge is a must to determine the frequency of
the appearance of the data on the item set so that it is
easier to recognize the value of the percentage from each
of the datum by using certain algorithms, for example
apriori. This research discussed the comparison between
market basket analysis by using apriori algorithm and
market basket analysis without using algorithm in
creating rule to generate the new knowledge. The
indicator of comparison included concept, the process of
creating the rule, and the achieved rule. The comparison
revealed that both methods have the same concept, the
different process of creating the rule, but the rule itself
remains the same.
Keywords-data
apriori.

mining,

market

basket

II. RELATED WORK


Market basket analysis is one of the data meaning
techniques which is used for invent the system of the
relation between one item to another. The possible
percentage of the relation of combined items gives the
new knowledge so therefore it is very useful for the
determiner to take the decision. Market basket analysis
can be combined with some available algorithms. This
research will discuss the implementation of market
basket analysis without algorithm with market basket
analysis by using apriori algorithm.

analysis,

I. INTRODUCTION
The knowledge of the relation between the items in
a data transaction can use association rules technique.
Association rules technique is included in the type of
supervised. The type of Supervised found that the
knowledge is based on the generated rule. The system of
the relation from each item can be the knowledge to
determine the future strategy obligation.

The rules of the relation is stated in the format x-> y,


where X and Y are the separated item set (disjoint) that
is XY =( Tan et al. 2006)
The analysis of association naturally has basic
methodology as follow:
a.

There are so many methods which are used to


determine the relation of the items by combining any
algorithms. The invention of knowledge or the new
system can compare some techniques for a datum
therefore the new analysis for the invented knowledge is
needed. This research discusses the comparison between
market basket analysis by using apriori algorithm and
market basket analysis without using algorithm to
generate the new knowledge.

S (A) = Amount of transaction A

The simulation to invent the new knowledge has to


go through some steps which are appropriate with the
used technique. The indicator of comparison included
concept, the process of creating the rule, and the
achieved rule. Market basket analysis is often called as
association rules, while apriori algorithm is one of
algorithms in the technique association. In this matter, it

978-1-4799-7752-9/15/$31.00 2015 IEEE

Support
Combined percentage of the two items: for
identifying the combination of the item which is
fulfill the minimum requirement of support value.
Support value of an item is achieved by using the
following formula:
(1)

Total transaction
The formula of support value of two items
S (A n B) = Amount of trans A& B
Total Transaction

461

(2)

2015 3rd International Conference on Information and Communication Technology (ICoICT)

b.

The following are the processes of creating the rule


of the simulation of market basket analysis method
without using algorithm; Support value and confidence
is achieved from the following process:

Confidence
The frequencies of the item Y appear in the
transaction which contains X. After all of system of
high frequency found, then rules need to be found.
Conf= P(Y | X) = Amount A and B

If a then b, 3/10=0.3

confidence 3/4=0,75

If a then c, 3/10=0.3

confidence 3/4=0,75

If b then c 3/10=0.3

confidence 3/6=0,5

If b then d 3/10=0.3

confidence 3/6=0,5

If b then f 3/10=0.3

confidence 3/6=0,5

If d then f 3/10=0.3

confidence 3/5=0,6

If b then a 3/10=0.3

confidence 3/6=0,5

If c then a 3/10=0.3

confidence 3/4=0,75

If c then b 3/10=0.3

confidence 3/4=0,75

If d then b 3/10=0.3

confidence 3/5=0,6

If f then b 3/10=0.3

confidence 3/6=0,5

If f then d 3/10=0.3

confidence 3/6=0,5

(3)

Amount A
Apriori Algorithm is included in the type of
association rules on meaning data. Beside apriori,
other methods which are included in this type are
generalized rule induction method and hash based
algorithm. To get the frequent item set, the processes
are (Erwin, 2009):
1.

2.

Combination
Combination item so therefore there is no
anymore combination.
Reduction
The result of combination before, further it is
reduced by using minimum support.
III.

RESULT AND ANALYSIS

For identifying the result between market basket


analysis without using apriori algorithm and by using
aproriori algorithm observed from the concept, the
process of creating the rule and the resulted rule will be
discussed in this section.

The whole result from the support value and confidence


can be seen in table 2.

The basic concept of market basket analysis without


using algorithm and using algorithm, observed from the
result of the calculated simulation achieved, have the
same concept. The following is the process of creating
the rule.

TABLE 2 SUPPORT VALUE AND CONFIDENCE

TABLE 1 TESTED ITEM SET

0.3;0.75

0.3;0.75

0.3;0.5

0.3;0.5

0.3;0.5

0.3;0.5

0.3;0.75

0.3;0.75

Item Set

0.3;0.6

0.3;0.6

0.3;0.5

0.3;0.5

From the steps of the calculation of the item set


above,
it
is
achieved
12
rules
as
follows:{a,b}{a,c}{b,c}{b,d}{b,f}{d,f}{b,a}{c,a}{c,b,
},{d,b},{f,b},{f,d}. This is the resulted format from the
12 rules:
If a then b

Table 1 shows the simulation of the case for the


comparison between those two methods so therefore it
results some new theories.

If a then c
If b then c

978-1-4799-7752-9/15/$31.00 2015 IEEE

462

2015 3rd International Conference on Information and Communication Technology (ICoICT)

If b then d
If b then f
If d then f
If b then a
If c then a

Item Set

Frekuensi Bersamaan

10

If c then b
Another important thing is determining support value
and confidence. Support value and confidence have
significant influence in the creation of the rule.

If d then b
If f then b
If f then d

The result of calculation of the confidence and support


value can be seen in table 5. From that table, the focus
on the record which has the value of multiple support
and confidence is high equal to 75%.

According to table 2, 12 resulted rules carry out


selection for confidence value which is appropriate with
the concept of relation of the item so that it results the
following rule: {a,b},{a,c},{c,a},{c,b}

TABLE V. FREQUENCY COMBINATION OF ITEMS

For the simulation market basket analysis by using


apriori algorithm method, the following is the process
of creating the rule:
TABLE III. TESTED ITEM SET

Costumer

10

4
6
3
6
3
According to table 3, it is achieved the frequency of
relation of the item like in the table 4.
TABLE IV. FREQUENCY COMBINATION OF ITEMS

Rule

Support

Confidence

F1={a,b},if a then b

3/10=30%

3/4=75%

F1={b,a},if b then a

3/10=30%

3/4=75%

F1={c,a},if c then a

3/10=30%

3/4=75%

F1={a,c},if a then c

3/10=30%

3/4=75%

F1={c,b},if c then b

3/10=30%

3/4=75%

F1={b,c},if b then c

3/10=30%

3/4=75%

F1={d,b},if d then b

3/10=30%

3/4=75%

F1={b,d},if b then d

3/10=30%

3/4=75%

F1={f,b},if f then b

3/10=30%

3/4=75%

F1={b,f},if b then f

3/10=30%

3/4=75%

F1={f,d},if f then d

3/10=30%

3/4=75%

F1={d,f},if d then f

3/10=30%

3/4=75%

From above result, it is achieved 4 rules as follow:


{a,b},{a,c},{c,a},{c,b}. The following is the result of
comparison of the rule achieved from the two discussed
methods.
TABLE 6 THE COMPARISON OF THE RULE ACHIEVED

Rule Apriori
F1={a,b}, if a then b

MBA
F1={a,b}, if a then b

F2={c,a}, if c then a

F2={c,a}, if c then a

F3={a,c}, if a then c

F3={a,c}, if a then c

F4={c,b}, if c then b

F4={c,b}, if c then b

Item Set

3
4

Frekuensi Bersamaan

978-1-4799-7752-9/15/$31.00 2015 IEEE

463

2015 3rd International Conference on Information and Communication Technology (ICoICT)

The result on table 5 shows that both compared


methods result the same rule. The general result of
analysis of comparison of those methods is showed in
the table 6.
TABLE 7 ANALYSIS OF COMPARISON OF THOSE METHODS

Measured Parameter
Concept
The creation of the rule
The resulted rule

MBA with Apriori vs MBA


without algoritm
Same
Different
Same

IV. CONCLUSION AND DISCUSSION


Both methods which are compared result the same
rule. While the result of analysis of comparison in
general, both compared methods result the same rule.
REFERENCES
[1]

Pal, Shankar K & Mitra, Pabitra, Pattern


algorithms for data mining. CRC Press. 2004.

[2]

Tan P. N., Steinbach, M., Kumar, V, Introduction to Data


mining, AddisonWesley. 2006.

[3]

Wu, Xindong, Vipin Kumar, The Top Ten Algorithm in Data


Mining. New York: CRC. 2009..

[4]

Hand, David J,Priciples of Data Mining.MIT


2001.

[5]

Herbert A.Edelstein, Introduction to data mining and


knowledge discovery third edition.Two cross corporation.
1999.

[6]

Jiawei,Han, Micheline,Kamber, Data Mining Concept and


Techniques ,Academic Press Sand Diego,CA. 2001.

[8]

Olson,David.Shi, Yong, Introduction To Business Data Mining,


Mc Graw Hill International Edition. 2007.

[9]

Edelstein,Herbert, Introduction to Data Mining and


Knowledge Discovery Third Edition, Publisher; two crows
corporation.1999.

[10] M. Young, 1989, The Technical Writer's


Valley, CA: University Science.

978-1-4799-7752-9/15/$31.00 2015 IEEE

Recognition

PRESS.

Handbook. Mill

464

Vous aimerez peut-être aussi