Académique Documents
Professionnel Documents
Culture Documents
on
Association Rule Mining
to Remotely Sensed Data
Represented By
Madhusmita Sahu
(CSE,950014)
1
Contents
Introduction
Apriori Algorithm
Mining Rules to Imagery data
-Problem definition
-Partitioning quantitative attributes
-Finding larger itemsets from imagery data
New pruning techniques for fast data mining
-Technique one
-Technique two
An example of applying new algorithm
Conclusion
Reference
2
REMOTE SENSING
3
Association Rule Mining
Associations
Simple rules in categorical data
Sample applications
Market Basket Analysis
Buys(Milk) ⇒ Buys(Eggs)
Transaction Processing
Income(Hi) & Single(Y) ⇒ Owns(Computer)
Search for Strong Rules
Support R(A ⇒ B) = P(A U B)
Confidence R(A ⇒ B) = P(B | A) = P(A ∩ B) / P(A)
4
The Apriori Algorithm : Pseudo code
6
NEW PRUNING TECHNIQUES FOR FAST DATA
MINING
Technique one
7
Ck : Candidate k-item sets
Lk: Large k-item sets
* : An operation for contactenation
│Ck│ : Number of itemset in candidate k-item sets
Rj : Number of intervals in bandj │L k │:
Number of itemset in large k-item sets
1. According to the apriori algorithm :Apriori use L1*L1 to generate a candidate set of
itemsets C2.
|C2|apriori = |L1 ||L₁-1| ∕ 2
=
8
Contd…
9
Technique two
During the process of data mining ,allow user interaction with the
mining engine and use users’ prior knowledge will help to speed up
the mining algorithms by restricting the search space.
Consider only one band "bandN" in output. The association rule is the
form: bandl Λ ... Λ band(N-l)⇒bandN.
The number of candidate 2-itemset
│C2 │new =
we are not interested in those itemsets which do not contain bandN.
We will prune those candidate itemset in which none of the interval is
chose from bandN.
The number of pruned candidate 2-itemset is
│C2 │prune 2 =
10
contd….
│C2 │prune 1 =
The total number of pruned candidate 2-itemset
│C2 │prune = │ C2│prune 1 + │ C2 │prune 2
11
Contd….
= +
12
Steps
Step 1: Choose one of the partition method (equal
depth,uneven depth and discontinous partition) to
determine the intervals.
13
An example for applying new algorithm (Assume user select equal depth
partitioning.Diameter two for band1 and band4 , Diameter three for band2 and band3
Pi b b b b b b b b b b b b b b b b
xel 11 12 13 14 21 25 26 28 31 32 37 38 41 42 43 44
1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1
2 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1
3 1 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0
4 0 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0
5 0 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0
15
Contd….
Apply new pruning techniques for candidate 2-itemset
generation.Assume the minsup=40% and minconf=60%
Candidate 1-itemset:
{b11,b12,b13,b14,b21,b22,b23,b24,b25,b26,b27,b28,b31,b32,
b33,b34,b35,b36,b37,b38,b41,b42,b43,b44}
• Large 1-itemset:
{b11(3),b12(2),b25(3),b26(2),b32(2),b37(3),b42(2),b44(2)}
Candidate 2-itemsets: {{b42,b11},{b42,b12},{b42,b25},
{b42,b32},{b42,b37},{b44,b11},{b44,b12},{b44,b25},
{b44,b26},{b44,b32},{b44,b37}}
16
An example contd….
• Applying pruning technique one,
│C2 │prune 1 =1+1+1+1=4
• Applying pruning technique two,
│C2 │prune 2 =2 X (2+2)+2 X2 = 12
• Total pruned no. of candidate 2-itemsets is =12+4=16
• Applying apriori algorithm,the no. of candidate 2-itemset
│C2 │apriori =(8 X 7)/2 = 28
The percentage of pruning is 57%.so,the execution
efficiency of mining process is improved.
• Remaining steps are the same as Apriori algorithm.
17
Conclusion
18
References
19
Thank You!!
20