AssociationBasedRecommendation PDF

Association based Recommender System
Mamata Jenamani
Professor
Department of Industrial & Systems Engineering
Association based recommendation system
• A variation of collaborative filtering
• Recommending the items that can be purchased
with the items that users have purchased in the past
or have shown interest to purchase
– Co-occurrences of items that the users frequently
preferred to purchase/view together
• Information used
– Unary rating
• Type of recommendation decision
– Prediction
– Top-N recommendations
• Personalized
Introduction to frequent pattern analysis
• Frequent pattern: a pattern (a set of items, subsequences,
substructures, etc.) that occurs frequently in a data set
• Frequent pattern analysis is the basis of association rule
mining
• Motivation: Finding inherent regularities in data
– What products were often purchased together?
– What are the subsequent purchases after buying a PC?
– What kinds of DNA are sensitive to this new drug?
– Can we automatically classify web documents?
• Applications
– Basket data analysis, cross-marketing, catalog design, sale campaign
analysis, Web log (click stream) analysis, and DNA sequence analysis.
Basic Concepts: Frequent Patterns
and Association Rules
Transaction-id Items bought
10 A, B, D  Itemset X = {x1, …, xk}
20 A, C, D  Find all the rules X  Y with
30 A, D, E minimum support and confidence
40 B, E, F
 support, s, probability that a
50 B, C, D, E, F
transaction contains X  Y
Customer Customer  confidence, c, conditional
buys both buys B
probability that a transaction
having X also contains Y
Let supmin = 50%, confmin = 50%
Freq. Pat.: {A:3, B:3, D:4, E:3, AD:3}
Customer
Association rules:
buys A
A  D (60%, 100%)
D  A (60%, 75%)
Interestingness measures
• Association rule mining searches for interesting relationships

among items in a given data set.
• Two measures of interestingness: Support and Confidence
• Find all the rules X  Y with minimum support and confidence
– support, S, probability that a transaction contains X  Y
• S = (# of tuples containing both X and Y)/(total number of tuples)
• Support Count = # of tuples containing both X and Y)
– confidence, C, conditional probability that a transaction having
X also contains Y
• C = (# of tuples containing both X and Y)/(# of tuples containing X
alone)
= (Support count of tuples containing X Y)/(Support count of tuples
containing A)
Algorithms for association rule mining
• Three major approaches
– Apriori algorithm
– Frequent pattern growth
– Vertical data format approach
The apriori Algorithm
• Apriori Principle
– Suppose an item set is not frequent (i.e. does not have the
minimum support). If an item A is added to this set then
the resulting set cannot occur more frequently.
– It is an anti-monotone property
• If a set cannot pass a test then all its supersets will also fail the
test.
– Two steps of the algorithm
• Join
• Prune
The algorithm
• scan DB once to get frequent 1-itemset C1

• C1 = Prune (C1)
• L1  C 1
• Continue join step till no frequent or candidate set can be generated
• Join
– Ck A set of k-item sets generated by joining Lk-1 with itself
– Ck=Prune(Ck)
– Lk  C k
• Prune(Ck)
– Delete the tuples in Ck that do not satisfy the apriori property
– If any (k-1)-subset of a candidate is not in Lk-1, then the k-
item set cannot be frequent
– Scan D to get the frequency count of each set in Ck. Delete the sets
that does not satisfy the minimum support count.
Assignment
• Derive the frequent Tid Items
pattern from the given

transaction database. 10 A, C, D
• Generate association
rules 20 B, C, E
30 A, B, C, E
40 B, E
Solution Supmin = 2 (50%)
Itemset sup
Database TDB Itemset sup
C1 {A} 2 L1 {A} 2
Tid Items {B} 3
{B} 3
10 A, C, D {C} 3
20 B, C, E
1st scan {C} 3
{D} 1
{E} 3
30 A, B, C, E {E} 3
40 B, E
C2 Itemset sup C2 Itemset
Itemset sup {A, B} 1
L2 2nd scan {A, B}
{A, C} 2 {A, C} 2
{B, C} 2 {A, E} 1 {A, C}
{B, E} 3 {B, C} 2 {A, E}
{C, E} 2 {B, E} 3 {B, C}
{C, E} 2 {B, E}
Itemset sup {C, E}
{A, B, C} 1
3rd scan L3 Itemset sup
C3 {A, B, C, E} 1 {B, C, E} 2
{A, C, E} 1
{B, C, E } 2
Solution
• Association Rules and – E{B, C} {2/3}

Confidence – {B, C}  E {2/2}
– BC {2/3} • Assuming we go for the
– CB {2/3} rules with 100% confidence
– BE {3/3} only 4 rules qualify
– EB {3/3}
– B{C, E} {2/3}
– {C, E}B {2/2}
– C {B, E} {2/3}
– {B, E}  C {2/3}
Association rule based recommendation
generation
• Generate association rules from the transaction database
• To generate Top-N recommendation
– Find the association rule supported by the active user (rules whose
LHS appears in the active user’s transaction)
– Let Ip be the set of unique items suggested by the RHS of the rules
– Sort Ip based on confidence score with respect to the association rules.
Confidence is more if an item appears in more rules.
– Choose the top N of these items
• Prediction
– An item can be recommended if it appears in the RHS of the
association rules supported by the active user.

AssociationBasedRecommendation PDF

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

AssociationBasedRecommendation PDF

Transféré par

Droits d'auteur :

Formats disponibles

Association based Recommender System

• Association rule mining searches for interesting relationships

• scan DB once to get frequent 1-itemset C1

• Derive the frequent Tid Items

pattern from the given

• Association Rules and – E{B, C} {2/3}

Vous aimerez peut-être aussi