Survey Recomender System Algorithm

Survey of Recommendation
Systems and Algorithms
Term Paper for
EE 380L: DATA MINING
Spring 2000
By
Yuan Qu
Xiaoyun Yang
Tianping Huang
2
May 5, 2000
Table of Contents
I. Introduction ........................................................ 3
II. Recommendation Systems................................... 4
I. Algorithms.......................................................... 14
II. Discussion.......................................................... 29
III. Reference........................................................... 31
2
3
1. Introduction
In our daily life, we make our choices at most cases relying on recommendations
from other people either by word of mouth, recommendation letters, movie and book
reviews printed in newspapers, or general surveys. In this information age, each day tons
of news published through the Internet. This leads to a clear demand for automated
methods that locate and retrieve information with respect to users’ individual interests.
More and more people accessing the Internet also provide new possibilities to organize
and recommend information.
Recommendation systems can assist and augment this natural social process.
These systems can recommend what you want according what you want in previous time.
The main purpose of the recommendation systems is to provide tools for people to
leverage the information hunting and gathering activities of other people or groups of
people. Recommendation systems have been an important application area and the focus
of considerable recent academic and commercial interest.
Recommendation systems basically are divided into two categories. One is called
content-base filtering; the other is collaborative filtering (or social filtering). In content-
based filtering system, each user is assumed to operate independently. As a result,
document representations in content-based filtering systems can exploit only information
that can be derived from document contents. In collaborative filtering system, the
representation of a document is based on an evaluation to that document made by prior
readers of the document. They consider that communities of shared interest could be
automatically identified by exchanging this sort of information. In practical, collaborative
3
4
filtering system provides a basis for selection of information items, regardless of whether
their content can be represented in a way that is useful for selection. In this paper, the
focus will be on the collaborative filtering.
Collaborative Filtering was presented by the developers of the first
recommendation system, Tapestry, in 1992 [Goldberg, et al. 1992]. Several years later
the concept of collaborative filtering had already applied in dozens of publicly available
systems, several proprietary systems, and even some commercially available systems. In
1996, dozens of the researchers in the academic and business areas gathered at the UC-
Berkeley to share their ideas and experiences about these emerging filtering methods
[Collaborative Filtering workshop, 1996]. They presented the vision and definition of
collaborative filtering, and provided some applications of this technique. Right now
more and more published articles demonstrated their applications of the collaborative
filtering methods.
In this paper, a survey was made for all the recommendation systems available in
the Internet. Then, the characteristics of each recommendation system are displayed. And
last, some algorithms of famous recommendation systems are introduced in detail.
II. Recommendation Systems
There are a lot of recommendation systems on web sites. According to the
purposes of their application, the recommendation systems can be classified into three
categories [Resnick, 1997], shown in Figure 1.
4
5
r e c o m m e n d a t o n s y s t e m s
m o v ie s o rn em w u s s io c r a rw t i ec bl e sp a g e s
E a c h M o i vTe a p e s t r y P h o a k s
M o r s e G r o u p L e n sG A B
F ir e f ly L o t u s N o t e Fs a b
. . . . . . . . .
Figure 1. The recommendation systems’ categories
The systems in first category are used for recommending movies, music, videos or
other services. In this category, the database is relative stable, such as the population
database, it may not be changed in years. The typical systems include EachMovie,
Firefly, and Morse. The second category is used for news or articles in a newsgroup.
The users in the newsgroup generally have the similar goals or interests. The database is
also relative stable. It may be updated in weeks or short time. The representatives of
these systems are Tapestry, GroupLens, and Lotus Notes. The last one is for web pages’
recommendation. The information in this category is dynamic, that means, the new page
can be added or deleted in the system at any time. At the same time, the users may have
different tastes. Phoaks, GAB, and Fab are most useful systems of this kind.
The brief introduction of each recommendation system is given as follows:
5
6
Do-I-Care
When a user revisits a favorite Web page, Do-I-Care [Turnbull, 1998;
Collaborative Filtering workshop, 1996] system provides a function that alerts the user
when this Web page is changed. The system uses the model-based algorithm. It uses
Bayesian classifier technology. After some users training the model many times, the other
users can get good prediction.
According to the report from Mark Ackerman (U. of California-Irvine)
[Collaborative Filtering workshop, 1996], the accuracy of Do-I-Care can reach 70-90%.
It is said the accuracy of the system reaches 100% in tracking airline fare sales
application.
Fab
In a collaborative filtering system, if a new item or new user enters the system,
the system has no clue to calculate the similarity between users and the system has no
way to consider the new item unless some users have rated it, or recommended it. This
problem is called cold-start problem. But for content-based filtering, there does not exist
such problem. To eliminate this problem, Fab recommendation system [Turnbull, 1998]
combines both collaborative and content-based filtering systems.
The Fab system is a web based recommendation service that incorporates both
collaborative and content-based filtering methods. Users ‘ profiles are constructed as a
collection of keywords contained in those documents that each user rate highly.
6
7
Documents are presented for rating when either the content of the document matches
previous documents that were rated highly, or neighboring users rate a document highly.
Every time a favorable or unfavorable rating is received, the profile of the user is updated
to reflect the new rating.
Collection agents are sent out over the web to look for documents with specific
content, each agent using a different set of keywords. After retrieving the documents,
they are passed to a central server where a selection agent matched to each user's profile,
scours through the documents looking for interesting material. Relevant documents are
then presented to the user for rating. This rating dynamically affects the selection agent’s
behavior and changes the user's profile. The rating also affects the collection agent that
retrieved the document. Unpopular collection agents are removed and replaced with more
successful ones over time.
The Fab system combines the best features of both content-based and
collaborative filtering methods and also manages to keep the system dynamically updated
to the current users' tastes. One potential shortcoming is Fab's reliance on explicit user
feedback.
Firefly
The system [Turnbull, 1997 and 1998] is based on similarities of users to provide
recommendation. At the beginning, this system was used for music and movies
recommendation. Right now it extends to other media recommendation, such as
newsgroup, books, and web pages.
7
8
The system used users’ profiles as input, and used constrained Pearson algorithm
to make the best predictions between users. The basic idea of the algorithm is: a) the
system maintains a user profile, which includes “like or dislike” of specific items; b) the
system compares the similarities of users and decides which kind of users that the user
belongs, and c) according to the similar user’s profile and gives a good recommendation.
GAB
GAB [Wittenburg, et. al., 1998] stands for group asynchronous browsing. The
idea of GAB system is that the system collects and merges bookmarks and hotlists files
of users and then serves these files to users. That means, the system has the ability to
reach user’s bookmarks and extract information. This raises privacy concerns. To
overcome the privacy problem, the system has provide a mechanism to let user save
his/her bookmark in “private” or “public”.
The system uses multi-tree data structure for the bookmarks. To avoid getting
lost in hyperspace and to increase the connectivity in merged subject tree database, the
system has defined sibling and cousin relations. Sibling relation of item A and B means
that A and B belong to the same specific subject, while cousin relation of A and B means
that A and B belong to the broad subject but not the same specific subject. The system
also has applied for monitoring the change of content of web page.
Grassroots
8
9
Grassroots system [Turnbull, 1998] is described as "A System Providing A
Uniform Framework for Communicating, Structuring, Sharing Information, and
Organizing People”.
This system provides a special interface of Web pages to access all of the
information it works with. In practice, Grassroots also lets participants continue using
other mechanisms, and takes as much advantage of them as possible. The main engine in
the Grassroots system is a Web server and Proxy server setup that can be used with any
Web browser.
GroupLens
Resnick [Resnick, et al. 1994] presented the GroupLens system, which is built
based on a simple premise "the heuristic that people who agreed in the past will probably
agree again". This system uses the same Pearson algorithm to provide algorithm. At
early stage, the system uses explicit vote ( 1 to 5 scale, 1 stands for dislike it, 5 for like
it). The updated one also includes using implicit method to get the feedback from the
user, such as monitoring reading time. The most characteristics of the system are its
openness and scalability.
Openness means that this system provides other researchers an access to create
clients that work with the system servers or to even change those servers if there are
better improvements. When users’ number increases, the system still can provide
accurate prediction but the database for the system or the calculation time will be very
huge.
9
10
Letizia & Let’s Browse
Let's browse and its predecessor, Letizia, [Lieberman, 1996; Pryor, 1998] are web
agents that assist a user during his/her browsing experience. By monitoring a user’s
behavior, or browsing time on a web page, Letizia system learns the user’s interests and
provides recommendation. Let’s Browse, improved from Letizia, provides
recommendation by using group’s profiles instead of by using a single profile. If
multiple users are reading the same page at the same time, Let’s Browse can determine
which users are in the area of monitor, and use their profiles to provide recommendation
sites for entire group.
Lotus Notes
Lotus Notes [Turnbull, 1998] is a system that is used as a foundation for
Collaborative Filtering techniques. The system serves for the newsgroup. All Notes
Users should have similar goals or information interests because they are working in the
same group
Lotus provides a feature to let people annotate documents. After annotation, the
user can send or distribute these links or comments to others. To protect user’s privacy,
the system uses an agent to represent an individual. These agents extract significant
phrases from the document that the user reads, and then exchange the learning results
anonymously.
10
11
Mosaic
Mosaic system [Turnbull,1997] was the first Web tool that facilitated
collaborative. Like recommendation system Pointers, the Mosaic users in the system can
publish and distribute the bookmarks and add the comments to the web page. This
simple feature enabled users to actively share information with others.
PHOAKS
Terveen [Terveen et. al, 1997] first introduced PHOAKS (People Helping One
Another Know Stuff) system that recommends the URLs that will be very interesting to
users. The system will automatically recognize web resource references in a new group
message and then attempt to classify it, and introduce it to other users. That means the
system scans and checks the group’s messages and then gets the most important URLs in
theses messages. After sorting these links, the system recommends this URLs to users.
The system uses implicit feedback and also considers the role specialization.
Pointers
This system [Maltz, 1995] is implemented inside Lotus Notes environment. As
we know if one person is an expert in these areas, then other users in this group would
like to see his/her recommendation. So the system provide a mechanism to let the
“information mediators” in a workgroup easily distribute references and commentary of
11
12
documents they find. This mechanism is realized by using “pointer”. This pointer is
consists of URL link, contextual information, and optimal comments by the sender. The
system is very easy to use but not anonymous.
Siteseer
Siteseer [Turnbull, 1997] is a collaborative system using web browser bookmarks
to find neighbors and recommend sites. Users with significant overlap in bookmark
listings are determined to be close to one another, allowing previously unvisited sites to
be recommended to one another.
Tapestry
This is the first collaborative recommendation system [Goldberg, 1992]. It uses
free annotations or explicit “like it” or “hate it” annotations. This system is used for
newsgroup. So it is not easy for the group exploring new area.
Yahoo!
Turnbull [Turnbull, 1998] considered Yahoo! as a recommendation system that
uses manual way to realize collaborative filtering. They have one expert to update
Yahoo! Index as quickly as possible. That means that every site is examined by a people
when it is added. Also the system allows web users to submit pages. Because of its
12
13
openness, the form of Yahoo! index has become very popular and has become a
classification standard.
WebWatcher
The WebWatcher system [Joachims, 1996] likes a tour guide in a museum. It
provides interactive communication between server and users and provides
recommendation. The user who enter the system can ask question by typing what is
his/her interest, and then the system will recommend the related web sites. This is not the
same thing as keyword-based search engine. It does use the user profile and other users’
previous tour, and calculate the similarities of users and predict the user’s interest. The
system also uses the user’s experience to reinforce learning.
III. Algorithms on Collaborative Filtering
Today recommendation systems have been used in many fields, virtually all
topics that could be of potential interest to users are covered by special purpose
recommendation systems: Web pages, news stories, emails, movies, music videos, books,
CDs, restaurants, and many more. These recommendation systems predict the users’
interest and preference based on all users’ profiles, using information retrieval
techniques. The underlying techniques used in today’s recommendation systems fall into
two distinct categories: content-based filtering and collaborative filtering methods. The
content-based filtering uses actual content features of items, while the collaborative
13
14
filtering predict new user’s preference using other users’ rating, assuming the like-
minded people tend to have similar choices. Here, we concentrate on the algorithms used
on the collaborative filtering.
Collaborative filtering or recommender systems predict additional topics or
products of a new user might like, based on a user preference database. There have been
a lot of collaborative filtering algorithms. Breese, et.al.,1998, classified these algorithms
into two categories: Memory-based Algorithm and Model-based Algorithms. Based on
their classification, we collect and classified the available algorithms so far on
Collaborative Filtering.
Memory-based Algorithms
The reason that they define these algorithms as memory-based algorithm is
because that these algorithms operate over the entire user database to make predictions.
Basically, these algorithms all try to find the similarity or correlation between the new
active user and other users in the database. All users’ preferences could be represented by
their votes (explicit or implicit) to the products (which could be anything related to the
users’ interests.). The new user has an average vote over the products he/she has rated.
Then the predicted votes of the new users over other products could be calculated by
adding weighted sum of other users’ votes. The weights could be determined by the
similarity between the new user and other users. The more similar they are, the more
contributions they have to the sum, so the large the weights are. The user’s average vote
14
15
could be represented as below, the I i is the set of items the new user i has voted, vij is
the user i vote to product j. Then the average vote is:
1
vi = ∑v i , j
| I i | j∈I i
The predicted vote of the new (active) user is:
n
p a , j = v a + k ∑ w(a, i )(vi , j − v i )
i =1
where the k is a normalizing factor, while w( a, i ) is the weight that the user i
contributes to the active user.
The weights are calculated by comparing a set of common products, which the
active user and all other users in the database have rated. Here we collected three major
methods to define the weights.
Mean Squared Differences:
This method defines the weight as the inverse of the mean square distance.
1
w( a, i ) =
(V j −Va ) 2
Pearson Correlation:
15
16
w( a, i ) =
∑ (v − v )( v − v )
j a, j a i, j i
∑ (v − v ) ∑ (v − v
j a, j a
2
j i, j i )2
Vector Similarity:
This method defines the weight based on the angle size between the active user
vector and the other user vector.
va, j vi , j
w( a, i ) = ∑
j ∑ k∈I a
v a2, k ∑ k ∈I i
vi2,k
Improvement on Memory-based Algorithms
In order to improve the performance of standard memory-based algorithms,
several modifications are proposed.
Default Voting:
book1 book2 book3 book4 book5 book6

user 1 5 1
user 2 3 1 5
user 3 3 5 4
user 4 4 2 ?
16
17
Usually, we are dealing with very sparse databases, also there are a lot of products which
users didn’t vote on (explicit or implicit). When using memory-based algorithms, we are
only using the entries at the intersection. For the example above, to calculate the weight
user1 contributes, we can only use the rates for book1. In order to deal with this problem,
default votes are introduced. In most case, a neutral or negative preference is given to the
unobserved products. So the union of voted set could be used in weights calculation
instead of intersection. But this method may not necessarily improve the performance of
the memory-based algorithms, an unobserved product may not mean that it’s less
interesting.
Inverse User Frequency:
The idea of inverse user frequency is that universally liked products are not as
useful as the less common products in capturing the similarity between users. So the
weight is modified by introducing a f j , which is defined as below:
n
f j = log
nj
Where n is total number of users, while n j is the total number of users who have
voted for product j. Then the relative correlation weight would be
w( a, i ) =
∑ j
f j (∑j f j v a , j vi , j ) − (∑j f j v a , j )( ∑j f j vi , j )
UV
17
18
Where,
U = ∑ f j (∑ f j v a2, j − (∑ f j v a , j ) 2 )
j j j
V = ∑ f j (∑ f j vi2, j − (∑ f j vi , j ) 2 )
j j j
Case Amplification:
Case amplification emphasizes the contribution of the most similar users to the
prediction by amplifying the weights close to 1. The new weights are calculated as
below:
waρ,i wa ,i ≥ 0
wa' ,i = {
− ( −waρ,i ) wa , j < 0
Voting by category:
In some collaborative filtering applications, the dimensions of the users’ voting
matrix could become unmanageable, preventing the practical calculations over the over
matrix. There could be very few common votes to the same products if not using default
voting method mentioned before, however, providing default votes may not improve the
performance. Gokiso-cho et.al., 1998 proposed an voting by category algorithm.
Basically, they assume the existence of small number of generated clusters or pre-
existing categories to which products can be assigned. Then transfer the voting matrix
into much lower dimension by transfer users’ voting to products into the voting to
18
19
categories. See the same example below, this time the original 4 by 6 matrix is changed
to be 4 by 3 and users have more common votes.
catagory1 category2 catagory3

book1 book2 book3 book4 book5 book6
user 1 5 1
user 2 3 1 5
user 3 3 5 4
user 4 4 2 ?
The new votes of users to categories are calculated as below:
v i ,c = vi , j , j ∈c
Now the entry of the new matrix is the average over the votes of the products per each
category for a given user.
The categories could be pre-defined or unknown. To deal with unknown
categories, EM algorithm could be used.
The method could be used on all other algorithms (including the Model-based
Algorithms). We put it here because the original author uses it along with the correlation
algorithm.
Model-based Algorithms
19
20
Model-based algorithms first generate a descriptive model by compiling the users’
preferences; recommendations are then predicted by appealing to the model. From a
probabilistic perspective, the collaborative filtering can be viewed as calculating the
expected value of a vote, given user’s profile or previous votes.
m
Pa , j = E (v a , j ) = ∑ Pr( v a , j = i | v a ,k , k ∈ I a ) ⋅ i
i =0
Cluster Models:
Based on the idea that there are certain groups or types of users capturing a
common set of preferences and tastes, Breese, et.al, proposed a cluster method, in which
like-minded users are classified into the same group. Given a user’s class membership,
the user’s votes are assumed to be independent, then the joint probability of class and
votes could be calculated by the “naïve” Bayes formulation,
n
Pr( C = c, v1 ,..., v n ) = Pr( C = c) Pr( vi | C = c)
i =1
Once we know the probability of observing an individual of a class with a set of votes,
the expectation of the future vote could be easily calculated. Since the classes and
number of class are unknown, EM algorithm is used to find the model structure with
maximum likelihood.
20
21
Ungar [Unger, et. al.,1998] proposed a new clustering methods, unlike the
standard cluster models, they assume that people are from classes: e.g, intellectual or fun
and products are also from classes. Here is an example in their paper,
Batman Rambo Andre Hiver Whispers Star Wars

Lyle y y
Ellen y y y
Jason y y
Fred y y
Dean y y y
In this movie database example, people can be classified as intellectual or fun,
and movies could belong to three categories: action, foreign, classic. “y” in the table
means people like the movies associated. For each person/movie pair, the probability that
there is a “y” in the table is
action foreign classic

intellectual 0/6 5/9. 2/3.
fun 3/4. 0/6 2/2.
Based on the observation above, they establish a model, which contains three sets
of parameters: Pk (probability a random person is in class k), Pl (probability a random
movie is in class l), Pkl (probability a person in class k is linked to a movie in class l).
Here, the class assignments are unknown. They tried repeated clustering and
Gibbs sampling methods. In repeated clustering method, firstly, people are clustered
based on movies and movies based on people; on the second, and later passes, people are
clustered based movie clusters and movies based on people clusters. To do clustering,
21
22
they use k-means clustering instead of EM algorithm due to the constraint that a person is
always in the same class and a movie is always in the same class. They claimed that the
Gibbs sampling method over-performances repeated clustering.
Bayesian Network Models:
An alternative model formulation for probabilistic collaborative filtering is a
Bayesian belief network with a node corresponding to each product in the database. The
missing data can be represented by a “no vote” value. After applying an algorithm to train
the belief network, in the resulting network, each item will have a set of parent items that
are the best predictors of its votes. A decision tree could be used to represent the
conditional probability table.
Neural Network Models:
Similar as the Bayesian Network models, collaborative filtering can be seen as a
classification task. Based on a set of ratings from users for products, we could induce a
model for each user that allows us to classify unseen products into two or more classes.
The missing data could be indicted by a “no vote” state. Here is an example given in
Billsus’ [Billsus, D. and Pazzani, M., 1998] paper.
I1 I2 I3 I4 I5
U1 4 3
U2 1 2
U3 3 4 2 4
U4 4 2 1 ?
22
23
Where Ui is the ith user, Ii is the ith item. Users rate the items from 1 to 4, while 4 is the
highest rating. Since finally they only recommend the items the active user would like,
they reform the rating matrix by replacing rating > 2 by 1 otherwise 0. To represent the
“no vote” value, they further split every user set into two sets (like and dislike).
E1 E2 E3
U1 like 1 0 1
U1 dislike 0 0 0
U2 like 0 0 0
U2 dislike 0 1 0
U3 like 1 1 0
U3 dislike 0 0 1
Class like dislike dislike
Here U4’s ratings for I1, I2, I3 are class labels. After converting a data set of user ratings
for items into this format, we can apply virtually any supervised learning algorithm.
Other Algorithms
A hybrid memory- and model-based approach:
Pennock [Pennock, David M. and Horvitz, Eric 1999] proposed a CF method
called personality diagnosis (PD) which can be seen as a hybrid between memory- and
model-based approaches. All data is maintained throughout the process, new data can be
added incrementally, and predictions have a meaningful probabilistic semantics.
In this algorithm, each user’s preferences are interpreted as a manifestation of
their underlying “personal type”. Based on the fact that users’ voting are affected by the
other environmental factors, such as previous users’ votes, current user’s mood … , they
23
24
assumed that all users report their rating with Gaussian noise. If we define a user’s
true
personality type as a vector of “true” rating V i , then user i’s actually rating could be
drawn from an independent normal distribution,
2
−( x − y ) / 2σ 2
Pr( vi , j = x | vitrue
, j = y) = k ⋅ e
Where σ is a free parameter.
They further assumed that the distribution of voting vector in the database is
representative of the distribution of that in target population of users. So we have,
1
Pr(Vatrue = Vi ) =
n
Where n is the total number of users in the database. Then the probability that the active
user has the same personality type with any other user can by calculated by applying
Bayes’ rule.
Pr( Vatrue = Vi | v a ,1 = x1 ,..., v a ,m = x m ) ∝
Pr( v a ,1 = x1 | v atrue true

,1 = v i ,1 ) ⋅ ⋅ ⋅ Pr( v a ,m = x m | v a ,m = v i ,m ) ⋅ Pr( Va
true
= Vi )
Then the active user’s vote of an unseen product would be,
24
25
p (v =x |v = x ,..., v =x ) =
r a, j j a ,1 1 a, m m
true =V ) ⋅ p (V true =V | v
∑ p r ( v a , j = x j | Va i r a i a ,1 = x1 ,..., v a , m = x m )
Improvements:
Now we have seen the memory-based and model-based collaborative filtering
methods. Both methods have their advantages and drawbacks. Memory-based methods
are simple and easy to implement. But they may be time- and space- consuming. At lease,
for memory-based methods, it’s hard to handle two problems mentioned below:
1) Missing data: To find the similarity between users, the difference (distance) between
users has to be computed. If there are missing data, either only the products which all
users voted are used, or give a vote to missing data. In first case, it has problem with
sparse databases. In second case, giving average votes or somewhat negative votes to
the missing data may shadow the similarity between users.
2) Memory-based methods can not handle the situation that two user are very similar but
have not rated the same set of products. For example,
product1 product2 product3 product4 product5 product6

user1 1 0 1 1 1
user2 0 1 1 1 1
user3 1 ?
User1 and user2 are very similar in this example, however, when we use memory-
based methods to predict user3’s preference on product6, only user1’s votes could be
used to predict.
25
26
For model-based methods, clustering methods could somewhat handle missing data
by clustering products into fewer categories, the new votes for categories are averaged
over available votes for the products in the category. But Clustering methods may over-
generalize, and hurt the performance. Bayesian network or neural network models could
handle the missing data and the problem (2) mentioned before reasonably well. But for
large databases containing many users, we will end up with thousands of features while
our amount of training data is very limited, those models will become not practical.
Recently, a promising algorithm is proposed. The idea is that users are rating their
products based on the latent features of products. All products in the database share a set
of common features. Users rate products highly because they rate those features highly.
So by factoring peoples’ ratings into features using linear algebra, we could predict how
users will react to documents they have not seen before based on their preferences for
these features. Singular Value Decomposition (SVD) allows us to break down data sets
into these components and analyze the principal components of the data. We will see
below how SVD could be used to capture the hidden features and help to reduce the
dimension of databases.
Singular Value Decomposition:
The user rating vectors can be represented by a m× n matrix A, with m users and
n products,
A =[ai , j ]
26
27
Where a i , j is the rating of user i for product j. Through singular value
decomposition, A can by factored into USV T

, where U and V are orthogonal matrices
and the S is a zero matrix, except for the diagonal entries which are defined as the
singular value of A. U is representative of the response of each user to certain features. V
is representative of the amount of each feature present in each product. S is a matrix
related to the feature importance in overall determination of the rating. Here is an
example given by Pryor [Pryor, H. Michael,1998] in his report. Suppose the rating matrix
A is,
5 4 2 6
A=
3 7 5 2


6 4 1 4

The SVD of A would be:
0.6000 − 0.4124 − 0.6855 

U =
0.5811 0.8136 0.0192 


0.5498 − 0.4099 0.7278 

14 .4890 0.0000 0.0000 0.0000 

S =
 0.0000 4.9324 0.0000 0.0000 


 0.0000 0.0000 1.6550 
0.0000 
0.5551 − 0.4218 0.6023 − 0.3889 

0.5982 0.4878 0.1835 0.6088 
V = 
0.3213 0.5744 − 0.3306 − 0.6764 
 
0.4805 − 0.5041 − 0.7031 0.1437 
27
28
We can find that the feature described by “14.4890” in S is the most important
feature. So the dimension of S could drop off by selecting only most important features,
in this case only the one represented by “14.4890”. Then the new rating matrix could be
generated, by converting the original rating matrix into the feature space.
AV = US
The new rating matrix M,
M = US '
In this case, S ' = [14 .4890 ] , after we get the new rating matrix M in the feature space.
We can implement memory-based or model-based methods on this new rating matrix. It
has been shown that exploiting latent structure in matrices of user ratings can lead to
improved predictive performance.
In current recommender systems, Content-Based Filtering (CBF) methods and
Collaborative Filtering (CF) Methods are used. CBF filters information based on
matching information content with user’s interests. CBF is able to filter information that
has not been evaluated by other people. So CBF and CF are combined in recommender
systems. CBF could be used to deal with unlearn products, while CF recommend new
products based on previous users’ votes.
IV. Discussion
28
29
As we introduced above, the future recommendation systems should have
following features:
1) Solve the “cold-start” problem.
General collaborative recommendation systems have suffered this problem, that
is, system has no clue to recommend a new item to users or to provide an accurate
predictions for a new user. Since content-based filtering is based on the feature of the
item, there is no such cold-start problem. Fab system has integrated these content-based
fitering and collaborative filtering. Based on this integration, Michelle Keim Condliff et
al[1998], propose a Bayesian methodology for recommendation system. This proposal
uses Bayesian theory to give a good prediction by fully incorporating all of the available
data, such as user ratings, user features, and item features . Claypool [Mark Claypool, et
al. 1999] also provide an approach to solve this cold-start problem. This system bases on
a weighted average of the content-based filtering prediction and collaborative filtering
prediction.
2) Easy for users to participate or vote
Generally speaking, people do not like to provide recommendation although they
like to receive recommendation. Since the system depends on the votes of users and then
to calculate the similarities of users, so it is very important to get enough data from the
users. So the system should provide very easy interface for a user to vote or provide
annotation. Although explicit annotations or votes will leverage the calculation, implicit
feedback of the users will be more helpful to decrease the sparse matrices, which is used
for similarity calculation. The implicit methods include monitoring user’s behavior and
29
30
monitoring user’s browsing time on the page. The longer time a person stays, the more
interesting the person shows. The system also can use compensation methods. For
example, if one needs further recommendation, one must vote what he reads.
3) Privacy
Privacy becomes an issue when a system collects information about its user, so
important social issue s arise on an individual scale as well. In collaborative filtering,
users share the document annotations. In one side, people do not like the release their
private identification, on the other side, people like to see who make the annotations. For
example, if annotation is provided by an expert in this area, people in this group would
like more to read this information. The system should provide a mechanism to allow user
to adopt a pseudonym, also it should provide different level of privacy protection.
4) Algorithm
The good algorithm should have following features:
1. handling missing data
2. handling sparse data
3. cost-efficiency
5. Reference:
Ariyoshi, Yusuke: 1999. Improvement of combination Information Filtering Method
based on Reliabilities. http://www-ai.cs.uni-dortmund.de/EVENTS/IJCAI99-
MLIF/papers.html
Billsus, D. and Pazzani, M., 1998. Learning Collaborative Filters. Proceedings of
ICML’98, 46-53. Morgan Kaufman Eds.
30
31
Breese, J., Heckerman, D., Kadie, C., 1998. Empirical Analysis of Predictive Algorithms
for collaborative Filtering. Proceedings of the Fourteenth Conference on
Uncertainty in Artificial Intelligence, Madison, WI.
Claypool, Mark; Gokhale, Anuja and Miranda, Tim et. al., 1999, Combining Content-
Based and Collaborative Filters in an online Newspaper.
http://www.cs.wpi.edu/~claypool/papers/content-collab/
Collaborative Filtering workshop, 1996, Berkeley, CA. Webpage:
http://www.sims.berkeley.edu/resources/collab/collab-report.htr.
Condliff, Michelle Keim; Lewis, David D.; Madigan, David and Posse, Christian ; 1998,
Bayesian Mixed-Effects Models for Recommender Systems.
http://www.cs.umbc.edu/~ian/sigir99-rec/
Goldberg, D. Nichols, D. Oki, B. M. and Terry, D.: Using collaborative filtering to weave
an information tapestry. Commun. ACM 35, 12, 1992.
Joachims, Thorsten; Freitag, Dayne and Mitchell, Tom 1996, WebWatcher: A Tour
Guide for the World Wide Web.
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-6/web-agent/www/project-
home.html
Lieberman, H. 1996: “Letizia: An Agent That Assists Web Browse,” in MIT Media Lab.
Maltz, David and Ehrlich, Kate 1995: Pointing the way: active collaborative filtering.
http://www.acm.org/sigchi/chi95/Electronic/documnts/papers/ke_bdy.htm.
Oard, Douglas W. and Marchionini, Gary 1996, A Conceptual FrameWork for Text
Filtering. http://www.ee.umd.edu/medlab/filter/papers/filter/filter.html
Pennock, David M. and Horvitz, Eric 1999. Collaborative Filtering by Personality
31
32
Diagnosis: A Hybrid Memory- and Model-Based Approach.
http://www.research.microsoft.com/~horvitz/cfpd.htm
Pryor, H. Michael,1998. The Effects of Singular Value Decomposition on Collaborative
Filtering. Computer Science Technical Report, Dartmouth College. PCS-TR98-
338.
Resnick, Paul and Varian, Hal R. 1997, Recommender Systems. COMMUNICATIONS
OF THE ACM. March 1997/vol. 40, No.3.
Resnick, Paul; Iacovou, Neophytos and et al;, 1994, GroupLens : An Open Architecture
for Collaborative Filtering of Netnews. From Proceedings of ACM 1994
Conference on Computer Supported Cooperative Work, Chapel Hill, NC: pages
175-186.
Shardanand, Upendra and Maes, Pattie 1995. Social Information Filtering: Algorithms
for Automating “Word of Mouth”.
http://www.acm.org/sigchi/chi95/Electronic/documnts/papers/us_bdy.htm
Terveen, Loren G., Hill, William C. and et al;, 1998, Building Task-Specific Interfaces
to High Volume Conversational Data.
http://www.acm.org/sigchi/chi97/proceedings/paper/lgt.htm
Turnbull, Don: Augmenting Information Seeking on the World Wide Web Using
Collaborative Filtering Techniques. 1998,
http://donturn.fis.utoronto.ca/research/augmentis.htn
Turnbull, Don: KMDI Final Summary: Collaborative Filtering. 1997,
http://donturn.fis.utoronto.ca/research/kmdi-cf.html
Ungar, Lyle H., and Foster, Dean P. Foster, 1998. A Formal Statistical Approach to
32
33
Collaborative Filtering in AAAI Workshop on Recommendation System.
http://www.cis.upenn.edu/~ungar/papers.html
Wittenburg, Kent, Duco Das, Will Hill, and Larry Stead, 1998, Group Asynchronous
Browsing on the World Wide Web.
http://www.w3.org/Conferences/WWW4/Papers/98/
33

Survey Recomender System Algorithm

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Survey Recomender System Algorithm

Transféré par

Droits d'auteur :

Formats disponibles

Survey of Recommendation

Systems and Algorithms

Term Paper for

EE 380L: DATA MINING

II. Recommendation Systems................................... 4

and recommend information.

of considerable recent academic and commercial interest.

based filtering system, each user is assumed to operate independently. As a result,

document representations in content-based filtering systems can exploit only information

representation of a document is based on an evaluation to that document made by prior

automatically identified by exchanging this sort of information. In practical, collaborative

focus will be on the collaborative filtering.

Collaborative Filtering was presented by the developers of the first

last, some algorithms of famous recommendation systems are introduced in detail.

II. Recommendation Systems

There are a lot of recommendation systems on web sites. According to the

categories [Resnick, 1997], shown in Figure 1.

Figure 1. The recommendation systems’ categories

The brief introduction of each recommendation system is given as follows:

When a user revisits a favorite Web page, Do-I-Care [Turnbull, 1998;

users can get good prediction.

According to the report from Mark Ackerman (U. of California-Irvine)

combines both collaborative and content-based filtering systems.

collaborative and content-based filtering methods. Users ‘ profiles are constructed as a

to reflect the new rating.

successful ones over time.

recommendation. Right now it extends to other media recommendation, such as

newsgroup, books, and web pages.

his/her bookmark in “private” or “public”.

Grassroots system [Turnbull, 1998] is described as "A System Providing A

Uniform Framework for Communicating, Structuring, Sharing Information, and

openness and scalability.

Letizia & Let’s Browse

provides recommendation. Let’s Browse, improved from Letizia, provides

recommendation by using group’s profiles instead of by using a single profile. If

sites for entire group.

Lotus Notes [Turnbull, 1998] is a system that is used as a foundation for

simple feature enabled users to actively share information with others.

This system [Maltz, 1995] is implemented inside Lotus Notes environment. As

“information mediators” in a workgroup easily distribute references and commentary of

system is very easy to use but not anonymous.

Siteseer [Turnbull, 1997] is a collaborative system using web browser bookmarks

be recommended to one another.

This is the first collaborative recommendation system [Goldberg, 1992]. It uses

newsgroup. So it is not easy for the group exploring new area.

Turnbull [Turnbull, 1998] considered Yahoo! as a recommendation system that

The WebWatcher system [Joachims, 1996] likes a tour guide in a museum. It

provides interactive communication between server and users and provides

system also uses the user’s experience to reinforce learning.

III. Algorithms on Collaborative Filtering

on the collaborative filtering.

Collaborative filtering or recommender systems predict additional topics or

a lot of collaborative filtering algorithms. Breese, et.al.,1998, classified these algorithms

into two categories: Memory-based Algorithm and Model-based Algorithms. Based on

their classification, we collect and classified the available algorithms so far on

The reason that they define these algorithms as memory-based algorithm is

the user i vote to product j. Then the average vote is:

The predicted vote of the new (active) user is:

contributes to the active user.

methods to define the weights.

Mean Squared Differences:

vector and the other user vector.

Improvement on Memory-based Algorithms