Académique Documents
Professionnel Documents
Culture Documents
Recommender systems are information filtering systems that present relevant item
suggestions to users from a large collection of possible items, for this purpose
recommender systems traditionally rely on the historic interaction of the user with
the system to build and accurate representation of the user's interests. We
understand relevance defined in as the: "utility or usefulness of the information in
regard to the user's task and needs".
There are two types of information recommender systems use in order to predict
the relevance of an item for a user:
Explicit feedback consists on the direct feedback of the user on an item. Ratings
are usual representation of the user opinion of a user for an item, for example a
numerical rating (1 to 5), a rating on like a scale (Strongly disagree, Disagree,
Neither agree nor disagree, Agree, Strongly agree) or binary ratings (like, dislike).
Implicit feedback on the other hand is the collection of actions users exert on
items, these actions indirectly reflect the opinion of the user on the item, and for
example a user can view, click or buy an item.
These views and clicks, combined with purchase events, are the most common
examples of implicit feedback used in recommender systems.
It is shown that using such implicit feedback in recommender systems produces
results that are consistent with results achieved when using explicit feedback.
Content based filtering (CB) which uses the features or characteristics of the
items to find out the relevance for the user.
Collaborative filtering (CF) which uses only the opinions of the users on the
items.
Content based (CB) filtering systems operate under the assumption that the user
will like similar items to the ones she has liked in the past. CB systems extract the
characteristics or features of the data items and use these characteristics to
represent the item and the user under a common knowledge model.
Data item representation: The data analyzer component creates a data item
representation under a knowledge model that reflects the relevant characteristics of
the data item that are useful for the filtering component.
User representation: The user model component creates a user profile that
reflects the current user's situation and desires and represents it under a knowledge
model. To properly represent the user's situation and interests explicit information
from the user can be used.
Matching: The filtering component is in charge of using the item and user
representation to predict the relevance of the item for the said user.
Learning: The learning component keeps up to date the user model representation
based on the feedback received by the user.
Although CB systems are simple, easy to implement and its easy to explain why an
item has been classified as relevant, several researchers such as have noted the
limitations of these kind of systems in their filtering performance.
We Identified the following Limitations:
(1) Limited content analysis: In order to have a good representation of data items,
data items must be characterized by features. If this feature extraction of data items
is difficult or impractical the representation of the data items will not be accurate
and a conceptual gap between the representation and the real features will occur,
this inaccurate classification of items will affect the accuracy of the filtering
system.
(2)Overspecialization: A CB system will only classify as relevant data items that
are similar to the ones that the user has classified as relevant in the past. This
means that the system cannot predict the relevance of a data item that is unlike the
ones the user has seen.
(3)New user problem: A user has to express her opinion on a sufficient number of
data items in order to build an useful user profile.
We have noted the limitations of these kind of systems in their predictive accuracy,
in particular we remark the following problems:
(1) Sparsity: In some systems data items are rated rarely, making it difficult to find
similarities between users or between items. For model based systems there is also
a problem since
(2)New item problem: When a new item is added to the system is difficult to
predict its relevance because the user and item profile has little or no information,
this problem is critical in systems where items appear and disappear frequently.
(3)Scalability: As the number of items and users
increase, the neighborhood formation process is more demanding computationally,
also as vector representations increase their dimensionality, distance metrics to
detect similarities become less significant. On model based systems, the
computational cost of learning the model parameters as the number of users and
items increase is not negligible
(4)Complexity and explainability: Although model based CF gives better
predictive performance than memory based CF, most of the times developer prefer
using memory based CF for two reasons: Model based CF is more complex;
adjusting parameters can be and time consuming and difficult to maintain. On the
other hand it's difficult to find out what the latent dimensions in the model are
representing and it's difficult to explain to the user how the relevance of an item
has been calculated.