Vous êtes sur la page 1sur 4

Weboo:Using Web Browser For User Social Profiling

Used For Recommendations


1

Narendra Rajput, 2Sanket Bhawkar, 3Rajesh Basrur, 4Rubeena Khan

Department of Computer Engineering M.E.S. College of Engineering, Pune, India


1
bknarendra2008@gmail.com2sanketb89@gmail.com
3
rajeshbasrur@yahoo.co.in 4rubeenakhan@mescoepune.org

Abstract Recommender systems mine the data on the web and are
helpful in identifying and recommending user information they might
be interested in which gives the user more personal web experience.
Social web has become the buzzword today. And even
recommender systems havent remain untouched by this social
effect. As the title suggests in this paper we propose a way to use web
browser to create a tag based profile of a user learning about his
interests which then can be used for personalised recommendations.
We also propose an architecture using web browser to collect the user
information from social networks & how this data is processed by
recommender & provide an API to use this and access this processed
data in your own recommender applications. We also explain how
this approach is advantageous over the normal recommenders
available today.
Keywords Knowledge Base, Social Tagging, Folksonomy,
Collaborative filtering, Ontology.

I. INTRODUCTION
During the last decade we have seen rise of some social sites
like 1Facebook, 2Twitter, 3Flickr, 4LinkedIn, 5Delicious which
have had a great impact on the web today. The sole reason for
this success has been the way users allow people to interact
with other people. They may know share their content,
express themselves the way they want, And while doing this
people knowingly or unknowingly leave behind the traces of
their areas of interests, likes, preferences, etc. Thus social
networks are a gold mine of information we need for
recommending things to the user.
Now traditionally most recommenders are concerned only
with information specific to a particular domain. But the
information obtained from the social profiles of a user are
multi-faced & can be used in multi-domain recommendations.
Recommendation systems used today can be broadly
classified into content- based, item based collaborative
filtering & user based collaborative filtering & hybrid
recommendations systems. Each of them is suitable in
different scenarios.
1

www.facebook.com
www.twitter.com
3
www.flickr.com
4
www.linkedin.com
5
www.delicious.com
2

Collaborative filtering based recommendation systems are


quite popular & most successful. But they face issues of cold
start problem & handling changed user interests. Since the
users interact with social sites on regular basis they hold up to
date information. So these can be used as sources of
information for learning about the users. This information is
mostly in the form of tags or other content like hash tags,
likes.
Social networks allow users to add tags which are kind of
assigned key user to classify items of interest. Other sites
Facebook use a mechanism called likes to allow users to
specify their items of interest. Facebook also allows users to
specify their favourite music, books, movies, sportsmen, etc.
In this paper we propose a mechanism to handle these
different forms of data, filter them and classify them. Today
many folksonomy based recommendation systems use these
tag based information for recommendation relevant to users.
However, a large part of these tags are noisy which degrades
the performance of the system in meaningful set of categories.
Our approach combines the different forms of data into
keywords, filter them & map them into suitable categories by
making use of knowledge base provided into the form of
Freebase, Wikipedia, etc.
In the latter section, we also propose architecture of
implementation of the system.
II. RELATED WORK
In [1] Fabian Abel Nicola, Eelco & Daniel intorduce Mypes
project that connects, different online social network services.
It collects, aggregates & filters information from the profiles
& makes it available to the end user & third party applications
in the easy to use form. It has been the main source of
motivation for our paper.
The social networks like Facebook , Twitter, LinkedIn, Flickr
provide their APIs that allow developers to access the users
social information. That can be used in their applications.
In [2] & [3] Ivan Cantador proposes a method for categorizing
the tags into Ontologies specifying domains of interest. We
make use of similar approach to filter our keywords and then
make use of Freebase knowledgebase to categorize the social
tags into purpose oriented ontologies.

Mike, Richard & Shivakant [4] in their paper proposed an


online news recommendation system called Social News by
leveraging the power of Facebook likes mechanism to
generate recommendations.
III. OVERVIEW OF OUR APPROACH
Our approach to collecting information from services and
filtering to categorize them can be summarised as follows:
The raw tags and other forms of information that is collected
from the services is inconsistent and noisy. So first based upon
the nature of information they are filtered in several steps.
a)

Facebook likes dont need any refinement since, they


are already categorized by Facebook.
b) The tags retrieved from Flickr, Delicious, Last-fm &
hash tags from Twitter can contain some noise & need
some filtering. The raw tags can contain spelling
mistakes like
schol school or Bombai Bombay
Plural forms of words (comment, comments) stop
words like a, the, an, etc. Or abbreviation like LA for
Los Angeles or special character like
() (ueaceeiiioouuyaioun).
The filtering takes place in the following steps:i)

ii)

iii)

In the first step the tags that are too large or too
small (depending on threshold values) are
removed. Then the tags with special characters are
converted to normal ASCII form. The stop words
like articles(a, an , the), conjunctions (like and,
but, or, yet, for, nor, so), pronouns (like he, she, it,
they) are removed from the tags.
Then the spelling mistakes in the tags are
corrected by using the Google Did you mean
Feature inspired by the approach taken in [2].
Since Google corrects the words and finds results
relevant to a word which is very close to the
misspelled word. This can be done by passing the
misspelled word to Google as a search query and
in the result Google returns with the word which is
close to the misspelled word. Generally the words
that have a very small Levenstein distance.
Many a times sites that allow users to add tags
combine the separate words into a single word by
removing spaces for ease of handling data. Like
for examle:
artificialintelligence artificial intelligence
newyork new york
sanfransisco san fransisco

iv)

Such compound words also need to be handled.


This can be done by using Googles Did you
mean feature.
Tags can also contain a word with same
meaning in different forms like plural form, or
verb form (blogs, blogging blog). So these tags
are needed to be stemmed. In this process we can
make use of Wordnet knowledge base.

Once the words are filtered the next task is to derive semantic
of the filtered information keywords to derive the concepts
and then assign the concepts to suitable ontologies.
Freebase is a collaborative KB which is in the form of
collection of collection of structured data generated from
many sources. Its data is accessible as open API, database
dump to programmers who can use it in their applications.
The main feature of Freebase that makes it useful for our
system is the fact that it uses a graph model instead of tables.
It arranges the data in the form of set of nodes or set of edges
that establish relationships between the nodes. Freebase can
be queried through Metaweb Query Language.
Our approach is to use Freebase for categorization of tags.
After the keywords have been filtered, we try to define the
category of tag by searching for an entry in Freebase. It
returns with the IDs of most relevant topics from knowledge
base. Another query to Freebase returns the category of the
item of interest. We do not have to worry about the ambiguity
of meanings because Freebase already has a field score that
gives the similarity score between the tag and topic from
Freebase. Once we have categorized all tags graph of user
interests is derived from the categorized objects.
A.

RECOMMENDATION ALGORITHM

In this graph objects/ tags are represented as nodes and the


relationships among them are represented as unsigned edge
and weights are defined based on strength of relationship.
Now different graph based algorithms can be applied to the
graph to assign heights to edges between the nodes.
For this purpose we suggest a couple of approaches:
i)

Path based similarity:


In this method the weight is defined as a function
of number of paths between the nodes or length of
edges.

ii)

Random Walk:
In this approach the similarity between the 2 nodes
is defined as a function of the probability of
reaching a node from other node by walking
randomly.

IV.

OVERVIEW OF ARCHITECTURE

Users interact with social sites through browsers regularly and


hence we have new data generated on regular basis. By
capturing the users social interaction through web browser we
can filter and send the information for further processing to
recommender systems, instead of having the system to keep
polling and gathering each users information by itself. Thus,
we have a kind of event handling mechanism that triggers the
sending of user information to recommender whenever new
data is generated.

V.

Combining the users social information with the well


structured knowledge base that is derived from the worlds
largest encyclopedia promises to be very useful for deriving
recommendations for users.
Similarly our approach of making use of browsers for
collecting the social information and making it a part of
our recommendation system, in the process of learning about
the user promises to be very effective due to its distributed
flexible architecture.

Once the filtered profile data is sent to recommender it applies


the process of categorisation to tags using the Freebase
creating, the user graph which can then be used for
recommendation.
Since our architecture is extensible it all depends on the nature
of the application that uses the service to implement this
interface and use suitable algorithm for assigning weights to
the nodes.
The recommender system can be used through the provided
API.

CONCLUSION

REFERENCES

[1]

[2]

[3]

Fabian Abel, Nicola Henze, Eelco Herder, Daniel Krause.,Linkage,


Aggregation, Alignment, and Enrichment of Public User profiles with
Mypes I-SEMANTICS 10 Proceedngs of the 6th International
Conference on Semantic Systems ACM New York, NY, USA 2010.
Ivn Cantadora, Ioannis Konstasb, Joemon M. Josec.,"Categorising
social tags to improve folksonomy-based recommendations" Web
Semantics: Science, Services and Agents on the World Wide Web
Volume 9, Issue 1, March 2011, Pages 1-15
Ivan Cantador, Martin Szomszor, Harith Alani, Miriam Fernandez,
Pablo Castells., "Enriching Ontological User Profiles with tagging
History for Multi-Domain Recommendations" 1st International

[4]
[5]
[6]
[7]
[8]

[9]
[10]

Workshop on Collective Semantics: Collective Intelligence and the


Semantic Web (CISWeb 2008) 5th European Semantic Web Conference
(ESWC 2008) Tenerife, Spain
Mike Gartrell, Richard Han, Qin Lv, Shivakant Mishra.,"SocialNews:
Enhancing Online News Recommendations By Leveraging Social
Network Information" Technical Report CU-CS-1084-11
Hugo Liu, Pattie Maes .,"InterestMap: Harvesting Social Network
Profiles for Recommendations" Beyond Personalization, 2005
Satnam Alag "Collective Intelligence in Action" MANNING
Freebase API http://wiki.freebase.com/wiki/Freebase_API
X. Jin, C. Wang, J. Luo, X. Yu, and J. Han. Likeminer: A system for
mining the power of like in social media networks. In KDD11:
Proceedings of the 17th ACM SIGKDD Conference on Knowledge
Discovery and Data Mining, 2011. Demo.
Tanvi
Surti
"Social
Recommender
systems:
Improving
recommendations through personalization" .
Karen H. L. Tso-Sutter, Leandro Balby Marinho, Lars Schmidt-Thieme
"Tagaware Recommender Systems by Fusion of Collaborative Filtering
Algorithms" SAC '08 Proceedings of the 2008 ACM symposium on
Applied computing ACM New York, NY, USA.

Vous aimerez peut-être aussi