Vous êtes sur la page 1sur 172

Trust & Reputation

in Multi-Agent Systems
Dr. Jordi Sabater Mir
jsabater@iiia.csic.es

EASSS 2012, Valencia, Spain

Dr. Javier Carb


jcarbo@inf.uc3m.es

Dr. Jordi Sabater-Mir


IIIA Artificial Intelligence Research Institute
CSIC Spanish National Research Council

Outline
Introduction
Approaches to control the interaction
Computational reputation models
eBay
ReGreT

A cognitive perspective to computational reputation


models
A cognitive view on Reputation
Repage, a computational cognitive reputation model
[Properly] Integrating a [cognitive] reputation model into a
[cognitive] agent architecture
Arguing about reputation concepts

Trust
A complete absence of trust
would prevent [one] even getting
up in the morning.
Niklas Luhman - 1979

Trust
A couple of definitions that I like:

Trust begins where knowledge [certainty] ends: trust provides a


basis dealing with uncertain, complex, and threatening images of
the future. (Luhmann,1979)

Trust is the outcome of observations leading to the belief that the


actions of another may be relied upon, without explicit guarantee,
to achieve a goal in a risky situation. (Elofson, 2001)

Trust

Epistemic

The subjective probability by which an individual, A,


expects that another individual, B, performs a given
action on which its welfare depends [Gambetta]
An expectation about an uncertain behaviour [Marsh]

The decision and the act of relying on, counting on,


depending on [the trustee] [Castelfranchi & Falcone]

Motivational
6

Reputation

"After death, a tiger leaves behind


his skin, a man his reputation"
Vietnamese proverb

Reputation
What a social entity says about a target regarding his/her behavior

It is always associated to a specific


behaviour/property

The social evaluation linked to the reputation is


not necessarily a belief of the issuer.
Reputation cannot exist without communication.

Set of individuals plus a set of social relations among


these individuals or properties that identify them as a
group in front of its own members and the society at
large.

What is reputation good for?


Reputation is one of the elements that allows
us to build trust.
Reputation has also a social dimension. It is
not only useful for the individual but also for
the society as a mechanism for social order.

But... why we need computational


models of those concepts?

What we are talking about...

Mr. Yellow

What we are talking about...


Two years ago...

Mr. Yellow

Trust based on...

Direct experiences

What we are talking about...


Trust based on...

Mr. Pink

Mr. Yellow

Third party information

What we are talking about...


Trust based on...

Mr. Pink

Mr. Green

Mr. Yellow

Third party information

What we are talking about...


Trust based on...

Mr. Yellow

Reputation

What we are talking about...

Mr. Yellow

What we are talking about...

Characteristics of computational trust and


reputation mechanisms
Each agent is a norm enforcer and is also under
surveillance by the others. No central authority
needed.

Their nature allows to arrive where laws and central


authorities cannot.
Punishment is based usually in ostracism. Therefore,
exclusion must be a punishment for the outsider.

Characteristics of computational trust and


reputation mechanisms
Bootstrap problem.
Not all kind of environments are suitable to apply
these mechanisms. It is necessary a social
environment.

Approaches to control the


interaction

Different approaches to control the


interaction

Security approach

Different approaches to control the


interaction
Security approach
Agent identity validation.
Integrity, authenticity of messages.
...

Different approaches to control the


interaction

Institutional approach

Security approach

Different approaches to control the


interaction
Institutional approach

Different approaches to control the


interaction
Social approach

Trust and reputation


mechanisms are at this
level.

Institutional approach

Security approach

They are complementary


and cover different aspects
of interaction.

Computational reputation
models

Classification dimensions
Paradigm type
Mathematical approach
Cognitive approach

Information sources

Direct experiences
Witness information
Sociological information
Prejudice

Visibility types
Subjective
Global

Models granularity
Single context
Multi context

Agent behaviour assumptions


Cheating is not considered
Agents can hide or bias the
information but they never lie

Type of exchanged information

Subjective vs Global
Global
The reputation is maintained as a centralized resource.
All the agents in that society have access to the same reputation values.
Advantages:
Reputation information is available even if you are a newcomer and do not
depend on how well connected or good informants you have.
Agents can be simpler because they dont need to calculate reputation
values, just use them.
Disadvantages:
Particular mental states of the agent or its singular situation are not taken
into account when reputation is calculated. Therefore, a global view it is only
possible when we can assume that all the agents think and behave similar.
Not always is desireable for an agent to make public information about the
direct experiences or submit that information to an external authority.
Therefore, a high trust on the central institution managing reputation is
essential.

Subjective vs Global
Subjective
The reputation is maintained by each agent and is calculated according to its
own direct experiences, information from its contacts, its social relations...
Advantages:
Reputation values can be calculated taking into account the current state of
the agent and its individual particularities.

Disadvantages:
The models are more complex, usually because they can use extra sources of
information.
Each agent has to worry about getting the information to build reputation
values.
Less information is available so the models have to be more accurate to
avoid noise.

A global reputation model: eBay


Model oriented to support trust between buyer and seller.
Completely centralized.
Buyers and sellers may leave comments about each other
after transactions.
Comment: a line of text + numeric evaluation (-1,0,1)
Each eBay member has a Feedback score that is the
summation of the numerical evaluations.

eBay model

eBay model
Specifically oriented to scenarios with the following
characteristics:
A lot of users (we are talking about milions)
Few chances of repeating interaction with the same partner
Easy to change identity
Human oriented

Considers reputation as a global property and uses a single


value that is not dependent on the context.

A great number of opinions that dilute false or biased


information is the only way to increase the reliability of the
reputation value.

A subjective reputation model: ReGreT

What is the ReGreT system?


It is a modular trust and reputation system
oriented to complex e-commerce environments
where social relations among individuals play
an important role.

ODB

IDB

SDB

The ReGreT
system

Credibility

Neighbourhood
reputation

Witness
reputation
Reputation
model

Direct
Trust

System
reputation
Trust

ODB

IDB

SDB

The ReGreT
system

Credibility

Neighbourhood
reputation

Witness
reputation
Reputation
model

Direct
Trust

System
reputation
Trust

Outcomes and Impressions


Outcome:
The initial contract
to take a particular course of actions
to establish the terms and conditions of a transaction.
AND
The actual result of the contract.
Example:

Outcome

Prize =c 2000
Quality =c A
Quantity =c 300

Contract

Prize =f 2000
Quality =f C
Quantity =f 295

Fulfillment

Outcomes and Impressions

Outcome
Prize =c 2000
Quality =c A
Quantity =c 300
Prize =f 2000
Quality =f C
Quantity =f 295

offers_good_prices
maintains_agreed_quantities

Outcomes and Impressions


Impression:
The subjective evaluation of an outcome from a specific
point of view.

Outcome
Prize =c 2000
Quality =c A
Quantity =c 300
Prize =f 2000
Quality =f C
Quantity =f 295

Imp(o, 1 )

Imp(o, 2 )

Imp(o, 3 )

ODB

IDB

SDB

The ReGreT
system

Credibility
Witness
reputation
Direct
Trust

Reliability of the value based on:


Reputation
model
Number of outcomes
Deviation: The greater the variability in
the rating values the more volatileSystem
will be
the other agent in the fulfillment
of its
reputation
agreements.
Trust

Neighbourhood
reputation

Direct Trust
Trust relationship calculated directly from an agents
outcomes database.

DTa b ( )

(t , t ) Imp(o , )
i

a ,b
oi ODB gr
( )

f (ti , t )
(t , ti )
o IDBa ,b f (t j , t )
j

gr ( )

ti
f (ti , t )
t

Direct Trust
DT reliability
a ,b

a ,b

DTRLab ( ) No ( ODBgr ( ) ) (1 Dv ( ODBgr ( ) )

Number of
outcomes
(No)

a ,b

No ( ODB gr ( ) ), itm 10

Deviation
(Dv)
The greater the variability in
the rating values the more
volatile will be the other
agent in the fulfillment of its
agreements.

ODB

IDB

SDB

The ReGreT
system

Credibility

Neighbourhood
reputation

Witness
reputation
Reputation
model

Direct
Trust

System
reputation
Trust

Witness reputation
Reputation that an agent builds on another agent based
on the beliefs gathered from society members (witnesses).
Problems of witness information:
Can be false.

Can be incomplete.
It may suffer from the correlated evidence problem.

o
+

o
#

c1

a1

b1

a2u3
+

u7

o+b2

c1
u6

u2

u4

u9

u5
u2

#^

b2

u3

+
^

o+ c2 #^ d1
d1 +
u8
d2
+

u1

c2

o
+
a1 #
o^

u9

u1
u8

u6 u5
u4 u7

o
#
+

a2

b1
+

+
trade ^

d2

o
#

o
+

a2
a1

o
#

b1 u1
o

o
+
#
^

b2

c1

c2
o+ u4
#^

u9

o
u1

d1

+
d2

u4

u3

u2

+
^

u5
u5

u2
u9
u6
u6

u3

u7

u8

u8
u7

o
Big exchange of sincere infor# kind of predispomation and some
sition to help+if it is possible.

cooperation
+
^

o
#

o
+

a2
a1

o
#

b1
+

b2
o

o
+
#
^

c1

#
u1
u4

o
u1

u9 c2u3
o+
#^

+
^

d1

+
d2

u2
u5

u2
u9

u7
u3
u6

o
Agents tend to use all the available
# some advantage
mechanisms to take
from their+competitors.

u8

u8

u5
u6
u7

competition
+
^

u4

o
#

Witness
reputation

u7

a1

#
c1

Step 1: Identifying
the witnesses
Initial set of witnesses:
Agents that have had
a trade Relation with
the target agent

o+

u6

u3
u2

?
c2

#^

d1

u8

b2

u9

u1

u5
u4

a2

b1
+

trade

d2

u7
Witness
Grouping agents with frequent interactions
reputation
among them and considering each one of these

groups as a single source of reputation values:

Step 1: Identifying
u6
Minimizes u3
the correlated evidence problem.
the witnesses
u8
u2
Initial set of witnesses:
Reduces the number of queries to agents that
Agents that have had
probably will give us more or less the same
a trade Relation with
the target agent information.
b2

To group agents ReGreT


u5 relies on sociograms.
u4

trade

Witness reputation

u7

Heuristic to identify groups and


the best agents to represent
them:
1. Identify the components of
the graph.

Central-point

u6
u3

2. For each component, find the


set of cut-points.

3. For each component that


does not have any cut-point,
select a central point (node
with larger degree).

u8

u2

b2

u5
u4
Cut-point

cooperation

Witness
reputation
Step 1: Identifying
the witnesses

u7

u6
u3

Initial set of witnesses:


Agents that have had
a trade Relation with
the target agent
Grouping and selecting
the most representative
witnesses

u8

u2

b2

u5
u4

trade

Witness
reputation
Step 1: Identifying
the witnesses

u3

Initial set of witnesses:


Agents that have had
a trade Relation with
the target agent
Grouping and selecting
the most representative
witnesses

u2

b2

u5

trade

Witness
reputation

Trustu 2b 2 ( ), TrustRLu 2b 2 ( )
u2

Step 1: Identifying
the witnesses
Step 2: Who can I
trust?

u3
u5

Trustu 5b 2 ( ), TrustRLu 5b 2 ( )

ODB

IDB

SDB

The ReGreT
system

Credibility

Neighbourhood
reputation

Witness
reputation
Reputation
model

Direct
Trust

System
reputation
Trust

Credibility model
Two methods are used to evaluate the credibility of
witnesses:

Credibility
(witnessCr)

Social relations
(socialCr)

Past history
(infoCr)

Credibility model
socialCr(a,w,b): credibility that agent a assigns to agent w when
w is giving information about b and considering the social structure
among w, b and himself.
a
w

a
w

a
w

w
b

competitive relation
cooperative relation

w
b

b
w - witness
b - target agent
a - source agent

Credibility model
Regret uses fuzzy rules to calculate how the structure of
social relations influences the credibility on the information.
IF coop(w,b) is h
THEN socialCr(a,w,b) is vl

0
0

low
(l)

moderate
(m)

high
(h)

very_low
(vl)

low
(l)

moderate
(m)

high
(h)

very_high
(vh)

ODB

IDB

SDB

The ReGreT
system

Credibility

Neighbourhood
reputation

Witness
reputation
Reputation
model

Direct
Trust

System
reputation
Trust

Neighbourhood reputation
The trust on the agents that are in the neighbourhood of
the target agent and their relation with it are the elements
used to calculate what we call the Neighbourhood reputation.
ReGreT uses fuzzy rules to model this reputation.
IF DTan (offers_good_quality ) is X AND coop(b,ni) low
THEN Ran b (offers_good_quality) is X
i

IF DTRLan (offers_good_quality) is X AND coop(b,ni) is Y


THEN RLan b (offers_good_quality) is T(X,Y)
i

ODB

IDB

SDB

The ReGreT
system

Credibility

Neighbourhood
reputation

Witness
reputation
Reputation
model

Direct
Trust

System
reputation
Trust

System reputation
The idea behind the System reputation is to use the
common knowledge about social groups and the role that
the agent is playing in the society as a mechanism to assign
reputation values to other agents.
The knowledge necessary to calculate a system reputation
is usually inherited from the group or groups to which the
agent belongs to.

Trust
If the agent has a reliable direct trust value, it will use that
as a measure of trust. If that value is not so reliable then it
will use reputation.
Neighbourhood
reputation

Witness
reputation

Reputation
model

Direct
Trust

System
reputation
Trust

A cognitive perspective to computational


reputation models
A cognitive view on Reputation

Repage, a computational cognitive reputation model

[Properly] Integrating

a [cognitive] reputation model into a


[cognitive] agent architecture

Arguing about reputation concepts

Social evaluation
A social evaluation, as the name suggests, is the evaluation by a social
entity of a property related to a social aspect.
Social evaluations may concern physical, mental, and social properties of
targets.
A social evaluation includes at least three sets of agents:
a set E of agents who share the evaluation (evaluators)
a set T of evaluation targets
a set B of beneficiaries
We can find examples where the different sets intersect totally, partially,
etc...
e (e in E) may evaluate t (t in T) with regard to a state of the world that is in
bs (b in B) interest, but of which b not necessarily is aware.
Example: quality of TV programs during childrens timeshare

Image and Reputation


Both are social evaluations.
They concern other agents' (targets) attitudes toward socially desirable
behaviour but...
...whereas image consists of a set of evaluative beliefs about the
characteristics of a target,
reputation concerns the voice that is circulating on the same target.

Reputation in artificial societies


[Rosaria Conte, Mario Paolucci]

Image
An evaluative belief; it tells whether the target is good or bad with respect
to a given behaviour [Conte & Paolucci]
Is the result of an internal reasoning on
different sources of information that leads the
agent to create a belief about the behaviour
of another agent.

Beliefs

The agent has accepted as something true


and its decisions from now on will take this
into account.

Social evaluation

Reputation
A voice is something that it is said, a piece of information that is being
transmitted.
Reputation: a voice about a social evaluation that is recognised by the
members of a group to be circulating among them.

Beliefs

B(S(f))

The agent believes that the social


evaluation f is communicated.
This does not imply that the agent
believes that f is true.

Reputation
Implications:
The agent that spreads a reputation, because it is not implicit that it
believes the associated social evaluation, takes no responsibility about
that social evaluation (another thing is the responsibility associated to
the action of spreading that reputation).
This fact allows reputation to circulate more easily than image
(less/no fear of retaliation).
Notice that if an agent believes what people say, image and
reputation colapse.
This distinction has important advantages from a technical point of
view.

Gossip
In order for reputation to exist, it has to be transmitted. We cannot have
reputation without communication.
Gossip currently has the meaning of an idle talk or rumour, especially
about the personal or private affairs of others. Usually has a bad
connotation. But in fact is an essential element in human nature.
The antecedents of gossip is grooming.
Studies from evolutionary psicology have found gossip to be very
important as a mechanism to spread reputation [Sommerfeld et al. 07, Dunbar 04]

Gossip and reputation complement social norms: Reputation evolves


along with implicit norms to encourage socially desirable conducts, such as
benevolence or altruism and discourage socially unacceptable ones, like
cheating.

Outline
A cognitive view on Reputation
Repage, a computational cognitive reputation model

[Properly] Integrating

a [cognitive] reputation model into a


[cognitive] agent architecture

Arguing about reputation concepts

RepAge
What is the RepAge model?
It is a reputation model evolved from a
cognitive theory by Conte and Paolucci.
The model is designed with an special
attention to the internal representation of the
elements used to build images and
reputations as well as the inter-relations of
these elements.

RepAge memory

Rep

Img

Strength: 0.6

Value:

RepAge memory

Outline
A cognitive view on Reputation
Repage, a computational cognitive reputation model

[Properly] Integrating

a [cognitive] reputation model into a


[cognitive] agent architecture

Arguing about reputation concepts

What do you mean by properly?


Current models

Trust & Reputation


system

Planner

?
Inputs

Decision
mechanism
Comm

Black box
Reactive

Agent

What do you mean by properly?


Current models

Trust & Reputation


system

Planner
Value

Inputs

Decision
mechanism
Comm

Black box
Reactive

Agent

What do you mean by properly?


The next generation?

Planner

Trust & Reputation


system

Inputs

Decision
mechanism
Comm

Agent

What do you mean by properly?


The next generation?

Planner

Inputs

Decision
mechanism
Comm

Agent
Not only reactive...
... proactive

BDI model
Very popular model in the multiagent community.
Has the origins in the theory of human practical reasoning
[Bratman] and the notion of intentional systems [Dennett].
The main idea is that we can talk about computer programs as if
they have a mental state.
Specifically, the BDI model is based on three mental attitudes:
Beliefs - what the agent thinks it is true about the world.
Desires - world states the agent would like to achieve.
Intentions - world states the agent is putting efforts to achieve.

BDI model
The agent is described in terms of these mental attitudes.
The decision-making model underlying the BDI model is known as
practical reasoning.

In short, practical reasoning is what allows the agent to go from


beliefs, desires and intentions to actions.

Multicontext systems
Logics

Declarative languages, each with a set of


axioms amd a number of rules of inference.

UNITS

Structural entities representing the main


architecture components. Each unit has a
single logic associated with it.

Bridge Rules
Theories

Rules of inference wich relate formulae


in different units.

Sets of formulae written in the logic


associated with a unit

U2

U1

U1:b

U2:d

U3:a

U3

U2

U1

U1:b

U2:d

U3:a

U3

U2

U1

U1:b

U2:d

U3:a

U3

U2

U1

U1:b

U2:d

U3:a

U3

Multicontext

Repage integration in a BDI architecture

BC-LOGIC

Grounding Image and Reputation to BC-Logic

Repage integration in a BDI architecture

Desire and Intention context

Generating Realistic Desires

Generating Intentions

Repage integration in a BDI architecture

Outline
A cognitive view on Reputation
Repage, a computational cognitive reputation model

[Properly] Integrating

a [cognitive] reputation model into a


[cognitive] agent architecture

Arguing about reputation concepts

Arguing about Reputation Concepts


Goal: Allow agents to participate in argumentation-based dialogs regarding
reputation elements in order to:
- Decide on the acceptance of a communicated social evaluation based
on its reliability.
Is the argument associated to a communicated social evaluation (and according to
my knowledge) strong enough to consider its inclusion in the knwoledge base of my
reputation model?

- Help in the process of trust alignment.


What we need:
A language that allows the exchange of reputation-related
information.
An argumentation framework that fits the requirements imposed by
the particular nature of reputation.
A dialog protocol to allow agents establish information seeking
dialogs.

The language: LRep


LREP : First-order sorted languange with
special predicates representing the
typology of social evaluations we use:
Img, Rep, ShV, ShE, DE, Comm.

Ex 2: Linguistic Labels

SF: Set of constant formulas


Allows LREP formulas to be nested in communications

SV: Set of evaluative values


f:
{ 0 , 1, 2 , 3 , 4 }

The reputation argumentation framework


Given the nature of social evaluations (the values of a social evaluation
are graded) we need an argumentation framework that allows to weight
the attacks.
Example: We have to be able to differentiate between Img(j,seller,VG)
being attacked by Img(j,seller,G) or being attacked by Img(j,seller,VB).

Specifically we instantiate the Weighted Abstract Argumentation


Framework defined in

P.E. Dunne, A. Hunter, P. McBurney, S. Parsons, and M. Wooldridge,


Inconsistency tolerance in weighted argument systems, in
AAMAS09, pp. 851858, (2009).
Basically, this framework introduces the notions of strength and
inconsistency budgets (defined as the amount of inconsistency that the
system can tolerate regarding attacks) in a classical Dungs framework.

Building Argumentative Theories


Argumentative theory
(Build from the
reputation theory)

Simple shared consequence relation

Argumentation level
?

?
Reputation-related information

Reputation theory: set of ground


elements (expressed in LREP) gathered
by j through interactions and
communications.

Consequence relation
(Reputation model)
Specific to each agent

Attack and Strength


f:
{ 0 , 1, 2 , 3 , 4 }

Strength of the attack

Example of argumentative dialog


Role: seller

Role: Inf
informant

Agent i: proponent
Agent j: opponent

Role: sell(q)

Role: sell(dt)

quality

delivery time

Each agent is equipped with a Reputation Weighted Argument System

Example of argumentative dialog


j

Example of argumentative dialog


j

Strength of the attack

Example of argumentative dialog


j

Example of argumentative dialog


j

Example of argumentative dialog


j

Example of argumentative dialog


j

Using Inconsistency Budgets


j

Outline
+ PART II:
Trust

Computing Approaches

Security
Institutional
Social
Evaluation of

Trust and Reputation Models

EASSS 2010, Saint-Etienne, France

111

Dr. Javier Carb


GIAA Group of Applied Artificial Intelligence
Univ. Carlos III de Madrid

Trust in Information Security


Same Word, Different World
Security approach tackles hard problems of trust.
They view trust as an objective, universal and
verifiable property of agents.
Their trust problems have solutions:
False identity
Reading/modification of messages by third parties
Repudiation of messages
Certificates of accomplishing tasks/services
according to standards
EASSS 2010, Saint-Etienne, France

113

An example,
Public Key Infrastructure
Certificate authority

4. Publication of
certificate

3. Public key
sent

LDAP directory

5. Certificate
sent

2. Private key sent

Registration authority

1. Client identity
EASSS 2010, Saint-Etienne, France

114

Trust in I&S, limitations


Their trust relies on central entities:
Authorities, Trust Third Parties
Partially solved using hierarchies of TTPs.
They ignore part of the problem:
- Top authority should be trusted by any other way
Their scope is far away from Real Life Trust issues:
lies, defection, collusions, social norm violations,

EASSS 2010, Saint-Etienne, France

115

Institutional approach
Institutions have proved to successfully regulate human
societies for a long time:
- created to achieve particular goals while complying norms.
- responsible for defining the rules of the game (norms), to
enforce them and assess penalties in case of violation.
Examples: auction houses, parliaments, stock exchange
markets,.
Institutional approach is focused on the existence of
organizations:
Providing an execution infrastructure
Controlling the resources access
Sanctionning/rewarding agents behaviors
EASSS 2010, Saint-Etienne, France

116

An example: e-institutions

EASSS 2010, Saint-Etienne, France

117

Institutional approach, limitations


They view trust as an partially objective, local and verifiable
property of agents.
Intrusive control on the agents (modification on the
execution resources, process killing, )
They require a shared agreeement to define of what is
expected (norm compliance, case laws)
They require a central entity and global supervision
Repositories, access control entities should be
centralised
Low scalability if every agent is observed by the
institution
Assumes that the institution itself is trusted
EASSS 2010, Saint-Etienne, France

118

Social approach
Social approach consists in the idea of an auto-organized society
(Adam Smiths invisible hand)
Each agent has its own evaluation criteria of what is expected:
no social norms, just individual norms
Each agent is in charge of rewards and punishments (often in
terms of more/less future cooperative interactions)
No central entity at all, it consists of a completely distributed
social control of malicious agents.
Trust as an emergent property
Avoids Privacy issues caused by centralized approaches
EASSS 2010, Saint-Etienne, France

119

Social approach, limitations


Unlimited, but undefined and unexpected trust scope:
We view trust as a subjective, local and unverifiable
property of agents.
Exclusion/Isolation is the typical punishment for the
malicious agents Difficult to enforce it in open and
dynamical societies of agents
Malicious behaviors may occur, they are supposed to be
prevented due to the lack of incentives and punishments.
Difficult to define which domain and society is appropriate
to test this social approach.
EASSS 2010, Saint-Etienne, France

120

Ways to evaluate any system


Integration on real applications
Using real data from public datasets
Using realistic data generated artificially
Using ad-hoc simulated data with no
justification/motivation
None of above

Ways to evaluate T&R in agent systems


Integration of T&R on real agent applications
Using real T&R data from public datasets
Using realistic T&R data generated artificially
Using ad-hoc simulated data with no
justification/motivation
None of above

Real Applications using T&R in an agent


system
What real application are we looking for?
Trust and reputation:
System that uses (for something) and exchanges
subjective opinions about other participants
Recommender Systems
Agent System:
Distributed view, no central entity collects, aggregates
and publishes a final valuation ???

Real Applications using T&R in an agent


system
Desiderata of application domains:
(To be filled by students)

Real data & public datasets


Assuming real agent applications exists, would data
be publicly available?
Privacy concerns
Lack of incentives to save data along time
Distribution of data.Heisenberg uncertainty
principle: If users knew their subjective opinions
would be collected by a central entity, they would
not be as if their opinions had just a private
(supposed-to-be friendly) reader.
No agents, no distribution public dataset from
recomender systems

A view on privacy concerns


Anonymity: use of arbitrary/secure pseudonysms
Using concordance: similarity between users within a
single context. Mean of differences rating a set of items.
Users tend to agree. (Private Collaborative Filtering using
estimated concordance measures, N. Lathia, S. Hailes, L.
Capra, 2007)
Secure Pair-wise comparison of fuzzy ratings
(Introducing newcomers into a fuzzy reputation agent
system, J. Carbo, J.M. Molina, J. Davila, 2002)

Real Data & Public Datasets


MovieLens, www.grouplens.org: Two datasets:
100,000 ratings for 1682 movies by 943 users.
1 million ratings for 3900 movies by 6040 users.
These are the standard datasets that many
recommendation system papers use in their evaluation

My paper with MovieLens


I selected users among those who had rated 70 or more
movies, and we also selected the movies that were
evaluated more than 35 times in order to avoid the
sparsity problem.
Finally we had 53 users and 28 movies.
The average votes per user is approximately 18. So the
sparsity of the selected set of users and movies is under
35%
Agent-based collaborative filtering based on fuzzy
recommendations J. Carb, J.M. Molina, IJWET v1 n4,
2004

Real Data & Public Datasets


BookCrossing (BX) dataset:
www.informatik.uni-freiburg.de/~cziegler/BX
collected by Cai-Nicolas Ziegler in a 4-week crawl (August
/ September 2004) from the Book-Crossing community.
It contains 278,858 users providing 1,149,780 ratings
(explicit / implicit) about 271,379 books.

Real Data & Public Datasets


Last.fm Dataset
top artists played by all users:
contains <user, artist-mbid, artist-name, total-plays>
tuples for ~360,000 users about 186,642 artists.
full listening history of 1000 users:
Tuples of <user-id, timestamp, artist-mbid, artistname, song-mbid, song-title>
Collected by Oscar Celma, Univ. Pompeu Fabra
www.dtic.upf.edu/~ocelma/MusicRecommendationDatas
et

Real Data & Public Datasets


Jester Joke Data Set:
Ken Goldberg from UC Berkeley released a dataset from
Jester Joke Recommender System.
4.1 million continuous ratings (-10.00 to +10.00) of 100
jokes from 73,496 users.
www.ieor.berkeley.edu/~goldberg/jester-data/
It differentiates itself from other datasets by having a
much smaller number of rateable items.

Real Data & Public Datasets


Epinions dataset, collected by P. Massa:
in a 5-week crawl (November/December 2003) from the
Epinions.com
Not just ratings about items, also trust statements:
49,290 users who rated a total of
139,738 different items at least once, writing 664,824
reviews.
487,181 issued trust statements.
only positive trust statements and not negative ones

Real Data & Public Datasets


Advogato: www.trustlet.org
a weighted dataset. Opinions aggregated (centrally) on a
3 levels base, Apprentice, Journeyer, and Master
Tuples of: minami -> polo [level="Journeyer"];
Used to test trust propagation in social networks
(asuming trust transitivity).
Trust metric (by P. Massa) uses this information in order
to assign to every user a final certification level
aggregating weighted opinions.

Real Data & Public Datasets


MoviePilot dataset: www.moviepilot.com
this dataset contains information related to concepts
from the world of cinema, e.g. single movies, movie
universes (such as the world of Harry Potter movies),
upcoming details (trailers, teasers, news, etc
RecSysChallenge: live evaluation session will take place
where algorithms trained on offline data will be
evaluated online, on real users.
Mendeley dataset: www.mendeley.com
recommendations to users about scientific papers that
they might be interested in.

Real Data & Public Datasets


No agents, no distribution public dataset from
recomender systems
Authors have to distribute opinions to participants in
some way.
Ratings about items, not trust statements.
Relationship between # of ratings / # of items too low
Relationship between # of ratings / # of users too low
No time-stamps
Papers intend to be based on real data, but required
transformation from centralized to distributed
aggregation distort reality of these data.

Realistic Data
We need to generate realistic data to test trust and
reputation in agent systems.
Several technical/design problems arise:
Which # of users, ratings and items we need?
How much dynamic would be the society of agents?
But the hardest part is the pshichological/sociological
one:
How individuals take trust decisions? Which types of
individuals?
How real society of humans trust? How many of each
individual type belong to real human society?

Realistic Data
Large-scale simulation with Netlogo
(http://ccl.northwestern.edu/netlogo/)
Others: MASON (https://mason.dev.java.net/), RePast
(http://repast.sourceforge.net/)
But there are mainly adhoc simulations which are
difficult to repeat by third parties.
Many of them are unrealistic agents with binary
behaviour altruist/egoist based on game theory views.

Examples of AdHoc Simulations


Convergence of reputation image to real behaviour of
agents. Static behaviours, no recomendations, just
consume/provide services. Worst case.
Maximum Influence of cooperation. Free and honest
recomendations from every agent based on consumed
services. Best case.
Inclusion of dynamic behaviours, different % of malicious
agents in society, collusions between recommenders and
providers, etc. Compare results with the previous ones.
Avoiding malicious agents using fuzzy recommendations J.
Carbo, J. M. Molina, J. Dvila. Journal of Organizational
Computing & Electronic Commerce, vol. 17, num. 1

Technical/Design Problems to generate


simulated data
Lessons learned from the ART testbed experience.
http://megatron.iiia.csic.es/art-testbed/
A testbed would help to compute fair comparisons:
Researchers can perform easily-repeatable experiments
in a common environment against accepted
benchmarks
Relative Success:
3 international competitions jointly with AAMAS 0608.
Over 15 participants in each competition.
Several journal and conference publications use it.

Art Domain

the ART testbed

ART Interface

The agent system is displayed as a topology in the left, while


in the left two panels show the details of particular agent
statistics and of global system statistics.

The ART testbed


The simulation creates opinions according to an error
distribution of zero mean and a standard deviation s:
s = (s + / cg) t
where s, unique for each era, is assigned to an appraiser
from a uniform distribution.
t is the true value of the painting to be appraised
is a hidden value fixed for all appraisers that balances
opinion-generation cost and final accuracy.
cg, the cost an appraiser decides to pay to generate an
opinion. Therefore, the minimum achievable error
distribution standard deviation is s t

The ART testbed


Each appraiser as actual client share ra takes into
account the appraisers client share from the previous
timestep:
ra = q ra + (1 q) ra
where ra is appraiser as client share in the previous
timestep.
q is a value that reflects the influence of previous client
share size on next client share size (thus the volatility in
client share magnitudes due to frequent accuracy
oscillations may be reduced)

2006 ART Competition


2006 Competition setup:
Clients per agent: 20, Painting eras: 10, games with 5
agents
Costs 100/10/1, Sensing-Cost-Accuracy=0.5, Winner iam
from Southampton Univ.
Post competition discussion notes:
Larger number of agents required, Definition of dummy
agents, Relate # of eras with # of agents, More fair
distribution of expertise (just uniform), More abrupt
change in # of clients (greater q), Improving expertise
over time?

2006 ART Winner conclusions


The ART of IAM: The Winning Strategy for the 2006
Competition, Luke Teacy et al, Trust WS, AAMAS 07.
It is generally more economical for an agent to purchase
opinions from a number of third parties than it is to
invest heavily in its own opinion
There is little apparent advantage to reputation sharing.
reputation is most valuable in cases where direct
experience is relatively more difficult to acquire
The final lesson is that although trust can be viewed as a
sociological concept, and inspiration for computational
models of trust can be drawn from multiple disciplines,
the problem of combining estimates of unknown
variables (such as trustee behaviour) is fundamentally a
statistical one.

2007 ART Competition


2007 Competition Setup:
Costs 100/10/0.1, All agents have equal sum of expertise
values, Painting eras: static but unknown, Expertise
assignments may change during the course of the game,
Include dummy agents, games with 25 agents
2007 Competition Discussion Notes:
it need sto facilitate reputation exchange
It doesnt have to produce all changes at the same time,
Gradual changes
Studying barriers to entry; how a new agent joins an
existing MAS: Cold start vs. Hot start (exploration vs
explotation)
More competitive dummy agents
relationship between opinion generation cost and accuracy

2008 ART Competition


2008 Competition Setup:
limited in the number of certainty and opinion requests
that he can send.
Certainty request has cost.
deny the use of self opinions
Wider range of expertise values
Every time step, select randomly a number of eras to
change, and add a given amount of positive change
(increase value). For every positive change, apply also a
negative change of the same amount, so that the average
expertise of the agent is not modified

Evaluation criteria
Lack of criteria on which and how the very different trust
decisions should be considered
Conte and Paolucci 02:
epistemic decisions: those about about updating and
generating trust opinions from received reputations
pragmatic-strategic decisions are decisions of how to
behave with partners using these reputation-based trust
memetic decisions stand for the decisions of how and
when to share reputation with others.

Main Evaluation Criteria of The ART


testbed
The winning agent is selected as the appraiser with the
highest bank account balance in the direct confrontation
of appraiser agents repeated X times.
In other words, the appraiser who is able to:
estimate the value of its paintings most accurately
purchase information most prudently.
Where an ART iteration involves 19 steps (11 decisions, 8
interactions) to be taken by an agent.

Trust decisions in ART testbed


1. How our agent should aggregate reputation information
about others?
2. How our agent should trust weights of providers and
recommenders are updated afterwards?
3. How many agents our agent should ask for reputation
information about other agents?
4. How many reputations and opinions requests from other
agents should our agent answer?
5. How many agents our agent should ask for opinions about our
assigned paintings?
6. How much time (economic value) our agent should spend
building requested opinions about the paintings of the other
agents?
7. How much time (economic value) our agent should spend
building the appraisals of the own paintings?
(AUTOPROVIDER!)

Limitations of Main Evaluation Criteria


of ART testbed
From my point of view:
Evaluates all trust decisions jointly: should participants
play provider and consumer roles jointly of just the role
of opinion consumers?
Is the direct confrontation of competitor agents the right
scenario to compare them?

Providers vs. Consumers


Playing games with two participants of 2007 competition
(iam2 and afras) and other 8 dummy agents.
Dummy agents implemented ad hoc to be the solely
opinion providers, they do not ask for any service to 2007
participants.
None of both 2007 participants will ever provide
opinions/reputations, they are just consumers.
Differences between both agents were much less than
the official competition stated (absolutely and relatively).
An extension of a fuzzy reputation agent trust model in the
ART testbed Soft Computing v14, issue 8, 2010

Trust Strategies in
Evolutive Agent Societies
An evolutionarily stable strategy (ESS) is a strategy which,
if adopted by a population of players, cannot be invaded
by any alternative strategy
An evolutionarily stable trust strategy is a strategy which,
if becomes dominant (adopted by a majority of agents)
can not be defeated by any alternative trust strategy.
Justification: The goal of trust strategies is to establish
some kind of social control over malicious/distrustful
agents
Assumption: agents may change of trust strategy. Agents
with a failing trust strategy would get rid of it and they
would adopt a successful trust strategy in the future.

An evolutive view of ART games


We consider a failing trust strategy the one who lost
(earning less money than the others) the last ART game.
We consider the successful trust strategy to the one who
won the last ART game (earning more money than the
others).
By this way replacing in consecutive games the
participant who lost the game by the one who won it.
We have applied it to the 16 participant agents of 2007
ART competition

and so on
16 participants
in 2007 competition

Winner
Winner

ART gam
ART game
ART game

Loser

Loser

Game
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

Winner
iam2
iam2
iam2
iam2
agentevicente
iam2
artgente
artgente
artgente
iam2
artgente
artgente
artgente
artgente
artgente
artgente
artgente
iam2
iam2
artgente

Earnings
17377
14321
10360
10447
8975
8512
8994
10611
8932
9017
7715
8722
8966
8372
7475
8384
7639
6279
14674
8035

Loser
xerxes
lesmes
reneil
blizzard
Rex
alatriste
agentevicente
agentevicente
novel
IMM
marmota
spartan
zecariocales
iam2
iam2
UNO
iam2
JAM
artgente
iam2

Earnings
-8610
-13700
-14757
-7093
-5495
-999
2011
1322
424
1392
1445
2083
1324
2599
2298
2719
2878
3486
2811
3395

Results of repeated games


2007 winner is not a Evolutionarily Stable Strategy.
Although the strategy of the winner of the 2007 spreads
in the society of agents (until 6 iam2 agents out of 16), it
never becomes dominant (no majority of iam2 agents).
iam2 strategy is defeated by artgente strategy, which
becomes dominant (11 artgente agents out of 16).
Therefore its superiority as winner of 2007 competition
is, at least, relative.
The right equilibrium of trust strategies that form an
evolutionarily stable society is composed by 10-11
Artgente agents and 6-5 iam2 agents.

CompetitionRank

EvolutionRank

Agent

ExcludedInGame

artgente

iam2

JAM

18

UNO

16

zecariocales

13

spartan

12

marmota

11

13

IMM

10

10

novel

15

10

agentevicente

11

11

alatriste

12

12

rex

13

Blizzard

14

reneil

14

15

lesmes

16

16

xerxes

Other Evaluation Criteria of the ART


testbed
The testbed also provides functionality to compute:
the average accuracy of the appraisers final appraisals
(final appraisal error mean)
the consistency of that accuracy (final appraisal error
standard deviation)
the quantities of each type of message passed
between appraisers are recorded.
We could take into account other relevant evaluation
criteria?

Evaluation criteria from the agent-based


view
Characterization and Evaluation of Multi-agent System, P.
Davidsson, S. Johanson, M. Svahnberg In Software
Engineering for Multi-Agent Systems IV, LNCS 3914, 2006.
9 Quality atributes:
1. Reactivity: How fast are opinions re-evaluated when
there are changes in expertise?
2. Load balancing: How evenly is the load balanced
between the appraisals?
3. Fairness: Are all the providers treated equally?
4. Utilization of resources: Are the available
abilities/information utilized as much as is possible?

Evaluation criteria from the agent-based


view
5. Responsiveness: How long does it take for the
appraisals to get response to an individual request?
6. Communication overhead: How much extra
communication is needed for the appraisals?
7. Robustness: How vulnerable is the agent to the absence
of responses?
8. Modifiability: How easy is it to change the behaviour of
the agent in very different conditions?
9. Scalability: How good is the system at handling large
numbers of providers and consumers)?

Evaluation criteria from the agent-based


view
Evaluation of Multi-Agent Systems: The case of Interaction,
H. Joumaa, Y. Demazeau, J.M. Vincent, 3rd Int. Conf. on
Information & Communication Technologies: from
Theory to Applications. IEEE Computer Society, Los
Alamitos (2008)
An evaluation at the interaction level, based on the
weight of the information brought by a message.
A function is defined in order to calculate the weight of
pertinent messages.

Evaluation criteria from the agent-based


view
The relation between the received message m and the
effects on the agent is studied in order to calculate the
(m) value. According to the model, two kinds of
functions are considered:
A function that associates weight to the message
according to its type.
A function that associates weight to the message
according to the change provoked on the internal
state and the actions triggered by its reception.

Consciousness Scale
Too much quantification (AI is not just statistics)
Compare agents qualitatively Measure their level of
consciusness
A scale of 13 conscious levels according to the cognitive
skills of an agent, the Cognitive Power of an agent.
The higher the level obtained, the more the behavior of
the agent resembles humans
www.consscale.com

Bio-inspired order of Cognitive Skills


From the point of view of emotions (Damasio, 1999):

Emotion
Feeling
Feeling of a Feeling
Fake Emotions

Bio-inspired order of Cognitive Skills


From the point of view of perception and action (Perner,
1999):

Perception
Adaptation
Attention
Set Shifting
Planning
Imagination

Bio-inspired order of Cognitive Skills


From the point of view of Theory of Mind (Lewis 2003):

I Know
I Know I Know
I Know You Know
I Know You Know I Know

Consciousness Levels
Super-Conscious

Human-like
Social

Empathic
Self-Conscious
Emotional
Executive
Attentional
Adaptive
Reactive

Evaluating agents with ConsScale

Thank you !

EASSS 2010, Saint-Etienne, France

172