Multi-Sensor Coordination Using Team Decision Theory

I
I.
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
Information nd Multi-Sensor Coordination
Greg Hager ad Hugh Durrant-Whyte*
GRASP Laboratory
Deparment of Computer and information Science
Universit of Pennsylvania
Philadelphia, PA 1910
Abstract
The contol ad integation of distibuted, multi-sensor per
ceptual systems is a complex ad challenging prblem. Te
obseations or opinions of diferent sensors ae often dis
paate, incompaable ad ar usually only paal views.
Sensor inforation is inherently uncertain, and in addition
the individual sensor may temselves be i eror with re
spect t the system a a whole. The successful operation of
a multi-sensor system must account for this uncerinty and
prvide for the aggegation of dispaate informion in a
intelligent and robust maner.
We consider te sensor of a multi-sensor system to be
member or agents of a tam, able t ofer opinions ad
bagain in group decisions. We will analyze te coordi
nation and contol of this stctur using a ter of team
decision maing. We present some new aalytic rsults on
multi-sensor aggegation ad detail a simulaton which we
use to investigate our idea. This simulaton prvides a ba
sis for the aalysis of complex agent stctes cooperating
in te presence of uncertaint. Te results of tis study
are discussed with reference to multi-sensor robot system,
distbuted AI ad decision making under uncerainty.
1 Introduction
The general problem of seeking, sensing, ad using perep
tual inforation is a complex ad, as yet unsolve problem.
Complications aise due t inherent uncernt of infora
tion fom perceptual souces,. icompleteness of inforation
fom partial views, ad questons of deployment coordina
tion ad fusion of multple data soures. Yet aoter dimen
sion of complexit result fom orgaizational ad compu
tational considerations. We feel that these three topics -
inforation, contol, and organization - are fundamental for
understading and constucting complex, intelligent robotics
systems. In this paper, we ae conceed with developing
useful aalytic methods for describing, aalyzing and com
paing the behavior of such constuctions based on tese
crteria.
Tis mateal is be on wok supor unde a National Science
Fundation Gadute Fllowship and by te National Science Fondation un
de Grant DMC-8411879 and DMC-12838. Any opinion, fndings, and
concluions or reomendtions epese in tis publication ae to of
te authos and do not nearily refet te views of t Naional Scienc
Fundation.
We assume fom the outet that such robotic systems
ae basically task-riented, goal-directed agent. The be
havior of a system is determined entirely by the goal it is
working toward, and the inforation it has about its envi
rnment. At any point in time, such an agent should use the
avaiable information t select some feaible ation. The
most preferable action should be tat which is expected to
lead the system closest t te curnt goal. In short we
wi consider te queston of driving robotics systems as a
lare ad complex problem i estmaton and contol. To
aopt te nomenclatre of decision thery[2], at ay point
in time a agent has a local informion structure refecting
the stat of the world, a set of feasible action to choose
fom ad a utilit which supplies a preference ordering of
ations wit respect t states of te world. We generally
asume tat a rationl deision make is one which, at any
point in tme, takes tat action which maximzes hs utility.
Our commitent, as a result of catng te problem in a de
cision theortic perpective, is t provide prncipled means
for specifing inforation stctures, actions, and (perhaps
most crucially) determination of utilit.
This monolitic forulation is certainly too naive and
general t successflly attack the prblem. The state of
te system is a complex entit which must be decomposed
ad analyzed t be understood. The rsulting proceures for
contol will undoubtedly be computtionally complex. Com
puter resources, like human problem solvers, have resource
limtations which bound te complexit of prblem that can
be solve by a single agent - oterwise kown as bouned
rationai [2]. Such computational considerations suggest
distbutng te workload t increae te prblem solving po
tential of the system. Fm a practical stadpoint te system
itself is composed of physically distinct devices, each with
its own special characteristics. Softar and hardwae mod
ules should be designed so that inforaton ad contol local
to subtasks is kept locally, and only information gerane t
other subtaks is made available. Ultatly, sensors and
subtasks could independent modules which can be added or
remove fom a system witout catastophic result. In this
ca we desire each subtask t have the ability to cooper
ate and coordinate it actions with a grup while maintaining
it own local processing intlligence, local control vaiables,
and possibly some local autonomy.
Our solution is t view the systems as decomposed
99
into several distinct decision-maer. These modules a
t be organized and comunicate in such a maner as t
achieve the comon goal of the system. Oganizatons of
this tpe of are often refered t a team[8,16]. We pro
pose to consider a team theortic forulation formulation
of multi-sensor systms in the followig sense: Te aent
a considere as members of the tam, each observing the
environment and making local decisions bae on te infor
maton available t tem. A manaer (exeutive or coor
dinator) makes use of utilit considerations to converge te
opinions of the sensor system. Secton 2 will be devoted
t a review of team decision theor and present some new
analytic results[7].
One crticism of decision teor is tat optal solu
tions are ofen diffcult or impossible to fnd. In orde to
aid in analysis of these poblems, we have built a simula
ton environent We use the simulaton t exane varous
non-optimal and heuristc solutions t oterise intactble
problem, ad experent wit differnt loss functons t de
terine the charactr of the resultant decision metod. The
simulation is a generalizton of classic puruit and eva
sion games [141 to team of pursuer and evader. Eah
tam member ha loca sensor ad stat vaables. They
a coordinatd though a ta exeutive. Seton 3 will be
devotd t o a detile look at the simulaton ad our rsults
t dat.
We feel tat te team forulaton of snsor systm
ha implications for te broade study of Acial Intlli
gence. AI is rlevant t tis worl in at least t rspet:
Fisty, it is cely possible t consider the agents
of te systm as perforng som reaoning prcess.
Considerng AI system as deision-maer seems a
plausible apprah to te constcton of intlligent
distbuted systems. Thus, this wor ha commonali
tes wit Distibutd AI i tat bot a interestd in
questions of stuctrng inforaton ad comunica
tion betwen intellgent systm.
Seondly, we oftn wat t inerpret te information
available t t systm, and t comunicate infor
mation as intrrettons rater ta simple signas.
This is prmaily a prblem in representation of infor
mation. Again, AI has focussd on te intrprtation
of inforation, and te reprsentton of tat intr
pretation.
More generally, we would l t discover when systems like
tis ca be prftably pose a decision problem. Sction 4
will be devote t an in dept discussion of te general me
its and shoromings of te oraiztonal view, ad atempt
t defne when it is most appropriate o usfl.
2 A Team-Theoretic Formulation of
Multi-Sensor Systems
Team ter originated fom problem in game tery [26]
and multi-persn contol. Te bais for te aalysis of coop-
eration amongst stctures with different opinions or intr
est was formulated by Nah [20) i the well kown bagain
ing problem. Nash's soluton for te to person cooperative
game wa developed into the concept of information, group
ratonalit and multi-peron decisions by Savage [24]. Team
ter has since been extensively used by economist t a
alyze stct [16], inforaton [18] ad comunicaton.
Secton 2.1 intduces te tam stuctre and defnes te
fnction of te team membes ad manager. Diferent tam
orgaizatons a discussed ad t concepts of information
stcture, tam decision, team utility and cooperation ae
defned in Section 2.2. Secton 2.3 applies these techniques
t the multi-sensor team and a method for aggregating opin
ions is derved. Due to lack of space, we will assume some
familiaity wit probabilit ad decision teor.1
2.1 Team Preliminaries
A sensor or member of a ta of sensor is chaactized by
its information stucture and its decision function. Consider
a team comprsing of " membes or sensors each making
observations of te state of te envirnment. The infora
ton stcture of te i'" team member is a function 'i which
describes te character of te sensr obserations z; E J; in
trms of the state of te environment 8 E 8, and the other
sensor actions a; E A;. j = 1, , "
So that:
(1)
<ollectvely te n-tple 1 = (';,, 'n) is called te in
foraton stuctre of te team. The action a; of te it
"
team member is rlated to its inforaton z; by a decision
fnction 5; E /; as a; = 5;(z;). We may also allow ran
domized rles, in which case 6 associats inforation with a
distribution over the set of feasible actions. Collectively the
n-tuple 5 = (51,, 5n) is calle te tea decision func
tion. Rr an estimaton problem, the action space A; is the
same as te spac e of possible stts of natre 8: Our acton
is t choose a estimat a; = 8; E
8
.
Thee are a number of different for that the infor
mation stctur ca t which in tu chaacterizes te
tpe of prblem t be slved. If for all tam member 'i
is defned ony on e ('i! e -}) te resulting strcture
is called a static team [16]. When 'i also depends on the
other tam members' actions, then the stcture is called a
dynic tea (13]. Clealy a eah team membe ca not
mae decisions ad be awa of te result simultaneously,
te general for of inforation stuctur for a dynamic team
must induce a causal relation on the team member actions
a
;
. We can apply a preedence stuctre on te time instat
a member maes a decision, so that if member i maes a
decision prior to meme j ten the inforation stcture
I
i will not be a function of a;
. Indexing the tam members
by thei decision maing precedene order we ca rewite
te inforation stuctre a:
z; = 7;(8, a1 , a;-1}
An etnde vesion of tis papap a Gap Lb t. rer 71.

100
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
A sensr or membe of a team will be considered ratio
nal i it ca plae apreference orderng on its actons tat
admit a utlity function u; E U. One possible set of rato
nalit aioms ca be found in [2, p. 43] and te prof that
tese aiom admt a utilit function ca be found in [5]. A
deision rle 6() can be evaluate in t of it payff:
u
;
(6, 8) = u;(6(z;), 8)f(z;l8)dz;
= E[u;
(6(z;), 8)]

We asume tat a rational team member is atempting to
maximize it payoff.
Te team utilit is a fncton which assigns a value t
each team action L( e, a 1, a2, , an). The role of L is very
important in chaacterizing the team. The interpretation of
tam ation due to Ho, Chu, Maschak and Radnor [13,16],
is that te goal of evey tam member is t maximize L
regadless of personal loss (in fact, peronal loss is not even
defined). We will call this an "altuistic" team. An alter
native forulation is t allow .individual tam members to
have a personal utilit as well a an interest in te ta.
Fr example a team member may agree to cooperate and be
subject t the utility L, or to disagee wit te other team
member and be subject to a pronal utlity. In tis cae a
rational tea member wl agree to coopeate ony if it wl
gain by doing so: when the team utility exceeds it personal
utilit. We shall call this a "atagonistc" team.
Te idea of individual rationalit ca be extended to
include so-called group rationalit. Nash frst intoduced a
set of group rationalit aiom. Ther has been consider
able disageement about tese aiom [28], ad a number of
other defnitions have been suggested e.g. [10]. The under
lying bais for providing group rationalit is te abilit of a
tam to put a preference ordering on group decisions. Un
like individual utilit considerations, tis involves a number
of asumptions about te nature of te group or team. Fr
example, each tam member must assume sme subjective
knowledge of other players rationalit, interersonal com
paisons of utilit require preferences to be congrent and
assumptions must be made about indifference, dominance
and dicttorship.
2.2 Team Organizations
The problems associated wit te extension of individual
to group rationality are all concered with te comparison
of individual utilities. The existence of a group preference
ordering is equivalent to requiring that the combination of
individual team member utilities that for te team utilit,
is convex. If this is satisfed then we say tat the group
decision is also person-by-person optimal. The key princi
ple in group decision maing is te idea of Paeto optimal
decision rles:
Defnition: The goup decision 6 is Pato-optimal i every
other rule 6 E D decreases at leat one team members utilit.
If the risk set of te team L(8; 61, , 6
n) E R
n
is convex,
then it can be shown [13] that such a team decision is also
person-by-person optimal so tat for all team members i =
1, , n te tam ation a= [a1, , an]T also satisfes
ma E [L (o;(z1), ,a
;
= 6
;(z;), , 6
(zn))] (2)
a;EA;
If te clas of group decision rles D includes all jointly
randomized rles then L will awas be convex. If we re
ally believed in an altuistic team, we must use this clas
ad be subject to these result. Considerable work ha been
done on fnding solutions to equation 2.3 under these con
ditons [16,13,12,11], particulaly as regards the effect of
inforation stctue on distbuted contol problems.
We ae primaly interested in team of observers-sen
sor making obserations of the state of te environment. In
tis case the team members ca be considered as Bayesian
estimators, and te tam decision is to come to a consensus
view of te obsred state of nature. The static team of esti
mators is often called a Multi-Bayesian system [28]. These
system have many of te same chaacteristics a more gen
eral team decision problem. Weerahadi [27] has shown
tat te set of non-randomized decision rles is not complete
in these system. If to tea members using decision rles
6 = [61,6:] have utlities u(8) = u1(61,l) and u2(62,8),
ten the team utilit fnction L(8) = L(u(8)) will only
admt a consensus i it satisfes te inequalit:
E[L(u(8)));L(E[u(8))) (3)
This is te Jensen inequalit, and it is well kown that tis
will be satisfied i and only i the function L( u( l)) and
te risk set ae convex. Generally, tis will only be true
when the set D of decision rles includes jointly randomized
decision rules.
Consider te team utilit L a a function of the team
member utilities so tat L = L(u1, , un) = L(u). The
group rationality principles described above restict te func
tions L that ae of interest t tose that have the following
properties[ I]:
1. Unanimity: . > 0 Vi
lu;
,
2. No dictator: If 'i: u
;
= 0, there is no Uf such that
L= Uj.
3. Indifference: If Vi, 351,62 such tat u
;
(61,) =
u
;
(62, ),then L(61) = L(62)
If the tam utilit function L satisfies these properties, we
will say tat the team is rational. The function L is often
called an "opinion pool". Two comon examples of opinion
pools are the generalized Nash product:
n
L(8;61, ... ,6n)=ciu;(6
;
,8) a;;: o
i=l
ad the logaithmic or linea opinion pool:
n
L(8; 81, .
.
. , On)= L >;u;(8;, e), >.; ;:0,
i=l
101
The value of te generaize Nah product ca be seen by
noting tat if u;(6;(Z),8) = f(z; !8) and a; = 1 ten L
i the posteior densit of 8 with respect t te obseratons
z;. A crtcism leveled at the geealize Nah prduct is
that it asumes indeendence of opinions, however tis may
be acountd for tugh te weight a;. A cticism of
te linea opinion pool is tat tee is no reinforcement of
opinion.
Suppose we now restct gup decision rules 6 E D
t non-randomzed deisions. Ts alows team mmber
t disagre in the following sense: If te tam rsk set
u = [u1(61,8), un(6n,8)] i convex for non-radomzd
6, then equaton 3 holds a a consnsu may be reache. If
however u is concave i at leat one "" ad if randome
rles ae disallowed, it is bettr (in t of utlit) for te
associat tam mmber t dsag: a i tey wee atng
as a an atagonistc t. It shuld be clea fm t ex
ample that te differne beten atonistc an altuistc
tam i te abilit t obtin a convex "opinion" sa.
If all the U; a convex fnctons, ten L will aways be
convex on t clas of non-radomzd decisions. However
i locaton estimation o Mult-Bayesian system, te u; will
oftn be concave s that L(u) will be guaateed convex
only i te class of radomizd rles. Thus L( u) will always
be convex for an altistic team. Fr an antagonistic team L
will only be convex when agrement ca be reached (in te
class of non-randomized deisions), oterise if opinions
diverge sufciently then L will be concave. Concavit will
geneally te te for of sepaating team membes int
convex goups of opinions coaltions which may overlap.
Ou itrst in these result centes on fnding when
agement can be reached ad in calculating te value of the
consnsus. We sume tese concepts in te following:
Result 1: Consider a team wit member utlities u;(6;19)
and ta utlity satsfying the grup ratonait conditons.
Then:
1.1. Conenus: Cooperation will only occur when te set
of risk poit L(611 ,6n) E R
n
is convex.
1.2. Altruistic: If 6 E [ is te clas of all randomized
decision rules then L will always be convex.
1.. Anagonistic: If Vi, L u; then L will be convex in
te clas of non-randomizd deision rles.
1.4. Diagreemn: When L is concave ther is no best
decision and agreement cannot be reached.
The point at which L beomes concave for each member
is called the disagreement point te value of a member's
utility at tis point is called te security level.
2.3 Multi-Sensor Teams
The fusion of sensor obserations requires tat we have a
metod for comparing information fom disparate sources.
We consider each sensor to be a member of an antagonistc
team in the following sense: Eah sensor comes up wit un
ceran partial views !f te state of the envionment te goal
of the executve is t integrate the vaous sensor opinions ,
by oferng incentives ad intrprtations for comining dis
paat viewpoints. Te atagonistic ta stuctur allows
member t disagree if for some reaon tey have made a
mste or canot rconcile their views with tose of the
other ta members. A altistc tam could not tae this
aton.
We suggest tat te compaison of diverse obseratons
can be intepreted in tes of a compason of the utility of
a consensus decision. Suppose we have two observatons z1
ad z2 which ae not direty compaable. Each observation
contbutes t some higher level descrption of the environ
met , and each i depedent on te other. We can interret
ay deision 6 aout te environment i t of it utilt to
te obsevations: u1(6(z1)18) and u2(6(z2)18). Although
z1 ad z2 canot b comae diretly, teir contibutons
t pacula decisions can be evaluated in a comon utilit
fawork. The team teretc comparisn of utilities a
mts a masur of disagement ad allows for te evaluation
of sesr inforaton in a consistnt maner.
Define e t be te st of stats of natre and consider
a robot system with snsors 81, J = 1
1 1m, ting se
quences Of ObseratiOnS Z; = . { zf,
I
zn Of featureS in
te environment. We will rstct interest t te static team
stctue so tat Z; = f;(6). Lcally, sensors can make
decisions basd on local obsevatons as 8 = 6;(z;) frm
COmpaable sequenceS Z = {zl 1
I
zn, With respect tO
a comon utilit u;(6;(Z;)1 9). Jointly the sensor team ha
a utilit L = L( 8; 611 1 6n ), which can be considered
a a functon of te individual utilites L = L(u11 , u2)
satsfying the group rationalit conditions.
If the obseratons from different sensors a incompa
rable, they must be interpretd in sme comon famework.
This will be the cae when te snsor ae located in differ
ent locations for examle. Let
D
; interrt S;'s obseratons
in sme comon descrption famework. Ten the team loss
can be writen as:
L( u1(ct[D1(z1)
]1 8)1
1 u
,.
(
o[D,
(
z,
)
]1
8)
]
)
By selectng L and aalyzing its convexit, we will establish
te character of te sensor team.
The rationality axioms derved fm utilit theory re
quire that we be able t put a prference ordering on deci
sions ci(). It seems reasonable tat the preference ordering
admitted by a observation Z; will be the same ordering as
tat obtined by a maximum likelihood estmator (unbiased
Bayes rationality). In this cae, te utility fnction of an
obseration will be coincident wit its likelihood function.
Thus te Gaussian distibution N( Zi 1 A;) associated wit the
observation
z
; can aso be considered as the preference or
dering or posterior utility function of z; on any resulting
estimate 8. In this framework. to observations Z; ad z1
will have a bais for agreement only if their combined utilit
exceeds their individual utility, that is a consensus can only
be reached if te set of observation utilities form a convex
set.
102
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
To f tis, defe u;(z,8) = f;(ztl8) " N(z;, A;) as
te loss t te obseration Z; of te estmat 8, ad let
denote te ta utit. Ten, in tr of expected utiit, a
consensus ca only be reached i L satisfes Equaton 3, i.e
te fnction L i convex.
Te functon L will be convex i ad only if it matx
of second order derivatives is non-negative defnit. If L
satsfes te grup rationalit principles, tis requies that
0 fori= 1, , n. Differentiating u; gives
au
0
8
3' = 1- ,- .\,
,-.\]n(..\
Fr tese to be positve, and hence te set u e R
n
t be
convex, we ae must fnd a 8 which satsfes:
(0- z;
)
' A/
1
(8- z;
)
S 1 (4)
Fr all i = 1, , n obserations.
Consider any two observatons z; ad ZJ They ca
form a consensus i we ca fnd a 9 that satisfies equaton
4 for bot z; ad ZJ. To compae obserations, we interret
tem in a comon framework a
D;(z;) and D1(z1). If
J; and Ji
ae te jaobias of
D; ad Di respetvely [6],
define
.
E;= Ji1 Ai1J;
'
. This i te inforaton matx
of te obseration Zt tansfored to the comon frame of
reference by the transforation D.
Since the left hand side of equation 4 is always positive,
we must .nd a 8 which satisfes
i(8-D;(z,)?.E, (9- D;(z;)) +
(0-Dj(Z)i)T.Ej (0-Dj(Zi)) s 1 {5)
The value of 8 which maes the left hand side of this equa
to a mnimum (ad which i also the consensus when
it exists) is given by the usua combinaton of noral
observations[2]:
8 = (.E;+ Dri)
-l (D!;D;(z,)
+
DriDi(zi))
Substituting this into equation 5 gives:
i(D;(z;
)-Di(zi))'Dr,(Dr, +DEi) -1
DEi (D;(z;)-Di(zi)) S 1 (6)
We will say tat z; ad Zi
admt a Bayesian (non
randomized) consensus i ad only i they satisfy equation
6. The left side of Euation 6, which we will denot as
d
i
is called te generalized Mahalanobis distance (a re
sticted form of this is derved in [27]) ad is a measure of
disagreemnt between to observations. Figure 2.3 shows
plot of u; against U
j
for vaious values of a11 and which
clealy demonstate tat te convexity of the set [u;, Uj]
corresponds t requiig tat d
i
S 1. This measure ca be
further extended t consider more than two observations at a
time. Fr example, if each obseration z;, i = 1, , n has
v.
v,
Figure 1: Plot of Mahalanobis distances
te same vaiance-covariance matix A, then a consensus be
obtained only i:
1
n n
d <l
2n2 L.L. 'J -
i=l i=l
(7)
It is clear that a set of obserations that satisfy Equation 6
pair-wise wll also satsfy Equation 7
In most real situations, it is unlikely that we will kow
te varace-covarance matces exactly. In tis case, any
estimates of the A; act as i tey were tesholds in te sense
tat te larer te A; that i used, te more disagreement will
be toleratd.
3 Simulation Studies
To ti point, we have discussed the theoretical aspect of
estmation in the tam framework. Our goal is t even
tually pose problems of multi-sensor control in coordina
tion and solve tem in a simila manner. However, fnding
and analyzing solutions to decision, contol, or game prob
lems, especially in the face of anyting less ta perfect
inforation, can be extremely difcult. Fom a technical
perspectve, solutions under even relatively simple losses
ae complex optimization problems. Other heuristic or ad
hc approaches must oftn be considered. Methodologi
cally, tere is a queston as to what proper loss functons
ae for different prblems. Ideally, the loss function should
refect the actual state of affais under consideration since
it refects te preferences of the decision maker. Whereas
in te economics literature, losses ae usually derived fom
utilit considerations based on monetry rewads, we have a
mch wider set of competg critria to consider. This com
plicates matters to the point that we need to gain intuition
about te issues involved before hypothesizing a solution.
In order t deal wit tese issues, we have constucted
a simulation on a Symbolcs Lisp MachineThe simulation
takes the for of a game of pursuit and evaion simila in
103
chaactr t te classic diferntal game kow a te h
mcid chufeu [14}. That classical for of tis game
consists of a puruer ad evader moving at constat velocit
in a plane. Both player have pefect inforaton about te
oter's stat, and atempt use ths information t intrcept
or evae thei opponent rspectvely. Te payoff stct
of the game is te te untl captre. The major chages we
have made a tat we have euippe te playes with im
perfet sesing devices (i.e. te playes use imeect stt
inforation), ad we allow multple pusues ad evaes
gruped into teams coordinatd by a tam executive. It is
importnt t note tat te motvaon for using te pusuit
evaion famework is prmay t provide eah ta wit
a well-defined method for comparing stuctrs and contol
policies. The game is not of intnsic value by itelf, but
forms a stct, fexible, closed systm in which sensor
models, organiztonal stctrs and decision metods may
be implementd ad eaiy evaluatd
The simlaton i constct so tat we ca vmy t
stcte of ta member, as wel as oveall ta stctre,
a quickly evauat the effect of te chage basd on te
chaatr of te simulate gam tat ensues. We have in
mind t allow varaton in such fator a dyamcs, sen
sors, informtion intgration polcies, incentive stctures,
an uncertaint of inforation, and obsere what tpes of
policies lead t the aequat peforance in tes circum
stnces. We expect t tasport what we lea fm the
simulaton t real-wrld prblem of mult-sensor rbot sys
tm curently being developed in the Grasp labortr[21}.
We imagine a situaton wh tis simlaton provides an
environment in which distbut exper cordinaton and
contl prblems ca be investgatd befor implemntation
and converely tat applicatons of the sensor systm unde
development will suggeSl what diections, snsor models and
dynamics would most fitul t explor in te simulation.
The remaider of tis scton details te curnt stctre of
te simlation envirnment ad outlines our initial exper
ences with it.
3.1 The Game Environment
The simulation tes place on a plana feld possibly litred
with obstacles. The baic cycle execution involves team
members taing sensor readings, exeutives intgratng in
formation ad offering incentives, and fnaly ta mem
bers making decisions. The stat variables a updated a
te game moves t a new step. A game terminats when
and if the pursuit robot;which ae euipped wit a simple
ballistics system, capture all te evaders. This is a medium
level of granulaity wit emphasis on the general behavior of
tams, not the precise performace issues of ta members.
Some time-constaint issues can be investigate by includ
ing time paameters in te payoff functions, but computa
tional complexity issues ad investigations of aynchronous
behavior are outide the scope of our considerations. Fr
instnce, if some decision policy is computationally more
complex than another, differences in performance will not
refect tat complexit.
3.2 The Stucture of Team Members
The chaactr of individual team members is detrned by
te modules:
1. The kinematics ad dynamics of motion on te plane,
2. what sensor a avalable ad the noise chaactr
istics of tose sensors, ad their kinematics and dy
namics, and
3. the ballistics which detrine te termination of the
game.
The tm memes a constat velocit, vaiable di
rction unit opeating i a plae wit stat variables z, y,
ad 8. Since t robot move wit constat velocit, te
only dircty contlled varable is 8. The only dynamical
consideraton involved is how we alow te robot t chage
its curnt heading t some new desired heading 8 4 - the
single contl. Cunty, we asume that when reorienting
each agent can move with some fed (possibly infinite) ve
locit, w. The has te effect of defining a minima tring
rdius. Pursuers generally have some fnite w, while evaders
have innit w - i.e. they tm instteously.
The sensor model we a curently using is a range ad
dirction sensor. The sensor has a limitd cone of dat gat
eing, and a liit rage. It ha a single contol vaiable
a which t robot ca selet t point te snsor. We a
sume tat snsr tically rtr noisy data, s we have
difent noise models which we "wap aund" te sensor
t make it more closly approximat real dat gateing de
vices. Te induces decision problem in dealing with both
te noise and rage limitations of devices. Te fat te te
snsor ae distbute intoduces issues in intgrating noisy
obserations fom different fames of reference[6]. Finally,
since sensors are tansported by the robot tere ae issues
involved in resolving the confl ict betwen action for pur
suit or evasion, and actions which will allow more effcient
gaterng of information.
Terination of the game occurs when all evaders are
eliminatd. We define a captue region which delineates
how close a pursuer must come to eliminat an evader.
However, when information is noisy, te aea in which the
evader ca be located will have a associatd uncertainty.
We sometimes equip each pursuer wit some mehaism t
"shoot" evaders, allowing the possibility of uncertainty in
obseation t make it "mss". Pa of the payoff strcture
of the game can include csts for using projectiles and miss
ing; thereby adding incentive t localize the evader t te
best degre possible.
3.3 Information Structures, Organization, and
Control
The intesting issues ae how the robot systems ae contolled,
ad how team members interact Each team member must
104
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
Figure 2: Te Pursuit-Evasion Game

make decisions, bae o available information, about te
best setg of it contol vaiables. Tus, eah tam mem
ber has a local informaton slctre and a utlit fnction
or decision rle as outlined in Section 2. T tpe of i
formaton avaiable - it completness and fdelit - aong
with the utlity fnction dete how an agent behaves for
a fxed set of conlls. We have specifcally mdula in
formation slctre and decision processes s tat vaatons
are easily compared.
The inforation slcture can consist of only local in
formation, or ca contn inforaton comunicated from
oter team members, as well as computd information based
on obseration history. Te ta executive provides the ba
sic organztona stctre of the ta It i euippe with
te ta information slcture, and computs te tam utl
ity which is offered a the incentives for a tam member t
cooperate wt te team. The issues here involve integrat
ing the team information, and the exact nature of the ta
utility. In our experiment, we either use te executive as
an inforation integratr and tam blackboad, or restrct
robot t teir own local inforaton.
Our experiment t date have been in controlling the
direction of !avel of robots. Three decision metods have
been used:
1. Purely local information and decision
2. Completely global information and a linea pool team
utilit (see page 3)
3. A mix of global and local information with a non
convex utility stucture.
The frst of tese is te obvious slategy of chasing
whatever is witin the viewing radius and avoiding any
obstacles. The second amount to the executve choosing
an evader to follow, and the team members agreeing at all
costs. This has te undesirable proper of the pursuers be
ing destoyed while attempting purely global objectives. The
fst metod disregars centalied contol, possibly mssing
a opportnt for team members to cooperate for comon
benefL The lattr disregards individual concers for global
objectves, possibly disregading importat local objectives.
The fnal method uses te executive t integrate infor
mation about evaders, and t offer a team incentive t chase
a parcula evader. But, it also let ta member comput
a incentive t avoid obstacles which fall on tei path. Fig
ure 2 shows a team confguration where some team members
(those labeled "P-D") ae disagreeing with te team in order
to avoid an obstacle, while the rest of the tam (labeled "P
A") ae following te executives order t chae an evader.
This is our frst experience with a m of local and global
contol.
Our next objective in the simulation is to consider noisy
observations ad develop sensor contol algorithms. Our
idea for this project is the following: recall that sensors
retur distance and diection. We henceforth assume both
quantities are distributed according to some probability dis
!ibution. The infonation strcture of te team will consist
the curent best estimat and the inforation matix of mea
surments integrated as in Section 2. Te utility for an angle
a; of a sensor i will be the expected change in the informa
ton for te closest evader witin te cone of vision. This
means tat individual memen will choose that evader for
which tey can contribute the "maximum" infonation. We
have not developed any tam policies for this scenario, yet.
4 Evaluation and Speculation
We have considered a very basic, static, non-recursive tam
slcturc for the sensors and cues of a robot system. The
results obtained for te aggregation of agent opinions ae in
tuitively appealing and computtionally very simple. Simi
larly, the initial simulation experiment wit distibutd con
tol seem promsing. However, it is clearly te case that the
methods presented thus fa could easily be developed with-
105
out recourse t te concepts of team and infortion stuc
ture. We have chose to intoduce these ideas for to main
reasns: fstly as a device tugh which the intatons
beten sensor agents my be easily explaie, ad sec
ondly because we feel tat tam theortc metodology has
grat potntal for udestdng and implemeting more
complex organitonal stctures i a systematc maner.
Our mai poin i that team teor i neite a cmletly a
stct non-omputonal foton of te problem, nor
a computational tehnique or algori!m w no tereti
ca potnta, but i i fat our aaog of a comuaionl
theor[lS]. We asse tat te inherent elemets of cooper
aton ad uncet mae t theory te appropriae tol
for this class of prblems[ll]. This secton discusses te
advantages of ta teor sugests issues which nee to be
explord mr flly.
4.1 Information and Structure
Many of te advantges of tam teortc descrptons lies
in te abilit t aayze 1 effect of differnt agent (ta
memer) inforaton stctrs on te overll systm capa
bilites. Rea tat i te general case, te i'" team mem
ber iforaton stctr may well depend on the other
tam members actions ad inforaton, either a a prefer
ence orering (non-reusive) or in te forof a dialogue
(reurive) stcture. F exaple, consider a ste cam
era and a tatle sensor, actng tgether in a tam. It i
oftn te cae tat the stro matching algorthm i inca
pable of fding dispaties and thre dimensional locatons
tat ae horzntl in the view plae, wherea a tctile sen
sor mountd on a gippe jaw is best at fnding just such
horizontal features. In additon it is reasonable t assume
only to te camera agent, ad the response characteristics
of te tactile sensor ae of relevant only to te tactile con
toller. The ambient illumination as meaured by the camera
has no relevance t decision made by the tactle sensors even
tough tey may cooperat i disambiguatng edges. Team
ter allows inforation and contol t reside where it i
apropriate, ad tereby reuces problem complexity and
increases perorce potntial. We believe tat this is a
ccia prnciple for te constction of intelligent robotics
system.
To tis point, we have not discussd uncerant of in
formaton. However, inforation fom perceptual sources
is sur t have some asociated uncertainty. Uncertaint
ads a ent dimension t ay discussion of inforation
- we mut consider some gae of belief in inforation[4]
ad how that should infuence the choice of aton. In the
cas of perfect snsing, inforation i either adeuate or
inadequate; and new information ca be derived by using
te constaints of te prblem t had. Hence, new facts
will either lea t mre iforaton, or be redundant. On
te other had, i inforation is uncerain, adding more un
related obseratons may not really increae the available
inforaton. Multiple corelatd obserations may, in fact,
be a betr statgy since that is likely t reduce unceraint.
Encodig considerations such a tis present n prblem
in team theory, as inforaton stctrs a perfectly ca
pable of modeling uncertan inforation sources. The had
questions that ase ae how to stcture the pooling of in
foraton beteen sensor wit deendent information, how
t tke ation i the fae of uncertinty, and what control
methods ae most approprat f.r directng te gaterng of
iforation. We ae curently exploring these issues.
that, while a vision system is good at fnding globa
loca
- 4.2 Loss Considerations and Control
tons, a touch sensor is beter for refning observatons and
resolving local ambiguites. Wha is reuid is a shaing The loss functon associated with an agent or a tam de
of infomation beteen these two sensrs: teir respectve terines the essential nature of te decision maer. In te
inforaton stctures should be made dependent on each stadad optimal contol forulation, specifcation of infor
other's actions. We ca imagine specifying te prblem so mation stctures and loss provide the criteria for selection
tat the soluton i anything fm a simple optmal contol t of te optim conrol la or dcision rue. However, op
an extended dialogue tng place beteen te two sensors, tmal rles ae often difcult to derive, and have a compu
resolving each other observations and atons, aving at tationally complex nature. General result ae known only
a consensus deision about the environment. This example for a restcted clas of inforation stuctureloss function
clealy shows te advatages of a team theoretc analysis. forulations. Anoter method for selecting contols is t
We can postulate alterative inforation stcturs for te postulate a class of admissible conrols, and choose the
sensor and the dynamics of the exchage of opinions can member of this class which minimzes the loss. Latly, we
be analyzed: Is a consensus obtained? When is a decision can consider constucting decision rules ad hoc and eval
made? Should comunication badwidth be increasd or uating their performance relative t an objective based on
dereaed? etc. simulation studies. In any case, the character of the loss
Another aspect of this scenaio is that the to sensors
function is crucial in determining the resultant decision rle
have partitioned te envionment into a kind of "who knows or contol law.
what" inforation stucture. In general, not all the infora- One area which neds more exploration is a methodol
tion about a robotics system is relevant t the constction ogy for the specification of loss functions. Ideally, the loss
of specific porons of the system. Analogously, all the i- function should be justifable in tr of objective critra
formation available via sensors is not relevant to te perfor- related to the problem. Pragmatically, it is often dictated by
mance of all pat of te system. In the example above, the mathematical convenience. Fom the team perspective, more
spatial chaacteristics of te camera image a of inteest work neds t be done on te interation of ta ad local
106
I
I
I
I
I
I
I
I
I
I
-1
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
loss characterizations. Section 2 presented some rsults in
this diection, but more work is surely neeed, pariculaly
in te case whee te tam objectives ae not expressible a
some combination of member's objectives.
To illustate what we have in md conside forulat
ing loss fnctons for contolling a system based on a desired
state of informaton. That is, if the team ha its goal some
state of inforation (for exaple, t move a a tis in
formation is neded), what action is most apprpriate for
progessing fom the curent inforation stt tward te
desired inforation stat. Should it select a aton which
will change te uncertait asciate wit curnt infor
maton, or go ahead with an acton tat adds uncorelated
evidence? How should it decide tat it has enough infor
mation? More concretely, should the executve take aother
picture with te camea, or pehaps take a diferent view,
or maybe use another sensor altogether. Maybe te sensors
themselves should decide individually what to do. These
are all issues dealing wit the intraction of inforaton and
action. By using tam teory, we ca eaily forulat the
problem, specify loss functions or decision metods based
on, for exaple, te paeters of a probabilit distibution
associated wit sme inforation source, and examne the
result via simulation or by analytic methods.
4.3 Decision Theory and A
As we stated at the outset we consider our work relevant to
AI in that we tat we may wat to consider information as
interpreted, ad would lie t consider pa of a system as
intelligent reaoning agents. In relatd work dealing wit the
interaction of intelligent agent, Rosenschein ad Genesereth
in [22] and Ginsburg in [9] have investigatd varations on
game teoretical defnitons of rationalit for coordinating
intelligent agent. However tese rsults ae a atempt t
analyze te interaction of intelligent agent with n a priori
structure and investigate the consequence of various rato
nality asumptions. We, on te oter hand, postulat a given
team stcture and ae interested in discovering it prper
ties. This is an important fndamental distinction to keep in
mind.
It is our view that kowledge-based reaoning agents
can be used effectively in te team theoretic framework;
but, we must be able to describe them in terms of the other
system elements -that is, as decision maers with infor
mation structures and preferences about action. In order to
achieve this objective, we must develop information stuc
tures to be compatible with AI conceptions of information
as discrete (usually logical) tokens, ad soehow connect
control stuctures and loss forulations. At his point, we
can sketh at least possibility. First, view such reaoning
agent a consisting of to phases: computing the infora
tion stucture, and selectng an optimal action in the face of
available inforation. This is similar to the clasic separa
tion of estimation in contol in contol theory literature[3].
Computation of te information strcture amounts to using
furished inforation and making implicit information ex-
plicit relative to a given model[23]. That is, some pat of
the inforation in the kowledge bae is used to infer new
facts fom given information. The complet set of such facts
for (in a limiting sense) the inforation stcture. Some
of the teoretical analyses of (logical) knowledge have de
tiled methods for describing this prcess of inference using
variant of modal logic[23].
Loss forulations for te prefeence of actions can be
specife using a conception of action simila to te situation
calculus[17,19]. In this systm action is a mapping between
world stts, wher each state reresent a confguration of
the world. Moor [ 19] has shown how both inforation and
action ca be rpresentd ad related within te conceptual
famework of world states, maing loss forulations based
on information possible. The actual details of tis procedure
ae beyond the scope of tis paper, but we can show that
several problem in the planning domain can, in fact, be
reduced t decision problem posed in this manner. As a
frher example, consider building a decision-maker who
attempt t fll in gaps in an incomplete, discrete knowledge
base. The specifcation of information and loss functions can
be done in trs of world states as presented above, and te
atal implementation of the system done as a rule-based
system.
Finally, we may attempt t combine this agent with
agent which attempt to reduce uncertaint in the probabilis
tic sense outlined in te previous subsection. For instance,
a camera and a tactile sensor which have local probabilistic
uncertaint reduction methods, and a global executive which
is building models of te environment. Using team theory,
we ca aalyze possible methods for contol and cooperation
of these dispaate agents ad offer a coherent explanation of
the full system' s behavior.
5 Conclusions and Future Research
Analysis of the general team organization with respect to
tam members information strctures provides a systematic
faework for addressing a number of important questions
concering the effect of sensor agent capabilities on over
all systm perforance. We summarize some of the more
important issues:
1. Could sensor beneft by guidance fom another team
member. Should communication between members
be increased.
2. Should te sensors ability to mae obserations be
enhanced in anyway, by changing hadwae or finding
algorithmic bottlenecks.
3. When would an exchange of opinions and dynamic
consensus be attmpted.
4. What overall system stucture (as described by te
information stuctures of te team members) is best
(or better) for different tasks.
107
Simlaly, there are a number of imporat questons that
ca be addressed by aalyzing the effect of individual tea
members utlit ad deision fnctions, including:
1 . Comunication ad time cost in te decision process
t prvide for real tme aton.
2. Inclusion of decisions t tae new obseratons of
te environment if previous opinions a rjecte by
other ta member, or if insuffcient inforaton
was obtne on a frst pass.
3. Efect of new deision heurstcs on overall systm
perforce.
Of course al thes idea may well be difcult t consider
analytcally, tough tis foralism does reuce te seach
space of altratives ad prvides a famework witin which
tese issues may be evaluat. The tam theoretic oraiza
ton is a powerfl method for aalyzing multi-agent system,
but it is cernly not the complete aswer.
Acknowledgent: The Author would le t thank D.
Max Mintz for many valuable discussions about tis subject
References
[1] M. Bachaach. Group decisions in the face of differ
. ences ofopinion. Magemn Sciene, 22: 182, 1975.
[2] J. 0. Berger. Statitical Decision Theor a Baesian
Anysi. Sprnge-Verlag, New York, 1985.
[3] D. P. Bersek. Dynmic Progaming a Stochasti
Conrol. Volume 125 of Mathmtics in Scine a
Enginering, Acacdemic Pess, Inc., New York, frst
editon eition, 1976.
[4] P. Cheesema. In defense of probabilit. In Proceed
ings of/CA-85, pages 1002-1007, Los Angelos, Au
gust 1985.
[5] M. DGrot. Optimal Statitical Deciion. McGraw
Hill, 1970.
[6] H. Durant-Whyte. Conceg uncertain geomety in
robotcs. Presentd at The Inerntiona Workhop on
Geomtric Reasoning, Oxford U.K. July 1986.
[7] H. Durat-Whyte. Inegration and Coordintion of
Multisenor Robot System. PhD thesis, Universit of
Pennsylvaia, Philadelphia, Pa, August 1986.
[8] M. Fox. An organiztional view of distibuted system.
IEEE Traation on System, Man, an Cyberntics,
1 1(1):70-80, January 1981.
[9] M. Ginsburg. Dcision procedures. In Proc. Workhip
on Distributed Problem Solving, page 42, 1985.
[10] P. Harsanyi. Cooperation in Gams ofSocial System.
Cambridge Universit Press, Cambridge, 1975.
[11] Y. Ho. Team decision teor and inforation stuc
trs. Proceedings ofthe IEEE, 68(6):644-54, June
1980.
[12] Y. Ho, I. Blau, and T. Baa. A Tale of Fou In
formtion Structures. Technical Report 657, Harad
Univesit, 1974. Pat l .
[13] Y. Ho ad K. Chu. Team decision theor and infor
mation stuctures in optmal contol. IEEE Tra. on
Automtic Conrol, 17: 15, 1972.
[14] R. Isaas. Di erenial Gas. Rober E. Kreiger Pub
lishing Co., Huntington, New York, 1975.
[15] D. Ma. Vision. Feeman, Sa Fasisco, 1982.
[16] J. Marchak and R. Radner. Economic Theor of
Tea. Yale Universit Prss, New Haven, 1972.
[17] J. McCahy ad P. Hayes. Some philosophical prob
lem fom the standpoint of artificial intelligence. In
B. Melter and D. Michie, eitors, Machin Inelli
gene, Edinburh Universit Press, Edinburgh, 1969.
[18] C. McGui. Comparsns of information stctures.
In C. McGuie ad R. Radner, eitors, Decision
a Organization, chaptr 5, pages 101-130, North
Hollad Publishing Co., Amsterdam, 1972.
[19] R. C. Moore. Knwledge an Action. Technica Re
por 191, SRI Inteational, Menlo Park, October 1980.
[20] J. Nash. The bargaining problem. Economtrica,
18: 155-162, 1950 .
[21] R. Paul, H. Durat-Whyte, and M. Mint. A robust,
distibutd mult-sensor rbot contol system. In Proc.
Third In. Symosiu ofRobotic Reseach, Gouvieaux,
Face, 1985.
[22] J. S. Rosenschein ad M. R." Genesereth. Deals among
ratonal agents. In Proceedings of the Ninh /CAJ,
pages 91-99, Los Angelos, August 1985.
[23] S. Rosenschein. Forml Theories ofKnwledge in AI
a Robotics. Technical Report 362, SRI Interatonal,
Menlo Pak, CA., September 1985.
[24] L. Savage. The Founation ofStatistics. John Wiley,
1954.
[25] H. Simon. Theories of bounded rationalit. In
C. McGuire and R. Radner, editors, Decision and Or
ganization, chapter 8, pages 161-176, North-Holland
Publishing Co., Amsteram, 1972.
[26] J. Von Neumann and 0. Morgenster. Theory of
Gas a Eonmic Behavior. Princeton University
Press, Prnceton, NJ., 1944.
[27] S. Weerahandi and J. Zidek. Elements of multi
bayesian decision theory. The Annals of Statistics,
1 1 : 1032, 1983.
[28] S. Weerahandi and J. Zidek. Multi-bayesia statistical
decision teory. J. Roya Statistical Society, 144:85,
198 1.
108
I
I
I
I
I
I
I
I
I
I
I
l
I
I
I
I
I

Multi-Sensor Coordination Using Team Decision Theory

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Multi-Sensor Coordination Using Team Decision Theory

Transféré par

Droits d'auteur :

Formats disponibles

I

An etnde vesion of tis papap a Gap Lb t. rer 71.

Figure 2: Te Pursuit-Evasion Game

Vous aimerez peut-être aussi