Académique Documents
Professionnel Documents
Culture Documents
71 - 75
Olac Fuentes
University of Rochester
U.S.A.
e-mail: fuentes@jsbach.cic.ipn.mx
e-mail: {rao,vanwie}@cs.rochester.edu
ABSTRACT
robot navigation.
INTRODUCTION
Dbe Fuentes, Rajesh P. N. Rao and Michael Van Wie: Hierarchical Leamlng of Reacltve Behaviors in en Autonomous Nlob'e Robot
2 TASK DESCRIPfION
The task to be learned by the robot (figure 1) is one of nav
igation and obstacle-avoidance. Specifically, we expect the
robot to learn appropriatesensorimotor strategies for navigat
ing between two points in an obstacle-ridden environment.
Three classes of sensory input are available to the robot:
Figure 1: The robot used for the experiments.
Bump Sensors: Realized using digital microswitches,
Obstacle Delection
:
:
t
Obslade Avoidan<e
r
r
Bellcon Followlng
Seo8Ol}' Inpul
Actuators
Olac Fuentes, Rajesh P. N. Roo and Michael Van WIe: Hierarchical Leaming of Redcfive 8ehaviOlS in an Autonomous Mobile Robot
PERCEPTUAL GOALS
n(p) = n(p) + 1
(f) if h(p)
Oloe Fuentes, Rojesh P. N. Roo ond Mchoel Van Wie: Hierarchical LeamingofReactive Sehaviors in an AufonomousNlobile Robot
120
110
100
..,e
"l!
..
"
&
~
90
;; 80
~O.3
.",
.2
'S
Sil
:3 0.2
70
&
f!
~ 60
50
0.1
40
O~----~------~------~------~----~~
500
1000
1500
2000
2500
Cycle
30
10
15
20
25
30
Trip number
35
40
45
50
0.3.--------,.--------,-------,-------,---------,
ure 4 plots collisions per robot control cycle versus time for tbe
lowest level behavior, where tbe robot can feel but not see ob
stacles; figure 5 plots collisions-per-c.yele versus time for tbe
second level behavoir, where tbe robot can botb feel and see
obstaeles; and figure 6 plots average time-per-trip against trip
number during tbe beacon-following behavior.
0.25
0.2
."
()
l'" 0.15
j
'IJ!
8 0.1
0.05
O~-----5~OO~----7.10700~----~1~50~O----~20700~--~2500
Cyc!e
EXPERIMENTAL RESULTS
74
ter a few hundred cycles, tbe robot ha~ learned the appropri
starting this time witb tbe final value from tbe first behavior,
the previous case, after a few hundred cycles tbe robot suc
cessfulIy learns a policy tbat results in significantl y fewer col
lisions. It should be noted tbat given the finite turning radius
of the robot and the cluttered environment, collisions cannot
be completely elimnated.
Pi
io
ta
C2
in
sil
o(
hi
lo
ac
as
Qloe Fuentes, Rojesh P. N. Roo ond Michael Van Wie: Hierarchical Leaming ofReactive Behaviors in an Autonomou$ Mobile Robot
REFERENCES
[1] Rodney A. Brooks. Arobust layered control system for a
ence,1990.
FLgure 7: The Obstacle AvoidancelBeacon following behav
ior of the robot after hierarchicallearning.
task ofhoming to the location ofthe goal beacon).
Figure 7 depicts the behavior toward the end of the learning
period, when all three behaviors are active.
CONCLUSIONS