Vous êtes sur la page 1sur 6

On Maintaining Connectivity of a Colony of Autonomous Explorer Mobile Robots

Jonathan Matas Palma Olate


Escuela de Ingeniera de Ejecucion en Electronica
Universidad del Bo-Bo
Concepcion, Chile
Email: jmpalma@alumnos.ubiobio.cl

AbstractIn the context of multi-robot exploring, we work


on the particular problem of maintaining communication
links between an explorer robots and a base station using
autonomous robot routers. In this paper, we propose and
describe a kind of exploring application which considers a
variable amount of robots over time. This problem opens
various research challenges. Afterwards, we focus on the
problem of maintaining a single chain of router robots
between an explorer and the base station. We propose an
simple algorithm for driving router robots. The algorithm
generates reactions using a heuristic that considers network
changes perceived by RSSI measures. Actions performed
by router robots depend on achieving goals consisting in
algebraic relationships between current and past RSSI values. Simulation results show effectiveness of the proposal
allowing to drive router robots to good positions, in a freeof-obstacles environment, considering the proposed heuristic
and the adopted simulation model.
Keywords-Multi-robot applications;
maintenance; RSSI; simulation.

localization;

link

I. I NTRODUCTION
Exploration of unknown, hostile or difficult (even impossible) to access places is a challenging task for robotics.
Applications include exploration over disaster areas, tactical military missions, and planetary exploration. In many
cases, the use of robots on exploration applications is
limited to the operation of one or various independent
units, however, since some years ago, the concept of
using colonies of multiple collaborating robots to perform
exploration tasks has been developed [1].
Whether to remotely control an exploring robot or to
communicate findings or any kind of data, many robotic
applications require to maintain a wireless communication
link between a robotic unit and a gateway node. Depending
on the application, this gateway can be of different nature
(e.g., a portable device, a base station, etc.), enabling
remote monitoring/control, data storage, and other functionalities. Noticeably, in these cases the exploring robot
coverage is constrained by the signal range so, when the
task involves larger regions, routing units installation is
required.
An alternative to the installation of routing units is
the use of a network of mobile routing robots that autonomously seek the best position to forward data packets
between explorer nodes and the gateway, maintaining a
certain link quality [2].
The task of maintaining links in a multi-robot scenario
is a complex task, specially when the use of global

Cristian Duran-Faundez
Departamento de Ingeniera Electrica y Electronica
Universidad del Bo-Bo
Concepcion, Chile
Email: crduran@ubiobio.cl

positioning technologies (such as GPS) or other kind of


external infrastructure is impossible. Recently, various algorithms using signal properties such as the RSS (received
signal strength) have been reported, to allow finding and
following of moving agents by autonomous robots; a task
referred in the bibliography as tethering. Tethering is the
robot task of following a mobile agent (human, robot, etc.),
with all the different required capabilities to it, in order to
provide network connectivity [3].
Normally, multi-robot tethering considers a fixed number of robotic units working in an unknown environment. In this paper, we present and address new kind of
exploring problems where the amount of routing nodes
is variable in time. The contributions of this paper are
twofold: Firstly, we propose and describe a multi-robot
exploring application which needs of dynamic topologies of routing robot colonies, discussing its involved
scientific challenges. Secondly, we address the particular
sub-problem of maintaining a single communication link
between an explorer robot and a gateway base station, by
proposing and evaluating a RSS-based tethering algorithm
for enabling self-adaptive driving of intermediate nodes.
The remainder of this paper is organized as follows: In
Section II, we present some works related to the problem of maintaining communication links in colonies of
multiple robots. In Section III, we introduce and describe
a kind of multi-robot application that can be adopted
to perform communication link maintenance in exploring applications. A proposal for addressing the problem
of maintaining a single communication link between an
explorer robot and a gateway base station is presented in
Section IV. Simulation results are commented in Section
V. Finally, Section VI concludes and give some future
directions of this work.
II. R ELATED WORKS
Multiple works about the problem of finding the position of wireless devices in indoor and outdoor environments are reported in the literature. Different kind of
techniques are adopted, using some signal propagation
property such as time or angle of arrival (ToA or AoA),
or received signal strength indicator (RSSI) measures [4],
[5], [6]. Many works allow localization of mobile wireless
nodes, but with the need of a static infrastructure of
known-position nodes (beacons). Such a infrastructure is

not possible in communication link maintenance applications with mobile router robots, moreover, all robots
should be considered to move over time so this kind of
algorithms are not well suited because there are no fixed
reference points (except, may be, the gateway node).
The literature reports a special kind of problem involving colonies of multiple robots allowing detection and
following of target nodes to provide network connectivity.
Such issue is known as tethering. In [7], a doublelayer chain tethering algorithm is proposed. Authors argue
that single-layer chain tethering has drawbacks because
communication links can be lost if intermediate nodes
have problems. Given a quantity of deployed nodes, the
developed algorithm makes nodes to maintain a particular
topological configuration even in presence of obstacles.
A RSSI-based metric is proposed as a quality link metric.
The algorithm is simulated in Matlab, showing how nodes
initially deployed in random positions always tend to
maintain the reference double-layer chain configuration.
Real world issues of localization and tethering are addressed in [3]. Authors design and implement a tethering
algorithm allowing a LANdroid robot to follow another
LANdroid, using a RSS-based distance model. Finally,
in [2], a RSSI-based multi-robot tethering algorithm is
proposed. This algorithm is based on cluster geometries,
which are local topological configurations that robots can
recognize and use to orient themselves and diminish uncertainty. This work identify three type of nodes: Gateways,
Targets and LANdroids. Our work is similar in the sense
that our proposed applications involves similar types of
nodes, and we use RSSI as a link quality indicator.
However, the problem we address introduce the problem
of using different number of router nodes over time.
III. A

MULTI - ROBOT EXPLORING APPLICATION

The reference multi-robot exploring application we propose includes the following kind of nodes:
1) A base station, which has two general functionalities: First, it is the networks gateway node, so
it is the point where application data come to and
go from the exploring robot colony. Second, it is
a place where robots are stored, maintained, and/or
energetically recharged. For the stated problem, this
is considered to be a non-constrained node (i.e. a
place with unlimited resources: energy, etc.).
2) A set of explorer robots, which are sent to make
some exploration mission in an unknown environment. Explorer robots start moving away from the
base station, and they must periodically (or eventually) exchange application data from it. For the
stated problem, these are target nodes that wander on
the environment in an unpredictable (autonomous),
although not random, way [2].
3) A set of router robots, which are sent to get positions allowing them to forward data packets between
the explorer robots and the base station, maintaining
certain quality links. For the stated problem, these
mobile nodes are driven by a control algorithm that

allows them: (i) to decide when to activate and leave


the base station, (ii) to decide where and how to get
an optimal (or pseudo-optimal) position to forward
data packets, and (iii) to decide when to go back to
the base station for maintenance, electrical recharge,
and/or deactivate (among others).
The problem starts when explorer robots leave the base
station and move away following some non-determined
path. Explorer robots can eventually get closer or farther,
or keep a certain distance to the base station, or stop
a certain time. When an explorer robot signal quality
decreases beyond a certain level, a router robot, stored
at the base station, is activated and automatically moves
to the best possible place to forward data packets between
the explorer and the base station. Eventually, the distance
between the nodes (or obstacles) will produce similar
decreases of the communication link quality so additional
router robots can be activated, forming a multi-robot
communication chain. The main goal of this application
is to make router robots to maintain communication links
between the explorer robots and the base station, maintaining a certain link quality. This scenario entails a set of
sub-problems, including:
1) Definition of a link quality metric. Which is related to define and adopt suitable metrics as quality
references. Current works mainly adopt the RSSI,
which can be combined with other simple metrics
such as the PLR (packet loss rate), LQI (link quality
indicator), etc.
2) Drive router robots to optimal positions to forward data packets. Which includes methods to find
optimal (or pseudo-optimal) positions and to get
those positions minimizing resources and considering topology dynamics and geographic issues.
3) Minimizing the amount of robot routers maintaining
global performance. Which includes methods to
order router robots to activate and for part of the
routing network, and methods to tell a node when
to go back to the base station in order to economize
energy and get recharged. Minimizing the amount of
router nodes involves the effect that a router robot
can be used to provide connection to more than
one explorer robot. The fact that, in some cases, an
explorer robots can also be used as a router robot
should be considered.
In particular, this goal could require predictive behavior instead of reactive. As an example, Figure
1 illustrates an example of the problem of sending
redundant router robots to the base station, which
may cause loss of some communication links.
4) Addressing robustness. Which involves methods to
deal with communication errors and incidentals such
as robot failures. A sample solution was described
in Section II, by applying multiple communication
chains.
5) Sharing tasks. To provide ways to make robots share
some tasks, e.g., to make a router robot take other
routers job allowing the second one to go to the

(a)

(b)

(c)

(d)

(e)

(f)

Figure 1: Example of the problem of sending redundant routing robots to the base station. In (a), two independent
router chains maintain communication links to explorer robots E1 and E2. In (b), E2 approaches E1. In (c), E2 is in
the proximities of router R1, so it can share the same communication chain, and routers R3 and R4 can go back to the
base station. In (a), In (d), E2 moves away from R1. In (e), E2 moves further away, so nodes R3 and R4 are sent to
maintain the communication link. In (f), E2 losses connectivity because router nodes are still far from it.

base station for recharge or maintenance.


IV. A LGORITHM

FOR MAINTAINING A SINGLE


COMMUNICATION CHAIN

As said, the particularity of control algorithms for


router robots is networks dynamics, which changes over
time due to nodes movements. Goal coordinate PO for a
robotic router is unknown, but it is characterized by having
RSSI values from following and previous nodes (in the
communication chain) equal and maximum. The operating
principle of the proposed heuristic is based on the control
machine learning strategy and cluster topology proposed
in [3], and the reinforcement learning control strategy in
[2]. In the second work, different actions (movements) are
performed according to a policy which seeks to minimize,
using RSSI values, the distance between a robotic router
and an optimal point.
A. Previous experiments
Raw RSSI values do not provide useful information,
so algebraic relations between RSSI values for current
iteration k and previous iteration (k 1) are designed.
These relationships are useful for evaluating the benefit
of a taken action.
Our first implemented controllers seek to maximize the
sum of RSSI values and minimize their difference. These
models considered that PO is a coordinate where RSSI
values are equal, with zero difference, and maximal. The
seek for maximizing the sum of RSSI is due to the fact
that the function of sum of RSSI has a local maximum
in the straight line formed by the coordinates of zero
difference. This approach does not achieve to approach
a router robot to the neighborhood of PO. This is because
mobile networks dynamics can make correct actions
be considered as erroneous and the opposite. Another
detected factor is that on the seek for an orientation,
router units can move to areas with too low RSSI values,
impeding location. We executed many experiments where
two main problems are evidenced (details in [8]): First,

the difficulty for deciding whether a performed action is


favorable or not, by using only RSSI measures. The second
one is the execution of unnecessary movements during
orientation phases. Considering these two problems, we
developed the heuristic presented below.
B. Proposal
1) Orientation method: Let us define V S and V I as
the communication links between a router robot and the
neighboring node closest to the explorer and the base station, respectively, on the communication chain. We define
RSSIVS and RSSIVI as the values resulting from a low pass
filter taking as input a set of RSSI measures taken during
a router robot movement from V S and V I, respectively. In
the seek for a characterization of the benefit of an executed
action two indicators of orientation are adopted. Those
indicators are the absolute RSSI difference Di fa and the
digital RSSI difference Di fd whose equations are given
by:
Di fa

Di fd

|RSSIVS RSSIVI |
RSSIVS RSSIVI
Di fa

(1)
(2)

Di fa is relevant because PO is a coordinate where


the difference of RSSI from previous and next nodes
in the chain is zero. The sign of Di fd indicates which
neighboring node is closest.
With RSSI values and orientation indicators the following status indicators can be defined:
1) Decrease Di fa , which is calculated as:

1 if Di fa (k) Di fa (k 1) < 0
MInDi f =
0 otherwise
(3)
and returns 1 if Di fa decreases since previous iteration, and 0 if do not.

go back to the last position where the three goals where


accomplished. On the other hand, if SSk1 = SSk = S2 ,
by executing A9 action going, risks of having the node
trapped in an infinite loop of actions A9 can occur, being
the node confined to two coordinates. To avoid this,
when (SS2 SS2 ) occurs, a random action Ai is selected.
Finally, in cases where current status fulfills the three
goals, the action selection is performed by using a Qlearning-based vector Q(SS1 Ai ), where i = 1 . . . 8. We
adopted as learning coefficient = 0.5, and we assign a
reward +1 when pair State-Action is accomplished. StateAction vector coefficients are initialized as zero. In this
way, in execution time, more successful State-Action pairs
will have better values. Future actions are selected by
State-Action pair with bigger value.
The following algorithm allows to select the action
to perform at the iteration k, depending on the status
transition SSk1 SSk :
Require: RSSI captures
Require: Calculation of goals and categorizations for
states SS(k) and SS(k 1).
1: if SSk1 = SS1 and SSk = SS2 then
2:
return A9
3: end if
4: if SSk1 = SS2 and SSk = SS2 then
5:
return random Ai ;
i [1, 8]
6: end if
7: Ac = MaxAc InQ(Q, SS)
8: Q(SS, Ac ) (1 )Q(SS, Ac ) + [r(SS, Ac ) +

maxAA Q(s , A )]
s
9: return Ac
V. S IMULATION

RESULTS

In order to show the performance of the proposed


algorithm, we implemented a simulation environment in
Matlab software. For estimating RSSI values, classical
Log-Normal model with Gaussian shadowing is adopted.
Figure 2a shows three possible paths followed by a
router GG1 searching PO, in a network with an static
explorer. Colors vary from blue (iteration zero) to dark
red (last iteration). Figure 2b shows parameter Di fa for
100

3.5

Difa T1

3
GG1 T1
2.5

Dif T
RSSI (db)

GG1 T3

1.5

Difa T2

80

TG
GW
GG1 T2

2
Meters (M)

2) Decreases of RSSI values for a communication link


V X, can be calculated as:

1 if RSSIVX (k) RSSIVX (k 1) < 0
MinV X =
0 otherwise
(4)
3) Increase of RSSI values for a communication link
V X, which is defined as:

1 if RSSIVX (k) RSSIVX (k 1) > 0
MaxV X =
0 otherwise
(5)
From our experiments, we have determined that the
policy which makes to converge a router robot GGi to
a suitable position consists on the achievement of the
following three goals:
1) MINDi f = 1 (which is determined by equation 3).
2) MaxV L = 1 (which is determined by equation 4).
3) MinVC = 1 (which is determined by equation 5).
where communication links V L and VC are the farthest
and the closest of the two neighbors to the involved router
robot in the communication chain, and can be determined
with indicator Di fd .
In a network where neighbors are static, actions allowing to achieve just one of the previous goals is enough
to move GGi towards a suitable coordinate, but when the
network is composed of more than a router robot, this is
not accomplished.
2) Algorithm: A router robot GGi must perform actions
that decrease its distance to PO, avoiding (as possible) to
perform actions that move it away from this goal. The
action to perform is decided by the state values. Without
lost of generality, we define height possible actions for a
router robot: A1: up, A2: down, A3: right, A4: left, A5:
up-right, A6: down-right, A7: down-left, and A8: up-left.
States determine the three defined binary goals MINDi f ,
MaxV L and MinVC , having 23 different situations. It is
not always easy to discern whether a combination of
these three states is favorable or not, hence we consider
a favorable action it the three goals are accomplished. As
it is preferable to avoid actions moving GGi away from
PO, when an action not accomplishing the three goals is
performed, an action A9 forcing GGi going back to the
last coordinate where the three goals where accomplished
is performed.
The proposed algorithm is inspired in reinforcement
learning technique Q-learning. We define two states, SS1
and SS2, being these: SS1 the accomplishing of the three
goals MINDi f , MaxV L and MinVC , and SS2 the nonaccomplishment of at least one of them.
For the proposed model there are four transitions
(SSk1 SSk ), i.e. transitions of passing from an state
SSk1 in previous iteration k 1 to the current state
SSk . These transitions are: (SS1 SS2), (SS2 SS2 ),
(SS1 SS1), and (SS2 SS2).
The performed action is chosen in the following way:
If current state SSk does not fulfill the three goals (SSk =
SS2 ), two cases can arise. On the one hand, if SSk1 = SS1
(previous state was good) the action to perform is A9, i.e.,

1
0.5

60
40
20

0
0.5
0.5

0
0

0.5

1.5
2
Meters (M)

2.5

(a) Trajectories.

3.5

10

20
Meters (M)

30

40

(b) Di fa .

Figure 2: Execution example with one robotic router and


a static explorer.
each trajectory. At the beginning Di fa is equal for the
three trajectories and converges to near-zero values.
Seeking an orientation, at least the first action is random. First action accomplishing the goals is rewarded in
vector Q(s, a). In the example, as selected action is the

most rewarded in Q, once an action is selected as suitable,


the unique rewarded action in first steps is the same so we
observe straight movements. When zero-difference curve
is reached, the node stays confined.
Figure 3 illustrates an execution example of the driving
algorithm in a scenario with a mobile explorer T G and one
router robot GG. T G starts besides the base station GW
and performs a zigzag movement. When T G gets some
distance, RSSI decreases passing a certain threshold Umdb
so GG is activated and executes the algorithm following
T G. Figure 3a shows GGs followed trajectory. Figure 3b
shows RSSI values for links V S and V I. The algorithm
allows the convergence between RSSIVS and RSSIVS to
similar value, while GG follows T G.

model, which at very small distances throws RSSI values


exceeding the maximum transmission power values of
real-world electronic devices.
C. Results
Results exposed consider a total of 1000 simulations,
each consisting on 500 iterations of the algorithm, for
same initial conditions and movement of unit T G. Figure
4 corresponds to the expected movements at different
intervals. Each point corresponds to the center of mass
(mean location for each iteration). Figure 5 exposes the
expected Di fa values and the standard deviation, for the
three GG units.
1.5

RSSI Inf
RSSI Sup

0.8
0.6

20
RSSI (db)

Meters (M)

0.4
0.2
0
0.2
0.4

40

0.5
1

GG1

1.5

GG2

0
0.5
GG1

GG2

1.5

GG3

GG3
2
2.5

60

0.6

1
0.5

Meters (M)

GG
GW
TG

Meters (M)

0
1

1.5

1
0.5

GW
TG
0

4
Meters (M)

2.5

GW
TG
0

4
Meters (M)

0.8

3
4
Meters (M)

(a) Trajectories.

100

200
300
Iteration (k)

400

500

(b) Di fa .

Figure 3: Sample simulation with one robotic router and


a mobile explorer.

(b) k = 100

1.5

1.5

0.5

0.5

0
0.5
GG

GG2

1.5

2.5

A. Simulation description

B. Evaluation method
The proposed heuristic ensures convergence in an area
near PO, implying an error between the ideal RSSI and average RSSI of the area where GG converges. As explained
above, paths performed by GG units are not unique while
finding PO. Thus, as individual simulations could not be
representative of general behavior, we provide statistical
results in terms of mean and standard deviation for the
indicators of interest. Mean provides expected behavior
and standard deviation allows intervals of validity of the
expected value. RSSI values may present discontinuities
and far values of the standard deviation. These abrupt
values that may occur in some simulations are identified
when GG units are activated. This is because of the initial
values of the RSSI filters and/or because of the RSSI

4
Meters (M)

GG1

GG2
GG

GW
TG

GW
TG
0

0
0.5

1.5

GG3

2.5

(c) k = 200

4
Meters (M)

1.5

0.5

0.5

0
0.5
GG1

GG2

1.5

4
Meters (M)

GG1

GG

GW
TG
0

0
0.5

1.5

GG3

2
2.5

(d) k = 300

1.5

Meters (M)

For evaluation purposes, we adopt a chain consisting in


three router robots and one explorer. Such a scheme has
various interesting possible relationships for a router robot,
considering the proposed reference application. Those
relationships include: to be directly connected to the base
station and a GG robot, to be connected to two GG
robots, and to be connected to a GG and an explorer
robot. The simulations to perform considers the following
characteristics: First, all of the robotic nodes (explorer and
routers) start besides the base station. Second, the explorer
robot moves in a zigzag at a half of the router nodes
speed. Third, router units GGi are activated when the RSSI
of the previous node in the chain decreases until achieve
Umdb = 68db.

Meters (M)

(a) k = 5

Meters (M)

80

Meters (M)

GG3
GW
TG

2.5

(e) k = 400

4
Meters (M)

(f) k = 500

Figure 4: Expected trajectory at different iterations k.


To minimize Di fa , which implies to approach RSSI
values, may not imply the convergence in the surroundings
of PO. We define two kinds of ideal coordinates: Optimal
General Point (OGP) and Instantaneous Optimal Point
(IOP). Each of them determines a value where links
RSSI of neighboring nodes is maximum and equal. IOP
is particular for the iteration. IOP is the same for each
simulation due to the fact that the movement of T G is the
same in all cases and the base station is static. Let us define
the difference between RSSI obtained by a unit in PO and
current RSSI obtained by a unit as eR SSI. This indicator
is useful to evaluate the performance of the algorithm.
Figure 6 shows eR SSI expected value in the left graphics
and eR SSI standard deviation in right column.
VI. C ONCLUSION
This paper describes a kind of application for maintaining communication links between explorer robots and a
base station using a colony of autonomous router robots.

60

40

RSSI (db)

RSSI (db)

Difa GG1
40

20

30
20
10

DE Dif GG
a

200
400
Iteracion (k)

600

(a) Average Di fa , GG1 .

DE Dif GG
a

40

60

RSSI (db)

RSSI (db)

40
20

30
20
10

200
400
Iteracion (k)

600

(c) Average Di fa , GG2 .

200
400
Iteracion (k)

600

(d) Di fa standard deviation, GG2 .

80

50
Difa GG3

DE Difa GG3

40

60

RSSI (db)

RSSI (db)

600

50
Dif GG

40
20
0

200
400
Iteracion (k)

(b) Di fa standard deviation, GG1 .

80

30

ACKNOWLEDGMENT

20
10

200
400
Iteracion (k)

600

(e) Average Di fa , GG3 .

200
400
Iteracion (k)

600

(f) Di fa standard deviation, GG3 .

Figure 5: Difference and standard deviation.

60

30

10
5

100

200

300
400
Iteration (k)

500

600

200

300
400
Iteration (k)

500

600

RSSI (db)

40
30

DE Error RSSI VI
DE Error RSSI VS

25

50

100

200

300
400
Iteration (k)

500

600

(c) Average RSSI, GG2 .

100

200

300
400
Iteration (k)

500

600

(d) Standard deviation, GG2 .

60

30
Error RSSI VI
Error RSSI VS

50

30

20

10

10

100

200

300
400
Iteration (k)

500

(e) Average RSSI, GG3 .

600

[3] S. Zickler and M. Veloso, Rss-based relative localization


and tethering for moving robots in unknown environments,
in Robotics and Automation (ICRA), 2010 IEEE International Conference on, May 2010, pp. 54665471.
[4] H. Karl and A. Willig, Protocols and Architectures for
Wireless Sensor Networks. John Wiley & Sons, Ltd, 2005.

[6] Q. Mao and B. Fidan, Localization Algorithms and Strategies for Wireless Sensor Networks. Information Science
Reference, 2009.
[7] X. Chen and J. Tan, An adaptive mobile robots tethering
algorithm in constrained environments, in Intelligent Robots
and Systems, 2009. IROS 2009. IEEE/RSJ International
Conference on, Oct 2009, pp. 13771382.

15

20

DE Error RSSI VI
DE Error RSSI VS

25

RSSI (db)

40

[2] P. P. Reddy and M. M. Veloso, Rssi-based physical layout


classification and target tethering in mobile ad-hoc networks, in Intelligent Robots and Systems (IROS), 2011
IEEE/RSJ International Conference on, Sept 2011, pp.
23272332.

[5] A. Bensky, Wireless positioning technologies and applications, 2008, Ed. Artech House.

15

10

R EFERENCES

20

10

20

100

30
Error RSSI VI
Error RSSI VS

60

(b) Standard deviation, GG1 .

70

RSSI (db)

15

10

(a) Average RSSI, GG1 .

RSSI (db)

20

20

DE Error RSSI VI
DE Error RSSI VS

25

RSSI (db)

RSSI (db)

40

This work was supported by the Research Department


of the University of Bio-Bio (Project DIUBB 121910 2/R).

[1] I. Rekleitis, G. Dudek, and E. Milios, Multi-robot collaboration for robust exploration, in Robotics and Automation,
2000. Proceedings. ICRA 00. IEEE International Conference on, vol. 4, 2000, pp. 31643169 vol.4.

30
Error RSSI VI
Error RSSI VS

50

Different sub-problems are discussed. After, a RSSIbased algorithm for maintaining a single communication
chain between one explorer robot and the base station is
proposed. This algorithm is based in a simple heuristic
consisting on the evaluation of three goals: decreasing
absolute RSSI difference between obtained data from
previous and next neighboring nodes in the communication chain, the increase of the farthest neighbor RSSI,
and the decrease of the nearest neighbor RSSI. This
heuristic is used to select the next action to perform by
a robotic router, combining simple decisions with a Qlearning-based decision process. Simulation results over
1000 repetitions of a simulation scheme shows good
performance of the algorithm to make converge router
robots to near-optimal positions, considering the simulated
models restrictions.
Future works will focus on improving the guidance
algorithm, and dealing with all different sub-problems
described in Section III.

100

200

300
400
Iteration (k)

500

600

(f) Standard deviation, GG3 .

Figure 6: Average error and standard deviation.

[8] J. M. Palma Olate, Aseguramiento de enlaces inalambricos


de datos en una colonia de robots moviles autonomos,
Universidad del Bo-Bo, Concepcion, Chile, Memoria de
ttulo de Ingeniero de Ejecucion en Electronica, 2014.