Elsevier 2014 A Surveyonintelligentroutingprotocolsinwirelesssensornetworks

Journal of Network and Computer Applications 38 (2014) 185201
Contents lists available at ScienceDirect
Journal of Network and Computer Applications

journal homepage: www.elsevier.com/locate/jnca
Review
A survey on intelligent routing protocols in wireless sensor networks

Wenjing Guo, Wei Zhang n
Department of Computer Science and Technology, East China Normal University, 500 Dongchuan Road, Shanghai 200241, China
art ic l e i nf o
a b s t r a c t
Article history:
Received 27 September 2012
Received in revised form
12 February 2013
Accepted 1 April 2013
Available online 10 April 2013
This paper surveys intelligent routing protocols which contribute to the optimization of network lifetime
in wireless sensor networks (WSNs). Different from other surveys on routing protocols for WSNs,
this paper rst puts forward new ideas on the denition of network lifetime. Then, with a view to
prolonging network lifetime, it discusses the routing protocols based on such intelligent algorithms as
reinforcement learning (RL), ant colony optimization (ACO), fuzzy logic (FL), genetic algorithm (GA),
and neural networks (NNs). Intelligent algorithms provide adaptive mechanisms that exhibit intelligent
behavior in complex and dynamic environments like WSNs. Inspired by such an idea, some intelligent
routing protocols have recently been designed for WSNs. Under each category, it discusses the
representative routing algorithms and further analyzes the performance of network lifetime dened
in three aspects. This paper intends to give assistance in the optimization of network lifetime in
WSNs, together with offering a guide for the collaboration between WSNs and computational
intelligence (CI).
& 2013 Elsevier Ltd. All rights reserved.
Keywords:
Intelligent routing protocols
Reinforcement learning
Ant colony optimization
Fuzzy logic
Genetic algorithm
Neural networks
Contents
1.
2.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.
Our contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.
Taxonomy of intelligent routing protocols in WSNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Reinforcement learning based routing protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.
Q-Routing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.1.
Protocol denition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.2.
Functioning of the scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.3.
Results and performance analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.4.
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.
AdaR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.1.
2.2.2.
2.2.3.
2.2.4.
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.
ATP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.1.
2.3.2.
2.3.3.
2.3.4.
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.
FROMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.1.
2.4.2.
2.4.3.
2.4.4.
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5.
QELAR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5.1.
Corresponding author. Tel.: +86 18918797512.

E-mail address: wzhang@cs.ecnu.edu.cn (W. Zhang).
1084-8045/$ - see front matter & 2013 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.jnca.2013.04.001
187
187
187
187
188
188
188
188
188
188
188
188
189
189
189
189
189
189
190
190
190
190
190
190
190
190
186
W. Guo, W. Zhang / Journal of Network and Computer Applications 38 (2014) 185201
2.5.2.
Functioning of the scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
2.5.3.
Results and performance analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
2.5.4.
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
2.6.
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
3. Ant colony optimization based routing protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
3.1.
BAR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
3.1.1.
Protocol denition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
3.1.2.
3.1.3.
3.1.4.
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
3.2.
SCFFFP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
3.2.1.
3.2.2.
3.2.3.
3.2.4.
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
3.3.
EEABR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
3.3.1.
3.3.2.
3.3.3.
3.3.4.
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
3.4.
ACORC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
3.4.1.
3.4.2.
3.4.3.
3.4.4.
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
3.5.
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
4. Fuzzy logic based routing protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
4.1.
FCH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
4.1.1.
4.1.2.
4.1.3.
4.1.4.
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
4.2.
FMO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
4.2.1.
4.2.2.
4.2.3.
4.2.4.
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
4.3.
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
5. Genetic algorithm based routing protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
5.1.
GA-Routing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
5.1.1.
5.1.2.
5.1.3.
5.1.4.
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
5.2.
GA-EECP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
5.2.1.
5.2.2.
5.2.3.
5.2.4.
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
5.3.
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
6. Neural networks based routing protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
6.1.
SIR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
6.1.1.
6.1.2.
6.1.3.
6.1.4.
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
6.2.
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
7. Analysis of network lifetime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
7.1.
Denition 1 of network lifetime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
7.2.
7.3.
8. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
1. Introduction
A Wireless Sensor Network (WSN) is a network comprising a
large number of wirelessly connected heterogeneous sensors
which are spatially distributed across an interested eld. It has
been applied in many elds such as military investigation, medical
treatment, environmental monitor and industry management.
However, WSNs differ from other networks, in which sensor nodes
have limited energy supply, constrained computation and communication abilities. Therefore, how to prolong the network lifetime is an important and challenging issue, which is also the focus
of designing the routing protocols for WSNs.
A great many routing protocols have been specically designed for
WSNs classied as data centric, hierarchical and location-based. In
recent years, with the development of computational intelligence (CI),
routing protocols based on such intelligent algorithms as reinforcement learning (RL), ant colony optimization (ACO), fuzzy logic (FL),
genetic algorithm (GA), and neural networks (NNs) have been
proposed to improve the performance of WSNs. Intelligent algorithms
provide adaptive mechanisms that enable or facilitate intelligent
behavior in complex and changing environments, which can be
brought to design all-in-one distributed real-time algorithms. Such
algorithms have proved to work well under WSN-specic requirements like communication failures, changing topologies and mobility.
Thus, some researchers make use of intelligent algorithms to address
routing issue in WSNs. However, these intelligent algorithms have
different properties, and they should be used depending on the
specic application scenario. GA and NNs have very high processing
demands and are usually centralized solutions. They are slightly better
suited for clustering when the clustering schemes can be predeployed. FL is suitable for implementing routing and clustering
heuristics like link or cluster head quality classication. However, it
generates non-optimal solutions and fuzzy rules need to be re-learnt
upon topology changes. ACO is very exible, but generates a lot of
additional trafc because of the forward and backward ants. RL has
been proved to work very well for routing and can be implemented at
nearly no additional costs. It should be the rst choice when looking
for a exible and low-cost routing approach.
1.1. Our contributions
In this paper, we discuss some representative intelligence-based
routing algorithms. There have been several surveys (Karaki and
Kamal, 2004; Villalba et al., 2009; Singh et al., 2010; Baranidharan
and Shanti, 2010; Halawani and Khan, 2010; Celik et al., 2010; Saleem
et al., 2011; Zungeru et al., 2012) on the routing protocols for WSNs.
The former focus on the traditional routing protocols, and the latter
research on ACO based routing protocols. However, besides ACO, many
other intelligent algorithms such as RL, FL, GA and NNs have also been
used to optimize the routing issue for WSNs. This paper intends to
present a comprehensive survey of intelligent routing protocols in
WSNs. The main contributions of this paper are listed as follows:
1. With a view to the optimization of network lifetime in WSNs,
this paper picks out some typical intelligent routing protocols
to discuss. It intends to provide new ideas and incentives for
WSNs, and at the same time to offer a guide for the collaboration between WSNs and CI.
2. To evaluate the performance of these protocols in the round, it
puts forward new ideas on the denition of the network
lifetime. Most of the routing algorithms for WSNs are proposed
to prolong the network lifetime which is dened as the time
until the rst sensor node is drained of its energy. However,
such time is not always important. For the applications of
WSNs, they are concerned about whether the network can
187
provide an acceptable service, which may focus on the ratio of

living nodes, the connectivity to the base station, or the
efciency of packet delivery. Thus, this paper does an overall
consideration of the demand of applications, and denes network lifetime in three aspects:
Denition 1. The time until the rst node is drained of its energy.
Denition 2. The time until the rst living node has no path to
the base station.
Denition 3. The ratio of packet delivery when no data can be
transmitted to the base station.
The major measures to optimize network lifetime from the
above aspects include decreasing and balancing energy consumption, building multiple paths, reducing latency and improving link
reliability. In future, this paper intends to evaluate routing algorithms for WSNs following the opinion mentioned above.
1.2. Taxonomy of intelligent routing protocols in WSNs
On the basis of the intelligent algorithms used in routing
protocols, these intelligence-based routing protocols in WSNs
can be classied into ve categories: RL based routing protocols,
ACO based routing protocols, FL based routing protocols, GA based
routing protocols, and NNs based routing protocols. Just as shown
in Table 1, in each category, some representative routing protocols
are listed for us to discuss and analyze.
The rest of the paper is organized as follows. Sections 26
discuss representative routing protocols based on intelligent algorithms RL, ACO, FL, GA and NNs, respectively. Section 7 analyzes
the performance of these intelligent routing algorithms in terms of
three denitions of network lifetime. Finally, Section 8 concludes
the paper and further points out the open research problems.
2. Reinforcement learning based routing protocols

Reinforcement Learning (RL) (Sutton and Barto, 1998; Kaelbling
et al., 1996), a sub-area of machine learning technique which
attempts to use computer programs to generate patterns or rules
Table 1
Taxonomy of Intelligent Routing Protocols in WSNs.
Category
Routing protocols
RL based
Q-learning based routing (Q-Routing)

Adaptive routing (AdaR)
Adaptive tree protocol (ATP)
Feedback routing for optimizing multiple sinks (FROMS)
Q-learning-based energy-efcient and lifetime-aware routing
(QELAR)
ACO
based
Basic ant routing (BAR)

Sensor-driven cost-aware ant routing (SC)
Flooded forward ant routing (FF)
Flooded piggybacked ant routing (FP)
Energy-efcient ant based routing (EEABR)
Routing using ant colony optimization router chip (ACORC)
FL based
Cluster-head election using fuzzy logic (FCH)

Fuzzy multi-objective routing (FMO)
GA based
Genetic algorithm based routing (GA-Routing)

Genetic algorithm based energy-efcient clustering protocol (GAEECP)
NNs
based
Sensor intelligence routing (SIR)
188
from large data sets, deals with how an agent should take actions
in an environment to maximize the long-term reward. The agent
acquires its knowledge by actively exploring its environment.
Then, it determines the next action according to the knowledge.
The agent has to try many different actions and learns from its
experience since it does not know the best action beforehand. For
a particular state, it selects some possible action and receives a
reward from the environment.
A reinforcement learning task is described as a Markov decision
process (MDP) (S; A; P; R), in which S is the set of possible states, A
is the set of possible actions, P denotes the state transition
probability, and R indicates the environment reward to corresponding action. In addition, the policy (t: S-A) is the mapping from
states to actions. Such a policy denes how the learning agent
behaves at time-step t. The function V (s) denes the expected total
reward that can be received by the agent at states s under the policy
. The goal of solving an MDP is to nd an optimal policy * under
which the cumulative reward is maximized.
The RL algorithm can be used to optimize the network
performance. It has medium requirements for memory and
computation at each node because of keeping some different
possible actions and reward values. And it needs some time to
converge. However, it is easy to implement, and highly exible
to topology changes. By using distributed learning, it is able to
achieve optimal results at nearly no additional costs. Therefore, RL
is well suited to deal with such distributed problems as routing in
WSNs. But for large-scale network, the complexity of learning
should be thought over since it has an exponential increase with
the number of agents. In addition, the difculty of RL is the
fundamental tradeoff between exploration and exploitation.
Exploration is to grope for new knowledge, whereas exploitation
is to adopt these experienced stateaction pairs which have gained
good reward. The former can bring about a long-term improvement, which is conducive to converging to the optimization.
The latter is able to enhance the performance in a short time,
but maybe it converges to a non-optimal solution. These two
strategies should be selected on the basis of different requirements. At present, it is more popular to gain the optimal solution
by exploration
2.1. Q-Routing
2.1.1. Protocol denition
Q-Routing proposed in Boyan and Littman (1994) is one of the
earliest works in routing using machine learning techniques.
It takes the minimal delivery times into account to learn the best
paths, and assigns a Q-value to each neighbor of each node. Such a
Q-value is dened as the evaluated time spent on the packet
delivery for the current node taking this particular neighbor as the
next hop to the sink.
2.1.2. Functioning of the scheme
The learning process of this protocol is shown as follows.
When a node y receives a packet from node x, it immediately
sends back a reward representing the minimum time taken to
forward the packet to the sink, which is computed in Eq. (1).
t minzneighborsof y Q y d; z
where Qy(d,z) is the evaluated time spent on the packet delivery

for node y taking node z as the next hop to the sink.
Then, node x updates the estimate of time remaining in the trip
according to
oldestimate
newestimate
z}|{ z}|{
Q x d; y q s t Q x d; y
where is the learning rate, q indicates the units of time in the
queue of x, and s represents the units of time in transmission

between node x and node y.
2.1.3. Results and performance analysis
This protocol has been validated by the experiments which
involve a 36-node, irregularly connected network. Simulations
show that Q-Routing is highly efcient under high network loads
and performs well under changing network topology.
Q-Routing is able to discover efcient routing policies in a
dynamically changing network without knowing the network
topology and trafc patterns in advance. In the process, it learns
the best paths to increase the rate of packet delivery. Thus, this
protocol can indirectly optimize the network lifetime in terms of
the third denition. However, this protocol is tested by simulations. The simulations are not fully realistic from the standpoint of
actual telecommunication networks. In addition, for future work,
this protocol should make less routing information stored at each
node to increase the scalability.
2.1.4. Applications
Q-Routing was primitively developed for wired, packetswitched network. However, it can be easily applied to wireless
networks. Further, the fully distributed feature is well suited for
the applications of WSNs.
2.2. AdaR
Adaptive Routing (AdaR) proposed in Wang and Wang (2006) is
a novel routing scheme based on a least squares RL technique for
WSNs. It applies Least Squares Policy Iteration (LSPI) to learn an
optimal routing strategy in terms of hop count, residual energy,
aggregated ratio and link reliability. LSPI is a model-free learning
algorithm. Unlike the traditional RL algorithm such as Q-Routing
which evaluates the optimal action-value function directly, LSPI
approximates Q-values Q for a given policy with a parametric
function.
The value function is approximated as a linear weighted
combination of k basis features, just as shown in Eq. (3). The k
basis features can be designed manually according to the concrete
requirement. In Eq. (3), i(s, a) is the ith basis feature of this state
action pair, and wi is the relevant weight in the linear equation.
k
Q s; a; w i s; awi s; aT w
i1
Then, the weights w of the above linear functions can be

extracted by solving the linear system in Eq. (4).
LSPI learns A and b by sampling from the environment. A sample
is a tuple (s, a, s, r), where s, a, s' and r are the current state, action,
new state, and immediate reward respectively. Given a set of such
samples, these factors , P and R can be approximately constructed as Eq. (5). Accordingly, the weight w can be acquired by
w A1 b
where
A p
T
b T R
0
B
B
@
4
1
sd1 ; ad1 T
C
C
A
T
sdL ; adL
0
B
p B
@
sd1 ; sd1 T
C
C
A
T
sdL ; sdL
1
r d1
B C
R@
A
r dL
5
AdaR samples from the environment and uses LSPI to update

the weights of the linear functions for the current policy. The
updated weights in turn improve the current policy. The specic
process is present as follows:
(1) Upon receiving a packet, node s forwards it to node s based on
the current Q-values, and appends the corresponding features
(s, a) (Eq. 6) to the packet.
s; a fds; a; es; a; cs; a; ls; ag
(2)
(3)
(4)
(5)
where d(s, a) is the difference of the hop counts of s and s to

the base station, e(s, a) is the residual energy of s, c(s, a)
represents the number of routing paths crossing on s, and l(s,
a) indicates the reliability of the link between s and s.
Once the packet arrives at the base station, the information of
the whole routing path can be traced. Then, based on the
quality of the routing path, the immediate reward r for each
tuple (s, a, s) is calculated.
When the base station collects a certain amount of samples
(s, a, s, r), it invokes the LSPI procedure to estimate the new
weights of the linear functions, and disseminates the updated
w via a network-wide broadcast.
The current policy is improved with the updated weights w.
For each node, it selects the action with the highest Q-value.
This procedure repeats with the improved policy, until a xed
point policy is reached, i.e., the weights of policies between
successive iterations do not differ signicantly.

Experimental results have shown that this protocol gains a
signicant improvement of performance over a basic Q-learning
implementation from the aspects of convergence speed and
sensitivity to the initial parameters.
AdaR takes multiple routing metrics into account to determine
the routing path, which can make correct trade-offs between
multiple optimization goals to maximize network lifetime in terms
of the rst and third denitions. Moreover, it makes use of the LSPI
procedure to learn the optimal routing strategy, which is both data
efcient and insensitive against initial setting. However, AdaR also
has a number of drawbacks. Firstly, it is hard to implement and
maintain this protocol in a real world scenario. Secondly, the
protocol is hardly scalable because of the growing size of the
packet at each hop. Thirdly, a lot of extra cost is expended to
broadcast the new Q-values. Fourthly, a new round of learning is
needed to start in case of link failure or node mobility, which
results in resource dissipation. Furthermore, AdaR is not a distributed protocol in which packets are delivered from sensors to a
centralized base station to calculate the optimal policy ofine.
2.2.4. Applications
AdaR is ideal for the context of ad-hoc sensor networks since it
is based on a least squares RL technique, which is both data
efcient, and insensitive against initial setting. In addition, this
protocol considers multiple routing metrics to learn efcient
routing strategies. Thus, this protocol adapts well to the applications which intend to achieve multiple optimization goals.
2.3. ATP
Adaptive tree protocol (ATP) proposed in Zhang and Huang
(2006) suggests to use a type of RL-based meta-routing strategy
for the constraint-based routing. Based on this RL strategy, a
spanning tree is constructed at initialization, but automatically
maintained during the routing process. Without additional control
189
packets for tree maintenance, the adaptive spanning tree can

maintain the best connectivity to the base station in spite of node
failures or mobility of base station. By using a general constraintbased routing specication Message-initiated constraint-based
routing (MCBR), the same strategy can be applied to achieve load
balancing and to control network congestion. MCBR is a framework of routing mechanisms composed of the explicit specication of constraint-based destinations, route constraints and QoS
requirements for messages, and a set of QoS-aware meta-strategies. It separates routing specications from routing strategies, and
general-purpose meta-routing strategies can be applied.
The core of this protocol is the RL-based meta-routing strategy.
It consists of three phases: initialization phase, forwarding phase,
and conrmation phase. Learning happens in all phases. First of all,
given a routing specication of a message, a cost function is
dened. According to the cost function, each node is assigned a
Q-value indicating the minimum cost from this node to the
destination, and it also stores its neighbors'Q-values, NQ-values.
For each node, the initial Q-value can be obtained during network
initialization if the sink node is known, or estimated when the
node receives a packet of that type at the rst time. And the NQvalues are estimated initially according to the neighbors' information and updated upon receiving packets from neighbors. When a
node forwards a packet to another one, it attaches its current Qvalue for that type of message to the packet. When a node
overhears a packet of type m from node n with Q-value Q(n),
whether it is the designated receiver or not, it updates the
corresponding NQ-value by Eq. (7). Further, it re-estimates its
own Q-value according to Eq. (8), where is the learning rate, and
Om is the current value of the local objective function.
NQ m nNQ m n 1Q n
Q m 1Q m Om minn NQ m n
ATP makes uses of the above RL-based strategies to build an

adaptive spanning tree. In the initialization phase, if the sink node
is known, an initial spanning tree rooted at the sink is built. Each
node except the sink node has a pointer to its parent, which is the
neighbor with the smallest NQ-value. Such an initial spanning tree
may not be optimal, and connections may change from time to
time in the case of mobile sink. In the forwarding phase, each
packet is forwarded once, and each node passes the received
packet to its parent. During the process, the structure of the tree
changes along with NQ-values changing. For the conrmation
phase, the mechanism of implicit packet conrmation is applied. If
the packet is not heard from the forwarding node within a certain
time period, the NQ-value of that node is updated. Accordingly, the
relevant parentchild relationship is changed.
This protocol has been proved to be robust for un-predicable
link failures and mobile sinks by experiments in Zhang and Huang
(2006). Compared to the traditional routing trees, ATP considers
new routing metrics for energy-aware load balancing to increase
lifetime in terms of the rst denition, and for congestion-aware
routing to reduce latency and increase reliability to optimize
lifetime in terms of the third denition. In addition, ATP can
adjust to a better structure during routing without extra control
packets, and remove asymmetric or broken links automatically.
Furthermore, the simulation results reveal that this protocol has
much better connectivity and can enhance network lifetime in
terms of the second denition. For a particular application, these
parameters in the routing protocol can be tuned in accordance
with the demand. However, there is also great space in the
190
research on the selection of parameters and the relationship

between different parameters.
2.3.4. Applications
ATP is well t for applications which demand for high reliability
and such an application in which the sink node needs to move to
gather information since this protocol is robust for un-predicable link
failures and mobile sinks. In addition, ATP can be applied to achieve
different routing objectives by using the same strategy.
2.4. FROMS
Feedback routing for optimizing multiple sinks (FROMS) proposed in Forster and Murphy (2007) is an energy-aware multicast
routing protocol based on RL algorithm. It makes the local
information of each node shared as feedback with neighboring
nodes, and makes use of the RL algorithm to minimize the energy
dissipation while delivering packets to many sinks simultaneously,
referring to an optimal broadcast Steiner tree where a minimum
number of broadcasts are needed to deliver one packet from an
independent source to all sinks.
In this protocol, each node as an agent learns the best hop costs
to any combination of sinks. An action is one possible routing
decision for a packet, and is dened as a set of sub-actions {a1ak}.
For a sub-action ai (ni, Di), it indicates that neighbor ni is the
intended next hop for routing to destinations Di. A complete action is
a set of sub-actions such that each sink is covered by exactly one subaction.
FROMS works as follows.
(1) First, the routing information is gathered, and the initial
Q-values of actions are estimated.
Each sink broadcasts an announcement indicating its interest
to receive a particular data type. By such a way, hop counts to
all known sinks are acquired. Then, the initial Q-values of subactions are estimated based on the individual hop counts to
each sink, which is shown in Eq. (9). In such estimation, it
assumes that the packet will not share any links after the next
hop. Thus, it is an upper bound of the actual costs.
!
Q ai
hopsndi 2jDi j1
dDi
The rst part of the formula calculates the total number of

hops to individually reach the sinks using neighbor ni, and the
second part is subtracted from this total on account of such an
assumption that broadcast communication is used.
The Q-value of a complete action a with sub-actions {a1ak} is
!
Q a
Q ai k1
10
i1
(2) When data begins to ow in the network, nodes working as

agents start to learn the real Q-values of the shared paths in
the network.
At each step, a forwarding node chooses an action to reach the
desired set of destinations based on the concrete exploration
strategy. The node broadcasts the received packet piggybacking its evaluation of the goodness of the sub-action as reward
to its neighbors. The reward is calculated in
Rai ca mina Q a
11
where ca is the action's cost, which is always 1 in hop count

metric.
Upon receiving a packet, the node extracts the reward and
updates its Q-value of the sub-action:
Q new ai Q old ai Rai Q old ai
12
where is the learning rate of the algorithm.

(3) After a nite number of steps, the Q-values no longer change.
That is to say, the learning protocol converges. Then, the
exploration strategy must be updated.
Simulation results show that FROMS signicantly decreases
routing cost even if there is additional expense of learning.
Moreover, FROMS also performs well in case of node failure and
sink mobility, which enhances the performance of connectivity
and increases its applicability. Thus, it can optimize the network
lifetime in terms of the rst and the second denitions. However,
such parts as exploration strategy, Q-values initialization, and
eventual exploration stop of the protocol are needed to further
study to apply in different environments.
2.4.4. Applications
FROMS enables efcient routing to multiple sinks. Thus, it can
be applied to such an application in which multiple, possibly
mobile users collect data from a monitored area. Moreover, FROMS
innately supports node failure and sink mobility, which further
increases the applicability of this protocol.
2.5. QELAR
QELAR presented in Hu et al. (2010) is an adaptive energyaware distributed routing protocol based on RL algorithm. By
making use of the RL technique, this protocol strives to learn the
environment effectively to better adapt to dynamic networks,
to reduce networking overhead for higher energy efciency, and
to make energy consumption more evenly by considering the
residual energy of each node as well as the energy distribution
among a group of nodes in the reward function.
The details of QELAR are presented as follows.
(1) When a node receives or overhears a packet whose structure is
dened in Fig. 1, it retrieves the information of the previous
forwarder from the tuple and updates its own local copy.
(2) If the eld of Next Forwarder in the received packet indicates
that the current node is not the one to forward the packet, it
simply drops the packet.
(3) Otherwise, the eligible forwarder calculates the Q-values
associated with each of its neighbors by Eq. (13), and chooses
the one with the highest Q-value as the next forwarder. Then,
before forwarding the packet, the node puts information about
its own and the next forwarder in the relevant elds.
Q n st ; at r t P asttst1 maxa Q n st1 ; a
13
st1 S
where Qn(st, at) denotes the expected reward that can be

received by taking an action at at node st, rt is the direct reward
received after taking an action from node st at time t, which is
computed as
r t P asnmsm Rasnmsm P asnmsn Rasnmsn
14
where P in the rst part is the success rate of forwarding

packets along the link from node sn to node sm, and in the
191
lifetime on average in terms of the rst denition. Even in a

moderately sparse network, QELAR can achieve high delivery rate
and energy efciency, which greatly enhances network lifetime in
terms of the third denition.
2.5.4. Applications
QELAR is originally proposed to apply in underwater wireless
sensor networks (UWSNs) to prolong network lifetime. In fact, it
can be applied to various different applications since this protocol
makes few assumptions and is more realistic. It can be easily tuned
to trade latency or energy efciency for lifetime.
2.6. Summary
Fig. 1. Packet structure in QELAR.
second part is the failure rate. In addition, R denotes the

reward function, which is shown in
Rasnmsm g1 csn csm 2 dsn dsm
15
where g represents the constant cost for a node forwarding a

packet, c(sn) dened in Eq. (16) as a cost function of residual
energy of node n, and d(sn) is the reward of energy distribution
in a group of sensor nodes including the node holding a packet
and all its direct neighbors, which is calculated in Eq. (17).
It reveals that the larger the difference between the residual
energy of a node and its group average, the more advantage
for the node to be chosen as the next forwarder.
csn 1
dsn
Eres sn
Einit sn
2
arctanEres sn Esn
16
17
(4) At each step having determined the next forwarder, the

maximum Q-value of the current node is updated, which
serves for Q-value calculation afterward.
(5) The procedure goes on until the packet arrives at the destination.
During the period, QELAR adopts some mechanism to detect
transmission failure: after a node sends a packet, it stores the packet
in the memory rather than removing it from the buffer immediately.
If the next forwarders successfully receive the packet, they will
forward the packet further along the next hop, and the returning
packet heard by the previous forwarder will be taken as an acknowledgment. If the transmission fails, the packet in the sender's memory
will not be heard by itself in some time, and retransmission will be
triggered. If the number of retransmissions for one packet exceeds a
predened constant, the forwarding attempt is regarded as a failure
and no more retransmission occurs. By such a way, nodes can sense
failure and Q-values are updated accordingly, i.e. when a link
is disrupted, the failure rate of forwarding packets along the link is
increased, and the success rate is decreased.
QELAR has been evaluated on the Aqua-sim platform with
various network congurations. Simulation results show that this
protocol can improve network performance in multiple aspects.
Compared to the routing protocol VBF, QELAR yields 20% longer
These RL-based routing protocols mentioned above have

enhanced the network lifetime of WSNs. Q-Routing takes the
minimal delivery times into account to learn the best paths. Such
a strategy increases the rate of packet delivery. Thus, this protocol
can indirectly optimize the network lifetime in terms of the third
denition. AdaR considers hop count, residual energy, aggregated
ratio and link reliability to learn an optimal routing strategy.
Accordingly, this protocol balances energy consumption and
improves link reliability, which leads to an enhancement of network lifetime in terms of the rst and third denitions. ATP
considers new routing metrics for energy-aware load balancing
to increase lifetime in terms of the rst denition, and for
congestion-aware routing to reduce latency and increase reliability
to optimize lifetime in terms of the third denition. In addition,
ATP is robust for un-predicable link failures and mobile sinks. Such
a protocol has much better connectivity and can enhance network
lifetime in terms of the second denition. FROMS signicantly
decreases routing cost even if there is additional expense of
learning. Moreover, FROMS also performs well in case of node
failure and sink mobility, which enhances the performance of
connectivity and increases its applicability. Thus, it can optimize
the network lifetime in terms of the rst and the second denitions. QELAR has been proved to yield longer lifetime in terms of
the rst denition. Even in a moderately sparse network, QELAR
can achieve high delivery rate and energy efciency, which greatly
enhances network lifetime in terms of the third denition.
However, there are still some problems needing to be solved.
Most of these protocols are tested by simulations. These simulations are not fully realistic from the standpoint of actual telecommunication networks. In addition, for each protocol, there are
also some limitations. Q-Routing should make less routing information stored at each node to increase the scalability. AdaR is not
a distributed protocol and is hardly scalable because of the
growing size of the packet at each hop. For ATP, there is great
space in the research on the selection of parameters to meet the
requirement of a particular application. In FROMS, these parts such
as exploration strategy, Q-values initialization, and eventual
exploration stop of the protocol are needed to further study to
apply in different environments.
3. Ant colony optimization based routing protocols

The ant colony optimization (ACO) algorithm originates from
the actual behavior of ants which communicate with each other by
mediator called pheromone. Pheromone is a volatile chemical
substance released by ants which in turn affects their moving
decisions (Wei, 2007). While walking, ants lay pheromone on the
ground, and they smell the current strength of pheromone to
instruct themselves. At the beginning, no pheromone is laid on the
branches and the ants do not have any bit of information about the
branches length. However, once a shorter branch is found, it will
192
receive pheromone at a higher rate than the longer one. Thus,

there will be a positive feedback in the group of hordes of ants
(Ellabib et al., 2007). The more quantities of pheromone ants leave
on the path, the larger probability they visit this path next time.
It seems that this method only gets a local shortest, but in fact,
it approaches to the global shortest. There is some probability that
ants make errors to go through other path rather than the one
regarded as the best at present. When ants go through all of the
paths between source and destination, such an innovation may
nd a nal much shorter way, then more and more ants will be
absorbed there. Therefore, it is close to the global shortest in time.
The self-organizing dynamics driven by local interactions among a
number of relatively simple individuals will lead to a global
optimization (Ye et al., 2001; Subramanian and Katz, 2000).
The ACO algorithm can be used to address the combinatorial
optimization problems such as asymmetric traveling salesman,
vehicle routing, WSNs routing, and so on. In WSNs, ACO is popular
to handle the routing problem. However, these challenges of ACO
should be concerned. Firstly, there is a contradiction between
accelerating the speed of convergence and preventing prematurity
and stagnancy. On the one hand, some researchers exploit the
learning mechanism to optimize the pheromone feedback to
speed up converging. But it brings on prematurity and stagnancy.
On the other hand, the change of pheromone is restricted to a
xed range. In this condition, prematurity and stagnancy can be
effectively prevented, but the speed of convergence is slowed.
Secondly, all the links have the same pheromone at rst. The ants
walk randomly with no hint. This selection may be incorrect, and
the latter selection will be misguided. Thirdly, in the process of
ACO, any mistake in the selection of path or the updating of
pheromone will affect the nal optimization.
3.1. BAR
Basic ant routing (BAR) (Camilo et al., 2006) is an easy
implementation of ACO algorithm in routing issue in WSNs.
It follows the basic idea of ACO algorithm.
Referring to AntNet (Di Caro and Dorigo, 1998), the process of
BAR can be described as follows:
(1) At regular intervals, a forward ant at each source node is
launched towards the destination node.
(2) Each forward ant searches for the destination by selecting the
next hop node according to the link probability distribution
(Eq. 18). Initially, all the links have equal probability.
8
Es
>
if sM k
< Tr;s
Tr;u Eu
pk r; s uMk
18
>
:
0
otherwise
where pk(r,s) is the probability for ant k to choose node s as the
next hop of node r, Mk is a memory carried by ant k which
stores the identity of each visited node by ant k, T is the
routing table at each node storing the amount of pheromone
trail on each correlative link, is the visibility function dened
in Eq. (19), and and are the parameters that control the
relative importance of pheromone trail versus visibility.
E
1
Ces
19
where C is the initial energy level of nodes, and es is the

current energy level of node s.
In general, each ant avoids traversing the same node, and for
the node that has not been visited, the probability of being
pitched on depends on two factors. The factor of pheromone

trail reveals that it is highly desirable to use a link which has
loaded a lot of trafc, and the visibility factor which is the
reciprocal of consumed energy says that nodes with more
energy are chosen with high probability.
(3) Once a forward ant reaches the destination node, a backward
ant which will backtrack to the source node is created.
In addition, the destination node computes the amount of
pheromone trail that the backward ant will drop during its
journey. Then, the backward ant moves back along the links
having been traversed by the forward ant and releases some
amount of pheromone on each link of that path according to
T k
1
NFdk
20
where N is the total number of nodes in the network, and Fdk

is the number of nodes traveled by the forward ant k.
(4) Upon receiving a backward ant from neighbor node s, node r
updates its routing table according to
T k r; s 1T k r; s T k
21
where is a coefcient, and (1) represents the evaporation

of pheromone trail.
(5) Once a backward ant reaches the source node, the mission is
nished.
(6) The above procedure is performed for several iterations, then
each node knows the path to send a packet towards a specic
destination.

The BAR protocol chooses routing path according to the
probability distribution, which can assist in the optimization of
the network lifetime in terms of the second denition. However,
there are some problems. Initially, the forward ants have no sense
about the destination. The energy level of nodes and pheromone
trail on links are all same. Such a condition is equivalent to random
walks in a maze with no hint. The probability distribution assists
to search for the destination, but it is not modied until the rst
forward ant arrives at the destination and traverses back. What is
worse, the ants having successfully reached the destination may
not be able to move back to the source due to asymmetric links,
which results in more slow updating. In addition, collisions and
failure nodes also lower the performance. Furthermore, in updating pheromone trail, the energy level of path is not taken into
account, which is a signicant aspect in WSNs.
3.1.4. Applications
BAR is a basic ant based routing protocol. Because of the
inherent advantage of ACO algorithm, this protocol is suitable for
such applications in which the sink node is the focus and ordinary
nodes must keep good connectivity to the sink.
3.2. SCFFFP
Sensor-driven cost-aware ant routing (SC), ooded forward ant
routing (FF), and ooded piggybacked ant routing (FP) proposed in
Zhang et al. (2004) are three ant-based routing algorithms for
WSNs. They address the initial pheromone settings to lead to a
good start-up.
In SC, ants are equipped with sensors to sense the best
direction to go even initially. For each node, it stores the probability distribution of correlative links in routing table as that in
BAR. In addition, it estimates and stores the cost to the destination
from each neighbor. The denition of cost may be diverse
depending on different requirements. For WSNs, cost should be

energy-correlative.
In FF, forward ants are ooded to the destination. Once the
forward ants arrive at the destination, backward ants are created
to traverse back to the source. By such a ooding phase, the
probability distributions on multiple paths are updated just as that
in BAR. The ooding can be stopped if the probability distribution
is good enough for the data ants to the destination. When a
shorter path is traversed, the rate of releasing ooding ants is
decreased.
Just as FF, FP also adopts the ooding mechanism to release
ants. However, unlike FF, it combines forward ants with data ants.
In SC, two ways can be used to obtain the cost estimation.
For each neighbor node n, the initial probability distribution is
calculated according to
pn
eCQ n
CQ n
22
nN
where N is the set of all neighbors of the current node, Qn is the

cost estimation for neighbor node n, and C is the cost from
the current node to the destination. If the current node is the
destination, C is 0, otherwise it is computed by
C minnN cn Q n
23
where cn is the local cost function.

In FF, to control the forward ooding, two strategies are used.
First, the restricted broadcast mechanism is used in FF. A neighbor
node will broadcast a forward ant only if it is closer to the
destination than the node which broadcasted at an earlier time.
Link probabilities are used for the estimation, i.e., only if the
condition in Eq. (24) is met, will a forward ant be broadcasted.
If initially all links have the same probability, each node will
broadcast a forward ant once. Second, the mechanism of delayed
transmission is also used, in which a random delay is added to
each transmission. Upon hearing the same ant from other nodes, a
node will stop broadcasting.
pn o
1
jNj
24
where n is the neighbor the ant is coming from, and N is the set of
neighbors.
In FP, the ooded data ants carrying the forward list are
controlled by the same strategy in FF. The data ants not only pass
the data to the destination, but also remember the traversed paths,
by which the backward ants update the correlative pheromone
trail. The probability distribution constrains the ooding towards
the destination for future data ants.
In SC, the exponential term in Eq. (22) makes the initial
probability distribution differ more. Accordingly, at the beginning,
the good links are chosen with a much higher probability rather
than the same probability in BAR. Therefore, SC provides better
energy efciency, which contributes to the optimization of network lifetime in terms of the rst denition. However, it is still not
quite effective in latency.
Compared to BAR and SC, FF has shorter delays, which can
indirectly optimize the network lifetime in terms of the last
denition. However, the success of this protocol signicantly
depends on the appropriate frequency of ooding ants since there
is collision problem in FF. The ooded forward ants create a
signicant amount of trafc so that the data ants and the
193
backward ants are interfered. Accordingly, it is important to

control the frequency of ooding forward ants.
FP provides higher success rates of data delivery, that is, it
further contributes to the enhancement of network lifetime in
terms of the third denition whereas it consumes more energy
than SC and FF.
3.2.4. Applications
These three ant-based protocols focus on the building of initial
pheromone distribution. SC provides better energy efciency, FF
shortens the delay, and FP has higher success rates. On the basis of
different requirements of applications, these protocols can be
independently used to build a good system start-up.
3.3. EEABR
Energy-efcient ant based routing (EEABR) presented in Camilo
et al. (2006) is an improved ant-based algorithm to maximize the
lifetime of WSNs. The algorithm strives to reduce the communication load related to the ants and the energy spent on communications. In addition, both the energy levels of sensor nodes and the
lengths of routed paths are taken into account to update the
pheromone trail.
For the ant-based algorithm, each ant carries a memory storing
all the visited nodes. Then, in a network with a large number of
sensor nodes, the memory of ants would be so big that it would be
unfeasible to send ants through the network. To solve such a
problem, EEBAR only stores the last two visited nodes in the
memory of each ant. For each node, it keeps records of each
received and sent ant in its memory. Each memory record contains
the previous node, the forward node, the ant identication and a
timeout value. Upon receiving a forward ant, a node looks into its
memory. If no record about the received ant is found, the node
saves the required information, restarts a timer, and forwards the
ant to the next node. Otherwise, the ant is eliminated. Once a node
receives a backward ant, it searches its memory to nd the next
node to where the ant must be sent.
On the other hand, to improve the performance in terms of
energy efciency, EEBAR mends the rules about the amount of
pheromone released by backward ants according to Eq. (25).
In addition, the equation used to update the routing tables at each
node is changed to Eq. (26).
T k
1
CEMink Fdk =EAvg k Fdk
25
where C and Fdk have the same meanings as that in BAR. EMink
and EAvgk respectively represent the average energy of nodes that
have been visited by ant k, and the minimum energy of these
nodes.
T k r; s 1T k r; s T k =Bdk
26
where (1) is still the evaporation of pheromone trail, and Bdk is

the number of nodes visited by backward ant k until node r. It is to
build a better pheromone distribution in which nodes near the
sink will have more pheromone levels and force remote nodes to
nd better paths. Such a strategy is good for a network with
mobile sink since the pheromone adaptation will be much quicker.
EEABR uses lightweight ants to nd routing paths between the
sensor nodes and the sink nodes, which are optimized in terms of
distance and energy levels. Simulation results show that in
different WSNs scenarios, EEABR minimizes the communication
194
load and maximizes the energy saving, which emphasizes the

enhancement of network lifetime in terms of the rst denition.
In view of the feature of ACO, EEABR can also optimize the second
aspect of network lifetime.
In addition to a negative feedback, the operation of pheromone

evaporation after the tour is also accomplished in
3.3.4. Applications
EEABR minimizes communication load and maximizes energy
savings. Moreover, it makes sensor nodes in the network keep
good connectivity to the sink. For different WSNs scenarios, this
protocol can lead to a good result.

ACORC has been compared to EEABR using an event-based
simulator. The results show that ACORC offers signicant reductions of energy consumption, which can further prolong the
network lifetime in terms of the rst denition. In addition, this
ACO approach has been tested running on a router chip, and the
performance results including response times of the chip were
obtained. As a future work, the case with mobile sink or multiple
sinks should be taken into account.
3.4. ACORC
Routing using ant colony optimization router chip (ACORC)
proposed in Okdem and Karaboga (2009) makes use of ACO to
optimize routing paths. It provides an effective multi-path data
transmission method to achieve reliable communications in the
case of node faults. By developing an adaptive approach, the
network lifetime is maximized, while data transmission is
achieved efciently.
The operation of this routing scheme is summarized as follows:
A node having information for the base station initializes the
routing task by transferring data in packages to different neighbor
nodes. Each node then chooses other neighbor nodes and so on.
Thus, paths towards the base are formed and each routing
operation supplies some information about optimum paths for
the consequent routing tasks.
While performing this operation, the ACO algorithm is used to
achieve efcient routing. Once a node has data to be transferred to
the destination node, ants are launched from the source node and
move through repeater nodes to reach the destination. The rule
determining the probability distribution is shown in
8
< r;s r;s

if ktabur
r;s r;s
pk r; s rRs
27
:
0
otherwise
where (r,s) represents the factor related to energy, which is
dened in Eq. (28), (r,s) indicates the pheromone trail on link
(r,s), which is calculated according to Eq. (30), and and are the
parameters that control the relative importance of these two
factors.
Ier 1
Ien 1
r; s
28
nRs
where I is the initial energy, and er is the current energy level of

receiver node r. It means that if a node has a lower energy source
then it has lower probability to be chosen.
After all ants have completed their tour, each backward ant
deposits a quantity of pheromone to the traversed path, which is
calculated in
k t
1
J kw t
29
The above equation reveals that the amount of released

pheromone at each tour is inversely proportional to the tour
length.
Then, the correlative pheromone trail is updated according to
r; str; st r; st; lr; swk t; k 1; m
30
where (r,s)(t) denotes the amount of pheromone released by

backward ants to link (r, s) at iteration t.
ij t1ij t
31
3.4.4. Applications
ACORC provides an effective multi-path data transmission
method to achieve reliable communications in the case of node
faults. This protocol has also been implemented on a small sized
hardware component and tested on the router chip. Such a
protocol is suitable for applications in which transmission speed
is not essential but transmission reliability is important. In addition, the idea of this protocol and its hardware implementation
seem to be a promising solution for node designers.
3.5. Summary
The ACO-based routing protocols choose routing path according to the probability distribution, which can make nodes keep
good connectivity to the base station. Thus, such procotols contribute to the optimization of the network lifetime in terms of the
second denition. Moreover, these protocols also solve other
problems. SC, FF and FP address the initial pheromone settings
of ACO to lead to a good start-up. They take measures to make the
initial probability distribution different. SC stores the energycorrelative cost of each neighbor to the destination. At the
beginning, good links are chosen with a much higher probability.
Compared to BAR, SC provides better energy efciency, which
contributes to the optimization of network lifetime in terms of the
rst denition. FF oods forward ants to the destination. Once the
forward ants arrive at the destination, the backward ants are
created to traverse back to the source. When a shorter path is
traversed, the rate of releasing ooding ants is decreased. Compared to BAR and SC, FF has shorter delays, which can indirectly
optimize the network lifetime in terms of the last denition.
FP adopts the ooding mechanism to release ants, and combines
forward ants with data ants. The data ants not only pass the data
to the destination, but also remember the traversed paths, by
which the backward ants update the correlative pheromone trail.
The probability distribution constrains the ooding towards the
destination for future data ants. Such a protocol provides higher
success rates of data delivery, which leads to the optimization of
network lifetime in terms of the third denition. EEABR takes the
energy levels of sensor nodes and the lengths of routed paths into
account to update the pheromone trail. It strives to reduce the
communication load related to the ants and the energy spent on
communications. This protocol minimizes the communication
load and maximizes the energy saving, which emphasizes the
enhancement of network lifetime in terms of the rst denition.
ACORC has been proved to offer signicant reductions of energy
consumption compared to EEABR. Thus, it can further prolong the
network lifetime in terms of the rst denition.
However, there are some problems. For the BAR protocol, the
forward ants have no sense about the destination at the beginning.
The energy level of nodes and pheromone trail on links are all the
same. Such a condition is equivalent to random walks in a maze

with no hint. In addition, the energy level of path is not considered
to update the pheromone trail. SC, FF and FP bring about a
good beginning. But SC is weak in the performance of latency.
The success of FF signicantly depends on the appropriate frequency of ooding ants. FP consumes more energy than SC and FF.
EEABR and ACORC have much better performance, whereas they
have the shortcoming of all ACO-based protocols. The ACO algorithm requires high communication overhead by sending ants
separately to manage the routes and sending ants back to the
source. Thus, it is better to change the ACO model to accommodate
the requirements of WSNs, but this has not been done so far.
4. Fuzzy logic based routing protocols

Fuzzy logic (FL) is a mathematical discipline invented to
express approximate human reasoning. Different from the classical
set theory which allows elements to be either included in a set or
not, fuzzy logic can establish intermediate values based on
linguistic variables and inference rules. That is to say, in a fuzzy
set, a certain element is allowed to have partial membership
which is in the range [0, 1]. A linguistic variable is a variable whose
values are words or sentences in natural or articial language, and
inference rules are used to govern the approximate reasoning. By
using hedges like more, many, few, and connectors like AND,
OR, NOT with linguistic variables, an expert can form inference
rules (Minhas et al., 2008). There are two important parts in FL.
First, a fuzzy membership function is dened to compute the
membership corresponding to a given value of a linguistic variable.
Based on various needs, the membership function can be designed
in a exible way to reect the desired goodness behavior of an
objective.
In addition, FL offers a fuzzy aggregation operator, Ordered
Weighted Averaging (OWA), as an alternative to weighted sum, for
designing a multi-objective cost function. Usually, the Or-like
and And-like OWA operators are used in FL, which are implemented in Eqs. (32) and (33), respectively.
AB x maxA ; B 1
1
B
2 A
32
AB x minA ; B 1
1
B
2 A
33
where A, B are the two fuzzy sets, and is a fuzzy membership

function.
FL has been applied successfully in digital image processing,
pattern recognition, and control systems such as control of vehicle
subsystem, power systems, home appliances, elevators etc. Moreover, FL is suited for implementing clustering heuristics and
routing optimization to simultaneously achieve multiple objectives. However, this algorithm generates non-optimal solution, and
fuzzy rules need to be re-learnt upon topology changes.
4.1. FCH
A fuzzy logic approach FCH proposed in Gupta et al. (2005)
addresses cluster-head election for WSNs. In this fuzzy based
system model, cluster-heads are elected by the base station in
each round. For each node, energy, concentration and centrality
are considered as three linguistic variables to determine the
chance to become the cluster-head. Node concentration is the
number of nodes present in the vicinity, and node centrality is a
value which classies the nodes based on how central the node is
to the cluster. The linguistic variables node energy and node
concentration are divided into three levels: low, medium and
195
high, and node centrality is divided into three levels: close,

adequate and far. The outcome to represent the node clusterhead election chance was divided into seven levels: very small,
small, rather small, medium, rather large, large, and very large.
The fuzzy rule base currently includes rules like the following: if
the energy is high and the concentration is high and the centrality
is close then the node's cluster-head election chance is very large.
Thus there are 33 27 rules for the fuzzy rule base. In addition,
triangle membership functions are used to represent medium and
adequate fuzzy sets, and trapezoid membership functions to
represent low, high, close and far fuzzy sets.
The operation of this fuzzy cluster-head election scheme is
divided into rounds consisting of a setup and steady state phase.
In the setup phase, all the nodes are compared on the basis of
chances and the node with the maximum chance is then elected as
the cluster-head. If there are multiple nodes having maximum
chance, then the node having more energy is selected. Each node
in the cluster associates itself to the cluster-head and starts
transmitting data. The data transmission phase is similar to the
LEACH steady-state phase. This approach rotates the cluster-head
on the basis of the defuzzied chance value, and incorporates
different parameters like energy, concentration and centrality to
determine the chance.
Compared to the LEACH protocol, FCH gains a substantial
increase in network lifetime. Simulation results in Gupta et al.
(2005) show that the number of rounds before rst-node-death in
case of the proposed method is on average about 1.8 times greater
than in LEACH, which optimizes the network lifetime in terms of
the rst denition.
4.1.4. Applications
FCH is used to select cluster-heads in WSNs. However, this
protocol is a central control approach. It is only suitable for a
small-scale WSN.
4.2. FMO
Fuzzy multi-objective routing algorithm (FMO) proposed in
Minhas et al. (2008) is to simultaneously optimize two routing
objectives for WSNs. For the routing objectives, it uses fuzzy
membership functions and rules in the design of cost functions.
In FMO, a static WSN deployment is modeled as a directed graph G
(V,E), where V is the set of nodes and E is the set of links.
This algorithm operates as follows:
When a routing request rh(sm, tn) is initiated, a fuzzy lifetime
membership and a fuzzy minimum energy membership are
computed for each edge using Eqs. (34) and (36), respectively.
Then, the corresponding multi-objective membership is calculated
by Eq. (38). Having gotten the multi-objective membership, each
edge is assigned a weight according to Eq. (39). Following the
weight assignment, the multi-objective path ph between sm and tn
is found using Dijkstra's shortest path algorithm (Dijkstra, 1959).
8

1
i
>
1 rev
if s o revi s
1 1
>
s
>
<
ij
34
lt sTX revi TX ij if TX ij o revi s
ij
>
>
>
:0
if revi TX ij
196
revi cevi TX ij
35
where re(vi) and ce(vi) denote the residual energy and current
energy of node vi respectively, s is the initial energy of nodes, TXij
represents the consumed energy in transmission from node vi to
node vj, and , are the algorithmic parameters.
ijme 1
1 TX ij
maxTX ij

maxTX ij maxj TX ij
36
j s:t:
vj N i
37
where is an algorithmic parameter which can be adjusted to

alter the lowest membership. The above function assigns the
lowest membership value to an edge requiring the maximum
transmission energy among all the neighboring edges, which
encourages the selection of such edges that require lesser transmission energy.
!
ijlt ijme
ij minijlt ; ijme 1
38
2
where ij is the fuzzy multi-objective membership of the edge
e(vi, vj), and is a constant.
wij 1ij
39

FMO makes use of the FL algorithm to simultaneously optimize
multiple objectives. Simulation results show that this approach is
superior to a number of other well-known online routing heuristics in the performance of network lifetime in terms of the rst
denition.
4.2.4. Applications
Because of the characteristic of FMO, this protocol can be
applied to simultaneously achieve multiple routing objectives
for WSNs.
4.3. Summary
The FL-based protocols implement clustering heuristics or
routing optimization to simultaneously achieve multiple objectives. FCH considers energy, concentration and centrality as three
linguistic variables to determine the chance of becoming cluster
head. It has been validated to gain a substantial increase in
network lifetime in terms of the rst denition. In this aspect,
FMO is also superior to a number of other well-known online
routing heuristics. It uses fuzzy membership functions and rules in
the design of cost functions to simultaneously optimize multiple
objectives. But many of these protocols lack explicit comparison to
traditional or to other intelligent routing protocols. And only a few
protocols have been validated under real WSNs environments like
test-bed or deployments. In addition, FL generates non-optimal
solution, and fuzzy rules need to be re-learnt upon topology
changes.
5. Genetic algorithm based routing protocols

Genetic algorithm (GA) modeling the natural evolution performs tness tests on new structures to select the best population.
A population is composed by a group of chromosomes. In the
application of GA, a chromosome represents a complete solution
to a dened problem, and tness reveals the quality of a chromosome on the basis of concrete needs.
Initially, the population is randomly generated as a set of
chromosomes. Then, the tness of each chromosome is evaluated
according to the dened tness function. For a particular chromosome, the better the tness value, the higher the chance of being
selected to create new chromosomes by crossover. The probability
of crossover taking place depends on the crossover rate which is
usually around 8095%. In nature, crossover is a recombination of
component materials due to mating. It is a binary genetic operator
acting on two chromosomes which have been chosen in the
selection process. Different methods for crossover are developed,
and the simplest is the single-point crossover in which a point is
randomly chosen and the two parents exchange genes after that
point. By means of crossover, the offspring chromosomes only
inherit the traits of the parent chromosomes, which will lead to a
problem that no new genetic material is introduced in the next
generation. Therefore, mutation which allows new genetic patterns to be introduced in the new generation is needed. As with
crossover, the mutation rate is dened to control how often
mutation is applied, which is around 0.51%. However, unlike
crossover, mutation is a unitary genetic operator that affects only a
single chromosome. Then, in the chromosome which has been
selected for mutation, a random bit is selected to change from 0 to
1, or vice versa. By such a way, a new sequence of genes is
introduced into a chromosome. Only if the tness of the mutated
chromosome evaluated in the selection process is higher than the
general population, will it be retained. The procedures presented
above are repeated generation after generation until either a tenough solution is found or a given limit is reached.
The GA algorithm is able to explore the search space efciently
through parallel evaluation of tness and mixing of partial solutions
through crossover. It maintains a search frontier to seek global
optima, and solve multi-criterion optimization problems. In addition,
a more specic advantage of GA is its ability to represent rule-based,
permutation-based, and constructive solutions to many pattern
recognition and machine learning problems (Hsu, 2008). This algorithm has been applied to address design, control, classication,
clustering, and performance tuning. In WSNs, GA is suited for
clustering when the clustering schemes can be pre-deployed. But it
has very high processing demands and is usually a centralized
solution. In addition, the denition of a good tness measure is the
most critical challenge in a genetic approach.
5.1. GA-Routing
GA-Routing presented in Islam and Hussain (2006) is a GAbased multi-hop routing protocol. In the protocol, the GA technique is used to generate an aggregation tree which spans all the
sensor nodes.
In the application of WSNs, a sensor node aggregates data
received from neighboring nodes with its own, and forwards the
aggregated packet in the direction of the base station. An aggregation tree indicates the path to transmit data, and the best one is
the most efcient path. However, if the best path is continuously
used, a few nodes in that path may die earlier so that the network
lifetime is shortened. Therefore, a sequence of routing paths is
generated. In GA-Routing, a chromosome represents a spanning
tree. By means of initiation, selection, crossover, and mutation, the
nal solution is obtained. It strives to construct an aggregation tree
and to nd the number of times a particular tree is used.
Simulation results show that GA-Routing prolongs the network
lifetime in terms of the rst denition compared to the single best
tree algorithm. And for a network in small size, it can achieve the
same lifetime as the clustering-based maximum lifetime data

aggregation algorithm (Dasgupta et al., 2003). However, GARouting is a centralized protocol which is not suited for largescale network. In addition, there is extra cost to disseminate the
optimal routing paths to sensor nodes.
5.1.4. Applications
GA-Routing is proposed for a homogeneous network to maximize the network lifetime. But it is not suited for large-scale
network since it adopts the mechanism of central control.
5.2. GA-EECP
The GA-based energy-efcient clustering protocol (GA-EECP)
proposed in Hussain et al. (2007) makes use of the GA approach to
create energy efcient clusters for data dissemination in WSNs.
In this protocol, sensor nodes are represented as bits of a
chromosome. Head and member nodes are represented as 1 and 0,
respectively. The tness of a chromosome is determined by several
parameters such as the cluster distance, the direct distance to the
base station, the transfer energy, the standard deviation of cluster
distance, and the number of transmissions, which is shown in
F wi f i ; f i fC; DD; E; SD; Tg
40
where C represents the cluster distance. It is the sum of the

distances from nodes to the cluster head and the distance from
the head to the base station. For a cluster with k member nodes,
the cluster distance C is dened as follows:
k
C dih dhs
where dih is the distance from node i to the cluster head h, and dhs
is the distance from the cluster head h to the base station.
DD, direct distance to the base station, is the sum of all
distances from sensor nodes to the base station.
m
is evaluated and the weights for tness parameters are updated as

follows:
wi wi1 ci f i
46
where f represents the change in the tness parameter value:

f i f i f i1
47
1
48
1 ef i
Based on this tness function, the quality of each chromosome
in the current population is evaluated, and the best one is
introduced to the future generation.
Just as Fig. 2 shows, GA-EECP uses GA to determine the initial
set of hierarchical clusters. Based on GA, all the living nodes are
organized into clusters. Then, the cluster optimizer uses the GA's
suggested clusters, query type, and the current network condition
to create optimized clusters that provide the query execution plan
and the transmission schedules. The sensor nodes are congured
according to the optimized cluster information, which is followed
by the data transfer phase. Once receiving the required transmissions, the base station creates a new set of clusters by GA
according to the current condition.
ci

The simulation results in Hussain et al. (2007) indicate that this
GA-based protocol performs better than the traditional clusterbased in terms of network lifetime dened by the rst aspect.
However, in such a protocol, there is an additional cost caused by
the base station gathering information about the whole network
to determine the clusters.
41
i1
DD dis
42
5.2.4. Applications
This protocol can extend network lifetime for different network
deployment environments. Such a GA-based hierarchical clusters
protocol is suitable for large scale WSNs which can be used for
various pervasive and ubiquitous applications such as security,
health-care, industry automation, agriculture, environment and
habitat monitoring.
i1
where dis is the distance from node i to the base station. E, the
transfer energy, represents the energy consumed to transfer the
aggregated message from the cluster to the base station. For a
cluster with k member nodes, E is dened as follows:
k
E ETjh k ER EThs
43
j1
The three terms reveal the energy consumption in the transmission from member nodes to the cluster head, the cluster head
receiving messages from the member nodes, and the transmission
from the cluster head to the base station, respectively.
SD represents the standard deviation of cluster distance. With a
deviation , SD can be computed as follows:
hi 1 dclusteri
h
s
SD
dclusteri 2
197
44
45
i1
T denotes the number of transmissions, which is assigned by

the base station according to the network conditions and current
energy levels.
Initially, the tness parameters can be assigned arbitrary
weights w. Then, after every generation, the best t chromosome
Fig. 2. Cluster optimization in GA-EECP.
198
5.3. Summary
For WSNs in medium or large size, GA can be used to build
clusters. GA-Routing uses GA technique to generate an aggregation
tree which spans all the sensor nodes. It prolongs the network
lifetime in terms of the rst denition compared to the single best
tree algorithm, and achieves the same lifetime as the clusteringbased maximum lifetime data aggregation algorithm for network
in small size. GA-EECP considers cluster distance, direct distance to
the base station, transfer energy, standard deviation of cluster
distance, and number of transmissions to create energy efcient
clusters. It performs better than the traditional cluster-based
protocols in terms of network lifetime dened by the rst aspect.
However, GA-Routing has an extra cost to disseminate the optimal
routing paths to sensor nodes. In GA-EECP, there is also an
additional cost caused by the base station gathering information
about the whole network to determine the clusters. In addition,
GA has very high processing demands and is usually a centralized
solution.
of a layer are connected as the inputs to the next layer, while in a

recurrent network of Jordan type, a copy of output of the neurons
in the output layer is presented as the input to the hidden layer,
and in a recurrent network of Elman type, the context layer is
presented as the input to the hidden layer (Kulkarni et al., 2011).
NNs learn the patterns and determine their inter-relationships,
in which the weights of a NN are updated to discover patterns in
the input data. Once the learning phase has been successfully
accomplished, the performance of the NN model needs to be
validated using an independent testing set.
NNs have been applied in power system stabilization, pattern
classication, speech recognition, image processing, robotics, prediction, and so on. To improve the performance of the NNs models,
some issues need to be addressed. For one thing there should be a
systematic approach which deals with the choice of network
architecture, determination of adequate inputs, data division and
preprocessing, selection of internal parameters, stopping criteria,
and model validation. For another thing effective approaches are
needed to be developed. Such approaches deal with uncertainty,
ensure the development of robust models, increase model transparency, and improve extrapolation ability.
6. Neural networks based routing protocols

Neural networks (NNs) (Haykin, 1994) are the models of the
biological neural systems. The human brain contains a large
number of neurons. Each neuron connected to a great many other
neurons receives signals through synapses. These synaptic connections play an important role in the behavior of the brain. Such a
structure is similar to a dense network. A NN consists of a network
of neurons organized in input, hidden and output layers.
The architecture of NN is shown in Fig. 3, where context layer is
a copy of hidden layer output. For different types of NNs, there are
some differences. For example, in a feed-forward NN, the outputs
6.1. SIR
Sensor intelligence routing (SIR) proposed in Barbancho et al.
(2006) is a Qos driven routing algorithm for automatic reading of
public utility meters. A NN is introduced in every node to manage
the routing. In the structure, the rst layer has 4 neurons and the
second layer has 12 neurons in a 3 4 matrix. Inputs are latency,
throughput, error-rate, and duty-cycle.
Fig. 3. Architecture of NN.

The samples presented as inputs form groups in such a way
that all the samples in a group have similar characteristics. Then, a
map is formed by clusters, where every cluster corresponds with a
specic QoS and is assigned a neuron of the output layer. After a
node has collected a set of input samples, it runs the wining
neuron election algorithm. Having elected the winning neuron, the
node uses the output function to assign an estimation of Qos.
By such a way, each node pings its neighbors to nd out the quality
of link based on such factors as latency, error rate, duty cycle and
throughput. Then, the minimum cost paths from the base station to
each node can be found by a modication over the Dijkstra algorithm.
199

SIR has been evaluated in Barbancho et al. (2006) under two
cases. One is all the nodes are effective, and the other is 20% nodes
are dead. Simulation results show that SIR achieves superior
performance in terms of average latency and energy consumption
over EAR and Directed Diffusion. Thus, compared to EAR and
Directed Diffusion, SIR further optimizes network lifetime in terms
of the rst and third denitions. Especially when the percentage of
dead nodes is high, SIR has much greater superiority. However, SIR
has expensive cost. First, the implementation of a NN on each node
entails computational cost. In addition, there is additional cost for
each node pinging neighbors to learn the quality of links.
Table 2
Intelligent routing protocols optimizing network lifetime in terms of Denition 1.
Routing
protocol
Intelligence
used in routing
Characteristic
Effect
AdaR
ATP
RL
RL
Balances energy consumption

FROMS
RL
QELAR
RL
SC
ACO
EEABR
ACO
ACORC
ACO
FCH
FL
FMO
FL
GARouting
GA
GA-EECP
GA
SIR
NNs
Learns an optimal routing strategy considering hop count, residual energy, aggregated ratio.
Makes uses of RL-based strategies to build an adaptive spanning tree, which considers new routing
metrics for energy-aware load balancing
Makes uses of the RL algorithm to minimize the energy dissipation while delivering packets to many
sinks simultaneously.
Learns the environment effectively to reduce networking overhead. Considers the residual energy of
each node and the energy distribution among a group of nodes in the reward function.
For the initial pheromone settings in ACO, stores the energy-correlative cost to the destination from
each neighbor, good links are chosen with a much higher probability.
Strives to reduce the communication load related to the ants and the energy spent on
communications. Takes the energy levels of sensor nodes and the lengths of routed paths into
account to update the pheromone trail. Has been proved to minimize the communication load and
maximizes the energy saving.
Has been simulated using an event-based simulator and tested running on a router chip in (Okdem
and Karaboga, 2009). Has been shown to offer signicant reductions of energy consumption
compared to EEABR.
Considers energy, concentration and centrality as three linguistic variables to determine the chance
of becoming cluster-head. Has been validated in (Gupta et al., 2005) to be on average 1.8 times
greater than LEACH in terms of rst-node-death.
Makes use of FL to simultaneously optimize multiple objectives, which uses fuzzy membership
functions and rules in the design of cost functions. Has been validated to be superior to a number of
other well-known online routing heuristics in the performance of rst-node-death by experiment in
(Minhas et al., 2008)
Uses GA technique to generate an aggregation tree which spans all the sensor nodes. Has been proved
to prolong network lifetime in terms of the rst denition compared to the single best tree algorithm,
and achieve the same lifetime as the clustering-based maximum lifetime data aggregation algorithm
for network in small size.
Uses GA technique to create energy efcient clusters. Considers cluster distance, direct distance to the
base station, transfer energy, standard deviation of cluster distance, and number of transmissions as
factors of inuence.
Introduces a NN in each node to manage the routing. As a Qos driven routing algorithm, considers
latency, error rate, duty cycle and throughput to determine the quality of link. Has been shown in
(Barbancho et al., 2006) to achieve superior performance in terms of energy consumption over EAR
and Directed Diffusion
Decreases energy consumption

Decreases energy consumption,
Provides better energy efciency
Postpones the time of rst-nodedeath

Table 3
Routing
protocol
Intelligence used in
routing
Characteristic
Effect
ATP
RL
FROMS
RL
BAR
SCFFFP
EEABR
ACORC
ACO
ACO
ACO
ACO
Has been proved to be robust for un-predicable link failures and mobile sinks by experiments in (Zhang Has much better
and Huang, 2006).
connectivity
Has been shown in (Forster and Murphy, 2007) to perform well in case of node failure and sink mobility Has much better
connectivity
Chooses routing path according to the probability distribution.
Builds multiple paths
Address the initial pheromone settings in ACO to lead to a good start-up.
Build multiple paths
Has the feature of ACO to choose routing path according to the probability distribution.
Builds multiple paths
Makes use of ACO to provide an effective multi-path data transmission to achieve reliable communications Has much better
in the case of node faults.
connectivity
200
Table 4
Routing
protocol
Intelligence used Characteristic

in routing
Effect
Q-Routing
AdaR
ATP
RL
RL
RL
QELAR
RL
Reduces latency
Improves link reliability
Reduces latency and
improves link reliability
Increases packet delivery
FF
ACO
FP
ACO
SIR
NNs
Takes the minimal delivery times into account to learn the best paths
Learns an optimal routing strategy also taking the factor of link reliability into account.
Makes uses of RL-based strategies to build an adaptive spanning tree, which considers new routing
metrics for congestion-aware routing.
Has been proved in (Hu et al., 2010) to achieve high delivery rate and energy efciency even in a
moderately sparse network.
At the beginning, oods forward ants to the destination. Once the forward ants arrive at the destination,
creates backward ants to traverse back to the source. When a shorter path is traversed, the rate of
releasing ooding ants is decreased.
Adopts the ooding mechanism to release ants, and combines forward ants with data ants. Data ants not
only pass the data to the destination, but also remember the traversed paths, by which the backward ants
update the correlative pheromone trail. The probability distribution constrains the ooding towards the
destination for future data ants.
Introduces a NN in each node to manage the routing. As a Qos driven routing algorithm, considers
latency, error rate, duty cycle and throughput to determine the quality of link. Has been shown in
(Barbancho et al., 2006) to achieve superior performance in terms of average latency over EAR and
Directed Diffusion.
6.1.4. Applications
SIR achieves superior performance in terms of average latency
and energy consumption. Such a protocol is well suited for the
real-time application of WSNs.
6.2. Summary
NNs can be utilized to improve performance of WSNs. SIR
considers latency, error rate, duty cycle and throughput to determine the quality of link. This Qos driven routing protocol achieves
superior performance in terms of average latency and energy
consumption. Thus, it optimizes network lifetime in terms of the
rst and third denitions. Especially when the percentage of dead
nodes is high, SIR has much greater superiority. However, there is
an additional cost caused by each node pinging neighbors to learn
the quality of link. In addition, the implementation of a NN on
each node entails computational spending.
Reduces latency
Increases packet delivery
Reduces latency and

increases packet delivery
Table 5
Intelligent routing protocols in WSNs.
Routing protocol
RL
Q-Routing
AdaR
ATP
FROMS
QELAR
BAR
SCFFFP
EEABR
ACORC
FCH
FMO
GA-Routing
GA-EECP
SIR
Y
Y
Y
Y
Y
ACO
FL
GA
NNs
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Lifetime optimization
Denition 3
Denitions 1 and
Denitions 13
Denitions 1 and
Denitions 1 and
Denition 2
Denitions 13
Denitions 1 and
Denitions 1 and
Denition 1
Denition 1
Denition 1
Denition 1
Denitions 1 and
3
2
3
2
2
7.3. Denition 3 of network lifetime
This section is to further analyze the performance of these

intelligent routing protocols in terms of three denitions of network lifetime.
Denition 3 pays close attention to the ratio of packet delivery

when no data can be transmitted to the base station. Routing
protocols covered in Table 4 adopt the corresponding measures to
reduce latency or improve link reliability to achieve more efcient
packet delivery.
8. Conclusion
Denition 1 denotes the time of rst-node-death. The major

measures to extend such time are to decrease and balance energy
consumption. Among all of the intelligent routing protocols
covered in this survey, these protocols mentioned in Table 2
contribute to the optimization of network lifetime in terms of
Denition 1.
Lifetime optimization has been a hot issue in WSNs. In recent

years, several routing protocols based on such intelligent algorithms as RL, ACO, FL, GA, and NNs have been proposed for WSNs
to achieve this goal. This paper rst denes network lifetime in
three aspects. Then, under each category of intelligent algorithms,
it picks out some representative routing protocols which contribute to the optimization of network lifetime to discuss. Table 5
summarizes these intelligent routing protocols covered in this
survey, and indicates which aspects of network lifetime have been
optimized.
These intelligent algorithms do not have the same tness for
routing in WSNs. RL is the best option to deal with routing issue for
WSNs. This algorithm is exible, fully distributed, and robust against
node failures. Moreover, its communication requirements are nearly
zero, and it can maintain data delivery even in case of topology
changes. Then, ACO is also popular to address routing for WSNs.
7. Analysis of network lifetime

Denition 2 focuses on the time until there is a node which is
energy-effective but already has no path to the base station. The
major measure to postpone such time is to build multiple paths to
guarantee the connectivity to the base station. Table 3 indicates
these intelligent routing protocols which take measures to prolong
network lifetime in terms of Denition 2.
However, it requires high communication overhead by sending ants

separately to manage the routes and sending ants back to the source.
Thus, it is better to change the ACO model to accommodate the
requirements of WSNs, but this has not been done so far. In addition,
FL is suited for implementing clustering heuristics and routing
optimization to simultaneously achieve multiple objectives. However,
this algorithm generates non-optimal solution, and fuzzy rules need
to be re-learnt upon topology changes. Finally, GA and NNs can also
be made use of to improve performance of WSNs. But they have very
high processing demands and are usually centralized solutions. These
two approaches are slightly better suited for clustering when the
clustering schemes can be pre-deployed. On the basis of diverse
demands, one or multiple intelligent algorithms can be utilized to
optimize the performance.
Routing protocols based on intelligent algorithms look promising
since they have superiority under uncertain environments and
severe limitations. However, many of them lack explicit comparison
to traditional or to other intelligent algorithms. In addition, only a
few algorithms have been validated under real WSNs environments
like test-bed or deployments. Oppositely, most of them are evaluated
in the simulation environment. Therefore, this paper intends to
provide new ideas and incentives for addressing routing issue in
WSNs, and there are still many challenges needing to be solved.
Furthermore, since the current denition of network lifetime is
limited to the time until the rst dead node appears, this paper
brings forward a comprehensive evaluation indicator for network
lifetime. In future, this paper intends to evaluate and compare these
routing protocols for WSNs following the above opinion.
Acknowledgments
This study was supported by the Networking Research Lab at
East China Normal University. The authors are grateful to all the
members of the lab.
References
Baranidharan B, Shanti BA. Survey on energy efcient protocols for wireless sensor
networks. International Journal of Computer Applications 2010;11(10):3540.
Barbancho J, Len C, Molina J, Barbancho A..Giving neurons to sensors: QoS
management in wireless sensors networks. In: Leon C, editor. Proceedings of
the IEEE conference on emerging technologies and factory automation ETFA;
2006. p. 594597.
Boyan JA, Littman ML. Packet routing in dynamically changing networks: a
reinforcement learning approach. Advances Neural Information Processing
Systems, vol. 6; 1994.
Camilo Tiago, Carreto Carlos, S Silva Jorge, Boavida Fernando. An energy-efcient
ant-based routing algorithm for wireless sensor networks. Ant Colony Optimization and Swarm Intelligence; 2006. p. 4959.
Celik F, Zengin A, Tuncel S. A survey on swarm intelligence based routing protocols
in wireless sensor networks. International Journal of the Physical Sciences
2010;5(14):211826.
Dijkstra EWA. Note on two problems in connexion with graphs. Numerische
Mathematik 1959;1:26971.
Dasgupta K, Kalpakis K, Namjoshi P. An efcient clustering-based heuristic for data
gathering and aggregation in sensor networks. In: Proceedings of IEEE wireless
communication and networking WCNC, 3; 2003. p. 19481953.
201
Di Caro G, Dorigo M. AntNet: distributed stigmergetic control for communications

networks. Journal of Articial Intelligence Research (JAIR) 1998;9:31765.
Ellabib Issmail, Calamai Paul, Basir Otman. Exchange strategies for multiple ant
colony system. Information Science 2007;177(5):124864.
Forster A, Murphy AL. FROMS: Feedback routing for optimizing multiple sinks in
WSN with reinforcement learning. In: Proceedings of the 3rd International
Conference on Intelligent Sensors, Sensor Networks and Information Processing. (ISSNIP); 2007.
Gupta I, Riordan D, Sampalli S. Cluster-head election using fuzzy logic for wireless
sensor networks. In: Riordan, D, editor. Proceedings of the 3rd Annual
Communications Networks and Services Research Conference; 2005. p. 255
260.
Halawani Sami, Khan Abdul Waheed. Sensors lifetime enhancement techniques in
wireless sensor networksa survey. Journal of Computing, 2010;2(5):3447.
Haykin S. Neural networks: a comprehensive foundation. Prentice Hall; 1994.
Hsu William H. Genetic algorithms. Department of Computing and Information
Sciences, Kansas State University; 2008.
Hu Tiansi, et al. QELAR: a machine-learning-based adaptive routing protocol for
energy-efcient and lifetime-extended underwater sensor networks. IEEE
Transactions on Mobile Computing 2010;9(6):796809.
Hussain Sajid, Matin Abdul Wasey, Islam Obidul. Genetic algorithm for hierarchical
wireless sensor networks. Journal of Networks 2007;2(5):8797.
Islam O, Hussain S. An intelligent multi-hop routing for wireless sensor networks.
In: Proceedings of WI-IAT Workshops Web Intelligence and Intelligent Agent
Technology Workshops; 2006. p. 239242.
Kaelbling LP, Littman ML, Moore AP. Reinforcement learning: a survey. Journal of
Articial Intelligence Research 1996;4:23785.
Karaki JN, Kamal AE. Routing techniques in wireless sensor networks: a survey.
Wireless Communications 2004;11(6):628.
Kulkarni Raghavendra V, Forster Anna, Kumar Venayagamoorthy Ganesh. Computational intelligence in wireless sensor networks: a survey. IEEE Communications Surveys & Tutorials 2011;13(1):6896.
Minhas Mahmood R, Gopalakrishnan Sathish, Leung Victor CM. Fuzzy algorithms
for maximum lifetime routing in wireless sensor networks. Global telecommunications conference; 2008. p. 16.
Okdem Selcuk, Karaboga Dervis. Routing in wireless sensor networks using an Ant
Colony Optimization (ACO) router chip. Sensors 2009;9(2009):90921.
Saleem M, Di Caro GA, Farooq M. Swarm intelligence based routing protocol for
wireless sensor networks: survey and future directions. Information Sciences
2011;181(20):4597624.
Singh SK, Singh MP, Singh DK. Routing protocols in wireless sensor networksa
survey. International Journal of Computer Science and Engineering Survey
(IJCSES) 2010;1(2):6383.
Subramanian L, Katz RH. An architecture for building self congurable systems. In:
Proceedings of IEEE/ACM workshop on mobile ad hoc networking and
computing, Boston, MA; August 2000.
Sutton RS, Barto AG. Reinforcement learning: an introduction. The MIT Press; 1998.
Villalba LJG, Orozco ALS, Cabrera AT, Abbas CJB. Routing protocols in wireless sensor
networks. Sensors 2009;9:8399421.
Wang P, Wang T. Adaptive routing for sensor networks using reinforcement
learning. In: Proceedings of the 6th IEEE international conference on computer
and information technology (CIT). Washington, DC, USA. IEEE Computer
Society; 2006.
Wei G. Study on immunized ant colony optimization. In: Proceedings of the third
international conference on natural computation (ICNC 2007); 2007.
Ye F, et al. A scalable solution to minimum cost forwarding in large scale sensor
networks. In: Proceedings of international conference on computer communications and networks (ICCCN), Dallas, TX; October 2001.
Zhang Ying, Huang Qingfeng. A learning-based adaptive routing tree for wireless
sensor networks. Journal of Communications 2006;1(2).
Zhang Y, Kuhn L, Fromherz M. Improvements on ant routing for sensor networks.
In: Ants 2004, Workshop on Ant Colony Optimization and Swarm Intelligence;
2004. p. 154165.
Zungeru AM, Ang LM, Seng KP. Classical and swarm intelligence based routing
protocols for wireless sensor networks: a survey and comparison. Journal of
Network and Computer Applications 2012;35(5):150836.

Elsevier 2014 A Surveyonintelligentroutingprotocolsinwirelesssensornetworks

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Elsevier 2014 A Surveyonintelligentroutingprotocolsinwirelesssensornetworks

Transféré par

Droits d'auteur :

Formats disponibles

Journal of Network and Computer Applications 38 (2014) 185201

Contents lists available at ScienceDirect

Journal of Network and Computer Applications

A survey on intelligent routing protocols in wireless sensor networks

Corresponding author. Tel.: +86 18918797512.

W. Guo, W. Zhang / Journal of Network and Computer Applications 38 (2014) 185201

W. Guo, W. Zhang / Journal of Network and Computer Applications 38 (2014) 185201

provide an acceptable service, which may focus on the ratio of

2. Reinforcement learning based routing protocols

Q-learning based routing (Q-Routing)

Basic ant routing (BAR)

Cluster-head election using fuzzy logic (FCH)

Genetic algorithm based routing (GA-Routing)

Sensor intelligence routing (SIR)

W. Guo, W. Zhang / Journal of Network and Computer Applications 38 (2014) 185201

where Qy(d,z) is the evaluated time spent on the packet delivery

where is the learning rate, q indicates the units of time in the

queue of x, and s represents the units of time in transmission

Then, the weights w of the above linear functions can be

W. Guo, W. Zhang / Journal of Network and Computer Applications 38 (2014) 185201

AdaR samples from the environment and uses LSPI to update

where d(s, a) is the difference of the hop counts of s and s to

2.2.3. Results and performance analysis

packets for tree maintenance, the adaptive spanning tree can

ATP makes uses of the above RL-based strategies to build an

W. Guo, W. Zhang / Journal of Network and Computer Applications 38 (2014) 185201

research on the selection of parameters and the relationship

The rst part of the formula calculates the total number of

(2) When data begins to ow in the network, nodes working as

where ca is the action's cost, which is always 1 in hop count

where is the learning rate of the algorithm.

where Qn(st, at) denotes the expected reward that can be

where P in the rst part is the success rate of forwarding

W. Guo, W. Zhang / Journal of Network and Computer Applications 38 (2014) 185201

lifetime on average in terms of the rst denition. Even in a

Fig. 1. Packet structure in QELAR.

second part is the failure rate. In addition, R denotes the

where g represents the constant cost for a node forwarding a

(4) At each step having determined the next forwarder, the

These RL-based routing protocols mentioned above have

3. Ant colony optimization based routing protocols

W. Guo, W. Zhang / Journal of Network and Computer Applications 38 (2014) 185201

receive pheromone at a higher rate than the longer one. Thus,

where C is the initial energy level of nodes, and es is the

pitched on depends on two factors. The factor of pheromone

where N is the total number of nodes in the network, and Fdk

where is a coefcient, and (1) represents the evaporation

3.1.3. Results and performance analysis

W. Guo, W. Zhang / Journal of Network and Computer Applications 38 (2014) 185201

depending on different requirements. For WSNs, cost should be

where N is the set of all neighbors of the current node, Qn is the

where cn is the local cost function.

backward ants are interfered. Accordingly, it is important to

where (1) is still the evaporation of pheromone trail, and Bdk is

W. Guo, W. Zhang / Journal of Network and Computer Applications 38 (2014) 185201

load and maximizes the energy saving, which emphasizes the

In addition to a negative feedback, the operation of pheromone

3.4.3. Results and performance analysis

< r;s r;s

where I is the initial energy, and er is the current energy level of

The above equation reveals that the amount of released

where (r,s)(t) denotes the amount of pheromone released by

W. Guo, W. Zhang / Journal of Network and Computer Applications 38 (2014) 185201