Hexapod

INTELLIGENT GAIT CONTROL OF A MULTILEGGED ROBOT USED
IN RESCUE OPERATIONS
A THESIS SUBMITTED TO
THE GRADUATE SCHOOL OF NATURAL AND APPLIED SCIENCES
OF
THE MIDDLE EAST TECHNICAL UNIVERSITY
BY
EMRE KARALARLI
IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE

DEGREE OF
MASTER OF SCIENCE
IN
THE DEPARTMENT OF ELECTRICAL AND ELECTRONICS
ENGINEERING
DECEMBER 2003
Approval of the Graduate School of Natural and Applied Sciences
——————————————–
Prof. Dr. Canan Özgen
Director
I certify that this thesis satisfies all the requirements as a thesis for the degree
of Master of Science.
——————————————–
Prof. Dr. Mübeccel Demirekler
Head of Department
This is to certify that we have read this thesis and that in our opinion it is
fully adequate, in scope and quality, as a thesis for the degree of Master of
Science.
——————————————– ——————————————–
Prof. Dr. İsmet Erkmen Assoc. Prof. Dr. Aydan Erkmen
Co-Supervisor Supervisor
Examining Committee Members

Prof. Dr. Erol Kocaog̃lan ———————————–
Prof. Dr. Aydın Ersak ———————————–
Prof. Dr. İsmet Erkmen ———————————–
Assoc. Prof. Dr. Aydan Erkmen ———————————–
Asst. Prof. Dr. İlhan Konukseven ———————————–

ABSTRACT
INTELLIGENT GAIT CONTROL OF A MULTILEGGED ROBOT USED
IN RESCUE OPERATIONS
Karalarlı, Emre
M.S., Department of Electrical and Electronics Engineering
Supervisor: Assoc. Prof. Dr. Aydan Erkmen
Co-Supervisor: Prof. Dr. İsmet Erkmen
December 2003, 97 pages
In this thesis work an intelligent controller based on a gait synthesizer

for a hexapod robot used in rescue operations is developed. The gait synthe-
sizer draws decisions from insect-inspired gait patterns to the changing needs
of the terrain and that of rescue. It is composed of three modules responsible
for selecting a new gait, evaluating the current gait, and modifying the rec-
ommended gait according to the internal reinforcements of past time steps. A

Fuzzy Logic Controller is implemented in selecting the new gaits.
Key words: Hexapod Walking Rescue Robots, Insect-inspired Gaits, Gait Syn-
thesizer, GARIC.
iii
ÖZ
ÇOK BACAKLI KURTARMA ROBOTLARININ AKILLI YÜRÜYÜŞ
DENETİMİ
Karalarlı, Emre
Yüksek Lisans, Elektrik ve Elektronik Mühendisliği Bölümü

Tez Yöneticisi: Doç. Dr. Aydan Erkmen
Ortak Tez Yöneticisi: Prof. Dr. İsmet Erkmen
Aralık 2003, 97 sayfa
Bu tez çalışmasında kurtarma robotlarının akıllı yürüyüş denetimi için
bir yürüyüş şekli sentezleyicisi geliştirilmiştir. Sentezleyici değişen zemin

özelliklerine ve farklı kurtarma çalışmalarına cevap verebilmek için böceklerden
ilham alınan yürüyüş şekillerine göre karar vermektedir. Sentezleyici, yürüyüş
şekli belirleyici, değerlendirici ve değiştirici olmak üzere üç bölümden oluşur.

Belirleyici, bir bulanık mantık denetleyicisidir.
Anahtar Sözcükler: Altı Bacaklı Kurtarma Robotları, Böceklerden İlham
Alınmış Yürüyüş Şekilleri, Yürüyüş Şekli Sentezleyicisi, GARIC.
iv
ACKNOWLEDGMENTS
I would like to express my gratitude to my supervisor Assoc. Professor
Dr. Aydan Erkmen and co-supervisor Prof. Dr. İsmet Erkmen for their
motivation, guidance, patience, and encouragement through the preparation
of this thesis. I also thank to all my friends, especially Engür and Aslı Pişirici,
Mehmetçik and Semra Pamuk, Bora Sag̃dıçog̃lu, and Sedat Ilgaz for their
invaluable comments and suggestions throughout the study. Finally, I express
my gratitude to my family for their endless support.
v
TABLE OF CONTENTS
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
ÖZ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
TABLE OF CONTENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .vi
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .viii
CHAPTER
1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2. SURVEY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1 Search and Rescue Robotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Legged Locomotion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2.1 Walking Mechanisms in Animals . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.2 Control of Legged Robots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2.3 Gait Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.4 Gait Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.3 Mathematical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3.1 Neural-Fuzzy Controllers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3.2 GARIC Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3.3 Fuzzy Sets and Fuzzy Logic Controllers . . . . . . . . . . . . . . . 31

2.3.4 Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3. LEGGED ROBOT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37
3.1 Dynamics and Coordinated Control of Legged Robots . . . . . . . 37
3.1.1 Motion Dynamics of Legged Robots . . . . . . . . . . . . . . . . . . . 38
vi
3.1.2 Coordinated Control of Legged Robots . . . . . . . . . . . . . . . . 45
3.2 Gait Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.2.1 Encoding the Gaits for a Multilegged Robot . . . . . . . . . . . 50
3.2.2 Gait Selection Module (GSM) . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.2.3 Gait Evaluation Module (GEM) . . . . . . . . . . . . . . . . . . . . . . . 59

3.2.4 Gait Modifier Module (GMM) . . . . . . . . . . . . . . . . . . . . . . . . 60
3.2.5 The Complete Control Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4. HEXAPOD ROBOT SIMULATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.1 Hexapod Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .64
4.2 Sensor System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.3 Kinematics of the Hexapod Robot . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.4 Uneven Terrain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5. SIMULATION RESULTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.1 Exploration and Exploitation Dilemma in Reinforcement

Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.2 Smooth Terrain Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

5.3 Performance on Rough Terrain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.4 Task Shapability: A Must for SAR Operations . . . . . . . . . . . . . . 79
6. CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .87
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .89
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
APPENDICES
A. SIMULATION PROGRAM CD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
vii
LIST OF FIGURES
FIGURES
2.1 Tripod (A) and tetrapod (B) support patterns (or support
polygons) formed by contact points of the supporting legs . . . . . . . 9
2.2 Wave gait patterns. Bold lines represent swing phase. L1

signifies the left front leg and R3 indicates the right hind leg . . . 16
2.3 Hexapod model. Dashed legs are in swing phase . . . . . . . . . . . . . . . .17
2.4 Summary of coordination mechanisms in the stick insect. The

pattern of coordinating influences among the step generators
for the six legs is shown at the left; the arrows indicate the
direction of the influence. The mechanisms are described
briefly at the right . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.5 The GARIC architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.6 The action evaluation network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.7 The action selection network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.8 General model of a fuzzy logic controller . . . . . . . . . . . . . . . . . . . . . . . 33
3.1 Coordinate frames defined for the legged robot. The coordinate
frame Cci is assigned such that the unit vector ẑ is normal to
the contact surface at the point of contact . . . . . . . . . . . . . . . . . . . . . 40
3.2 Architecture of Gait Synthesizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.3 Summary of terminology used in gait analysis . . . . . . . . . . . . . . . . . . 52
3.4 Wave gait patterns. Bold lines represent swing phase. L1

signifies the left front leg and R3 indicates the right hind leg . . . 53
3.5 Antecedent Labels, fuzzification of individual leg position . . . . . . 54

3.6 Consequent Labels: task share based on operation modes . . . . . . .56
3.7 Gait Selection Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.8 Complete control cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.1 The hexapod robot used in simulation . . . . . . . . . . . . . . . . . . . . . . . . . .65
viii
4.2 Hexapod model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.3 Each leg is identical and composed of three links. Pink legs are
in swing phase whereas blue ones are in stance . . . . . . . . . . . . . . . . . 67
4.4 Two different postures of the robot. Body level of the robot in
B is lowered in order to increase reachable space of the legs . . . . 68
4.5 The modelled uneven terrain. Different surface segments can be

seen in the figure. The holes on uneven terrain are modelled
by surface segments which are deeper than the legs can reach.
Notice that the pink leg (swinging) falls into such a segment . . . 70
5.1 Body speed versus time graphs for different scale factor and
threshold values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.2 Comparison of resultant gaits when training is done according to

two different reinforcement for speed (first row) and critical
margin (second row). The first column gives the resultant
gaits, second one body speed versus time, and last column
shows critical margin in the direction of motion versus time . . . . 75
5.3 Internal reinforcement versus time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.4 Critical margin, Cm(t), versus time . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.5 Leg tip positions on x direction versus time. In order to increase

the critical margin gait synthesizer applies smaller step sizes . . . 78
5.6 Leg tip trajectories of the hexapod on x-z plane with a fixed
tripod gait on the defined terrain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.7 Leg tip trajectories of the hexapod on x-z plane with gait
synthesizer on the defined terrain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.8 Gait of the hexapod robot on uneven terrain. The robot

recovers tripod gait pattern after some time reaching the
smooth terrain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.9 Gait of the hexapod robot on uneven terrain. The robot

recovers tripod gait faster than the previous one . . . . . . . . . . . . . . . 84
5.10 Critical margin versus time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

5.11 Leg tip positions on x direction versus time . . . . . . . . . . . . . . . . . . . . 85
5.12 Gait generated by gait synthesizer when leg R1 is missing . . . . . . 85
5.13 Gait generated by gait synthesizer in sudden lack of leg R1 . . . . 86
ix
CHAPTER 1
INTRODUCTION
Recent experiences of natural disasters (earthquakes, tornados, floods)
and man-made catastrophes (e.g. urban terrorism) have brought the attention
to the area of search and rescue (SAR) and emergency management. Horrible
devastations and losses have dramatically illustrated the damage that can be
expected to today’s modern industrialized countries despite the technological

progresses in construction techniques [1]. Besides, these experiences showed
that preparedness and emergency response of the governments are inadequate
to deal with these devastations. As a result, people who have died due to
lack of immediate response inevitably forced us to find out better solutions for
search and rescue.
The utilization of autonomous intelligent robots in search and rescue
(SAR) is a new and challenging field of robotics, dealing with tasks in ex-
tremely hazardous and complex disaster environments [2]. Autonomy, high
mobility, robustness, and reconfigurability are critical design issues of res-
cue robotics, requiring dexterous devices equipped with the ability to learn
from prior rescue experience, adaptable to variable types of usage with a wide
enough functionality under different sensing modules, and compliant to en-
vironmental and victim conditions. Intelligent, biologically inspired mobile
robots and, in particular hexapod robots have turned out to be widely used
robot types beside serpentine mechanisms [3]; providing effective, immediate,

and reliable responses to many SAR operations. Aiming at enhancing the
1
quality of rescue and life after rescue, the field of rescue robotics is seeking
shape changing and moreover task shapable intelligent dexterous devices.
The objective of this thesis is to design a gait synthesizer for 6-legged
walking robots with shape-shifting gaits that provide the necessary flexibility
and adaptability needed in the difficult workspaces of rescue missions. The gait
synthesizer is responsible for the locomotion of the robot providing a compro-
mise between mobility and speed while allowing task shapability to use some
legs as manipulators when need arise during rescue. Legged robots are chosen
due to their advantage on rough terrains over their wheeled mobile counter-
parts [4], [5].
Wheeled locomotion is well suited for fast transportation. Wheels change

their point of support and use friction to move forward in an efficient way.
But due to this fact they require a continuous path. So, they require a pre-
constructed terrain, which restricts mobility demands.
On the other hand, legged locomotion offers a significant potential for
mobility over natural rough terrains in comparison to wheeled or tracked lo-

comotion. Because legs can choose footholds to improve traction, to minimize
lurching and to step over obstacles, they can handle with softness, unevenness
of the terrain [4]. Legs can provide the capability of maneuvering within con-
fined areas of space. Unlike wheels, legs change their point of support all at
once so do not need a continuous path. Also, as seen in nature, legs are not
used only for walking. Beside their main function, they are almost in every
external process of animals (as tactile sensors, as manipulators, etc.).
2
However, legged locomotion possesses additional complexity in the coor-
dination control of the legs [6]. The control of a legged robot is a sophisticated
job due to the high number of degrees of freedom offered by the articulated
legs. In the design of a control structure of a legged robot on difficult rough
terrain there are many aspects that have to be dealt with simultaneously and
that also interfere with each other. For example, movements of legs must be
carefully coordinated in order to advance the body without causing feet slip-
page; at each step, an appropriate foothold has to be found; body attitude
must be set according to the terrain profile; keep stability; accomplish a nav-
igation task; etc. Here, a body movement for terrain adaptation may change
the operation space of a leg so that the leg can not reach a chosen foothold that
was within the range beforehand, or inversely, a decision for a modification in

the gait may solve a stability problem. So, while coordinating the movements
of body and legs, the control structure of the legged robot must also handle
such interferences.
In this thesis we focus on the gait control and leg coordination and em-
phasize the potential of redundancy of legs for handling irregularity on terrains
as well as their use as manipulators. In walking robots, coordinating the move-
ments of individual legs in order to maintain a stable gait is one of the main
control tasks. Observations on insect gaits (cockroaches, stick insects) shows
that insects produce sequential movements starting with hind leg protraction
and followed by the middle and front legs, which is called metachronal wave
or wave gait [7]. Among the numerous periodic gaits, the class of wave gaits
is most important because they provide good stability [8]. The tripod gait,
3
which is a member of wave gaits, involves an alternation between right-sided
and left-sided metachronal waves and it is the fastest gait. Gaits arise from
the interaction of individual leg oscillators (step pattern generators) which
govern the stepping of each leg by exchanging the influences of the legs [9].
The information transmitted from the step pattern generator depends upon
the leg’s state (either swing or stance, position and velocity). Here the position
information play a particular central role in coordination. Several researchers
have implemented insect-like controllers for leg coordination ([10], [11]), most
of which oriented to protect the regularity of a fixed gait pattern against per-
turbations.
In this thesis, we work on biologically inspired wave gait patterns. Gait
patterns are patterns of leg coordination which represents relative phases

(swing phase or stance phase) of legs in a statically stable locomotion. These
gaits have different properties from mobility and speed point of views. In our
method we encoded gait pattern cycles from relative positions of the legs and
find the individual leg’s tasks within those gait patterns. The method enables
exploring among many different gait patterns and selecting gait patterns ac-
cording to the different needs to adapt online to terrain conditions. This is the
point where we have required features of intelligent control.
Generalized approximate reasoning-based intelligent control (GARIC)
architecture [12] is one of the realizations of the fusion of the Fuzzy and Neural
technologies guided by feedback from the environment. It presents a method
for learning and tuning fuzzy logic controllers (FLC) through reinforcements
signals. The basic idea behind Fuzzy Logic Controllers (FLC) is to incorporate
4
the ”expert experience” of a human operator in the design of the controller in
controlling a process whose input-output relationship is described by a collec-

tion of fuzzy control rules (IF-THEN rules) involving linguistic variables rather
than a complicated dynamic model [13].
Our gait synthesizer has adapted the GARIC architecture, to our ob-
jective. The gait synthesizer that we develop for serpentine locomotion [3]
and here for hexapod walking consists of three modules. The Gait Evaluation
Module (GEM), acts as a critic and provides advice to the main controller,
based on a multilayer artificial neural network. The Gait Selection Module
(GSM), offers a new gait to be taken by the robot according to a fuzzy con-
troller with rules for different gait patterns in the knowledge base. The Gait
Modifier Module (GMM), changes the gait recommended by the GSM based
on internal reinforcement. This change in the recommended gait is more sig-
nificant for a state if that state does not receive high internal reinforcements.
(i.e. probability of failure is high). On the other hand, if a state receives
high reinforcements, GMM administers small changes to the action selected
by the fuzzy controller embedded in the GSM. This reveals that the action is
performing well so that the GMM recommendation dictates no or only minor

changes to that gait.
The basic contribution of this thesis is the development of an intelligent
task shapable control, based on a gait synthesizer for a hexapod robot upon its
traversal of unstructured workspaces in rescue missions within disaster areas.
The gait synthesizer draws decisions from insect-inspired gait patterns to the
changing needs of the terrain and that of rescue. The method provides ex-
5
ploration among different gait patterns using the redundancy in multi-legged
structures.
The thesis is organized as follows: Chapter 2 covers a survey on legged
locomotion and gait analysis, and gives information about basic notions needed
throughout the thesis. Chapter 3, includes the dynamics and control of legged
robots, and the detailed description of the gait synthesizer. Chapter 4 in-
troduces the simulation and chapter 5 presents and discusses the results of
simulation. Chapter 6 covers the conclusion.
6
CHAPTER 2
SURVEY
2.1 Search and Rescue Robotics
Contribution of robotics technology to today’s sophisticated tasks is an
inevitable progress, leading to a gradual minimization of human share, mostly,
due to saturation in improvements of human abilities or to complementation

of human activities. Education and training are insufficient for dealing with
the complex and exhaustive tasks [1]. Thus, from the robotics point of view,
the trend is to provide an intelligent versatile tool to be a complete substi-
tution of human in risky operations and complement human operations when

auxiliary intelligent dynamics are required for extra dexterity. As a part of
this progress, Search and Rescue (SAR) is the one of the most crucial fields
that needs robotics contribution.
Search and rescue (SAR) robotics can be defined as the utilization of
robotics technology for human assistance in any phase of SAR operations [2].
Robotic SAR devices have to work in extremely unstructured and technically
challenging areas shaped by natural forces. One of the major requirements

of rescue robot design is the flexibility of the design for different rescue us-
age in disaster areas of varying properties. Rescue robotic devices should be
adaptable, robust, and predictive in control when facing different and chang-
ing needs. Intelligent, biologically inspired mobile robots and, in particular
hexapod walking robots have turned out to be widely used robot types beside
serpentine mechanisms; providing effective, immediate, and reliable responses
7
to many SAR operations.
2.2 Legged Locomotion
Legged locomotion offers a significant potential for mobility over highly
irregular natural rough terrains cut with ditches and high unpredictable in
comparison to wheeled or tracked locomotion [4], [5]. Legs can provide the
capabilities of stepping over obstacles or ditches, and maneuvering within con-
fined areas of space. They can handle with softness, the unevenness of the
terrain. Beside their main function in locomotion, legs are almost in every
external process of animals. The articulated structures of legs serve as manip-
ulators to pull, push, hold, etc. or as tactile sensors to explore the environment.
2.2.1 Walking Mechanisms in Animals
Millions of years of evolution have resulted in a large number of locomo-
tory designs for efficient, rapid, adjustable and reliable movement of the ani-
mals [15]. The major variations are observed in the number of legs (from two in
humans to about two hundred in a millipede), the length and shape (some spi-
ders possess extremely long and slender legs whereas hedgehogs have compara-
tively short legs), the positioning of the legs (insects carry their body between
the legs, whereas mammals tuck their legs underneath), and the type of skele-
ton (arthropods use an exoskeleton made of chitin-protein cuticle, whereas
vertebrates use an endoskeleton composed of bone). Despite this diversity,
legged locomotion in animals has some basic similarities according to their

mechanics and control.
8
At its fundamental level, legs work in a cyclic manner to locomote. The
step cycle for an individual leg consists of two basic phases: the swing phase,
when the foot is off the ground and moving forward, and the stance phase,
when the foot is on the ground and the leg is moving backward with respect to
the body. The propulsive force for progression is developed during the stance
phase. A common feature of the step cycle in most of the animals (including
man) is that the duration of the swing phase remains comparatively constant
as walking speed varies. Accordingly changes in the speed of progression are
produced primarily by changes in the time it takes for the legs to be retracted
during the stance phase [21].
Figure 2.1: Tripod (A) and tetrapod (B) support patterns (or support polygons)
formed by contact points of the supporting legs.
Animal locomotion can be classified into two categories according to gait

they use [23]. The first type is the one exhibited by insects. Insects are arthro-
pods and have a hard exoskeletal system with joined limbs. They use their
legs as struts and levers and the legs must always support the body during
walking, in addition to providing propulsion. In other words, the sequential
9
pattern of steps must ensure static stability. The vertical projection of the
center of gravity must therefore always be within the support pattern (the two
dimensional convex polygon formed by the contact points (Fig. 2.1)). This
kind of locomotion has been described as crawling and the legs have to pro-
vide at least tripod of support at all times. Another kind of locomotion may
be observed in humans, horses, dogs, cheetahs, and kangaroos which have a
more flexible structure. These animals require dynamic balance, which is less
stringent restriction on the posture and the gait of the animal. The animal
may not be in static equilibrium. On the contrary, there may be periods of
time when none of the support legs are on the ground as is observed in trotting
horses, running humans, and hopping kangaroos.
The mechanism by which the nervous system generates the cyclic move-
ments of the legs during walking is basically the same in animals [23], [21]. The
first significant efforts analyzing the nervous system were in the beginning of
1900s with the work of two British physiologists, C. S. Sherrington and T.
Graham Brown [21]. Sherrington first showed that rhythmic movements could
be elicited from the hind legs of cats and dogs some weeks after their spinal
cord had been severed. Since the operation had isolated from the rest, the
nervous center that control the movement of the hind legs, he showed that the
higher levels of the nervous system are not necessary for the organization of
stepping movements. He explained the generation of rhythmic leg movements
by a series of ”chain reflexes” (a reflex being a stereotyped movement elicited

by the stimulation of a specific group of sensory receptors). Thus he conceived
that the sensory input generated during any part of the step cycle elicit the
10
next part of the cycle by a reflex action, producing in turn another sensory sig-
nal that elicits the next part of the cycle, and so on. Whereas, Graham Brown
demonstrated that rhythmic contractions of leg muscles, similar to those that
occur during walking, could be induced immediately following transection of
the spinal cord even in animals in which all input from sensory nerves in the
legs had been eliminated. So, Graham claimed that mechanisms located en-
tirely within the spinal cord are responsible for generating the basic rhythm
for stepping in each leg.
Actually these two concepts are not compatible but neither provides a
complete explanation by itself [21]. Further experiments in a number of labo-

ratories have yielded results that strongly support the dual view of the nervous
mechanisms involved in walking. Both approaches have attractive features as

models for understanding how neural systems produce behavior. If walking is
the consequence of complete motions (central pattern), then it is much easier

to see how phase coordination of multiple legs is possible. On the other hand,
it is more difficult to see how adaptation to details of the terrain is possible
when walking is composed of complete motions. This state of affairs is re-
versed when the model is based on reflexes. The consensus that evolved was
that aspects of both models are important to the control of locomotion and
that neither was completely correct by itself [21]. Thus, our gait synthesizer
joins both sensory effects, environmental task performance as reinforcement
and simple neural structure for phase coordination of the multiple legs of our
robot. However our generated system is reflexive enough to adapt to the sud-
den unevenness of the terrain in rescue operations.
11
The process that gives rise to locomotion is a complicated control sys-
tem [16]. Motor output is constantly modified by both neural and mechanical
feedback . Specialized circuits within the nervous system, called central pat-
tern generators (CPGs), produce the rhythmic oscillations that drive motor
neurons of limb and body muscles in animals as diverse as leeches, slugs, lam-
preys, turtles, insects, birds, cats, and rats. Although CPGs may not require
sensory feedback for their basic oscillatory behavior, such feedback is essen-
tial in structuring motor patterns as animals move. This influence may be so
strong that certain sensory neurones should be viewed not as modulators but
as integral members of distributed pattern-generating network that comprises
both central and peripheral neurones. This is the main motivation behind
our gait synthesizer learning to select gait patterns while other parts of the
synthesizer learns to evaluate performance based on sensory data and modify
these patterns when necessary. More specifically, Gait Selection Module, GSM
(section 3.2.2), in our architecture acts as the CPG of real animals.
As a result of studies on animal locomotion, a few themes emerge. First,
the dynamics of locomotion is complicated on the basis of a few common
principles, including common mechanisms of energy exchange and the use of
force for propulsion, stability, and maneuverability. Second the locomotory
performance of animals in nature habitats reflects trade-offs between differ-
ent ecologically important aspects of behavior and is affected by the physical
properties of the environment. Third, the control of locomotion is not a lin-
ear cascade, but a distributed organization requiring both feedforward motor

patterns and neural and mechanical feedback. Fourth, muscles perform many
12
different functions in locomotion, a view expanded by the integration of muscle
physiology with whole-animal mechanics (muscles can act as motors, brakes,

springs, and struts).
Because machines face the same physical laws and environmental con-
straints that biological systems face when they perform similar tasks, the solu-
tions they use may embrace similar principles. Legged machines have a lot of
things to learn from the nature. But, evolutionary pressures that dictate the
morphology and physiology of animals do not always give the suitable results
for our tasks. For example, % 40 of the body mass of a shrimp is devoted to the
large, tasty abdominal muscles that produce a powerful tail flick during rare,
but critical, escape behaviors [16]. The imitation of a such body design will
surely result in an inefficient machine. The consequence is that the informa-

tion taken from the nature must be processed and the fundamental principles
must be defined. That is why we concentrated on redundant legged robots

and more specifically 6 legged ones.
2.2.2 Control of Legged Robots
The main challenge for legged robots is the control system. A system
that controls such a robot accomplishes several tasks [5]. First it regulates the
robot’s gait, that is, the sequence and way in which the legs share the task
of locomotion. For example, six legged robots work with gaits that elevate a
single leg at a time or two or three legs simultaneously. A gait that elevates
several legs at once generally makes it possible to travel faster but offers less
stability than a gait that keeps more legs on the ground.
13
A second task is to keep the robot from tipping over. For the vehicles
using static stability, if the center of gravity of the robot moves beyond the
base of support provided by the legs, the robot will tip. So, location of the
center of gravity with respect to the placement of the feet must be continuously
monitored by the robot. In our control structure, static stability is provided
by ensuring safety margins from physical limits such as the distance of the
center of gravity from the support polygon and distance of the legs from their
reaching limits during support phase.
Since many legs share the support of the body, a third task is to dis-
tribute the support load and the lateral forces among the legs. Smoothness of
the ride and minimal disturbance of the ground are the main objectives during
this task. In this thesis work the smoothness of the legged robot is provided
by applying periodic wave gait patterns of the insects. The perturbations of

the ground to the robot are compensated by choosing proper gaits during the
locomotion.
A fourth task is to make sure the legs are not driven past the limits
during their travel. The geometry of the legs may make it possible for one leg
to bump into another. Control system must take into account the limits of
the leg’s motion and the expected motion of the robot during that leg’s stance
period. In our robot the legs’ operation areas are restricted such that they do
not overlap.
A fifth task is to choose places for stepping that will give adequate sup-
14
port. For this task a sensor system that would scan the ground ahead of the
robot will be required. This system will build an internal digital model of the
terrain and process to find suitable footholds. Here, softness of the terrain may
cause problems. In the gait synthesizer we developed, a task oriented internal
model is learned during the learning process of gait evaluation.
We perform these five tasks (which are related to locomotion) on a hexa-
pod robot by focusing on gait control. In other words, our solutions to the
problems in the overall control of the hexapod robot are based on gait con-
trol. For the rest of the tasks which depend on the application, we just show
the potential of the gait synthesizer. Specifically, we will show that the gait
synthesizer is capable of adapting to the rescue operations where a leg of the
hexapod is used as manipulator while the rest provide mobility. However, the
key challenge in legged robots is to control individual components (legs) for
cooperative manipulation, while obtaining their cooperation for walking as an

integrated whole. This is behind the motivation of this thesis work.
2.2.3 Gait Analysis
In this thesis we focus on gaits of legged robots. A gait is a sequence
of leg motions coordinated with a sequence of body motions for the purpose
of transporting the body of the legged system from one place to another [8].
Gait analysis is one of the fundamental areas in the study of walking robots.
It is important because it is the major factor that affects the geometric and
control design of a walking robot [30]. In general, there are two types of gaits:
periodic and non-periodic gaits [8].
15
Figure 2.2: Wave gait patterns. Bold lines represent swing phase. L1 signifies the
left front leg and R3 indicates the right hind leg [7].
Periodic gaits are those in which a specific pattern of leg movement is
imposed. Observations on insect gaits (cockroaches, stick insects) show that
insects produce sequential movements starting with hind leg protraction and
followed by the middle and front legs, which is called metachronal wave or
wave gait [7]. The slowest gait involves an alternation between right-sided and
left-sided metachronal waves (Fig. 2.2A). As these waves overlap (Fig. 2.2B
to 2.2E), tetrapod gaits (Fig. 2.2C, 2.2D) and the typical tripod gait (Fig.
2.2E) are generated. The tripod gait (observed in hexapod insects such as
cockroaches), which is the fastest statically stable gait that a six-legged mech-
anism can use. In the tripod gait, three legs that enclose the center of gravity
support the body while the other legs simultaneously lift and recover. Peri-
odic gaits offer good mobility over smooth terrain since they possess optimum
stability. However, terrain irregularities, which can be dealt with these gaits,
16
are relatively limited. If the terrain irregularity is severe such as in natural
disaster areas, periodic gaits become ineffective, and special gates need to be
developed. These gaits are non-periodic gaits. Works in this area comprise of
studies on free gaits [32], and large obstacle gaits [30]. Free gaits are gaits in
which any leg is permitted to move at any time [31]. In free gait approach, a
finite set of gait states is defined and control is done on a rule-based principle,
resulting in simple motions lack of smoothness. Our gait control approach
takes the advantages of these gaits in order to achieve a smooth and adaptive
locomotion over unpredictive terrain roughness.
Figure 2.3: Hexapod model. Dashed legs are in swing phase.
Fig. 2.3 shows a hexapod model. The leg order as labelled in Fig. 2.3
is adopted throughout our thesis work. Below are some terms used in gait
analysis [8], [10], [30]:
1. Protraction: The leg moves towards the front of the body.
17
2. Retraction: The leg moves towards the rear of the body.
3. Stance phase: The leg is on the ground where it supports and propels
the body. In forward walking, the leg retracts during this phase. Also
called power stroke or support phase.
4. Swing phase: The leg lifts and swings to the starting position of the next
stance phase. In forward walking, the leg protracts during this phase.
Also called the return stroke or recovery phase.
5. Cycle time: The time for a complete cycle of leg locomotion of a periodic
gait.
6. Duty factor of a leg: The time fraction of a cycle time in which the leg
is in the support phase.
7. Phase of a leg: The fraction of a cycle period by which the current leg
position follows the placement of the leg.
8. Support Polygon: Two dimensional point set in a horizontal plane con-
sisting of the convex hull of the vertical projection of all foot points in
support phase (Fig. 2.3).
9. Stability Margin (Sm): The shortest distance of the vertical projection

of center of gravity to the boundaries of the support pattern in the hor-
izontal plane.
10. Front and Rear Boundary: The boundaries of the support polygon, re-
spectively ahead of and behind the projection of center of gravity in
forward walking, and intersect the longitudinal body axis.
18
11. Front and Rear Stability Margin (Front and Rear Sm): The distances
from the vertical projection of the center of gravity to the front and rear
boundaries of the support polygon respectively, in forward walking.
12. Kinematic Margin (Sm): The distance from the current foothold of a
stance leg to the border of its reachable area in the opposite direction of
body motion (Fig. 2.3).
13. Anterior Extreme Position (AEP): In forward walking, this is the target
position of the advance degree of freedom during recovery phase. It is
the foremost position a leg reaches during a cycle.
14. Posterior Extreme Position (PEP): In forward walking, this is the target
position of the swing degree of freedom during support phase. It is the
backmost position a leg reaches during a cycle.
15. The Stroke distance (Sd): The distance between Anterior Extreme Point
(AEP) and Posterior Extreme Point (PEP).
2.2.4 Gait Control
One of the aspects related with the control of legged robots is the genera-
tion of stable gaits [31]. The task of gait generation mechanism can be defined
as selecting an appropriate coordination sequence of leg and body movements
so that the robot advances with a desired speed and direction. Gait generation
for six legged (or more) robots has been addressed with several researches that
we will overview here.
19
The principle stated by experimental studies of walking in insects is that
gaits arise from the interaction of individual leg oscillators (step pattern gen-
erators) which govern the stepping of each leg by exchanging the influences
of the legs [9]. The information transmitted from the step pattern generator
depends upon the leg’s state (either swing or stance, position and velocity).
Here the position information plays a particular central role in coordination.
We take this also into consideration in our work.
Several versions of this interleg coordination principle is investigated and
implemented in insect-inspired walking robots. Pearson [21] proposed that
modification of the walking coordination may occur through load sensors in

the leg’s chordotonal organ and position information from the campaniform
sensillae. This model formed the basis of Beer’s simulation of cockroach be-
haviors [18] where the effect of load and position sensors was simulated by
forward and backward angle sensor ”neurons” as well as ground contact and
stance and swing ”neurons” within a distributed neural network control archi-
tecture. This basic model was then implemented on a walking robot with two
degrees of freedom per leg [19].
A more complex interleg coordination model is proposed by [20] and

[24]. Together they identify at least six mechanisms that work between legs in
a stick insect. A summary of the coordination mechanisms in the stick insect is
shown in Fig. 2.4. The arrows indicate the direction of influences which estab-
lish the coordination of the legs providing stability. In [24], [25] most of these
mechanisms are simulated and some of them have also been implemented on a
robot with two dof per leg [26] and two robots with three dof per leg [27]. In
20
Figure 2.4: Summary of coordination mechanisms in the stick insect. The pattern
of coordinating influences among the step generators for the six legs is shown at
the left; the arrows indicate the direction of the influence. The mechanisms are
described briefly at the right [24].
the implementations, interleg coordination mechanisms operate by modifying
the PEP (AEP and PEP are applied as a switching point between swing and
stance phases and AEP is set to a constant value) of a receiving leg depending
upon the state of a sending leg.
In [10], Ferrell compares different insect inspired gait controllers. The
most important feature of these implementations is that they are highly dis-
tributed. But, much is still unknown about the general dynamical behavior
of the models and dependence of this behavior on parameters [34]. So, pa-
rameters associated with the model must be tuned heuristically to achieve a
desired behavior. However, one of the major requirements of rescue robot de-
sign is the flexibility of the design for different rescue usage in disaster areas
of varying properties [2]. Our work on gait control offers such a flexibility by
21
drawing decisions from insect-inspired gait patterns to the changing needs of
the terrain and that of rescue.
In the literature some of the complete walking robot designs does not
offer remarkable approaches for gait control [37], [11]. They usually apply
fixed gait patterns (especially tripod). But some researches still focus on the
subject. In [30], Choi and Song deal with obstacle-crossing gaits. Their study
is presented on fully automated gaits that can be used to cross four types
of simplified obstacles: grade, ditch, step, and isolated wall. After the type
and dimensions of an obstacle is entered, the system generates a series of pre-
programmed movements that enables a hexapod to cross over the obstacle in

a fully automated mode. Our approach provides obstacle crossing by trying
different gaits rather than imposing pre-programmed movements.
In [36] a gait state definition is presented as a function of the last steps
executed. They identify several classes of gait states and transitions between
them. They show that independently from the initial posture of robot, the
robot would be in one of the four situations according to the number of legs
in contact by executing a sequence of gait states and the tripod gait can be
obtained.
Yang and Kim focus on the robustness to damages to legs in walking ma-
chines and deal with fault tolerant gaits [28]. These are the gaits maintaining
stability in static walking against a fault event preventing a leg from having
the support state. In [29], they successfully implement a fault tolerant gait
over uneven terrain. In our gait control approach we do not distinguish the
22
gaits according to their fault tolerances but we enable the controller to search
for the gait that will solve the problem.
In [6], Celaya and Porta present a complete control structure for the lo-
comotion of a legged robot on uneven terrain. In the gait generation they use
two rules by which different gaits including the complete family of wave gaits
can be obtained with a proper initial state. With the first rule that is: ’never
have two neighboring legs raised from the ground at the same time’, static
stability is guaranteed. Whereas the second rule: ’a leg should perform a step
when this is allowed by the first rule and its neighboring legs have stepped
more recently than it has’, forces the alternation of the steps of any pair of
neighboring legs. These two rules are local so that no central synchronization
is required.
In [33], a modified version of Q-learning approach is used for the decen-
tralized control of the Robot Kafka. Each leg, which can be in one of finite
number of states, has its own look-up table and can communicate with the
others. Based on the legs’ states and those, to which they are coupled, actions
are chosen according to these lookup tables. Modified Q-learning approach is
employed to search for a set of actions resulting in successful walking gaits.
Parker and et al utilize Cyclic Genetic Algorithm (CGA) to produce
gaits for a hexapod robot [40]. The approach to generate a gait is to develop
a model capable of representing all states of the robot and use a cyclic genetic
algorithm to train this model to walk forward. CGA is developed as a modi-
fication of the standard Genetic Algorithm. The CGA incorporates time into
23
the chromosome structure by assigning each gene a task to be accomplished
in a set amount of time. Also some portions of the chromosome (tasks) are
repeated creating a cycle. This allows the chromosome to represent a program
that has a start section and an iterative section. In [40] it is shown that with
only minimal a priori knowledge the optimal tripod gait for a hexapod robot
can be produced.
In [11], a survey of different approaches for gait generation can be found.
Among the methods in the literature, our gait synthesizer has an hybrid struc-
ture. Interaction of legs (mutual inhibitions and excitations) in biological sys-
tems results in observed gait patterns. Without implementing the described

interleg mechanisms we work on these patterns so that we make use of biolog-
ical background. Whereas our system enables the flexibility of non-periodic

gaits by allowing any leg to move out of the pattern when needed. In our
approach both the terrain conditions and performance criteria determine the
gait to be applied.
2.3 Mathematical Background
2.3.1 Neural-Fuzzy Controllers
Neural Fuzzy Controllers (NFCs), based on a fusion of ideas from fuzzy

control and neural networks, possess the advantages of both neural networks
(e.g., learning abilities, optimization abilities, and connectionist structures)
and fuzzy control systems (e.g., humanlike IF-THEN rule thinking and ease
of incorporating expert knowledge) [13]. Fuzzy systems and neural networks
24
share the common ability to improve the intelligence of systems working in
an uncertain, imprecise, and noisy environment. The main purpose of neural

fuzzy control system is to apply neural learning techniques to find and tune the
parameters and/or structure of the neuro-fuzzy control systems. Some of the
works in this area are Generalized Approximate Reasoning based Intelligent

Control (GARIC) [12], Fuzzy Adaptive Learning Control Network (FALCON)
[52], Adaptive Neuro Fuzzy Inference System (ANFIS) [53], and Neuro-Fuzzy
Control (NEFCON) [54]. In our work we adopted the GARIC architecture in
order to develop the gait synthesizer for our multilegged robot.
2.3.2 GARIC Architecture
Generalized approximate reasoning-based intelligent control (GARIC),

introduced by Berenji and Khedkar [12], is a neural fuzzy control system with
reinforcement learning capability. GARIC presents a method for learning and
tuning fuzzy logic controllers (FLC) through reinforcements signals. It consists
of three modules (Fig. 2.5): an action evaluation network (AEN) that maps
a state vector and a failure signal into a scalar score (internal reinforcement)
indicating the goodness of the state, an action selection network (ASN) that
maps a state vector into a recommended action using fuzzy inference, and a
stochastic action modifier that produces actual action based on internal re-
inforcement. Learning occurs by fine-tuning the free parameters in the two
networks: in the AEN, the weights are adjusted; in the ASN, the parameters
describing the fuzzy membership functions are changed.
25
Figure 2.5: The GARIC architecture [12].
Action Evaluation Network
The AEN constantly predicts reinforcements associated with different
input states. It is a two-layer feedforward network with direct interconnections
from input nodes to output node (Fig. 2.6). The input to the AEN is the state
of the plant, and the output is an evaluation of the state (or equivalently, a
prediction of the external reinforcement signal) denoted by v(t). The output

of each node in the AEN is calculated by the following equations
n
X
yi (t) = g( aij (t)xj (t)) (2.1)
j=1
n
X n
X
v(t) = bi (t)xi (t) + ci (t)yi (t) (2.2)
i=1 i=1
where
1
g(s) = (2.3)
1 + e−s
is the sigmoid function, v is the prediction of the reinforcement signal, and aij ,
26
bi , and ci are corresponding link weights shown as A, B, and C in Fig. 2.6.
Figure 2.6: The action evaluation network.
This network evaluates the action recommended by the action network
as a function of the failure signal and the change in state evaluation based on
the state of the system at time t






0 start state

r̂(t) =  r(t) − v(t − 1) failure state (2.4)




 r(t) + γv(t) − v(t − 1) otherwise
where 0 ≤ γ ≤ 1 is the discount rate. In other words, the change in the value
of v plus the value of the external reinforcement constitutes the heuristic or
internal reinforcement, r̂, where the future values of v are discounted more the
further they are from the current state of the system.
Learning in AEN is based on internal reinforcement, r̂(t). If r is positive,
27
the weights are altered so as to increase the output v for positive input, and
vice versa. Therefore, the equations for updating the weights are as follows:
bi (t) = bi (t − 1) + βr̂(t)xi (t − 1) (2.5)
ci (t) = ci (t − 1) + βr̂(t)yi (t − 1) (2.6)
aij (t) = aij (t − 1) + βh r̂(t)yi (t − 1)(1 − yi (t − 1))sgn(ci (t − 1))xj (t − 1) (2.7)
where β > 0 and βh > 0 are constant learning rates.
Action Selection Network
As shown in Fig. 2.7, the ASN is a five layer network with each layer
performing one stage of the fuzzy inference process. The functions of each
layer are briefly described here.
• Layer 1: An input layer that just passes input data to the next layer.
• Layer 2: Each node in this layer functions as an input membership func-
tion. Here triangular membership functions are used:






1 − |x − c|/sL x ∈ [c − sL , c]

µV (x) = 1 − |x − c|/sR x ∈ [c, c + sR ] (2.8)





 0 otherwise
where V = (c, sL , sR ) indicates an input linguistic value, and c, sL , sR
correspond to the center, left spread, and right spread of the triangular
membership function µV , respectively.
28
Figure 2.7: The action selection network.
• Layer 3: Each node in this layer represents a fuzzy rule and imple-
ments the conjunction of all the preconditions in the rule. Its output
wr , indicating the firing strength of this rule, is calculated by following

continuous, differentiable softmin operation:
P
µi ekµi
wr = Pi kµi
(2.9)
ie
where µi is the output of a layer 2 node, which is the degree of matching
between a fuzzy label occurring as one of the preconditions of rule r
and the corresponding input variable. The parameter k controls the

hardness of the softmin operation, and as k → ∞ we recover the usual
min operator. However, for k finite, we get a differentiable function of

the inputs, which makes it convenient for calculating gradients during
the learning process. The choice of k is not critical.
29
• Layer 4: Each node in this layer corresponds to a consequent label.
For each of the wr supplied to it, this node computes the corresponding
output action as suggested by rule r. This mapping is written as µ−1 (wr ),
where the inverse is taken to mean a suitable defuzzification procedure
applicable to an individual rule. For triangular functions,
µ−1
Y (wr ) = c + 0.5(sR − sL )(1 − wr ) (2.10)
where Y = (c, sL , sR ) indicates a consequent linguistic value.
• Layer 5: Each node in this layer is an output node that combines the
recommendations from all the fuzzy control rules using the following
weighted sum:
P
r wr µ−1 (wr )
F = P (2.11)
r wr
In the ASN, adjustable weights are present only on the input links of
layers 2 and 4. The other weights are fixed at unity.
Stochastic Action Modifier
In GARIC architecture, the output of ASN is not applied to the environ-
ment directly. Stochastic Action Modifier (SAM) uses the values of r̂ from the
previous time step and the action F recommended by the ASN to stochasti-
cally generate an action, F 0 , which is a Gaussian random variable with mean
F and standard deviation σ(r̂(t − 1)). This σ() is some nonnegative, mono-
tonically decreasing function, e.g. exp(−r̂). The action F 0 is what is actually
30
applied to the plant. The stochastic perturbation in the suggested action leads
to a better exploration of state space and better generalization ability. When

r̂(t − 1) is low, meaning the last action performed is bad, the magnitude of
the deviation |F 0 − F | is large, whereas the controller remains consistent with
the fuzzy control rules when r̂(t − 1) is high. The actual form of the function
σ(), especially its scale and rate of decrease, should take the units and range
of variation of the output variable into account.
In GARIC, the goal of calculating F values in ASN is to maximize
the evaluation of the gait, v, determined by AEN. The gradient information
∆p = δv/δp (p is the vector of all adjustable weights in ASN) is estimated by

stochastic exploration in the Stochastic Action Modifier (SAM). The modifi-
cation implemented in t − 1 by SAM is judged by r̂(t). If r̂ > 0, meaning the

modified F 0 (t − 1) is better than expected, then F (t − 1) is moved closer to
the modified one, and vice versa.
2.3.3 Fuzzy Sets and Fuzzy Logic Controllers
Fuzzy sets, introduced by Zadeh in 1965 as a mathematical way to rep-
resent vagueness in linguistics, can be considered a generalization of classical
set theory [47]. In a classical set, the membership of an element is crisp; it is
either yes (in the set) or no (not in the set). A crisp set can be defined by the
so-called characteristic function (or membership function). The characteristic

function µA (x) of a crisp set A is given as


 1 if x ∈ A
µA (x) =

 0 if x 6∈ A
31
Fuzzy set theory extends this concept by defining partial memberships,
which can take values ranging from 0 and 1:
µA (x) : U → [0, 1]
where U refers to the universal set defined in a specific problem.
Fuzzy logic was one of the major developments of Fuzzy Set Theory and
was primarily designed to represent and reason with knowledge that cannot
be expressed by quantitative measures. The main idea of algorithms based on
fuzzy logic is to imitate the human reasoning process to control ill-defined or

hard-to-model plants. Fuzzy inference systems model the qualitative aspects
of human knowledge through linguistic if-then rules. Every rule has two parts:
an antecedent part (premise), expressed by if..., which is the description of the
state of the system, and a consequent part, expressed by then..., which is the
action that the operator who controls the system must take.
We can use fuzzy sets to represent linguistic variables. Linguistic vari-
ables represent the process states and control variables in a fuzzy controller.
Their values are defined in linguistic terms and they can be words or sentences
in a natural or artificial language.
The most important operators in classical set theory with crisp sets are
complement, intersection, and union. These operations are defined in fuzzy
logic via membership functions. The membership values in a complement
subset Ā are
µĀ (x) = 1 − µA (x)
32
which corresponds to the same operation in the classical theory. For the inter-
section of two fuzzy sets various operators have been proposed (min operator,
algebraic product, bounded product,...). The min operator for two fuzzy sets
A and B is given as
µA (x)andµB (x) = min{µA (x), µB (x)}
For the union of two fuzzy sets, there is a class of operators named t-conorms
or s-norms. One of the most used in the literature is the max operator:
µA (x)orµB (x) = max{µA (x), µB (x)}
Figure 2.8: General model of a fuzzy logic controller.
A typical architecture of a FLC is comprised of four principal components
(Fig. 2.8): a fuzzifier, a fuzzy rule base, inference engine and defuzzifier. The
fuzzifier performs the fuzzification that converts the data from the sensor mea-
surements into proper linguistic values of fuzzy sets through predefined input
membership functions. In Fuzzy Rule Base, fuzzy control rules are character-
ized by a collection of fuzzy IF-THEN rules in which the preconditions and

consequents involve linguistic variables. This collection of fuzzy control rules
33
(or fuzzy control statements) characterizes the simple input-output relation of
the system. The inference engine is to match the output of the fuzzifier with
the fuzzy logic rules and perform fuzzy implication and approximate reasoning
to decide a fuzzy control action. Finally, the defuzzifier performs the function
of defuzzification to yield a nonfuzzy (crisp) control action from an inferred
fuzzy control action through predefined output membership functions.
The principal elements of designing a FLC include defining input and out-
put variables, deciding on the fuzzy partition of the input and output spaces
and choosing the membership functions for the input and output linguistic
variables, deciding on the types and derivation of fuzzy control rules, design-
ing the inference mechanism, and choosing a defuzzification operator [13].
2.3.4 Reinforcement Learning
Reinforcement learning is an approach to artificial intelligence that em-
phasizes learning by the individual from its interaction with its environment
[13]. The environment supplies a time varying vector of input to the system,
receives its time varying vector of output or action and then provides a time
varying scalar reinforcement signal. Here, the reinforcement signal r(t) can be
one of the following forms: a two valued number r(t) ∈ {-1,1}or {-1,0} such
that r(t)=1 (0) means ”success” and r(t)=-1 means ”failure”; a multi-valued
discrete number in the range [-1,1] or [-1,0], for example r(t) ∈ {-1, -0.5, 0,
0.5, 1}, a real number r(t)=[-1,1] or [-1,0], which represent a more detailed
and continuous degree of failure or success. We also assume that r(t) is the
reinforcement signal available at time step t and is caused by the inputs and
34
actions at time step (t-1) or even affected by earlier inputs and actions.
Challenging problem in reinforcement learning is that there may be a
long time delay between a reinforcement signal and the actions that caused it.
In such cases a temporal credit assignment problem results because we need to
assign credit or blame, for an eventual success or failure, to each step individ-
ually in a long sequence. An approach to solve such problem is based on the

temporal difference methods [41]. TD methods consist of a class of incremental
learning procedures specialized for prediction problems. TD methods assign
credit based on the difference between temporally successive predictions. The
main characteristic of these methods is that it is not required to wait until the
actual outcome is known.
The object of learning is to construct an action selection policy that

optimizes the systems performance. A natural measure of performance is the
discounted cumulative reinforcement (utility [38])
∞
X
Vt = γ k rt+k (2.12)
k=0
where Vt is the discounted cumulative reinforcement starting from time t
throughout the future, rt is the reinforcement received after the transition
from time t − 1 to t, and 0 ≤ γ ≤ 1 is a discount factor, which adjusts the
importance of long term consequences of actions. In the approach to solve the
temporal credit assignment problem, the aim is to learn an evaluation func-
tion to predict the discounted cumulative reinforcement. Evaluation function
(Vxπ ) is the expected discounted cumulative reinforcement that will be received

starting from state x, or simply the utility of state x. The evaluation function
35
is represented using connectionist networks (evaluation network or critic) and
learned using a combination of temporal difference methods and error back-
propagation algorithm. TD methods compute the error called the TD error
between temporally successive predictions, and the backpropagation algorithm

minimizes the error by modifying the weights of the networks.
Let pt is the output of the evaluation network, which denotes the estimate
at time step t for the evaluation function Vxπ , given the state xt , and rt is the
actual cost incurred between time steps t − 1 and t. Then pt−1 predicts
∞
X
γ k rt+k = rt + γpt (2.13)
k=0
In this case the prediction error (TD error) which is the difference between
estimated evaluation and actual evaluation would be
(rt + γpt ) − pt−1 (2.14)
This method is used for prediction problems in which exact success or
failure may never become completely known.
36
CHAPTER 3
LEGGED ROBOT
3.1 Dynamics and Coordinated Control of Legged Robots
The dynamics of a robotic system play a central role in both its control
and simulation. When studying the control of robots, the primary problem,
which must be solved, is known as Inverse Dynamics. Solution of the in-

verse dynamics problem requires the calculation of the actuator torques and/or
forces, which will produce a prescribed motion in the robotic system. Whereas,
in the area of simulation, the fundamental problem to be solved is called For-
ward or Direct Dynamics. Solution of this problem requires the determination

of the joint motion, which results from a given set of applied joint torques
and/or forces.
The overall mechanism of a legged robot is a closed-chain comprised of a
body with supporting legs. The kinematic relations between the leg joint mo-
tion and the body motion are complicated. The additional complexity arises
because the chains (legs) of the system are coupled through the body.
In the approach presented here the resemblance between the control of
legged robots and the manipulation of objects by multi-fingered robot hands
is considered. The dynamics and control of grasping are developed in various
prior works [48], [51]. We adapt these concepts here to legged robots. The
basics of the mathematical background given in this section can be found in

[42], [44], [50]. Note that these analysis are valid for legged robots using static
37
balance where the body is continuously supported by at least three legs con-
stituting a support polygon.
The dynamics and control algorithm represented here must be consid-
ered within a complete control system for a legged robot including navigation,
terrain adaptation, etc. Because these concepts are out of the scope of this
thesis work, we will just give the algorithm, however simulations for the gait
synthesizer will be implemented with a simpler model described in chapter 4.
3.1.1 Motion Dynamics of Legged Robots
We firstly derive equations concerned with moving coordinate frames.

Let C1 and C2 be two coordinate frames. We denote by p12 ∈ R3 and R12 ∈
SO(3) (3 × 3 orthogonal matrix, R−1 = RT ) the position and orientation of

C2 relative to C1 . Beside, we denote by v12 = ṗ12 and w12 = S −1 (Ṙ12 R12
T
) (or
Ṙ12 = S(w12 )R12 ), translational and rotational velocity of C2 relative to C1 ,
where S is an operator defined by

   
 w1   0 −w3 w2 
   
w= 
 w2  , S(w) = 
 w3 0

−w1 
   
   
w3 −w2 w1 0
which clearly satisfies
S(w)f = w × f and
AS(w)AT = S(Aw) for all A ∈ SO(3), w, f ∈ R3 .
Now consider three coordinate frames C1 , C2 , and C3 . The position and

orientation of C3 relative to C1 is given by [50]
38
p13 = p12 + R12 p23 (3.1)
R13 = R12 R23 (3.2)
Then translational velocity of C3 relative to C1 is obtained by
v13 = ṗ13 = ṗ12 + Ṙ12 p23 + R12 ṗ23 (3.3)
which is
v13 = v12 − S(R12 p23 )w12 + R12 v23 (3.4)
To see this, we observe that
Ṙ12 p23 = S(w12 )R12 p23

T
= (R12 R12 )S(w12 )R12 p23
T
= R12 S(R12 w12 )p23
T
= R12 (R12 w12 ) × p23
T
= R12 (−p23 ) × (R12 w12 )
T
= −R12 S(p23 )R12 w12
= −S(R12 p23 )w12
By differentiating both sides of equation 3.2, we also obtain rotational
velocity of C3 relative to C1 :
Ṙ13 = Ṙ12 R23 + R12 Ṙ23 (3.5)
S(w13 )R13 = S(w12 )R12 R23 + R12 S(w23 )R23 (3.6)
S(w13 )R13 = S(w12 )R13 + S(R12 w23 )R13 (3.7)
w13 = w12 + R12 w23 (3.8)
by the transformation
39
T
R12 S(w23 )R23 = R12 S(w23 )(R12 R12 )R23
= S(R12 w23 )R13 )
Then the generalized velocity of C3 relative to C1 is given in matrix form
by
       
 v13   I −S(R12 p23 )   v12   R12 0   v23 
  =   +   (3.9)
w13 0 I w12 0 R12 w23
Figure 3.1: Coordinate frames defined for the legged robot. The coordinate frame
Cci is assigned such that the unit vector ẑ is normal to the contact surface at the
point of contact.
In Fig. 3.1 the coordinate frames Cw , CB , Cbi , Cti , and Cci denote
respectively the inertial base frame, the body coordinate frame attached to
the center of mass of the body, the leg base frame of leg i, the leg tip frame
of leg i, and the local frame at the contact point of leg i. For the relations of
these coordinate frames we know that ptc = 0, and Cc and Cb are fixed with
respect to Cw and CB , respectively (vwc = wwc = vBb = wBb = 0). Besides,

according to equation 3.9 following relations exist:
40
       
 vbc   I −S(Rbt ptc )   vbt   Rbt 0   vtc 
  =   +  
wbc 0 I wbt 0 Rbt wtc
     (3.10)
vbt   Rbt 0   vtc 
= 
 +  
wbt 0 Rbt wtc
    
 vBc   RBb 0   vbc 
  =    (3.11)
wBc 0 RBb wbc
       
 vwc   I −S(RwB pBc )   vwB   RwB 0   vBc 
  =   +  =0
wwc 0 I wwB 0 RwB wBc
(3.12)
     
I −S(RwB pBc )   vwB   Rwb 0   vbc 
−
   =    (3.13)
0 I wwB 0 Rwb wbc
    
T T
 Rwb −Rwb S(RwB pBc )   vwB   vbc 
−   =   (3.14)
T
0 Rwb wwB wbc
   
 vwB   vbc 
−T  =   (3.15)
wwB wbc
Moreover, the velocity of leg tip frame, Ct , is related to the velocity of
the leg joints, q̇, by the leg jacobian,

 
 vbt 
  = J(q)q̇ (3.16)
wbt
In this analysis, we consider following contact models for the leg tip-
terrain interactions: a) a point contact without friction, b) a point contact
with friction, c) a soft contact, d) a rigid contact. These contact models give
rise to contact constraints specified by
41
• vzi = 0, for a point contact without friction.
• vxi = vyi = vzi = 0, for a point contact with friction.
• vxi = vyi = vzi = 0 and wzi = 0, for a soft contact.
• vxi = vyi = vzi = 0 and wxi = wyi = wzi = 0, for a rigid contact.
For each of the contact models, substituting the above contact constraints
and equation 3.16 into equation 3.10 we have

 
 vbc 
BT  T
 = B J(q)q̇ (3.17)
wbc
T
where B is the basis matrix defined in [49] representing the model contact
constraints. For example, for a point contact with friction

 
 1 0 0 0 0 0 
 
 
BT = 
 0 1 0 0 0 0 

(3.18)
 
0 0 1 0 0 0
Substituting equation 3.17 into equation 3.15 we have
 
vwB 
−B T T 
  = B T J(q)q̇ (3.19)
wwB
 
vwB 
−GT   = Jleg (q)q̇ (3.20)
wwB
Dual to generalized velocity, a generalized force (or wrench) can be writ-
ten as
 
f13 
F13 =
  (3.21)
τ13
42
where τ13 ∈ R3 and f13 ∈ R3 are the torque and the linear force about the
origin of C3 relative to coordinate frame C1 , respectively.
Generalized force can be defined by examining the work produced by a

virtual displacement. A virtual displacement is an instantaneous infinitesimal
displacement du. The work produced by a virtual displacement, virtual work,
is denoted by δW , where δW = F · du. We use the principle of virtual work

to find generalized force relations. The work performed, which has units of
energy, must be the same regardless of the coordinate system within which
it is measured or expressed [45]. The virtual work done by an infinitesimal
displacement of the body with respect to Cw is

     T  
 fwB   vwB   fwB   vwB 
δW =  · =   
τwB wwB τwB wwB
where we have represented the dot product in the virtual work equation using
the transpose operation. Alternatively, the virtual work done by the corre-
sponding infinitesimal displacement of the Cc with respect to Cb is

     T  
 fbc   vbc   fbc   vbc 
δW =  · =   
τbc wbc τbc wbc
By the principle of virtual work, these two formulations of the work performed
are equal:
 T    T  
 fwB   vwB   fbc   vbc  (3.22)
    =    
τwB wwB τbc wbc
and substituting equation 3.15 into 3.22 we have
 T  T
 fwB   fbc  (3.23)
  =   (−T )
τwB τbc
43
   
 fwB   fbc 
  = (−T T )   (3.24)
τwB τbc
For a given contact model, let ni denote the total number of independent
contact wrenches that leg i can apply to the terrain. For example, ni = 1 for a
point contact without friction (i.e., a force in the normal direction), and ni = 3
for a point contact with friction (i.e., a force in the normal direction plus two
components of frictional forces). Note that ni is just the number of contact
constraints corresponding to the contact model. According to equation 3.24

the resulting generalized force from applied contact force of the leg i can be
expressed as
 
 fwB 
  = −T T Bxi (3.25)
τwB
 
 fwB 
  = −Gxi (3.26)
τwB
n
where xi ∈ R is the magnitude vector of applied contact forces (generalized)
i
along the basis directions of B. Equations 3.20 and 3.26 provide valid relations
if the leg remains in contact with the surface and there is no slipping. A com-
mon way to guarantee no slipping is to ensure that the contact forces lie within
the friction cone at the point of contact-that is, the tangential component of
the contact force is less than or equal to the coefficient of friction µ times the
normal component of the contact force.
Finally for n (i = n) supporting legs
44
 
 q1 
 
 
 q2 
 
Q= .. 
 


. 

 
qn
 
 x1 
 
 
 x2 
 
FT = 
 .. 



. 

 
xn
· ¸
JT = Jleg1 Jleg2 · · · Jlegn
· ¸
GT = G1 G2 · · · Gn
Then the equations 3.20 and 3.26 and can be concatenated for i =
1, · · · , n to give,
 
 vwB 
JT (Q)Q̇ = −GTT   (3.27)
wwB
 
fwB 
−GT FT =    (3.28)
τwB
We have derived the force,torque and velocity relations from legs to leg-
tips and leg-tips to body.
3.1.2 Coordinated Control of Legged Robots
In this section, we develop the control algorithm for the coordinated

control of the robot legs. The goal of the control scheme is to specify a set of
45
control inputs for the leg motors so that the body undergoes a desired motion.
The control scheme we develop in this section is based on the computed torque
methodology.
By differentiating equation 3.27 we have
   
 vwB   v̇wB 
JT (Q)Q̈ + J˙T (Q)Q̇ = −ĠTT   − GTT   (3.29)
wwB ẇwB
    
 vwB   v̇wB 
Q̈ = JT+ (Q) 
−ĠTT   − GTT   − JT (Q)J˙T (Q)Q̇ + Q̈o
+
wwB ẇwB
(3.30)
Here JT+ (Q) is the pseudo inverse satisfying J + = J T (JJ T )−1 and Q̈o ∈ N (JT )
is the internal motion of redundant joints not affecting the body motion.
The dynamics of the body expressed in internal base frame Cw is given

by the Newton-Euler equation as [51]
      
 Im 0   v̇wB   0   fwB 
  +  =   (3.31)
0 Iw ẇwB wwB × Iw wwB τwB
Here  
 mB 0 0 
 
 
Im = 
 0 mB 0 

 
0 0 mB
T
where mB is the body mass and Iw = RwB Io RwB is the body inertia matrix
expressed in Cw and Io is the body inertia matrix expressed in CB . Also from

equation 3.28 we have
46
 
 fwB 
FT = −G+
T   + FT o (3.32)
τwB
where G+
T is the pseudo inverse of GT and FT o is the internal leg force not
affecting the body motion. Combining equation 3.31 and equation 3.32 yields
    
 Im 0   v̇wB   0 
FT = −G+
T   +  + FT o (3.33)
0 Iw ẇwB wwB × Iw wwB
In order to specify an orientation trajectory in terms of the rotation
matrix RwB (t) we parameterize SO(3) so that RwB = RwB (Υ) where Υ ∈ R3
is taken as yaw α(t), pitch β(t), and roll γ(t) coordinates of the body. Given
this parametrization, there exists a linear transformation p(Υ) such that [42]:
    
 wx   cγcβ −sγ 0   α̇ 
    
w  
=  wy  = 
 sγsβ

cγ 0 

 β̇  = p(Υ)Υ̇ (3.34)
    
    
wz −sβ 0 1 γ̇
where
 
 α 
 
 
Υ =  β 
 
 
γ
So the acceleration of the body is given as
     
 v̇wB   p̈wB   ṗwB 
  = P  + Ṗ   (3.35)
ẇwB ΫwB Υ̇wB
where
 
 I 0 
P =  
0 p(Υ)
We define the position error ep ∈ R6 of the body in a given desired
trajectory as
47
   
d
 pwB   pwB 
ep =  − 
ΥdwB ΥwB
where
 
d
 pwB (t) 
 
ΥdwB (t)
is the desired body trajectory.
In order to reduce the position error, we apply joint torques of the legs
to make the acceleration of the body satisfy the equation
      
d
 v̇wB   p̈wB    ṗwB 
  = P   + kv ėp + kp ep  + Ṗ   (3.36)
ẇwB ΫdwB Υ̇wB
where kv and kp are scalars chosen such that the characteristic roots of
ëp + kv ėp + kp ep = 0 have negative real parts.
The dynamics of the ith leg manipulator for l link is given by
Hi (q)q̈ + Ci (q, q̇)q̇ = τi − J T (q)Bxi
where
τ = l × 1 vector of joint torques,

q, q̇, q̈ = l × 1 vectors of joint positions, velocities, and accelerations,
H(q) = l × l joint space inertia matrix , both symmetric and positive definite,
C(q, q̇) = l × l matrix of coriolis and centripetal force terms,
We define
 
 H1 ··· 0 
 
 .. .. .. 
H=
 .
. . 

 
0 · · · Hn
48
 
 C1 
 
 
 C2 
 
C=
 ..


 . 
 
 
Cn
 
 τ1 
 
 
 τ2 
 
τ = .. 
 


. 

 
τn
Then the leg dynamics can be grouped for i = 1, · · · , n to yield
H(Q)Q̈ + C(Q, Q̇)Q̇ = τ − JTT (Q)FT (3.37)
Thus the resultant control law is specified by substituting equations 3.30,

3.33, 3.36, 3.37:
    
p̈dwB  ṗwB 
τ = DP 

 
 + kv ėp + kp ep  + D Ṗ  +E (3.38)
ΫdwB Υ̇wB
where   
 I
+ m
0 
D = −HJT+ GTT − JTT GT  
0 Iw
and
   
 ṗwB  +
0 
E = −HJT+ ĠTT P   − HJT J˙T Q̇ + C Q̇ − J T T GT 
+

Υ̇wB wwB × Iw wwB
All the terms in equation 3.38 are functions of state variables Q, Q̇, pwB ,
ΥwB , vwB , and Υ̇wB .
49
3.2 Gait Controller
Our gait controller is based on a gait synthesizer which is adapted from
the Generalized Approximate Reasoning-based Intelligent Control (GARIC)
architecture [12], to our objective GARIC presents a method for learning and
tuning fuzzy logic controllers (FLC) through reinforcements signals. The gait
synthesizer (Fig. 3.2) consists of three modules. The Gait Evaluation Mod-
ule (GEM), acts as a critic and provides advice to the main controller, based
on a multilayer artificial neural network. The Gait Selection Module (GSM),
decides on a new gait to be undertaken by the robot according to an ANN

representation of a fuzzy controller with as many hidden units as there are
rules in the knowledge base. The Gait Modifier Module (GMM), changes the
gait recommended by the GSM based on internal reinforcement. This change
in the recommended gait is more significant for a state if that state does not
receive high internal reinforcements. (i.e. probability of failure is high). On

the other hand, if a state receives high reinforcements, GMM administers small
changes to the action selected by the fuzzy controller embedded in the GSM.
This reveals that the action is performing well so that the GMM recommen-
dation dictates no or only minor changes to that gait. The actions for the gait
synthesizer are the gaits recommending an operation mode (defined in section

3.3.1) for each leg.
3.2.1 Encoding the Gaits for a Multilegged Robot
Our gait synthesizer works on gait patterns that need to be coded.
Gait patterns are patterns of leg coordination which represents relative phases
50
Figure 3.2: Architecture of Gait Synthesizer.
(swing phase or stance phase) of legs. For legged robots using static balance,
the typical feature of these gait patterns is that in any phase of the pattern
the robot ensures static stability. In the gait synthesizer we work on wave gait
patterns which are observed in insect walking. As stated in chapter 2 these

gaits consist of metachronal waves in both side of the robot and differ from
each other with an amount of overlapping. So different wave gait patterns can
be derived by changing this amount. Among numerous gait patterns we choose
the ones including groups of legs which are in phase. For instance, the tripod
gait which is special in these patterns (an alternation between right-sided and
left-sided metachronal waves) naturally have two group of legs involving three
legs in phase.
In the encoding of the gaits our goal is to find a modelling method for
51
Figure 3.3: Summary of terminology used in gait analysis.
all gait patterns from which a leg task can be obtained. In other words for a
given state (which at least includes the phase, position and velocity of each leg
for proprioceptive level of control) we want to find both in which gait pattern
the current state belongs to and in which phase of that pattern it belongs to.
We make use of position information of the legs to recognize the gait patterns.
In the encoding process we divide the stroke distance (Fig. 3.3)
of a leg into overlapping grids for both swing and stance phases as
in Fig. 3.5. Here the linguistic values {A, B, . . . , L, M } are ”author-
defined” fuzzy partitions of stroke distance with triangular member-
ship functions. The tripod gait of Fig. 3.4E can now be coded
with the sequence: (F, A, F, A, F, A) →(G, B, G, B, G, B) → ... →
(E, J, E, J, E, J)→ (F, A, F, A, F, A) → . . ., or the gait pattern in Fig.

3.4D with the sequence: (K, C, A, C, A, K) → (L, D, B, D, B, L) →
{(M, E, C, E, C, M )or(A, K, C, K, C, A)} → (B, L, D, L, D, B) →
. . . → (D, B, L, B, L, D) → {(E, C, M, C, M, E)or(K, C, A, C, A, K)} →
(L, D, B, D, B, L) → . . .. In all gait-sequence-encoding the fraction of cy-
52
Figure 3.4: Wave gait patterns. Bold lines represent swing phase. L1 signifies the
left front leg and R3 indicates the right hind leg [7].
cle periods for stance and swing must be incorporated in the model. As in Fig.
3.4, in tripod gait, a stance phase is the half of a whole leg cycle whereas in
tetrapod gaits it is two over three portion of leg cycle (so called duty factor
described in chapter 2). Leg sequences defining gait patterns have to be also
modelled by leg cycles. For a portion of a cycle, a leg is either in stance, swing
or in transition (end of swing or end of stance). Thus we construct rules as:

if leg R3 is in E, and, R2 in C, R1 in M , L3 in C, L2 in M , L1 in E, then
R3 is in transition, R2 in stance, R1 in transition, L3 in stance, L2 in transi-
tion, and L1 in transition. Here being in A for example means that leg is in
stance in current state and has partial belonging to the fuzzy linguistic value
A. Whereas, the consequent (or then part) of the rule prescribe the legs’ next
”state”. With the given partitioning, 10 rules cover a tripod gait pattern and
9 for tetrapod gait patterns. The significance of this fuzzy modelling is that
53
Figure 3.5: Antecedent Labels, fuzzification of individual leg position.
individual leg phases are found from a gait pattern cycle which is determined
from relative positions of the legs.
For uneven terrain conditions, we define four ”operation modes” of a leg:
1. First mode labelled as -2: The leg is responsible for supporting the body.
2. Second mode labelled as -1: The leg switches to third mode provided
that just the legs in the first mode provides static stability, otherwise
the leg participate in the supporting legs. These legs are candidates for
swing phase among stance legs.
54
3. Third mode labelled as 2: The leg is responsible for full recovery such as
that if it encounters an obstacle it will try to handle it.
4. Fourth mode labelled as 1: The leg tip will descend to the ground until
the tip touches the terrain and switches to the first mode.
In both mode labels 2 and 1 (modes will be mentioned with their labels from
now on), the leg will go on recovery if it is within the limits of its operation
space. These four modes constitute leg states from control point of view that
we need to distinguish for a leg within the cooperative action of walking. At
Anterior Extreme Point (AEP), mode 2 automatically switches to mode 1.
Furthermore, the binary data from static stability check for mode -2 legs and
tip contact (a protracting leg switches to retraction when it finds a foothold

that it can safely support the body) clearly determine the switching from mode
-1 to mode 2 and from mode 1 to mode -2, respectively.
Beside the leg/leg coordination, leg/body coordination is required for a

regular gait. Movement of each leg can be characterized by a position p ∈ R
and a velocity ṗ ∈ {vstance (vst ), vswing (vsw )} according to direction of body mo-
tion in leg centered coordinates. When a leg is in protraction, it is lifted from
the ground and swing forward relative to the body with a constant velocity
vsw > 0. When a leg is in retraction, it is on the ground, providing support and
swinging backward relative to the body with a velocity vst < 0 (for straight
line walking this velocity is equal to minus body velocity with respect to the
ground, vB ). As in many walking animals, vsw is relatively constant while
vst varies according to walking speed. In other words considering Fig 3.4, the
body or retraction velocity is a fraction of protraction velocity and the fraction
55
is directly proportional to number of support legs over number of swing legs.
For instance in tripod gait this ratio is one and the velocities are equal. So in
our controller the body velocity for a time step is taken as:
X
vB (t) = ( vst (t − 1) ∗ ∆t)/(nost) (3.39)
where nost is the number of stance legs. There are two parameters to be con-
sidered concerning velocity: static stability margin and kinematic margin of
stance legs (Fig. 3.3). The minimum of these margins (let us call critical mar-
gin, Cm) determines the distance that the robot can travel without violating
a physical constraint. So additionally vB is set to zero when Cm is zero in

speed control.
Figure 3.6: Consequent Labels: task share based on operation modes.
For the gait synthesizer, a gait is the ”task sharing” of the legs accom-
plishing a coordinated body movement. For instance, if a leg is on AEP, it
is clear that it can only be used for stance (no share for swing) such that if
it is presently in swing phase it must take a transition to stance. However in
uneven terrain conditions where there is no fixed leg cycle for individual legs,
it is difficult to assign in a deterministic way a leg share within the limits.
56
In our controller, we introduce a linguistic variable task share, Mleg (t), taking
linguistic values {Stance (St), Swing (Sw), Transition (Tr)} with triangular
membership functions shown in Fig. 3.6. The values (−2, −1, 0, 1, 2) are cho-
sen according to labels of operation modes which consider cyclic behavior of

the legs. By changing the overlapping areas and phase difference of the left-
and right-sided metachronal waves we form 9 tetrapod gaits. According to
the method mentioned above, we construct 91 (9 × 9 + 10) rules for all gaits
belonging to the wave gait class. With the membership functions in Figs. 3.5,
3.6 we constitute the fuzzy rules for the rule base of the GSM of the gait syn-
thesizer where triggered rules recommend a value for task share of each leg.
3.2.2 Gait Selection Module (GSM)
GSM determines the recommended task share for each leg, Mleg (t), in a
fuzzy decision process where inferencing is done based on the fuzzy rule base.
Mleg values define a measure to distinguish two switching point between mode
-2 and -1 and mode 2 and 1 during walking. Two thresholds T2,1 and T−2,−1 de-
termine the mode of the legs. For the legs in stance, legs with Mleg (t) < T−2,−1
are determined as mode -2 legs and legs with Mleg (t) < T−2,−1 as mode -1.
Likewise, for legs in stance, legs with Mleg (t) > T2,1 are determined as mode
2 legs and those with Mleg (t) < T2,1 as mode 1. The effect of these threshold
values to the decision process are analyzed in simulation.
As shown in Fig. 3.7, GSM is a fuzzy logic controller represented as a five-
layer feedforward network with each layer performing one stage of the fuzzy
inference process. GSM takes the current legs’ positions and phases (swing
57
Figure 3.7: Gait Selection Module
or stance) as input. The nodes in the second layer corresponds individually

to possible values of each linguistic variables of the inputs (Fig. 3.5) with
triangular membership functions, µV (x), where the input linguistic value V =

(c, sL , sR ) is represented by c, sL , sR corresponding respectively to the center,
left spread, and right spread of the triangular membership function µV . Each
node in this layer feeds the rules using the linguistic value in their antecedent
parts (”if” part). The conjunction of all the antecedent conditions in a rule is
calculated in the third layer. The output of the layer is the firing strength of
the rules which is calculated by softmin operation described in section 2.5.2.
The nodes in the fourth layer corresponds to a consequent label (Fig. 3.6).
Their inputs come from all the rules which use this particular consequent label.
For each input supplied by a rule, nodes compute the corresponding output
suggested by that rule by a defuzzification procedure µ−1
Yleg (wr ) = c + 0.5(sR −
58
sL )(1 − wr ) where Yleg = (c, sL , sR ) indicates a consequent linguistic value of
a leg. In the last layer there are six output node for each leg that computes
Mleg (t) by combining the recommendations from all the fuzzy control rules in
the rule base, using weighted sum in which the weights are the rule strengths:
X X
Mleg = ( wr µ−1 (wr ))/ wr (3.40)
r r
The goal of calculating Mleg values in GSN is to maximize the evaluation
of the gait, v, determined by GEM where, within its learning process the vector
of all parameters of Yleg (centers and spreads) are adjusted; that is,
δv
∆pY ∝ (3.41)
δpY
where pY is the vector of Yleg = (c, sL , sR ). But, there is no explicit gradi-
ent information provided by the reinforcement signal and the gradient δv/δp
can only be estimated. To estimate the gradient information in reinforcement
learning, there needs to be some randomness in how output gaits are chosen
by GSM so that the range of possible outputs can be explored to find a correct
value. This is provided by the stochastic exploration in Gait Modifier Module
(GMM).
3.2.3 Gait Evaluation Module (GEM)
GEM is a standard two-layer feedforward neural network, which takes
the state of the system as input. The state data includes leg-tip positions,
and velocities in leg centered coordinate systems and legs’ operation mode
(-2,-1,1,2). To assign credit to the individual actions of the action sequence

preceding a reinforcement signal, an evaluation function of the states is learned.
59
The output is an evaluation of the state denoted by v. Changes in v due to
state transitions are further combined with a reinforcement signal to produce

an internal reinforcement r̂:
r̂(t) = r(t) + γv(t) − v(t − 1) (3.42)
where 0 ≤ γ ≤ 1 is the discount rate. The internal reinforcement plays the role
of an error measure in the learning of the GEM. If r̂ is positive, the weights of
the network are altered through back propagation algorithm so as to increase
the output v for positive input, and vice versa. The main reinforcement signal
is obtained from critical margin (Cm) and vB . If Cm = 0 or vB = 0 (may
be zero if there are no legs in swing) a reinforcement signal r(t) = −1 is re-

turned. Otherwise a value is returned according to design goal. This value
can be simply r(t) = 0 or can be a real number to represent a more detailed
and continuous degree of success. Different reinforcement signals are tested in
simulations in order to optimize speed and mobility.
3.2.4 Gait Modifier Module (GMM)
One of the features of the Gait Synthesizer architecture is to modify the

output of the GSM according to internal reinforcement from previous time
steps. GMM creates a Gaussian random distribution with a mean which is set
as the recommended Mleg value, and with a standard deviation αexp(−r̂(t−1)),
a non negative, monotonically decreasing function with a scale factor α, where
α ∈ R+ . When r̂(t − 1) is low, meaning the last action performed is bad,

the deviation is large, whereas the controller remains consistent with the fuzzy
control rules when r̂(t − 1) is high. This deviation provides adaptation to
60
current conditions or to solving a sudden problem of leg entrapment. Also
the exploration of the state space increases the systems experience, which is
provided by the learning in the GEM and GSM. The gradient information
∆pY = δv/δpY , which is within GSM, is estimated by stochastic exploration
in the GMM. The modification implemented in t − 1 by GMM is judged by

r̂(t). If r̂ > 0, meaning the modified M (t − 1) is better than expected, then
M (t − 1) is moved closer to the modified one, and vice versa. That is,
" #
δv Mmod (t − 1) − Mrec (t − 1)
≈ r̂(t) (3.43)
δpY αexp(−r̂(t − 1))
where Mrec denotes the M value recommended by GSM, and Mmod denotes
the M value modified by stochastic perturbation in GMM.
Due to change in Mleg values, four different transitions may occur: If a

leg’s state is -2 (-1) and Mleg (t) > T−2,−1 (Mleg (t) < T−2,−1 ) then leg is -1 (-2).
If a leg’s state is 2 (1) and Mleg (t) < T2,1 (Mleg (t) > T2,1 ) then leg is 1 (2).
So stochastic exploration on Mleg values which does not result in a modified

transition, have no contribution in learning. We can define the minimum
deviation, ∆dm , as the minimum perturbation added by GMM required to

change the state of a leg. ∆dm (t) values can be given as


 |Mrec (t) − T2,1 | if the leg is in state 2 or 1
∆dm (t) =

 |M (t) − T
rec −2,−1 | if the leg is in state -2 or -1
So the modification of an M (t) depends on the deviation function and
the ∆dm (t). The effect of the values α, T2,1 , and T−2,−1 will be analyzed in
simulations.
61
3.2.5 The Complete Control Cycle
The control cycle in Fig. 3.8 is executed in each time step. Firstly, re-
inforcement signal and legs’ states are taken by the gait synthesizer and the
legs are clustered to suitable operation modes according to their calculated M
values. Then, further modifications depending on physical checks are applied

and the resultant operation modes which will be valid for the rest of control
layers are obtained until the next cycle. In the figure we only consider velocity
controller, but different control modules (such as navigation, terrain adapta-
tion) can be implemented. Lastly, desired velocities of the legs and the body
is calculated and applied by the robot.
62
Figure 3.8: Complete Control Cycle
63
CHAPTER 4
HEXAPOD ROBOT SIMULATION
We develop the hexapod robot shown in Fig. 4.1 to be used in our sim-
ulations. Our simulation program consists of two subprograms. The first one
constitutes the main body (main program) which includes the controller ar-
chitecture and the hexapod model. All simulation tests and training sessions
are implemented in this subprogram which is written in Matlab 6.5. The sec-
ond subprogram is responsible for visualization (rendering) of the simulation
results. The main program saves as a file named simvars.bsd the state data
of the hexapod for each time cycle. The state data are fed as an input to the
rendering program. The rendering program is written in Borland C Builder
with opengl as a graphics tool. The reason for using two separate program in
the simulation is to decrease the computation time spent in the tests of the
hexapod. The source code of the programs and simulation results can be found
in the CD attached to this thesis as an appendix.
4.1 Hexapod Model
The simulations are implemented in a kinematic model. Such kinematic
models are commonly implemented in gait analysis [24], [34] and gait control
[30], [28] for simulation purposes. A simplified model of the hexapod robot
considered in this thesis is shown in Fig. 4.2. Each leg is identical and com-
posed of three rigid links (Fig. 4.3). All the links are connected to each other
via a revolute joint. Hence the foot point or the leg tip has three degrees of
64
Figure 4.1: The hexapod robot used in simulation.
freedom with respect to the body. The legs are represented by labels R1, R2,
R3, and L1, L2, L3. Here, for example, L1 signifies the left front leg and R3
indicates the right hind leg.
The body coordinate (CB ) is attached on the hexapod body with the
origin at the center of gravity while leg base coordinates (Cb ) are attached
on the bases of the legs (Fig. 4.2). Cw is the inertial base frame. Dashed
rectangles represent working spaces for the legs (pbtipx ∈ [−Sd/2, Sd/2] and
pbtipz ∈ [−Rz/2, Rz/2]). The joint angles are calculated by inverse kinematics
[45] given a desired position and orientation. The dimensions of the links and
the body level from the ground are assigned such that the leg tips can reach
all points in their working spaces (existence of solution of inverse kinematics)
and there exists only one joint angle vector (uniqueness of solution of inverse
kinematics). Fig. 4.4 shows two postures of the hexapod model. As can be
seen, the hexapod body in Fig. 4.4B is lower compared to the one in Fig. 4.4A
65
Figure 4.2: Hexapod model
in order to increase the reachable space of the legs. The hexapod in Fig. 4.4B
is especially used in uneven terrain simulations where some legs fall into holes
on the terrain. Also notice that reachable space by the legs do not overlap
(Fig. 4.2).
4.2 Sensor System
As indicated, joint angles of the legs are calculated by inverse kinematics
from given leg tip trajectories. In real robots these angles measured by Joint
Angle Sensors [42]. These are potentiometers that measure the joint angle for
each DOF of the leg. In our simulation these angles are used in the rendering
66
Figure 4.3: Each leg is identical and composed of three links. Pink legs are in swing
phase whereas blue ones are in stance.
program. In the gait synthesizer (so in the main program), leg tip coordinates
and velocities in their own coordinate systems are used.
The leg tip-terrain interactions are determined by modelling ground con-

tact sensors. In real robots, these are linear potentiometers on the tip of all
legs that measures the deflection of the foot as it presses against the ground.
In our experiments, this is an on-off sensor with output of ’1’ when contact
occurs, and ’0’ for noncontact.
In real robots, several additional sensors are used such as inclinometer
which senses the body orientation with respect to the direction of gravity. In
our simulations, we implement straight line walking in x-direction (Fig. 4.2).
So body orientation does not change. Also we did not need to model sensors
for terrain sensing (such as optical sensors), because the gait synthesizer is
capable of making its decisions without explicitly needing such data since it
develops gradually an internal world of the environment for gait adaptation.
67
Figure 4.4: Two different postures of the robot. Body level of the robot in B is
lowered in order to increase the reachable space of the legs.
4.3 Kinematics of the Hexapod Robot
Assumptions on kinematics and dynamics of the hexapod are given as
follows for simplicity of the analysis and is adapted from [28].
1. The contact between a foot and the ground is a point.
2. There is no slipping between a foot and the ground.
3. All the mass of the six legs is lumped into the body, and the center of
gravity is assumed to be at the centroid of the body.
4. There is no displacement in y-directions and body level (pwBz ) and ori-
entation is constant with respect to the inertial base frame.
5. The body speed with respect to inertial frame in x direction is equal to
68
minus leg tip speed in x direction of stance legs with respect to Cb (i.e.,
vwBx = −vbtipst x ).
In our simulations vbtipsw x is set to a constant positive value ρ from which

vwBx (so vbtipst x ) is calculated. Also |vbtipsw z | = % for swinging legs. For different
states (operation modes) the velocities of the legs are calculated as follows.






ρ if the leg is in state 2 or 1 and pbtipx (t − 1) < Sd/2 (AEP)

vbtipx = 0 if the leg is in state 2 or 1 and pbtipx (t − 1) > Sd/2 (AEP)





 ν if the leg is in state −2 or −1
(4.1)
P
where ν = ( vbtipsw x (t − 1)∆t)/(nost) and nost is the number of stance legs.
And






% if the leg is in state 2 and pbtipx (t − 1) <(Rz/2)

vbtipz = −% if the leg is in state 1 and pbtipx (t − 1) >(-Rz/2) (4.2)





 0 otherwise
4.4 Uneven Terrain
The main challenge of the gait synthesizer is for uneven terrain locomo-
tion. The test path for uneven terrain is modelled such that a smooth surface
succeeding a part with randomly placed hills and holes, some of which are
deeper than the legs can reach (Fig. 4.5). A function in the main program
named as TTerrainmaker.m (refer to the CD in the appendix) creates terrains

by randomly placing 7 different surface segments which have dimensions such
69
that only leg tips can collide with them. In other words the other parts of the
hexapod robot (links or the body) do not have collision with the terrain. The
tests conducted on uneven terrain, where a leg hits an obstacle (probably with
the link part of the leg) and can not go on swinging, are modelled by temporary
disabling of the corresponding leg. The effect of the disabling is same as the
obstacle collision from the gait synthesizer point of view (temporarily the leg
will not participate in the gait of the hexapod). Again such simulations can
be found in the CD and are discussed in details in chapter 5 under simulation
results.
Figure 4.5: The modelled uneven terrain. Different surface segments can be seen in
the figure. The holes on uneven terrain are modelled by surface segments which are
deeper than the legs can reach. Notice that the pink leg (swinging) falls into such a
segment.
70
CHAPTER 5
SIMULATION RESULTS
The hexapod robot simulation developed in chapter 4 is used to gen-
erate simulation results that clearly demonstrate the capabilities of our gait
synthesizer. The simulations are implemented in a kinematic model (chapter

4) rather than the dynamic model described in chapter 3.1. A control system
in such a dynamic model has to include many control modules besides a gait
controller, such as control algorithms related to navigation, speed, body level
(terrain adaptation), which have only effects on the low level execution of a
gait of the robot rather than to the gait formulation level. Consequently the
simulation omits these effects and analyze just the gait synthesizer in the gait
control of a hexapod robot based on its kinematic model.
In the first two sections of the simulations we will firstly analyze the con-
trol parameters and different choice of reinforcement signals that are significant
in the performance of the gait synthesizer. These tests will be implemented
on smooth terrains in order to focus on comparisons under similar environ-
mental effects. In the rest of the simulations we will show the capabilities of
the gait synthesizer for search and rescue (SAR) by testing its performance on
modelled uneven terrains expected in SAR operations and when a leg is used
as a manipulator. Before these tests are implemented, the gait synthesizer
was trained with different initial conditions and with different terrains (for the
tests applied on uneven terrain). The results presented here are chosen among
the ones which are impressive enough to clearly demonstrate the advantages
71
of the gait synthesizer and the potential it offers for SAR. All the results are
included in a CD which is attached to the thesis as an appendix. The reader

is referred to this CD in which the results discussed here can be examined
visually.
5.1 Exploration and Exploitation Dilemma in Reinforcement
Learning
As indicated in the Gait Modifier Module (GMM) the deviation func-
tion αexp(−r̂(t − 1)) is scaled by α, and two threshold values, T−2,−1 and T2,1
must be properly selected for the controller. The effects of these values are
tested first for a simple learning problem. The legs begin from random initial
positions (all the legs are in state −2) such that this initial configuration does
not belong to any gait pattern in the fuzzy rule base. Since the reinforcement
signals aim at optimizing the speed of the hexapod with a maximized static
stability, we expect that from this random initialization the gait will converge
to the optimum one in terms of speed which is the tripod gait. In order to test
the sensitivity of the gait synthesizer to changes in α and thresholds we test
the GSM in the same manner for each α and threshold values. Within each
training session, repeated 10 times maximum, the gait synthesizer is trained

for 2000 time steps for a given parameter set (α, T2,1 , and T−2,−1 ). We ini-
tialize the weights in learning, change the parameters and apply the training
again for a new parameter set. Fig. 5.1 shows the resultant speed vs time
graphs of the hexapod. In the first test the parameters are chosen as; α = 0.5,
T2,1 = 0.5, and T−2,−1 = −0.5. If the magnitude of the scale factor (α) is
72
high, we find that the exploration of different gaits is also high. In other
words the gait synthesizer tries plenty of gaits for different states, causing a
very slow learning. Fig. 5.1A shows the resultant speed vs time graph at the
10th training session. The synthesizer is found not to be able to converge to
a periodic movement or capture a gait pattern. On the other hand, when the
scale factor is too small as in a second test taking α = 0.01, T2,1 = 0.3, and
T−2,−1 = −0.3, learning is low thus exploration is low and moreover there is
a chance of getting stuck. Fig. 5.1B (the second row) is the resultant speed
vs time at the second training session. The legs’ state vector at the end of
this training is observed as [−1, −1, −2, −1, −2, −1]. Here, because no static
stability is provided by the legs in state −2, no swinging leg exist and the body
stands in a still position. In such states (most severe being the case of the state
[−1, −1, −1, −1, −1, −1]) the synthesizer has to try different combinations of
leg states in order to continue its movement. But low scale factor tightens
the deviation from the recommended M values and recovery from the present
state is low and limited.
In the third and fourth tests (Fig. 5.1C, 5.1D) we set the scale factor to
0.15 and consider two threshold pair; T2,1 = 0.5, T−2,−1 = −0.1 (Fig. 5.1C),
and T2,1 = 0.1, T−2,−1 = −0.5 (Fig. 5.1D). These speed vs time graphs are
obtained in the 10th training session. When T2,1 is high the legs can not stay
at state 2 for a long time and changing into state 1. This creates very small
step sizes. Whereas, when T−2,−1 is too small a similar problem as in the sec-
ond test arises where too many legs falls into the state −1 and the hexapod
robot get stuck in a still position without the gait synthesizer being able to
73
Figure 5.1: Body speed versus time graphs for different scale factor and threshold
values.
restart its motion, although the gait synthesizer tries many new gaits in order
to escape from such states. The robot looses time: notice long delays with zero
speed such as between times 1300 and 1400. The last row represents results
for parameters α = 0.15, T2,1 = 0.1, T−2,−1 = −0.1. This speed vs time graph
shows a tripod gait and is obtained in the third training session, giving raise
to values that can be considered as near optimum.
5.2 Smooth Terrain Tests
In this section simulations demonstrate the learning capability of the
gait synthesizer on smooth flat terrain. Learning is aiming at increasing the
74
Figure 5.2: Comparison of resultant gaits when training is done according to two
different reinforcement for speed (first row) and critical margin (second row). The
first column gives the resultant gaits, second one body speed versus time, and last
column shows critical margin in the direction of motion versus time.
static stability margin while maximizing speed. As indicated in section 3.2.3,
a reinforcement signal r(t) = −1 is returned when the critical margin, Cm, or

body speed vwBx is zero, except for states in which there exist a swinging leg
on AEP. Otherwise, the controller is rewarded towards its optimization of the
speed and critical margin. Reinforcement signals leading to such rewards are
of the form
r(t) = vwBx /ρ (5.1)
and
r(t) = Cm(t)/Cmmax
respectively. Here Cmmax is the maximum critical margin which is the stroke
75
distance (Sd), and ρ is the maximum speed of the body according to the speed
policy which can be obtained in tripod gait. The first row of Fig. 5.2 shows the
results of speed optimized gait. The first column gives the resultant gait, sec-
ond one vwBx versus time and last column shows critical margin versus time in
the direction of motion. As expected a tripod gait is obtained because it is the
fastest gait in the rule base of the gait synthesizer and this is where naturally
gait decision has converged to. Maximum speed in second column corresponds
to ρ. The results in the second row corresponds to the gait synthesizer trained
to optimize Cm. As can be seen, a tetrapod gait is obtained which generates

steps to prevent the critical margin from getting smaller (graph in the third
column of second row). The drawback here is on the speed as seen in the
second column.
0.5
−0.5
−1
−1.5
0 500 1000 1500 2000 2500
Figure 5.3: Internal reinforcement versus time.
Another example demonstrates a compromise the gait synthesizer under-

goes in its performance in the case of a tripod gait with small step sizes. The
76
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
0 500 1000 1500 2000 2500
Figure 5.4: Critical margin, Cm(t), versus time.
robot is trained for speed with an additional reinforcement signal r(t) = −1

when critical margin, Cm, which is the minimum of stability margin and kine-
matic margin, is below a positive value. When the robot starts with a tripod
gait it is punished several times due to this reinforcement signal and the in-
ternal reinforcement increased as seen in Fig. 5.3. Fig. 5.4 shows the Cm
versus time graph of this simulation. The decrease in the internal reinforce-
ment meaning a performance problem causes the gait synthesizer to decide on
new gaits. As can be seen, the gait synthesizer adapts the gait after a cer-
tain amount of time to increase the internal reinforcements without losing the
periodicity. Fig. 5.5 shows the leg tip positions in x direction where one can
observe that leg step sizes decreased. This simulation clearly shows that an
adaptation of the gait synthesizer is achieved for both speed and mobility (in
terms of critical margin) by an appropriate choice of reinforcement signals.
77
0.2 0.2 0.2
0.1 0.1 0.1
0 0 0
−0.1 −0.1 −0.1
−0.2 −0.2 −0.2

500 1000 1500 500 1000 1500 500 1000 1500
R1 R2 R3
0.2 0.2 0.2
0.1 0.1 0.1
0 0 0
−0.1 −0.1 −0.1
−0.2 −0.2 −0.2

500 1000 1500 500 1000 1500 500 1000 1500
L1 L2 L3
Figure 5.5: Leg tip positions on x direction versus time. In order to increase the
critical margin gait synthesizer applies smaller step sizes.
5.3 Performance on Rough Terrain
Next, the robot is tested on uneven terrain, modelled such that a smooth
surface succeeds a part with randomly placed hills and holes some of which
deeper than the legs can reach. We conduct a comparative analysis of per-
formance of the hexapod robot with or without the gait synthesizer but with
fixed gait approaches on the defined terrain. Fig. 5.6 shows tip trajectories of
the legs in classical fixed tripod gait. The legs swing in their operation space
and Anterior Extreme Point is taken to be the fixed switching point for mode
2 to 1. When left front leg (L1) falls in a hole, the robot is stuck and can no
longer move. There are mainly two reasons for such a failure. Firstly the gait
pattern is defined for six legs and can not be implemented if any one is missing.
Secondly as shown in Fig. 5.2, critical margin for the tripod gait approaches
zero when swinging legs are descending. This is because stance legs reach
78
Posterior Extreme Point (PEP) so there exist no margin for body movement
to handle the hole. Fig. 5.7 shows tip trajectories and Fig. 5.8 shows the
resultant gait when gait synthesizer is implemented on the same terrain. The
gait synthesizer successfully handles the terrain irregularities. When the robot
first enters the uneven portion of the terrain, the evaluation of the gait give
lower reinforcements (due to unexpected bad performance in the robot state)
and new gaits are recommended by the gait synthesizer. When a leg falls in
a hole the synthesizer generates very small steps as ripples in the trajectories.
These hesitations are actually trials of new gaits by Gait Modifier Module and
are also seen on the trajectories of the legs’ tips while they are swinging. One
can argue that a different fixed gait (for instance a tetrapod gait) can tackle
such terrain. This is true from mobility point of view. However for search and
rescue tasks, speed (or response time) is as important as mobility and a fixed
tetrapod gait has a slower performance quite inadequate for a time pressing
SAR operation. Compromise is needed between two concepts. Fig. 5.8 also
shows that after some time the robot reaches the smooth terrain where it re-
covers a tripod gait. Gait trials for a better evaluation of the gait can be seen
from these results where recoveries occur. The results of an another example
for a similar terrain is given in Fig. 5.9, 5.10, and 5.11 where a faster recovery
of the tripod gait is achieved.
5.4 Task Shapability: A Must for SAR Operations
In search and rescue (SAR) operations a leg of the hexapod can be re-
quired to be used for tasks such as carrying debris or any equipment while
79
the robot is in motion, so that it can not participate in the gait of the hexa-
pod. Such a task shapability may be vital in hazardous environment of SAR.

Fig. 5.12 and 5.13 represent such a situation where leg R1 is involved in a
manipulation task and is eliminated from the gait pattern. The leg involved
in a manipulation task is shown here as fixed in a position in swing phase as

if it is holding something. Although the gait synthesizer is seen not being able
to find right away a periodic gait, it provides the mobility in sudden lack of
a leg using the redundancy in multi-legged locomotion. Simulations clearly
indicate the advantageous characteristics of the gait synthesizer for mobility
and robustness required in search and rescue (refer to CD in the appendix).
80
Figure 5.6: Leg tip trajectories of the hexapod on x-z plane with a fixed tripod gait
on the defined terrain.
81
Figure 5.7: Leg tip trajectories of the hexapod on x-z plane with gait synthesizer on
the defined terrain.
82
Figure 5.8: Gait of the hexapod robot on uneven terrain. The robot recovers tripod
gait pattern after some time reaching the smooth terrain.
83
Figure 5.9: Gait of the hexapod robot on uneven terrain. The robot recovers tripod
gait faster than the previous one.
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
0 500 1000 1500 2000 2500
Figure 5.10: Critical margin versus time.
84
0.2 0.2 0.2
0.1 0.1 0.1
0 0 0
−0.1 −0.1 −0.1
−0.2 −0.2 −0.2

500 1000 1500 2000 500 1000 1500 2000 500 1000 1500 2000
R1 R2 R3
0.2 0.2 0.2
0.1 0.1 0.1
0 0 0
−0.1 −0.1 −0.1
−0.2 −0.2 −0.2

500 1000 1500 2000 500 1000 1500 2000 500 1000 1500 2000
L1 L2 L3
Figure 5.11: Leg tip positions on x direction versus time.
Figure 5.12: Gait generated by the gait synthesizer when leg R1 is missing.
85
Figure 5.13: Gait generated by the gait synthesizer in sudden lack of leg R1.
86
CHAPTER 6
CONCLUSION
6.1 General
In this thesis work we developed an intelligent task shapable control,
based on a gait synthesizer for a hexapod robot upon its traversal of unstruc-
tured workspaces in rescue missions within disaster areas. The gait synthesizer
draws decisions from insect-inspired gait patterns to the changing needs of the
terrain and that of rescue tasks. It is composed of three modules responsi-
ble for selecting a new gait, evaluating the current gait, and modifying the
recommended gait according to the internal reinforcements of previous time

execution performances. Simulation results show the potential of the gait
synthesizer for Search and Rescue operations such as to be adaptable to the
uneven terrain by shaping gaits, to get out of an entrapment of some legs, to

modify gaits when some legs are used as manipulators in some other tasks very
different in nature to that of locomotion.
The contribution of this thesis work can be analyzed from several point
of views. Towards gait analysis, we introduce a modelling method for insect

inspired gait patterns. We form fuzzy rules for different phases in gait pattern
cycles from relative positions of the legs. The fuzzy rules provide a method to
distinguish tasks of individual legs in the coordinated movement of hexapod
robots. This modelling and fuzzification process is valid for all legged robots
using static stability.
87
For legged robots, there are two parts to gait generation; the cyclic action
of the individual legs and the coordination of all the legs to make effective use
of their cycles. Periodic gaits offers this coordination within a pattern with
a fast execution, exhibiting different amount of weakness to irregularities of
the environment. By utilizing a control structure namely the novel gait syn-
thesizer architecture that exhibit intelligent control features such as learning
and adaptability in unstructured environments, we provide exploration among
such periodic gait patterns in order to be mobile and rapid at the same time on
uneven terrain. Besides, the control architecture generates gaits to get out of
entrapment of legs owing to the modifier module as one of the 3 main modules
of the gait synthesizer.
Dynamics of legged robots is complicated because of coupling of individ-

ual legs in the dynamics of the body. We established its similarity to grasping
and manipulation of objects by multi-fingered robot hands. and we consid-

ered locomotion as grasping onto infinitely big, rough, arbitrarily textured
terrain. We make use of multitude of existing works on grasping models with
multi-fingered robot hands to generate the locomotion dynamical equations for
legged robots. Again the derived equations are general enough to be applied
to all legged robots under static stability.
Finally this thesis contributes to the literature on feasibility of au-
tonomous intelligent robots for search and rescue (SAR). We have developed a
coordination control of legs based on gait patterns for fast and secure mobility
of legged robots for the mobility needs in SAR environments. Fast mobility
is ensured by an optimization on speed. Secure mobility is achieved by an
88
optimization of the static stability margin and also by the gait synthesizer
modifying its gait to take out the robot for any motion entrapment. This
deadlockfree locomotion when terrain entrapment or leg failure occur is due to
the ability of our gait synthesizer using successfully the redundancy in multi-
legged robots.
6.2 Future Work
In this thesis we restrict the subject to gait control. In a complete control
structure of a legged robot, the gait synthesizer should undertake more respon-
sibilities than the ones mentioned in this thesis work. The gait synthesizer can
then be further expanded in several ways. By different choice of reinforcement
signals the synthesizer can be trained and adapted to different tasks. We just
give an example of such an adaptation in terms of speed and mobility which
are the main concern of locomotion as our case.
Moreover, for terrain irregularities that are routinely faced in a search

and rescue (SAR) operation (such as specific obstacle types), some control
modules can be added into Gait Modifier Module. Encountering such situa-
tions, the gait synthesizer lets these modules take the control of modifications
held in GMM while still recommending gaits for locomotion.
Also new rule bases can be added to the system for five legged locomotion
so that in a permanent loss of a leg the corresponding rule base can be set in
action. Although we showed that such situations can still be handled by the
gait synthesizer with rules for six legs, addition of such rules will provide more
89
functionality to the gait synthesizer with just a drawback of memory usage.
The analysis in the thesis are made for a two dimensional model of a
hexapod robot locomotion, meaning a straight line walking. The gait synthe-
sizer can be adapted to a real robot by addition of rules for lateral positions
of leg tips and the locomotion can be made over a planar x-y terrain. Also
the working space of the legs must be adapted when orientation of the body
changes. These foreseen changes to the system would not effect the perfor-
mance of the gait synthesizer because the main concept, which is drawing
decisions from gait patterns to the needs of the locomotion, would not change
with these modifications.
An important property of the gait synthesizer is the generated M val-

ues. These values have information about relative functionality of the legs.
Although we just apply to distinguish operation modes of the legs, the other
control modules can also make use of them. For instance, in navigation control
the M vector of the legs can be taken as an input indicating feasibility of a

manoeuvre. For such utilizations the learning algorithm need to be changed.
Because this time not only the comparison with threshold values but also the
value of the M will be meaningful.
90
REFERENCES
[1] Jennifer Casper, Mark Micire, Robin R. Murphy, Issues in Intelligent
Robots for Search and Rescue, .
[2] İsmet Erkmen, Aydan M. Erkmen, Fumitoshi Matsuno, Ranajit Chatter-
jee, Tetsushi Kamegawa, Snake Robots to the Rescue, Serpentine Search
Robots in Rescue Operations, IEEE Robotics and Automation Magazine,
September 2002.
[3] G. Meltem Kulalı, Mustafa Gevher, Aydan M. Erkmen, İsmet Erkmen,
Intelligent Gait Synthesizer for Serpentine Robots, Proceedings of the 2002
IEEE International Conference on Robotics and Automation, Washington,

DC, May 2002.
[4] M. H. Raibert, Legged Robots that Balance, MIT Press, Cambridge, MA,
1986.
[5] S. M. Song, K. J. Waldron, Machines That Walk: The Adaptive Suspension
Vehicle, Cambridge, MA: MIT Press, 1988.
[6] Celaya, E., Porta, J. M., A Control Structure for the Locomotion of a Legged
Robot on Difficult Terrain, IEEE Robotics and Automation Magazine, Vol.

5, No. 2, June 1998, pp. 43-51.
[7] M. J. Randall, A. G. Pipe, A Novel Soft Computing Architecture for the
Control of Autonomous Walking Robots, Soft Computing 4 (2000) 165-185.
91
[8] Shin-Min Song, Kenneth J. Waldron, An Analytical Approach for Gait
Study and Its Applications on Wave Gaits, The International Journal of
Robotics Research, Vol. 6, No. 2, Summer 1987.
[9] J. Dean, A Model of Leg Coordination in the Stick Insect, Carausius Mo-
rosus, I. A geometrical consideration of contralateral and ipsilateral coor-
dination mechanisms between two adjecent legs, Biol. Cybern. 64, 393-402,
1991.
[10] Cynthia Ferrell, A Comparison of Three Insect-inspired Locomotion Con-

trollers, Robotics and Autonomous Systems 16, (1995) 135-159
[11] David Wettergreen, Chunk Thorpe, Gait Generation for Legged Robots,
Proceedings of the IEEE International Conference on Intelligent Robots
and Systems, July 1992.
[12] Hamid R. Berenji, Pratap Khedkar, Learning and Tuning Fuzzy Logic
Controllers Through Reinforcements, IEEE Transactions on Neural Net-
works, Vol. 3, No. 5, September 1992.
[13] Chin-Teng Lin, C.S. George Lee, Neural Fuzzy Systems, Prentice Hall
Inc., 1996.
[14] Jennifer Casper, Mark Micire, Robin R. Murphy, Jeff Hyams, Brian
Menten, Mobility and Sensing Demands in USAR, .
[15] R. Blickhan, R. J. Full, Similarity in Multilegged Locomotion: Bouncing
Like a Monopode, Journal of Comparative Physiology, A(1993) 173:509-

517.
92
[16] Michael H. Dickinson, Claire T. Farley, Robert J. Full, M. A. R. Koehl,
Rodger Kram, Steven Lehman, How Animals Move: An Integrative View,
Science Vol 288 7 April 2000.
[17] Chiel,H., Beer,R., Sterling,L, Heterogenous Neural Networks for Adap-

tive Behaviour in Dynamic Environments, Advances in Neural Information
Processing Systems, 1989,pp. 577-585.
[18] Beer, R. D., Intelligence as Adaptive Behaviour, Academic Press, 1990.
[19] Beer, R. D., Chiel, H. J., Quinn, R. D, Espenschied, K. S., Larsson, P.,
A Distributed Neural Network Architecture for Hexapod Robot Locomotion,

Neural Computation, 4, pp. 356-365, 1992.
[20] Cruse, H., What Mechanisms Coordinate Leg Movement In Walking An-
thropods?, Trends in Neuroscience, 13, pp. 15-21, 1990.
[21] K. Pearson, The control of walking, Scientific American 235:72-86, 1976.
[22] Full, R. J., Blickhan, R., Ting, L. H., Leg design in hexapedal runners, J.
exp. Biol. 158, 369–390.
[23] V.R. Kumar, K.J. Waldron, A Review of Research on Walking Vehicles,
The Robotics Review 1, pp. 243-266, The MIT Press, Cambridge, MA,
1989.
rosus, II. Description of the Kinematic Model and Simulation of Normal
Step Patterns, Biol. Cybern. 64, 403-411, 1991.
93
rosus, III. Responses to perturbations of normal coordination, Biol. Cy-
bern. 66, 335-343, 1992.
[26] Espenschied, K. S., Chiel, H. J., Quinn, R. D, Beer, R. D., Leg Coordina-
tion Mechanisms in the Stick Insect Applied to Hexapod Robot Locomotion,
Adaptive Behaviour 1 (4), pp. 455-468, 1992.
[27] Espenschied, K. S., Chiel, H. J., Quinn, R. D, Beer, R. D., Biologically-

Inspired Hexapod Robot Control, Proc. 5th Int. Symp. on Robotics and
Manufacturing (ISRAM), 5, pp. 89-94, 1994.
[28] Jung-Min Yang and Jong-Hwan Kim, Optimal Fault Tolerant Gait Se-
quence of the Hexapod Robot with Overlapping Reachable Areas and Crab
Walking, IEEE Transactions on Systems, Man, and Cybernetics- Part A:

Systems and Humans, Vol.29, No. 2, March 1999.
[29] Jung-Min Yang and Jong-Hwan Kim, A fault Tolerant Gait for a Hexapod
Robot over Uneven Terrain, IEEE Transactions on Systems, Man, and
Cybernetics- Part B: Cybernetics, Vol.30, No. 1, February 2000.
[30] Byoung S. Choi, Shin Min Song, Fully Automated Obstacle-Crossing Gaits
for Walking Machines, IEEE Transactions on Systems, Man, and Cyber-
netics, Vol.18, No. 6, November/December 1988.
[31] David Wettergreen, Chuck Thorpe, Gait Generation for Legged Robots,
Proceedings of the IEEE International Conference on Intelligent Robots
and Systems, July 1992 .
94
[32] Robert B. McGhee, Geoffrey I. Iswandhi, Adaptive Locomotion of a Mul-
tilegged Robot over Rough Terrain, IEEE Transactions on Systems, Man,
and Cybernetics, Vol. SMC-9, No. 4, April 1979.
[33] T.D. Barfoot, E.J.P. Earon, G.M.T. D’Eleuterio, A Step in the Right
Direction, Learning Hexapod Gaits Through Reinforcement, International
Symposium on Robotics, Montreal, Canada, 14-17 May 2000.
[34] Alan Calvitti, Randall D. Beer, Analysis of a Distributed Model of Leg
Coordination, I. Individual Coordination Mechanisms, Biol. Cybern. 82,
197-206 (2000).
[35] Porta, J. M., Celaya, E., Gait Analysis for Six-Legged Robots, Techni-
cal Report IRI-DT-9805, Institut de Rob‘otica i inform‘atica Industrial,
Barcelona, March 1998.
[36] E. Celaya, J.M. Porta, V. Ruz de Angulo, Reactive Gait Generation for
Varying Speed and Direction, First International Symposium on Climbing

and Walking Robots, 1998.
[37] U. Saranlı, M. Buehler, D. E. Koditschek, Design, Modeling and Prelimi-
nary Control of a Compliant Hexapod Robot, IEEE Int. Conf. on Robotics
and Automation, San Fransisco, CA (April 2000).
[38] Long-Ji Lin, Self-Improving Reactive Agents Based On Reinforcement
Learning, Planning and Teaching, Machine Learning, 8, 293-321, 1992.
[39] Andrew G. Barto, Richard S. Sutton, and Christopher J.C.H Watkins,
Learning and Sequential Decision Making, Learning and computational

neuroscience, MIT Press, 1990.
95
[40] George J. Klir, Tina A. Folger, Fuzzy Sets, Uncertainty, and Information,
Prectice Hall.
[41] Sutton, R.S., Learning to predict by the methods of temporal differences,

Machine Learning, 3:9-44 1988.
[42] K.S. Fu, R.C. Gonzalez, C.S.G. Lee, Robotics: Control, Sensing, Vision,
and Intelligence, McGraw-Hill.
[43] John J. Craig, Introduction to Robotics, Mechanics and Control, Addison-

Wesley Publishing Company.
[44] Kathryn W. Lilly, Efficient Dynamic simulation of robotic mechanisms,

Kluwer Academic Publishers, 1993.
[45] Robert J. Schilling, Fundamentals of Robotics, Analysis and Control,
Prentice Hall, 1990.
[46] Michael McKenna, David Zeltzer, Dynamic simulation of autonomous
legged locomotion, In Computer Graphics (SIGGRAPH proceedings), vol-

ume 24. ACM, August 1990.
[47] Jelena Godjevac, Comparative Study of Fuzzy Control, Neural Network
Control and Neuro-Fuzzy Control, Technical Report, No: 103/95, February
1995.
[48] Zexiang Li, Ping Hsu, Shankar Sastry, Grasping and Coordinated Manip-
ulation by a Multifingered Robot Hand, International Journal of Robotics
Research, Vol. 8, No. 4, August 1989.
96
[49] J. Kerr, An analysis of multifingered hands, Ph.D. Dissertation. Mechan-
ical Engineering, Stanford University, 1984.
[50] David Baraff, Physically based modelling: Principles and Practice, Sig-
graph 97 course notes.
[51] Arlene B. A. Cole, John E. Hauser, S. Shankar Sastry, Kinematics and

Control of Multifingered Hands with Rolling Contact, IEEE Transactions
on Automatic Control, Vol. 34, No. 4, April 1989.
[52] Lin C. T., Lee C. S. G., Neural Network Based Fuzzy Logic Control and
Decision System, IEEE Transactions on Comput., 40(12), pp. 1320-1336,

1991.
[53] Jang R., Neuro-Fuzzy Modeling: Architectures, Analyses and Aplications,

PhD Thesis, University of California, Berkeley, July 1992.
[54] Nauck D., Kruse R., Neuro-Fuzzy Systems for Function Approximation,
4th International Workshop Fuzzy-Neuro Systems, 1997.
97

Hexapod

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Hexapod

Transféré par

Droits d'auteur :

Formats disponibles

INTELLIGENT GAIT CONTROL OF A MULTILEGGED ROBOT USED

IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE

Examining Committee Members

Prof. Dr. Aydın Ersak ———————————–

Prof. Dr. İsmet Erkmen ———————————–

Assoc. Prof. Dr. Aydan Erkmen ———————————–

Asst. Prof. Dr. İlhan Konukseven ———————————–

INTELLIGENT GAIT CONTROL OF A MULTILEGGED ROBOT USED

Supervisor: Assoc. Prof. Dr. Aydan Erkmen

Co-Supervisor: Prof. Dr. İsmet Erkmen

December 2003, 97 pages

In this thesis work an intelligent controller based on a gait synthesizer

ommended gait according to the internal reinforcements of past time steps. A

ÇOK BACAKLI KURTARMA ROBOTLARININ AKILLI YÜRÜYÜŞ

Yüksek Lisans, Elektrik ve Elektronik Mühendisliği Bölümü

Ortak Tez Yöneticisi: Prof. Dr. İsmet Erkmen

Aralık 2003, 97 sayfa

Bu tez çalışmasında kurtarma robotlarının akıllı yürüyüş denetimi için

bir yürüyüş şekli sentezleyicisi geliştirilmiştir. Sentezleyici değişen zemin

ilham alınan yürüyüş şekillerine göre karar vermektedir. Sentezleyici, yürüyüş

şekli belirleyici, değerlendirici ve değiştirici olmak üzere üç bölümden oluşur.

Anahtar Sözcükler: Altı Bacaklı Kurtarma Robotları, Böceklerden İlham

Alınmış Yürüyüş Şekilleri, Yürüyüş Şekli Sentezleyicisi, GARIC.

I would like to express my gratitude to my supervisor Assoc. Professor

motivation, guidance, patience, and encouragement through the preparation

invaluable comments and suggestions throughout the study. Finally, I express

my gratitude to my family for their endless support.

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .viii

2.1 Search and Rescue Robotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2 Legged Locomotion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2.2 Control of Legged Robots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2.4 Gait Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.3.1 Neural-Fuzzy Controllers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.3.2 GARIC Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.3.3 Fuzzy Sets and Fuzzy Logic Controllers . . . . . . . . . . . . . . . 31

3. LEGGED ROBOT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37

3.1 Dynamics and Coordinated Control of Legged Robots . . . . . . . 37

3.1.1 Motion Dynamics of Legged Robots . . . . . . . . . . . . . . . . . . . 38

3.2 Gait Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.2.3 Gait Evaluation Module (GEM) . . . . . . . . . . . . . . . . . . . . . . . 59

4. HEXAPOD ROBOT SIMULATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.1 Hexapod Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .64

4.2 Sensor System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

4.3 Kinematics of the Hexapod Robot . . . . . . . . . . . . . . . . . . . . . . . . . . 68

5.1 Exploration and Exploitation Dilemma in Reinforcement

5.2 Smooth Terrain Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .89

2.2 Wave gait patterns. Bold lines represent swing phase. L1

2.4 Summary of coordination mechanisms in the stick insect. The

2.5 The GARIC architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.7 The action selection network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.8 General model of a fuzzy logic controller . . . . . . . . . . . . . . . . . . . . . . . 33

3.2 Architecture of Gait Synthesizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.3 Summary of terminology used in gait analysis . . . . . . . . . . . . . . . . . . 52

3.4 Wave gait patterns. Bold lines represent swing phase. L1

3.5 Antecedent Labels, fuzzification of individual leg position . . . . . . 54

3.7 Gait Selection Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.8 Complete control cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.1 The hexapod robot used in simulation . . . . . . . . . . . . . . . . . . . . . . . . . .65

4.5 The modelled uneven terrain. Different surface segments can be

5.2 Comparison of resultant gaits when training is done according to

5.4 Critical margin, Cm(t), versus time . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77