Vous êtes sur la page 1sur 106

INTELLIGENT GAIT CONTROL OF A MULTILEGGED ROBOT USED

IN RESCUE OPERATIONS

A THESIS SUBMITTED TO
THE GRADUATE SCHOOL OF NATURAL AND APPLIED SCIENCES
OF
THE MIDDLE EAST TECHNICAL UNIVERSITY

BY

EMRE KARALARLI

IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE


DEGREE OF
MASTER OF SCIENCE
IN
THE DEPARTMENT OF ELECTRICAL AND ELECTRONICS
ENGINEERING

DECEMBER 2003
Approval of the Graduate School of Natural and Applied Sciences

——————————————–
Prof. Dr. Canan Özgen
Director

I certify that this thesis satisfies all the requirements as a thesis for the degree
of Master of Science.

——————————————–
Prof. Dr. Mübeccel Demirekler
Head of Department

This is to certify that we have read this thesis and that in our opinion it is
fully adequate, in scope and quality, as a thesis for the degree of Master of
Science.

——————————————– ——————————————–
Prof. Dr. İsmet Erkmen Assoc. Prof. Dr. Aydan Erkmen
Co-Supervisor Supervisor

Examining Committee Members


Prof. Dr. Erol Kocaog̃lan ———————————–

Prof. Dr. Aydın Ersak ———————————–

Prof. Dr. İsmet Erkmen ———————————–

Assoc. Prof. Dr. Aydan Erkmen ———————————–

Asst. Prof. Dr. İlhan Konukseven ———————————–


ABSTRACT

INTELLIGENT GAIT CONTROL OF A MULTILEGGED ROBOT USED

IN RESCUE OPERATIONS

Karalarlı, Emre
M.S., Department of Electrical and Electronics Engineering

Supervisor: Assoc. Prof. Dr. Aydan Erkmen

Co-Supervisor: Prof. Dr. İsmet Erkmen

December 2003, 97 pages

In this thesis work an intelligent controller based on a gait synthesizer


for a hexapod robot used in rescue operations is developed. The gait synthe-

sizer draws decisions from insect-inspired gait patterns to the changing needs
of the terrain and that of rescue. It is composed of three modules responsible

for selecting a new gait, evaluating the current gait, and modifying the rec-

ommended gait according to the internal reinforcements of past time steps. A


Fuzzy Logic Controller is implemented in selecting the new gaits.

Key words: Hexapod Walking Rescue Robots, Insect-inspired Gaits, Gait Syn-

thesizer, GARIC.

iii
ÖZ

ÇOK BACAKLI KURTARMA ROBOTLARININ AKILLI YÜRÜYÜŞ

DENETİMİ

Karalarlı, Emre

Yüksek Lisans, Elektrik ve Elektronik Mühendisliği Bölümü


Tez Yöneticisi: Doç. Dr. Aydan Erkmen

Ortak Tez Yöneticisi: Prof. Dr. İsmet Erkmen

Aralık 2003, 97 sayfa

Bu tez çalışmasında kurtarma robotlarının akıllı yürüyüş denetimi için

bir yürüyüş şekli sentezleyicisi geliştirilmiştir. Sentezleyici değişen zemin


özelliklerine ve farklı kurtarma çalışmalarına cevap verebilmek için böceklerden

ilham alınan yürüyüş şekillerine göre karar vermektedir. Sentezleyici, yürüyüş

şekli belirleyici, değerlendirici ve değiştirici olmak üzere üç bölümden oluşur.


Belirleyici, bir bulanık mantık denetleyicisidir.

Anahtar Sözcükler: Altı Bacaklı Kurtarma Robotları, Böceklerden İlham

Alınmış Yürüyüş Şekilleri, Yürüyüş Şekli Sentezleyicisi, GARIC.

iv
ACKNOWLEDGMENTS

I would like to express my gratitude to my supervisor Assoc. Professor

Dr. Aydan Erkmen and co-supervisor Prof. Dr. İsmet Erkmen for their

motivation, guidance, patience, and encouragement through the preparation

of this thesis. I also thank to all my friends, especially Engür and Aslı Pişirici,

Mehmetçik and Semra Pamuk, Bora Sag̃dıçog̃lu, and Sedat Ilgaz for their

invaluable comments and suggestions throughout the study. Finally, I express

my gratitude to my family for their endless support.

v
TABLE OF CONTENTS

ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

ÖZ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
TABLE OF CONTENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .vi

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .viii

CHAPTER

1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2. SURVEY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1 Search and Rescue Robotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2 Legged Locomotion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8


2.2.1 Walking Mechanisms in Animals . . . . . . . . . . . . . . . . . . . . . . . 8

2.2.2 Control of Legged Robots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13


2.2.3 Gait Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2.4 Gait Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19


2.3 Mathematical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.3.1 Neural-Fuzzy Controllers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.3.2 GARIC Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.3.3 Fuzzy Sets and Fuzzy Logic Controllers . . . . . . . . . . . . . . . 31


2.3.4 Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3. LEGGED ROBOT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37

3.1 Dynamics and Coordinated Control of Legged Robots . . . . . . . 37

3.1.1 Motion Dynamics of Legged Robots . . . . . . . . . . . . . . . . . . . 38

vi
3.1.2 Coordinated Control of Legged Robots . . . . . . . . . . . . . . . . 45

3.2 Gait Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50


3.2.1 Encoding the Gaits for a Multilegged Robot . . . . . . . . . . . 50
3.2.2 Gait Selection Module (GSM) . . . . . . . . . . . . . . . . . . . . . . . . . 57

3.2.3 Gait Evaluation Module (GEM) . . . . . . . . . . . . . . . . . . . . . . . 59


3.2.4 Gait Modifier Module (GMM) . . . . . . . . . . . . . . . . . . . . . . . . 60
3.2.5 The Complete Control Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . 62

4. HEXAPOD ROBOT SIMULATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.1 Hexapod Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .64

4.2 Sensor System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

4.3 Kinematics of the Hexapod Robot . . . . . . . . . . . . . . . . . . . . . . . . . . 68


4.4 Uneven Terrain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

5. SIMULATION RESULTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5.1 Exploration and Exploitation Dilemma in Reinforcement


Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

5.2 Smooth Terrain Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74


5.3 Performance on Rough Terrain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.4 Task Shapability: A Must for SAR Operations . . . . . . . . . . . . . . 79

6. CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .87

6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .89

REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
APPENDICES

A. SIMULATION PROGRAM CD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

vii
LIST OF FIGURES

FIGURES
2.1 Tripod (A) and tetrapod (B) support patterns (or support
polygons) formed by contact points of the supporting legs . . . . . . . 9

2.2 Wave gait patterns. Bold lines represent swing phase. L1


signifies the left front leg and R3 indicates the right hind leg . . . 16
2.3 Hexapod model. Dashed legs are in swing phase . . . . . . . . . . . . . . . .17

2.4 Summary of coordination mechanisms in the stick insect. The


pattern of coordinating influences among the step generators
for the six legs is shown at the left; the arrows indicate the
direction of the influence. The mechanisms are described
briefly at the right . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.5 The GARIC architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26


2.6 The action evaluation network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.7 The action selection network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.8 General model of a fuzzy logic controller . . . . . . . . . . . . . . . . . . . . . . . 33

3.1 Coordinate frames defined for the legged robot. The coordinate
frame Cci is assigned such that the unit vector ẑ is normal to
the contact surface at the point of contact . . . . . . . . . . . . . . . . . . . . . 40

3.2 Architecture of Gait Synthesizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.3 Summary of terminology used in gait analysis . . . . . . . . . . . . . . . . . . 52

3.4 Wave gait patterns. Bold lines represent swing phase. L1


signifies the left front leg and R3 indicates the right hind leg . . . 53

3.5 Antecedent Labels, fuzzification of individual leg position . . . . . . 54


3.6 Consequent Labels: task share based on operation modes . . . . . . .56

3.7 Gait Selection Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.8 Complete control cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.1 The hexapod robot used in simulation . . . . . . . . . . . . . . . . . . . . . . . . . .65

viii
4.2 Hexapod model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

4.3 Each leg is identical and composed of three links. Pink legs are
in swing phase whereas blue ones are in stance . . . . . . . . . . . . . . . . . 67
4.4 Two different postures of the robot. Body level of the robot in
B is lowered in order to increase reachable space of the legs . . . . 68

4.5 The modelled uneven terrain. Different surface segments can be


seen in the figure. The holes on uneven terrain are modelled
by surface segments which are deeper than the legs can reach.
Notice that the pink leg (swinging) falls into such a segment . . . 70

5.1 Body speed versus time graphs for different scale factor and
threshold values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

5.2 Comparison of resultant gaits when training is done according to


two different reinforcement for speed (first row) and critical
margin (second row). The first column gives the resultant
gaits, second one body speed versus time, and last column
shows critical margin in the direction of motion versus time . . . . 75
5.3 Internal reinforcement versus time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

5.4 Critical margin, Cm(t), versus time . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

5.5 Leg tip positions on x direction versus time. In order to increase


the critical margin gait synthesizer applies smaller step sizes . . . 78

5.6 Leg tip trajectories of the hexapod on x-z plane with a fixed
tripod gait on the defined terrain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

5.7 Leg tip trajectories of the hexapod on x-z plane with gait
synthesizer on the defined terrain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

5.8 Gait of the hexapod robot on uneven terrain. The robot


recovers tripod gait pattern after some time reaching the
smooth terrain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

5.9 Gait of the hexapod robot on uneven terrain. The robot


recovers tripod gait faster than the previous one . . . . . . . . . . . . . . . 84

5.10 Critical margin versus time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84


5.11 Leg tip positions on x direction versus time . . . . . . . . . . . . . . . . . . . . 85

5.12 Gait generated by gait synthesizer when leg R1 is missing . . . . . . 85

5.13 Gait generated by gait synthesizer in sudden lack of leg R1 . . . . 86

ix
CHAPTER 1

INTRODUCTION

Recent experiences of natural disasters (earthquakes, tornados, floods)

and man-made catastrophes (e.g. urban terrorism) have brought the attention

to the area of search and rescue (SAR) and emergency management. Horrible
devastations and losses have dramatically illustrated the damage that can be

expected to today’s modern industrialized countries despite the technological


progresses in construction techniques [1]. Besides, these experiences showed
that preparedness and emergency response of the governments are inadequate

to deal with these devastations. As a result, people who have died due to
lack of immediate response inevitably forced us to find out better solutions for
search and rescue.

The utilization of autonomous intelligent robots in search and rescue

(SAR) is a new and challenging field of robotics, dealing with tasks in ex-
tremely hazardous and complex disaster environments [2]. Autonomy, high

mobility, robustness, and reconfigurability are critical design issues of res-

cue robotics, requiring dexterous devices equipped with the ability to learn
from prior rescue experience, adaptable to variable types of usage with a wide

enough functionality under different sensing modules, and compliant to en-

vironmental and victim conditions. Intelligent, biologically inspired mobile

robots and, in particular hexapod robots have turned out to be widely used

robot types beside serpentine mechanisms [3]; providing effective, immediate,


and reliable responses to many SAR operations. Aiming at enhancing the

1
quality of rescue and life after rescue, the field of rescue robotics is seeking

shape changing and moreover task shapable intelligent dexterous devices.

The objective of this thesis is to design a gait synthesizer for 6-legged

walking robots with shape-shifting gaits that provide the necessary flexibility
and adaptability needed in the difficult workspaces of rescue missions. The gait

synthesizer is responsible for the locomotion of the robot providing a compro-

mise between mobility and speed while allowing task shapability to use some

legs as manipulators when need arise during rescue. Legged robots are chosen

due to their advantage on rough terrains over their wheeled mobile counter-

parts [4], [5].

Wheeled locomotion is well suited for fast transportation. Wheels change


their point of support and use friction to move forward in an efficient way.

But due to this fact they require a continuous path. So, they require a pre-

constructed terrain, which restricts mobility demands.

On the other hand, legged locomotion offers a significant potential for

mobility over natural rough terrains in comparison to wheeled or tracked lo-


comotion. Because legs can choose footholds to improve traction, to minimize

lurching and to step over obstacles, they can handle with softness, unevenness

of the terrain [4]. Legs can provide the capability of maneuvering within con-

fined areas of space. Unlike wheels, legs change their point of support all at
once so do not need a continuous path. Also, as seen in nature, legs are not

used only for walking. Beside their main function, they are almost in every

external process of animals (as tactile sensors, as manipulators, etc.).

2
However, legged locomotion possesses additional complexity in the coor-

dination control of the legs [6]. The control of a legged robot is a sophisticated

job due to the high number of degrees of freedom offered by the articulated

legs. In the design of a control structure of a legged robot on difficult rough

terrain there are many aspects that have to be dealt with simultaneously and

that also interfere with each other. For example, movements of legs must be
carefully coordinated in order to advance the body without causing feet slip-

page; at each step, an appropriate foothold has to be found; body attitude

must be set according to the terrain profile; keep stability; accomplish a nav-

igation task; etc. Here, a body movement for terrain adaptation may change

the operation space of a leg so that the leg can not reach a chosen foothold that

was within the range beforehand, or inversely, a decision for a modification in


the gait may solve a stability problem. So, while coordinating the movements

of body and legs, the control structure of the legged robot must also handle

such interferences.

In this thesis we focus on the gait control and leg coordination and em-

phasize the potential of redundancy of legs for handling irregularity on terrains

as well as their use as manipulators. In walking robots, coordinating the move-

ments of individual legs in order to maintain a stable gait is one of the main

control tasks. Observations on insect gaits (cockroaches, stick insects) shows

that insects produce sequential movements starting with hind leg protraction

and followed by the middle and front legs, which is called metachronal wave

or wave gait [7]. Among the numerous periodic gaits, the class of wave gaits
is most important because they provide good stability [8]. The tripod gait,

3
which is a member of wave gaits, involves an alternation between right-sided

and left-sided metachronal waves and it is the fastest gait. Gaits arise from
the interaction of individual leg oscillators (step pattern generators) which
govern the stepping of each leg by exchanging the influences of the legs [9].

The information transmitted from the step pattern generator depends upon
the leg’s state (either swing or stance, position and velocity). Here the position
information play a particular central role in coordination. Several researchers

have implemented insect-like controllers for leg coordination ([10], [11]), most

of which oriented to protect the regularity of a fixed gait pattern against per-

turbations.

In this thesis, we work on biologically inspired wave gait patterns. Gait

patterns are patterns of leg coordination which represents relative phases


(swing phase or stance phase) of legs in a statically stable locomotion. These

gaits have different properties from mobility and speed point of views. In our
method we encoded gait pattern cycles from relative positions of the legs and
find the individual leg’s tasks within those gait patterns. The method enables

exploring among many different gait patterns and selecting gait patterns ac-

cording to the different needs to adapt online to terrain conditions. This is the
point where we have required features of intelligent control.

Generalized approximate reasoning-based intelligent control (GARIC)

architecture [12] is one of the realizations of the fusion of the Fuzzy and Neural
technologies guided by feedback from the environment. It presents a method

for learning and tuning fuzzy logic controllers (FLC) through reinforcements

signals. The basic idea behind Fuzzy Logic Controllers (FLC) is to incorporate

4
the ”expert experience” of a human operator in the design of the controller in

controlling a process whose input-output relationship is described by a collec-


tion of fuzzy control rules (IF-THEN rules) involving linguistic variables rather
than a complicated dynamic model [13].

Our gait synthesizer has adapted the GARIC architecture, to our ob-

jective. The gait synthesizer that we develop for serpentine locomotion [3]

and here for hexapod walking consists of three modules. The Gait Evaluation

Module (GEM), acts as a critic and provides advice to the main controller,

based on a multilayer artificial neural network. The Gait Selection Module

(GSM), offers a new gait to be taken by the robot according to a fuzzy con-
troller with rules for different gait patterns in the knowledge base. The Gait

Modifier Module (GMM), changes the gait recommended by the GSM based
on internal reinforcement. This change in the recommended gait is more sig-

nificant for a state if that state does not receive high internal reinforcements.
(i.e. probability of failure is high). On the other hand, if a state receives
high reinforcements, GMM administers small changes to the action selected

by the fuzzy controller embedded in the GSM. This reveals that the action is

performing well so that the GMM recommendation dictates no or only minor


changes to that gait.

The basic contribution of this thesis is the development of an intelligent

task shapable control, based on a gait synthesizer for a hexapod robot upon its
traversal of unstructured workspaces in rescue missions within disaster areas.

The gait synthesizer draws decisions from insect-inspired gait patterns to the

changing needs of the terrain and that of rescue. The method provides ex-

5
ploration among different gait patterns using the redundancy in multi-legged

structures.

The thesis is organized as follows: Chapter 2 covers a survey on legged

locomotion and gait analysis, and gives information about basic notions needed
throughout the thesis. Chapter 3, includes the dynamics and control of legged

robots, and the detailed description of the gait synthesizer. Chapter 4 in-

troduces the simulation and chapter 5 presents and discusses the results of

simulation. Chapter 6 covers the conclusion.

6
CHAPTER 2

SURVEY

2.1 Search and Rescue Robotics

Contribution of robotics technology to today’s sophisticated tasks is an

inevitable progress, leading to a gradual minimization of human share, mostly,

due to saturation in improvements of human abilities or to complementation


of human activities. Education and training are insufficient for dealing with

the complex and exhaustive tasks [1]. Thus, from the robotics point of view,

the trend is to provide an intelligent versatile tool to be a complete substi-

tution of human in risky operations and complement human operations when


auxiliary intelligent dynamics are required for extra dexterity. As a part of
this progress, Search and Rescue (SAR) is the one of the most crucial fields

that needs robotics contribution.

Search and rescue (SAR) robotics can be defined as the utilization of

robotics technology for human assistance in any phase of SAR operations [2].
Robotic SAR devices have to work in extremely unstructured and technically

challenging areas shaped by natural forces. One of the major requirements


of rescue robot design is the flexibility of the design for different rescue us-

age in disaster areas of varying properties. Rescue robotic devices should be

adaptable, robust, and predictive in control when facing different and chang-

ing needs. Intelligent, biologically inspired mobile robots and, in particular

hexapod walking robots have turned out to be widely used robot types beside
serpentine mechanisms; providing effective, immediate, and reliable responses

7
to many SAR operations.

2.2 Legged Locomotion

Legged locomotion offers a significant potential for mobility over highly

irregular natural rough terrains cut with ditches and high unpredictable in

comparison to wheeled or tracked locomotion [4], [5]. Legs can provide the

capabilities of stepping over obstacles or ditches, and maneuvering within con-

fined areas of space. They can handle with softness, the unevenness of the

terrain. Beside their main function in locomotion, legs are almost in every
external process of animals. The articulated structures of legs serve as manip-

ulators to pull, push, hold, etc. or as tactile sensors to explore the environment.

2.2.1 Walking Mechanisms in Animals

Millions of years of evolution have resulted in a large number of locomo-

tory designs for efficient, rapid, adjustable and reliable movement of the ani-
mals [15]. The major variations are observed in the number of legs (from two in

humans to about two hundred in a millipede), the length and shape (some spi-
ders possess extremely long and slender legs whereas hedgehogs have compara-

tively short legs), the positioning of the legs (insects carry their body between

the legs, whereas mammals tuck their legs underneath), and the type of skele-

ton (arthropods use an exoskeleton made of chitin-protein cuticle, whereas

vertebrates use an endoskeleton composed of bone). Despite this diversity,

legged locomotion in animals has some basic similarities according to their


mechanics and control.

8
At its fundamental level, legs work in a cyclic manner to locomote. The

step cycle for an individual leg consists of two basic phases: the swing phase,

when the foot is off the ground and moving forward, and the stance phase,

when the foot is on the ground and the leg is moving backward with respect to

the body. The propulsive force for progression is developed during the stance

phase. A common feature of the step cycle in most of the animals (including
man) is that the duration of the swing phase remains comparatively constant

as walking speed varies. Accordingly changes in the speed of progression are

produced primarily by changes in the time it takes for the legs to be retracted

during the stance phase [21].

Figure 2.1: Tripod (A) and tetrapod (B) support patterns (or support polygons)
formed by contact points of the supporting legs.

Animal locomotion can be classified into two categories according to gait


they use [23]. The first type is the one exhibited by insects. Insects are arthro-

pods and have a hard exoskeletal system with joined limbs. They use their

legs as struts and levers and the legs must always support the body during

walking, in addition to providing propulsion. In other words, the sequential

9
pattern of steps must ensure static stability. The vertical projection of the

center of gravity must therefore always be within the support pattern (the two
dimensional convex polygon formed by the contact points (Fig. 2.1)). This
kind of locomotion has been described as crawling and the legs have to pro-

vide at least tripod of support at all times. Another kind of locomotion may
be observed in humans, horses, dogs, cheetahs, and kangaroos which have a
more flexible structure. These animals require dynamic balance, which is less

stringent restriction on the posture and the gait of the animal. The animal

may not be in static equilibrium. On the contrary, there may be periods of

time when none of the support legs are on the ground as is observed in trotting

horses, running humans, and hopping kangaroos.

The mechanism by which the nervous system generates the cyclic move-
ments of the legs during walking is basically the same in animals [23], [21]. The

first significant efforts analyzing the nervous system were in the beginning of
1900s with the work of two British physiologists, C. S. Sherrington and T.
Graham Brown [21]. Sherrington first showed that rhythmic movements could

be elicited from the hind legs of cats and dogs some weeks after their spinal

cord had been severed. Since the operation had isolated from the rest, the
nervous center that control the movement of the hind legs, he showed that the

higher levels of the nervous system are not necessary for the organization of

stepping movements. He explained the generation of rhythmic leg movements

by a series of ”chain reflexes” (a reflex being a stereotyped movement elicited


by the stimulation of a specific group of sensory receptors). Thus he conceived

that the sensory input generated during any part of the step cycle elicit the

10
next part of the cycle by a reflex action, producing in turn another sensory sig-

nal that elicits the next part of the cycle, and so on. Whereas, Graham Brown
demonstrated that rhythmic contractions of leg muscles, similar to those that
occur during walking, could be induced immediately following transection of

the spinal cord even in animals in which all input from sensory nerves in the
legs had been eliminated. So, Graham claimed that mechanisms located en-
tirely within the spinal cord are responsible for generating the basic rhythm

for stepping in each leg.

Actually these two concepts are not compatible but neither provides a

complete explanation by itself [21]. Further experiments in a number of labo-


ratories have yielded results that strongly support the dual view of the nervous

mechanisms involved in walking. Both approaches have attractive features as


models for understanding how neural systems produce behavior. If walking is

the consequence of complete motions (central pattern), then it is much easier


to see how phase coordination of multiple legs is possible. On the other hand,
it is more difficult to see how adaptation to details of the terrain is possible

when walking is composed of complete motions. This state of affairs is re-

versed when the model is based on reflexes. The consensus that evolved was
that aspects of both models are important to the control of locomotion and

that neither was completely correct by itself [21]. Thus, our gait synthesizer

joins both sensory effects, environmental task performance as reinforcement

and simple neural structure for phase coordination of the multiple legs of our
robot. However our generated system is reflexive enough to adapt to the sud-

den unevenness of the terrain in rescue operations.

11
The process that gives rise to locomotion is a complicated control sys-

tem [16]. Motor output is constantly modified by both neural and mechanical

feedback . Specialized circuits within the nervous system, called central pat-

tern generators (CPGs), produce the rhythmic oscillations that drive motor

neurons of limb and body muscles in animals as diverse as leeches, slugs, lam-

preys, turtles, insects, birds, cats, and rats. Although CPGs may not require
sensory feedback for their basic oscillatory behavior, such feedback is essen-

tial in structuring motor patterns as animals move. This influence may be so

strong that certain sensory neurones should be viewed not as modulators but

as integral members of distributed pattern-generating network that comprises

both central and peripheral neurones. This is the main motivation behind

our gait synthesizer learning to select gait patterns while other parts of the
synthesizer learns to evaluate performance based on sensory data and modify

these patterns when necessary. More specifically, Gait Selection Module, GSM

(section 3.2.2), in our architecture acts as the CPG of real animals.

As a result of studies on animal locomotion, a few themes emerge. First,

the dynamics of locomotion is complicated on the basis of a few common

principles, including common mechanisms of energy exchange and the use of

force for propulsion, stability, and maneuverability. Second the locomotory

performance of animals in nature habitats reflects trade-offs between differ-

ent ecologically important aspects of behavior and is affected by the physical

properties of the environment. Third, the control of locomotion is not a lin-

ear cascade, but a distributed organization requiring both feedforward motor


patterns and neural and mechanical feedback. Fourth, muscles perform many

12
different functions in locomotion, a view expanded by the integration of muscle

physiology with whole-animal mechanics (muscles can act as motors, brakes,


springs, and struts).

Because machines face the same physical laws and environmental con-
straints that biological systems face when they perform similar tasks, the solu-

tions they use may embrace similar principles. Legged machines have a lot of

things to learn from the nature. But, evolutionary pressures that dictate the

morphology and physiology of animals do not always give the suitable results

for our tasks. For example, % 40 of the body mass of a shrimp is devoted to the

large, tasty abdominal muscles that produce a powerful tail flick during rare,
but critical, escape behaviors [16]. The imitation of a such body design will

surely result in an inefficient machine. The consequence is that the informa-


tion taken from the nature must be processed and the fundamental principles

must be defined. That is why we concentrated on redundant legged robots


and more specifically 6 legged ones.

2.2.2 Control of Legged Robots

The main challenge for legged robots is the control system. A system

that controls such a robot accomplishes several tasks [5]. First it regulates the

robot’s gait, that is, the sequence and way in which the legs share the task

of locomotion. For example, six legged robots work with gaits that elevate a
single leg at a time or two or three legs simultaneously. A gait that elevates

several legs at once generally makes it possible to travel faster but offers less

stability than a gait that keeps more legs on the ground.

13
A second task is to keep the robot from tipping over. For the vehicles

using static stability, if the center of gravity of the robot moves beyond the

base of support provided by the legs, the robot will tip. So, location of the

center of gravity with respect to the placement of the feet must be continuously

monitored by the robot. In our control structure, static stability is provided

by ensuring safety margins from physical limits such as the distance of the
center of gravity from the support polygon and distance of the legs from their

reaching limits during support phase.

Since many legs share the support of the body, a third task is to dis-

tribute the support load and the lateral forces among the legs. Smoothness of
the ride and minimal disturbance of the ground are the main objectives during
this task. In this thesis work the smoothness of the legged robot is provided

by applying periodic wave gait patterns of the insects. The perturbations of


the ground to the robot are compensated by choosing proper gaits during the

locomotion.

A fourth task is to make sure the legs are not driven past the limits
during their travel. The geometry of the legs may make it possible for one leg

to bump into another. Control system must take into account the limits of

the leg’s motion and the expected motion of the robot during that leg’s stance

period. In our robot the legs’ operation areas are restricted such that they do

not overlap.

A fifth task is to choose places for stepping that will give adequate sup-

14
port. For this task a sensor system that would scan the ground ahead of the

robot will be required. This system will build an internal digital model of the
terrain and process to find suitable footholds. Here, softness of the terrain may
cause problems. In the gait synthesizer we developed, a task oriented internal

model is learned during the learning process of gait evaluation.

We perform these five tasks (which are related to locomotion) on a hexa-

pod robot by focusing on gait control. In other words, our solutions to the

problems in the overall control of the hexapod robot are based on gait con-

trol. For the rest of the tasks which depend on the application, we just show

the potential of the gait synthesizer. Specifically, we will show that the gait
synthesizer is capable of adapting to the rescue operations where a leg of the

hexapod is used as manipulator while the rest provide mobility. However, the
key challenge in legged robots is to control individual components (legs) for

cooperative manipulation, while obtaining their cooperation for walking as an


integrated whole. This is behind the motivation of this thesis work.

2.2.3 Gait Analysis

In this thesis we focus on gaits of legged robots. A gait is a sequence

of leg motions coordinated with a sequence of body motions for the purpose

of transporting the body of the legged system from one place to another [8].

Gait analysis is one of the fundamental areas in the study of walking robots.
It is important because it is the major factor that affects the geometric and

control design of a walking robot [30]. In general, there are two types of gaits:

periodic and non-periodic gaits [8].

15
Figure 2.2: Wave gait patterns. Bold lines represent swing phase. L1 signifies the
left front leg and R3 indicates the right hind leg [7].

Periodic gaits are those in which a specific pattern of leg movement is

imposed. Observations on insect gaits (cockroaches, stick insects) show that

insects produce sequential movements starting with hind leg protraction and
followed by the middle and front legs, which is called metachronal wave or
wave gait [7]. The slowest gait involves an alternation between right-sided and

left-sided metachronal waves (Fig. 2.2A). As these waves overlap (Fig. 2.2B

to 2.2E), tetrapod gaits (Fig. 2.2C, 2.2D) and the typical tripod gait (Fig.

2.2E) are generated. The tripod gait (observed in hexapod insects such as
cockroaches), which is the fastest statically stable gait that a six-legged mech-

anism can use. In the tripod gait, three legs that enclose the center of gravity

support the body while the other legs simultaneously lift and recover. Peri-
odic gaits offer good mobility over smooth terrain since they possess optimum

stability. However, terrain irregularities, which can be dealt with these gaits,

16
are relatively limited. If the terrain irregularity is severe such as in natural

disaster areas, periodic gaits become ineffective, and special gates need to be
developed. These gaits are non-periodic gaits. Works in this area comprise of
studies on free gaits [32], and large obstacle gaits [30]. Free gaits are gaits in

which any leg is permitted to move at any time [31]. In free gait approach, a
finite set of gait states is defined and control is done on a rule-based principle,
resulting in simple motions lack of smoothness. Our gait control approach

takes the advantages of these gaits in order to achieve a smooth and adaptive

locomotion over unpredictive terrain roughness.

Figure 2.3: Hexapod model. Dashed legs are in swing phase.

Fig. 2.3 shows a hexapod model. The leg order as labelled in Fig. 2.3
is adopted throughout our thesis work. Below are some terms used in gait

analysis [8], [10], [30]:

1. Protraction: The leg moves towards the front of the body.

17
2. Retraction: The leg moves towards the rear of the body.

3. Stance phase: The leg is on the ground where it supports and propels

the body. In forward walking, the leg retracts during this phase. Also

called power stroke or support phase.

4. Swing phase: The leg lifts and swings to the starting position of the next

stance phase. In forward walking, the leg protracts during this phase.
Also called the return stroke or recovery phase.

5. Cycle time: The time for a complete cycle of leg locomotion of a periodic
gait.

6. Duty factor of a leg: The time fraction of a cycle time in which the leg

is in the support phase.

7. Phase of a leg: The fraction of a cycle period by which the current leg

position follows the placement of the leg.

8. Support Polygon: Two dimensional point set in a horizontal plane con-

sisting of the convex hull of the vertical projection of all foot points in
support phase (Fig. 2.3).

9. Stability Margin (Sm): The shortest distance of the vertical projection


of center of gravity to the boundaries of the support pattern in the hor-

izontal plane.

10. Front and Rear Boundary: The boundaries of the support polygon, re-

spectively ahead of and behind the projection of center of gravity in

forward walking, and intersect the longitudinal body axis.

18
11. Front and Rear Stability Margin (Front and Rear Sm): The distances
from the vertical projection of the center of gravity to the front and rear

boundaries of the support polygon respectively, in forward walking.

12. Kinematic Margin (Sm): The distance from the current foothold of a
stance leg to the border of its reachable area in the opposite direction of

body motion (Fig. 2.3).

13. Anterior Extreme Position (AEP): In forward walking, this is the target
position of the advance degree of freedom during recovery phase. It is

the foremost position a leg reaches during a cycle.

14. Posterior Extreme Position (PEP): In forward walking, this is the target

position of the swing degree of freedom during support phase. It is the

backmost position a leg reaches during a cycle.

15. The Stroke distance (Sd): The distance between Anterior Extreme Point
(AEP) and Posterior Extreme Point (PEP).

2.2.4 Gait Control

One of the aspects related with the control of legged robots is the genera-
tion of stable gaits [31]. The task of gait generation mechanism can be defined

as selecting an appropriate coordination sequence of leg and body movements

so that the robot advances with a desired speed and direction. Gait generation
for six legged (or more) robots has been addressed with several researches that

we will overview here.

19
The principle stated by experimental studies of walking in insects is that

gaits arise from the interaction of individual leg oscillators (step pattern gen-
erators) which govern the stepping of each leg by exchanging the influences
of the legs [9]. The information transmitted from the step pattern generator

depends upon the leg’s state (either swing or stance, position and velocity).
Here the position information plays a particular central role in coordination.
We take this also into consideration in our work.

Several versions of this interleg coordination principle is investigated and

implemented in insect-inspired walking robots. Pearson [21] proposed that

modification of the walking coordination may occur through load sensors in


the leg’s chordotonal organ and position information from the campaniform

sensillae. This model formed the basis of Beer’s simulation of cockroach be-
haviors [18] where the effect of load and position sensors was simulated by

forward and backward angle sensor ”neurons” as well as ground contact and
stance and swing ”neurons” within a distributed neural network control archi-
tecture. This basic model was then implemented on a walking robot with two

degrees of freedom per leg [19].

A more complex interleg coordination model is proposed by [20] and


[24]. Together they identify at least six mechanisms that work between legs in

a stick insect. A summary of the coordination mechanisms in the stick insect is

shown in Fig. 2.4. The arrows indicate the direction of influences which estab-
lish the coordination of the legs providing stability. In [24], [25] most of these

mechanisms are simulated and some of them have also been implemented on a

robot with two dof per leg [26] and two robots with three dof per leg [27]. In

20
Figure 2.4: Summary of coordination mechanisms in the stick insect. The pattern
of coordinating influences among the step generators for the six legs is shown at
the left; the arrows indicate the direction of the influence. The mechanisms are
described briefly at the right [24].

the implementations, interleg coordination mechanisms operate by modifying

the PEP (AEP and PEP are applied as a switching point between swing and
stance phases and AEP is set to a constant value) of a receiving leg depending

upon the state of a sending leg.

In [10], Ferrell compares different insect inspired gait controllers. The

most important feature of these implementations is that they are highly dis-

tributed. But, much is still unknown about the general dynamical behavior

of the models and dependence of this behavior on parameters [34]. So, pa-
rameters associated with the model must be tuned heuristically to achieve a

desired behavior. However, one of the major requirements of rescue robot de-

sign is the flexibility of the design for different rescue usage in disaster areas

of varying properties [2]. Our work on gait control offers such a flexibility by

21
drawing decisions from insect-inspired gait patterns to the changing needs of

the terrain and that of rescue.

In the literature some of the complete walking robot designs does not

offer remarkable approaches for gait control [37], [11]. They usually apply
fixed gait patterns (especially tripod). But some researches still focus on the

subject. In [30], Choi and Song deal with obstacle-crossing gaits. Their study

is presented on fully automated gaits that can be used to cross four types

of simplified obstacles: grade, ditch, step, and isolated wall. After the type

and dimensions of an obstacle is entered, the system generates a series of pre-

programmed movements that enables a hexapod to cross over the obstacle in


a fully automated mode. Our approach provides obstacle crossing by trying

different gaits rather than imposing pre-programmed movements.

In [36] a gait state definition is presented as a function of the last steps

executed. They identify several classes of gait states and transitions between

them. They show that independently from the initial posture of robot, the

robot would be in one of the four situations according to the number of legs
in contact by executing a sequence of gait states and the tripod gait can be

obtained.

Yang and Kim focus on the robustness to damages to legs in walking ma-

chines and deal with fault tolerant gaits [28]. These are the gaits maintaining
stability in static walking against a fault event preventing a leg from having

the support state. In [29], they successfully implement a fault tolerant gait

over uneven terrain. In our gait control approach we do not distinguish the

22
gaits according to their fault tolerances but we enable the controller to search

for the gait that will solve the problem.

In [6], Celaya and Porta present a complete control structure for the lo-

comotion of a legged robot on uneven terrain. In the gait generation they use
two rules by which different gaits including the complete family of wave gaits

can be obtained with a proper initial state. With the first rule that is: ’never

have two neighboring legs raised from the ground at the same time’, static

stability is guaranteed. Whereas the second rule: ’a leg should perform a step

when this is allowed by the first rule and its neighboring legs have stepped

more recently than it has’, forces the alternation of the steps of any pair of
neighboring legs. These two rules are local so that no central synchronization

is required.

In [33], a modified version of Q-learning approach is used for the decen-

tralized control of the Robot Kafka. Each leg, which can be in one of finite

number of states, has its own look-up table and can communicate with the

others. Based on the legs’ states and those, to which they are coupled, actions
are chosen according to these lookup tables. Modified Q-learning approach is

employed to search for a set of actions resulting in successful walking gaits.

Parker and et al utilize Cyclic Genetic Algorithm (CGA) to produce

gaits for a hexapod robot [40]. The approach to generate a gait is to develop
a model capable of representing all states of the robot and use a cyclic genetic

algorithm to train this model to walk forward. CGA is developed as a modi-

fication of the standard Genetic Algorithm. The CGA incorporates time into

23
the chromosome structure by assigning each gene a task to be accomplished

in a set amount of time. Also some portions of the chromosome (tasks) are
repeated creating a cycle. This allows the chromosome to represent a program
that has a start section and an iterative section. In [40] it is shown that with

only minimal a priori knowledge the optimal tripod gait for a hexapod robot
can be produced.

In [11], a survey of different approaches for gait generation can be found.

Among the methods in the literature, our gait synthesizer has an hybrid struc-

ture. Interaction of legs (mutual inhibitions and excitations) in biological sys-

tems results in observed gait patterns. Without implementing the described


interleg mechanisms we work on these patterns so that we make use of biolog-

ical background. Whereas our system enables the flexibility of non-periodic


gaits by allowing any leg to move out of the pattern when needed. In our

approach both the terrain conditions and performance criteria determine the
gait to be applied.

2.3 Mathematical Background

2.3.1 Neural-Fuzzy Controllers

Neural Fuzzy Controllers (NFCs), based on a fusion of ideas from fuzzy


control and neural networks, possess the advantages of both neural networks

(e.g., learning abilities, optimization abilities, and connectionist structures)

and fuzzy control systems (e.g., humanlike IF-THEN rule thinking and ease

of incorporating expert knowledge) [13]. Fuzzy systems and neural networks

24
share the common ability to improve the intelligence of systems working in

an uncertain, imprecise, and noisy environment. The main purpose of neural


fuzzy control system is to apply neural learning techniques to find and tune the
parameters and/or structure of the neuro-fuzzy control systems. Some of the

works in this area are Generalized Approximate Reasoning based Intelligent


Control (GARIC) [12], Fuzzy Adaptive Learning Control Network (FALCON)
[52], Adaptive Neuro Fuzzy Inference System (ANFIS) [53], and Neuro-Fuzzy

Control (NEFCON) [54]. In our work we adopted the GARIC architecture in

order to develop the gait synthesizer for our multilegged robot.

2.3.2 GARIC Architecture

Generalized approximate reasoning-based intelligent control (GARIC),


introduced by Berenji and Khedkar [12], is a neural fuzzy control system with

reinforcement learning capability. GARIC presents a method for learning and

tuning fuzzy logic controllers (FLC) through reinforcements signals. It consists

of three modules (Fig. 2.5): an action evaluation network (AEN) that maps
a state vector and a failure signal into a scalar score (internal reinforcement)
indicating the goodness of the state, an action selection network (ASN) that

maps a state vector into a recommended action using fuzzy inference, and a

stochastic action modifier that produces actual action based on internal re-

inforcement. Learning occurs by fine-tuning the free parameters in the two

networks: in the AEN, the weights are adjusted; in the ASN, the parameters
describing the fuzzy membership functions are changed.

25
Figure 2.5: The GARIC architecture [12].

Action Evaluation Network

The AEN constantly predicts reinforcements associated with different

input states. It is a two-layer feedforward network with direct interconnections

from input nodes to output node (Fig. 2.6). The input to the AEN is the state
of the plant, and the output is an evaluation of the state (or equivalently, a

prediction of the external reinforcement signal) denoted by v(t). The output


of each node in the AEN is calculated by the following equations

n
X
yi (t) = g( aij (t)xj (t)) (2.1)
j=1
n
X n
X
v(t) = bi (t)xi (t) + ci (t)yi (t) (2.2)
i=1 i=1
where

1
g(s) = (2.3)
1 + e−s
is the sigmoid function, v is the prediction of the reinforcement signal, and aij ,

26
bi , and ci are corresponding link weights shown as A, B, and C in Fig. 2.6.

Figure 2.6: The action evaluation network.

This network evaluates the action recommended by the action network

as a function of the failure signal and the change in state evaluation based on
the state of the system at time t






0 start state

r̂(t) =  r(t) − v(t − 1) failure state (2.4)




 r(t) + γv(t) − v(t − 1) otherwise
where 0 ≤ γ ≤ 1 is the discount rate. In other words, the change in the value

of v plus the value of the external reinforcement constitutes the heuristic or

internal reinforcement, r̂, where the future values of v are discounted more the
further they are from the current state of the system.

Learning in AEN is based on internal reinforcement, r̂(t). If r is positive,

27
the weights are altered so as to increase the output v for positive input, and

vice versa. Therefore, the equations for updating the weights are as follows:

bi (t) = bi (t − 1) + βr̂(t)xi (t − 1) (2.5)

ci (t) = ci (t − 1) + βr̂(t)yi (t − 1) (2.6)

aij (t) = aij (t − 1) + βh r̂(t)yi (t − 1)(1 − yi (t − 1))sgn(ci (t − 1))xj (t − 1) (2.7)

where β > 0 and βh > 0 are constant learning rates.

Action Selection Network

As shown in Fig. 2.7, the ASN is a five layer network with each layer
performing one stage of the fuzzy inference process. The functions of each
layer are briefly described here.

• Layer 1: An input layer that just passes input data to the next layer.

• Layer 2: Each node in this layer functions as an input membership func-

tion. Here triangular membership functions are used:







1 − |x − c|/sL x ∈ [c − sL , c]

µV (x) = 1 − |x − c|/sR x ∈ [c, c + sR ] (2.8)





 0 otherwise

where V = (c, sL , sR ) indicates an input linguistic value, and c, sL , sR

correspond to the center, left spread, and right spread of the triangular
membership function µV , respectively.

28
Figure 2.7: The action selection network.

• Layer 3: Each node in this layer represents a fuzzy rule and imple-
ments the conjunction of all the preconditions in the rule. Its output

wr , indicating the firing strength of this rule, is calculated by following


continuous, differentiable softmin operation:

P
µi ekµi
wr = Pi kµi
(2.9)
ie

where µi is the output of a layer 2 node, which is the degree of matching

between a fuzzy label occurring as one of the preconditions of rule r

and the corresponding input variable. The parameter k controls the


hardness of the softmin operation, and as k → ∞ we recover the usual

min operator. However, for k finite, we get a differentiable function of


the inputs, which makes it convenient for calculating gradients during

the learning process. The choice of k is not critical.

29
• Layer 4: Each node in this layer corresponds to a consequent label.

For each of the wr supplied to it, this node computes the corresponding
output action as suggested by rule r. This mapping is written as µ−1 (wr ),
where the inverse is taken to mean a suitable defuzzification procedure

applicable to an individual rule. For triangular functions,

µ−1
Y (wr ) = c + 0.5(sR − sL )(1 − wr ) (2.10)

where Y = (c, sL , sR ) indicates a consequent linguistic value.

• Layer 5: Each node in this layer is an output node that combines the

recommendations from all the fuzzy control rules using the following
weighted sum:

P
r wr µ−1 (wr )
F = P (2.11)
r wr

In the ASN, adjustable weights are present only on the input links of
layers 2 and 4. The other weights are fixed at unity.

Stochastic Action Modifier

In GARIC architecture, the output of ASN is not applied to the environ-

ment directly. Stochastic Action Modifier (SAM) uses the values of r̂ from the

previous time step and the action F recommended by the ASN to stochasti-

cally generate an action, F 0 , which is a Gaussian random variable with mean

F and standard deviation σ(r̂(t − 1)). This σ() is some nonnegative, mono-
tonically decreasing function, e.g. exp(−r̂). The action F 0 is what is actually

30
applied to the plant. The stochastic perturbation in the suggested action leads

to a better exploration of state space and better generalization ability. When


r̂(t − 1) is low, meaning the last action performed is bad, the magnitude of
the deviation |F 0 − F | is large, whereas the controller remains consistent with

the fuzzy control rules when r̂(t − 1) is high. The actual form of the function
σ(), especially its scale and rate of decrease, should take the units and range
of variation of the output variable into account.

In GARIC, the goal of calculating F values in ASN is to maximize

the evaluation of the gait, v, determined by AEN. The gradient information

∆p = δv/δp (p is the vector of all adjustable weights in ASN) is estimated by


stochastic exploration in the Stochastic Action Modifier (SAM). The modifi-

cation implemented in t − 1 by SAM is judged by r̂(t). If r̂ > 0, meaning the


modified F 0 (t − 1) is better than expected, then F (t − 1) is moved closer to

the modified one, and vice versa.

2.3.3 Fuzzy Sets and Fuzzy Logic Controllers

Fuzzy sets, introduced by Zadeh in 1965 as a mathematical way to rep-

resent vagueness in linguistics, can be considered a generalization of classical

set theory [47]. In a classical set, the membership of an element is crisp; it is

either yes (in the set) or no (not in the set). A crisp set can be defined by the

so-called characteristic function (or membership function). The characteristic


function µA (x) of a crisp set A is given as


 1 if x ∈ A
µA (x) =

 0 if x 6∈ A

31
Fuzzy set theory extends this concept by defining partial memberships,

which can take values ranging from 0 and 1:

µA (x) : U → [0, 1]

where U refers to the universal set defined in a specific problem.

Fuzzy logic was one of the major developments of Fuzzy Set Theory and

was primarily designed to represent and reason with knowledge that cannot

be expressed by quantitative measures. The main idea of algorithms based on

fuzzy logic is to imitate the human reasoning process to control ill-defined or


hard-to-model plants. Fuzzy inference systems model the qualitative aspects
of human knowledge through linguistic if-then rules. Every rule has two parts:

an antecedent part (premise), expressed by if..., which is the description of the

state of the system, and a consequent part, expressed by then..., which is the
action that the operator who controls the system must take.

We can use fuzzy sets to represent linguistic variables. Linguistic vari-

ables represent the process states and control variables in a fuzzy controller.

Their values are defined in linguistic terms and they can be words or sentences

in a natural or artificial language.

The most important operators in classical set theory with crisp sets are
complement, intersection, and union. These operations are defined in fuzzy

logic via membership functions. The membership values in a complement

subset Ā are

µĀ (x) = 1 − µA (x)

32
which corresponds to the same operation in the classical theory. For the inter-

section of two fuzzy sets various operators have been proposed (min operator,
algebraic product, bounded product,...). The min operator for two fuzzy sets
A and B is given as

µA (x)andµB (x) = min{µA (x), µB (x)}

For the union of two fuzzy sets, there is a class of operators named t-conorms

or s-norms. One of the most used in the literature is the max operator:

µA (x)orµB (x) = max{µA (x), µB (x)}

Figure 2.8: General model of a fuzzy logic controller.

A typical architecture of a FLC is comprised of four principal components

(Fig. 2.8): a fuzzifier, a fuzzy rule base, inference engine and defuzzifier. The

fuzzifier performs the fuzzification that converts the data from the sensor mea-

surements into proper linguistic values of fuzzy sets through predefined input

membership functions. In Fuzzy Rule Base, fuzzy control rules are character-

ized by a collection of fuzzy IF-THEN rules in which the preconditions and


consequents involve linguistic variables. This collection of fuzzy control rules

33
(or fuzzy control statements) characterizes the simple input-output relation of
the system. The inference engine is to match the output of the fuzzifier with

the fuzzy logic rules and perform fuzzy implication and approximate reasoning

to decide a fuzzy control action. Finally, the defuzzifier performs the function
of defuzzification to yield a nonfuzzy (crisp) control action from an inferred
fuzzy control action through predefined output membership functions.

The principal elements of designing a FLC include defining input and out-

put variables, deciding on the fuzzy partition of the input and output spaces

and choosing the membership functions for the input and output linguistic

variables, deciding on the types and derivation of fuzzy control rules, design-

ing the inference mechanism, and choosing a defuzzification operator [13].

2.3.4 Reinforcement Learning

Reinforcement learning is an approach to artificial intelligence that em-

phasizes learning by the individual from its interaction with its environment

[13]. The environment supplies a time varying vector of input to the system,

receives its time varying vector of output or action and then provides a time
varying scalar reinforcement signal. Here, the reinforcement signal r(t) can be

one of the following forms: a two valued number r(t) ∈ {-1,1}or {-1,0} such
that r(t)=1 (0) means ”success” and r(t)=-1 means ”failure”; a multi-valued

discrete number in the range [-1,1] or [-1,0], for example r(t) ∈ {-1, -0.5, 0,

0.5, 1}, a real number r(t)=[-1,1] or [-1,0], which represent a more detailed

and continuous degree of failure or success. We also assume that r(t) is the
reinforcement signal available at time step t and is caused by the inputs and

34
actions at time step (t-1) or even affected by earlier inputs and actions.

Challenging problem in reinforcement learning is that there may be a

long time delay between a reinforcement signal and the actions that caused it.

In such cases a temporal credit assignment problem results because we need to

assign credit or blame, for an eventual success or failure, to each step individ-

ually in a long sequence. An approach to solve such problem is based on the


temporal difference methods [41]. TD methods consist of a class of incremental

learning procedures specialized for prediction problems. TD methods assign

credit based on the difference between temporally successive predictions. The

main characteristic of these methods is that it is not required to wait until the

actual outcome is known.

The object of learning is to construct an action selection policy that


optimizes the systems performance. A natural measure of performance is the

discounted cumulative reinforcement (utility [38])


X
Vt = γ k rt+k (2.12)
k=0

where Vt is the discounted cumulative reinforcement starting from time t

throughout the future, rt is the reinforcement received after the transition

from time t − 1 to t, and 0 ≤ γ ≤ 1 is a discount factor, which adjusts the

importance of long term consequences of actions. In the approach to solve the

temporal credit assignment problem, the aim is to learn an evaluation func-

tion to predict the discounted cumulative reinforcement. Evaluation function

(Vxπ ) is the expected discounted cumulative reinforcement that will be received


starting from state x, or simply the utility of state x. The evaluation function

35
is represented using connectionist networks (evaluation network or critic) and
learned using a combination of temporal difference methods and error back-

propagation algorithm. TD methods compute the error called the TD error

between temporally successive predictions, and the backpropagation algorithm


minimizes the error by modifying the weights of the networks.

Let pt is the output of the evaluation network, which denotes the estimate

at time step t for the evaluation function Vxπ , given the state xt , and rt is the

actual cost incurred between time steps t − 1 and t. Then pt−1 predicts


X
γ k rt+k = rt + γpt (2.13)
k=0

In this case the prediction error (TD error) which is the difference between

estimated evaluation and actual evaluation would be

(rt + γpt ) − pt−1 (2.14)

This method is used for prediction problems in which exact success or

failure may never become completely known.

36
CHAPTER 3

LEGGED ROBOT

3.1 Dynamics and Coordinated Control of Legged Robots

The dynamics of a robotic system play a central role in both its control

and simulation. When studying the control of robots, the primary problem,

which must be solved, is known as Inverse Dynamics. Solution of the in-


verse dynamics problem requires the calculation of the actuator torques and/or

forces, which will produce a prescribed motion in the robotic system. Whereas,

in the area of simulation, the fundamental problem to be solved is called For-

ward or Direct Dynamics. Solution of this problem requires the determination


of the joint motion, which results from a given set of applied joint torques
and/or forces.

The overall mechanism of a legged robot is a closed-chain comprised of a

body with supporting legs. The kinematic relations between the leg joint mo-

tion and the body motion are complicated. The additional complexity arises
because the chains (legs) of the system are coupled through the body.

In the approach presented here the resemblance between the control of

legged robots and the manipulation of objects by multi-fingered robot hands

is considered. The dynamics and control of grasping are developed in various

prior works [48], [51]. We adapt these concepts here to legged robots. The

basics of the mathematical background given in this section can be found in


[42], [44], [50]. Note that these analysis are valid for legged robots using static

37
balance where the body is continuously supported by at least three legs con-

stituting a support polygon.

The dynamics and control algorithm represented here must be consid-

ered within a complete control system for a legged robot including navigation,
terrain adaptation, etc. Because these concepts are out of the scope of this

thesis work, we will just give the algorithm, however simulations for the gait

synthesizer will be implemented with a simpler model described in chapter 4.

3.1.1 Motion Dynamics of Legged Robots

We firstly derive equations concerned with moving coordinate frames.


Let C1 and C2 be two coordinate frames. We denote by p12 ∈ R3 and R12 ∈

SO(3) (3 × 3 orthogonal matrix, R−1 = RT ) the position and orientation of


C2 relative to C1 . Beside, we denote by v12 = ṗ12 and w12 = S −1 (Ṙ12 R12
T
) (or
Ṙ12 = S(w12 )R12 ), translational and rotational velocity of C2 relative to C1 ,

where S is an operator defined by


   
 w1   0 −w3 w2 
   
w= 
 w2  , S(w) = 
 w3 0

−w1 
   
   
w3 −w2 w1 0

which clearly satisfies

S(w)f = w × f and

AS(w)AT = S(Aw) for all A ∈ SO(3), w, f ∈ R3 .

Now consider three coordinate frames C1 , C2 , and C3 . The position and


orientation of C3 relative to C1 is given by [50]

38
p13 = p12 + R12 p23 (3.1)

R13 = R12 R23 (3.2)

Then translational velocity of C3 relative to C1 is obtained by

v13 = ṗ13 = ṗ12 + Ṙ12 p23 + R12 ṗ23 (3.3)

which is

v13 = v12 − S(R12 p23 )w12 + R12 v23 (3.4)

To see this, we observe that

Ṙ12 p23 = S(w12 )R12 p23


T
= (R12 R12 )S(w12 )R12 p23
T
= R12 S(R12 w12 )p23
T
= R12 (R12 w12 ) × p23
T
= R12 (−p23 ) × (R12 w12 )
T
= −R12 S(p23 )R12 w12
= −S(R12 p23 )w12
By differentiating both sides of equation 3.2, we also obtain rotational
velocity of C3 relative to C1 :

Ṙ13 = Ṙ12 R23 + R12 Ṙ23 (3.5)

S(w13 )R13 = S(w12 )R12 R23 + R12 S(w23 )R23 (3.6)

S(w13 )R13 = S(w12 )R13 + S(R12 w23 )R13 (3.7)

w13 = w12 + R12 w23 (3.8)

by the transformation

39
T
R12 S(w23 )R23 = R12 S(w23 )(R12 R12 )R23
= S(R12 w23 )R13 )
Then the generalized velocity of C3 relative to C1 is given in matrix form

by

       
 v13   I −S(R12 p23 )   v12   R12 0   v23 
  =   +   (3.9)
w13 0 I w12 0 R12 w23

Figure 3.1: Coordinate frames defined for the legged robot. The coordinate frame
Cci is assigned such that the unit vector ẑ is normal to the contact surface at the
point of contact.

In Fig. 3.1 the coordinate frames Cw , CB , Cbi , Cti , and Cci denote

respectively the inertial base frame, the body coordinate frame attached to

the center of mass of the body, the leg base frame of leg i, the leg tip frame

of leg i, and the local frame at the contact point of leg i. For the relations of

these coordinate frames we know that ptc = 0, and Cc and Cb are fixed with

respect to Cw and CB , respectively (vwc = wwc = vBb = wBb = 0). Besides,


according to equation 3.9 following relations exist:

40
       
 vbc   I −S(Rbt ptc )   vbt   Rbt 0   vtc 
  =   +  
wbc 0 I wbt 0 Rbt wtc
     (3.10)
vbt   Rbt 0   vtc 
= 
 +  
wbt 0 Rbt wtc
    
 vBc   RBb 0   vbc 
  =    (3.11)
wBc 0 RBb wbc

       
 vwc   I −S(RwB pBc )   vwB   RwB 0   vBc 
  =   +  =0
wwc 0 I wwB 0 RwB wBc
(3.12)

     
I −S(RwB pBc )   vwB   Rwb 0   vbc 
−
   =    (3.13)
0 I wwB 0 Rwb wbc
    
T T
 Rwb −Rwb S(RwB pBc )   vwB   vbc 
−   =   (3.14)
T
0 Rwb wwB wbc
   
 vwB   vbc 
−T  =   (3.15)
wwB wbc
Moreover, the velocity of leg tip frame, Ct , is related to the velocity of

the leg joints, q̇, by the leg jacobian,


 
 vbt 
  = J(q)q̇ (3.16)
wbt
In this analysis, we consider following contact models for the leg tip-

terrain interactions: a) a point contact without friction, b) a point contact

with friction, c) a soft contact, d) a rigid contact. These contact models give

rise to contact constraints specified by

41
• vzi = 0, for a point contact without friction.

• vxi = vyi = vzi = 0, for a point contact with friction.

• vxi = vyi = vzi = 0 and wzi = 0, for a soft contact.

• vxi = vyi = vzi = 0 and wxi = wyi = wzi = 0, for a rigid contact.

For each of the contact models, substituting the above contact constraints

and equation 3.16 into equation 3.10 we have


 
 vbc 
BT  T
 = B J(q)q̇ (3.17)
wbc
T
where B is the basis matrix defined in [49] representing the model contact

constraints. For example, for a point contact with friction


 
 1 0 0 0 0 0 
 
 
BT = 
 0 1 0 0 0 0 

(3.18)
 
0 0 1 0 0 0
Substituting equation 3.17 into equation 3.15 we have
 
vwB 
−B T T 
  = B T J(q)q̇ (3.19)
wwB
 
vwB 
−GT   = Jleg (q)q̇ (3.20)
wwB
Dual to generalized velocity, a generalized force (or wrench) can be writ-

ten as
 
f13 
F13 =
  (3.21)
τ13

42
where τ13 ∈ R3 and f13 ∈ R3 are the torque and the linear force about the

origin of C3 relative to coordinate frame C1 , respectively.

Generalized force can be defined by examining the work produced by a


virtual displacement. A virtual displacement is an instantaneous infinitesimal
displacement du. The work produced by a virtual displacement, virtual work,

is denoted by δW , where δW = F · du. We use the principle of virtual work


to find generalized force relations. The work performed, which has units of

energy, must be the same regardless of the coordinate system within which
it is measured or expressed [45]. The virtual work done by an infinitesimal

displacement of the body with respect to Cw is


     T  
 fwB   vwB   fwB   vwB 
δW =  · =   
τwB wwB τwB wwB
where we have represented the dot product in the virtual work equation using

the transpose operation. Alternatively, the virtual work done by the corre-

sponding infinitesimal displacement of the Cc with respect to Cb is


     T  
 fbc   vbc   fbc   vbc 
δW =  · =   
τbc wbc τbc wbc
By the principle of virtual work, these two formulations of the work performed

are equal:
 T    T  
 fwB   vwB   fbc   vbc  (3.22)
    =    
τwB wwB τbc wbc
and substituting equation 3.15 into 3.22 we have
 T  T
 fwB   fbc  (3.23)
  =   (−T )
τwB τbc

43
   
 fwB   fbc 
  = (−T T )   (3.24)
τwB τbc
For a given contact model, let ni denote the total number of independent

contact wrenches that leg i can apply to the terrain. For example, ni = 1 for a

point contact without friction (i.e., a force in the normal direction), and ni = 3

for a point contact with friction (i.e., a force in the normal direction plus two
components of frictional forces). Note that ni is just the number of contact

constraints corresponding to the contact model. According to equation 3.24


the resulting generalized force from applied contact force of the leg i can be

expressed as
 
 fwB 
  = −T T Bxi (3.25)
τwB
 
 fwB 
  = −Gxi (3.26)
τwB
n
where xi ∈ R is the magnitude vector of applied contact forces (generalized)
i

along the basis directions of B. Equations 3.20 and 3.26 provide valid relations
if the leg remains in contact with the surface and there is no slipping. A com-
mon way to guarantee no slipping is to ensure that the contact forces lie within

the friction cone at the point of contact-that is, the tangential component of
the contact force is less than or equal to the coefficient of friction µ times the

normal component of the contact force.

Finally for n (i = n) supporting legs

44
 
 q1 
 
 
 q2 
 
Q= .. 
 


. 

 
qn
 
 x1 
 
 
 x2 
 
FT = 
 .. 



. 

 
xn

· ¸
JT = Jleg1 Jleg2 · · · Jlegn

· ¸
GT = G1 G2 · · · Gn

Then the equations 3.20 and 3.26 and can be concatenated for i =
1, · · · , n to give,
 
 vwB 
JT (Q)Q̇ = −GTT   (3.27)
wwB
 
fwB 
−GT FT =    (3.28)
τwB
We have derived the force,torque and velocity relations from legs to leg-

tips and leg-tips to body.

3.1.2 Coordinated Control of Legged Robots

In this section, we develop the control algorithm for the coordinated


control of the robot legs. The goal of the control scheme is to specify a set of

45
control inputs for the leg motors so that the body undergoes a desired motion.

The control scheme we develop in this section is based on the computed torque
methodology.

By differentiating equation 3.27 we have

   
 vwB   v̇wB 
JT (Q)Q̈ + J˙T (Q)Q̇ = −ĠTT   − GTT   (3.29)
wwB ẇwB

    
 vwB   v̇wB 
Q̈ = JT+ (Q) 
−ĠTT   − GTT   − JT (Q)J˙T (Q)Q̇ + Q̈o
+

wwB ẇwB
(3.30)
Here JT+ (Q) is the pseudo inverse satisfying J + = J T (JJ T )−1 and Q̈o ∈ N (JT )
is the internal motion of redundant joints not affecting the body motion.

The dynamics of the body expressed in internal base frame Cw is given


by the Newton-Euler equation as [51]

      
 Im 0   v̇wB   0   fwB 
  +  =   (3.31)
0 Iw ẇwB wwB × Iw wwB τwB
Here  
 mB 0 0 
 
 
Im = 
 0 mB 0 

 
0 0 mB
T
where mB is the body mass and Iw = RwB Io RwB is the body inertia matrix

expressed in Cw and Io is the body inertia matrix expressed in CB . Also from


equation 3.28 we have

46
 
 fwB 
FT = −G+
T   + FT o (3.32)
τwB
where G+
T is the pseudo inverse of GT and FT o is the internal leg force not
affecting the body motion. Combining equation 3.31 and equation 3.32 yields

    
 Im 0   v̇wB   0 
FT = −G+
T   +  + FT o (3.33)
0 Iw ẇwB wwB × Iw wwB
In order to specify an orientation trajectory in terms of the rotation

matrix RwB (t) we parameterize SO(3) so that RwB = RwB (Υ) where Υ ∈ R3

is taken as yaw α(t), pitch β(t), and roll γ(t) coordinates of the body. Given

this parametrization, there exists a linear transformation p(Υ) such that [42]:

    
 wx   cγcβ −sγ 0   α̇ 
    
w  
=  wy  = 
 sγsβ

cγ 0 

 β̇  = p(Υ)Υ̇ (3.34)
    
    
wz −sβ 0 1 γ̇
where
 
 α 
 
 
Υ =  β 
 
 
γ
So the acceleration of the body is given as
     
 v̇wB   p̈wB   ṗwB 
  = P  + Ṗ   (3.35)
ẇwB ΫwB Υ̇wB
where
 
 I 0 
P =  
0 p(Υ)
We define the position error ep ∈ R6 of the body in a given desired
trajectory as

47
   
d
 pwB   pwB 
ep =  − 
ΥdwB ΥwB
where
 
d
 pwB (t) 
 
ΥdwB (t)
is the desired body trajectory.

In order to reduce the position error, we apply joint torques of the legs

to make the acceleration of the body satisfy the equation

      
d
 v̇wB   p̈wB    ṗwB 
  = P   + kv ėp + kp ep  + Ṗ   (3.36)
ẇwB ΫdwB Υ̇wB
where kv and kp are scalars chosen such that the characteristic roots of

ëp + kv ėp + kp ep = 0 have negative real parts.

The dynamics of the ith leg manipulator for l link is given by

Hi (q)q̈ + Ci (q, q̇)q̇ = τi − J T (q)Bxi

where

τ = l × 1 vector of joint torques,


q, q̇, q̈ = l × 1 vectors of joint positions, velocities, and accelerations,
H(q) = l × l joint space inertia matrix , both symmetric and positive definite,
C(q, q̇) = l × l matrix of coriolis and centripetal force terms,
We define
 
 H1 ··· 0 
 
 .. .. .. 
H=
 .
. . 

 
0 · · · Hn

48
 
 C1 
 
 
 C2 
 
C=
 ..


 . 
 
 
Cn

 
 τ1 
 
 
 τ2 
 
τ = .. 
 


. 

 
τn
Then the leg dynamics can be grouped for i = 1, · · · , n to yield

H(Q)Q̈ + C(Q, Q̇)Q̇ = τ − JTT (Q)FT (3.37)

Thus the resultant control law is specified by substituting equations 3.30,


3.33, 3.36, 3.37:

    
p̈dwB  ṗwB 
τ = DP 

 
 + kv ėp + kp ep  + D Ṗ  +E (3.38)
ΫdwB Υ̇wB
where   
 I
+ m
0 
D = −HJT+ GTT − JTT GT  
0 Iw
and
   
 ṗwB  +
0 
E = −HJT+ ĠTT P   − HJT J˙T Q̇ + C Q̇ − J T T GT 
+

Υ̇wB wwB × Iw wwB

All the terms in equation 3.38 are functions of state variables Q, Q̇, pwB ,

ΥwB , vwB , and Υ̇wB .

49
3.2 Gait Controller

Our gait controller is based on a gait synthesizer which is adapted from

the Generalized Approximate Reasoning-based Intelligent Control (GARIC)

architecture [12], to our objective GARIC presents a method for learning and

tuning fuzzy logic controllers (FLC) through reinforcements signals. The gait

synthesizer (Fig. 3.2) consists of three modules. The Gait Evaluation Mod-

ule (GEM), acts as a critic and provides advice to the main controller, based

on a multilayer artificial neural network. The Gait Selection Module (GSM),

decides on a new gait to be undertaken by the robot according to an ANN


representation of a fuzzy controller with as many hidden units as there are

rules in the knowledge base. The Gait Modifier Module (GMM), changes the
gait recommended by the GSM based on internal reinforcement. This change

in the recommended gait is more significant for a state if that state does not

receive high internal reinforcements. (i.e. probability of failure is high). On


the other hand, if a state receives high reinforcements, GMM administers small

changes to the action selected by the fuzzy controller embedded in the GSM.

This reveals that the action is performing well so that the GMM recommen-
dation dictates no or only minor changes to that gait. The actions for the gait

synthesizer are the gaits recommending an operation mode (defined in section


3.3.1) for each leg.

3.2.1 Encoding the Gaits for a Multilegged Robot

Our gait synthesizer works on gait patterns that need to be coded.

Gait patterns are patterns of leg coordination which represents relative phases

50
Figure 3.2: Architecture of Gait Synthesizer.

(swing phase or stance phase) of legs. For legged robots using static balance,
the typical feature of these gait patterns is that in any phase of the pattern
the robot ensures static stability. In the gait synthesizer we work on wave gait

patterns which are observed in insect walking. As stated in chapter 2 these


gaits consist of metachronal waves in both side of the robot and differ from

each other with an amount of overlapping. So different wave gait patterns can

be derived by changing this amount. Among numerous gait patterns we choose

the ones including groups of legs which are in phase. For instance, the tripod

gait which is special in these patterns (an alternation between right-sided and
left-sided metachronal waves) naturally have two group of legs involving three

legs in phase.

In the encoding of the gaits our goal is to find a modelling method for

51
Figure 3.3: Summary of terminology used in gait analysis.

all gait patterns from which a leg task can be obtained. In other words for a

given state (which at least includes the phase, position and velocity of each leg

for proprioceptive level of control) we want to find both in which gait pattern
the current state belongs to and in which phase of that pattern it belongs to.
We make use of position information of the legs to recognize the gait patterns.

In the encoding process we divide the stroke distance (Fig. 3.3)

of a leg into overlapping grids for both swing and stance phases as
in Fig. 3.5. Here the linguistic values {A, B, . . . , L, M } are ”author-

defined” fuzzy partitions of stroke distance with triangular member-

ship functions. The tripod gait of Fig. 3.4E can now be coded

with the sequence: (F, A, F, A, F, A) →(G, B, G, B, G, B) → ... →

(E, J, E, J, E, J)→ (F, A, F, A, F, A) → . . ., or the gait pattern in Fig.


3.4D with the sequence: (K, C, A, C, A, K) → (L, D, B, D, B, L) →

{(M, E, C, E, C, M )or(A, K, C, K, C, A)} → (B, L, D, L, D, B) →

. . . → (D, B, L, B, L, D) → {(E, C, M, C, M, E)or(K, C, A, C, A, K)} →

(L, D, B, D, B, L) → . . .. In all gait-sequence-encoding the fraction of cy-

52
Figure 3.4: Wave gait patterns. Bold lines represent swing phase. L1 signifies the
left front leg and R3 indicates the right hind leg [7].

cle periods for stance and swing must be incorporated in the model. As in Fig.

3.4, in tripod gait, a stance phase is the half of a whole leg cycle whereas in
tetrapod gaits it is two over three portion of leg cycle (so called duty factor
described in chapter 2). Leg sequences defining gait patterns have to be also

modelled by leg cycles. For a portion of a cycle, a leg is either in stance, swing

or in transition (end of swing or end of stance). Thus we construct rules as:


if leg R3 is in E, and, R2 in C, R1 in M , L3 in C, L2 in M , L1 in E, then

R3 is in transition, R2 in stance, R1 in transition, L3 in stance, L2 in transi-

tion, and L1 in transition. Here being in A for example means that leg is in

stance in current state and has partial belonging to the fuzzy linguistic value
A. Whereas, the consequent (or then part) of the rule prescribe the legs’ next

”state”. With the given partitioning, 10 rules cover a tripod gait pattern and

9 for tetrapod gait patterns. The significance of this fuzzy modelling is that

53
Figure 3.5: Antecedent Labels, fuzzification of individual leg position.

individual leg phases are found from a gait pattern cycle which is determined

from relative positions of the legs.

For uneven terrain conditions, we define four ”operation modes” of a leg:

1. First mode labelled as -2: The leg is responsible for supporting the body.

2. Second mode labelled as -1: The leg switches to third mode provided
that just the legs in the first mode provides static stability, otherwise

the leg participate in the supporting legs. These legs are candidates for

swing phase among stance legs.

54
3. Third mode labelled as 2: The leg is responsible for full recovery such as

that if it encounters an obstacle it will try to handle it.

4. Fourth mode labelled as 1: The leg tip will descend to the ground until

the tip touches the terrain and switches to the first mode.

In both mode labels 2 and 1 (modes will be mentioned with their labels from

now on), the leg will go on recovery if it is within the limits of its operation

space. These four modes constitute leg states from control point of view that

we need to distinguish for a leg within the cooperative action of walking. At

Anterior Extreme Point (AEP), mode 2 automatically switches to mode 1.

Furthermore, the binary data from static stability check for mode -2 legs and

tip contact (a protracting leg switches to retraction when it finds a foothold


that it can safely support the body) clearly determine the switching from mode

-1 to mode 2 and from mode 1 to mode -2, respectively.

Beside the leg/leg coordination, leg/body coordination is required for a


regular gait. Movement of each leg can be characterized by a position p ∈ R
and a velocity ṗ ∈ {vstance (vst ), vswing (vsw )} according to direction of body mo-

tion in leg centered coordinates. When a leg is in protraction, it is lifted from

the ground and swing forward relative to the body with a constant velocity
vsw > 0. When a leg is in retraction, it is on the ground, providing support and

swinging backward relative to the body with a velocity vst < 0 (for straight
line walking this velocity is equal to minus body velocity with respect to the

ground, vB ). As in many walking animals, vsw is relatively constant while

vst varies according to walking speed. In other words considering Fig 3.4, the

body or retraction velocity is a fraction of protraction velocity and the fraction

55
is directly proportional to number of support legs over number of swing legs.

For instance in tripod gait this ratio is one and the velocities are equal. So in
our controller the body velocity for a time step is taken as:

X
vB (t) = ( vst (t − 1) ∗ ∆t)/(nost) (3.39)
where nost is the number of stance legs. There are two parameters to be con-

sidered concerning velocity: static stability margin and kinematic margin of

stance legs (Fig. 3.3). The minimum of these margins (let us call critical mar-

gin, Cm) determines the distance that the robot can travel without violating

a physical constraint. So additionally vB is set to zero when Cm is zero in


speed control.

Figure 3.6: Consequent Labels: task share based on operation modes.

For the gait synthesizer, a gait is the ”task sharing” of the legs accom-

plishing a coordinated body movement. For instance, if a leg is on AEP, it

is clear that it can only be used for stance (no share for swing) such that if

it is presently in swing phase it must take a transition to stance. However in

uneven terrain conditions where there is no fixed leg cycle for individual legs,
it is difficult to assign in a deterministic way a leg share within the limits.

56
In our controller, we introduce a linguistic variable task share, Mleg (t), taking
linguistic values {Stance (St), Swing (Sw), Transition (Tr)} with triangular

membership functions shown in Fig. 3.6. The values (−2, −1, 0, 1, 2) are cho-

sen according to labels of operation modes which consider cyclic behavior of


the legs. By changing the overlapping areas and phase difference of the left-
and right-sided metachronal waves we form 9 tetrapod gaits. According to

the method mentioned above, we construct 91 (9 × 9 + 10) rules for all gaits
belonging to the wave gait class. With the membership functions in Figs. 3.5,

3.6 we constitute the fuzzy rules for the rule base of the GSM of the gait syn-
thesizer where triggered rules recommend a value for task share of each leg.

3.2.2 Gait Selection Module (GSM)

GSM determines the recommended task share for each leg, Mleg (t), in a
fuzzy decision process where inferencing is done based on the fuzzy rule base.

Mleg values define a measure to distinguish two switching point between mode
-2 and -1 and mode 2 and 1 during walking. Two thresholds T2,1 and T−2,−1 de-
termine the mode of the legs. For the legs in stance, legs with Mleg (t) < T−2,−1

are determined as mode -2 legs and legs with Mleg (t) < T−2,−1 as mode -1.

Likewise, for legs in stance, legs with Mleg (t) > T2,1 are determined as mode
2 legs and those with Mleg (t) < T2,1 as mode 1. The effect of these threshold

values to the decision process are analyzed in simulation.

As shown in Fig. 3.7, GSM is a fuzzy logic controller represented as a five-

layer feedforward network with each layer performing one stage of the fuzzy
inference process. GSM takes the current legs’ positions and phases (swing

57
Figure 3.7: Gait Selection Module

or stance) as input. The nodes in the second layer corresponds individually


to possible values of each linguistic variables of the inputs (Fig. 3.5) with

triangular membership functions, µV (x), where the input linguistic value V =


(c, sL , sR ) is represented by c, sL , sR corresponding respectively to the center,

left spread, and right spread of the triangular membership function µV . Each

node in this layer feeds the rules using the linguistic value in their antecedent

parts (”if” part). The conjunction of all the antecedent conditions in a rule is

calculated in the third layer. The output of the layer is the firing strength of

the rules which is calculated by softmin operation described in section 2.5.2.

The nodes in the fourth layer corresponds to a consequent label (Fig. 3.6).

Their inputs come from all the rules which use this particular consequent label.

For each input supplied by a rule, nodes compute the corresponding output
suggested by that rule by a defuzzification procedure µ−1
Yleg (wr ) = c + 0.5(sR −

58
sL )(1 − wr ) where Yleg = (c, sL , sR ) indicates a consequent linguistic value of
a leg. In the last layer there are six output node for each leg that computes

Mleg (t) by combining the recommendations from all the fuzzy control rules in

the rule base, using weighted sum in which the weights are the rule strengths:

X X
Mleg = ( wr µ−1 (wr ))/ wr (3.40)
r r
The goal of calculating Mleg values in GSN is to maximize the evaluation

of the gait, v, determined by GEM where, within its learning process the vector

of all parameters of Yleg (centers and spreads) are adjusted; that is,

δv
∆pY ∝ (3.41)
δpY
where pY is the vector of Yleg = (c, sL , sR ). But, there is no explicit gradi-

ent information provided by the reinforcement signal and the gradient δv/δp
can only be estimated. To estimate the gradient information in reinforcement
learning, there needs to be some randomness in how output gaits are chosen

by GSM so that the range of possible outputs can be explored to find a correct

value. This is provided by the stochastic exploration in Gait Modifier Module

(GMM).

3.2.3 Gait Evaluation Module (GEM)

GEM is a standard two-layer feedforward neural network, which takes

the state of the system as input. The state data includes leg-tip positions,
and velocities in leg centered coordinate systems and legs’ operation mode

(-2,-1,1,2). To assign credit to the individual actions of the action sequence


preceding a reinforcement signal, an evaluation function of the states is learned.

59
The output is an evaluation of the state denoted by v. Changes in v due to

state transitions are further combined with a reinforcement signal to produce


an internal reinforcement r̂:

r̂(t) = r(t) + γv(t) − v(t − 1) (3.42)

where 0 ≤ γ ≤ 1 is the discount rate. The internal reinforcement plays the role
of an error measure in the learning of the GEM. If r̂ is positive, the weights of

the network are altered through back propagation algorithm so as to increase

the output v for positive input, and vice versa. The main reinforcement signal

is obtained from critical margin (Cm) and vB . If Cm = 0 or vB = 0 (may

be zero if there are no legs in swing) a reinforcement signal r(t) = −1 is re-


turned. Otherwise a value is returned according to design goal. This value
can be simply r(t) = 0 or can be a real number to represent a more detailed

and continuous degree of success. Different reinforcement signals are tested in

simulations in order to optimize speed and mobility.

3.2.4 Gait Modifier Module (GMM)

One of the features of the Gait Synthesizer architecture is to modify the


output of the GSM according to internal reinforcement from previous time

steps. GMM creates a Gaussian random distribution with a mean which is set

as the recommended Mleg value, and with a standard deviation αexp(−r̂(t−1)),

a non negative, monotonically decreasing function with a scale factor α, where

α ∈ R+ . When r̂(t − 1) is low, meaning the last action performed is bad,


the deviation is large, whereas the controller remains consistent with the fuzzy

control rules when r̂(t − 1) is high. This deviation provides adaptation to

60
current conditions or to solving a sudden problem of leg entrapment. Also

the exploration of the state space increases the systems experience, which is
provided by the learning in the GEM and GSM. The gradient information
∆pY = δv/δpY , which is within GSM, is estimated by stochastic exploration

in the GMM. The modification implemented in t − 1 by GMM is judged by


r̂(t). If r̂ > 0, meaning the modified M (t − 1) is better than expected, then
M (t − 1) is moved closer to the modified one, and vice versa. That is,

" #
δv Mmod (t − 1) − Mrec (t − 1)
≈ r̂(t) (3.43)
δpY αexp(−r̂(t − 1))
where Mrec denotes the M value recommended by GSM, and Mmod denotes

the M value modified by stochastic perturbation in GMM.

Due to change in Mleg values, four different transitions may occur: If a


leg’s state is -2 (-1) and Mleg (t) > T−2,−1 (Mleg (t) < T−2,−1 ) then leg is -1 (-2).
If a leg’s state is 2 (1) and Mleg (t) < T2,1 (Mleg (t) > T2,1 ) then leg is 1 (2).

So stochastic exploration on Mleg values which does not result in a modified


transition, have no contribution in learning. We can define the minimum

deviation, ∆dm , as the minimum perturbation added by GMM required to


change the state of a leg. ∆dm (t) values can be given as


 |Mrec (t) − T2,1 | if the leg is in state 2 or 1
∆dm (t) =

 |M (t) − T
rec −2,−1 | if the leg is in state -2 or -1
So the modification of an M (t) depends on the deviation function and

the ∆dm (t). The effect of the values α, T2,1 , and T−2,−1 will be analyzed in

simulations.

61
3.2.5 The Complete Control Cycle

The control cycle in Fig. 3.8 is executed in each time step. Firstly, re-
inforcement signal and legs’ states are taken by the gait synthesizer and the
legs are clustered to suitable operation modes according to their calculated M

values. Then, further modifications depending on physical checks are applied


and the resultant operation modes which will be valid for the rest of control
layers are obtained until the next cycle. In the figure we only consider velocity

controller, but different control modules (such as navigation, terrain adapta-

tion) can be implemented. Lastly, desired velocities of the legs and the body

is calculated and applied by the robot.

62
Figure 3.8: Complete Control Cycle

63
CHAPTER 4

HEXAPOD ROBOT SIMULATION

We develop the hexapod robot shown in Fig. 4.1 to be used in our sim-

ulations. Our simulation program consists of two subprograms. The first one

constitutes the main body (main program) which includes the controller ar-
chitecture and the hexapod model. All simulation tests and training sessions

are implemented in this subprogram which is written in Matlab 6.5. The sec-
ond subprogram is responsible for visualization (rendering) of the simulation
results. The main program saves as a file named simvars.bsd the state data

of the hexapod for each time cycle. The state data are fed as an input to the
rendering program. The rendering program is written in Borland C Builder
with opengl as a graphics tool. The reason for using two separate program in

the simulation is to decrease the computation time spent in the tests of the
hexapod. The source code of the programs and simulation results can be found

in the CD attached to this thesis as an appendix.

4.1 Hexapod Model

The simulations are implemented in a kinematic model. Such kinematic

models are commonly implemented in gait analysis [24], [34] and gait control

[30], [28] for simulation purposes. A simplified model of the hexapod robot

considered in this thesis is shown in Fig. 4.2. Each leg is identical and com-
posed of three rigid links (Fig. 4.3). All the links are connected to each other

via a revolute joint. Hence the foot point or the leg tip has three degrees of

64
Figure 4.1: The hexapod robot used in simulation.

freedom with respect to the body. The legs are represented by labels R1, R2,

R3, and L1, L2, L3. Here, for example, L1 signifies the left front leg and R3

indicates the right hind leg.

The body coordinate (CB ) is attached on the hexapod body with the
origin at the center of gravity while leg base coordinates (Cb ) are attached

on the bases of the legs (Fig. 4.2). Cw is the inertial base frame. Dashed

rectangles represent working spaces for the legs (pbtipx ∈ [−Sd/2, Sd/2] and
pbtipz ∈ [−Rz/2, Rz/2]). The joint angles are calculated by inverse kinematics

[45] given a desired position and orientation. The dimensions of the links and

the body level from the ground are assigned such that the leg tips can reach

all points in their working spaces (existence of solution of inverse kinematics)

and there exists only one joint angle vector (uniqueness of solution of inverse

kinematics). Fig. 4.4 shows two postures of the hexapod model. As can be
seen, the hexapod body in Fig. 4.4B is lower compared to the one in Fig. 4.4A

65
Figure 4.2: Hexapod model

in order to increase the reachable space of the legs. The hexapod in Fig. 4.4B

is especially used in uneven terrain simulations where some legs fall into holes

on the terrain. Also notice that reachable space by the legs do not overlap

(Fig. 4.2).

4.2 Sensor System

As indicated, joint angles of the legs are calculated by inverse kinematics

from given leg tip trajectories. In real robots these angles measured by Joint

Angle Sensors [42]. These are potentiometers that measure the joint angle for
each DOF of the leg. In our simulation these angles are used in the rendering

66
Figure 4.3: Each leg is identical and composed of three links. Pink legs are in swing
phase whereas blue ones are in stance.

program. In the gait synthesizer (so in the main program), leg tip coordinates

and velocities in their own coordinate systems are used.

The leg tip-terrain interactions are determined by modelling ground con-


tact sensors. In real robots, these are linear potentiometers on the tip of all
legs that measures the deflection of the foot as it presses against the ground.

In our experiments, this is an on-off sensor with output of ’1’ when contact
occurs, and ’0’ for noncontact.

In real robots, several additional sensors are used such as inclinometer

which senses the body orientation with respect to the direction of gravity. In

our simulations, we implement straight line walking in x-direction (Fig. 4.2).

So body orientation does not change. Also we did not need to model sensors

for terrain sensing (such as optical sensors), because the gait synthesizer is
capable of making its decisions without explicitly needing such data since it

develops gradually an internal world of the environment for gait adaptation.

67
Figure 4.4: Two different postures of the robot. Body level of the robot in B is
lowered in order to increase the reachable space of the legs.

4.3 Kinematics of the Hexapod Robot

Assumptions on kinematics and dynamics of the hexapod are given as

follows for simplicity of the analysis and is adapted from [28].

1. The contact between a foot and the ground is a point.

2. There is no slipping between a foot and the ground.

3. All the mass of the six legs is lumped into the body, and the center of

gravity is assumed to be at the centroid of the body.

4. There is no displacement in y-directions and body level (pwBz ) and ori-

entation is constant with respect to the inertial base frame.

5. The body speed with respect to inertial frame in x direction is equal to

68
minus leg tip speed in x direction of stance legs with respect to Cb (i.e.,
vwBx = −vbtipst x ).

In our simulations vbtipsw x is set to a constant positive value ρ from which


vwBx (so vbtipst x ) is calculated. Also |vbtipsw z | = % for swinging legs. For different

states (operation modes) the velocities of the legs are calculated as follows.







ρ if the leg is in state 2 or 1 and pbtipx (t − 1) < Sd/2 (AEP)

vbtipx = 0 if the leg is in state 2 or 1 and pbtipx (t − 1) > Sd/2 (AEP)





 ν if the leg is in state −2 or −1
(4.1)
P
where ν = ( vbtipsw x (t − 1)∆t)/(nost) and nost is the number of stance legs.
And







% if the leg is in state 2 and pbtipx (t − 1) <(Rz/2)

vbtipz = −% if the leg is in state 1 and pbtipx (t − 1) >(-Rz/2) (4.2)





 0 otherwise

4.4 Uneven Terrain

The main challenge of the gait synthesizer is for uneven terrain locomo-

tion. The test path for uneven terrain is modelled such that a smooth surface

succeeding a part with randomly placed hills and holes, some of which are

deeper than the legs can reach (Fig. 4.5). A function in the main program

named as TTerrainmaker.m (refer to the CD in the appendix) creates terrains


by randomly placing 7 different surface segments which have dimensions such

69
that only leg tips can collide with them. In other words the other parts of the

hexapod robot (links or the body) do not have collision with the terrain. The
tests conducted on uneven terrain, where a leg hits an obstacle (probably with
the link part of the leg) and can not go on swinging, are modelled by temporary

disabling of the corresponding leg. The effect of the disabling is same as the
obstacle collision from the gait synthesizer point of view (temporarily the leg
will not participate in the gait of the hexapod). Again such simulations can

be found in the CD and are discussed in details in chapter 5 under simulation

results.

Figure 4.5: The modelled uneven terrain. Different surface segments can be seen in
the figure. The holes on uneven terrain are modelled by surface segments which are
deeper than the legs can reach. Notice that the pink leg (swinging) falls into such a
segment.

70
CHAPTER 5

SIMULATION RESULTS

The hexapod robot simulation developed in chapter 4 is used to gen-

erate simulation results that clearly demonstrate the capabilities of our gait

synthesizer. The simulations are implemented in a kinematic model (chapter


4) rather than the dynamic model described in chapter 3.1. A control system

in such a dynamic model has to include many control modules besides a gait
controller, such as control algorithms related to navigation, speed, body level
(terrain adaptation), which have only effects on the low level execution of a

gait of the robot rather than to the gait formulation level. Consequently the
simulation omits these effects and analyze just the gait synthesizer in the gait
control of a hexapod robot based on its kinematic model.

In the first two sections of the simulations we will firstly analyze the con-

trol parameters and different choice of reinforcement signals that are significant
in the performance of the gait synthesizer. These tests will be implemented

on smooth terrains in order to focus on comparisons under similar environ-

mental effects. In the rest of the simulations we will show the capabilities of
the gait synthesizer for search and rescue (SAR) by testing its performance on

modelled uneven terrains expected in SAR operations and when a leg is used

as a manipulator. Before these tests are implemented, the gait synthesizer

was trained with different initial conditions and with different terrains (for the

tests applied on uneven terrain). The results presented here are chosen among
the ones which are impressive enough to clearly demonstrate the advantages

71
of the gait synthesizer and the potential it offers for SAR. All the results are

included in a CD which is attached to the thesis as an appendix. The reader


is referred to this CD in which the results discussed here can be examined
visually.

5.1 Exploration and Exploitation Dilemma in Reinforcement

Learning

As indicated in the Gait Modifier Module (GMM) the deviation func-

tion αexp(−r̂(t − 1)) is scaled by α, and two threshold values, T−2,−1 and T2,1

must be properly selected for the controller. The effects of these values are
tested first for a simple learning problem. The legs begin from random initial
positions (all the legs are in state −2) such that this initial configuration does

not belong to any gait pattern in the fuzzy rule base. Since the reinforcement

signals aim at optimizing the speed of the hexapod with a maximized static
stability, we expect that from this random initialization the gait will converge

to the optimum one in terms of speed which is the tripod gait. In order to test

the sensitivity of the gait synthesizer to changes in α and thresholds we test

the GSM in the same manner for each α and threshold values. Within each

training session, repeated 10 times maximum, the gait synthesizer is trained


for 2000 time steps for a given parameter set (α, T2,1 , and T−2,−1 ). We ini-

tialize the weights in learning, change the parameters and apply the training
again for a new parameter set. Fig. 5.1 shows the resultant speed vs time

graphs of the hexapod. In the first test the parameters are chosen as; α = 0.5,

T2,1 = 0.5, and T−2,−1 = −0.5. If the magnitude of the scale factor (α) is

72
high, we find that the exploration of different gaits is also high. In other

words the gait synthesizer tries plenty of gaits for different states, causing a
very slow learning. Fig. 5.1A shows the resultant speed vs time graph at the
10th training session. The synthesizer is found not to be able to converge to

a periodic movement or capture a gait pattern. On the other hand, when the
scale factor is too small as in a second test taking α = 0.01, T2,1 = 0.3, and
T−2,−1 = −0.3, learning is low thus exploration is low and moreover there is

a chance of getting stuck. Fig. 5.1B (the second row) is the resultant speed

vs time at the second training session. The legs’ state vector at the end of

this training is observed as [−1, −1, −2, −1, −2, −1]. Here, because no static

stability is provided by the legs in state −2, no swinging leg exist and the body
stands in a still position. In such states (most severe being the case of the state

[−1, −1, −1, −1, −1, −1]) the synthesizer has to try different combinations of

leg states in order to continue its movement. But low scale factor tightens
the deviation from the recommended M values and recovery from the present

state is low and limited.

In the third and fourth tests (Fig. 5.1C, 5.1D) we set the scale factor to

0.15 and consider two threshold pair; T2,1 = 0.5, T−2,−1 = −0.1 (Fig. 5.1C),
and T2,1 = 0.1, T−2,−1 = −0.5 (Fig. 5.1D). These speed vs time graphs are

obtained in the 10th training session. When T2,1 is high the legs can not stay

at state 2 for a long time and changing into state 1. This creates very small

step sizes. Whereas, when T−2,−1 is too small a similar problem as in the sec-
ond test arises where too many legs falls into the state −1 and the hexapod

robot get stuck in a still position without the gait synthesizer being able to

73
Figure 5.1: Body speed versus time graphs for different scale factor and threshold
values.

restart its motion, although the gait synthesizer tries many new gaits in order
to escape from such states. The robot looses time: notice long delays with zero
speed such as between times 1300 and 1400. The last row represents results

for parameters α = 0.15, T2,1 = 0.1, T−2,−1 = −0.1. This speed vs time graph

shows a tripod gait and is obtained in the third training session, giving raise

to values that can be considered as near optimum.

5.2 Smooth Terrain Tests

In this section simulations demonstrate the learning capability of the

gait synthesizer on smooth flat terrain. Learning is aiming at increasing the

74
Figure 5.2: Comparison of resultant gaits when training is done according to two
different reinforcement for speed (first row) and critical margin (second row). The
first column gives the resultant gaits, second one body speed versus time, and last
column shows critical margin in the direction of motion versus time.

static stability margin while maximizing speed. As indicated in section 3.2.3,

a reinforcement signal r(t) = −1 is returned when the critical margin, Cm, or


body speed vwBx is zero, except for states in which there exist a swinging leg

on AEP. Otherwise, the controller is rewarded towards its optimization of the

speed and critical margin. Reinforcement signals leading to such rewards are
of the form

r(t) = vwBx /ρ (5.1)

and

r(t) = Cm(t)/Cmmax

respectively. Here Cmmax is the maximum critical margin which is the stroke

75
distance (Sd), and ρ is the maximum speed of the body according to the speed
policy which can be obtained in tripod gait. The first row of Fig. 5.2 shows the

results of speed optimized gait. The first column gives the resultant gait, sec-

ond one vwBx versus time and last column shows critical margin versus time in
the direction of motion. As expected a tripod gait is obtained because it is the
fastest gait in the rule base of the gait synthesizer and this is where naturally

gait decision has converged to. Maximum speed in second column corresponds
to ρ. The results in the second row corresponds to the gait synthesizer trained

to optimize Cm. As can be seen, a tetrapod gait is obtained which generates


steps to prevent the critical margin from getting smaller (graph in the third
column of second row). The drawback here is on the speed as seen in the

second column.

0.5

−0.5

−1

−1.5
0 500 1000 1500 2000 2500

Figure 5.3: Internal reinforcement versus time.

Another example demonstrates a compromise the gait synthesizer under-


goes in its performance in the case of a tripod gait with small step sizes. The

76
0.35

0.3

0.25

0.2

0.15

0.1

0.05

0
0 500 1000 1500 2000 2500

Figure 5.4: Critical margin, Cm(t), versus time.

robot is trained for speed with an additional reinforcement signal r(t) = −1


when critical margin, Cm, which is the minimum of stability margin and kine-

matic margin, is below a positive value. When the robot starts with a tripod
gait it is punished several times due to this reinforcement signal and the in-
ternal reinforcement increased as seen in Fig. 5.3. Fig. 5.4 shows the Cm

versus time graph of this simulation. The decrease in the internal reinforce-

ment meaning a performance problem causes the gait synthesizer to decide on

new gaits. As can be seen, the gait synthesizer adapts the gait after a cer-
tain amount of time to increase the internal reinforcements without losing the
periodicity. Fig. 5.5 shows the leg tip positions in x direction where one can

observe that leg step sizes decreased. This simulation clearly shows that an

adaptation of the gait synthesizer is achieved for both speed and mobility (in
terms of critical margin) by an appropriate choice of reinforcement signals.

77
0.2 0.2 0.2

0.1 0.1 0.1

0 0 0

−0.1 −0.1 −0.1

−0.2 −0.2 −0.2


500 1000 1500 500 1000 1500 500 1000 1500
R1 R2 R3

0.2 0.2 0.2

0.1 0.1 0.1

0 0 0

−0.1 −0.1 −0.1

−0.2 −0.2 −0.2


500 1000 1500 500 1000 1500 500 1000 1500
L1 L2 L3

Figure 5.5: Leg tip positions on x direction versus time. In order to increase the
critical margin gait synthesizer applies smaller step sizes.

5.3 Performance on Rough Terrain

Next, the robot is tested on uneven terrain, modelled such that a smooth
surface succeeds a part with randomly placed hills and holes some of which

deeper than the legs can reach. We conduct a comparative analysis of per-

formance of the hexapod robot with or without the gait synthesizer but with
fixed gait approaches on the defined terrain. Fig. 5.6 shows tip trajectories of
the legs in classical fixed tripod gait. The legs swing in their operation space

and Anterior Extreme Point is taken to be the fixed switching point for mode
2 to 1. When left front leg (L1) falls in a hole, the robot is stuck and can no

longer move. There are mainly two reasons for such a failure. Firstly the gait

pattern is defined for six legs and can not be implemented if any one is missing.

Secondly as shown in Fig. 5.2, critical margin for the tripod gait approaches

zero when swinging legs are descending. This is because stance legs reach

78
Posterior Extreme Point (PEP) so there exist no margin for body movement
to handle the hole. Fig. 5.7 shows tip trajectories and Fig. 5.8 shows the

resultant gait when gait synthesizer is implemented on the same terrain. The

gait synthesizer successfully handles the terrain irregularities. When the robot
first enters the uneven portion of the terrain, the evaluation of the gait give
lower reinforcements (due to unexpected bad performance in the robot state)

and new gaits are recommended by the gait synthesizer. When a leg falls in
a hole the synthesizer generates very small steps as ripples in the trajectories.

These hesitations are actually trials of new gaits by Gait Modifier Module and
are also seen on the trajectories of the legs’ tips while they are swinging. One
can argue that a different fixed gait (for instance a tetrapod gait) can tackle

such terrain. This is true from mobility point of view. However for search and
rescue tasks, speed (or response time) is as important as mobility and a fixed

tetrapod gait has a slower performance quite inadequate for a time pressing

SAR operation. Compromise is needed between two concepts. Fig. 5.8 also
shows that after some time the robot reaches the smooth terrain where it re-
covers a tripod gait. Gait trials for a better evaluation of the gait can be seen

from these results where recoveries occur. The results of an another example

for a similar terrain is given in Fig. 5.9, 5.10, and 5.11 where a faster recovery

of the tripod gait is achieved.

5.4 Task Shapability: A Must for SAR Operations

In search and rescue (SAR) operations a leg of the hexapod can be re-
quired to be used for tasks such as carrying debris or any equipment while

79
the robot is in motion, so that it can not participate in the gait of the hexa-

pod. Such a task shapability may be vital in hazardous environment of SAR.


Fig. 5.12 and 5.13 represent such a situation where leg R1 is involved in a
manipulation task and is eliminated from the gait pattern. The leg involved

in a manipulation task is shown here as fixed in a position in swing phase as


if it is holding something. Although the gait synthesizer is seen not being able
to find right away a periodic gait, it provides the mobility in sudden lack of

a leg using the redundancy in multi-legged locomotion. Simulations clearly

indicate the advantageous characteristics of the gait synthesizer for mobility

and robustness required in search and rescue (refer to CD in the appendix).

80
Figure 5.6: Leg tip trajectories of the hexapod on x-z plane with a fixed tripod gait
on the defined terrain.

81
Figure 5.7: Leg tip trajectories of the hexapod on x-z plane with gait synthesizer on
the defined terrain.

82
Figure 5.8: Gait of the hexapod robot on uneven terrain. The robot recovers tripod
gait pattern after some time reaching the smooth terrain.

83
Figure 5.9: Gait of the hexapod robot on uneven terrain. The robot recovers tripod
gait faster than the previous one.

0.35

0.3

0.25

0.2

0.15

0.1

0.05

0
0 500 1000 1500 2000 2500

Figure 5.10: Critical margin versus time.

84
0.2 0.2 0.2

0.1 0.1 0.1

0 0 0

−0.1 −0.1 −0.1

−0.2 −0.2 −0.2


500 1000 1500 2000 500 1000 1500 2000 500 1000 1500 2000
R1 R2 R3

0.2 0.2 0.2

0.1 0.1 0.1

0 0 0

−0.1 −0.1 −0.1

−0.2 −0.2 −0.2


500 1000 1500 2000 500 1000 1500 2000 500 1000 1500 2000
L1 L2 L3

Figure 5.11: Leg tip positions on x direction versus time.

Figure 5.12: Gait generated by the gait synthesizer when leg R1 is missing.

85
Figure 5.13: Gait generated by the gait synthesizer in sudden lack of leg R1.

86
CHAPTER 6

CONCLUSION

6.1 General

In this thesis work we developed an intelligent task shapable control,

based on a gait synthesizer for a hexapod robot upon its traversal of unstruc-

tured workspaces in rescue missions within disaster areas. The gait synthesizer
draws decisions from insect-inspired gait patterns to the changing needs of the

terrain and that of rescue tasks. It is composed of three modules responsi-

ble for selecting a new gait, evaluating the current gait, and modifying the

recommended gait according to the internal reinforcements of previous time


execution performances. Simulation results show the potential of the gait
synthesizer for Search and Rescue operations such as to be adaptable to the

uneven terrain by shaping gaits, to get out of an entrapment of some legs, to


modify gaits when some legs are used as manipulators in some other tasks very
different in nature to that of locomotion.

The contribution of this thesis work can be analyzed from several point

of views. Towards gait analysis, we introduce a modelling method for insect


inspired gait patterns. We form fuzzy rules for different phases in gait pattern

cycles from relative positions of the legs. The fuzzy rules provide a method to

distinguish tasks of individual legs in the coordinated movement of hexapod

robots. This modelling and fuzzification process is valid for all legged robots

using static stability.

87
For legged robots, there are two parts to gait generation; the cyclic action

of the individual legs and the coordination of all the legs to make effective use
of their cycles. Periodic gaits offers this coordination within a pattern with
a fast execution, exhibiting different amount of weakness to irregularities of

the environment. By utilizing a control structure namely the novel gait syn-
thesizer architecture that exhibit intelligent control features such as learning
and adaptability in unstructured environments, we provide exploration among

such periodic gait patterns in order to be mobile and rapid at the same time on

uneven terrain. Besides, the control architecture generates gaits to get out of

entrapment of legs owing to the modifier module as one of the 3 main modules

of the gait synthesizer.

Dynamics of legged robots is complicated because of coupling of individ-


ual legs in the dynamics of the body. We established its similarity to grasping

and manipulation of objects by multi-fingered robot hands. and we consid-


ered locomotion as grasping onto infinitely big, rough, arbitrarily textured
terrain. We make use of multitude of existing works on grasping models with

multi-fingered robot hands to generate the locomotion dynamical equations for

legged robots. Again the derived equations are general enough to be applied
to all legged robots under static stability.

Finally this thesis contributes to the literature on feasibility of au-

tonomous intelligent robots for search and rescue (SAR). We have developed a
coordination control of legs based on gait patterns for fast and secure mobility

of legged robots for the mobility needs in SAR environments. Fast mobility

is ensured by an optimization on speed. Secure mobility is achieved by an

88
optimization of the static stability margin and also by the gait synthesizer

modifying its gait to take out the robot for any motion entrapment. This
deadlockfree locomotion when terrain entrapment or leg failure occur is due to
the ability of our gait synthesizer using successfully the redundancy in multi-

legged robots.

6.2 Future Work

In this thesis we restrict the subject to gait control. In a complete control

structure of a legged robot, the gait synthesizer should undertake more respon-

sibilities than the ones mentioned in this thesis work. The gait synthesizer can
then be further expanded in several ways. By different choice of reinforcement
signals the synthesizer can be trained and adapted to different tasks. We just

give an example of such an adaptation in terms of speed and mobility which

are the main concern of locomotion as our case.

Moreover, for terrain irregularities that are routinely faced in a search


and rescue (SAR) operation (such as specific obstacle types), some control

modules can be added into Gait Modifier Module. Encountering such situa-

tions, the gait synthesizer lets these modules take the control of modifications

held in GMM while still recommending gaits for locomotion.

Also new rule bases can be added to the system for five legged locomotion
so that in a permanent loss of a leg the corresponding rule base can be set in

action. Although we showed that such situations can still be handled by the

gait synthesizer with rules for six legs, addition of such rules will provide more

89
functionality to the gait synthesizer with just a drawback of memory usage.

The analysis in the thesis are made for a two dimensional model of a
hexapod robot locomotion, meaning a straight line walking. The gait synthe-

sizer can be adapted to a real robot by addition of rules for lateral positions
of leg tips and the locomotion can be made over a planar x-y terrain. Also

the working space of the legs must be adapted when orientation of the body

changes. These foreseen changes to the system would not effect the perfor-

mance of the gait synthesizer because the main concept, which is drawing

decisions from gait patterns to the needs of the locomotion, would not change

with these modifications.

An important property of the gait synthesizer is the generated M val-


ues. These values have information about relative functionality of the legs.

Although we just apply to distinguish operation modes of the legs, the other

control modules can also make use of them. For instance, in navigation control

the M vector of the legs can be taken as an input indicating feasibility of a


manoeuvre. For such utilizations the learning algorithm need to be changed.

Because this time not only the comparison with threshold values but also the

value of the M will be meaningful.

90
REFERENCES

[1] Jennifer Casper, Mark Micire, Robin R. Murphy, Issues in Intelligent

Robots for Search and Rescue, .

[2] İsmet Erkmen, Aydan M. Erkmen, Fumitoshi Matsuno, Ranajit Chatter-

jee, Tetsushi Kamegawa, Snake Robots to the Rescue, Serpentine Search

Robots in Rescue Operations, IEEE Robotics and Automation Magazine,

September 2002.

[3] G. Meltem Kulalı, Mustafa Gevher, Aydan M. Erkmen, İsmet Erkmen,

Intelligent Gait Synthesizer for Serpentine Robots, Proceedings of the 2002

IEEE International Conference on Robotics and Automation, Washington,


DC, May 2002.

[4] M. H. Raibert, Legged Robots that Balance, MIT Press, Cambridge, MA,

1986.

[5] S. M. Song, K. J. Waldron, Machines That Walk: The Adaptive Suspension

Vehicle, Cambridge, MA: MIT Press, 1988.

[6] Celaya, E., Porta, J. M., A Control Structure for the Locomotion of a Legged

Robot on Difficult Terrain, IEEE Robotics and Automation Magazine, Vol.


5, No. 2, June 1998, pp. 43-51.

[7] M. J. Randall, A. G. Pipe, A Novel Soft Computing Architecture for the

Control of Autonomous Walking Robots, Soft Computing 4 (2000) 165-185.

91
[8] Shin-Min Song, Kenneth J. Waldron, An Analytical Approach for Gait
Study and Its Applications on Wave Gaits, The International Journal of

Robotics Research, Vol. 6, No. 2, Summer 1987.

[9] J. Dean, A Model of Leg Coordination in the Stick Insect, Carausius Mo-
rosus, I. A geometrical consideration of contralateral and ipsilateral coor-

dination mechanisms between two adjecent legs, Biol. Cybern. 64, 393-402,
1991.

[10] Cynthia Ferrell, A Comparison of Three Insect-inspired Locomotion Con-


trollers, Robotics and Autonomous Systems 16, (1995) 135-159

[11] David Wettergreen, Chunk Thorpe, Gait Generation for Legged Robots,

Proceedings of the IEEE International Conference on Intelligent Robots

and Systems, July 1992.

[12] Hamid R. Berenji, Pratap Khedkar, Learning and Tuning Fuzzy Logic
Controllers Through Reinforcements, IEEE Transactions on Neural Net-

works, Vol. 3, No. 5, September 1992.

[13] Chin-Teng Lin, C.S. George Lee, Neural Fuzzy Systems, Prentice Hall

Inc., 1996.

[14] Jennifer Casper, Mark Micire, Robin R. Murphy, Jeff Hyams, Brian

Menten, Mobility and Sensing Demands in USAR, .

[15] R. Blickhan, R. J. Full, Similarity in Multilegged Locomotion: Bouncing

Like a Monopode, Journal of Comparative Physiology, A(1993) 173:509-


517.

92
[16] Michael H. Dickinson, Claire T. Farley, Robert J. Full, M. A. R. Koehl,
Rodger Kram, Steven Lehman, How Animals Move: An Integrative View,

Science Vol 288 7 April 2000.

[17] Chiel,H., Beer,R., Sterling,L, Heterogenous Neural Networks for Adap-


tive Behaviour in Dynamic Environments, Advances in Neural Information

Processing Systems, 1989,pp. 577-585.

[18] Beer, R. D., Intelligence as Adaptive Behaviour, Academic Press, 1990.

[19] Beer, R. D., Chiel, H. J., Quinn, R. D, Espenschied, K. S., Larsson, P.,

A Distributed Neural Network Architecture for Hexapod Robot Locomotion,


Neural Computation, 4, pp. 356-365, 1992.

[20] Cruse, H., What Mechanisms Coordinate Leg Movement In Walking An-
thropods?, Trends in Neuroscience, 13, pp. 15-21, 1990.

[21] K. Pearson, The control of walking, Scientific American 235:72-86, 1976.

[22] Full, R. J., Blickhan, R., Ting, L. H., Leg design in hexapedal runners, J.

exp. Biol. 158, 369–390.

[23] V.R. Kumar, K.J. Waldron, A Review of Research on Walking Vehicles,

The Robotics Review 1, pp. 243-266, The MIT Press, Cambridge, MA,

1989.

[24] J. Dean, A Model of Leg Coordination in the Stick Insect, Carausius Mo-

rosus, II. Description of the Kinematic Model and Simulation of Normal

Step Patterns, Biol. Cybern. 64, 403-411, 1991.

93
[25] J. Dean, A Model of Leg Coordination in the Stick Insect, Carausius Mo-
rosus, III. Responses to perturbations of normal coordination, Biol. Cy-

bern. 66, 335-343, 1992.

[26] Espenschied, K. S., Chiel, H. J., Quinn, R. D, Beer, R. D., Leg Coordina-
tion Mechanisms in the Stick Insect Applied to Hexapod Robot Locomotion,

Adaptive Behaviour 1 (4), pp. 455-468, 1992.

[27] Espenschied, K. S., Chiel, H. J., Quinn, R. D, Beer, R. D., Biologically-


Inspired Hexapod Robot Control, Proc. 5th Int. Symp. on Robotics and

Manufacturing (ISRAM), 5, pp. 89-94, 1994.

[28] Jung-Min Yang and Jong-Hwan Kim, Optimal Fault Tolerant Gait Se-

quence of the Hexapod Robot with Overlapping Reachable Areas and Crab

Walking, IEEE Transactions on Systems, Man, and Cybernetics- Part A:


Systems and Humans, Vol.29, No. 2, March 1999.

[29] Jung-Min Yang and Jong-Hwan Kim, A fault Tolerant Gait for a Hexapod
Robot over Uneven Terrain, IEEE Transactions on Systems, Man, and

Cybernetics- Part B: Cybernetics, Vol.30, No. 1, February 2000.

[30] Byoung S. Choi, Shin Min Song, Fully Automated Obstacle-Crossing Gaits

for Walking Machines, IEEE Transactions on Systems, Man, and Cyber-

netics, Vol.18, No. 6, November/December 1988.

[31] David Wettergreen, Chuck Thorpe, Gait Generation for Legged Robots,

Proceedings of the IEEE International Conference on Intelligent Robots

and Systems, July 1992 .

94
[32] Robert B. McGhee, Geoffrey I. Iswandhi, Adaptive Locomotion of a Mul-
tilegged Robot over Rough Terrain, IEEE Transactions on Systems, Man,

and Cybernetics, Vol. SMC-9, No. 4, April 1979.

[33] T.D. Barfoot, E.J.P. Earon, G.M.T. D’Eleuterio, A Step in the Right
Direction, Learning Hexapod Gaits Through Reinforcement, International
Symposium on Robotics, Montreal, Canada, 14-17 May 2000.

[34] Alan Calvitti, Randall D. Beer, Analysis of a Distributed Model of Leg

Coordination, I. Individual Coordination Mechanisms, Biol. Cybern. 82,

197-206 (2000).

[35] Porta, J. M., Celaya, E., Gait Analysis for Six-Legged Robots, Techni-
cal Report IRI-DT-9805, Institut de Rob‘otica i inform‘atica Industrial,

Barcelona, March 1998.

[36] E. Celaya, J.M. Porta, V. Ruz de Angulo, Reactive Gait Generation for

Varying Speed and Direction, First International Symposium on Climbing


and Walking Robots, 1998.

[37] U. Saranlı, M. Buehler, D. E. Koditschek, Design, Modeling and Prelimi-

nary Control of a Compliant Hexapod Robot, IEEE Int. Conf. on Robotics

and Automation, San Fransisco, CA (April 2000).

[38] Long-Ji Lin, Self-Improving Reactive Agents Based On Reinforcement

Learning, Planning and Teaching, Machine Learning, 8, 293-321, 1992.

[39] Andrew G. Barto, Richard S. Sutton, and Christopher J.C.H Watkins,

Learning and Sequential Decision Making, Learning and computational


neuroscience, MIT Press, 1990.

95
[40] George J. Klir, Tina A. Folger, Fuzzy Sets, Uncertainty, and Information,
Prectice Hall.

[41] Sutton, R.S., Learning to predict by the methods of temporal differences,


Machine Learning, 3:9-44 1988.

[42] K.S. Fu, R.C. Gonzalez, C.S.G. Lee, Robotics: Control, Sensing, Vision,
and Intelligence, McGraw-Hill.

[43] John J. Craig, Introduction to Robotics, Mechanics and Control, Addison-


Wesley Publishing Company.

[44] Kathryn W. Lilly, Efficient Dynamic simulation of robotic mechanisms,


Kluwer Academic Publishers, 1993.

[45] Robert J. Schilling, Fundamentals of Robotics, Analysis and Control,

Prentice Hall, 1990.

[46] Michael McKenna, David Zeltzer, Dynamic simulation of autonomous

legged locomotion, In Computer Graphics (SIGGRAPH proceedings), vol-


ume 24. ACM, August 1990.

[47] Jelena Godjevac, Comparative Study of Fuzzy Control, Neural Network

Control and Neuro-Fuzzy Control, Technical Report, No: 103/95, February

1995.

[48] Zexiang Li, Ping Hsu, Shankar Sastry, Grasping and Coordinated Manip-

ulation by a Multifingered Robot Hand, International Journal of Robotics

Research, Vol. 8, No. 4, August 1989.

96
[49] J. Kerr, An analysis of multifingered hands, Ph.D. Dissertation. Mechan-
ical Engineering, Stanford University, 1984.

[50] David Baraff, Physically based modelling: Principles and Practice, Sig-
graph 97 course notes.

[51] Arlene B. A. Cole, John E. Hauser, S. Shankar Sastry, Kinematics and


Control of Multifingered Hands with Rolling Contact, IEEE Transactions

on Automatic Control, Vol. 34, No. 4, April 1989.

[52] Lin C. T., Lee C. S. G., Neural Network Based Fuzzy Logic Control and

Decision System, IEEE Transactions on Comput., 40(12), pp. 1320-1336,


1991.

[53] Jang R., Neuro-Fuzzy Modeling: Architectures, Analyses and Aplications,


PhD Thesis, University of California, Berkeley, July 1992.

[54] Nauck D., Kruse R., Neuro-Fuzzy Systems for Function Approximation,

4th International Workshop Fuzzy-Neuro Systems, 1997.

97

Vous aimerez peut-être aussi