Optimal Control of Inverted Pendulum Using Ant Colony System Algorithm2

Optimal Control of Inverted Pendulum using
Ant Colony System Algorithm

Shekhar Yadav
J.P.Tiwari
S.K.Nagar
Dept. of Electrical Engineering,

IT.BHU, Varanasi (UP), India.

Email: jptiwari.eee@itbhu.ac.in

Email: sknagar.eee@itbhu.ac.in
Email: yadavshekhar4@gmail.com
Abstract- In this paper a pole-placement technique is

used for designing a feedback controller for an inverted
pendulum system. The state feedback method is
proposed for controlling and stabilization of an inverted
pendulum-cart system. The parameters of feedback
gain matrix are optimize using the modified ant colony
system (ACS) algorithm. The proposed control strategy
has been derived by eliminating the internal signals of
the feedback control system by executing row
operations. The effectiveness of this proposed technique
is validated through experimental results obtained by
performing experiments on a simple digital inverted
pendulum (33-936IC).
Keywords- Pole Placement design, Ant Colony
System (ACS), LQR, ITAE, Inverted Pendulum.
I.
INTRODUCTION
The inverted pendulum on a cart is a perfect test-bed

for the design of a wide range of classical and
contemporary control techniques. Inverted pendulum
system is widely used in the field of robotics, space
rocket guidance system, fast moving ground vehicles
and anti-seismic control for buildings etc. Inverted
pendulum is a multivariable, nonlinear, fast reaction,
unstable and higher order system [1,2]. A double
rod inverted pendulum system as shown in Fig.1, is
mounted on a cart which is driven using a dc motor.
The aim of the control strategy is to oscillate the
inverted pendulum from its initial position, until it
reaches the upright equilibrium point [3,7]. The
stabilization of inverted pendulum system is
proposed using pole-placement technique. The
parameters of state feedback gain matrix are tuned
using ACS algorithm. Ant colony algorithm was first
introduced by M. Dorigo [5] and inspired by the
foraging behavior of real ants.
Fig.1 Digital Inverted Pendulum (33-936IC)

The basic ant colony algorithm idea is that a set
of cooperating artificial ants searching the solution
space in parallel simulated real ants searching their
environment for food [4,5]. In pheromone updating
rule, the main modification lies in either mode of
pheromone trails increment of different routes
assigned to different weight values. To keep a
suitable balance between the two contradictory aims
of exploring the search space and accelerating
convergence, an modified ant colony algorithm is
proposed based on adjusting pheromone evaporation
factor by solving continuous optimization problems.
The quadratic performance index is selected as an
objective function, and all parameters of state
feedback gain matrix are tuned by ACS algorithm.
III. ANT COLONY SYSTEM
Linear Quadratic Regulator (LQR) is designed for

optimal control of inverted pendulum.
II. POLE-PLACEMENT DESIGN TECHNIQUE
Considering a linear dynamic system in state space
form
= +
=
(1)
(2)
where, =state vector of the plant ( -vector)

=control signal (scalar)
=output signal (scalar)
= constant matrix
= 1 constant matrix
and the control signal is given by
=
(3)
The 1 matrix is called the state feedback gain

matrix. The closed-loop control system when state
is fed back to the control signal is given by = ()
(4)
Assuming that the pair (A,B) is completely

controllable, there exist a feedback matrix K such that
the closed-loop system eigenvalues can be placed in
arbitrary locations. The state feedback gain matrix
can also be obtained through the Quadratic cost
function minimization.
1
=2
(5)
The errors are minimized as,

= [ ]
(6)
where is a state vector,

= [, , , ]
The vector is expressed by the solution of the
Riccati equation.
+ + 1 = 0
= 1
(7)
(8)
The ant colony system was developed in early 1990s

by Dorigo et al [5]. The ACS technique is one of the
metaheuristic optimization methods and is inspired
by the capability of real ants to establish the shortest
path from a food source to their nest. Ants lay the
chemical substance or the trails of pheromone, on the
ground when they move along paths. Each individual
ant makes a decision of the moving direction based
on the strength of the pheromone trails. The better
path is one that has higher amount of the pheromone
trails on the ground. While more and more ants track
on the food source, the shorter path accumulates the
more pheromone trails. Thus, most of the ants are
attracted to the shorter path, and this behavior of the
path selection encourages the positive feedback
effect. It is noted that the ants finally will find the
shortest path [6].
A.
GENERATION OF NODES AND PATHS
Let the state feedback gain matrix parameters

1 , 2 , 3 , 4
are the optimized variables, and
assume that the value of each of them has four valid
digits. In the four digits of 1 , 2 , 3 , 4 , there are
two digits before decimal point and two digits after
decimal point. When using the ACS algorithm, a
discrete solving space is needed because the path
selections of an ant in each step are limited. In order
to use the ACS algorithm conveniently, the values of
1 , 2 , 3 and4 are expressed on X-Y plane. As
shown in Fig.2, first we draw sixteen lines
1 , 2 , , 16 which have equal length and equal
separation and are perpendicular to axis X.
1 ~4 , 5 ~8 , 9 ~12 , 13 ~16 represents the first
digit to fourth digit of 1 , 2 , 3 and4 respectively.
The X coordinated of these lines are represented by
numbers 1~16 respectively. Then, we divided each
of these lines into ten portions and thus eleven nodes
are generated on each line. The eleven nodes on each
line represent numbers 0~10 respectively, which are
possible values of the digits corresponding to the line.
Let an ant depart from the origin O of X-Y plane.
When it moves to any node of line 16 , it completes a
tour. Its moving path can be represented by =
{ , 1 , 1 , , 2 , 2 , . . ,
16 , 16 , }. Obviously, the values of
1 , 2 , 3 and4 represented by the path can be
computed by the following formulas:
Fig.2 Diagram of Generating Nodes and Paths
, =
K1 = y1j 101 + y2j 100 + y3j 101 + y4j 102

K2 = y5j 101 + y6j 100 + y7j 101 + y8j 102
1
K3 = y9j 10 + y10j 10 + y11j 10
+ y12j 10
(9)
2
K4 = y13j 101 + y14j 100 + y15j 101 + y16j 10 2
B.
When all ants move to one line, say, line , let

(0~10) be the number of ants at node j of line
then the total number of ants is =
10
=0 .
Let
, be the concentration of pheromone at Node

, assume that initially all the nodes have same
amount of pheromone 0 . In moving process, an ant
= 1~ on line 4 = 1~16 , will select a
node j from the eleven nodes of the next line to
move to the according to the following transition
rule:
. ,
, 0
(10)
and j=J, if 0
where q is a random variable uniformly distributed
over [0,1], 0 is tunable parameter, contains all of
the nodes on line and J is a node that is
randomly selected according to probability.
(11)
In Eqn. (10) and (11) ( , ) is the visibility of

node ( , ) and this is computed as-
, =
TRANSITION RULE
, ( , )
, ( , )
11
(12)
11
where the values of (i=1~16,j= 0~10) are the set

in following way. In the first iteration of the ACS
algorithm the values of the are set to the vertical
coordinates of the sixteen nodes which are obtained
by mapping the values of state feedback gain matrix
parameters 10 , 20 , 30 and 40 onto Fig.2 where 10 ,
20 , 30 and 40 are obtained by using pole placement
technique. In each of the following iterations, the
values of state feedback gain matrix 1 , 2 , 3 and
4 as shown in Fig.2, where 1 , 2 , 3 and 4 are
the state feedback gain parameters corresponding to
best tour generated since the beginning of the trial.
C. GLOBAL UPDATE
CONCENTRATION
OF
PHEROMONE
When all of the ants in the colony complete their

tours once in the modified ant colony system
algorithm i.e. when they arrive on the line 16 , the
pheromone concentration of each nodes belonging to
the best tour since the beginning of the trial is

updated by the following formulas:
( , ) (1 ). ( , ) + . ( , ) (13)
, = Q/ITAE
(14)
where Node , s are the nodes belonging to the

best tour since the beginning of the trial; is the
parameter which governs the pheromone decay;
ITAE* is the value of the ITAE performance criterion
corresponding to the best tour since the beginning of
the trial; and Q is a positive constant which can be
determined in the following way: for a given control
system, first we obtained the state feedback gain
matrix through pole-placement technique and then
we compute the ITAE performance criterion of the
system according to the obtained state feedback gain
parameters and use ITAE0 to denote the obtained
ITAE value, and then let Q be equal to ITAE0.
Obviously, as the value of ITAE* becomes smaller
and smaller, the value of Q/ITAE* will become
greater and greater, which is helpful to increasing the
pheromone concentration of the nodes on the best
tour since the beginning of the trial and results in
finding the best solution within the maximum number
of iterations allowed.
D. LOCAL UPDATES
CONCENTRATION
OF
PHEROMONE
The local update is performed as follows: while

performing a tour, ant is on line 1 and selects
node j on line , the pheromone concentration of
Node ( , ) is updated by the following formula:
( , ) (1-). ( , ) + 0
the performance of the system. Various objective

functions were written based on error performance
criterion. The performance index is calculated over a
time interval; T, normally in the region of 0 T ts ,
where ts is the settling time of the system. To
emphasize the effectiveness of the proposed method,
the ITAE performance criterion as given below is
adopted in this paper.
.
0
ITAE =
()
(16)
IV. MODELING OF INVERTED PENDULUM

The inverted pendulum-cart system is usually
presented as a pole balancing task. The system to be
controlled consists of a cart and a rigid pole hinged to
the top of the cart. The movement of the cart is
caused by pulling the belt in two directions by the
DC motor attached at the end of the rail. By applying
a voltage to the motor the force can be controlled
with which the cart is to be pulled. The value of the
force depends on the value of the control voltage.
The cart can move left or right on a one-dimensional
bounded track, whereas the pole can swing in the
vertical plane determined by the track. The linearized
system equations around = in the state space are:
0
=
0
+ 2
2 2
+ + 2
+ + 2
+ + 2
+ + 2
0
+
1
0
+ 2
+ + 2
(17)
+ + 2
(15)
The value 0 is the same as the initial value of

pheromone concentration. When an ant visits a node,
the application of the local update rule makes the
pheromone level of the node diminish. This has the
effect of making the visited nodes less and less
attractive for other ants, thus indirectly favoring the
exploration of not yet visited nodes. To optimize the
performance of an inverted pendulum-cart system,
the gains of state feedback system are adjusted to
maximize or minimize a certain performance index.
The objective of the performance index is to
encompass in a single number a quality measure for
1 0 0 0
0
=
+
0 0 1 0
0
(18)
where,
M (mass of cart)
m (mass of pendulum)
b (friction of cart)
I (moment inertia of pendulum)
l (length of pendulum)
g (acceleration due to gravity)
2.4 kg
0.23 kg
0.05 N/m/sec
0.099 kgm2
0.4 m
9.8 m/sec2
The state of the system is defined by values of four

system variables: , , , the cart position, cart
velocity, pendulum angle and angular velocity of the
pendulum pole respectively. Control force is applied
to the system to prevent the pole from falling while
keeping the cart within the specified limits. The
inverted pendulum-cart system is used here is Digital
Pendulum (33-936IC).
V.
RESULTS AND DISCUSSION
4500
0
=
0
0
0
0
0
0
0
0
0 100 0
0
0
0
= 1
the step response generated while updating the matrix
Q and R, is shown in Fig.4, and the state feedback
gain matrix is given as = [63.05 66.78 372.28 144.12]
(20)
The performance of proposed controller is discussed

in this section. The closed-loop poles of the system
are located at = ( = 1,2,3,4), where 1 = 2 +
2 3, 2 = 2 2 3, 3 = 10, 4 = 10. The
closed-loop poles 1 2 are a pair of dominant
closed loop poles with = 0.5 = 4. The
LQR method finds the optimal control matrix that
result in some balance between system errors and
control effort. The performance index matrix (R) and
the state-cost matrix (Q) is initially set as: =
1 and = [1 0 0 0; 0 0 0 0; 0 0 1 0; 0 0 0 0].
The
weighting factors will be chosen by trial and errors.
The state feedback gain matrix found through
MATLAB commands is:
= [0.9701 3.0259 70.5683 27.1358]
(19)
The pendulum's and cart's overshoot appear fine, but

their settling times need improvement and the cart's
rise time needs to be decreased. Also the cart has, in
fact, moved in the opposite direction. For now, we
will concentrate on improving the settling times and
the rise times.
Fig.3 Step response of inverted pendulum

system
The settling time and rise time can be improved by
updating the matrix Q and matrix R by trial and error
method. The updated matrix Q and R are given as-
Fig.4 Step response using LQR method
From Fig.4, we see that all design requirements are

satisfied except the steady-state error of the cart
position (x) but using LQR method system respond
very slowly because values of state feedback gain
becomes larger. Therefore, gains of state feedback
are tuned through modified ant colony system
algorithm. Performance index (PI) is optimized for
position of the cart as in the real system, the length of
the apparatus on which the cart is moving is limited.
So care has to be taken to restrict the motion of the
cart within the limits. This is analyzed on the basis of
ITAE to maintain the pendulum position at 00 for any
disturbance given to the cart. The feedback gain
matrix using ant colony system (ACS) is = 48.78 28.63 51.65 71.76
(21)
Studies in Computational Intelligence (SCI) 75, 117

(2007), www.springerlink.com c_ Springer-Verlag Berlin
Heidelberg.
[5] M. Dorigo, M. Birattari, and T. Stitzle, Ant Colony
Optimization: Arificial Ants as a Computational
Intelligence Technique, IEEE computational intelligence
magazine, November, 2006
[6] M. Dorigo, L.M. Gambardella, Ant colony system : a
cooperative learning approach to the traveling salesman
problem, IEEE Tran. On Evolutionary Computation, vol.
1, no. 1, pp. 53-66, 1997.
[7] Katsuhiko Ogata, Modern Control Engineering,
Prentice Hall, New Jersey, 3rd edition-1997
Fig.5 Step response of ITAE using ACS
algorithm
VI.
CONCLUSION
In this paper, an Ant Colony System (ACS) algorithm

is used to stabilize an inverted pendulum-cart system.
By using modified ACS algorithm, the calculation
time can be reduced and the accuracy can be
increased in comparison with the pole-placement
design technique. This concept gives a new
alternative procedure in time varying feedback
control to improve the stability performance. This
technique is implemented in an inverted pendulumcart system which is a highly nonlinear system.
VII.
REFERENCES
[1] C. C. Chung and J. Hauser, Nonlinear control of a

swinging pendulum, Automatica, vol. 31, no. 6, pp. 851
862, Jun. 1995.
[2] Q. Wei, W. P. Dayawansa, and W. S. Levine,
Nonlinear controller for an inverted pendulum having
restricted travel, Automatica, vol. 31, no. 6, pp. 841-850,
1995.
[3] Hamid R. P., M. R. Jaheh-Motlagh, Ali-Akbar J.,
Optimal feedback control design using genetic algorithm
applied to inverted pendulum, IEEE International
Symposium on Industrial Electronics, pp. 263-268,
June,2007
[4] C. Grosan and A. Abraham: Hybrid Evolutionary
Algorithms: Methodologies, Architectures, and Reviews,

Optimal Control of Inverted Pendulum Using Ant Colony System Algorithm2

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Optimal Control of Inverted Pendulum Using Ant Colony System Algorithm2

Transféré par

Droits d'auteur :

Formats disponibles

Optimal Control of Inverted Pendulum using

Ant Colony System Algorithm

Dept. of Electrical Engineering,

Dept. of Electrical Engineering,

Dept. of Electrical Engineering,

Abstract- In this paper a pole-placement technique is

The inverted pendulum on a cart is a perfect test-bed

Fig.1 Digital Inverted Pendulum (33-936IC)

III. ANT COLONY SYSTEM

Linear Quadratic Regulator (LQR) is designed for

where, =state vector of the plant ( -vector)

The 1 matrix is called the state feedback gain

Assuming that the pair (A,B) is completely

The errors are minimized as,

where is a state vector,

The ant colony system was developed in early 1990s

GENERATION OF NODES AND PATHS

Let the state feedback gain matrix parameters

Fig.2 Diagram of Generating Nodes and Paths

K1 = y1j 101 + y2j 100 + y3j 101 + y4j 102

K3 = y9j 10 + y10j 10 + y11j 10

K4 = y13j 101 + y14j 100 + y15j 101 + y16j 10 2

When all ants move to one line, say, line , let

, be the concentration of pheromone at Node

In Eqn. (10) and (11) ( , ) is the visibility of

where the values of (i=1~16,j= 0~10) are the set

When all of the ants in the colony complete their

the best tour since the beginning of the trial is

where Node , s are the nodes belonging to the

The local update is performed as follows: while

the performance of the system. Various objective

IV. MODELING OF INVERTED PENDULUM

The value 0 is the same as the initial value of

The state of the system is defined by values of four

RESULTS AND DISCUSSION

The performance of proposed controller is discussed

The pendulum's and cart's overshoot appear fine, but

Fig.3 Step response of inverted pendulum

Fig.4 Step response using LQR method

From Fig.4, we see that all design requirements are

Studies in Computational Intelligence (SCI) 75, 117

Fig.5 Step response of ITAE using ACS

In this paper, an Ant Colony System (ACS) algorithm

[1] C. C. Chung and J. Hauser, Nonlinear control of a

Vous aimerez peut-être aussi