Vous êtes sur la page 1sur 6

Optimal Control of Inverted Pendulum using

Ant Colony System Algorithm


Shekhar Yadav

J.P.Tiwari

S.K.Nagar

Dept. of Electrical Engineering,


IT.BHU, Varanasi (UP), India.

Dept. of Electrical Engineering,


IT.BHU, Varanasi (UP), India.
Email: jptiwari.eee@itbhu.ac.in

Dept. of Electrical Engineering,


IT.BHU, Varanasi (UP), India.
Email: sknagar.eee@itbhu.ac.in

Email: yadavshekhar4@gmail.com

Abstract- In this paper a pole-placement technique is


used for designing a feedback controller for an inverted
pendulum system. The state feedback method is
proposed for controlling and stabilization of an inverted
pendulum-cart system. The parameters of feedback
gain matrix are optimize using the modified ant colony
system (ACS) algorithm. The proposed control strategy
has been derived by eliminating the internal signals of
the feedback control system by executing row
operations. The effectiveness of this proposed technique
is validated through experimental results obtained by
performing experiments on a simple digital inverted
pendulum (33-936IC).
Keywords- Pole Placement design, Ant Colony
System (ACS), LQR, ITAE, Inverted Pendulum.
I.

INTRODUCTION

The inverted pendulum on a cart is a perfect test-bed


for the design of a wide range of classical and
contemporary control techniques. Inverted pendulum
system is widely used in the field of robotics, space
rocket guidance system, fast moving ground vehicles
and anti-seismic control for buildings etc. Inverted
pendulum is a multivariable, nonlinear, fast reaction,
unstable and higher order system [1,2]. A double
rod inverted pendulum system as shown in Fig.1, is
mounted on a cart which is driven using a dc motor.
The aim of the control strategy is to oscillate the
inverted pendulum from its initial position, until it
reaches the upright equilibrium point [3,7]. The
stabilization of inverted pendulum system is
proposed using pole-placement technique. The
parameters of state feedback gain matrix are tuned
using ACS algorithm. Ant colony algorithm was first
introduced by M. Dorigo [5] and inspired by the
foraging behavior of real ants.

Fig.1 Digital Inverted Pendulum (33-936IC)


The basic ant colony algorithm idea is that a set
of cooperating artificial ants searching the solution
space in parallel simulated real ants searching their
environment for food [4,5]. In pheromone updating
rule, the main modification lies in either mode of
pheromone trails increment of different routes
assigned to different weight values. To keep a
suitable balance between the two contradictory aims
of exploring the search space and accelerating
convergence, an modified ant colony algorithm is
proposed based on adjusting pheromone evaporation
factor by solving continuous optimization problems.
The quadratic performance index is selected as an
objective function, and all parameters of state
feedback gain matrix are tuned by ACS algorithm.

III. ANT COLONY SYSTEM

Linear Quadratic Regulator (LQR) is designed for


optimal control of inverted pendulum.
II. POLE-PLACEMENT DESIGN TECHNIQUE
Considering a linear dynamic system in state space
form
= +
=

(1)
(2)

where, =state vector of the plant ( -vector)


=control signal (scalar)
=output signal (scalar)
= constant matrix
= 1 constant matrix
and the control signal is given by
=

(3)

The 1 matrix is called the state feedback gain


matrix. The closed-loop control system when state
is fed back to the control signal is given by = ()

(4)

Assuming that the pair (A,B) is completely


controllable, there exist a feedback matrix K such that
the closed-loop system eigenvalues can be placed in
arbitrary locations. The state feedback gain matrix
can also be obtained through the Quadratic cost
function minimization.
1

=2

(5)

The errors are minimized as,


= [ ]

(6)

where is a state vector,


= [, , , ]
The vector is expressed by the solution of the
Riccati equation.
+ + 1 = 0
= 1

(7)
(8)

The ant colony system was developed in early 1990s


by Dorigo et al [5]. The ACS technique is one of the
metaheuristic optimization methods and is inspired
by the capability of real ants to establish the shortest
path from a food source to their nest. Ants lay the
chemical substance or the trails of pheromone, on the
ground when they move along paths. Each individual
ant makes a decision of the moving direction based
on the strength of the pheromone trails. The better
path is one that has higher amount of the pheromone
trails on the ground. While more and more ants track
on the food source, the shorter path accumulates the
more pheromone trails. Thus, most of the ants are
attracted to the shorter path, and this behavior of the
path selection encourages the positive feedback
effect. It is noted that the ants finally will find the
shortest path [6].
A.

GENERATION OF NODES AND PATHS

Let the state feedback gain matrix parameters


1 , 2 , 3 , 4
are the optimized variables, and
assume that the value of each of them has four valid
digits. In the four digits of 1 , 2 , 3 , 4 , there are
two digits before decimal point and two digits after
decimal point. When using the ACS algorithm, a
discrete solving space is needed because the path
selections of an ant in each step are limited. In order
to use the ACS algorithm conveniently, the values of
1 , 2 , 3 and4 are expressed on X-Y plane. As
shown in Fig.2, first we draw sixteen lines
1 , 2 , , 16 which have equal length and equal
separation and are perpendicular to axis X.
1 ~4 , 5 ~8 , 9 ~12 , 13 ~16 represents the first
digit to fourth digit of 1 , 2 , 3 and4 respectively.
The X coordinated of these lines are represented by
numbers 1~16 respectively. Then, we divided each
of these lines into ten portions and thus eleven nodes
are generated on each line. The eleven nodes on each
line represent numbers 0~10 respectively, which are
possible values of the digits corresponding to the line.
Let an ant depart from the origin O of X-Y plane.
When it moves to any node of line 16 , it completes a
tour. Its moving path can be represented by =
{ , 1 , 1 , , 2 , 2 , . . ,
16 , 16 , }. Obviously, the values of
1 , 2 , 3 and4 represented by the path can be
computed by the following formulas:

Fig.2 Diagram of Generating Nodes and Paths

, =

K1 = y1j 101 + y2j 100 + y3j 101 + y4j 102


K2 = y5j 101 + y6j 100 + y7j 101 + y8j 102
1

K3 = y9j 10 + y10j 10 + y11j 10

+ y12j 10

(9)
2

K4 = y13j 101 + y14j 100 + y15j 101 + y16j 10 2

B.

When all ants move to one line, say, line , let


(0~10) be the number of ants at node j of line
then the total number of ants is =

10
=0 .

Let

, be the concentration of pheromone at Node


, assume that initially all the nodes have same
amount of pheromone 0 . In moving process, an ant
= 1~ on line 4 = 1~16 , will select a
node j from the eleven nodes of the next line to
move to the according to the following transition
rule:

. ,

, 0

(10)

and j=J, if 0
where q is a random variable uniformly distributed
over [0,1], 0 is tunable parameter, contains all of
the nodes on line and J is a node that is
randomly selected according to probability.

(11)

In Eqn. (10) and (11) ( , ) is the visibility of


node ( , ) and this is computed as-

, =

TRANSITION RULE

, ( , )
, ( , )

11

(12)

11

where the values of (i=1~16,j= 0~10) are the set


in following way. In the first iteration of the ACS
algorithm the values of the are set to the vertical
coordinates of the sixteen nodes which are obtained
by mapping the values of state feedback gain matrix
parameters 10 , 20 , 30 and 40 onto Fig.2 where 10 ,
20 , 30 and 40 are obtained by using pole placement
technique. In each of the following iterations, the
values of state feedback gain matrix 1 , 2 , 3 and
4 as shown in Fig.2, where 1 , 2 , 3 and 4 are
the state feedback gain parameters corresponding to
best tour generated since the beginning of the trial.
C. GLOBAL UPDATE
CONCENTRATION

OF

PHEROMONE

When all of the ants in the colony complete their


tours once in the modified ant colony system
algorithm i.e. when they arrive on the line 16 , the
pheromone concentration of each nodes belonging to

the best tour since the beginning of the trial is


updated by the following formulas:
( , ) (1 ). ( , ) + . ( , ) (13)

, = Q/ITAE

(14)

where Node , s are the nodes belonging to the


best tour since the beginning of the trial; is the
parameter which governs the pheromone decay;
ITAE* is the value of the ITAE performance criterion
corresponding to the best tour since the beginning of
the trial; and Q is a positive constant which can be
determined in the following way: for a given control
system, first we obtained the state feedback gain
matrix through pole-placement technique and then
we compute the ITAE performance criterion of the
system according to the obtained state feedback gain
parameters and use ITAE0 to denote the obtained
ITAE value, and then let Q be equal to ITAE0.
Obviously, as the value of ITAE* becomes smaller
and smaller, the value of Q/ITAE* will become
greater and greater, which is helpful to increasing the
pheromone concentration of the nodes on the best
tour since the beginning of the trial and results in
finding the best solution within the maximum number
of iterations allowed.
D. LOCAL UPDATES
CONCENTRATION

OF

PHEROMONE

The local update is performed as follows: while


performing a tour, ant is on line 1 and selects
node j on line , the pheromone concentration of
Node ( , ) is updated by the following formula:
( , ) (1-). ( , ) + 0

the performance of the system. Various objective


functions were written based on error performance
criterion. The performance index is calculated over a
time interval; T, normally in the region of 0 T ts ,
where ts is the settling time of the system. To
emphasize the effectiveness of the proposed method,
the ITAE performance criterion as given below is
adopted in this paper.

.
0

ITAE =

()

(16)

IV. MODELING OF INVERTED PENDULUM


The inverted pendulum-cart system is usually
presented as a pole balancing task. The system to be
controlled consists of a cart and a rigid pole hinged to
the top of the cart. The movement of the cart is
caused by pulling the belt in two directions by the
DC motor attached at the end of the rail. By applying
a voltage to the motor the force can be controlled
with which the cart is to be pulled. The value of the
force depends on the value of the control voltage.
The cart can move left or right on a one-dimensional
bounded track, whereas the pole can swing in the
vertical plane determined by the track. The linearized
system equations around = in the state space are:
0

=
0

+ 2

2 2

+ + 2

+ + 2

+ + 2

+ + 2

0
+
1
0

+ 2
+ + 2

(17)

+ + 2

(15)

The value 0 is the same as the initial value of


pheromone concentration. When an ant visits a node,
the application of the local update rule makes the
pheromone level of the node diminish. This has the
effect of making the visited nodes less and less
attractive for other ants, thus indirectly favoring the
exploration of not yet visited nodes. To optimize the
performance of an inverted pendulum-cart system,
the gains of state feedback system are adjusted to
maximize or minimize a certain performance index.
The objective of the performance index is to
encompass in a single number a quality measure for

1 0 0 0
0
=
+

0 0 1 0
0

(18)

where,
M (mass of cart)
m (mass of pendulum)
b (friction of cart)
I (moment inertia of pendulum)
l (length of pendulum)
g (acceleration due to gravity)

2.4 kg
0.23 kg
0.05 N/m/sec
0.099 kgm2
0.4 m
9.8 m/sec2

The state of the system is defined by values of four


system variables: , , , the cart position, cart
velocity, pendulum angle and angular velocity of the
pendulum pole respectively. Control force is applied
to the system to prevent the pole from falling while
keeping the cart within the specified limits. The
inverted pendulum-cart system is used here is Digital
Pendulum (33-936IC).

V.

RESULTS AND DISCUSSION

4500
0
=
0
0

0
0
0
0
0
0
0 100 0
0
0
0

= 1
the step response generated while updating the matrix
Q and R, is shown in Fig.4, and the state feedback
gain matrix is given as = [63.05 66.78 372.28 144.12]

(20)

The performance of proposed controller is discussed


in this section. The closed-loop poles of the system
are located at = ( = 1,2,3,4), where 1 = 2 +
2 3, 2 = 2 2 3, 3 = 10, 4 = 10. The
closed-loop poles 1 2 are a pair of dominant
closed loop poles with = 0.5 = 4. The
LQR method finds the optimal control matrix that
result in some balance between system errors and
control effort. The performance index matrix (R) and
the state-cost matrix (Q) is initially set as: =
1 and = [1 0 0 0; 0 0 0 0; 0 0 1 0; 0 0 0 0].
The
weighting factors will be chosen by trial and errors.
The state feedback gain matrix found through
MATLAB commands is:
= [0.9701 3.0259 70.5683 27.1358]

(19)

The pendulum's and cart's overshoot appear fine, but


their settling times need improvement and the cart's
rise time needs to be decreased. Also the cart has, in
fact, moved in the opposite direction. For now, we
will concentrate on improving the settling times and
the rise times.

Fig.3 Step response of inverted pendulum


system
The settling time and rise time can be improved by
updating the matrix Q and matrix R by trial and error
method. The updated matrix Q and R are given as-

Fig.4 Step response using LQR method

From Fig.4, we see that all design requirements are


satisfied except the steady-state error of the cart
position (x) but using LQR method system respond
very slowly because values of state feedback gain
becomes larger. Therefore, gains of state feedback
are tuned through modified ant colony system
algorithm. Performance index (PI) is optimized for
position of the cart as in the real system, the length of
the apparatus on which the cart is moving is limited.
So care has to be taken to restrict the motion of the
cart within the limits. This is analyzed on the basis of
ITAE to maintain the pendulum position at 00 for any
disturbance given to the cart. The feedback gain
matrix using ant colony system (ACS) is = 48.78 28.63 51.65 71.76

(21)

Studies in Computational Intelligence (SCI) 75, 117


(2007), www.springerlink.com c_ Springer-Verlag Berlin
Heidelberg.
[5] M. Dorigo, M. Birattari, and T. Stitzle, Ant Colony
Optimization: Arificial Ants as a Computational
Intelligence Technique, IEEE computational intelligence
magazine, November, 2006
[6] M. Dorigo, L.M. Gambardella, Ant colony system : a
cooperative learning approach to the traveling salesman
problem, IEEE Tran. On Evolutionary Computation, vol.
1, no. 1, pp. 53-66, 1997.
[7] Katsuhiko Ogata, Modern Control Engineering,
Prentice Hall, New Jersey, 3rd edition-1997

Fig.5 Step response of ITAE using ACS

algorithm
VI.

CONCLUSION

In this paper, an Ant Colony System (ACS) algorithm


is used to stabilize an inverted pendulum-cart system.
By using modified ACS algorithm, the calculation
time can be reduced and the accuracy can be
increased in comparison with the pole-placement
design technique. This concept gives a new
alternative procedure in time varying feedback
control to improve the stability performance. This
technique is implemented in an inverted pendulumcart system which is a highly nonlinear system.
VII.

REFERENCES

[1] C. C. Chung and J. Hauser, Nonlinear control of a


swinging pendulum, Automatica, vol. 31, no. 6, pp. 851
862, Jun. 1995.
[2] Q. Wei, W. P. Dayawansa, and W. S. Levine,
Nonlinear controller for an inverted pendulum having
restricted travel, Automatica, vol. 31, no. 6, pp. 841-850,
1995.
[3] Hamid R. P., M. R. Jaheh-Motlagh, Ali-Akbar J.,
Optimal feedback control design using genetic algorithm
applied to inverted pendulum, IEEE International
Symposium on Industrial Electronics, pp. 263-268,
June,2007
[4] C. Grosan and A. Abraham: Hybrid Evolutionary
Algorithms: Methodologies, Architectures, and Reviews,

Vous aimerez peut-être aussi