Non Linear Optimal Swing Up of An Inverted Pendulum On A Cart Using Pontryagin's Principle With Fixed Final Time

Non-Linear Optimal Swing-Up of an Inverted Pendulum on a Cart
using Pontryagin's Principle with Fixed Final Time

Guillerme Phillips Furtado, Matteo Rocchi, Riccardo Romagnoni
Politecnico di Milano, Dipartimento di Meccanica
Abstract
The inverted pendulum is frequently used to benchmark control strategies. This paper applies the solution of
Euler-Lagrange equations supplemented by Pontryagin's Principle in order to optimally swing up an inverted
pendulum on a cart in arbitrarily fixed final time. The solution of the two-point boundary value problem generated by
Euler-Lagrange equation is then solved numerically, and thus the optimal trajectory is found. The advantage of using
this method is that it is a direct approach to find an optimal trajectory for a given dynamical system.
1. Introduction
The inverted pendulum is one of the most commonly studied systems in the control area. It is quite
popular because the system is an excellent test bed for learning and testing various control techniques. It is
an under-actuated mechanical system and inherently open loop unstable with highly non-linear dynamics.
The applications range of the inverted pendulum is wide: from robotics to space rocket guidance
systems. The inverted pendulum can be used to model the yaw and the pitch of a rocket when center of drag
is before the center of gravity causing aerodynamic instability. Originally, these systems were used to
illustrate ideas in linear control theory such as the control of linear unstable systems. Their inherent nonlinear nature helped them to maintain their usefulness along the years and they are now used to illustrate
several ideas emerging in the field of modern non-linear control [1].
In this paper, the swing up of a single pendulum in a cart and its stabilization in the unstable
equilibrium position is addressed: the swing up is done by finding an optimal trajectory that minimizes the
control effort required to bring the pendulum from the downward stable equilibrium position to the upward
unstable equilibrium position. The trajectory is based on the numerical solution of Euler-Lagrange equations,
supplemented by Pontryagin's Principle, a necessary condition for an optimal control law in a non-linear
system [2]. The solution obtained for the control law has the drawback of being open loop and therefore is
very sensitive to modeling errors and noise, so it should be restricted to situations where the modeled system
is very close to the real system.
After the pendulum is successfully brought to the upward position, it is stabilized by a linear
quadratic regulator, which gives a full state feedback optimal control for the linearized system around the
equilibrium position [3].
2. Inverted Pendulum System
The inverted pendulum on a cart system is graphically depicted on figure 2.1.
Figure (2.1)
The equations of motion for the system were derived using Lagrange equations
d L L D
+
=Q i (2.1)
dt q i qi q i
where L=T-V is the Lagrangian, D is the dissipation function and Q is a vector of generalized forces acting
on the direction of the generalized coordinates not accounted for in the formulation of the kinetic energy T,
potential energy V and dissipation function D. The kinetic and potential energy is given by the sum of
energies from the individual components (the cart and the pendulum), which are written in respect to the
coordinates x and , representing the displacement of the cart and the angle of the pendulum in respect
to the vertical axis, respectively:
V 0 =0
V 1=m g l cos
V =V 0+V 1
1
T 0= m0 x 2
2
1
1
T 1= m1 [( x +l cos )2 +(l sin )2 ]+ I 2 =
2
2
1
1
= m1 x 2 + (m1 l 2+ I 1 ) 2+m1 l x cos
2
2
T =T 0+T 1
D= D0+ D1
1
D 0= c0 x 2
2
1
D1= c 1 2
2
where
m0
m1
I
c0
c1
g
l
mass of the cart;

mass of the rod;
moment of inertia of the rod
damping coefficient of the track;
damping coefficient of the joint;
gravity constant;
length of the rod.
It is assumed that the only dissipative forces are due to the viscous damping c 0 and c1. Moreover, the
center of mass for the pendulum is considered to coincide with the geometrical center, and the mass is
uniformly distributed along its dimension.
From the Lagrange scalar equations, with respect to x and ,
d L L D
+
=u (t)
dt x x x
(2.2)
d L L D
+
=0
dt
(2.3)
the non-linear equations of motion for the system are obtained:
3 g l m1 cos sin +4 l u4 c0 l x +6 c 1 cos +2

l 2 m1 sin 2
x =
(2.4)
l (4 (m0+m1)3 m1 cos 2 )
2 m21 cos sin 2)

3(2 g l m0 m1 sin 2 g l m21 sin +2 l m1 cos u2 c 0 l m1 cos x +4 c 1 m0 +4
c1 m1 +l
l 2 m1(4(m0+m1 )+3 m1 cos 2 )
(2.5)
Those equations can then be rewritten as a system of first order differential equations:
x = f ( x)+ g ( x ) u(t ) (2.6)

The four states are declared as
therefore:
x 3=
x4 =
x 1=x , x 2= , x3 = x , x 4= and then x=[ x 1 x 2 x 3 x 4 ]T . We have
x1=x 3
x 2=x 4
2
2
3 g l m1 cos x 2 sin x 2+4 l u4 c0 l x 3+6 c 1 cos x 2 x 4+2 l m1 sin x 2 x 4
2
l (4(m0 +m1)3 m1 cos x 2)

2
3(2g l m0 m1 sin x 22 g l m1 sin x 2 +2 l m1 cos x 2 u2 c 0 l m1 cos x 2 x 3+4 c 1 m0 x 4 +4 c1 m1 x 4+l 2 m21 cos x 2 sin x 2 x 24 )
2
l m1 (4 (m0 +m1)+3 m 1 cos x 2)
(2.7)
The equations are linearized around the unstable equilibrium position x 1=0, x 2=0 which will be
used for obtaining a linear feedback control law that renders the system stable at that position:
x = Ax+Bu
(2.8)
with:
( f ( x)+g (x )u (t))
x
( f ( x)+g (x )u (t)) (2.9)
B=
u
A=
3. Control equations
The control of the pendulum is divided in two parts: first, the control effort required to bring it from
the downward position to the upward position, which can be obtained using Euler-Lagrange equations
supplemented by Pontryagrins Principle [3], acts on the system; second, when the position of the pendulum
is close enough to the upward position, a feedback control law stabilizes the pendulum by means of a linear
quadratic regulator.
3.1. Swing up Using Trajectory Optimization
In order to obtain a control action that generates an optimal trajectory in a specified time, the
following finite horizon optimization problem is considered:
tf
min (q( x)+u (t)T u(t )) dt+(t f )

0
(3.1)
x = f ( x)+ g (x ) u(t ).
where
( x(t f )) is a terminal weight function.

We desire to find a control action u (t) capable of minimizing the cost-function given above. We
now introduce H, referred to as the Hamiltonian, also called the Pontryagin H function [3]:
H (x (t ) , u(t ), (t))=q(x (t ))+uT (t )u (t )+ f (x (t ))+(t )T +g (x ( t))u (t ) (3.2)

The action can be found by using a calculus variation approach [3] that results in the following
necessary conditions, known as the Euler-Lagrange equations:
x = f ( x (t))+g ( x (t)) u (t)
(3.3)
H (3.4)
T
(t)=
x
H
=0 (3.5)
u
subject to the boundary conditions:
x (0)= x 0 (3.6a)
T (t f )=
[ x ]
tf
(3.6b)
Since we want to bring the system from a initial condition

the following boundary condition is introduced:
x (0) to a final condition
x (t f ) ,
(t , x ( t))= T ( x (t )) (3.6c)
with
(x (t))i=x (t)x (t f ) (3. 6d)
which is accounted in the weight function by using an algebraic Lagrange multiplier . The function
(t) is called costate, and must be solved simultaneously with the equations of the dynamic system. The
control action u *(t ) is given by:
u *(t )=arg minu ( H ( x * (t) , u(t), *(t))) (3.7)
The differential equations (3.2) and (3.3) together with the optimality condition given by (3.4) and
the boundary conditions given by (3.5) constitute a non-linear two-point boundary value problem. In contrast
with an initial value problem (IVP), a two-point boundary value problem (TVP) might have no solution or
multiple solutions [4]. The most common approaches to solve it numerically are by using a Shooting
Method, a Finite Difference Method or Collocation Method. Unfortunately, both methods may fail to
converge, even if if there is a solution. Due to these characteristics, solving a TVP is considerably more
difficult than solving a IVP.
Unfortunately, the obtained control action cannot take into account eventual disturbances in the
system and deviations from the model.
3.2 Stabilization of the Pendulum Using Linear Quadratic Regulator
Consider the following linear time-invariant system:
x = Ax (t)+Bu (t) (3.8)

And the cost function:
J=
1 T
[ x ( t)Qx (t)+uT (t ) R u (t)]dt (3.9)
2
0
We have to determine the optimal control feedback law u (t) *=k * x (t) that minimizes J.
From optimal control theory, the optimal control law that minimizes equation (functional) is given by [3]:
1 T
u *(t )=R B Px * (t) (3.10)
where P, the nn constant, positive definite, symmetric matrix, is the solution of the nonlinear, matrix
algebraic Riccati equation (ARE) :
PA AT PPBR 1 BT PQ=0 (3.11)

The optimal trajectory is the solution of:
1
x *( t)=[ ABR B P ] x *( t) (3.12)

and the optimal cost is given by:
1
J * = x * T ( t) Pbx * (t)
(3.13)
2
The matrix Q and R were obtained heuristically.

4. Experimental Apparatus
The swing up and the control developed in this project is applied to the Quanser short inverted
pendulum system with the IP02 cart.
The cart movement is given by a 6-Volt DC motor which applies torque to a pinion and enables
movement of a cart across a rack mounted on the track ensuring consistent and continuous traction.
Moreover, a simple pendulum is mounted to a free-spinning shaft, with c1 friction. The IP02 is instrumented
with two quadrature optical encoders,one each for cart position and pendulum angle. The shaft to which the
pendulum is attached allows for the pendulum to be suspended in front of the cart, free of the mechanical
stops. This permits additional configurations with unrestricted movement of the pendulum. The cart position
(x) and pendulum angle () are measured by two quadrature encoders with a resolution of 4096 ticks per
rotation. The cart position encoder is connected to a DAQ board which communicates to a stand alone
computer performing the control of the cart.
The DAQ board is also responsible of performing digital to analog conversion of the control value
given to the cart DC motor. The voltage has a 6V limit which induces an ideal saturation. Disturbances in
the system are caused by applied forces on the cart and pendulum and also by measurement noises created by
encoder quantization.
5. Simulation and Results

In order to solve the two-point boundary value problem, the finite difference code bvp4c available in
MATLAB was used.
For the swing-up, the boundary conditions are:
x (0)=[0 pi 0 0]
(t f )=[0 0 0]T
x 2 (t f )=0
The figure 5.1 shows the numerical solution obtained for u(t) with t f =1.5 seconds. Figure 5.2
and figure 5.3 shows the comparison between simulated results and the experimental results. In figure 5.2 we
can see that the results obtained were very close to each other. This can be attributed to the fact that the
choice of time for the swing up was short enough to reduce the effects of uncertainties in the system that
could significantly change the final result. It must noted that those results might not be satisfactorily close for
a open loop controller acting over a larger time span.
Figure 5.1
Figure 5.2
Figure 5.3
6. Conclusions
This paper showed that using Euler-Lagrange equations to obtain a trajectory for a highly non-linear
unstable system like the pendulum in a cart can be viable. It offers the advantage of being a direct approach
to find an optimal trajectory a system, while it has the drawback of being considerably difficult to solve
numerically. It should be also noted that the obtained solution is in open loop form, and thus would require
either a very accurate model of the system or a compensator over the trajectory.
Bibliography
[1] Optimal regulator for the inverted pendulum via EulerLagrange backward integration
Thomas Holzhuter
Automatica 40 (2004) 1613 1620
[2] Spacecraft Trajectory Optimization

Bruce Conway
Cambridge University Press, 2010

[3] Optimal Control Systems
Desineni S. Naidu
CRC Press, 2003
[4] Computer Methods for Ordinary Differential Equations and Differential-Algebraic Equations
Ascher, Uri M. , Petzold, Linda R
Philadelphia: Society for Industrial and Applied Mathematics, 1998

Non Linear Optimal Swing Up of An Inverted Pendulum On A Cart Using Pontryagin's Principle With Fixed Final Time

Transféré par

Informations du document

Titre original

Copyright

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Non Linear Optimal Swing Up of An Inverted Pendulum On A Cart Using Pontryagin's Principle With Fixed Final Time

Transféré par

Droits d'auteur :

Non-Linear Optimal Swing-Up of an Inverted Pendulum on a Cart

using Pontryagin's Principle with Fixed Final Time

mass of the cart;

3 g l m1 cos sin +4 l u4 c0 l x +6 c 1 cos +2

2 m21 cos sin 2)

x = f ( x)+ g ( x ) u(t ) (2.6)

x 1=x , x 2= , x3 = x , x 4= and then x=[ x 1 x 2 x 3 x 4 ]T . We have

l (4(m0 +m1)3 m1 cos x 2)

l m1 (4 (m0 +m1)+3 m 1 cos x 2)

min (q( x)+u (t)T u(t )) dt+(t f )

( x(t f )) is a terminal weight function.

H (x (t ) , u(t ), (t))=q(x (t ))+uT (t )u (t )+ f (x (t ))+(t )T +g (x ( t))u (t ) (3.2)

x = f ( x (t))+g ( x (t)) u (t)

Since we want to bring the system from a initial condition

x (0) to a final condition

(x (t))i=x (t)x (t f ) (3. 6d)

u (t )=arg minu ( H ( x (t) , u(t), *(t))) (3.7)

x = Ax (t)+Bu (t) (3.8)

PA AT PPBR 1 BT PQ=0 (3.11)

x ( t)=[ ABR B P ] x ( t) (3.12)

The matrix Q and R were obtained heuristically.

5. Simulation and Results

[2] Spacecraft Trajectory Optimization

Cambridge University Press, 2010

Vous aimerez peut-être aussi