Direct Adaptive Control, Direct Optimization: F Pait and Rodrigo Romano December 2015 Steve Morse Workshop in Osaka

Direct Adaptive Control, Direct Optimization
F Pait and Rodrigo Romano

December 2015
Steve Morse Workshop in Osaka
How do we design a direct adaptive controller? First we

need to decide on the underlying control design methodology.
The adjustable control parameters, the shape of the error equation, the everpresent adaptive observer, all follow from this initial
choice. This contrasts with indirect adaptive control systems. Indirect controllers typically comprise a parameterized observer which
generates an identification error; a certaintyequivalence feedback
regulator; and a tuner or adaptive law. These components can be
designed in a modular fashion, more or less independently, provided each possesses some properties which are indeed satisfied
by typical control and estimation algorithms. Not so with direct
adaptive control!
Reference models are just one possibility in indirect adaptive
control, and they are used sparingly, if at all, outside adaptive control. In this paper we explore the feasibility of another design technique: linearquadratic optimal control. Design using a quadratic
objective is perhaps the most transparent and best understood paradigm that can be applied to detectable, stabilizable linear dynamical
systems in general.
We wish to control a plant with measured output y Rny
and control input u Rnu using the nO dimensional direct adaptive
controller
x = AO x + BO u + DO y
(1)
u = K(t)x.
(2)
in which AO , BO , and DO are fixed, and K is a tunable feedback

parameter. The Morse observer is constructed, as fully spelled out
in1 , together with the state and output estimation equations
x = EO ()x
(3)
y = CO ()x + GO ()y.
(4)
The state vector x plays a role of the socalled regressor used in

parameter estimation and system identification2 .
y D = (I G())
If the plant is siso (ny = nu = 1)

and has dimension n, one could pick
nO = 2n, choose a controllable pair
( A nn , b n1 ) with A stable, and build
the observer with
AO = [
A
0
0
0
b
] , BO = [ ] , DO = [ ] ,
A
b
0
CO an nO -dimensional row vector of

parameters, and GO = 0.
A. S. Morse and F. M. Pait. MIMO
design models and internal regulators
for cyclicly switched parameter
adaptive control systems. IEEE
Transactions on Automatic Control, 39(9):
18091818, 1994
1
R. A. Romano and F. Pait.

Matchableobservable linear models
and direct filter tuning: An approach
to multivariable identification. Submitted to the IEEE Transactions on
Automatic Control, 2015
2
To construct Steves observer (1) we pick a design model

1
x D = (A D + D() (I G()) CD ) x D + BD ()u D
The overwhelming majority of the

direct adaptive control literature uses
reference models as the design paradigm. This is because the control error
between a plants output and that of a
suitably defined reference model can
be expressed in a convenient form in
which the control parameters appear
linearly provided, of course, that
a number of restrictive hypothesis
are satisfied. Another class of direct
controllers are nonidentifier based
universal controllers.
(5)
CD x D ,
where are model parameters, I is an identity matrix, and (CD , A D )

is a stable, observable, parameterindependent pair. The parameter
matrix G() Rny ny is strictly lower triangular, and the parameter
matrices D() and B() take values in Rnx ny and Rnx nu .
direct adaptive control, direct optimization
Matchability. Let CP (`O ) be the class of stabilizable, detectable

linear systems with coefficient matrix triple (CP , A P , BP ) of dimensions ny n P , n P n P , and n P nu respectively, and whose list of
observability indices is `O = {n1 , n2 , . . . , nny }. Then there exists a
value of the parameter such that the transfer function
H (s) = (I G())
CD (sI A D D() (I G())
CD )
An indirect adaptive control scheme

might be designed as follows:
+
Use the expression y = CO ()x
to define an identification
G()y
error y y for each value of ;
EO ()AO = A D EO ()
EO ()DO = DD ()
CO () = CD E0 ()
The number n1 + n2 + + nny is

the McMillan degree of the transfer
functions of the systems in the class
BD ()
of (5) matches the transfer function CP (sI A P )1 BP .

The design model (5) is linked to (1) by the equations:
EO ()BO = BD ()
GO () = G().
Existence of EO () is fundamental for the argument ahead, but its

construction will not be needed.
If the value of is such that (5) is stabilizable, there exists a
state feedback matrix K D () that stabilizes the design model. It
can be shown that the observer state feedback u = K D ()x D renders the signals y and u bounded, and stability of AO guarantees that x is bounded as well. Therefore there exists a matrix
K() = K D ()EO () such that A() = AO + BO K() is Hurwitz.
Relabeling BO = B, we now work with the system
x = A()x + Bu.
(6)
In this construction B is fixed, but A depends on the parameters ,

which in adaptive control is unknown.
Theorem 1 The pair (A(), B) as constructed above is stabilizable.
To obtain a quadratic error equation, choose constant,
symmetric, positivedefinite matrices Q and R so that one can
compute the quantity
z2 = y T Qy + u T Ru.
(7)
Taking advantage of linearity of

the error in , obtain the estimate
at each instant t using a recursive
gradienttype or leastsquares
method;
At each t plug the estimate into
the model, design a feedback con and set the observer
trol K D (),
based state feedback control signal
O ()x
according to
u(t) = K D ()E
the certaintyequivalence principle.
We are not at the moment interested
in pursuing parameter estimation or
indirect adaptive control, so we shall
directly employ a controller given by
(2), without the intermediate steps
of obtaining parameters estimates
and computing EO () or KO ().
The reason is that the parameter
space may contain values of for
which (5) loses stabilizability and the
certaintyequivalence feedback control
laws exhibit singularities. Moreover,
the topology of the space where
the parameters and the control
parameters contained in K live are
incompatible; one is neither stronger
nor weaker than the other, and this
leads to well-known complications
which we wish to avoid via the use of
direct adaptive control.
using only process input and output data. We write the quantity
on the lefthand side as a magnitude squared to emphasize that it
takes positive values only. The algebraic Riccati equation
T
A T P + PA PBR1 B T P + (I G)1 CO
QCO (I G)1 = 0,
which also appears3 when trying to avoid loss of stabilizability in

indirect control, has a unique positivedefinite solution P due to
stabilizability of (A, B), so along the trajectories of (6)
D Colon and FM Pait. Geometry of

adaptive control: optimization and
geodesics. International Journal of
Adaptive Control and Signal Processing,
18(4):381392, 2004
3
d T
x Px = x T A T Px + x T PAx + u T B T Px + x T PBu
dt
T
QCO (I G)1 x + x T PBR1 B T Px + x T K T B T Px + x T PBKx
= x T (I G)1 CO
= x T (K + R1 B T P)T R(K + R1 B T P)x y T Qy u T Ru.
Integrating on the interval [ti T, ti ] gives the identity
ti
ti T
z2 dt =
ti
ti T
x T (K K )T R(K K )xdt + x(ti T)T Px(ti T) x(ti )T Px(ti ) dt.
(8)
where the stabilizing feedback gain matrix K = R B P is unknown, as is the matrix P. We now assume that here exists a known
matrix P such that P P is positive definite, and write
1 T
i) +
zK (ti ) = x(ti )T Px(t
ti
ti T
z2 dt.
The term in K looks almost like the error in a traditional parameter estimation problem. It would be amenable to treatment by a
least-squares algorithm, except that in the present case the error is
known in magnitude only4 . The need arises for tuners with capabilities comparable to those of traditional leastsquares or gradient
type, which can function with information on the magnitude of the
error only.
Direct tuning for direct control. We consider that the feedback control K is linearly parametrized by a vector of parameters .
Our task then is to tune in a manner that keeps
f (, ti ) =
We are ignoring all the noise and

initial condition terms in the formulas
as written here.
zK() (ti )
T
e + Tx (ti T)x(ti
T)
small.
1. Choose a sequence of instants ti We shall fix t0 = 0 and use
ti+1 = ti + T, however different durations of the intervals ti+1 ti
might be algorithmically advantageous.
2. During each interval [ti , ti+1 ) apply a constant feedback of K to
obtain the control cost f (, ti ). The controlled process plays the
role of a 0order oracle: when queried it supplies the value of f ,
with no hint on how to use it.
F.M. Pait. On the design of direct

adaptive controllers. In Proceedings of
the 40th IEEE Conference on Decision and
Control, pages 734738, 2001
4
When an explicit formula for a function F(x) is available, a gardenvariety

finite difference approximation for
its derivative uses the 1st term of
F(x+h)F(x)
the Taylor series: F
.
x
h
A betterbehaved approximation which doesnt need the difference term uses a complex step;
Im [F(x+ih)]Im [F(x)]
Im [F(x+ih)]
F
=
.
x
Im [ih]
h
We shall not have the opportunity to
use this approximation here because
there is no experimental manner to
compute f (, ti ) for complex arguments, but I find the formula neat and
wrote it down for inspiration.
i ).
3. At each instant ti use the barycenter formula to compute (t
i ) and, during the
4. Pick (ti ) as a random variable with mean (t
interval [ti , ti+1 ), use the feedback K((ti )).
The barycenter method can be used to tune the parameters :
mi = mi1 + e f (i ,ti )
(10)
1
i =
(mi1 i1 + e f (i ,ti ) i )
mi
(11)
i = i1 + i
(12)
Here m0 = 0, 0 = 0, i is the sequence i of test values of the controller parameters, and is a positive real constant.
The rationale behind the method is that points where f is large
receive low weight in comparison with those for which f is small.
The update rules for mi (10) and i (11)

constitute a recursive version of the
expression
i =
ij=1 i e
ij=1 e
f ( j ,t j )
f ( j ,t j )
(9)
for the barycenter or center of mass

of a distribution of weights e f (i ,ti )
placed at points i , which is the point
Rn that minimizes the sum of
weighted distances
i
2 f ( j ,t j )
.
( j ) e
j=1
In system identification, this method has been employed successfully for filter tuning5 .
It will prove useful to consider the sequence of test points i as
defined by the sum of the barycenter i1 of the previous points
and a curiosity i , as spelled out in (12). Then (11) reads
i i1 =
e f (i ,ti )
(i i1 ) = Fi (i )i .
mi1 + e f (i ,ti )
R. A. Romano and F. Pait. Direct filter

tuning and optimization in multivariable identification. In Proceedings of
the 53rd IEEE Conference on Decision and
Control, pages 17981803, Los Angeles,
USA, Dec 2014
5
Assume that f () is continuously differentiable with respect to

mi1 e f (i1 +,ti )
the argument , and write F()

=
, so that
f (
+,t )
(mi1 e
i1
+1)2
f
= F (here and in the computations that follow the subscript
indicating dependence of the interval i is omitted if there is no
ambiguity.)
F
A randomized version of the barycenter algorithm has the

required gradientlike properties, in the case where i has a Gaussian normal distribution. A continuoustime version of the barycenter algorithm was analyzed in6 .
Claim 2 The expected value of i = i i1 with respect to the random
variable is proportional to the average value of the gradient of f ( i1 +
i , ti ) in the support of the distribution of i .
Consider the probability density function
p() =
1
(2)n
e 2 ()
1
1 ()
)p() so p = p . Einsteins implicit

Then = 1
(
summation of components with equal upper and lower indices convention
is in force, with upper Greek indices for the components of z and of , and
lower indices for the components of s inverse. With X the ndimensional
set where the curiosity takes its values, for each component of the
vector we have
p
E [F() ] = F() p()d = F() p()d F()

X
F
d +
p()d = F()p() ( d) ,
X
X
where d is an (n 1)form which can be integrated at the boundary

X of the ndimensional set X . The righthand side is zero, because F is
bounded and p() vanishes at the border of X , which is at infinity, hence
E [F() ] = F()p()d +
X
F
p()d,

E [ i ] = E [Fi ()] E [ Fi () f ( i1 + , ti )] .
The term in can be employed to

incorporate extra knowledge in several
manners. For example, if at each
step we take = i1 , then the
gradient term is responsible for the
rate of change, or acceleration, of
the tuning process. Here the author
cannot resist citing his seldomread
paper discussing tuners that set the
2nd derivative, or difference, of the
adjusted parameters, rather than the
1st derivative, as is more often done in
the literature, as well as an application
to filtering theory:
p
()d.
Felipe M Pait. A tuner that accele
Using the integrationbyparts formula,

F()
If we consider i to be our best guess,

on the basis of the information provided by the test point i , of where
the minimum of f () might be found,
then in the absence of any extra knowledge it makes sense to pick the
curiosity i as a random variable with
some judiciously chosen probability
distribution.
6
Felipe Pait. Reading Wiener in Rio.
In IEEE Conference on Norbert Wiener in
the 21st Century, pages 14. IEEE, 2014
(13)
rates parameters. Systems & Control

Letters, 35(1):6568, August 1998; and
M. Gerken, F.M. Pait, and P.E. Jojoa.
An adaptive filtering algorithm with
parameter acceleration. In IEEE International Conference on Acoustics, Speech,
and Signal Processing, pages 1720, 2000
Formula (13) is the main result concerning the barycenter method. It shows
that roughly speaking the search performed by the barycenter algorithm
follows the direction of the negative
average gradient of the function to
be minimized, the weighted average
being taken over the domain where the
search is performed.

Direct Adaptive Control, Direct Optimization: F Pait and Rodrigo Romano December 2015 Steve Morse Workshop in Osaka

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Direct Adaptive Control, Direct Optimization: F Pait and Rodrigo Romano December 2015 Steve Morse Workshop in Osaka

Transféré par

Droits d'auteur :

Formats disponibles

Direct Adaptive Control, Direct Optimization

F Pait and Rodrigo Romano

Steve Morse Workshop in Osaka

How do we design a direct adaptive controller? First we

in which AO , BO , and DO are fixed, and K is a tunable feedback

The state vector x plays a role of the socalled regressor used in

If the plant is siso (ny = nu = 1)

CO an nO -dimensional row vector of

R. A. Romano and F. Pait.

To construct Steves observer (1) we pick a design model

The overwhelming majority of the

where are model parameters, I is an identity matrix, and (CD , A D )

direct adaptive control, direct optimization

Matchability. Let CP (`O ) be the class of stabilizable, detectable

CD (sI A D D() (I G())

An indirect adaptive control scheme

The number n1 + n2 + + nny is

of (5) matches the transfer function CP (sI A P )1 BP .

Existence of EO () is fundamental for the argument ahead, but its

In this construction B is fixed, but A depends on the parameters ,

Taking advantage of linearity of

which also appears3 when trying to avoid loss of stabilizability in

D Colon and FM Pait. Geometry of

direct adaptive control, direct optimization

Integrating on the interval [ti T, ti ] gives the identity

x T (K K )T R(K K )xdt + x(ti T)T Px(ti T) x(ti )T Px(ti ) dt.

We are ignoring all the noise and

F.M. Pait. On the design of direct

When an explicit formula for a function F(x) is available, a gardenvariety

The update rules for mi (10) and i (11)

for the barycenter or center of mass

direct adaptive control, direct optimization

R. A. Romano and F. Pait. Direct filter

Assume that f () is continuously differentiable with respect to

the argument , and write F()

A randomized version of the barycenter algorithm has the

)p() so p = p . Einsteins implicit

E [F() ] = F() p()d = F() p()d F()

where d is an (n 1)form which can be integrated at the boundary

The term in can be employed to

Using the integrationbyparts formula,

If we consider i to be our best guess,

rates parameters. Systems & Control

Vous aimerez peut-être aussi