Vous êtes sur la page 1sur 54

Seminar on Intelligent Methods in Mechatronics

Real Time Estimation of Driving Resistance


Under Lack of Excitation
Echtzeitfhige Fahrwiderstandsschtzung bei schlechter Anregung

Jieqing Shi
November 4, 2015

Prfer
Univ.-Prof. Dr. - Ing. Christian Endisch
Betreuer
Dipl.-Ing. Simon Altmannshofer

1 Introduction
With recent advancements in vehicle automation, advanced driver assistance systems have
emerged as an important tool to facilitate energy efficient driving. For the driver assistance
systems to function an accurate model of the vehicles longitudinal dynamics is needed for
which vehicle parameters such as the mass and the driving resistances are required. In general,
a straightforward approach is to use sensors to measure these parameters. This, however is
not always applicable or it can be costly [1]. Furthermore, these vehicle parameters such as
mass and driving resistances can vary depending on the load, the attachment of trailers and
road conditions etc. In this respect, an efficient alternative to a sensor-based approach is to
estimate these parameters adaptively using a model-based approach [22]. For this, methods
from data fusion can be applied on available vehicle data retrieved from existing sensors and,
using a mathematical model of the vehicle dynamics, the unknown parameters can then be
reconstructed. The estimation algorithm is developed as a new software function based on various
rapid prototyping development tools and can be continuously validated on existing vehicle data.
To guarantee reliable system performance, the estimation algorithm has to be robust and accurate.
Many estimators have been proposed in literature amongst which the recursive least squares
(RLS) estimator is one of the most popular algorithms. However, in situations where the system
excitation is poor (e.g. on highways where the vehicle travels with nearly constant speed), a
problem called the estimator windup can occur in the RLS estimator. As a result, the estimator
is unable to produce accurate estimates of the unknown parameters which can severely inhibit
the systems performance. This is why the RLS algorithm is not well-suited for the parameter
estimation of vehicle parameters.
The objective of this work is to find modifications and alternatives to the RLS estimator,
respectively, which show better performance regarding the estimation of vehicle parameters
during periods of poor excitation. More specifically, the following tasks are targeted in this
work:
Analysis of estimator windup including its manifestation and consequences
Research of alternative/modified estimators which target the problem of estimator windup
Selection of suitable estimator candidates for the estimation of parameters in vehicle
dynamics model
Implementation of estimators in MATLAB/Simulink
Validation and evaluation of algorithms based on data obtained from various test drives
The remainder of this work is organized as follows: chapter 2 discusses the basics of real time
parameter identification. The algorithm of the RLS estimator is introduced and the problem

1 Introduction
of estimator windup is explained. Moreover the mathematical model of a vehicles longitudinal
dynamics based on [1] is described. Chapter 3 introduces several estimation techniques which can
be classified by their defining principles. The estimators algorithms as well as their key properties
are stated and discussed. Subsequently, a computational study is performed. The introduced
estimators are implemented in MATLAB/Simulink and based on the data obtained from test
drives, the performance of the algorithms are evaluated regarding the quality of parameter
estimates of the vehicles longitudinal dynamics model. The results of the simulation as well as
key observations are described in chapter 4. Finally, this work is concluded with a summary and
discussion in chapter 5.

2 Real time parameter identification


2.1 Basics of parameter identification
In many adaptive systems the direct measurement of the unknown parameters is not possible
because the application of extra sensors is either impractical or not viable [1]. As a reasonable
alternative, the unknown parameters can be estimated using available data. One of the most
popular estimation schemes is the so-called recursive least squares which is described in the
following.

2.1.1 Recursive least squares estimation


The least squares method for parameter estimation is based on a linear, mathematical model
that can be formulated in the so-called regressor form [2]:
y(k) = 1 (k)1 + 2 (k)2 + . . . n (k)n + e(k)
where y(k) denotes the output/observed variable, T = [1 (k) 2 (k)
the vector of regressors and = [1

(2.1)
...

n (k)] denotes

...

n ] represents the vector of n parameters to be


estimated. Denoting y(k) as the output of the estimation model of the above process and as
the estimation of the unknown parameters , the error between predicted output y(k) and the
actual output y(k) can be formulated as
e(k) = y(k) y(k) = y(k) T (k)
The objective of the least squares estimation is to choose the parameters such that the estimated
output y(k) follows the real output y(k) as closely as possible. This can be expressed as estimating
the parameters such that a loss function, which describes the deviation between real and
measured output, is minimized. Introducing the following matrix notations [2],

Y (k) = [y(1)

y(2)

T (1)

...

y(k)]T

(2)

(k) = .
..

T (k)

E(k) = [e(1) e(2)

. . . e(k)]T

= Y (k) (k)

2.1 Basics of parameter identification


with Y RN 1 , E RN 1 and RN n (N denoting the number of observations) the
least-squares loss function can be formulated as
V (, k) =

k
1X
2 = 1 (Y (k) (k))
T (Y (k) (k))
= 1 E T (k)E(k) (2.2)
(y(i) T (i))
2 i=1
2
2

Choosing the that minimizes the loss-function, i.e.


= arg min {V (, k)}
thus describes an unconstrained optimization problem with a quadratic loss function. Additionally
taking into account that y is linear in the parameters, an analytic solution can be obtained
simply by computing the gradient V
and setting it to 0. This leads to the following unique
T

solution for if the matrix is regular [2]:


T = T Y
= (T )1 T Y

(2.3)

The matrix T is often denoted as the information matrix R, its inverse (T )1 is called
the covariance matrix P [2].
T

P (k) = ( )

k
X

!1
T

(i) (i)

(2.4)

i=1

A geometric interpretation of the least squares estimate can be obtained when considering a
two-dimensional case where two parameters 1 and 2 are estimated (see fig. 2.1). With the
regression variables spanning a subspace in which the predicted output Y lies, the least squares
estimation can be seen as finding the parameters such that the distance between the real
output Y and its best approximation Y is minimal. The minimal distance is only achieved when
[2]
E = Y Y span(1 , 2 . . . n )
In order to enable online parameter estimation, the least-squares algorithm can also be formulated
in a recursive manner, i.e. the results obtained until time instance k 1 are used to determine
the estimates at current time instance k. For this purpose, it is assumed that the information
matrix R(k) = T is regular for all k. Using the fact that R(k) can be decomposed as,
R(k) = P 1 (k) = T (k)(k)
=

k1
X

(i)T (i) + (k)T (k)

i=1

= P 1 (k 1) + (k)T (k)

(2.5)

2.1 Basics of parameter identification

Figure 2.1 Geometric interpretation of the least squares estimate [2]

2.1 Basics of parameter identification


the least squares estimate (2.3) becomes

(k)
= P (k)T (k)Y (k) = P (k)
= P (k)

k
X
i=1
k1
X

(i)y(i)
!

(i)y(i) + (k)y(k)

(2.6)

i=1

where

k1
X

1)
(i)y(i) = P 1 (k 1)(k

i=1

From (2.5) it can be deduced that P 1 (k 1) = P 1 (k) (k)T (k). Plugging this expression

into (2.6) yields the following formula for (k)


[2]:

1) + K(k)(k)
(k)
= (k
K(k) = P (k)(k)
1)
(k) = y(k) T (k)(k
After some algebraic reformulations using the matrix
inversion lemma (A +
BCD)1 = A1

1
A1 B(C 1 + DA1 B)1 DA1 on P (k) = P 1 (k 1) + (k)T (k)
one obtains the
recursive least squares (RLS) algorithm [2]:

1) + K(k)(k)
(k)
= (k


K(k) = P (k 1)(k) 1 + T (k)P (k 1)(k)


P (k) = (I K(k)T (k))P (k 1)
1)
(k) = y(k) T (k)(k

(2.7)
1

(2.8)
(2.9)
(2.10)

with K(k) = P (k)(k) denoting a correction factor. Interpreting (k) as the error which occurs
1), the estimate (k)

by predicting y(k) one-step ahead based on (k


at time k is derived by
1). The correction term is
adding a correction term K(k)(k) to the previous estimate (k
proportional to the difference between the measured output and the predicted output based on
the previous estimate [2].

In order to initialize the RLS algorithm initial values for (0)


and P (0) must be known. These
initial values can be obtained in a variety of ways. For instance, if a priori knowledge is available
regarding the parameters and their covariances, these values can be used to instantiate the initial
estimates. Furthermore, it is possible to start with an off-line estimation to obtain the initial

estimates of (0)
and P (0) or to simply assume appropriate initial values [14].
An interesting observation is the similarity between the RLS and the standard Kalman filter
recursive algorithm. The Kalman filter algorithm is usually associated with a random walk
parameter variation model and a linear regression model that can be described by [8], [20]:
(k) = (k 1) + w(k)

(2.11)

y(k) = T (k)(k) + v(k)

(2.12)

2.1 Basics of parameter identification


with w(k) and v(k) denoting the sequence of random vectors which are responsible for the
parameters change and the measurement noise, respectively. Generally,
it is assumed
that
h
i
T
w(k), v(k) are Gaussian processes with zero mean and the variances E w(k)w (k) = Q(k) and
E v(k)2 = r(k). Applying the standard Kalman filter to said model yields [20]:


1) + K(k)(k)
(k)
= (k
P (k 1)(k)
K(k) =
r(k) + T (k)P (k 1)(k)


(2.13)

P (k) = I K(k)T (k) P (k 1) + Q(k)

(2.14)

1)
(k) = y(k) T (k)(k
The similarity between the standard RLS and the Kalman filter estimator become apparent when
comparing the algorithm equations. In fact, it can be shown that the RLS estimator is a special
case of the Kalman filter if specific assumptions about Q(k) and r(k) are made [8].

2.1.2 Exponential forgetting


In many real-world systems, the parameters of the system do not remain constant but can vary
with time. In that case, the proposed RLS algorithm is not suitable to estimate these time-varying
parameters. The reason is as follows: since the exponential convergence of the RLS algorithm (as
proven in various studies such as [13], [14]) implies that the covariance matrix converges to 0

1) as the correction gain


with increasing time horizon k, it follows from (2.7) that (k)
= (k
K(k) vanishes [2]. Therefore if the parameters are time-varying the standard RLS algorithm
is not able to track parameter changes. In this case, one intuitive approach is to modify the
algorithm such that older data is continuously discarded, i.e. is assigned less weight while newer
incoming data is considered with higher weight. This is achieved by modifying the least-squares
loss-function in (2.2) [2]:
V (, k) =

k
1X
2
ki (y(i) T (i))
2 i=1

(2.15)

The constant 0 1 is called the forgetting factor. Evidently, the modified loss-function
assigns exponentially less weight to data that is far away from current time instance k while
new incoming data is considered with more weight. This way parameter estimation using a
forgetting factor can be simply interpreted as averaging the data over a certain amount of data
points while the forgetting factor sets the memory length of the algorithm [3], [20]. Repeating the
calculations in the previous sub-section for the modified loss-function leads to the RLS algorithm
with exponential forgetting [2].

1) + K(k)(k)
(k)
= (k


(2.16)
1

K(k) = P (k 1)(k) + T (k)P (k 1)(k)

(2.17)

P (k) = (I K(k)T (k))P (k 1) 1

(2.18)

1)
(k) = y(k) T (k)(k

(2.19)

2.1 Basics of parameter identification


In some literature the recursive RLS estimator is formulated using the information matrix instead
of the covariance matrix. In that case
R(k) = R(k 1) + (k)T (k)

(2.20)

is used instead of (2.9) [6]. However, from the perspective of implementation, it is often computationally more efficient to use the covariance matrix in the update equations so as to avoid a
matrix inversion operation at each update.
The quality of the estimates depends directly on the choice of the forgetting factor. Choosing
1 leads to robust, smooth trajectories, however the algorithm loses its capability to track
parameter changes since old data is discarded at a relatively slow rate. Conversely, a small
value for enables fast tracking of parameter changes, however the measurements become more
sensitive to noise, ultimately causing fluctuating trajectories and decreased robustness. Therefore,
the trade-off between adaption rate and robustness needs to be taken into account when deciding
on the forgetting factor [14]. This relationship is illustrated in fig. 2.2. The left figure shows the
estimation of four parameters using = 0.9 while the right figure displays the same estimation
using = 0.95. It is apparent that a smaller value of causes more noise sensitive estimates and
larger measurement variances, however changes in the parameter can be tracked relatively fast.
In comparison, a larger leads to smoother, less noise sensitive estimates at the cost of slower
parameter tracking [14].

Figure 2.2 Comparison of different forgetting factors = 0.9 (left) and = 0.95 (right) in the RLS
estimator with exponential forgetting (true values - dashed lines, estimated values - solid lines)
[14]

2.2 Lack of excitation

2.2 Lack of excitation


The RLS algorithm with exponential forgetting is a widely used technique for parameter estimation
in adaptive systems. However, certain circumstances can cause the RLS estimator to produce
inaccurate estimates. These situations arise when the input is insufficiently excited which leads
to a phenomenon denoted as estimator windup.

2.2.1 Estimator windup


For illustration of the windup phenomenon, we consider the extreme case of a process with no
excitation, i.e. (k) = 0 for a long amount of time. In this case, one can observe from (2.18) that
applying the an estimation technique with exponential forgetting leads to [14]
P (k) = (I K(k) T (k))P (k 1) 1
| {z }
=0

1
= P (k 1)

As < 1, P (k) grows exponentially with increasing time horizon which leads to a blow-up or
windup of covariance matrix [2]. Then, as soon as the process becomes properly exciting again,
the correction gain K(k) = P (k)(k) becomes very large due to the large covariance matrix.

This causes (k)


to become very sensitive to disturbances or numerical errors, causing sudden
fluctuations of the parameter estimates [22]. Similarly, estimator wind-up can also be caused by
a regression vector that is constant, but different from zero [2]. In this case, as the covariance
matrix grows exponentially, K(k) also becomes large. Thus, small signal fluctuations due to
noise can already cause the parameters to drift away from their true values.
The effects of estimator windup are illustrated in fig.2.3. In this example, the system input
(regressor) u has phases where it nearly constant, such as the period t = 50s 100s. During this
period of insufficient excitation the covariance matrix (illustrated by the diagonal element p11 )
grows exponentially. Therefore, small signal variations caused by noise results in the estimates a

and b (solid line) drifting away from their true values (dashed line). As soon as proper excitation
occurs (such as after t 130s), the covariance matrix decreases to 0, resulting in the estimates
converging to the true parameters [2].

2.2.2 Persistent excitation


The condition of insufficient excitation is characterized by a lack of incoming information regarding
the parameters of . In that case, the windup phenomenon causes the estimator to become
unreliable and noise sensitive. In fact, it has been shown that if the condition of persistent
excitation is fulfilled, the RLS with forgetting is exponentially convergent, i.e. the error between
actual and estimated parameters tends to 0 exponentially fast which is needed to guarantee
the robustness of the algorithm in the presence of measurement noise [15]. On the contrary,
if the excitation is not persistent, convergence of the parameter error cannot be ensured and

10

2.2 Lack of excitation

Figure 2.3 Effects of estimator windup caused by a constant regressor: control variable (top left), covariance
matrix element (top right), trajectories of estimates (bottom)[2]

the robustness of the algorithm cannot be guaranteed. Thus, as estimator windup causes an
unbounded increase of the covariance matrix the estimates become unreliable [7]. This is why
estimation algorithms relying on constant forgetting factors are only suitable for persistently
excited processes [22].
The concept of persistent excitation can be defined by the following expression. A sequence of
regressors is called persistently exciting in m steps if there exist constants c, C and m such that
[9]
cI

k+m
X

(i)(i)T CI

(2.21)

i=k+1

for any m > n. This condition implies that if (k) is persistently exciting, the entire Rn space
can be spanned by (k) uniformly in m steps. On the contrary, if the input is not sufficiently
exciting, only a subspace with a dimension smaller than n is spanned [9].
To decide if a signal is exciting or not, it can be useful to exploit the fact that a lack of excitation
is reflected in the behavior of the covariance matrix in several ways. Since the covariance matrix is
a symmetrical, real-valued, positive semidefinite matrix which possesses orthogonal eigenvectors,
it can be diagonalized using eigenvalue decomposition. In that way, a covariance matrix can be
transformed into canonical form, i.e. factorized as
P = U U 1
where U is a square matrix whose i th column is the eigenvector qi of P and is a diagonal
matrix with the eigenvalues of P as its diagonal elements. As a result, each covariance matrix
can be fully represented in terms of its eigenvalues and eigenvectors.
Adopting a statistical interpretation, the eigenvalues of a covariance matrix represent the
magnitude of data spread in the direction of the respective eigenvectors. This implies that the

11

2.2 Lack of excitation


largest eigenvector of a covariance matrix points into the direction of the largest variance of the
data, i.e. the direction in which the data has the largest uncertainty, while the second largest
eigenvector points into the direction of the second largest variance, which is orthogonal to the
largest eigenvector, and so on. The length of the eigenvectors is represented by the magnitude
of the respective eigenvalues. This way, covariance matrices can be represented as ellipsoids
in the Rn space with n as the number of parameters [20]. For instance, considering a process
with two parameters, the covariance matrix can be represented in two-dimensional space as an
ellipse (see fig 2.4). In this case, the two perpendicular axes of the ellipse point into the direction
of eigenvectors of the P matrix while the length of the axes is determined by the eigenvalues.
Assuming that the first eigenvalue at time instance k is larger than the second, the ellipse is
more elongated in the direction of eigenvector 1 [20].
As already established, estimator windup is characterized by a blowup of the covariance

EV 1

EV 2
b

a
Figure 2.4 Snapshot representation of a 2 2 covariance matrix as an ellipse at time k

matrix and is caused by a lack of excitation in the input signal i.e. no new information is
incoming regarding the parameters or more precisely, the incoming data does not contain enough
information all along the parameter space [5]. Since an estimator such as the RLS discards old
information, the uncertainty grows. Because each covariance matrix can be represented in terms
of its eigenvalues and eigenvectors, the windup phenomenon caused by insufficient excitation can
therefore be detected by an unbounded increase of its eigenvalues. Referring to fig. 2.4, a lack of
excitation would be represented by an increase of the eigenvalues, thus changing the shape of the
ellipse during the period of poor excitation [20]. Furthermore, using the fact that the trace of a
matrix is defined as the sum of its eigenvalues, another possibility to detect a lack of excitation
is by simply measuring the trace of the P matrix. Thus, an increase of the covariance matrix
eigenvalues would be represented by an increase of the matrix trace.
Similarly, another method of detection is by interpreting the windup phenomenon from the
perspective of the information matrix. As seen from (2.20), poor excitation will lead to [6]
R(k) = R(k 1)
As < 1, information is discarded in every computation step which leads to the information
matrix tending to 0. In this case, the information matrix becomes almost singular and P as its

12

2.2 Lack of excitation


inverse can become ill-conditioned, i.e. P 1 (k) = T (k)(k) can become non-invertible. In this
respect, poor excitation is also reflected by an increase of the condition number of P .

2.2.3 Parameter identification of vehicle longitudinal dynamics

In todays driver assistance systems accurate models of the vehicles longitudinal dynamics are
required to enable automated control schemes for e.g. fuel efficient driving, vehicle following or
range prediction of electric vehicles. Since the longitudinal dynamics is mainly characterized by
the mass of the vehicle as well as rolling resistance and air resistance, obtaining an accurate
model depends mostly on good estimates of these parameters. Since a sensor-based measurement
is often not possible, an adaptive, real-time estimation of these parameters can be considered as
a rational alternative.
In order to derive a system for online parameter identification, a physical model of a vehicles
longitudinal dynamics is required. The dynamics can be modeled as [1], [22]:
1
(m + mrot )v = FA air Acw v 2 mg sin() mg cos()fR
2

(2.22)

In the above equation m is the vehicle mass, mrot is the equivalent mass of rotating components,
v the vehicle acceleration. FA is the driving force and is computed as FA =

Twheel
rdyn

which is the

wheel torque divided by the dynamic rolling radius of the tire. The resisting forces consist of the
aerodynamic resistance, slope resistance and rolling resistance. The air resistance is defined as
Fair = 12 air Acw v 2 with air denoting the air density, A being the frontal area of the vehicle, cw
being the drag coefficient and v representing the vehicle speed. The slope resistance is defined
as FS = mg sin() with g denoting the gravitational constant and being the slope of the
road where 0 corresponds to uphill and downhill grades respectively and = 0 means no
inclination. Lastly, the rolling resistance is computed as FR = mg cos()fR with fR denoting the
rolling resistance coefficient of the road [1], [22].
The unknown parameters in the dynamic model to be estimated are the vehicle mass, the rolling
resistance coefficient and the drag coefficient. For the estimation of these parameters (2.22) can
be further simplified. Using the small angle approximation for small road slopes cos() 1, the
rolling resistance can be approximated as FR mgfR . Furthermore, in order to obtain a linear
representation, the rolling resistance coefficient is to be estimated together with the vehicle mass
(mfR ). Likewise, the drag resistance is to be estimated in combination with the frontal surface
area of the vehicle (Acw ). Lastly, the vehicle acceleration is combined with g sin() of the slope
resistance, so that v + g sin() = ax is used. With these simplifications, (2.22) can be expressed
using the regressor notation [1]:

FA mrot = ax g
|
{z
}
y

1
2
2 air v

{z

mfR

(2.23)

Acw

| {z }

13

2.2 Lack of excitation


h

where FA mrot describes the system output, ax g


h

1
2
2 air v

represents the system input.

iT

Thus, the parameter vector to be estimated is m mfR Acw .


It must be noted that the longitudinal dynamic model described in (2.23) is only valid under
specific circumstances. In fact, the following conditions must be fulfilled so that the parameters
can be estimated reliably [1]:
The vehicle must move at a minimum speed v vmin and must have an acceleration faster
than a amin
The vehicle is moving straight forward.
The driver or the driving assistance system is not performing a breaking maneuver.
The power train is fully opened or closed. This is because the wheel torque cannot be
calculated when the clutch is slipping.
If these validity conditions are not satisfied, the estimation algorithm is not activated.

14

3 Techniques for parameter identification under


lack of excitation
With the notation defined in (2.23), the parameters can be identified in real-time using
estimation algorithms such as the RLS with exponential forgetting. However, as already described,
windup can occur in the RLS estimator during periods of poor excitation. Ultimately, this leads
to inaccurate and unreliable estimates and can result in the algorithm becoming unstable. This
is a serious drawback since for the parameter estimation in many adaptive systems, reliability
and robustness are particularly important. As a result, the RLS algorithm with exponential
forgetting is not the most suited estimator candidate.
Various researchers have identified the problem related to RLS estimators and have proposed a
wide variety of modifications and alternatives to the algorithm in order to target the windup
problem. These can be broadly categorized based on their utilized technique:
Regularization of the least squares problem
Variation of the forgetting factor
Manipulation of the covariance matrix
Limitation or scaling of the trace of the covariance matrix
In the following, some of the proposed estimation algorithms from literature are illustrated.

3.1 Estimation algorithms based on regularization mechanisms


The concept of the estimators which rely on a regularization technique is based on the notion
of well-posed/ill-posed problems. A problem is defined as well-posed if it is solvable and has a
unique solution which depends continuously on system parameters, i.e. small changes of the data
cannot cause arbitrary large changes of the solution. Conversely when these conditions are not
fulfilled, a problem is called ill-posed [21], [19].
Applying the concept to recursive parameter estimation, one recalls that the least squares
estimation problem is described as Y = T which is solved by finding the solution which
minimizes the euclidian norm [10]

T (Y T )
= Y T
(Y T )

. This yields the so-called normal equation


T = T Y

(3.1)

3.1 Estimation algorithms based on regularization mechanisms


from which the solution can be obtained (see (2.3)).
However, when windup occurs, the information matrix R = T can become ill-conditioned. As
a result, R can become singular and the covariance matrix P as the inverse of the information
matrix does not exist. Hence, the regularization based estimators aim at preventing the poor
conditioning of the information matrix.

3.1.1 Tikhonov regularization


One possibility to prevent the ill-conditioning of T is based on the regularization technique
proposed by Tikhonov. This so-called Tikhonov Regularization method proposes to add a positive
term to (3.1). Consequently, the optimization problem becomes [10]

2
2 


T
min Y + T R L
2

(3.2)

with T R denoting an adjustable


2parameter and L being a weighting matrix.

In the above equation T R L can be seen as a penalty term, with which the optimization
2

problem can be ensured to stay well-posed at the price of biasing the obtained solution. Thus,
the regularization parameter T R has to be chosen considering the trade-off that a small value
does not effectively prevent the ill-conditioning of the problem while a large value leads to a
larger bias of the obtained solution. Ideally, T R can be chosen such that the residual is small
and the penalty is moderate [10].
Since in many cases L is chosen as I (so-called standard form) the solution of (3.2) is obtained
as:
= (T + T R I)1 T Y

(3.3)

Similar to the standard RLS, the derivation of the solution can be formulated in recursive form.
Thus, one obtains the Tikhonov regularization estimator (TR) [23]:

1) + K(k)(k) + P (k)(1 )T R (k
1)
(k)
= (k

(3.4)

K(k) = P (k)(k)

(3.5)

R(k) = R(k 1) + (k)T (k) + (1 )T R I

(3.6)

1)
(k) = y(k) T (k)(k

(3.7)

Note that while with the RLS estimator it is possible to formulate the algorithm entirely without
the notion of the information matrix, due to the additional term of T R I it is not possible to
apply the matrix inversion lemma to obtain a recursive expression for P (k). Hence, the algorithm
is formulated using R(k) = P 1 (k).

3.1.2 Levenberg-Marquardt regularization


Another regularization based estimator called the Levenberg-Marquardt regularization has similar
properties to the Tikhonov regularization. Here, the ill-conditioning of the information matrix is

16

3.2 Estimation algorithms based on variation of the forgetting factor


prevented simply by adding a positive definite matrix [11]. This way, it can be ensured that the
information matrix is always invertible. Hence, the Levenberg-Marquardt regularization estimator
(LMR) can be obtained by e.g. adding a scaled identity matrix LM R I to the update equation of
the information matrix [23].

1) + K(k)(k)
(k)
= (k
K(k) = P (k)(k)

(3.8)
(3.9)

R(k) = R(k 1) + (k) (k) + (1 )LM R I


1)
(k) = y(k) T (k)(k

(3.10)
(3.11)

The LMR is formally very similar to the TR algorithm. The only difference is that unlike the
LMR, the TR algorithm includes the covariance matrix in the parameter update equation. Similar
to TR, the adjustable parameter in the LMR is the parameter LM R which is to be chosen as a
positive constant while considering the same trade-off as in the TR estimator.

3.2 Estimation algorithms based on variation of the forgetting factor

A different category of estimators target the windup problem from the perspective of the forgetting
factor. Since the standard RLS uses a constant, time-invariant forgetting factor, old data is
discarded uniformly in each iteration step. This means that the same forgetting factor is applied
to the covariance matrix at all times, regardless of the level of excitation of the input or the
variation rate of the parameters. Therefore the windup problem can be targeted by designing an
estimator which uses a variable forgetting factor.

3.2.1 RLS with variable forgetting factor

An estimation technique which is based on a variable forgetting factor has been proposed by
Fortescue et al, 1981. The basic idea of the method is to enforce time-variant forgetting where
the forgetting factor is chosen approximately as 1 when the process is not properly excited to
avoid windup and to decrease the forgetting factor during periods of rich excitation in order to
enable parameter tracking. This way, during periods of poor excitation the large forgetting factor
ensures that old data is not discarded and conversely, when the excitation is rich can be varied
[14].
In the proposed algorithm, the variable forgetting factor is computed as a function of the noise
variance level and the current estimation error. The idea is to choose (k) such that the a
posteriori error remains constant in time (E(k) = E(k 1) = E(0)). In [14] this condition is
achieved by using
(k) = 1


1 
1 T (k)K(k 1) (k)2
E(0)

(3.12)

17

3.2 Estimation algorithms based on variation of the forgetting factor


where the initial a posteriori error E(0) is defined as a function of the noise variance n2
E(0) = n2 N0
1
N0 =
1 0
with 0 being an adjustable parameter.
To achieve better performance, a modification of the algorithm has been proposed in subsequent
studies where the noise variance n2 (k) is calculated recursively as the weighted sum of the previous
noise variance value and the current prediction error (k) [14]. Furthermore, two thresholds are
defined to detect if the parameters have changed. In case the thresholds are exceeded (i.e. a
change has occured) E(0) is chosen as a small value, which according to (3.12) leads to a smaller
forgetting factor that enables good parameter tracking. Otherwise E(0) is chosen as a large value
which leads to a larger that ensures robust estimates during poor excitation. The so-called
RLS with variable forgetting algorithm (VF) is obtained by adding the following equations to the
standard RLS given by (2.16) - (2.19):
1
1 0
= 10N02

N02 =

(3.13)

N01

(3.14)

n2 (k) = n2 (k 1) + (1 )2 (k)


E(0) =

2 (k)N02



if n2 (k) n2 (0) n2 (k) n2 (k 1) n2

2 (k)N

else

n
n

01

(3.15)
(3.16)


1 
(k) = 1
1 T (k)K(k 1) (k)2
E(0)

In the above algorithm, n2 (0) and n2 are the threshold parameters which need to be chosen.
The equation (3.16) states that if the noise variance exceeds its initial level and if there is a
significant increase or decrease of the noise variance level from the previous sample time, it
can be deduced that the process is sufficiently exciting. Thus, to enable good adaptation to
parameter changes, can be decreased which is achieved by choosing the smaller N02 to compute
the forgetting factor. Otherwise, when the two thresholds are not exceeded should be chosen
as a large value in order to restrict the effects of estimator windup. In this case, the forgetting
factor is computed using the larger N01 so that 1. Another parameter to be tuned is which
determines how the previous noise variance level and the current estimation error should be
weighted in the recursive computation of n2 (k) [14].

3.2.2 RLS with multiple forgetting factors


A different approach to an estimator based on variable forgetting is proposed by Vahidi et al
in [22]. The key principle of the algorithm is based on the observation that in many adaptive
systems, parameters often vary with different rates. For instance, regarding the parameters in
the longitudinal dynamics model of vehicles, the vehicle mass is a parameter that changes rather
abruptly (e.g. when passengers get into or out of the vehicle) while the air resistance would vary

18

3.2 Estimation algorithms based on variation of the forgetting factor


barely. Since the standard RLS assumes that all parameters have the same variation rate windup
can occur when multiple parameters are estimated that actually have different variation rates
[22].
A solution to this problem can be obtained when one considers not a single, but multiple forgetting
factors that are individually customized to each parameter. For this, a new loss function is defined
which separates the error caused for each parameter. E.g. for an estimation of two parameters,
the error function can be stated as
k

2
1X
V (1 (k), 2 (k)) =
ki
y(i) T1 (i)1 (k) T2 (i)2 (i)
1
2 i=1
k

2
1X
ki
y(i) T1 (i)1 (i) T2 (i)2 (k)
+
2
2 i=1

With the above expression, the error term can distinguish between the error caused by the first
estimate and the second estimate.
Similar to the standard RLS, the least squares estimates can be calculated using the individual
gradients

V
,i
i (k)

= 1, 2. Therefore, repeating the calculations for the standard RLS the update

equations for each individual parameter can be obtained (here presented for two parameters
i = 1, 2).
i (k) = i (k 1) + ki (k)(k)
1

ki (k) = pi (k 1)i (k) i + Ti (k)pi (k 1)i (k)


1
pi (k) = (I ki (k)Ti (k))pi (k 1)
i

y(k) T (k)1 (k 1) T (k)2 (k) for i = 1


1
2
(k) =
y(k) T (k) (k) T (k) (k 1) for i = 2
1
2
1
2
Applying some additional algebraic rearrangements the equations can be obtained in vector form
and the Multiple forgetting algorithm (MF) can be stated as [22]:

1) + Knew (k)(k)
(k)
= (k

(3.17)

1)
(k) = y(k) T (k)(k
"

knew (k) = k

1
1 p1 (k 1)1 (k)

1
2 p2 (k 1)2 (k)

1
2
1+
1)1 (k)2 + 1
2 p2 (k 1)2 (k)
1
pi (k) = (1 knew,i (k)Ti (k))pi (k 1)
i = 1, 2
i
"
#
p1 (k)
P (k) = I
p2 (k)
k=

1
1 p1 (k

(3.18)
(3.19)
(3.20)
(3.21)

with knew,i denoting the i-th row of knew . In this case, the covariance matrix is a diagonal matrix
with p1 and p2 as the diagonal elements. Thus, for each diagonal element an individual forgetting
factor is applied and each element of the covariance matrix is also updated separately.

19

3.3 Estimation algorithms based on covariance manipulation

3.3 Estimation algorithms based on covariance manipulation


While the above mentioned algorithms such as the VF estimator rely on non-uniform forgetting
in time, there are alternative estimation techniques which are based on non-uniform forgetting in
the parameter space. The basic idea comes from the observation that the incoming data is often
not uniformly distributed in the parameter space [7]. As a matter of fact, the standard RLS
with constant forgetting is based on the misconception that old data can be forgotten uniformly
since it is assumed to be obsolete. However, old data should only be forgotten when the new
incoming data contains enough new information about the parameters [6]. This is often not the
case since the incoming data is not distributed uniformly in the parameter space [7]. This way,
when data is discarded using a constant forgetting factor, information related to non-excited
directions will be lost, causing the unlimited growth of some elements of the covariance matrix.
Therefore, a variety of algorithms have been proposed where the forgetting is applied only in
certain directions. This type of technique termed as directional forgetting aims at discarding
data only in those directions where the incoming information is sufficiently excited and to restrict
the dismissal of data in the non-excited directions [17], [6].

3.3.1 Directional Forgetting (Bittanti)


There exist many variations of directional forgetting based algorithms. For instance, Bittanti et
al in [4] proposed a version which is henceforth denoted as Directional Forgetting by Bittanti
(DFB). The algorithm can be described by modifying the standard RLS as follows [12]:

1) + K(k)(k)
(k)
= (k
1)
(k) = y(k) T (k)(k
a(k) = T (k)P (k 1)(k)

(3.22)
1

K(k) = P (k 1)(k)(1 + a(k))


(k) =

if a(k) > 0

if a(k) = 0

a(k)

P (k) = P (k 1)

(3.23)
(3.24)

P (k 1)(k)T (k)P (k 1)
1 (k) + a(k)

(3.25)

In case windup is caused by a regression vector of 0, the term a(k) becomes 0 as well and the
equation for the correction gain (3.23) is the same as the corresponding equation in the standard
RLS without forgetting (2.8). Furthermore, the update equation of the covariance matrix is
almost the same as in (2.9) except for the difference in sign of the denominator expression.
Consequently, the influence of the forgetting factor is eliminated from the update equations
and the effects of estimator windup can be limited. On the other hand, if the regression vector is
constant but different from 0, a(k) > 0 applies and (k) is set to

1
a(k)

which can be either

positive or negative. Thus, the denominator 1 (k) + a(k) in the update equation of P (k) can
shift in sign, depending on the chosen forgetting factor and the magnitude of a(k). As a result,
when windup occurs due to a constant regression vector the covariance matrix can either increase

20

3.3 Estimation algorithms based on covariance manipulation


or decrease (i.e. P (k) stays bounded). This is in contrast to the RLS with constant forgetting
where P (k) is unbounded and can thus only increase [12].

3.3.2 Directional Forgetting (Cao)

An alternative directional forgetting based algorithm has been proposed by [7], [6]. The fundamental idea can be explained by examining the update equation of the information matrix of the
standard RLS in situations of poor excitation R(k) = R(k 1). It can be observed that in this
case, the entire matrix R(k) will tend to 0 because information is forgotten uniformly. However,
a better performance can be achieved when the information content of the regression vector (k)
is taken into account, i.e. forgetting is only applied to the specific part of R(k), which is affected
by the new information [7].
This leads to the modification of the information matrix update equation from (2.20) to a more
generalized form
R(k) = F (k)R(k 1) + (k)T (k)

(3.26)

where F (k) denotes the forgetting matrix. As stated in [6] the forgetting matrix should be
designed to apply forgetting only on the excited subspace of the parameter space.1 By introducing
1) + (k)T (k)
R(k) = R(k
1) = F (k 1)R(k 1)
R(k

(3.27)
(3.28)

1) is positive definite and R(k


1) R(k 1). This means
F (k) is to be chosen such that R(k
that R(k) can be bounded from below, thus preventing R(k) from becoming zero which causes
the windup phenomenon [6].
It is up to debate on how to choose F (k). In the proposed algorithm by [6], the forgetting matrix
is derived based on an orthogonal decomposition of R(k) along the excitation direction. Namely,
the information matrix is decomposed into two parts
R(k 1) = R1 (k 1) + R2 (k 1)

(3.29)

where R2 (k 1) is the part to which forgetting is applied. This way it can be stated that
R1 (k 1)(k) = 0

(3.30)

which establishes an orthogonal relationship between R1 (k 1) and (k) and (3.29) becomes
R(k 1)(k) = R2 (k 1)(k)
1

(3.31)

As one can see, by setting F (k) = I the update equation of the standard RLS is obtained.

21

3.3 Estimation algorithms based on covariance manipulation


Specifying that the new incoming data (k)T (k) has rank one and that R2 (k 1) must have
the same rank while R1 (k 1) must have rank n 1, a unique solution is given by


R2 (k 1) = R(k 1)(k)T (k)R(k 1) (k)


(k) =

1
T (k)R(k1)(k)

(3.32)

if |(k)| >

(3.33)

if |(k)|

0


R1 (k 1) = R(k 1) (k) R(k 1)(k)T (k)R(k 1)

(3.34)

where a dead-zone for (k) is introduced in which the decomposition is not performed (i.e.
R2 (k 1) = 0, R1 (k 1) = R(k 1))[6].
If forgetting is only applied to R2 (k 1) the recursive update equation of the information matrix
can be expressed as
R(k) = R1 (k 1) + R2 (k 1) + (k)T (k)

(3.35)

where R1 (k 1) refers to the part that is orthogonal to the regression vector which carries
information not to be discarded, R2 (k 1) is the part of the information matrix to which
forgetting is applied and (k)T (k) denotes the new incoming information.
After some reformulations and application of the matrix inversion lemma the estimation algorithm
which is henceforth referred to as Directional Forgetting by Cao (DFC) can be formulated [7],
[6]:

1) + K(k)(k)
(k)
= (k
1)
(k) = y(k) T (k)(k
P (k 1)(k)
K(k) = P (k)(k) =
1 + (k)P (k 1)(k)
P (k 1) =

P (k 1) + 1

if |(k)| >

P (k 1)

if |(k)|

(k)T (k)
T (k)R(k1)(k)

P (k) = P (k 1)

P (k 1)(k)T (k)P (k 1)
T (k)P (k 1)(k)

(3.36)

(3.37)

(3.38)

where is the threshold parameter for the deadzone of the covariance matrix update which
needs to be adjusted. Therefore, when the excitation is poor (i.e. the threshold is not exceeded),
the covariance matrix is not updated, i.e. P (k) = P (k 1) thus preventing the blowup of the
covariance matrix. In this case the update equations are exactly the same as the RLS without
forgetting and the effects of windup can be restricted since old data is not discarded [6].

3.3.3 Kalman filter based algorithm (I)


In a subsequent study by Cao and Schwartz a further modification of the DFC algorithm has
been proposed. The proposal of a modified version is motivated by the fact, that in the DFC
algorithm, the update of the covariance matrix requires both the value of the covariance matrix
P (k 1) of the previous sample as well as its inverse R(k 1). In order to improve computational

22

3.3 Estimation algorithms based on covariance manipulation


efficiency a simplified version of the algorithm has been designed where T (k)R(k 1)(k) is
replaced by an expression which does not contain R(k 1). The simplified version can expressed
as follows [9]:

1) + K(k)(k)
(k)
= (k
1)
(k) = y(k) T (k)(k
P (k 1)(k)
K(k) =
r + T (k)P (k 1)(k)
P (k 1)(k)T (k)P (k 1)
(k)T (k)
P (k) = P (k 1)
+
T
r + (k)P (k 1)(k)
+ T (k)(k)

(3.39)

The estimator is denoted by the authors as the Kalman Filter based algorithm (KFB-I) since it
has very similar properties to a standard Kalman filter (see (2.7), (2.10), (2.13), (2.14)). Here
r, and are adjustable parameters. For instance, is a parameter that determines the tracking
speed of the algorithm while can often be chosen as very a small value to ensure that the
covariance matrix is well-defined. Interpreting the algorithm as a modification of the standard
Kalman filter, r represents the as the variance of the measurement noise, which can e.g. be
assumed as Gaussian and known, i.e. r(k) = r [8].

3.3.4 Kalman filter based algorithm (II)


Another modified version of the above algorithm is developed in [8]. The derivation is motivated
by the similar properties between the described Kalman Filter based algorithm KFB-I and the
standard Kalman filter for parameter estimation. In the standard Kalman filter, the covariance
matrix update is given by [8] (see (2.14))
P (k) = P (k 1)

P (k 1)(k)T (k)P (k 1)
+ Q(k)
r(k) + T (k)P (k 1)(k)

where Q(k) is the covariance matrix of the random walk sequence vector w(k). Since in real
applications Q(k) is never known exactly, it is possible to compute Q(k) recursively. Thus, the
so-called Modified Kalman Filter based algorithm (KFB-II) can be obtained by simply modifying
(3.39) as [8]
P (k 1)(k)T (k)P (k 1)
+ Q(k)
r + T (k)P (k 1)(k)
(k)T (k)
Q(k) = Q(k 1) +
+ T (k)(k)

P (k) = P (k 1)

(3.40)
(3.41)

It can be observed that the modified version differs from the original version simply in the
choice of Q(k). That is, in the KFB-I estimator the variance matrix is chosen directly as
Q(k) =

(k)T (k)
,
+T (k)(k)

while in the modified version Q(k) is obtained recursively as the weighted

sum of the previous value in k 1 and

(k)T (k)
.
+T (k)(k)

In summary, it can be proven that the properties of both Kalman Filter based algorithms as well
as DFC ensure that the covariance matrix can be bounded from both below and above [9], [8]. This

23

3.4 Estimation algorithms based on limiting or scaling the covariance matrix trace
represents a desirable property of any estimation algorithm since the boundedness from below
ensures that good tracking abilities (since P (k) does not tend to zero), while the boundedness
from above restricts the effects of estimator windup, indicating that the covariance matrix cannot
increase infinitely. In contrast, DFB only shows upper boundedness of the covariance matrix [6].
Hence, although all directional forgetting based algorithms should be able to restrict the effects
of windup, one should expect that the latter three DF based algorithms provide better tracking
abilities than the first algorithm.

3.3.5 Stenlund-Gustafsson Anti-Windup algorithm


Another directional forgetting based algorithm with properties similar to a Kalman Filter is
proposed by Stenlund and Gustafsson. The starting point of the algorithm is equation (3.40) of
a Kalman filter for parameter estimation. In [20], it is proposed to choose the variance matrix
Q(k) as:
Q(k) =

Pd (k)T (k)Pd
r + T (k)Pd (k)

(3.42)

where r denotes the error covariance and Pd Rn is a matrix to be adjusted. This way, by
adding Q(k) to the update equation Pd becomes the matrix to which P (k) converges in periods
of poor excitation [1]. This indicates that the covariance matrix stays bounded. As a result, the
algorithm which is henceforth denoted as Stenlund Gustafsson Anti-windup (SG) can be obtained by adding (3.42) to the standard Kalman Filter estimator given by (2.7), (2.10), (2.13) [20].

3.4 Estimation algorithms based on limiting or scaling the covariance


matrix trace
Various studies have observed that in order to design a well-behaved estimator which is capable of
avoiding windup, the boundedness of the covariance matrix from above is of particular importance
[8]. While the above mentioned directional forgetting estimators bound the covariance matrix
indirectly through the application of non-uniform forgetting, there exist a variety of algorithms
which are based on the direct bounding of the covariance matrix through limiting or scaling the
matrix trace.

3.4.1 Constant trace algorithm


The idea of the constant trace algorithm is to scale the P matrix in each iteration such that
its trace remains constant. This way the eigenvalues of the covariance matrix cannot increase
infinitely as the trace is kept at a constant value. The algorithm can be obtained by introducing a

24

3.4 Estimation algorithms based on limiting or scaling the covariance matrix trace
recursively calculated matrix P (k) and by calculating P (k) as a function of P (k). The Constant
trace algorithm (CT) can be described by the following equations [2]:

1) + K(k)(k)
(k)
= (k


K(k) = P (k 1)(k) + T (k)P (k 1)(k)

1

P (k 1)(k)T (k)P (t 1)
1
P (k 1)
P (k) =

1 + T (k)P (k 1)(k)
P (k)
o + c2 I
P (k) = c1 n
tr P (k)

(3.43)
(3.44)

1)
(k) = y(k) T (k)(k
The key principle of the estimator can be explained through (3.43) and (3.44). Assuming that
excitation is poor, (k) = 0 leads to the exponential increase of P (k) due to P (k) = 1 P (k 1).
However, by dividing the matrix by its trace, the result

P (k)
tr{P (k)}

is scaled such that its trace

remains constant, no matter how large P (k) becomes. Therefore, the covariance matrix P (k) stays
bounded even in periods of poor excitation. The optional term c2 I is added as a regularization
mechanism [2], [18].

3.4.2 Maximum trace algorithm


Another method to achieve boundedness of the covariance matrix is to limit its trace to a
maximum value. In [16], this is achieved modifying the forgetting factor according to:
tr {P (k)}
(k) = 1 (1 0 ) 1
tr {Pmax }


(3.45)

Thus, the Maximum trace algorithm (MT) is given by substituting the constant forgetting
factor of the standard RLS algorithm by (3.45). Using said expression tends to 1 once the
trace of matrix P approaches the predefined maximum value tr {Pmax } since 1

tr{P (k)}
tr{Pmax }

= 0.

Conversely, when the covariance matrix converges to 0, tends to the specified lower bound 0
which ensures the algorithms adaptability to parameter changes.

25

4 Implementation and comparison of parameter


identification algorithms
In the following the estimation algorithms described in chapter 3 shall be implemented for the
estimation of vehicle mass and driving resistances as described by the longitudinal dynamics model
in 2.2.3. The algorithms are subsequently evaluated in terms of performance using numerical
data obtained from various test drives. A simulation model implemented in Simulink is used to
test the estimators which are subsequently compared in terms of various quality criteria.

4.1 Experimental setup


The starting point for the evaluation of the proposed estimators is a model of a vehicles
longitudinal dynamics (see (2.23)) implemented in MATLAB/Simulink. The experimental data
is obtained from recorded test drives which have been performed under different circumstances,
such as different vehicle types and/or different loads etc. In general, the test drives reflect
the environment of different landscapes. For instance, the vehicles travel through both plane
landscapes such as highways where the velocity is often constant and the acceleration is minimal as
well as city traffic which is characterized by varying velocity/acceleration. Thus, the performance
of the estimators can be evaluated under different circumstances. The relevant signals from the
test drives are transferred from a vehicle bus system to the simulation model [1].
For the evaluation of the proposed estimators, some preliminary preparations are made to improve
the estimators performances. For instance, all algorithms need a learning phase at the beginning.
This means that before they start to converge, the estimates can take on values which do not make
sense physically (i.e. too large or too small). Consequently, upon computation the parameters
are limited to stay within a physically meaningful range min , max , respectively. Furthermore,
signals needed to formulate the longitudinal dynamics model such as the driving force FA and the
longitudinal acceleration ax can be noisy. This is why upon obtaining these signals from the bus
network system a PT1 filter is used to reduce the noise. Finally, the parameters to be estimated
have different scales which can lead to numerical problems during computation. Therefore, the
estimated variables are scaled to the same magnitude using a transformation
= T

(4.1)

where the transformation matrix T is chosen so that the transformed variables lie within the
range of 1 to 1.
In order to compare and evaluate the estimators the reference values of the vehicle mass and

4.2 Simulation model


driving resistances are required. For this purpose, the first parameter m, i.e. the mass of the
vehicle including its load is measured before a test drive using a scale. The reference values of
the coefficients needed for the estimation of the second and third parameter mfR and Acw are
determined through a specific experimental setup. During these experiments, the vehicle speed
is measured and the coefficients are computed through non-linear optimization such that the
estimated speed using these coefficients is close to the actual measured speed. For the sake of
simplicity, the second and third parameter are henceforth denoted as f0 = mfR and f2 = Acw ,
respectively.

4.2 Simulation model


A simulation model has been implemented in Simulink and is shown in fig. 4.1. In the model, a

Figure 4.1 Simulink Model of test study

block named Parameters contains all parameters needed for the adjustment or initialization of
the algorithms, such as values of forgetting factors or initial values for estimates or covariance
matrices. Another block CAN2input processes all data obtained from the vehicle bus system.
The various data are subsequently sent to the orange colored estimator blocks as well as another
block called A/B Generator 3P which contains the model for the longitudinal dynamics. Finally,
an Evaluation block aggregates the simulation results regarding some quality measures and
sends the data to the MATLAB workspace for further processing.
All algorithm blocks have the same structure which is displayed in fig. 4.2 for the RLS algorithm.
The estimators are implemented as enabled subsystems since the estimators should only compute
when the conditions stated in 2.2.3 are fulfilled. The fulfillment of the conditions is evaluated in
a specific subsystem and the evaluation result is transmitted to each algorithm block using a
Goto-Block Valid. This way, the estimators are activated only if Valid = 1. Each algorithm is
implemented as a MATLAB-function. Upon simulation the results are sent to the workspace
for further processing. Furthermore, a number of quality measures are used to evaluate the

27

4.3 Quality measures for estimator evaluation

Figure 4.2 Structure of algorithm blocks

algorithms performances.

4.3 Quality measures for estimator evaluation


For the comparison of the various estimators different quality measures are used in the simulation
which all involve the computation of the quadratic error between the estimated and the actual
values. The quality measures used for the evaluation are:
RM SE: the root mean square error between the measured output y and the estimated
output T .
RM SEm : the root mean square error between the actual vehicle mass mref and the
estimated mass m.
pRM SEm and nRM SEm : the positive and negative RM SEm values, respectively.
mmax : the maximum deviations of the estimated mass from their true values.
RM SEv : the root mean square error between the predicted speed and the measured speed
from the vehicle bus. Basically, the obtained parameter estimations are used for an aheadprediction of the vehicle speed. The prediction model is implemented for each algorithm
(see fig 4.2). The obtained result is compared to the measured vehicle speed from the test
drives and the root mean square error is computed.
pRM SEv and nRM SEv : the positive and negative values of RM SEv , respectively.

28

4.4 Discussion of simulation results


All quality measures are calculated recursively. If the update conditions are fulfilled and the
starting phase of an algorithm has passed, the residual sum of squares (RSoS) is calculated as:
k = k0 + 1

1 0
RSoS(k) =
k RSoS(k 0 ) + E 2
k

(4.2)
(4.3)

with k and k 0 denoting the current and previous sample time, respectively and E denoting
the error of estimated value and actual/reference value. Otherwise when the update conditions
are not fulfilled k = k 0 and RSoS(k) = RSos(k 0 ). The root mean square error is subsequently
computed as the square root of the RSoS value.
Regarding the calculation of the quality measures, a timer is used to count the time instances
during which the validity conditions of 2.2.3 are fulfilled and stops when the requirements
are violated. When the duration of the valid time instances passes a predefined threshold, the
calculation of the quality measures is activated. The reason for setting such a threshold is because
in the beginning of a test drive, an estimator is still in its learning phase. Thus, the obtained
estimates may fluctuate drastically in the beginning and may also show large deviations from
the actual values. This would lead to large error values which should not be counted in the
evaluation of the estimators. Therefore by setting a duration threshold it is ensured that the
quality measures are only calculated when the learning phase is passed. However, the duration
threshold must be set at an appropriate value since a threshold too large can cause the error
calculation to not be activated at all while a threshold too small can lead to biased results for
the evaluation.

4.4 Discussion of simulation results


First, the implementation as well as general performance of the algorithms introduced in section
3 shall be discussed. Subsequently, the estimation quality is studied in terms of the achieved
quality measures. For the sake of simplicity, it is assumed that the true/reference values of the
parameters remain constant. All estimators are initialized with the same covariance matrix and
the same parameter vector.

4.4.1 Standard Recursive Least Squares


The standard RLS with constant forgetting is implemented according to (2.16) - (2.19). For
initialization of the algorithm, the vectors
(0) = I 31
P (0) = 105 I 33
are chosen, where is multiplied with T = [2100

20

0.6] for scaling. Moreover, a forgetting

factor of = 0.9999 is used. The algorithms are evaluated based on the data obtained from
a total of 30 test drives. During some of the test drives the vehicle travels mostly through
inner cities. Thus, due to the acceleration and braking maneuvers in inner city traffic the input

29

4.4 Discussion of simulation results


h

signals from T = ax , g, 12 air v 2 can be seen as persistently exciting and consequently, the
RLS estimator performs reasonably well. However during test drives on highways or freeways
where the vehicle speed and acceleration remain constant for long periods of time the insufficient
excitation in the regressor can lead to noticeable estimator windup (see fig. 4.3). For instance,
m[kg]

2400
RLS

2200
2000

40
f0[kg]

RLS

20
0
1.5

f2[m 2 ]

RLS

v [km/h]

0.5

150
100
50

a x [m/s 2 ]

5
0
-5
0

200

400

600

800

1000

1200

1400

1600

1800

time[s]

Figure 4.3 Estimator windup during t = 100s 800s in the RLS algorithm (test drive 9); velocity v;
acceleration ax

test drive no. 9 happens mostly on a free- or highway, which can be deduced from the vehicle
speeds v > 100km/h. Due to the poorly exciting signals of vehicle speed v and acceleration ax
which is most noticeable during t = 100s 800s a significant drift in the parameter estimates can
be observed. This is especially noticeable in the estimates of f0 whose values reach the bottom
thresholds defined by the saturation limits. However, once the signals are properly exciting
again (t > 800s) the estimators converge again to their reference values. This is also reflected in
the eigenvalues as well as trace of the covariance matrix. Fig. 4.4 shows the covariance matrix
eigenvalues and the trace in logarithmic scale. The observed peaks in the eigenvalues (such as
around t 800s) happen when the eigenvalues switch orders, i.e. the largest eigenvalue becomes
the second largest or smallest. Evidently, the inaccurate estimates in the parameters correspond
to the relatively large eigenvalue/trace values of the covariance matrix during t = 100s 800s.
Once the input is sufficiently exciting again the eigenvalues and thus the trace decrease and the
estimates become more accurate (see 4.3, 4.4).

4.4.2 Regularization based algorithms


Aside from the initialization for the information matrix R(0), the two regularization based
algorithms have two adjustable parameters and T R /LM R . In the model, is chosen as the

30

4.4 Discussion of simulation results


-10

EV(m)

RLS

-15
-20
-25
-10

EV(f0)

RLS

-15
-20
-25
-10

EV(f2)

RLS

-15
-20
-25

log(trace)

-10
RLS

-15

-20
0

200

400

600

800

1000

1200

1400

1600

1800

time[s]

Figure 4.4 Eigenvalues and trace of the covariance matrix for RLS in logarithmic scale (test drive 9)

same value as for the RLS estimator while T R is chosen as 2 106 and LM R = 108 .
As already illustrated, the regularization based algorithms TR and LMR aim at making the
least squares problem well-conditioned so as to prevent the information matrix R from becoming
singular. As a direct consequence, it is to be expected that the covariance matrices of the
regularization based algorithms have better condition numbers than the regular RLS. This is
shown in fig. 4.5 for test drive no. 6 where one can see that the condition numbers of TR and
especially, LMR are significantly lower than those of RLS. This is also reflected in the parameter
estimates for test drive no. 6, which is shown in fig. 4.6. One can observe that during said test
drive poor excitation occurs noticeably during t = 100s 200s as well as t = 300s 400s. During
this time, the TR and especially LMR estimators manage to keep the condition number lower
which is directly reflected in the more robust and accurate estimates as compared to the regular
RLS.

4.4.3 Variable forgetting based algorithms

The implementation of the VF and MF algorithm are described in 3.2.1 and 3.2.2. As already
stated, the VF algorithm is basically a modified RLS with a noise variance based detection
mechanism for windup, based on which is varied. In the described model, the algorithm has

31

4.4 Discussion of simulation results

1800
RLS
LMR
TR

1600
1400

log(cond)

1200
1000
800
600
400
200
0
0

100

200

300

400

500

600

time[s]

2150
2100
2050
2000
1950

f0[kg]

m[kg]

Figure 4.5 Condition numbers of TR and LMR in logarithmic scale (test drive 6)

RLS
TR
LMR

RLS
TR
LMR

20
10

f2[m 2 ]

1
RLS
TR
LMR

0.5

v [km/h]

100

a x [m/s 2 ]

5
0
-5
0

100

200

300

400

500

600

time[s]

Figure 4.6 Estimates of TR and LMR; velocity v; acceleration ax (test drive 6)

32

4.4 Discussion of simulation results


4 adjustable parameters (not counting the initial covariance matrix). Through trial and error,
these have been chosen as
0 = 0.999
= 0.88
n2 (0) = 5 105
n2 = 106

Furthermore, a lower threshold for has been set at 0.99 which is not to be exceeded.
The key concept of the VF estimator is best illustrated in fig. 4.7 for test drive no. 9 where one
can clearly see the variation of .

1.0001

0.9999

0.9998

0.9997

0.9996

0.9995

0.9994
160

162

164

166

168

170

172

174

176

178

180

time[s]
Figure 4.7 Variation of the forgetting factor in the VF estimator (test drive 9)

Upon further examination of the simulation results it can be concluded that with the given
parameters, the VF estimator behaves very similar to the RLS estimator. In fact, for almost all
test drives the estimation signals for the VF and RLS estimator overlap almost completely. A
small difference can be observed when using a more detailed scale of the signals (see fig. 4.8).
Regarding the MF estimator the algorithm which is adopted from [22] only discusses the case of
estimating two parameters. Therefore, in order to implement the algorithm for the estimation of
three parameters, the algorithm is adapted accordingly to feature an additional forgetting factor.

33

4.4 Discussion of simulation results

m[kg]

18.6
18.4
18.2
18
17.8
17.6

RLS
VF

2075

RLS
VF

0.76

RLS
VF

0.755

v [km/h]

f2[m 2 ]

2080

f0[kg]

2085

0.75

100

a x [m/s 2 ]

5
0
-5
800

850

900

950

1000

1050

1100

1150

1200

time[s]

Figure 4.8 Excerpt of the parameter estimates of VF; velocity v; acceleration ax (test drive 9)

Thus, there are three parameters which need to be adjusted. These have been determined as
m = 0.9999, f 0 = 0.9999, f 2 = 0.99999, respectively.
The performance of the MF algorithm with the given parameters is somewhat unique as it is
capable of providing accurate estimates for some test drives while for other test drives, the
estimates are noise sensitive. This is best illustrated in the following figure which shows the MF
estimators performance for test drive no. 13, 9 and 6 (from left to right), all three of which are
cases with noticeable lack of excitation. One can deduce that with test drive 13 the MF estimator
performs better compared to the RLS with regards to the f0 estimation but slightly worse with
regards to the m estimation (larger deviations during t = 500s 1000s. With test drive 9, MF
provides significantly more accurate estimates for all three parameters. However with test drive
6, the MF estimates regarding f2 are very noisy, although the estimates for m and f0 are fairly
accurate.

4.4.4 Covariance matrix manipulation based algorithms


The algorithms DFB, DFC, KFB-I KFB-II and SG are all based on manipulating the covariance
matrix so that forgetting is only applied to certain directions in the parameter space.
Just like the RLS estimator DFB only has as an adjustable parameter, which is chosen as
= 0.9999 in the simulations. Taking e.g. test drive 13, DFB estimator performs better than the
standard RLS during the period of poor excitation (from 600s to 1000s), however the difference
is insignificant. Only with increasing horizon (from t = 1000s onwards ) the DFB estimates
show better convergence to the reference values than the RLS estimates. From the trace of
the covariance matrix, one can clearly see that while the trace of the RLS covariance matrix

34

m[kg]

500

1000

1500

2000

2500

-5

-5

time[s]

3000

0.5

1.5

20

40

100

40

50

2000

2200

100

a x [m/s 2 ]

v [km/h]

f2[m 2 ]

f0[kg]

3800

4000

200

400

600

1000
RLS
MF

time[s]

800

1200

1400

1600

1800

-5

100

0.5

24
22
20
18
16

2050

2100

2150

100

200

300

time[s]

400

500

600

4.4 Discussion of simulation results

Figure 4.9 Estimates of MF; velocity v; acceleration ax : left = test drive 13, middle = test drive 9, right =
test drive 6

35

4.4 Discussion of simulation results


grows during the period of poor excitation, the trace of the DFB covariance matrix actually
decreases 4.10. In fact, starting from t = 600s (where poor excitation begins), the trace of the
DFB covariance matrix is always smaller than the trace of the RLS covariance matrix. This
indicates that the windup effects are indeed limited in the DFB estimator which also explains
why, for the given test drive, the parameter estimates are more accurate than those of the RLS
estimator.

-10
RLS
DFB

-11
-12

log(trace)

-13
-14
-15
-16
-17
-18
0

500

1000

1500

2000

2500

3000

time[s]
Figure 4.10 Trace of the covariance matrix of DFB estimator in logarithmic scale(test drive 13)

Next, the DFC and KFB-I algorithms are analyzed. The DFC algorithm has two adjustable
parameters, while the KFB-I has three. For the simulation these are determined as:
DFC:
= 0.9999
= 1000

KFB-I:
= 1013
= 1010
r=1

36

m[kg]

4.4 Discussion of simulation results

RLS
DFB

4000
3800

RLS
DFB

f0[kg]

50
45
40

v [km/h]

f2[m 2 ]

RLS
DFB

0.5

100

a x [m/s 2 ]

5
0
-5
0

500

1000

1500

2000

2500

3000

time[s]

m[kg]

Figure 4.11 Estimates of DFB; velocity v; acceleration ax (test drive 13)

RLS
DFC
KFB1

4000

3950

f2[m 2 ]

f0[kg]

54
52
50
48
46
44
42

RLS
DFC
KFB1

0.9

RLS
DFC
KFB1

0.8

v [km/h]

0.7

100

a x [m/s 2 ]

5
0
-5
0

500

1000

1500

2000

2500

3000

time[s]

Figure 4.12 Estimates of DFC and KFB-I; velocity v; acceleration ax (test drive 13)

37

4.4 Discussion of simulation results

-10
RLS
DFC
KFB1

-11
-12

log(trace)

-13
-14
-15
-16
-17
-18
0

500

1000

1500

2000

2500

3000

time[s]
Figure 4.13 Trace of the covariance matrix of DFC and KFB-I in logarithmic scale(test drive 13)

38

4.4 Discussion of simulation results


With the given parameters it has been found that DFC and KFB-I behave very similarly. For
instance, in test drive no.13 the signals of both estimators overlap almost entirely (see for instance
the f0 estimates in fig. 4.12). A small difference can only be seen using a more detailed scale
of the estimates (see m estimates in fig. 4.12). Furthermore, analyzing the P - matrix trace for
the given test drive it can be deduced that both DFC and KFB-I have similar properties to the
DFB algorithm, as both estimators restrict the growth of the covariance matrix during phases of
insufficient excitation. In fact, a general observation is that both DFC and KFB-I estimators
behave very similar to DFB for almost all test-drives.
Subsequently, the KFB-II estimator, for which the four adjustable parameters are set as
= 0.9999
= 1015
= 1010
r=1
is illustrated. It has been found that with the given parameters the KFB-II estimator values
always produces sensible estimates for f0 and f2 , however the estimates of m can be very noise
sensitive. In fact, the sensitivity varies for different test drives. For instance, fig. 4.14 which
displays KFB-II estimates for test drive no. 10. shows that the mass estimations for said test
drive are very noisy while the estimates of the other two parameters are reasonably smooth. In
general, it has been found that the accuracy of the algorithm especially regarding the estimations
of m depends on the properties of the considered test drives.

m[kg]

2200

RLS
KFB2

2100
2000

f0[kg]

40
RLS
KFB2

20
0

v [km/h]

f2[m 2 ]

1
RLS
KFB2

0.5

100

a x [m/s 2 ]

10
0
-10
0

1000

2000

3000

4000

5000

6000

time[s]

Figure 4.14 Estimates of KFB-II algorithm; velocity v; acceleration ax (test drive 10)

Lastly in the category of directional forgetting based algorithms the behavior of the SG Anti-

39

4.4 Discussion of simulation results

3900

DFB
DFC
KFB1
SG

55
50
45
40
35

DFB
DFC
KFB1
SG

4000

f0[kg]

m[kg]

4100

f2[m 2 ]

2
DFB
DFC
KFB1
SG

v [km/h]

100

a x [m/s 2 ]

5
0
-5
0

500

1000

1500

2000

2500

3000

time[s]

Figure 4.15 Estimates of DFB, DFC, KFB1 and SG; velocity v; acceleration ax (test drive 13)

windup estimator is discussed. For the algorithm only the convergence matrix Pd needs to be
specified, which is chosen as Pd = I 33 [1010 ,

1012 ,

1012 ]T . In summary it has been

observed that the SG estimator behaves similarly to the above mentioned directional forgetting
based algorithms (with the exception of KFB-II). Using the example of test drive no. 13, the SG
estimator achieves similar estimates for the given parameters as the other directional forgetting
based estimators This is illustrated in fig. 4.15, where the estimates of DFB, DFC, KFB1 and
SG are displayed. Evidently, the similarity of the estimates result in almost a complete overlap
of the signals which is why only the estimates of SG can be seen.

4.4.5 Trace manipulation based algorithms

Finally, the CT and MT algorithms, which are based on a limitation or scaling of the covariance
matrix are evaluated.
The core concept of the CT algorithm is to keep the trace of the covariance matrix constant
at all times. This is illustrated in the plot of the trace of the P - matrices for the RLS and
CT algorithm for test drive no. 9, where one can see the varying trace of the RLS covariance
matrix and in contrast, the constant trace of the CT covariance matrix. Aside from P (0) and

40

4.4 Discussion of simulation results

-10
RLS
CT

-11
-12

log(trace)

-13
-14
-15
-16
-17
-18
0

200

400

600

800

1000

1200

1400

1600

1800

time[s]
Figure 4.16 Covariance matrix trace for CT in comparison to RLS in logarithmic scale (test drive 9)

41

4.4 Discussion of simulation results


the introduced P (0) (which is chosen as P (0) = P (0) in the simulations), the CT estimator has
three adjustable parameters:
= 0.9999
c1 = 107
c2 = 1011
The behavior of the CT algorithm is remarkable in the sense that depending on the given data, it
can produce very accurate estimates for some test drives while for other test drives, the estimates
are inaccurate and noisy. For instance for test drive no. 9, the estimates obtained for CT are
very close to the actual values. Compared to the standard RLS or e.g. DFB the CT algorithm
manages to restrict the windup effects almost completely for the given test drive (see fig. 4.17).

m[kg]

2500
RLS
DFB
CT

2000

f0[kg]

40
RLS
DFB
CT

20
0

f2[m 2 ]

2
RLS
DFB
CT

v [km/h]

100

a x [m/s 2 ]

5
0
-5
0

200

400

600

800

1000

1200

1400

1600

1800

time[s]

Figure 4.17 Estimates of CT in comparison to DFB and RLS; velocity v; acceleration ax (test drive 9)

Nevertheless, the estimates of the CT algorithm are not always accurate. E.g. for test drive no.6
the CT algorithm produces rather inaccurate estimates for the vehicle mass while the estimates
for f0 and f2 remain smooth and accurate. Although CT does not exhibit a noticeable parameter
drift of the f0 and f2 signals like RLS during the poor excitation periods the mass estimates are
noisy right from the start (see fig. 4.18). Upon investigation of other test drives, noisy m but
accurate f0 , f2 estimates is a defining characteristic of the CT algorithm for many test drives.
Finally the MT algorithm is examined, which has two adjustable parameters that are determined
as 0 = 0.999 and Pmax = 108 I 33 . As already described the defining principle of the MT
estimator is to bound the trace of the covariance matrix from above. This way, once the trace of
the P - matrix approaches the specified upper bound, the forgetting factor is set to 1 to avoid
the further increase of the matrix trace. In other words, the forgetting factor varies with the

42

4.5 Quality criteria of estimators

m[kg]

2400
2200

RLS
CT

2000
1800

40
RLS
CT

f0[kg]

30
20
10

v [km/h]

f2[m 2 ]

1
RLS
CT

0.5

100

a x [m/s 2 ]

5
0
-5
0

100

200

300

400

500

600

time[s]

Figure 4.18 Estimates of CT; velocity v; acceleration ax (test drive 6)

development of the covariance matrix trace. This is demonstrated in the fig. 4.19 where one
can see that the trajectory of varies according to the changes of the covariance matrix trace.
Generally, it has been observed that the MT algorithm shows similar properties as CT in the
sense that for both algorithms the accuracy of the estimates is dependent on the characteristics
of the considered test drive. A noticeable difference however is that the estimates of the MT
algorithm are generally less noisy than those of CT. For instance, the side-by-side comparison of
test drive no. 13 and no. 9 shows that the MT algorithm can provide fairly exact estimates for
test drive no. 9 (right), where the effects of windup are restricted such that the estimates are
very close to the actual values. On the other hand, the left figure shows that the MT algorithm
can also produce inaccurate results since the estimates of m for the given test drive 13 are too
small while the f2 estimates are constantly too large for the entire simulation horizon 4.20).

4.5 Quality criteria of estimators


Having examined the general behavior of the algorithms for some chosen test drives the estimators
are subsequently evaluated based on the proposed quality criteria. For this, the estimators
performances are simulated for all 30 test drives. Regarding the time threshold for enabling
the error measure calculations (the threshold which must be passed for the error measurement
calculation to be enabled), the validity duration is set to 60s for all test drives, except for test
drive no. 7 for which the validity duration is set to 40s (since otherwise the calculation would
not be enabled).
To compare the estimators the achieved quality measures for all 30 test drives are averaged. Tab.
4.1 shows the obtained results. An illustration of the achieved quality criteria as a spider chart is

43

4.5 Quality criteria of estimators

-16.5
MT

MT

1.0006

1.0004

log(trace)

forgetting factor

-17

-17.5

1.0002

0.9998

0.9996
-18
0.9994
0

200

400

600

800

1000

1200

1400

1600

200

400

600

800

1000

1200

1400

1600

time[s]

time[s]

Figure 4.19 Covariance matrix trace in logarithmic scale (left) and (k) (right) for MT

m[kg]

4500

2600
2400

4000

2200
2000

3500

1800
30
f0[kg]

60
20
40
10
20
0

f2[m 2 ]

1.5
1

0.5

150

150

100

100

50

50

a x [m/s 2 ]

v [km/h]

-5

-5
0

500

1000

1500

time[s]

2000

2500

3000

0
RLS
MT

200

400

600

800

1000

1200

1400

1600

1800

time[s]

Figure 4.20 Estimates of MT; velocity v; acceleration ax : left = test drive 13, right = test drive 9

44

4.5 Quality criteria of estimators

Algorithms

RM SE

RM SEm

pRM SEm

nRM SEm

mmax

RM SEv

pRM SEv

nRM SEv

CT
DFB
DFC
KFB-I
KFB-II
LMR
MF
MT
RLS
SG
TR
VF

108,82
125,13
126,08
125,09
119,98
123,10
119,69
124,99
124,19
126,03
126,61
124,28

134,14
100,44
100,11
99,87
108,83
99,00
108,96
109,28
101,87
100,43
101,97
101,78

99,71
52,18
51,46
53,54
72,58
49,97
50,05
49,99
53,59
51,94
55,24
53,49

67,91
55,67
55,32
54,59
63,89
57,44
63,82
64,82
56,90
55,60
56,53
56,84

263,40
133,87
132,84
142,14
191,46
136,96
143,47
143,25
142,26
134,68
143,30
141,53

0,75
0,82
0,84
0,82
0,78
0,79
1,40
0,81
0,80
0,83
0,86
0,80

0,52
0,31
0,30
0,30
0,32
0,35
1,07
0,33
0,32
0,30
0,32
0,32

0,59
0,84
0,86
0,84
0,76
0,80
0,87
0,84
0,82
0,86
0,89
0,00

Table 4.1 Means of achieved quality measures of all estimators

represented in fig. 4.21. The columns in Tab. 4.1 display the different quality measures. For each
Quality Measurements
dm maxRMSE

pRMSE m

RLS
CT
DFB
DFC
KFB1
KFB2
LMR
MT
TR
VF
SG
MF

RMSE m

nRMSE

RMSE

RMSE v

nRMSE v

pRMSE v

Figure 4.21 Spider chart of achieved quality measures for all estimators

criterion, the top three values are shown in bold.


Several findings can be deduced from the table as well as the spider chart. For instance, the
CT, KFB-II and MF algorithms achieve the best overall RM SE values but the worst RM SEm
values. Interestingly, these are the estimators for which the general observation was that the
accuracy of their estimates (especially estimates of the vehicle mass) are dependent on the data
of the considered test drive. This has already been illustrated in fig. 4.17 which shows that the
CT estimator behaves very accurately while fig. 4.18 indicates that especially the mass estimates
of CT are very inaccurate. Therefore, despite the overall high RM SEm values one can conclude
that the overall good values for the RM SE measure of the CT, KFB-II and MF algorithm are

45

4.5 Quality criteria of estimators


mainly achieved through the relatively accurate estimates of f0 and f2 .
Another observation is that the directional forgetting based algorithms (with the exception of
KFB-II) all achieve good mass estimates (as indicated by the low RM SEm values) while the
overall RM SE values are among the highest (even higher than the achieved score of the regular
RLS). From this, one can conclude that the directional forgetting based algorithms basically
show the reverse behavior of the above mentioned CT, KFB-II and MF, i.e. they produce fairly
accurate mass estimates but inaccurate estimates for f0 and f2 . This has already been observed
in fig. 4.17 where the parameter drift of the DFB algorithm for f0 is indeed noticeable. Moreover,
it should be noted that the mass estimates of those estimators are generally only slightly more
accurate than the ones generated by the standard RLS. In fact, the differences regarding the
mass estimates among the directional forgetting based estimators (except KFB-II) are negligible.
Furthermore, it can be observed that the estimates of the directional forgetting based algorithms
have among the lowest maximum deviations regarding the mass estimates. On the contrary, the
mmax values of CT and KFB-II are among the highest, which again shows that the estimates
produced by said estimators are generally the most noise sensitive.
Another observation from the table is that the differences among all estimators regarding the
mass estimation is rather small. It is evident that on average, the estimators are able to predict
the vehicle mass with an offset of 100kg. Furthermore from the positive and negative RM SEm
columns one can see that with the exception of CT and KFB-II, all algorithms tend to rather
underestimate than overestimate the vehicle mass.
Regarding the achieved quality measures it can be concluded that the majority of the estimators
perform similar to the regular RLS on average. For instance, the VF algorithm displays almost
the same values for all quality measures. Some estimators even perform worse with regards to
specific quality measures. E.g. on average, the MF and MT algorithms achieve among the lowest
scores on the mass estimates. However, further investigation has yielded that when focusing
solely on the test drives with noticeable windup, those estimators generate among the most
accurate mass estimates. In fact, it has been found that e.g. for test drive 6, 8, 9 and 29 the
MT algorithm achieves the lowest RM SEm scores while for test drive 10 the RM SEm value
is the second lowest. Similarly, MF achieves the best mass estimations on drive 27, the second
best on drive 6 and 8 and the third best on 9 and 10. Nevertheless, the mentioned algorithms
performances on test drives which show significant excitation are often way below those of the
standard RLS.
Finally, the last three columns show the obtained quality measures regarding the speed predictions
including the positive- and negative-only values. Generally, it is evident that the best speed
predictions are also obtained by the algorithms with the lowest RM SE scores, which are CT and
KFB-II. As an exception, the MF algorithm which has the second best RM SE score achieves
the largest errors regarding the predicted vehicle speed. Nevertheless, the differences between the
individual estimators regarding the speed quality measures are negligible, since the differences
in the speed predictions is in the range of 102 m
s . Similar to the mass predictions, it can be
deduced that the estimators rather tend to under-predict than over-predict the speed.

46

5 Summary and discussion


In the study, the on-line estimation of vehicle parameters such as mass and driving resistances
(respectively denoted as m, f0 and f2 ) for use in advanced driver assistance systems is investigated.
The inability to measure said parameters via sensors drives the necessity for a model-based
estimation approach. For this purpose, a recursive on-line estimation method is applied.
One of the most popular estimation techniques is the recursive least squares method. Under
normal circumstances, i.e. with sufficient excitation of the process, the RLS is capable of generating fairly accurate estimates of the required parameters. However, when excitation is poor, the
accuracy of the estimates is compromised due to a phenomenon known as estimator windup. It
has been shown that windup occurs because the covariance matrix of the estimates increases
continuously when the process is not properly excited. This ultimately causes the estimates to
become very noise sensitive and can lead to a drift of the parameters from their true values.
Since the accuracy of the estimates is of high importance in drivers assistance systems alternatives to the RLS estimator are investigated. It has been indicated in various studies, that the
main principle in windup avoidance is to introduce a safety mechanism in order to prevent the
unbounded increase of the covariance matrix [16]. This can be achieved in many ways, which can
be categorized based on the used core principle.
The first category of estimators investigated in the present study are the regularization based
techniques. These estimators utilize an additional regularization term in the recursive update
in order to achieve the well-conditioning of the information matrix. This way, the information
matrix stays invertible and the covariance matrix as its inverse is prevented from becoming
infinitely large. Contrary to other estimators, both regularization based algorithms TR and LMR
utilize R in their update equations instead of P . Generally, this can be seen as a drawback since
the information matrix is mainly used when deriving an algorithm and analyzing its performance.
For the implementation, it is often more computationally efficient to use the covariance matrix
in order to avoid a matrix inversion operation at each update, which in the case of TR and
LMR is not possible. Nevertheless, since only three parameters need to be estimated in the
underlying model the matrix inversion should not pose a serious computational problem. From
the simulations it can be summarized that while the TR algorithm often performs worse than
the regular RLS, the LMR shows decent results with regards to most quality measures.
Subsequently the variable forgetting based algorithms have been investigated. In this category,
the VF algorithm performs a time-variant forgetting operation where poor excitation and a
change of parameters, respectively is detected based on the measured noise variance level and
the estimation error. This way the forgetting factor can be varied accordingly. This estimator has
four adjustable parameters, which on the one hand offers a large degree of freedom in adjusting
the estimators performance but on the other hand leads to the necessity of parameter fine-tuning.
In the simulations the VF estimator has shown a very similar behavior as the standard RLS.

5 Summary and discussion


On the other hand, the MF algorithm uses an individual forgetting factor for each parameter to
be estimated. This way, different rates of parameter variation can be taken into account and
windup can be avoided. In the evaluation it has been found that although the overall achieved
RM SE score is among the best, the MF estimator often generates very sensitive estimates of the
vehicle mass. Thus, the mass estimates of MF are among the least accurate in an RM SE sense.
Another category of estimation techniques is based on a manipulation of the covariance matrix.
These estimation methods are based on the directional forgetting of data, i.e. old information is
discarded non-uniformly only in excited directions of the parameter space which prevents the
unbounded increase of the covariance matrix. Therefore, while algorithms such as VF apply
non-uniform forgetting in time, the algorithms of this category apply non-uniform forgetting
in the parameter space. In the experimental study it has been found that with the exception
of KFB-II, the directional forgetting based algorithms behave very similarly. In general, these
algorithms achieve the most accurate mass estimations, however, the estimates of the other two
parameters are rarely more accurate than those of the standard RLS. On the other hand, the
KFB-II estimator exhibits the reverse behavior as it is capable of generating accurate estimates
for f0 and f2 (as indicated by the overall RM SE) while the mass estimates are often too noise
sensitive. Nevertheless, it should be possible to improve the algorithms performance through
more detailed parameter fine-tuning.
Alternatively, the covariance matrix trace manipulating algorithms also restrict the increase
of the covariance matrix but from a different perspective than the directional forgetting based
estimators. For instance, the CT algorithm is based on scaling the trace of the P -matrix such
that a bounding of the matrix is achieved through keeping its trace constant. Similar to MF or
KFB-II the mass estimates of the constant trace algorithm is dependent on the used experimental
data. The overall lowest RM SE and RM SEv values indicate that the CT estimates of f0 and f2
are among the most accurate while the highest RM SEm scores indicate that the mass estimates
are too noisy.
A similar bounding principle is used by the MT estimator, which limits the trace of the P -matrix
to a maximum value. Since the MT algorithm bounds the covariance matrix trace indirectly
through variation of the forgetting factor, it would have also been sensible to categorize said
algorithm as a variable forgetting factor based one. In an overall RM SE sense, the MT algorithm
has been observed to perform worse than the regular RLS in general. Interestingly though, with
regards to mass estimations, the algorithm is capable of generating very accurate estimates for
m especially for test drives which show a noticeable lack of excitation.
As already shown, during the comparison and evaluation of the proposed estimators it has become
evident that some algorithms demonstrate a unique behavior in a sense that the accuracy of the
obtained estimates depend on the experimental data considered. For instance, the CT algorithm
is capable of generating very accurate estimates in some cases, but noisy and inaccurate estimates
in others. A possible explanation for this phenomenon, which is also observed for other algorithms
such as KFB-II and MF can be found in the regression vector T . Namely, apart from the
constant variable g the other regressor variables ax and v are time-varying. Especially ax , which
is computed as v + g sin() can be noisy since v is obtained through derivation of the vehicle
speed. Thus, depending on the circumstances of a test drive, the noise levels of said regressors
can be different. Therefore, it is possible that with the defined set of parameters, different noise

48

5 Summary and discussion


levels in the regressors can cause different kinds of noise sensitivity in the estimator. In this
respect, it should be possible to improve the behavior of said estimators through more parameter
fine-tuning.
This however, leads to a another problem. All estimation algorithms introduced are adopted from
literature, however almost none of them offer any recommendation on how to adjust the tunable
parameters. Therefore in the simulation study, almost all parameters have been adjusted based
on a trial and error approach. Since some estimators have three or even four parameters, it can be
assumed that not all estimators are tuned appropriately in order to generate the best estimates.
This indicates that the results obtained in the study are not definite since the performance of
an estimator is indeed strongly dependent on the chosen parameters. This, for instance, is best
illustrated in the side-by-side comparison of the mass estimation of the CT algorithm using two
different sets of parameters. The left and right figures of fig. 5.1 show the obtained estimates
for test drive no.9 and no.6, respectively, with the blue line representing the original parameter
settings used in the study c1 = 107 , c2 = 1011 and the red line corresponding to the new
settings c1 = 108 , c2 = 1012 . Evidently, using the new parameter settings leads to improved
2400

-8

c1 = 10 , c2 = 10

2350

2300

2250

2250

2200

2200

2150

2150

2100

2100

2050

2050

2000

2000

1950

1950

1900
200

400

600

800

1000

time[s]

1200

1400

1600

c1 = 10 -8 , c2 = 10 -12

2350

2300

c1 = 10 -7 , c2 = 10 -11

-12

m[kg]

m[kg]

2400

c1 = 10 -7 , c2 = 10 -11

1800

1900
0

100

200

300

400

500

600

time[s]

Figure 5.1 Comparison of the mass estimates of CT using two different sets of parameters; left = test drive
9, right = test drive 6

mass estimates for test drive no. 6. However, the parameters are now too insensitive given the
data of test drive no. 9. As a consequence, the mass is barely tracked and the algorithm constantly
overestimates the mass. This is also reflected in the quality measures. It has been found that for
test drive no. 6. the overall RM SEm score is significantly improved from 103 to 62 when the new
parameter settings are used. However, in test drive no. 9. the results are significantly worsened
as the error measure increases from 35 to 59. This example illustrates that there is no optimal
setting for the parameters which yield the best results under all possible circumstances.
In a similar fashion, the initial covariance matrix as well as initial regressor can influence an

49

5 Summary and discussion


estimators performance. Since e.g. a large initial matrix indicates a great uncertainty of the data,
the estimators accuracy can be significantly worsened when the initial matrix is chosen too large.
Similarly the forgetting factor also plays a significant role with regards to the estimation quality
which has already been discussed in 2.1.2. Therefore, repeating the simulations with different
parameters settings could lead to different results.
Giving a recommendation regarding the best or most robust estimator based on the simulation
results is difficult due to the above mentioned reasons. For instance, the directional forgetting
based algorithms such as DFB, DFC, KFB-I and SG show on average the most accurate mass
estimates, however the differences to the RLS estimates are almost negligible. Furthermore, the
estimates of the other two parameters still show a significant drift especially during periods of poor
excitation. In this regard, other algorithms such as CT or KFB-II perform significantly better.
In summary, it can be concluded from the study that even if an algorithm performs reasonably
well during for one or a variety of circumstances, it can perform badly in other circumstances.
Therefore it is difficult to give an exact recommendation regarding which algorithm performs
the best. However, one observation from the 30 test drives is that the most robust algorithm
appears to be the LMR estimator. This is because it achieves among the best scores on the mass
and speed estimates while the results from an overall RM SE sense as well as regarding the
maximum deviations in the mass estimates are decent as it achieves the 4th lowest score on the
RM SE and mmax measure.
For the continuation of the study it is recommended to repeat the study with different parameter
adjustments, additional test drives and/or additional estimators. For instance the similarity in
the results observed for a majority of the estimators in comparison to the regular RLS can likely
be avoided when a lower forgetting factor is chosen. Furthermore, the proposed algorithms in this
study only target specifically the problem of estimator windup. In fact, there are other estimators
which target different shortcomings of the RLS estimator. For instance, an algorithm called the
recursive generalized total least squares (RGTLS) targets errors in the regressor vector unlike
the regular RLS which only considers the error of the output vector but not of the regressor.
Moreover, the RLS is derived based on a quadratic error function which penalizes large estimation
errors disproportionately. Alternatively, there are algorithms such as the robust M-estimator
which utilizes a non-quadratic error function [1].

50

Bibliography
[1] S. Altmannshofer: Robuste, Onlinefhige Schtzung von Fahrzeugmasse und Fahrwiderstnden. In: AUTOREG; Baden-Baden. 2015.
[2] K. J. Astrm and B. Wittenmark: Adaptive Control. Second Edition. Dover, 2008; pp. 1
574.
[3] S. Bittani and M. Campi: Adaptive RLS Algorithms under Stochastic ExcitationStrong
Consistency Analysis. In: Systems & control letters, vol. 17, no. 1, 1991; pp. 38.
[4] S. Bittanti, P. Bolzern, and M. Campi: Convergence and Exponential Convergence of
Identification Algorithms with Directional Forgetting Factor. In: Automatica, vol. 26,
no. 5, 1990; pp. 929932.
[5] S. Bittanti, P. Bolzern, and M. Campi: Recursive Least Squares Identification Algorithms
with Incomplete Excitation: Convergence Analysis and Application to Adaptive Control.
In: IEEE Transactions on Automatic Control, vol. 35, no. 12, 1990; pp. 13711373.
[6] L. Cao and H. Schwartz: A Directional Forgetting Algorithm Based on the Decomposition
of the Information Matrix. In: Automatica, vol. 36, no. 11, 2000; pp. 17251731.
[7] L. Cao and H. Schwartz: A Novel Recursive Algorithm for Directional Forgetting. In:
American Control Conference, 1999. Proceedings of the 1999. Vol. 2. 1999; pp. 13341338.
[8] L. Cao and H. Schwartz: Analysis of the Kalman Filter Based Estimation Algorithm: An
Orthogonal Decomposition Approach. In: Automatica, vol. 40, no. 1, 2004; pp. 519.
[9] L. Cao and H. Schwartz: The Kalman Filter Based Recursive Algorithm - Windup and
Its Avoidance. In: Proceedings of the American Control Conference; Airlington, 2011;
pp. 36063611.
[10] G. Golub, P. C. Hansen, and D. OLeary: Tikhonov Regularization and Total Least
Squares. In: SIAM Journal on Matrix Analysis and Applications, vol. 21, no. 1, 1999;
pp. 185194.
[11] S. Gunnarsson: Combining Tracking and Regularization in Recursive Least Squares
Identification. In: Linkping University Electronic Press, 1996.
[12] L. Gustafsson and M. Olsson: Robust On-line Estimation. Master Thesis. Lund Institute
of Technology, 1999; pp. 178.
[13] X. Hu and L. Ljung: New Convergence Results for Least Squares Identification Algorithm.
In: The International Federation of Automatic Control (ed) Proceedings of the 17th IFAC
World Congress; Seoul, 2008; pp. 50305035.
[14] R. Isermann and M. Mnchhof: Identification of Dynamic Systems. Springer Verlag, 2011;
pp. 1705.

Bibliography
[15] R. Johnstone et al.: Exponential Convergence of Recursive Least Squares with Exponential
Forgetting Factor. In: 21st IEEE Conference on Decision and Control; Orlando, 1982;
pp. 994997.
[16] P. Krus and S. Gunnarsson: Adaptive Control of a Hydraulic Crane Using Online Identification. In: Linkping University Electronic Press, 1993.
[17] J. Parkum, N. K. Poulsen, and J. Holst: Recursive Forgetting Algorithms. In: International
Journal of Control, vol. 55, no. 1, 1992; pp. 109128.
[18] M. Salgado, G. Goodwin, and R. Middleton: Modified Least Squares Algorithm Incorporating Exponential Resetting and Forgetting. In: International Journal of Control, vol. 47,
no. 2, 1988; pp. 477491.
[19] T. Seidman and C. Vogel: Well Posedness and Convergence of Some Regularisation
Methods for Non-linear Ill Posed Problems. In: Inverse problems, vol. 5, no. 2, 1989;
pp. 227241.
[20] B. Stenlund and F. Gustafsson: Avoiding Windup in Recursive Parameter Estimation.
In: Preprints of reglermote, 2002; pp. 148153.
[21] A. Tikhonov and V. Arsenin: Solutions of Ill-Posed Problems. In: Mathematics of
Computation, vol. 32, no. 144, 1978; pp. 13201322.
[22] A. Vahidi, A. Stefanopoulou, and H. Peng: Recursive Least Squares with Forgetting
for Online Estimation of Vehicle Mass and Road Grade: Theory and Experiments. In:
International Journal of Vehicle Mechanics and Mobility, vol. 43, 1 2005; pp. 3155.
[23] T. Van Waterschoot, G. Rombouts, and M. Moonen: Optimally Regularized Recursive
Least Squares for Acoustic Echo Cancellation. In: Proceedings of the second annual IEEE
BENELUX/DSP Valley Processing Symposium (SPS-DARTS 2006), Antwerp, Belgium.
2005; pp. 2829.

52

List of Figures
2.1

Geometric interpretation of the least squares estimate [2] . . . . . . . . . . . . .

2.2

Comparison of different forgetting factors = 0.9 (left) and = 0.95 (right) in the

RLS estimator with exponential forgetting (true values - dashed lines, estimated
values - solid lines) [14]
2.3

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Effects of estimator windup caused by a constant regressor: control variable (top


left), covariance matrix element (top right), trajectories of estimates (bottom)[2]

11

2.4

Snapshot representation of a 2 2 covariance matrix as an ellipse at time k . . .

12

4.1

Simulink Model of test study . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27

4.2

Structure of algorithm blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

28

4.3

Estimator windup during t = 100s 800s in the RLS algorithm (test drive 9);
velocity v; acceleration ax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.4

30

Eigenvalues and trace of the covariance matrix for RLS in logarithmic scale (test
drive 9) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31

4.5

Condition numbers of TR and LMR in logarithmic scale (test drive 6) . . . . . .

32

4.6

Estimates of TR and LMR; velocity v; acceleration ax (test drive 6) . . . . . . .

32

4.7

Variation of the forgetting factor in the VF estimator (test drive 9) . . . . . . . .

33

4.8

Excerpt of the parameter estimates of VF; velocity v; acceleration ax (test drive 9) 34

4.9

Estimates of MF; velocity v; acceleration ax : left = test drive 13, middle = test
drive 9, right = test drive 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

35

4.10 Trace of the covariance matrix of DFB estimator in logarithmic scale(test drive 13) 36
4.11 Estimates of DFB; velocity v; acceleration ax (test drive 13) . . . . . . . . . . . .

37

4.12 Estimates of DFC and KFB-I; velocity v; acceleration ax (test drive 13) . . . . .

37

4.13 Trace of the covariance matrix of DFC and KFB-I in logarithmic scale(test drive 13) 38
4.14 Estimates of KFB-II algorithm; velocity v; acceleration ax (test drive 10) . . . .

39

4.15 Estimates of DFB, DFC, KFB1 and SG; velocity v; acceleration ax (test drive 13) 40
4.16 Covariance matrix trace for CT in comparison to RLS in logarithmic scale (test
drive 9) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

41

4.17 Estimates of CT in comparison to DFB and RLS; velocity v; acceleration ax (test


drive 9) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

42

4.18 Estimates of CT; velocity v; acceleration ax (test drive 6) . . . . . . . . . . . . .

43

4.19 Covariance matrix trace in logarithmic scale (left) and (k) (right) for MT . . .

44

4.20 Estimates of MT; velocity v; acceleration ax : left = test drive 13, right = test drive 9 44
4.21 Spider chart of achieved quality measures for all estimators . . . . . . . . . . . .
5.1

45

Comparison of the mass estimates of CT using two different sets of parameters;


left = test drive 9, right = test drive 6 . . . . . . . . . . . . . . . . . . . . . . . .

49

List of Tables
4.1

Means of achieved quality measures of all estimators . . . . . . . . . . . . . . . .

45