Académique Documents
Professionnel Documents
Culture Documents
By
Optimum Filters
FIR Wiener Filter
Non-causal IIR Wiener Filter.
Causal IIR Wiener Filter
Introduction to Adaptive Filter
Filtering: Estimation of one signal from another (desired signal which is noisy or
distorted e.g. a speech signal, a radar signal, an image).
In very simple and idealized environment, it may be possible to design a classical
filter such as lowpass, highpass, bandpass, or bandstop filter to restore the
desired signal from the measured data.
Will these filter be optimum in the sense of producing the best estimate of the
signal.
Optimal filter theory was developed to provide structure to the process of selecting
the most appropriate frequency characteristics.
If a representation of the desired signal is available, then a well-developed and
popular class of filters known as Wiener Filters can be applied.
The Wiener filter solves the signal estimation problem for stationary signals.
The basic concept behind Wiener Filter theory is to minimize the difference
between the filtered output and some desired output. This minimization is based
on the least mean square approach which adjusts the filter coefficients to reduce
the square of the difference between the desired and actual waveform after
filtering.
Linear discrete-time filter, h(0), h(1), h(2), (IIR or FIR (inherently stable))
y(n) is the estimate of the desired response d(n)
e(n) is the estimation error, i.e., difference between the filter output and the
desired response
As the error signal e(n) 0, the output of the filter y(n) d(n).
Now, the goal is to find the impulse response coefficients h(n) of this filter which
would result in the smallest possible estimation error e(n).
Hence, the question arises is this: How are we minimizing the estimation error? In
other words, which quantity or function associated with the estimation error are we
attempting to minimize?
The performance function of this estimation error is also known as cost function or
objective function. The cost function can be defined in a statistical or deterministic
framework.
Two examples of cost functions are as follows:
In the statistical approach, the cost function is the mean-square value of the error
signal. For stationary input and desired signals, minimizing the mean-square error
results in linear filters known as Wiener filters, which are optimum in the sense of
MMSE.
The optimum means the best under the given set of assumptions and conditions.
where
x(n)=observed or received signal (input signal to the optimum filter)
s(n)=signal of interest
w(n)=additive random noise (of known statistics)
The error signal
where
d(n)=desired signal
y(n)= filter output
Filter Length=M
M
X 1
y(n) = h(n) x(n) = h(l)x(n l) (4)
l=0
The Wiener filter design problem requires that we find the filter coefficients, h(n),
that minimizes the MMSE.
For a set of filter coefficients to minimize it is necessary and sufficient that the
derivative of w.r.t. h (k) be equal to zero for k = 0, 1, , M 1.
=0 (6)
h (k)
= E{e(n)e (n)} = 0
h (k) h (k)
e (n)
=E e(n) =0 (7)
h (k) h (k)
with
h (M 1)x (n M + 1)
Optimum (Wiener) Filter p. 9/71
FIR Wiener filter contd.
Therefore,
e (n)
= x (n)
h (0)
e (n)
= x (n 1)
h (1)
e (n)
= x (n 2)
h (2)
.
..
e (n)
= x (n k) (8)
h (k)
= E {e(n)x (n k)} = 0
h (k)
which is known as the orthogonality principle or the projection theorem (Error at the
minimum is uncorrelated with the filter input).
(" M 1
# )
X
E d(n) h(l)x(n l) x (n k) =0
l=0
( M 1
)
X
E d(n)x (n k) h(l)x(n l)x (n k) =0
l=0
M
X 1
E {d(n)x (n k)} h(l)E {x(n l)x (n k)} = 0 (10)
l=0
M
X 1
rdx (k) h(l)rxx (k l) = 0
l=0
or equivalently
M
X 1
h(l)rxx (k l) = rdx (k); k = 0, 1, 2, , M 1 (11)
l=0
. . . .
.
.
rxx (M 1) rxx (M 2) rxx (0) h(M 1) rdx (M 1)
(12)
Rx h = rdx (13)
where
Rx is a M M Hermitian Toeplitz matrix of autocorrelation,
h is the vector of filter coefficients, and
rdx is the vector of cross-correlation between the desired signal d(n) and the
observed signal x(n).
The solution for the optimum filter coefficients is
hopt = R1
x rdx (14)
(" M 1
# )
X
min = E{e(n)d (n)} = E d(n) hopt (l)x(n l) d (n)
l=0
M
X 1
= E{d(n)d (n)} hopt (l) E{x(n l)d (n)}
l=0
| {z }
=rdx
(l)
M
X 1
min = rd (0) hopt (l)rdx (l) (15)
l=0
min = rd (0) rH
dx hopt (16)
Alternatively, since
hopt = R1
x rdx
the MMSE min may also be written explicitly in terms of autocorrelation matrix Rx and
cross-correlation vector rdx as follows:
1
min = rd (0) rH
dx Rx rdx (17)
Consider special cases: (I) Filtering, d(n) = s(n) and (II) Prediction,
d(n) = s(n + D).
Case-I: If d(n) = s(n), the linear estimation problem is referred to as filtering. In
filtering problem s(n) is to be estimated from the noise corrupted observation.
Now consider
In matrix form
Rx = Rs + Rw (20)
M
X 1
h(l)[rss (k l) + rww (k l)] = rss (k); k = 0, 1, 2, , M 1 (21)
l=0
or, equivalently
[Rs + Rw ]h = rs (22)
Case-II: If d(n) = s(n + D) where D > 0, the linear estimation problem is referred
to as signal prediction.
Assumption: noise has zero mean and uncorrelated with s(n), i.e.,
E{s(n + D)w (n k)} = 0.
Therefore,
and
M
X 1
h(l)[rss (k l) + rww (k l)] = rss (k + D); k = 0, 1, 2, , M 1 (25)
l=0
Note: In all cases, the correlation matrix to be inverted is Toeplitz. Hence the
Levinson-Durbin algorithm may be used to solve for the optimum filter coefficients.
Let us consider a signal x(n) = s(n) + w(n), where s(n) is an AR(1) process that
satisfies the equation
s(n) = 0.6s(n 1) + v(n)
where v(n) is a white noise sequence with variance v2 = 0.64, and w(n) is a
2 = 1. We shall design a filter of length
white noise sequence with variance w
M = 2 to estimate s(n).
An AR(1) process s(n) is obtained by exciting a single-pole filter by white noise
v(n). The frequency response of the single-pole filter is given by
1
H1 (ej ) =
1 0.6ej
1
X
h(l)[rss (k l) + rww (k l)] = rss (k); k = 0, 1.
l=0
for k = 0:
1
X
h(l)[rss (l) + rww (l)] = rss (0)
l=0
h(0)[rss (0) + rww (0)] + h(1)[rss (1) + rww (1)] = rss (0)
2h(0) + 0.6h(1) = 1
for k = 1:
1
X
h(l)[rss (1 l) + rww (1 l)] = rss (1)
l=0
h(0)[rss (1) + rww (1)] + h(1)[rss (0) + rww (0)] = rss (1)
0.6h(0) + 2h(1) = 0.6
MMSE
M
X 1
min = rd (0) hopt (l)rdx (l)
l=0
1
X
min = rs (0) hopt (l)rss (l)
l=0
= 1 h(0)rss (0) h(1)rss (1) = 0.45
1
X
H(z) = h(n)z n
n=0
= h(0) + h(1)z 1
H(z) = 0.451 + 0.165z 1
H(ej ) = 0.451 + 0.165ej
0.64
Rss (ej ) =
|1 0.6ej |2
0.64
=
1.36 1.2cos()
The magnitude of the frequency response of the Wiener filter The PSD of the process s(n)
0.65 4
0.6 3.5
0.55 3
0.5 2.5
Magnitude
Magnitude
0.45 2
0.4 1.5
0.35 1
0.3 0.5
0.25 0
0 0.5 1 1.5 2 2.5 3 3.5 0 0.5 1 1.5 2 2.5 3 3.5
frequency frequency
Since the power spectrum of the noise is constant for all , the SNR decreases with
increasing . Thus, it follows that the Wiener filter should have a frequency response
that has a magnitude that decreases with increasing .
s2
SN R = 2 = 1
w
or
SN R = 0 dB
The PSD of the output y(n) of the Wiener filter is given by
X
X
2
rys (0) = ys = h(l)rss (m l)h (m)
l= m=
M
X 1 M
X 1
= h(l)rss (m l)h (m)
l=0 m=0
In matrix form
2
h i rss (0) rss (1) h(0)
ys = h(0) h(1)
rss (1) rss (0) h(1)
2
ys = hH Rs h
2
yn = hH Rw h
h i rww (0) rww (1) h(0)
= h(0) h(1)
rww (1) rww (0) h(1)
2
h i 1 0 0.451
yn = 0.451 0.165 = 0.2306
0 1 0.165
2
ys 0.3199
SN R = 10log10 2
= 10log10 = 1.4215 dB
yn 0.2316
Thus, the Wiener filter increases the SNR by more than 1 dB.
Problem formulation for both FIR and IIR Wiener filter is same.
For the FIR Wiener filter there are only a finite number of coefficients that must be
determined, whereas for the IIR Wiener filter there are an infinite number of
unknowns, i.e., the values of h(n) for all n.
Problem formulation: Find the unit sample response, h(n), of the IIR filter
X
H(z) = h(n)z n
n=
= E{|e(n)|2 } (26)
where e(n) is the difference between the desired process d(n) and the output of
the Wiener filter, y(n),
X
e(n) = d(n) y(n) = d(n) h(l)x(n l) (27)
l=
This problem may be solved in exactly the same way that we solved the FIR
Wiener filtering problem, i.e., by differentiating w.r.t. h (k) for each k and setting
the derivatives equal to zero.
= E{e(n)x (n k)} = 0 <k <
h (k)
X
E d(n) h(l)x(n l) x (n k) = 0
l=
X
E d(n)x (n k) h(l)x(n l)x (n k) = 0
l=
X
E {d(n)x (n k)} h(l)E {x(n l)x (n k)} = 0 (29)
l=
X
rdx (k) h(l)rxx (k l) = 0
l=
or equivalently
X
h(l)rxx (k l) = rdx (k); < k < (30)
l=
which are the Wiener-Hopf equations of the non-causal IIR Wiener filter.
Eq.(30) can also be written as
j Rdx (ej )
H(e )= (33)
Rxx (ej )
Rdx (z)
H(z) = (34)
Rxx (z)
MMSE min
X
min = E{e(n)d (n)} = E d(n) h(l)x(n l) d (n)
l=
X
= E{d(n)d (n)} h(l) E{x(n l)d (n)}
l=
| {z }
=rdx
(l)
X
min = rd (0) h(l)rdx (l) (35)
l=
Using Parsevals theorem this error may be expressed in the frequency domain as
follows
Z
1
min = rd (0) H(ej )Rdx (ej )d (36)
2
Since
Z
1
rd (0) = Rd (ej )d
2
Wiener smoothing filter for producing the minimum mean-square estimate of s(n)
using the noisy observations
Assuming that s(n) and w(n) are uncorrelated zero mean random processes, the
autocorrelation of x(n) becomes
Furthermore
j Rss (ej )
H(e )= (38)
Rss (ej ) + Rww (ej )
Case-I: For those values of for which Rss (ej ) Rww (ej ), the SNR is high,
and |H(ej )| 1.
Therefore, over those frequency bands where the signal dominates, the filter
passes the signal with little attenuation.
Case-II: For those values of for which Rss (ej ) Rww (ej ), the SNR is low,
and |H(ej )| 0.
Therefore, over those frequency bands where the noise dominates, H(ej ) is
small in order to filter out or suppress the noise.
MMSE: Consider d(n) = s(n), we get rd (n) = rss (n), Rd (ej ) = Rss (ej ), and
Rdx (ej ) = Rss (ej ).
Z
1
min = Rd (ej ) H(ej )Rdx (ej ) d
2
Z
1
= Rss (ej ) H(ej )Rss (ej ) d
2
Z
1 j
j
min = Rss (e ) 1 H(e ) d
2
Z j )
1 R ss (e
= Rss (ej ) 1 j j
d
2 Rss (e ) + Rww (e )
Z j )
1 R ss (e
= Rww (ej ) d
2 Rss (ej ) + Rww (ej )
Z
1
min = Rww (ej )H(ej )d (39)
2
0.25
Rs (z) =
(1 0.5z 1 )(1 0.5z)
and suppose that s(n) is observed in the presence of zero mean white noise with
2 = 0.25.
a variance w
Assuming that s(n) is uncorrelated with w(n). Design a non-causal IIR Wiener
smoothing filter for estimating s(n) from x(n) and find the MMSE.
System function
Rss (z)
H(z) =
Rss (z) + Rww (z)
2 , we have
with Rww (z) = w
2(0.2344)
H(z) =
(1 0.2344z 1 )(1 0.2344z)
h(n) = 0.4960(0.2344)|n|
MMSE
Z
1
min = Rww (ej )H(ej )d
2
2 Z
w
= H(ej )d
2
2
min = w h(0) = 0.25 0.4960 = 0.1240
How much error is reduced as a result of filtering x(n) with a Wiener filter?
Without a filter y(n) = x(n).
Therefore,
E{|e(n)|2 } = E{|w(n)|2 } = w
2
= 0.25
Thus, the non-causal Wiener filter reduces the MSE by approximately by a factor
of two.
X
H(z) = h(n)z n
n=0
X
y(n) = h()x(n )
=0
Mean-square error
= E{|e(n)|2 } (40)
where e(n) is the difference between the desired process d(n) and the output of
the Wiener filter, y(n),
X
e(n) = d(n) y(n) = d(n) h()x(n ) (41)
=0
X
h()rxx (k ) = rdx (k); k0
=0
rdx (k) cannot be expressed as the convolution of h(k) and rx (k) because the
equation holds only for k 0. To solve the Wiener-Hopf equations we shall use
the innovations representation of the WSS random process {x(n)}.
A stationary process {x(n)} with autocorrelation rxx (k) and PSD Rxx (ej ) can
be represented by an equivalent innovation process, i(n), by passing {x(n)}
1
through a noise-whitening filter with system function FI (z) = F (z) , where F (z) is
the minimum-phase part obtained from the spectral factorization of Rxx (z).
X
y(n) = g()i(n )
=0
and e(n) = d(n) y(n), application of orthogonality principle yields the new
Wiener-Hopf equations as
X
g()rii (k ) = rdi (k); k0
=0
Thus, we obtain
X 1 X 1
G(z) = g(k)z k = 2 rdi (k)z k = 2 [Rdi (z)]+
k=0
i k=0 i
X
i(n) = fI (m)x(n m)
m=0
1 X
= FI (z) = fI (m)z m
F (z) m=0
then
X
rdi (k) = fI (m)E {d(n)x (n m k)}
m=0
X
rdi (k) = fI (m)rdx (m + k) (42)
m=0
X
Rdi (z) = rdi (k)z k
k=
"
#
X X
= fI (m)rdx (m + k) z k
k= m=0
X
X
Rdi (z) = fI (m) rdx (m + k)z k
m=0 k=
X
X
= fI (m)z m rdx (k)z k
m=0 k=
| {z }| {z }
FI (z 1 ) Rdx (z)
Rdx (z)
Rdi (z) = FI (z 1 )Rdx (z) =
F (z 1 )
Therefore,
1 1 Rdx (z)
G(z) = 2 [Rdi (z)]+ = 2
i i F (z 1 ) +
Finally, the optimum causal IIR Wiener filter has the system function
G(z) 1 Rdx (z)
Hopt (z) = = 2 (43)
F (z) i F (z) F (z 1 ) +
MMSE:
X
min = rd (0) h()rdx ()
=0
Z
1 j j
min = Rd (e ) H(e )Rdx (ej ) d
2
where w(n) is unit variance (w2 = 1) that is uncorrelated with s(n). The signal
1
H1 (ej ) =
1 0.6ej
In z-domain
0.64
Rss (z) =
(1 0.6z 1 )(1 0.6z)
0.64
Rdx (z) = Rss (z) =
(1 0.6z 1 )(1 0.6z)
Rxx (z) = Rss (z) + Rww (z)
0.64
= +1
(1 0.6z 1 )(1 0.6z)
1 1
1
1.8 1 3 z 1 3z
Rxx (z) =
(1 0.6z 1 )(1 0.6z)
= i2 F (z)F (z 1 )
where
1 13 z 1
F (z) =
1 0.6z 1
Hence
" #
Rdx (z) 0.64 (1 0.6z)
=
F (z 1 ) + (1 0.6z 1 )(1 0.6z) 1 13 z
+
" #
0.8 0.266z
= +
1 0.6z 1 1 13 z
+
Rdx (z) 0.8
=
F (z 1 ) + 1 0.6z 1
0.444
Hopt (z) =
1 13 z 1
4 1 n
hopt (n) = u(n)
9 3
Y (z)
Since Hopt (z) = X(z)
, the estimate of s(n) may be computed recursively as follows
1
y(n) = y(n 1) + 0.444x(n)
3
Finally, MMSE
X
min = rd (0) h()rdx ()
=0
4 X 1 n
min =1 (0.6)n
9 n=0 3
4 X
=1 (0.2)n
9 n=0
4
min = = 0.444
9
Note that, for second-order FIR Wiener filter, the mean-square error was
min = 0.45. Therefore, using all the previous observations of x(n) only slightly
improves the performance of the Wiener filter.
The signals that were being analyzed were stationary (stationary signals are
constant in their statistical parameters over time).
Signals that arise in almost every application are non-stationary.
The Wiener filtering approach would not be appropriate.
A better approach would be to start over and begin with a non-stationary
assumption at the outset.
Consider the Wiener filtering problem within the context of non-stationary
processes.
M
X 1
y(n) = h(k)x(n k)
k=0
If x(n) and d(n) are jointly WSS processes, with e(n) = d(n) y(n), then the
filter coefficients that minimize the mean-square error E{|e(n)|2 } are found by
solving the Wiener-Hopf equations
Rx h = rdx
However, if the d(n) and x(n) are non-stationary, then the filter coefficients that
minimize E{|e(n)|2 } will depend on n, and the filter will be shift-varying, i.e.
M
X 1
y(n) = hn (k)x(n k)
k=0
y(n) = hT
n x(n)
where
hn = [hn (0), hn (1), , hn (M 1)]T
In shift-varying (adaptive) filter, for each value of n, it is necessary to find the set of
optimum filter coefficients, hn (k), for k = 0, 1, , M 1.
However, the problem may be simplified considerably if we relax the requirement
that hn minimize the mean-square error at each time n and consider, instead, a
coefficient update equation of the form
hn+1 = hn + hn
The key component of an adaptive filter is the set of rules, or algorithm, that
defines how the correction hn is to be formed.
The sequence of corrections should decrease the mean-square error.
In fact, whatever algorithm is used, the adaptive filter should have the following
properties:
1) In a stationary environment, the adaptive filter should produce a sequence of
corrections hn in such a way that hn converges to the solution to the
Wiener-Hopf equations,
lim hn = R1
x rdx
n
2) It should not be necessary to know the signal statistics rdx (k) and rx (k) in
order to compute hn . the estimation of these statistics should be built into the
adaptive filter.
3) For non-stationary signals, the filter should be able to adapt to the changing
statistics and track the solution as it evolves in time.
An adaptive filter is a filter that self adjusts its transfer function (or it parameters or
coefficients) according to an optimization algorithm driven by an error signal.
The adaptive filter uses feedback in the form of an error signal to refine its transfer
function to match the changing parameters.
LMS, RLS, etc.
Applications: 1) Noise cancellation
2) Signal Prediction
3) Echo cancellation,
4) Adapive channel equalization.