Sensorics Script PDF

Sensorics
by
Prof. Dr.-Ing. Oliver Nelles
University
Version: 1. April 2019 Page 1 Prof. Dr.-Ing.

Oliver Nelles
of Siegen
Contents
A: Measurement Techniques
1. Introduction to Measurement Techniques
2. Measurement of Electrical Quantities
End B
3. Measurement of Non-Electrical Quantities
4. Digital Measurement Techniques
5. Measurement Errors and Statistics
End A
6. Static and Dynamic Behavior of Sensors
B: Signal Processing
7. Introduction to Signal Processing
8. Time-Discrete Systems and Signals
9. Transformation Into the Frequency Domain (Discrete Fourier Transform)
10. Filters
11. Selected Methods in Signal Processing
University
Page 2 Prof. Dr.-Ing.

Oliver Nelles
of Siegen
A: Measurement Techniques
University

Oliver Nelles
of Siegen
1. Introduction to
Measurement Techniques
University

Oliver Nelles
of Siegen
Contents of Chapter 1
1. Introduction to Measurement Techniques
1.1 Historical Issues

1.2 SI: International System of Units
1.3 Relevance of Measurement Techniques
1.4 Basics
1.5 Literature
University
1. Introduction to Measurement Techniques Page 5 Prof. Dr.-Ing.

Oliver Nelles
of Siegen
1.1 Historical Issues
Historical Milestones
• From 3000 B.C. first descriptions of length and
weight measures have been found.
In the tower of the Freiburger Münster a
• During the medieval, trade and jurisdiction
stainless metal bar is built in. Its length is
concentrated on the environment around churches. one “Elle“ (54 cm), a 1/20 of an “Elle“
First accepted standards have been established equals one inch.
like the Freiburger “Elle”.
• Each trade center defined individual standards. At the
end of the 18th century 118 different definitions of an
“Elle” and 80 different definitions of a “Pound” have
been common.
• 1791 world-wide valid and accepted standards and
1875 the metric system was established in Paris. But in
some countries it is not used until today (e.g. USA).
• Since 1889 measurement standards made from platinum
and iridium for the original “m” and “kg” are displayed in Paris.
University

Oliver Nelles
of Siegen
MKS System
• 1791 the units Meter, Kilogram and Second were established world-wide for the next
200 years in Paris, the so-called MKS system. From this system many important units
could be derived.
• The Meter was defined as the 40 millionth fraction of the circumference of earth
(orthogonal to the equator).
• The Kilogram was defined as the mass of 10 cubic centimeter (cm3) of water with
maximal density (at 4°C).
To complete the MKS system and to improve the accuracy and generality of the units by the
help of modern physics, 1960 the SI system (Système International d'Unités) consisting of
7 units was founded.
• All units can be derived from the basic 7 SI units.
• The definition are mainly based on physical constants.
• In principle, these units could be understood by aliens!
University

Oliver Nelles
of Siegen
University

Oliver Nelles
of Siegen
SI System
From the 7 basic SI units for example the following important units can be derived:
Speed:
Acceleration:
Force:
Torque:
Torque and Energy have identical units!
Does this mean they are the same?
Energy:
Torque throughout this script is named
Power: with M, not T as usual in English,
because this is the common German
Magnetic Field: abbreviation for “Moment”.
Electric Voltage:
University

Oliver Nelles
of Siegen
1.3 Relevance of Measurements
Measurement Techniques are the Foundation of Science
• The foundation of science are observations. Science comes up with theories that aim to
explain existing observations and predict future ones. If theory and observations
contradict each other, either the observations are flawed or the theory is wrong
(falsification). The more independent observations support a theory, the more likely it
is true. But, in principle, it can never be proven (verification)!
• Very concrete and quantitative observations are measurements. Mainly with their help
sciences progresses, in particular natural sciences.
• Discovered patterns within the measurements often lead towards a theory that is
coherent with them.
• New technological possibilities often have supported or refuted theories. Example:
The measured spectrum of black body radiation was in contradiction to the classical
theory. The introduction of quantization of the emitted frequencies by Planck in 1900
could align theory and observations. This was the birth of quantum mechanics (which
ironically Planck never accepted).
University

Oliver Nelles
of Siegen
Example: Interferometer Experiment by Michelson-Morley
The experiment by Michelson in 1881 and a refined version by
Morley in 1887 tried to prove that some “stuff” in vacuum exist
(German: “Äther”) that transmits light waves. The theory at this
time stated that any wave needs a medium for its transportation
in order to propagate the energy. Examples are water waves or
sounds with air as the medium. That vacuum is simply empty was
unimaginable because light can travel through space. So what is
the medium that light needs?
This obscure stuff was not found and called “Äther”. The inter-
ferometer experiment was designed to find out whether it exists.
If the earth travels through “Äther” then turning the interferometer
changes the speed direction and should lead to a phase shift because
light should be faster or slower depending on the relative “Äther”
speed. But nothing happened! Light speed always is c in vacuum.
No “Äther” exists. This was explained by Einstein’s theory of
special relativity in 1905.
University

Oliver Nelles
of Siegen
Measurement for Feedback Control
Control is based on measuring the quantity that shall be controlled. Without the measure-
ment there is no feedback possible, no comparison between desired and actual value.
desired value manipulated controlled value

controller variable plant
sensor
In many applications in signal processing a delay is not very tragic. If you see a football goal
100 ms delayed because of computations in your digital TV this is no significant drawback.
This is different in feedback control! The controlled variable must be fed back to the
comparison of desired with actual value immediately. Any delay due to a slow sensor
or filtering or other signal processing techniques deteriorates the control performance.
You can never make up for a delay in a subsequent step!
University

Oliver Nelles
of Siegen
1.4 Basics
Measuring
Definition: Measuring means comparing with an agreed unit.
A measurement consists of a number and a unit. The number describes which multiple of the
unit is assigned:
measurement = number · unit
Examples: Speed = 3 m/s = 3 m·s−1, Mass = 4 kg, Force = 5 kg·m/s2 = 5 N
Requirements:
1. The quantity to be measured must be qualitatively uniquely determined.
2. The standard unit must be defined by a convention.
These requirements are not met by many quantities in our everyday lives, like
wellness, beauty, intelligence.
University

Oliver Nelles
of Siegen
1.4 Basics
Measurement Setup
A measurement setup typically consists of 3 blocks:
1. The quantity to be measured by a sensor is converted into an electrical signal. Recently
the term smart sensor has become popular. This means sensors that incorporate an
intelligent signal processing that carries out tasks inside the sensor like filtering, data
reduction, extraction of features, combining different physical principles, …
2. The electrical signal is converted into another electrical signal which is e.g. of
higher power and/or digital, etc.
3. The amplified and possibly digitized signal is outputted to a display, printer, plotter or
only saved.
converter,
Process Sensor amplifier, ... output device
measurement Trans-
Probe
non-electrical ducer
electrical e.g. digital
support energy
University

Oliver Nelles
of Siegen
1.4 Basics
Measurement Method [1]
• Deflection Method: The measured quantity is directly converted into the output, e.g. a
display. No support energy is needed from outside. The required energy for the
conversion is taken from the medium or the environment (e.g. gravitation).
Examples: spring balance, expansion thermometer.
• Difference Method: The measured quantity is compared with a quantity from outside.
This quantity for comparison stays constant during the measurement. The difference
between both is the output. Example: volume measurement (displaced liquid).
• Compensation Method: A quantity opposed to the measured quantity is applied. A zero
indicator determines whether both quantities are equal. If so the compensation quantity is
a measure for the original one. The compensation quantity
can be of other kind than the original one.
Examples: Equal-armed balance with weights as
compensation quantity (same kind)
or with an electro-magnet induced force (different kind)
http://en.wikipedia.org/wiki/File:Balance_scale_IMGP9755.jpg
University

Oliver Nelles
of Siegen
1.4 Basics
Measuring Technique [1]
• Direct Measurement: Comparison with a gauge. The most fundamental technique.
Example: Length measurement with a ruler.
• Indirect Measurement: The quantity to be measured is determined by other relevant
quantities. Examples: Determination of pressure by measuring force and dividing by the
area. Determination of power by measuring voltage and current and multiplying them.
Determination of speed by measuring distance and time and dividing them.
Determination of acceleration by measuring speed and differentiating.
• Incremental Measurement: From a reference point, increments (= smallest change) are
added or subtracted to determine the actual value. Typically, equidistant markings are
scanned (optically or magnetically or otherwise). Examples: Measuring angles or
displacements.
University

Oliver Nelles
of Siegen
1.4 Basics
Analog and Digital Measurement Processing
1.4 1.4
1.2 1.2
1 1
0.8
Sampling 0.8
0.6 Time: continuous 0.6 Time: discrete

Amplitude: continuous Amplitude: continuous
0.4 0.4
0.2 0.2
Quantization
Quantization
0 0
0 2 4 6 8 10 0 2 4 6 8 10
1.4 1.4
1.2 1.2
1 1
Sampling
0.8 0.8

0.4 Amplitude: discrete 0.4 Amplitude: discrete
0.2 0.2
0 0
0 2 4 6 8 10 0 2 4 6 8 10
University

Oliver Nelles
of Siegen
1.5 Literature
These books are the main basis for these lecture notes:
1. J. Hoffmann: “Taschenbuch der Messtechnik“, 4. Aufl., Hanser, 2004
2. J. Niebuhr, G. Lindner: “Physikalische Messtechnik mit Sensoren“, 5. Aufl.,
Oldenbourg, 2005.
3. E. Schrüfer: “Elektrische Messtechnik: Messung elektrischer und nichtelektrischer
Größen“, 7. Aufl., Hanser, 2001
4. U. Kiencke, R. Eger: “Messtechnik“, 6. Aufl., Springer, 2005.
A reference:
Mayer, J.R. Rene: “Measurement, Instrumentation and Sensors Handbook“, CRC
Press, 1999
A good book in English:
Morris A.S., Langari, R.: “Measurement and Instrumentation: Theory and
Application”, Academic Press, 2012
University

Oliver Nelles
of Siegen
4. Digital Measurement
Techniques
University

Oliver Nelles
of Siegen
4. Digital Measurement Techniques
4.1 Discretisation of Amplitude and Time

4.2 Sampling Theorem
4.3 Quantization
4.4 A/D and D/A Converters
4.5 Measurement of Frequency
University
4. Digital Measurement Techniques Page 20 Prof. Dr.-Ing.

Oliver Nelles
of Siegen
4.1 Discretization of Amplitude and Time
Advantages of Digital Measurement Techniques
• Digital electronics is insensitive with respect to environmental influences (temperature).
• Digital electronics becomes more powerful, cheaper, smaller, more robust. It can be
integrated together with the sensor (smart sensor).
• Documentation and archiving purposes require or favor a digital form.
• Digital signal processing is much more powerful and flexible than analogue electronic
circuits:
– Digital and adaptive filtering.
– Nonlinear transformation/inversion.
– Transformation of signals into the frequency domain (fast Fourier transform, FFT).
– Parameter estimation, supervision, diagnosis.
– Sensor fusion.
– Storage of data on a digital storage medium (hard disk, flash).
– Transmission of data without any information loss.
– Powerful display technologies.
University

Oliver Nelles
of Siegen
sampled
Here: Focus on digital signals digital = time-discrete & quantized
• Difference equation and sums are simpler to manage and understand than
amplitude in
differential equations and integrals. e.g., 8 or 16 bits
• Digital realizations replace analogue circuits because it is
- cheaper in most cases (especially for high quantities),
- easy to implement,
- more flexible: faster and cheaper to change (even afterwards with updates),
- more robust and durable with respect to environmental influences
(wear, temperature, humidity).
Focus of this lecture

• Development of an understanding for the methods and their potential applications.
• No implementation details and tricks.
• No programing of digital signal processors (DSPs).
• More width than depth.
University

Oliver Nelles
of Siegen
Abbreviation: u(k) = uc(kT0)
Analog/Digital and Digital/Analog Conversion y(k) = yc(kT0)
A/D Convertor uc(t) us(t) u(k)

Sensor T0 A/D Computer
• Sampling time T0 can be bet-
ween !sec (signal proc.)
and hours (thermal, uc(t) us(t) u(k)
biological processes)
• Amplitude resolution
of 8, 12 or 16 bit. t t k
D/A Convertor y(k) y*(t) y(t)

Computer D/A Hold
• Computer handles
time-discrete series.
y(k) y*(t) yc(t)
• Hold of 0. order
generates piece-wise
constant signals.
k t t
University

Oliver Nelles
of Siegen
Analog and Digital Measurement Processing
1.4 1.4
1.2 1.2
1 1
0.8
Sampling 0.8

Amplitude: continuous Amplitude: continuous
0.4 0.4
0.2 0.2
Quantization
Quantization
0 0
0 2 4 6 8 10 0 2 4 6 8 10
1.4 1.4
1.2 1.2
1 1
Sampling
0.8 0.8

0.4 Amplitude: discrete 0.4 Amplitude: discrete
0.2 0.2
0 0
0 2 4 6 8 10 0 2 4 6 8 10
University

Oliver Nelles
of Siegen
4.2 Sampling Theorem Wheel seams
to freeze!
Sampling of a Continuous-Time Signal
Everybody has seen spoke wheels of a starting carriage or car − at least in the movies. First
the accelerating wheel can be observed. With a certain speed or angular velocity of the
wheel, it suddenly changes direction and seams turn the other way round although the
carriage further accelerates. Further on the wheel slows down before it finally stands still.
That in obvious contradiction to the faster and faster carriage.
This strange effect can be explained by the so-called Aliasing. It exists for all time-discrete
and therefore sampled systems. Obviously problems occur, if the signal is sampled too
slowly for its velocity (or more precisely frequency). This effect becomes prominent, if we
approach half of the sampling frequency. The movie plays the role of the sampler with a
sampling frequency of f0 = 24 Hz or 25 Hz, i.e., the refresh rate.
What happens if we sample a signal of frequency f = 1 Hz with f0 = 1 Hz?
1 1
0 0
-1 -1
0 1 2 3 4 5 6 0 1 2 3 4 5 6
University

Oliver Nelles
of Siegen
Aliasing
Obviously the oscillation is completely gone! We get a signal of frequency zero (a dc value).
This happens independently of the phase orientation of the sampler (only the value of the dc
value depends on it). For illustration some further examples with f = 0.9 Hz, 0.7 Hz, 0.5 Hz
and 0.3 Hz sampled with f0 = 1 Hz.
f = 0.9 Hz f = 0.7 Hz
1 1
0 0
-1 -1
0 1 2 3 4 5 6 0 1 2 3 4 5 6
f = 0.5 Hz f = 0.3 Hz
1 1
0 0
-1 -1
0 1 2 3 4 5 6 0 1 2 3 4 5 6
University

Oliver Nelles
of Siegen
Claude Elwood Shannon, 1916-2001
4.2 Sampling Theorem (www.wikepedia.org)
Sampling Theorem
From the examples on the previous slide we see, that at least the double of the
sampling frequency is required to reconstruct the original signal from its sampled
version (f = 0.5 Hz sampled with f0 = 1 Hz). Real signals consist of many (typically infinite
many) frequencies. Then, this requirement relates to the highest contained frequency fmax.
Shannon‘s Sampling Theorem

The signal x(t) shall be sampled. The highest significant frequency component of x(t) is at
fmax. Then the sampling frequency has to be at least twice this highest frequency component
of x(t):
If this theorem is violated, aliasing occurs, i.e., frequency components above the half
sampling frequency (f > ½ f0) are mirrored into a lower frequency range. By this effect high
frequency noise can disturb the signal in any frequency range. Thus aliasing should be
avoided or at least kept to a minimum.
It is practice to choose ~ f0 = 5…10 fmax
University

Oliver Nelles
of Siegen
Illustration of the Sampling Theorem and the Aliasing Effect
If the sampling theorem is met, it is possible to reconstruct the original signal from its
sampled version, i.e., no information loss takes place. However, in reality most signals are
not bandlimited. This means they have frequency components up to infinity, i.e., no upper
bound exists (fmax = ∞). Typical signals like steps, ramps, rectangular shapes stretch their
spectrum between zero and infinity. Such signals cannot be reconstructed perfectly.
Signal Spectrum
University

Oliver Nelles
of Siegen
Illustration of the Sampling Theorem and the Aliasing Effect
Spectrum of the continuous signal Spectrum of the sampled signal
bandlimited
signal
Spectrum of the sampled signal Spectrum of the sampled signal
University

Oliver Nelles
of Siegen
Aliasing for Sampling a Sin-Signals With Angular Frequency !1
Each signal component of frequency !1 is mirrored through the sampling process to:
As long as !1 lies inside the red area (solid), i.e., the sampling theorem is not violated, the
mirrored components (dashed) keep lying outside the red area (left figure).
As soon as !1 lies outside the red area (solid), i.e., the sampling theorem is violated, the
mirrored components (dashed) lie inside the red area (right figure). Aliasing occurs!
If a component changes from !1 to !0, a mirrored alias component at ! = 0 is created.
University

Oliver Nelles
of Siegen
Aliasing in Image Processing
Signal processing is relevant not only for signals over time. It is also important for signals
over space like pictures/photos(2-D: columns & rows) or a combination of both in videos (3-
D). For such spatial signal the same laws and relationships hold. Signals over time can be
filtered, so can signals over space.
Image processing therefore also has to deal with the aliasing effect. A high spatial frequency
corresponds to alternating points of black and white (or differently colored).Without a special
so called anti-aliasing filter, such
Without Anti-Aliasing With Anti-Aliasing
components of high frequency can
significantly disturb the picture. It is
particularly prominent for tiny
checkered patterns and known as the
Moiré effect. A low-pass anti-
aliasing filter prevents such
destructive effects. Every digital
photo and video camera has build
in such a filter.
University

Oliver Nelles
of Siegen
4.3 Quantization
Quantization Error
Any digital value is quantized in its amplitude. A continuous value has to be mapped to a
discrete value via the A/D converter. This means that each interval in the continuous range
corresponds to some integer number. All values inside of such an interval are
indistinguishable after the A/D conversion.
If we quantize a continuous value in the range from xmin to xmax into n bits, 2n intervals or
quantization levels exist. In such a quantization the maximum error can be calculated as
because this is the interval width. The quantization

xQ
error with this approach is always positive because
the green line (dashed) always is above the blue one 7
6
(solid). eQ max
5
Example: xmin = 0, xmax = 10, n = 3 Bits 4
3
eQ max = 10 / 8 = 1.25 2
1
0
In practice 8, 12, 16 bit A/D converters are standard. 0 1.25 2.5 3.75 5 6.25 7.5 8.75 10 x
University

Oliver Nelles
of Siegen
It is possible to improve this quantization error by almost xQ
a factor of 2. Instead of always rounding down, we can 7
draw the green (dashed) line through the average by 6
eQ max
5
shifting it eQ/2 to the right. Now xmin and xmax are in the 4
medium values of the intervals, not their limits. This 3
sacrifices one interval. The maximum quantization error is: 2
1
0
0 1.43 4.29 7.14 10 x
2.86 5.71 8.57
Quantization Noise
Although the quantization error is caused systematically, it appears
to be of random nature. Thus, one speaks of quantization noise that
any A/D conversion creates in principle. Since all values are of equal
probability, it can be modeled by an equal probability distribution.
In old synthesizers or CD players quantization noise could be heard
for low volume sounds.
University

Oliver Nelles
of Siegen
A/D Converters: Fundamentals
The three main characteristics of A/D converters are:
• Resolution
• Speed
• Realization effort / price
These characteristics are in conflict with each other. E.g. a high resolution implies a low
speed or high effort/price (or both).
With resolution we mean the number of bits n which results in 2n quantization levels. It is
not reasonable to request a much higher resolution from the A/D converter than the
measurement noise or other disturbances have as mean amplitude since the accuracy of the
signal then is limited to this value anyway. Otherwise the lowest significant bits are
determined by noise and carry no information.
The speed (bandwidth) determines how fast the A/D conversion is performed and therefore
how fast the sampling is possible (maximum sampling frequency). The effort typically
shows directly in the price.
A low sensitivity with respect to environment conditions is also an important criterion.
University

Oliver Nelles
of Siegen
A/D Converter: Parallel Principle or Flash Converter [1]
The voltage which shall be converted UE is directly compared with n different reference
values. For any of the existing 2n–1 quantization levels one comparator is required.
Properties: Very fast
(10 MHz), low resolution (8 bit).
Application Field: Video.
University

Oliver Nelles
of Siegen
A/D Converter: Successive Approximation or Weighting Method [1]
The procedure is identical to the weighting with a beam balance where the available weights
are 1, ½, ¼, …, 1/2n. A combination of these weight represents the quantization levels. One
starts with the highest weight and adds or removes weights in descending order to balance
the beam. At the end we have n steps (n times a weight is added and possibly removed). The
remaining weight represent “1”, the removed “0” in the converted value. Weights are
realized by voltages, the beam balance is realized by comparators.
Properties: Medium speed (1 MHz), medium-high resolution: 12, 16, even 24 bit.
Application Field: Computer plug-in A/D converter cards for measuring signals.
University

Oliver Nelles
of Siegen
A/D Converter: Servo Principle [1]
Constantly the difference of the voltage UE to be converted and the output of the A/D
converter which is converted back into an analogue signal is compared like in a control
system. If this difference I equal to zero, then the A/D conversion is correct. A positive or
negative difference triggers a count which is counted up or down (feedback!). Because the
has a certain speed, the conversion needs a lot of time that depends on the size of the
difference; this it similar to an integrative controller. However, if the difference is small
because the signal hardly changes (no steps or
impulses) the converted voltage follows closely.
Properties: Speed depends on the size of steps.
Application Field: continuous conversion,

slowly changing signals.
University

Oliver Nelles
of Siegen
A/D Converter: Dual Slope Principle [1]
The dual slope converter uses an extended ramp method. The input voltage UE is integrated
over a fixed period of time t by an integrator circuit. Subsequently, the integrated voltage is
integrated down again until zero by some reference voltage Uref of opposite sign. During the
latter time period a counter runs whose counting then is proportional to the original input
voltage UE.
Properties: Excellent quality and suppression of unwanted influences. Is is almost
independent of material properties, temperature changes, etc. because those effects cancel
each other during up- and down-integration. Slow speed since integration takes a lot of time.
Application Field: Digital volt meter.
University

Oliver Nelles
of Siegen
A/D Converter: Sigma-Delta- or Charge-Balance- or 1-Bit-Method [1]
In the first part of a sigma-delta-converter a
bit stream is generated whose average value
is proportional to the input voltage UE that
shall be converted. This is achieved through a
control loop in which the difference between
UE and a positive and negative reference
voltage is fed to a comparator. For UE = 0 V
the up- and down-integration phases are
equally long.
In the second part the bit series in the bit
stream is counted and converted into a
digital value.
Properties: very high resolution (24 bit),
medium speed.
Application Field: audio, instrumentation.
University

Oliver Nelles
of Siegen
D/A Converter: Current Weighted Principle [1]
In comparison to A/D conversion is the way back quite simple.
One possibility is to drive a
constant current through a
number of resistors with
geometrically ordered resistances,
i.e., R, 2R, 4R, 8R, … The voltage
drop over each resistor corresponds
to a bit in the digital value
(“1” for “on” and “0” for “off”).
The sum of these voltages then
corresponds to overall value,
e.g., the bit series
0001 1010
gives the analogue voltage
U = (16R + 8R + 2R)I = 26RI
University

Oliver Nelles
of Siegen
D/A Converter: R-2R Principle [1]
The R-2R converter divides a current in each knot into 2 halves (factor 2). One half drives a
resistor with resistance 2R and thereby creates a proportional voltage drop. The other half is
again divided into 2 halves etc. The main advantage compared to the method explained on
the last slide is that only two kinds of resistors R and 2R are required. They are much easier
and cheaper to manufacture in high quality (low temperature dependence) than all the
different kinds for the current weighted principle R, 2R, …, 1024R (for a 10 bit converter).
University

Oliver Nelles
of Siegen
Fundamentals of Frequency Measurement
In the discussion of velocity and angular velocity measurements in Chapter 3.3 it is explained
how such a measurement can be transformed into a voltage signal of same frequency. The
last step that still is open, is to determine this frequency! The reason for this is that frequency
measurement is typically done digitally − thus it has been postponed up to here.
The task here is therefore to determine the
frequency f of a given voltage signal u(t). u(t)
t frequency f
Two alternative approaches are presented: measuremen
• Measurement of the cycle duration (period): For signals of t
low frequency it makes sense to measure the time for one (or even half of an) oscillation
TP and calculate the frequency from f = 1/TP.
• Counting the number of cycles within one time interval: For signals of high frequency it
makes sense to measure the number of oscillations within a given time interval and to
count them. The frequency can be determined by f = number of oscillations / time
interval.
University

Oliver Nelles
of Siegen
Measurement of a Period [4]
Well-suited for
low-frequency signals
Gate time is one or

one half of an oscillation
of the original signal.
Reference frequency
is artificially generated.
University

Oliver Nelles
of Siegen
Counting of Many Periods [4]
Well-suited for
high-frequency signals.
Frequency come from

the original signal.
Gate time is generated

arbitrarily (the long the
more accurate but slower).
University

Oliver Nelles
of Siegen
B: Signal Processing
University

Oliver Nelles
of Siegen
University

Oliver Nelles
of Siegen
7.1 What for?
7.2 Deterministic and Stochastic Signals
7.3 Application Examples
7.4 Literature
University
7. Introduction to Signal Processing Page 47 Prof. Dr.-Ing.

Oliver Nelles
of Siegen
7.1 What For?
What are Signals?
• Signals transfer information.
• Signals are functions, typically of time.
• Signals are measured with sensors and can be available in every physical form like
pressure, temperature, voltage, …
Some Typical Signals

• Speech, music
t
• Pictures, videos
• EKG, EEG, signals of CT, MRT or PET → image processing, conversion in pictures, ...
• Distance measurement with laser, ultrasound or radar, echo lot, GPS, seismic signals, …
• Data streams via telephone lines, cable TV, satellite, cell phone, bluetooth, internet, …
• All kinds of measurements at machines, machines in factories, …
• Pressure in cylinders of a combustion engine
• Stock prices, number of unemployed people, development of populations, …
University

Oliver Nelles
of Siegen
7.1 What For?
What is signal processing?
The analysis, manipulation and integration of signals
Application areas of signal processing?

• Storage, reconstruction
• Separation of desired signal and disturbance (signal-to-noise ratio)
• Compression
• Feature extraction (pre-stage of every classification)
Method/Tools of signal processing

• Transformation, correlation
• Filtering, disturbance suppression
• Detection, classification, pattern recognition
• Identification, estimation
• Compression, integration, fusion
University

Oliver Nelles
of Siegen
7.1 What For?
Applications
Camera Video Cell Phone MRT
Driver Assistance Messtechnik Radar
Integration Sensorics/Control Units Night Vision Internet GPS

University

Oliver Nelles
of Siegen
Deterministic Signals Do Not Depend on Randomness:
• Dirac impulse Deterministic signal
4
• Step 2
0
• Ramp -2
• Periodic signals: sine, rectangular, ... -4
0 20 40 60 80 100 120 140 160 180 200
time t [s]
Stochastic Signals Depend on Randomness:

• Noise
• Distribution of amplitudes: Stochastic signal = Random signal
4
- Gaussian, 2
0
- uniform, ... -2
-4
• Frequency characteristics: 0 20 40 60 80 100 120 140 160 180 200
time t [s]
- white: all frequencies have the same power,
- band limited: only a certain frequency range is present, ...
University

Oliver Nelles
of Siegen
Motivation for Using Stochastic Signals
• Physical effects are truly random (e.g. radioactive decay).
• Many tiny disturbances appear like random, but are of deterministic nature each if we
look in close detail (what needs time and dedication).
→ In both cases: Modeling of the effects as stochastic signal makes sense!
White Noise
4
2
0
-2
-4
0 20 40 60 80 100 120 140 160 180 200
time t [s]
Violet Noise (high frequency) Brown Noise (low frequency)

4 10
2 5
0
0
-5
-2 -10
-4 -15
0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
time t [s] time t [s]
University

Oliver Nelles
of Siegen
Acoustic Echo Compensation (Hands-Free Talking)
• Adaptive (automatically self-adjusting) filters eliminate disturbing and annoying
feedbacks by modeling the transfer characteristics between speaker and microphone and
subtracts this signal part from the overall signal.
min. power of e!
adaptive
filter
University

Oliver Nelles
of Siegen
Active Noise Cancelation
• By direct measurement of the noise and generation of a
opposite phase signal (180° phase shift) destructive
interference annihilates the noise or at least parts of it.
• Works well in the low-frequency range up to 1000 Hz.
• Damping (active + passive) up to –30 dB possible!
University

Oliver Nelles
of Siegen
Active Noise Cancellation with Sony Xperia Z2
• Works via cell phone, not with headphones alone.
• Use processor and battery of cell phone.
Source: https://www.theguardian.com/technology/2014/apr/17/sony-xperia-z2-review-phone-android:
Source: http://www.techradar.com/news/phone-and-
communications/mobile-phones/background-noise-reduction-one-
of-your-smartphone-s-greatest-tools-1228924
University

Oliver Nelles
of Siegen
Face Detection
• By calculating the gradients in x- and y-direction, a vertical
and horizontal edge-image can be generated.
Vertical and horizontal
• From these edge-image the features can be extracted more easily. edge-image
• This software extracts 22 features per face:

- vertical position of nose and its width,
- vertical position of mouth, its width, and its height,
- vertical position and heights of eyebrows over eye center,
- 11 radii that describe the form of the chin,
- width of face at nose bottom edge,
- width of face at center of eyes and nose.
22 features used
Quelle: www.markus-hofmann.de for face detection.
University

Oliver Nelles
of Siegen
Supervision of quality welding line
Industrial Image Processing
Component measurement
to supervise tolerances
Camera
Camera
University

Oliver Nelles
of Siegen
Image Compression Image 279 x 356 pixel: as *.tif (without loss): 394 kB
*.jpg (100%): 119 kB *.jpg (60%): 22 kB
University

Oliver Nelles
of Siegen
*.jpg (100%): 119 kB *.jpg (20%): 10 kB
University

Oliver Nelles
of Siegen
*.jpg (100%): 119 kB *.jpg (10%): 5,4 kB
University

Oliver Nelles
of Siegen
*.jpg (100%): 119 kB *.jpg (2%): 2,1 kB
University

Oliver Nelles
of Siegen
Process Automat. for Waste Water Plants
• Graphic description of the plant
• Measurement of many process quantities
- temperatures
- flow rates
- concentrations
• Measurement of disturbances
• Logging of all value for measurements,
manipulated and control variables
• Control of many quantities
• Supervision of limits
• Sensor fault diagnosis
• Optimizat. of profiles for desired values
• Manual fine tuning via control system
University

Oliver Nelles
of Siegen
Suppression of Disturbances Goal: Desired signal „speed“
Example: can pass (almost) unchanged
but disturbance is suppressed.
50 Hz disturbance
through power line
Engine Speed Measurement as close as possible

Filter
test stand to „speed“
How to design a filter that fulfills its task (disturbance suppression) well?
• What does “well” mean? → Criterion needed!
• Structure of the filters: linear/nonlinear, FIR/IIR, order, ... to be determined.
• Parameters of the filter to be determined.
• Prior knowledge about the disturbance is required:
- kind: stochastic or deterministic
- frequency range: single frequencies, certain frequency bands, ...
University

Oliver Nelles
of Siegen
Detection of Damages in Bearings by Analysis of Structure-Borne Sound
• Humans/experts often are able to detect faults in machines by their sound. Even emerging
faults can be detected early.
• Characteristic features can be found in the spectrum of the sound signal.
• Automatic methods for calculating and analyzing the sound spectrum are required!
Bearing damage?
Sound spectrum
no
damage
Frequency emerging
Sound
analysis damage
advanced
damage
University

Oliver Nelles
of Siegen
Notch Filter in Position Control in Aeronautics
Notch filter band-stop filter that address a very small frequency range. They are often used to
remove frequencies that otherwise would harm the system., e.g.:
• Ship control: Elimination of disturbances caused by periodic waves.
• Control of planes, solar panels, and other weakly damped structures (light construction
becomes more important in almost every application).
• TV- and radio receiver: Interfering and disturbing frequencies are filtered.
Control System With Incorporated Notch Filter for Damping of Ressonances
Notch
Controller Plant
Filter
University

Oliver Nelles
of Siegen
7.3 Application Examples Strong damping at
4 kHz pushes
Example: Control of a Read/ amplitude response
Write Head of a Hard Disk down and increases
the amplitude margin!
• Improvement of the
frequency characteristics
of the open loop.
• Notch filter at 4 kHz
4 kHz
+ Notch
filter
339 Hz 4 kHz 628 Hz 4 kHz

University

Oliver Nelles
of Siegen
Example:
Control of Space Shuttle
Resonances in the
dynamics of the shuttle
Notch filter suppresses

these frequencies
Source: „Flight Control Overview of STS-88,

the First Space Station Assembly Flight“
by R. Hall, K. Kirchwey, M. Martin, G. Rosch,
D. Zimpfer, AAS-99-371
University

Oliver Nelles
of Siegen
7.4 Literature
In German
Wendemuth A.: „Grundlagen der digitalen Signalverarbeitung“, Springer, 2004, 268 S.
Werner M.; „Digitale Signalverarbeitung mit MATLAB: Grundkurs mit 16 ausführlichen
Versuchen“, 10. Ed., Vieweg + Teubner, 2008, 294 S.
Oppenheim A.V., Schafer R.W., Buck J.R.: „timeDiscrete Signalverarbeitung“, Pearson,
8. Ed., 2004, 1040 S.
In English
Oppenheim A.V., Schafer R.W., Buck J.R.: „Discrete-Time Signal Processing“, Prentice-
Hall, 9. Ed., 2008, 950 p.
Ifeachor E., Jervis B.: „Digital Signal Processing: A Practical Approach“, Prentice-Hall,
8. Ed., 2001, 960 p.
University

Oliver Nelles
of Siegen
8. Time-Discrete Systems and Signals
University

Oliver Nelles
of Siegen
8. Time-Discrete Systems and Signals (Fundamentals: Mainly Home Study)
8.1 Time-Discrete Signals
8.2 Difference Equations
8.3 Z-Transform
8.4 Transfer Functions
University
8. Time-Discrete Systems and Signals Page 70 Prof. Dr.-Ing.

Oliver Nelles
of Siegen
Equidistant Sampling of a Time-Continuous Signal With Sampling Time T0
7.2
1
0.8
0.6
u(t) 0.4
0.2
0
T0
-0.2
-15 -12 -9 -6 -3 0 3 6 9 12 15 18 21 24 27 30
continuous time t [sec]
k = ..., -2, -1, 0, 1, 2, 3, ...
Sampling at the time steps
Sequence: {u(k)} = {..., 0, 0, 0, 0.26, 0.45, 0.59, ...}
t = kT0
T0 = 3 sec
7.2
1
0.8
0.6 u(2)
u(k) 0.4 u(1)
0.2 u(–1) u(0)
0
-0.2
-5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10
discrete time k
University

Oliver Nelles
of Siegen
Leopold Kronecker, 1823-1891 Paul Dirac, 1902-1984
(www.wikepedia.org) (www.wikepedia.org)
Unit Impulse and Unit Step
The unit impulse in discrete time is called Kronecker delta and has height 1. This is in
contrast to the continuous-time Dirac impulse which has infinite height. Therefore the
Kronecker delta can indeed be realized in practice, while the Dirac impulse is only a
theoretical idealization (or limit). If a Kronecker delta is fed to a D/A converter the output’s
length is 1 sampling interval and its energy is proportional to T0.
area = energy = T0
1 D/A converter 1
with hold
–2 –1 0 1 2 3 4 5 k –2T0 0 T0 3T0 5T0 t
The discrete-time unit step simply corresponds to the continuous-time unit step sampled
with T0. During the 1. sample the unit step and the delta impulse are identical!
1 Connection:
This corresponds to
–2 –1 0 1 2 3 4 5 k
University

Oliver Nelles
of Siegen
Backward shift:
time shift operator Forward shift:
i steps of delay
No delay
i steps of prediction
7.2
1
0.8
0.6 u(2)
u(k) 0.4 u(1)
0.2 u(–1) u(0)
0
-0.2
-5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10
discrete time k
delay of 2 steps: y(k) = u(k–2)
7.2 =
1
0.8
0.6
y(k) 0.4 2 sampling-
0.2 points
0
y(3) = u(1)
-0.2
-5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10
discrete time k
University

Oliver Nelles
of Siegen
Up-Sampling and Down-Sampling
If the sampling rate of an already time-discrete sampled signal shall be changed the
following operations are required:
• Down-Sampling: Increase of sampling time by a factor of M. ↓M
• Up-Sampling: Decrease of sampling time by a factor of M. ↑M
These operations are needed to work with differently sampled signals (multi-rate systems) in
order to synchronize them compress the data. Commonly the sampling time is chosen very
small to make sure that the sampling theorem is not violated. However such an approach
creates huge amounts of data and causes problems with numerical accuracy, particularly in
control. Therefore, in a second step,
these signals can be down-sampled.
y(k) = u(kM)
↓M
0 1 2 3 k !# u(k / M ) k = 0,±M ,±2M ,…
M=2 y(k) = "
#$ 0 otherwise
↑M
0 1 2 3 4 5 6 k
0 1 2 3 4 5 6 7 8 9 10 11 12 13 k
University

Oliver Nelles
of Siegen
Aliasing With Down-Sampling
During sampling of a continuous-time signal aliasing arises if the sampling theorem is
violated, i.e., the sampling frequency f0 is not larger the maximal signal frequency fmax. The
same is true for sampling an already sampled signal, i.e., down-sampling. Thus, before
down-sampling it is important to run an anti-aliasing filter that ensures no frequency
component above f0/2 (new f0) is inside the signal. In this case, the anti-aliasing filter needs
to be a digital filter (see Chapter 10)! 2-fold
2
down-sampling 1
0
-
a nti
2
o ut g -1
aliasing!
t h
1 wi liasin -2
0 5 10 15 20 25 30 35 40 45 50
a
0 k
wi
-1 th 2
ali anti-
-2 asi 1
0 10 20 30 40 50 60 70 80 90 100 ng
k 0
no
-1
aliasing!
-2
(new) 0 5 10 15 20 25 30 35 40 45 50
f0 /2
k
University

Oliver Nelles
of Siegen
Differential Equations → Difference Equations
For small sampling times T0 → 0 a differential equation can be approximated by a difference
equation (discretization) by approximating a differential quotient by a difference quotient:
This approximation has significant drawbacks for T0 ≫ 0. A differential equation of

order n (m ≤ n)
corresponds to a difference equation of order n:
While the simulation of continuous-time systems requires integrations, a discrete-time

system “only” needs the solution of algebraic equations, i.e., simply the isolation of y(k):
Knowledge about the previous time steps k−1, k−2, ..., k−n is required.
University

Oliver Nelles
of Siegen
Moving Average (MA) System
The output is a weighted average of the previous input signal:
Such a system is also called FIR (finite impulse response) because its output to an impulse
inputs decays to zero after m steps.
Autoregressive (AR) System

The output is a weighted average the previous output signal
Such a system also called IIR (infinite impulse response) because its output to an impulse
inputs never decays to zero.
Moving Average Autoregressive (ARMA) System

A combination of a MA and an AR system. This corresponds to the general linear form.
Because it includes AR terms it possesses an IIR.
University

Oliver Nelles
of Siegen
Homogeneous Solution: Simulation for u(k) = 0
If the input is u(k) = 0 then the output depends only on the initial values. The most simple
example is the following difference equation of first order with b1 = 0:
If the initial condition y(–1) is known the output y(k) can be calculated for all times k:
Stable: |a1| < 1

Unstable: |a1| > 1
Marg. stable: |a1| = 1
For difference equations of order n with n > 1 it can be calculated correspondingly. However,
in the general case n initial values y(–1), y(–2), ..., y(–n) are required because y(k) depends on
y(k–1), y(k–2), ..., y(k–n).
University

Oliver Nelles
of Siegen
Stability of a Difference Equation of 1. Order
We can distinguish between three cases:
1 8 7.2
0.9 7
0.8 1
6
0.7
0.6 stable 5 unstable 0.8
marg. stable
0.5 4 0.6
0.4 3
0.3 0.4
2
0.2 0.2
0.1 a1 = –0.8 1
a1 = –1.2 a1 = –1
0 0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
k k k
If a1 > 0 we obtain alternating (in turn positive and negative) solutions. It does not exist any
analogue correspondence for time-continuous systems:
1 8
0.8 1
6
0.6 stable 4
unstable marg. stable
0.4 0.5
0.2 2
0 a1 = 0.8 0 0 a1 = 1
-0.2 -2
-0.4 -0.5
-4
-0.6
-0.8 -6 a1 = 1.2 -1
-1 -8
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
k k k
University

Oliver Nelles
of Siegen
8.2 Difference Equations otherwise
Impulse Response
For u(k) = !K(k) the generated output y(k) is called the impulse response. Like for time-
continuous systems the impulse response characterizes completely the dynamic behavior of
any linear system because the impulse contains all frequencies with equal power. In contrast
to the continuous time case, it is a sequence not a continuous function. For simplicity, we
assume all initial condition are = 0, thus the homogenous solution part is zero. For a first
order difference equation with b1 = 0 we get:
In the homogenous case we have y(–1) = 0 and thus the output y(k) for all times k becomes:
We obtain the same power law as in the homogenous case.

University

Oliver Nelles
of Siegen
Step Response
For u(k) = !(k) the generated output y(k) is called the step response. Like for time-
continuous systems the step response is the most intuitive way to find the picture the
dynamics. For simplicity, we assume all initial conditions are = 0, thus the homogenous
solution part is zero. For a first order difference equation with b1 = 0 we get:
In the homogenous case we have y(–1) = 0 and thus the output y(k) for all times k becomes:
(identical with the impulse response)
University

Oliver Nelles
of Siegen
Relationship Between Impulse and Step Responses
Remember: In continuous time the following relationship holds between the impulse
response g(t) and the step response h(t):
In discrete time the relationships are correspondingly:
Difference replace differentials, sums replace integrals. In discrete time the handling is much
simpler with the help of a computer. However, in this form, the number of sum terms
(summands) increases with k! Therefore we look for some other way to calculate the output
of a discrete-time system.
University

Oliver Nelles
of Siegen
Convolution Sum
The impulse response sequence contains all properties of a linear dynamic system in discrete
time. For the time-continuous case the output in response to an arbitrary input signal u(t) can
be calculated by the convolution integral:
u(t) y(t)
g(t)
In discrete time the corresponding expression is the convolution sum. With it the output y(k)
to every input signal u(k) can be calculated:
u(k) y(k)
g(k)
Usually we assume that for negative times the input is equal to zero, i.e., u(k) = 0 for k < 0.
This means that the first sum must be calculated only up to i = k or alternatively the second
sum has to start at i = 0. Additionally, if the system is causal, i.e., g(k) = 0 for k < 0, then the
first sum can start at i = 0 and the second sum run up to i = k.
University

Oliver Nelles
of Siegen
Convolution Sum (simplified)
With these simplifications the first sum can be written as:
In the second sum the order is reverse:
Obviously, both sums are identical! With the help of a computer the sums are very fast and
easy to calculate. It is much easier than the convolution integral in the continuous-time case.
WARNING: With increasing simulation times k → ∞ the number of terms in the sum
increases linearly. If the impulse response g(k) is of infinite length (IIR) then the
computational and storage demand increases without limits! This means that we have to find
out a way how to calculate the output of IIR systems in a more practical and efficient
manner. For systems with finite impulse responses of length L (FIR) the number of terms in
the sum is limited to L.
University

Oliver Nelles
of Siegen
Convolution with an Impulse
In continuous time the impulse ! (t) has the sifting property, i.e., a convolution with a Dirac
impulse yields the signal itself. The Dirac impulse is the neutral element in a convolution like
“0” in addition or “1” in multiplication. For the calculation of the impulse response we
choose u(t) = ! (t) and this yields:
In discrete time we choose u(k) = !K(k) and calculate with the convolution sum:
= 1 for k = i
This is exactly the corresponding result as in the time-
continuous case.
University

Oliver Nelles
of Siegen
Hilbert‘s Hotel
This hotel has and infinite number of rooms. It illustrates the understanding of infinite sets.
If all rooms are taken, is a room available for additional guests or for 2 of for ∞?
Source: http://www.mathcs.org/analysis/reals/infinity/graphics/hilberts_hotel.jpg
University

Oliver Nelles
of Siegen
Exponential Relationships: Not intuitive!
Q: If we fold a piece of paper (thickness = 0.1 mm)
50 times, doubling the thickness with each fold:
How high is the stack?
Source:
A: From Earth to Mars = 100 mio. meter. http://www.wdr.de/tv/kopfball/sendungsbeitrae
ge/2011/1120/papier-falten.jsp
Q: If we stack coins,
one stack on each field of a chess board:
1 coin on chess field 1,
8 coin on chess field 4, ...
How high is the stack on chess field 64?
A: Up to α-Centauri = 4 light years.
Source: https://www.youtube.com/watch?v=0mOZZLJZwpw
A human can calculate these numbers but cannot guess them! Human intuition fails with
exponential relationships. That make them potentially dangerous (extinction of species).
University

Oliver Nelles
of Siegen
Geometric Series
In the previous slides geometric sequences or series play an important role. A geometric
series is a sum of exponentially staged numbers:
The following trick allows to calculate this infinite sum exactly:
Thus, for |x| < 1 (for |x| ≥ 1 the series diverges to infinity):
An extended formula can be derived for finite sums:

University

Oliver Nelles
of Siegen
Abbreviation: u(k) = uc(kT0)
8.3 Z-Transform
Description of Sampled Signals
An A/D converter samples a continuous-time signal uc(t) and thereby creates a time-discrete
signal u(k) = uc(kT0). The sampling is performed at time T1. It can be mathematically
modeled as a multiplication of uc(t) with Dirac impulses at times T1, i.e., ! (t–T1):
If this sampling is performed periodically at the time steps kT0

then the continuous signal uc(t) must be multiplied (modulated)
with a train of impulses:
uc(t) us(t) u(k)

Sensor T0 A/D Computer
uc(t) us(t) u(k)
t t k
University

Oliver Nelles
of Siegen
8.3 Z-Transform
Interpretation of the Train of Dirac Impulses
The continuous-time description of a sampled signal as modulated impulse train is given by:
or , if u(k) = 0 for k < 0
These formulas represent only a idealized model because in reality the impulses are not of
infinite height, of course. These Dirac impulses do not exist in reality. But they associate a
finite energy to each sampled signal point. Thus, also the multiplication with u(k) makes sense.
Mathematical Model of the Sampling:
uc(t) us(t)
! (t+2T0) ! (t–3T0)
University

Oliver Nelles
of Siegen
8.3 Z-Transform
Laplace Transform of the Sampled Signal
If we apply the Laplace transform to a sampled signal the so-called z-transform originates.
The Laplace transform of a continuous-time signal u(t) is defined as:
Laplace-Transformation:
If we choose for u(t) a sampled signal, i.e., u(t) = us(t) then we obtain:
Remember:
This gives us:
University

Oliver Nelles
of Siegen
8.3 Z-Transform
Laplace Transform → z-Transform
With the abbreviation
the Laplace transform of a sampled system is called the z-transform (the index “s“ can be
skipped because it is clear by the variable denotation “z” that we deal with discrete time):
z-Transform:
Frequency Response
To calculate the frequency response of a continuous-time system the Laplace variable s is
evaluated on the imaginary axis in the s-plane by setting s = i! for ! = 0 ... ∞. The frequency
response for a discrete-time system can be calculated in the same way. Correspondingly, the
z-variable becomes . For ! = 0 ... ∞ we run along the unit circle in the z-plane. It
would be circled infinite many times. Thus the frequency response is periodic which is caused
by the sampling! But according to the sampling theorem the frequency has to be limited to
! T0= ". So we circle only once! (Symmetry with respect to ±!!)
University

Oliver Nelles
of Siegen
8.3 Z-Transform
Derivation of Periodicity of the Frequency Response
We want to consider the periodicity of the frequency response in more detail. The frequency
response of a discrete time system is:
With the facts for n = 0, ±1, ± 2, ... and we can show
that the frequency response repeats all multiples of !0 (each time we circle around the unit
circle in the z-plane). This means the frequency response is a periodic function. It is identical
for: ! , ! ± !0 , ! ± 2!0 , ! ± 3!0 , ...
University

Oliver Nelles
of Siegen
s-Plane
8.3 Z-Transform
Illustration of the Periodicity of the Frequency Response
Main Spectrum Shadow Spectra
... ...
• The shadows spectra around the multiples of !0

are created by the sampling with frequency !0.
• The Im-axis between –i!0/2 and i!0/2 in the s-plane
z-Plane
is mapped into the unit circle in the z-plane.
• The whole information in a time-discrete Unit Circle
system is contained in the frequency
response along the unit circle between
the frequencies ! = 0 and ! = !0/2;
in the part of the unit circle ! = –!0/2 ... 0 it is symmetrical!
University

Oliver Nelles
of Siegen
Sampling fast enough: no aliasing!
8.3 Z-Transform
!max
Sampling Theorem and Aliasing
• If the maximal signal frequency
... ...
!max is smaller than the half
sampling frequency !0/2, the
continuous-time signal can be
Limit case
reconstructed perfectly from the
!max
sampled one. No information is
lost because main and shadow ... ...
spectra do not overlap. We have
no aliasing.
• If !max > !0/2 the main and shadow
spectra overlap. We consequently Sampling too slow: Aliasing!
have aliasing which deteriorates !max
the original signal. A perfect
reconstruction is impossible. ... ...
University

Oliver Nelles
of Siegen
8.3 Z-Transform
Z-Transform of Impulse and Step
The impulse u(k) = !K(k) has the following z-transform:
u(0) = 1, u(1) = 0, u(2) = 0, ... →
An impulse delayed by one time step u(k) = !K(k–1) has the following z-transform:
u(0) = 0, u(1) = 1, u(2) = 0, ... →
An impulse delayed by d time steps u(k) = !K(k–d) has the following z-transform:
u(0) = 0, ..., u(d–1) = 0, u(d) = 1, u(d+1) = 0, ... →
The unit step u(k) = " (k) has the following z-transform:
u(0) = 1, u(1) = 1, u(2) = 1, ... →
An unit step delayed by d time steps u(k) = " (k–d) has the following z-transform:
u(0) = 0, ..., u(d–1) = 0, u(d) = 1, u(d+1) = 1, ... →
The following expressions are identical:

University

Oliver Nelles
of Siegen
8.3 Z-Transform
Z-Transform of Geometric Sequences
The geometric sequence u(k) = ak with any number a commonly occurs because it describes
an exponential behavior. This sequence has the following z-transform:
u(0) = a0, u(1) = a1, u(2) = a2, u(3) = a3, ... →
Further conversions lead to the standard form of a geometric series:
This infinite geometric series can be expressed simply by:

Long Division:
z : (z–a) = 1 + az–1 + a2z–2 + ...
This allows to formulate infinite series as one z–a
a
simple expression. The way back can be carried
a–a2z–1
out by long division.
a2z–1
a2z–1–a3z–2
a3z–2
University

Oliver Nelles
of Siegen
8.3 Z-Transform
Important Properties of the z-Transform
For limit considerations the cases t → 0 (k → 0) or
t → ∞ (k → ∞) are evaluated. In the frequency range (s or z) this means:
t → 0: s → ∞ t → ∞: s → 0
k → 0: z → ∞ k → ∞: z → 1
Start Value
The start value of a sequence can be calculated from its z-transform by:
End Value
The end value (if it exists!) of a sequence can be calculated from its z-transform by:
University

Oliver Nelles
of Siegen
8.3 Z-Transform
Backward Shift (To the Right)
A dead time Tt = dT0 is equivalent to a backward shift (shift to the right) by d samples. This
operation corresponds to the Laplace transform . In the z-domain this means:
u(k) u(k–1)
0 1 2 3 4 5 6 k 0 1 2 3 4 5 6 7k
Forward Shift (To the Left)
A prediction of time Tp = dT0 is equivalent to a forward shift (shift to the left) by d samples.
This operation corresponds to the Laplace transform . In the z-domain this means:
u(k) u(k+1)
0 1 2 3 4 5 6 k –1 0 1 2 3 4 5 6 k
University

Oliver Nelles
of Siegen
8.3 Z-Transform
Difference / Differentiation
The difference of two subsequently sampled values divided by the sampling time (that passed
between their measurement) is called the difference of first order and corresponds approxi–
mately to a differentiation. In the s-domain it is realized by u(i)
a multiplication with s. In the z-domain this is given by: u(k)
u(k) – u(k–1)
u(k–1)
T0
(k–1)T0 kT0 t
Summation / Integration
The sum of all sampled values starting from time 0 multiplied by the sampling time is equal
to the lower sum approximation of the area below the samples. That approximately equals the
integration. In the s-domain this is realized by a division by s. In the z-domain this
corresponds to: u(i)
0 T0 2T0 kT0 t
University

Oliver Nelles
of Siegen
Transfer Function and Impulse Response
The same relationship exists for discrete-time systems between a transfer function in the z-
domain and the impulse response sequence as for continuous-time systems between a
transfer function in the s-domain and the impulse response function:
In G(z) as in g(k) all properties of a linear dynamic system are contained. For calculation of
the system output over time only the system input over time and either G(z) or g(k) are
required.
Multiplication
Convolution
The multiplication in the z-domain corresponds to the convolution sum in the discrete time
domain as the convolution integral in the continuous time domain.
University

Oliver Nelles
of Siegen
Transfer Function and Impulse Response
We choose a Kronecker-delta impulse as input u(k) = !K(k) or U(z) = 1, respectively. This
yields the impulse response as output:
or
For a general impulse response sequence
the corresponding transfer function is:
If the impulse response sequence g(k) is of finite length the same is true for the number of
terms in G(z). If g(k) is of infinite length, however, the same is also true for G(z) and an
easier-to-handle alternative has to be found to avoid an infinite sum.
University

Oliver Nelles
of Siegen
Example: Transformation Via Impulse Response Invariance
A common method for the transformation from the continuous to the discrete world is to
demand identical impulse responses. This is popular for digital filter design. We demand that
the discrete impulse response sequence is identical to the sampled continuous impulse
response function.
1
0.8
continuous time: 0.6
0.4
0.2
0
0 1 2 3 4 5 6 7 8 9 10
t [sec]
1
discrete time:
0.8
0.6
0.4
0.2
0
0 1 2 3 4 5 6 7 8 9 10
k
For a first order system this requires a impulse response of:
University

Oliver Nelles
of Siegen
For K = 5 and T = 5 sec this results in: . If the sampling time is chosen to
T0 = 1 sec then the demand for an impulse response invariance yields:
Note that this is a geometric sequence!

This can also be written with the help of delayed delta impulses:
We can easily obtain the corresponding transfer function in the z-domain:
Because this infinite series is difficult to handle we compute the explicit sum with the formula
for infinite geometric series with x = 0.82 z–1:
Gain:
Therefore this G(z) corresponds to the G(s) in the sense of impulse response invariance.
University

Oliver Nelles
of Siegen
Example: Transformation Via Step Response Invariance
Another popular method for transformation from continuous to discrete time is the step
response invariance. It yields a different result than impulse response invariance. The
denominators (and thus poles) are identical but the numerators (and thus zeros) and the gains
are different:
Gain:
The choice of the criterion distinguishes all type of such transformations. An invariance of
the impulse responses accounts for all frequencies in the same way because all frequencies
are weighted equally (constant spectrum of an impulse). Therefore it is commonly applied
for filter design.
An invariance of the step response, however, weights lower frequencies stronger and is the
appropriate choice for control applications where the manipulated variable typically is of
stepwise character. It also ensures a correct transformation of the gain.
University

Oliver Nelles
of Siegen
Transfer Function→ Difference Equation
Consider a general transfer function of numerator degree m and denominator degree n:
The coefficient a0 can set to 1 through cancelation. This yields the following difference
equation in the time-domain:
A dead time of Tt = dT0 causes a backward shift by d steps:
In contrast to the s-domain, a dead time in the z-domain still keeps the transfer function of
rational type (numerator / denominator)!
University

Oliver Nelles
of Siegen
If the transfer function is written in form of positive powers in z first can be converted in a
form with negative powers, i.e., z–1, and afterwards it can be transformed into the time
domain.
b0m z m + . . . + b01 z 1 + b00
a0n z n + . . . + a01 z 1 + a00
WARNING: These are different n and m values compared to the previous slide.
For n = m this transfer function is identical to the one on the previous slide. For n > m a dead
time can be factored out in the numerator:
with d = n – m. The case m > n does not occur (negative dead time → non-causal)!
University

Oliver Nelles
of Siegen
Causality and Properness
A transfer function of the form
has numerator degree m and denominator degree n which are positive integers. G(z) is causal.
A transfer function of the form
b0m z m + . . . + b01 z 1 + b00
G(z) = 0 n
an z + . . . + a01 z 1 + a00
requires: denominator degree ≥ numerator degree or n ≥ m. If this requirement is met then
G(z) is causal. However, if m > n, then G(z) is non-causal negative dead times arise, i.e.,
values in the future have to be predicted.
The condition denominator degree ≥ numerator degree is known from the s-domain. There it is
a condition for properness or realizability, i.e., avoiding pure differentiators! For time-
discrete systems such limitations do not exist. Every causal system can be realized.
University

Oliver Nelles
of Siegen
Proper / Strictly Proper
For continuous-time systems the difference between a proper and strictly proper system can
be directly seen in the transfer function.
• Proper: numerator degree ≤ denominator degree: m ≤ n
• Strictly proper: numerator degree < denominator degree: m < n
In discrete time a system is proper but not strictly proper (= “sprungfähig”)
• for transfer functions in z-form (only positive powers of z):
numerator degree m = denominator degree n
• for transfer functions in z–1-form (only negative powers of z): b0 ≠ 0
Only if b0 exists the input u(k) directly influences the output y(k). If b0 = 0 then a change in
the input is delayed by one or more steps until u(k–1) or later until it affects the output y(k).
Terminology: A system follows the difference equation:
This can be interpreted either a dead time of 1 or as a not strictly proper system:
University

Oliver Nelles
of Siegen
Difference Equation → Transfer Function
In order to transform a difference equation into the z-domain, first the equation is rewritten
such that y(k) is the newest output value. Then the transformation into the z-domain requires
only operators with negative powers like z–i:
Example:
1.) New starting time step:
2.) Time transformation such that this value is mapped to y(k): k := k–3
3.) Transformation into the z-domain, separation of Y(z) and U(z), division to obtain
transfer function:
University

Oliver Nelles
of Siegen
IIR (Infinite Impulse Response)
All impulse response functions g(t) in continuous time are of infinite length. Typically they
decay to zero with exponential behavior. By sampling a sequence g(k) of infinite length
results. Such systems are named IIR (infinite impulse response).
IIR systems have a transfer function with non-trivial denominator, i.e., the denominator is
more complex than zn. This yields at least two different delayed versions of y(k) in the
corresponding difference equation. A consequence is that this difference equation can only
be calculated recursively!
Examples:
0.4 + 0.5z −1 + 0.6z −2 + 0.7z −3 + 0.8z −4

G(z) =
(1 − 0.8z −1 )2 (2 − z −1 + 0.3z −2 + 0.5z −3 )
non-causal! non-causal!
University

Oliver Nelles
of Siegen
FIR (Finite Impulse Response)
Systems with impulse sequences g(k) of finite length are called FIR systems (finite impulse
response). They only exist in discrete time! They have no (exact) equivalent in continuous
time. However, if the length of an FIR system is allowed to be very long it might be possible
to approximate a stable IIR system by a long FIR system. Marginally stable or unstable IIR
systems, in principle, cannot be approximated by an FIR system because their impulse
response does not converge to 0.
FIR systems have a transfer function without denominator or with a denominator of type zm.
A consequence is only one y-term in the difference equation (feedforward).
Examples:
non-causal!
non-causal!
University

Oliver Nelles
of Siegen
Pole-Zero-Form of a Transfer Function
Up to here we have considered transfer functions in explicit polynomial form. However, a
factorized form is often useful because the poles and zeros directly appear in the denominator
and numerator. It is simpler to write it in positive powers of z:
The gain of G(z) can be calculated according to the final value limit theorem of the
z-transform by letting z = 1:
Gain:
The poles pi and zeros ni can be transformed into the s-domain via and can be
interpreted accordingly.
Immediately conditions for stability and phase minimality for poles and zeros result in the z-
domain.
University

Oliver Nelles
of Siegen
Relation Between s-Plane and z-Plane
• The stability region “left half s-plane” is mapped to the inner region inside the unit circle
in the z-plane.
• The imaginary axis of the s-plane is mapped to the unit circle in the z-plane.
• The unstable region “right half s-plane“ is mapped to the outer region around the unit
circle in the z-plane.
max. possible frequency

s-Plane before aliasing occurs! z-Plane
Im Im
Re Re
University

Oliver Nelles
of Siegen
Stability
• A transfer function in the z-domain is stable if all poles are inside the unit circle.
• If one or more poles are on the unit circle (no multiple poles!) and all
other poles are inside the unit circle, the system is marginally stable.
• If at least one pole exists outside the unit circle or a multiple pole is on the unit
circle, then the system is unstable.
• The stability properties of a transfer function in the s-domain keep valid for
transformation in the z-domain because the poles transform according to .
Phase Minimality
• A system has minimum phase if it has only stable and marginally stable poles and zeros.
The location of the zeros typically changes during the transformation from the s-domain into
the z-domain. Therefore the property “minimum phase” generally is not preserved during the
transformation.
University

Oliver Nelles
of Siegen
s-Plane z-Plane
Example: All-Pass in z-Domain
An all-pass is characterized by an amplitude response equal to 1 for all frequencies. Because
poles and zeros have the same absolute values, just opposite signs, they cancel in the
magnitude. Of course the phase is affected. A simple first order all-pass in the s-domain is:
Pole: (stable)
with
Zero: (unstable)
The corresponding all-pass in the z-domain has a stable pole and the inverse zero mirrored at
the unit circle. It is not the direct transformation form s to z!
with Pole: (stable)

Zero: (unstable)
The amplitude response is given by :
University

Oliver Nelles
of Siegen
Chapter 8: Relevant MATLAB Commands MATLAB
Change of Sampling Rate:

decimate(x,r);1 % Reduces the sampling rate of signals x
% by a factor of r with help of a low-pass
% filter.
upsample(x,n);1 % Increases the sampling rate by a factor of n,
% by inserting zeros in between the sample
% points
% E.g.: x = [1 2 3];
% Z.B.: y = upsample(x,3);
% y = [1 0 0 2 0 0 3 0 0]
downsample(x,n);1 % Reduction of sampling rate. Only every n-th

% sample is carried over.
% E.g.: x = [1 2 3 4 5 6 7 8 9 10];
% y = downsample(x,3);
% y = [1 4 7 10]
University

Oliver Nelles
of Siegen
resample(x,p,q);1 % Changes the sampling rate of signal vector x

% by the rational factor p/q
Impulse Response and Step Response:

impulse;2 % Calculates the impulse response of a linear
% system
step;2 % Calculates the step response of a linear
% system
Partial Fraction Expansion:

[r,p,k] = residuez(b,a);1 % Performs a partial fraction expansion
% with the ratio of numerator b(z)
% and denominator a(z).
% The inverse operation is also
% possible.
1 : Signal Processing Toolbox

2 : Control System Toolbox
University

Oliver Nelles
of Siegen
9. Transformation Into the
Frequency Domain
University

Oliver Nelles
of Siegen
9. Transformation of Signals in the Frequency Domain Joseph Fourier, 1768-1830
(www.wikepedia.org)C
9.1 Discrete Fourier Transform (DFT)
9.2 Extension: Fast Fourier Transform (FFT)
9.3 Frequency Analysis Via DFT
9.4 Leakage Effect and Windowing
9.5 Non-Stationary Signals und Short-Time-DFT
9.6 Outlook: Time-Frequency-Analysis
9.7 Outlook: Parametric Frequency Analysis
University
9. Transformation Into the Frequency Domain Page 120 Prof. Dr.-Ing.

Oliver Nelles
of Siegen
Fourier Series
Source: ftp://ftp.ifn-
magdeburg.de/pub/MBLehre/sv06_13
0509-ftp.pdf
University

Oliver Nelles
of Siegen
Source: http://eitidaten.fh-
pforzheim.de/daten/mitarbeiter/blankenbach/vorlesungen/mathe_2/Fourier_Trafo_kurz_Folien.pdf
Standard concert pitch A4: f0 = 440 Hz on different music instruments
University

Oliver Nelles
of Siegen
Fourier Series
• Decomposition of a periodic signal in its frequency components.
• Signal can be decomposed into an infinite sum of sine and cosine terms.
• Amplitude for each frequency 1.5
1 term
indicates how strong this frequency 3 terms
is contained in the signal. 1
1. harmonic
2 terms
Amplitude
0.5
2. harmonic
Amplitude 0
3. harmonic
-0.5
Frequency !
-1
• If non-periodic signals shall be dealt
with: period length → ∞,
-1.5
basic oscillation → 0. 0 2 4 6 8 10 12
Time t
University

Oliver Nelles
of Siegen
Fourier Transform
• Extension of the Fourier series for non-periodic signals
• Period length T → ∞, basic oscillation ! → 0.
• The spectrum is not composed of discrete frequencies n·!0 (i.e., multiples of the
basic oscillation). Rather it consists of arbitrary many frequencies (i.e., a real number) −
the so-called amplitude density spectrum (similar for the phase).
Fourier Series Fourier Transform
Amplitude Density |F(i!)|

1. harmonic
Amplitudes
2. harmonic
3. harmonic
!0 2!0 3!0 4!0 ! !

University

Oliver Nelles
of Siegen
9.1 Discrete Fourier Transform (DFT) 4
Signals contain many different frequencies. 0
-2
A transformation from the time domain to
-4
the frequency domain allows to examine 0 20 40 60 80 100 120 140 160 180 200
Share of high frequencies in the signal

4
how strong which frequencies are contained 2
in the signal. 0
This is a powerful tool for the analysis -2
further processing of signals.

-4
0 20 40 60 80 100 120 140 160 180 200
4
-2
-4
0 20 40 60 80 100 120 140 160 180 200
-2
-4
0 20 40 60 80 100 120 140 160 180 200
Zeit t
University

Oliver Nelles
of Siegen
Remember::
Fourier Transform
• Time continuous
sampling frequency
• Frequency continuous sampling time
Time-Discrete Fourier Transform

• Time discrete: t = kT0
• Frequency continuous
Discrete Fourier Transform

• Time discrete N samples:
• Frequency discrete in N samples:
or Für
University

Oliver Nelles
of Siegen
Properties
• Periodicity in the frequency range (see sampling theorem).
Interpretation of
Because the exp-function is periodic with i2!: time and frequency axes:
E.g. for f0 = 50 Hz → T0 = 0.02 s
k = 4 → t = 4·0.02 s = 0.08 s
n = 4 → f = 4/N·50 Hz = 28.2 Hz
Signal over time Amplitudes per frequency
10.5 18
4
N=9 16
N=9
9.5 14
3 12
8.5 10
2 8
7.3 6
1 4
0.5 2
0 0
0 1 2 3 4 5 6 7 8 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Discrete time k Discrete frequency n
University

Oliver Nelles
of Siegen
Properties
• Periodicity in the time range.
Because the 2!-periodic exp-function also occurs in the backward transformation, in
contrast to the continuous-time transform, for the DFT the time signal appears to be
periodic. The discretization of the frequency axis causes this effect.
Amplitudes per frequency Signal over time

18
16 N=9 4 N=9
14
12 3
10
8 2
6
4 1
2
0 0
0 1 2 3 4 5 6 7 8 -5 0 5 10 15 20 25
Discrete frequency n Discrete time k
University

Oliver Nelles
of Siegen
Properties
• If x(k) is real (normal case) then the amplitude response is an even function and the
phase response is an odd function, i.e., both are determined completely by half of the
points; the other half can be generated by mirroring:
- N is even: N/2+1 points are required.
- N is odd: (N+1)/2 points are required.
Reason: The time signal x(k) contains only frequencies up to f0/2 (sampling theorem!)
otherwise we would get aliasing. Therefore it only makes sense to display the frequency
plot in the range or . The part or
respectively or is redundant!
18 200
16 (N+1)/2 = 5 N=9 150 (N+1)/2 = 5 N=9
Amplitude of X(n)
14
Phase of X(n)
100
12
50
10
0 This range contains no new
8
6
-50 information and can be
4 -100
generated by mirroring.
2 -150
Commonly therefore only the
0 -200
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 left range is displayed!
Discrete frequency n Discrete frequency n
University

Oliver Nelles
of Siegen
Further properties of the DFT are already known from the continuous
Fourier Transform:
• Linearity:
• Time shift:
• Frequency shift:
• Convolution:
• Multiplication:
Inverse DFT
For completeness, here the formula for the transformation back into the time-domain:
University

Oliver Nelles
of Siegen
Implementation of the DFT
Abbreviation:
with
This can be written for n = 0, 1, 2, ... N–1 as the following equation system:
To carry out this matrix-vector multiplication, the following amount of computation is

necessary:
• N 2 complex multiplications
• N 2 complex additions
University

Oliver Nelles
of Siegen
9.2 Extension: Fast Fourier Transform (FFT)
Idea for the Fast Fourier Transform (FFT)
• Efficient implementation of the DFT with identical result.
• Split of an DFT of size N (number of data points) in 2 DFTs of size N/2 by a trick.
• Further split of 2 DFTs of size N/2 in 4 DFTs of size N/4, etc.
• These splits are continued up to N/2s = 1; s represent the number of splits necessary.
• Works only if N = 2s, i.e., a power of 2. If this is not the case, the signal x(k) is filled with
zeros such that the number of points is equal to 2s (zero padding).
Complexity of the FFT

• Only N ld(N) complex multiplications and addition are required.
• Example: N = 1.024
- computational demand DFT ~ N 2 ≈ 1.000.000
- computational demand FFT ~ N ld(N) = 1024·10 ≈ 10.000 → Factor 100 savings!
Info: ld() is the logarithm to base 2
University

Oliver Nelles
of Siegen
Choice for the amount of data N
N = 20 N = 32 N = 40
1 1 1
x(k)
0.8 0.8 0.8
x(k)
Time signal x(k)
signalx(k)
signalx(k)
0.6 0.6 0.6
timesignal
timesignal
0.4 0.4 0.4
Time
Time
0.2 0.2 0.2
0 0 0
0 2 4 6 8 10 12 14 16 18 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35
Discrete time k Discrete time k Discrete time k
10 10 10
9 9 9
X(n)
X(n)
Amplitude of X(n)
of X(n)
of X(n)
8 8 8
7 7 7
Amplitude of
Amplitude of
6 6 6
Amplitude
Amplitude
5 5 5
4 4 4
3 3 3
2 2 2
1 1 1
0 0 0
0 2 4 6 8 10 12 14 16 18 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35
Discrete frequency n Discrete frequency n Discrete frequency n
University

Oliver Nelles
of Siegen
Observations:
• All time signals are chosen identically, only the number of zeros filled in vary, so that the
total number of points are N = 20, 32, 40.
• The resolution in the frequency domain depends on N. The frequency axes are scaled as
follows:
- N = 20:
- N = 32:
- N = 40:
• A clever choice for N by zero padding can achieve frequency intervals of desired
size even if the original signal is shorter than N values.
If a certain frequency is interesting and the amplitude for this frequency is important
to know with high accuracy, it should be exactly contained in the frequency discretization
by an appropriate choice of N (see picket fence effect)!
University

Oliver Nelles
of Siegen
Observation: 10
9 N = 40
• Doubling the number of points
8
N = 20 → 40 doubles frequency resolution.
Amplitude of X(n)
7
The DFT for N = 20 yields identical values N = 20
6
(for every second point) as the DFT for 5
N = 40. Identical values for the DFT
4
for N = 20 and N = 40!
3
Remark: 2 ...
1
• The phase of X(n) sometimes is interesting,
0
as well. We focus on the amplitudes but an 0 5 10 15 20 25 30 35
analysis of the phase can also be important. Discrete frequency n (N = 40)
• MATLAB creates the plots shown in these lecture notes.
fft() yields X(n) in the frequency range 0 to f0.
• Commonly the upper half of the spectrum is omitted because it does not carry any
additional information. Also a symmetric plot around the origin from –f0 /2 to +f0 /2 is
popular.
University

Oliver Nelles
of Siegen
Equivalent Types of Plots for the Spectrum
10 10 10
9 redundant! 9 redundant! 9
8 8 8
Amplitude of X(n)
7 7 7
6 6 6
5 5 5
4 4 4
3 3 3
2 2 2
1 1 1
0 0 0
0 5 10 15 20 25 30 35 40 -20 -15 -10 -5 0 5 10 15 20 0 2 4 6 8 10 12 14 16 18 20
University

Oliver Nelles
of Siegen
9.3 Frequency Analysis Via DFT This range is generated
by the increase of the
Choice of Sampling Time T0 / Sampling Frequency f0 sampling frequency.
• The faster the signal is
10
sampled, the wider is its 1
9
T0 = 1 ms f0 = 1 kHz
frequency range.
Amplitude of X(n)
0.8 8
Time signal x(k)

N = 20 7 N = 20
• In practice, the amplitudes 0.6 6
5
typically become smaller 0.4 4
3
at higher frequencies. 0.2
2
1
• As the sampling theorem 0
0
tells us, the sampling 0 2 4 6 8 10 12 14 16 18
Discrete time k
0 2 4 6 8 10 12 14 16 18
Discrete frequency n
frequency should be 1
20
chosen such that the T0 = 0.5 ms 18

f0 = 2 kHz
Amplitude of X(n)
16
Time signal x(k)
0.8
highest significant signal N = 40 14 N = 40
0.6 12
frequency lies below f0/2. 10
Otherwise we get aliasing! 0.4 8
6
0.2
4
2
0
0
0 4 8 12 16 20 24 28 32 36 0 4 8 12 16 20 24 28 32 36
University

Oliver Nelles
of Siegen
Example:
9.3 Frequency Analysis Via DFT Cos-type signals with 1, 2,
5 complete periods!
DFT of Sin- or Cos-Type Signals (Complete Periods) N = 32
1 1 1
0.8 0.8 0.8
0.6 0.6 0.6
Time signal x(k)
0.4 0.4 0.4

0.2 0.2 0.2
0 0 0
-0.2 -0.2 -0.2
-0.4 -0.4 -0.4
-0.6 -0.6 -0.6
-0.8 -0.8 -0.8
-1 -1 -1
0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30
Discrete Time k Discrete time k Discrete time k
16 16 16
14 14 14
Amplitude of X(n)
12 12 12
10 10 10
8 8 8
6 6 6
4 4 4
2 2 2
0 0 0
0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30
Discrete frequency n Discrete frequency n Discrete frequency n
University

Oliver Nelles
of Siegen
Observations:
• The amplitude obtained from the DFT for the signal frequency is N/2 if the original
signal had amplitude 1. It is clear that this number is proportional to the number of data
points N because so many points have to be summed up.
• Thus, the amplitude axes of the frequency response are commonly scaled with a
factor 2/N to make the axes in the plot independent of N.
18
16 30 60
14 N = 32 25
N = 64 50
N = 128
12
20 40
10
8 15 30
6
10 20
4
2 5 10
0 0 0
0 5 10 15 20 25 30 0 10 20 30 40 50 60 0 20 40 60 80 100 120
n n n
1 1 1
0.8
N = 32 0.8
N = 64 0.8
N = 128
0.6 0.6 0.6
0.4 0.4 0.4
0.2 0.2 0.2
0 0 0
0 5 10 15 20 25 30 0 10 20 30 40 50 60 0 20 40 60 80 100 120
n n n
University

Oliver Nelles
of Siegen
Observations:
• If the length N of the time signal is an exact multiple of the period length of an oscillation
then the DFT reveals the amplitude of this oscillation exactly in the spectrum:
- The complete energy is concentrated on one peak (if we have just one oscillation).
- This peaks lies exactly at the correct frequency.
• Due to the linearity property of the DFT these facts are valid for an additive mixture of
oscillations, as well.
Example: Superposition of two oscillations at f2 = 2/32 f0 and f3 = 5/32 f0 :
2 16
1.5 14
Amplitude of X(n)
Time signal x(k)
1 12
0.5 10
0 8
-0.5 6
-1 4
-1.5 2
-2 0
0 5 10 15 20 25 30 0 5 10 15 20 25 30
University

Oliver Nelles
of Siegen
Reason for the Exact Frequency Representation
1. If a time signal x(k) with k = 0, 1, ... N–1 1
contains exactly M periods of a sin- or cos- 0.5
signal of duration T1 this holds: 0
-0.5
-1
Thus the frequency f1 automatically is exactly 0 10 20 30 40 50 60
equal to one of the discrete frequencies of the DFT:
2. Due to periodicity of the complex exp-function the DFT “thinks” the signal repeats itself
infinitely often, i.e., the original signal for k = 0, 1, ... N–1 is repeated for
k = N, N+1, ... 2N–1 and 1
k = 2N, 2N+1, ... 3N–1, etc. Because the 0.5
oscillation are full periods, they fit together 0
exactly at the points N, 2N, etc. (continuity). -0.5

-1
0 20 40 60 80 100 120 140 160 180
University

Oliver Nelles
of Siegen
DFT of incomplete sinus-type signals
• Typically it is not possible to choose N such that all included oscillations exhibit an
integer multiple of periods. Reasons:
- The period length of the interesting oscillation is not known.
- Many oscillations of various period lengths are interesting and it is impossible to find a
reasonable value for N fulfills all conditions concurrently.
What happens if an oscillation is not present for an integer number of periods?
Lies between
1
12 n = 2 and n = 3.
0.8 10
right
neighbor
Amplitude of X(n)
0.6
Time signal x(k)
0.4 8
0.2 left
0 6
-0.2
neighbor
-0.4 4
-0.6
2
-0.8
-1
0
0 5 10 15 20 25 30 0 5 10 15 20 25 30
University

Oliver Nelles
of Siegen
Observations:
• The frequency f1 of a periodic signal does not exactly exist in the frequency
discretization! Therefore the amplitude belonging to 2.5/32 f0 splits between 2/32 f0 and
3/32 f0.
→ Picket Fence Effect
• Additionally the spectrum “smears” (leaks) across the whole frequency range. This is a
direct consequence of the discontinuity of the time signals that induces disturbing “steps”
in the (thought) periodic signal.
→ Leakage Effect 1
discontinuity!
periodicly continued
time signal x(k)
0.5
-0.5
-1
0 10 20 30 40 50 60
Discrete time k
University

Oliver Nelles
of Siegen
Identical Example for Different Resolutions, i.e., of Different Lengths N
1 N = 32 1
N = 64
0.5
x(k)
x(k)
0 0
-0.5
-1 -1
0 5 10 15 k 20 25 30 0 10 20 30 k 40 50 60
20 20
N = 32 N = 64
|X(n)|
|X(n)|
10 10
0 0
0 5 10 15 n 20 25 30 0 10 20 30 n 40 50 60
1 1
N = 128 N = 256
0.5 0.5
x(k)
x(k)
0 0
-0.5 -0.5
-1 -1
0 20 40 60 k 80 100 120 0 50 100 k 150 200 250
20 20
N = 128 N = 256
|X(n)|
|X(n)|
10 10
0 0
0 20 40 60 n 80 100 120 0 50 100 n150 200 250
University

Oliver Nelles
of Siegen
Observations:
• For N → ∞ the DFT result converges to the amplitude and phase response.
• We have to distinguish two negative effects that can occur: (i) The maximum amplitude
is split into its neighbors due to discretization (picket fence effect) and (ii) the spectrum is
smeared across (leakage effect).
• A rectangular window has a 1
0.8
sinc-function as Fourier Transform.
w(t)
0.6
The original signal can be thought 0.4
of as a multiplication with the 0.2
rectangle or convolution with sinc(). 0

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
–T/2 t T/2
1
Sinc-function=
Fourier Transform
|W(! )|
0.5
Rectangular Windows
-50 -40 -30 -20 -10 0 10 20 30 40 50

for T = 1 !
University

Oliver Nelles
of Siegen
Explanation of the Leakage Effect L = 32, N = 64 > 32
• A band-limited time signal x(k) of length L can be N
created from a signal of length infinity or large N 1
by multiplication with a rectangular window w(k)
xp(k)
0
of length L:
-1
0 10 20 30 40 50 60
k
• This multiplication in the time-domain corresponds 1
to a convolution in the frequency-domain:
w (k)
0.5
L
0
Here W(n) is the Fourier transform and DFT of the 0 10 20 30
k
40 50 60
rectangular window w(k):

1
x(k)
0
-1
0 10 20 30 40 50 60
k
University

Oliver Nelles
of Siegen
The DFT of a rectangular window of length L looks like the sinc-function. In practice,
usually L = N. Zero-padding is equivalent with L < N since anyway w(k) = x(k) = 0 for k > L:
1
N = 32
|W(n )|
0.5
0
0 5 10 15 20 25 30
n
n = 1 → f = f0 /N n = N–1
The zeros of the DFT of the rectangular window of length N lie at multiples of f0 /N. If the
time signal is an oscillation of frequency M f0 /N, then the zeros are at integer values of n.
This means that in this case a convolution with such a signal is trivial and no leakage effect
results.
University

Oliver Nelles
of Siegen
Summary: Rectangular Window
• Every finite real signal of length N thus can be thought of being constructed by a
multiplication of an infinite length signal with period length N by a rectangular signal of
length N.
• The rectangular window leads to discontinuities, i.e., abrupt changes. This means high
frequencies are induced.
• The errors caused by windowing with a rectangle or not windowing at all (which is the
same thing!) thus are extremely large (picket fence and leakage effects)
Room for Improvement

• A smoother shape of the window would help to induce not so high frequencies.
• Many alternative windows are commonly used, see next slide.
• All these windows are similar. They reduce the leakage effect. However, they necessarily
distort the signal by their smooth transition at the beginning and end.
University

Oliver Nelles
of Siegen
Source: en.wikipedia.org
Uniform /
Blackman
Rectangular
Hann Bartlett
Hamming Gauss
University

Oliver Nelles
of Siegen
Example: Windowing with Uniform/Rectangular and Hann Window
• Using the uniform/rectangular window is like using no window at all.
• The Hann window (and similar alternatives) reduce the leakage effect significantly. By
the smoother transitions at the window edges less disturbing high frequencies are
induced.
• Hann window of length L (usually L = N):
Uniform/Rectangular N = 256 Hann N = 256

1 1
0.5 0.5
x(k)
x(k)
0 0
-0.5 -0.5
-1 -1
0 50 100 150 200 250 0 50 100 150 200 250
k k
100 100
80 80
|X(n)|
|X(n)|
60 60
40 40
20 20
0 0
0 50 100 150 200 250 0 50 100 150 200 250
n n
University

Oliver Nelles
of Siegen
Zoom:
Uniform/Rectangular Hann
100 100
80 80
60 60
|X(n)|
|X(n)|
significant leakage less leakage
40 into high frequencies 40 into high frequencies
20 20
0 0
0 5 10 15 20 0 5 10 15 20
n n
Signal frequency f1 = 10.5 Hz Signal frequency f1 = 10.5 Hz
Observations:
• Hann window reduces leakage effect significantly.
• Since the Hann window has a smaller area than the rectangular window signal energy is
lost and the amplitudes in the spectrum are smaller. It makes sense to normalize with
respect to the window area in order to compensate for this influence.
University

Oliver Nelles
of Siegen
Correction of Signal Damping with Windowing
Windowing distorts the original signal in 2 ways:
- Amplitude: The signal amplitude is reduced
- Energy: The signal energy (effective value RMS, “area under the signal“) is reduced
One of these effects can be corrected by multiplying the DFT with a correction factor (> 1):
Window Type Correction Amplitude Correction Energy

Uniform/Rectangular 1 1
Hann 2 1,63
Hamming 1,85 1,59
Blackman 2,8 1,97
Source: https://community.plm.automation.siemens.com/t5/Testing-Knowledge-Base/Window-Correction-Factors/ta-p/431775
University
3. Transformation von Signalen in den Frequenzbereich Page 152 Prof. Dr.-Ing.

Oliver Nelles
of Siegen
9.5 Non-Stationary Signals and Short-Term-DFT
Stationary Signals:
• Signals that do not change their characteristics / properties over time.
• Up to this point we implicitly assumed that all signals are stationary.
Non-stationary Signals:
• Signals that do change their characteristics / properties over time.
• In practice most signals are non-stationary. However, for a short time interval they can be
considered, at least approximately, stationary. Examples:
o Signals with trends, i.e., with slowly changing mean. This is typical for larger time
scales. If we look at stock indices over years (not days!). A varying mean changes the
d.c. value of the spectrum for n = 0 or f = 0 Hz
o By wear the properties of construction elements change over time. Certain signals of
machines (rotation speed, sound, ...) might change their characteristics like the
frequency of their peak value.
o Instead of wear also a failure can be the cause for such changes. However, this happens
much faster!
University

Oliver Nelles
of Siegen
Problem by Applying a Fourier Transform or DFT to Non-Stationary Signals:
• It is averaged by integration or summation over the complete signal. If the spectrum
changes over time its frequency components are weight with their relevance.
• The transform reveals no information about when which frequency occurs how strongly
in the signal!
Solving this Problem

1. Transform only short intervals of the signal into the frequency-domain. Within the short
intervals the signal can be assumed to be approximately stationary:
→ Short-time Fourier transform or short-time DFT.
2. Modification of the Fourier Transform such that it does not look for oscillations of
infinite length (like the original transform) but rather for wave packages that are active
only in certain time intervals:
→ Wavelet transform.
University

Oliver Nelles
of Siegen
Illustration of the Difficulties by Applying a DFT to Non-Stationary Signals
• Order in which frequencies occur is irrelevant.
• The result (spectrum) is affected by the frequencies according to their time dominance.
• DFT is not meaningful!
Chirp-Signal 0 ... 20 Hz Chirp-Signal 60 ... 0 Hz

1 1
0.5 0.5
x(k)
x(k)
0 0
-0.5 -0.5
-1 -1
0 50 100 150 200 250 300 350 400 450 500 0 50 100 150 200 250 300 350 400 450 500
k k
80 80
Frequencies of
60 60
0 Hz to 60 Hz
|X(n)|
|X(n)|
40 40
20
Frequencies of 20
0 Hz to 20 Hz
0 0
0 50 100 150 200 250 300 350 400 450 500 0 50 100 150 200 250 300 350 400 450 500
n n
University

Oliver Nelles
of Siegen
Short-Time Discrete Fourier Transform (STDFT)
• Windowed DFT
• Width of the window determines the time resolution and also the frequency resolution.
The width is a parameter defined by the user. It should be guided by the expected rate of
change in the spectrum:
- Signal changes its frequency properties quickly → narrow window.
- Signal changes its frequency properties slowly→ wide window.
• The DFT does not only depend on the frequency f or n but also on a second variable: the
time shift of the window t0. It indicates the time t0 around which the DFT is valid
.
Windowed Fourier Transform with Window w(t):
Windowed DFT with Window w(k):
University

Oliver Nelles
of Siegen
Gaussian as Window
• Strongly decreasing form center towards outer regions.
• Symmetrical.
• Fourier transform of a Gaussian is again a Gaussian, i.e., it is symmetrical in its time-
frequency properties.
Gauss-Window for
Fourier Transform Gauss-Window for DFT
Gauss-Window for DFT Gauss-Window for DFT

1 k = 255 1 k = 150
0 0
0.8 ! = 1/3 0.8 ! = 1/6
0.6 0.6
w(k)
w(k)
0.4 0.4
0.2 0.2
0 0
0 50 100 150 200 250 300 350 400 450 500 0 50 100 150 200 250 300 350 400 450 500
k k
University

Oliver Nelles
of Siegen
Short-time DFTs of the 1. chirp-signal with a Gauss-window
University

Oliver Nelles
of Siegen
Short-time DFTs of the 2. chirp-signal with a Gauss-window
University

Oliver Nelles
of Siegen
Observations:
• By shifting the window via k0 the time range that shall be analyzed can be selected.
• Because the window is not infinitesimally narrow all signal properties inside mix.
• A too wide window with respect to the signal spectrum change rate (! = 1/3) yield an
unnecessarily large averaging effect over time.
• At k0 = 500 the leakage effect is easy to see. The reason for this is as follows: The Gauss-
window is close to the end of the data at 511 and it has significant values where the data
stops. This induces similar high frequencies like a uniform/rectangular window.
• For the 2. chirp-signal even the width ! = 1/6 is a bit too wide. That can be seen in the
low quality of the bottom plot. That is because the 2. chirp-signal changes its frequency
3 times as fast as the 1.
• The window should not be chosen too narrow to ensure a certain robustness, see next
slides.
University

Oliver Nelles
of Siegen
Effect of window width in a short-time DFTs of noisy signals
Original signal
7.3
1
0.5
0
-0.5
-1
1
-7.3 1
0 50 100 150 200 250 300 350 400 450 500
0.8 0.8
0.6 0.6
1
0.4 0.4
0.8
0.2 0.2
0.6
0 0
0 50 100 150 200 250 300 350 400 450 500 0.4 0 50 100 150 200 250 300 350 400 450 500
0.2
0 50 100 150 200 250 300 350 400 450 500
! = 1/3 ! = 1/6 ! = 1/12

7.3 7.3 7.3
1 1 1
0.5 0.5 0.5
0 0 0
-0.5 -0.5 -0.5
-1 -1 -1
-7.3 -7.3 -7.3
0 50 100 150 200 250 300 350 400 450 500 0 50 100 150 200 250 300 350 400 450 500 0 50 100 150 200 250 300 350 400 450 500
University

Oliver Nelles
of Siegen
Short-time DFTs of noisy 1. Chirp-signal with a Gauss-window (normalized w.r.t. window area!)
! = 1/3 ! = 1/6 ! = 1/12
120 120 120
k0 = 255 k0 = 255 k0 = 255
100 100 100
weak weak weak
80 noise 80 noise 80 noise
|X(n)|
|X(n)|
|X(n)|
60 60 60
40 40 40
20 20 20
0 0 0
0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50
n n n
100 100 100
90 k0 = 255 90 k0 = 255 90 k0 = 255
80 strong 80 strong 80 strong
70 70 70
noise noise noise
60 60 60
|X(n)|
|X(n)|
|X(n)|
50 50 50
40 40 40
30 30 30
20 20 20
10 10 10
0 0 0
0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50
n n n
University

Oliver Nelles
of Siegen
Observations:
• The window width determines theoretically the maximal possible resolution. The wider a
window is the more accurate the frequencies can be determined.
• The window width determines the robustness with respect to the noise in the original
signal. The wider a window is the less significant the noise deteriorates the result.
Wider windows mean more data is utilized!
• The optimal window width is a compromise between both goals:
Signal characteristics Signal characteristics

changes slowly changes quickly
Optimal Window Width:
Compromise:
Optimal Window Width:

Weak Strong
noise noise
University

Oliver Nelles
of Siegen
Goals of Time-Frequency Analysis
• Good overview on which frequencies are present at what times.
• Illustration: Strength of frequency component by grey tones:
white = frequency does not exist / black = frequency is strongly present
• Best possible resolution in time !t and frequency (or energy) !f are coupled by
Heisenberg’s uncertainty principle and thus relate anti-proportionally:
or Area = const.
• With the width of the window in the short-time DFT not only the time resolution !t
but implicitly also the frequency resolution !f is fixed.
!t
f f f
!f
t t t
University

Oliver Nelles
of Siegen
Example 1: Analysis of a periodic signal f = 10 Hz f = 25 Hz f = 50 Hz f = 100 Hz
1
with varying frequency over time
Amplitude
0.5
0
Goal of a short-time DFT: -0.5
-1
Frequency analysis of the signal in
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
dependency of time. We want to know time t [s]
when which frequency occurs.
1.2 1.2
1
!t = 0.025 s 1
!t = 0.125 s
0.8 0.8
Choice for window width: 0.6 0.6
Determines the time resolution !t.

0.4
•
0.4
0.2 0.2
• Implicitly also determines the 0

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
0
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
1.2 1.2
frequency resolution !f because both are 1

!t = 0.375 s 1
!t = 7.000 s
anti-proportional: 0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
University

Oliver Nelles
of Siegen
1
Quelle:
Wikipedia
University

Oliver Nelles
of Siegen
Example 2: Detection and sensitivity with respect to a short peak disturbance
!t = 0.02 !t = 0.10
Quelle: Skript „time-frequency-Analyse und Wavelettransformationen“ of M. Clausen und M. Müller, Universität Bonn
University

Oliver Nelles
of Siegen
Fourier Transform
• Looks for periodic signals of infinite length but of all frequencies.
• Ad-hoc fix: focus on a certain time range by windowing.
• Window possesses a certain width → Determination of time and frequency resolution.
Wavelet Transform
• Looks for wave packages of different lengths and frequency.
• Long wave packages are of low frequency → high frequency res. but low time res.
• Short wave packages are of high frequency → low frequency res. but high time res.
• Idea: High frequencies commonly occur briefly and thus should be resolved more
accurately than low frequencies that typically are present for long time intervals.
University

Oliver Nelles
of Siegen
1
!=1
9.6 Outlook: Time-Frequency-Analysis t0 = 0
0.5
Construction of Wavelets 0
• Basis wavelet (mother wavelet) as master copy.

-0.5
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
• All wavelets are derived from the mother wavelet by 1

!=1
time shifts and time scalings (typically with factors 2−n). t0 = 0.5
0.5
• Time shift t0 for localization of a certain part of the signal.

0
• Time scaling by a factor ! for a certain frequency component.

-0.5
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
1
!=2
Properties of Wavelets t0 = 0
0.5
• Through the time shift the signal can be analyzed around t = t0.
0
• Through the time scaling ! various frequency components

can be analyzed. -0.5
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
• In contrast to the Fourier transform where sin-signals

1
!=2
t0 = −1
of infinite length are analyzed, the length of a wavelet 0.5
is coupled to its frequency (scaling)! 0
-0.5
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
University

Oliver Nelles
of Siegen
can be compared
9.6 Outlook: Time-Frequency-Analysis to a frequency f
Fourier Transform Wavelet Transform
1 1
0.5
0
0
-1 -0.5
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2
1 1
0 0.5
0
-1 -0.5
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2
1 1
0.5
0
0
-1 -0.5
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2
Windowed Fourier Transform Windowing is not necessary since

wavelets are local themselves!
f !
t t
University

Oliver Nelles
of Siegen
Parametric Methods
• A large number of data samples N is modeled by estimating a small number of
n parameters (n≪ N).
• These parameters typically results from structural considerations and not only as a means
for model accuracy.
• These parameters are physically or with other first principles interpretable or easy to
convert in interpretable quantities.
• Examples: IIR or transfer function models, AR or ARMA models, ...
Non-parametric Methods
• A large number of data samples N is described with a large number of n parameters.
Often n = N, i.e., no averaging or noise suppression in the statistical sense takes place.
• The parameters themself and their number has no direct physical motivation. It just
reflects such issues as accuracy, resolution, variance, etc.
• The parameter have no direct physical meaning or interpretation.
• Examples: FIR models (= impulse response models), DFT, ...
University

Oliver Nelles
of Siegen
Idea of a Parametric Frequency Analysis
• Signal model: Impulse response of a parametric transfer function.
• Estimation of the parameters of this transfer function.
Example: Autoregressive model of 2. order (AR(2)):
Gain=
1
50
• Modeling of one
damped oscillation.
|X(Ω)|dB
• Pole locations determine

0
frequency and damping.
• 2 parameters are required
for each oscillation and can
-50
be estimated by least squares. 10-2 10
-1
10
0
10
1
University

Oliver Nelles
of Siegen
time signal:
DFT
• From 256 samples of the time
100
signal 256 frequency values are N = 256
80
computed. → High sensitivity
|X(n)|
60
with respect to noise! 40
• Leakage and picket fence effect 20
distort the spectrum from a peak at 0
0 5 10 15 20 25 30 35 40 45 50
f1 = 5.5 Hz to a broader bump. n
Parametric AR(2) Estimation

• From 256 samples of the time 100
a1 = –7.9818, a0 = 7.0000
signal 2 parameters of the AR 80 →∞
Pole:
|X(n)|
model are estimated. → Very 60

40 p1/2 = 0.9909 ± i0.1346
insensitive with respect to noise! | p1/2 | = 1 → not damped
20
• An exact frequency (a real number, 0
not discretized!) is computed. 0 5 10 15 20 25 30 35 40 45 50
n
University

Oliver Nelles
of Siegen
Y = fft(X); % Discrete Fourier Transform (1-D).

% The algorithm uses an FFT.
y = ifft(X); % Inverse Discrete Fourier Transform
A = dftmtx(n);1 % Matrix of the discrete Fourier Transform
% (DFT). The matrix product with a vector
% calculated the DFT of this vector.
spectrum;1 % Different methods for estimating the
% spectrum (see MATLAB help)
window;1 % Function to perform windowing of signals
% (e.g. gausswin, hamming, etc.)
S = spectrogram(x);1 % Calculates the short-time Fourier tansform
% (STFT) of a signal.
University

Oliver Nelles
of Siegen
Pxx = pcov(x,p);1 % Calculates the spectral density function

% of the vector x by means of the
% covariance method
% p is the order of the predictor (AR).
University

Oliver Nelles
of Siegen
10. Filters
University

Oliver Nelles
of Siegen
4. Filter
10.1 Requirements
10.2 FIR and IIR Filters
10.3 Design of FIR Filters
- Window Method
- Optimization Method (Parks-McClellan)
10.4 Design of IIR Filters
- Method of Bilineare Transformation
- Overview of Analog Filter Typs
10.5 Implementation of Filters
10.6 Nonlinear Filters
10.7 Non-Causal Filters
10.8 Adaptive Filters
University
10. Filters Page 177 Prof. Dr.-Ing.

Oliver Nelles
of Siegen
10.1 Requirements
What is a Filter?
A filter is a system that modifies certain properties (characteristics) of a signals, e.g. it
suppresses or enhances. Typical filters are dynamic systems and frequency selective, i.e., they
block certain frequencies or frequency ranges or let them pass.
Digital Filter
We focus to digital filters, i.e., filters that are discrete in time and can be described by
difference equations. They can be implemented directly in digital electronic circuits
(hardware) but usually are implemented by programs on a computer (software).
Time ↔ Frequency
Usually we consider signals as functions of continuous or low high-
discrete time t or k: x(t) or x(k). In a lot of applications, however, frequency frequency
the signals rather depend on other variables like location. This is
the case for the vast field of image processing. “Frequency” then
means the inverse of space (like normally frequency is the inverse
of time).
University

Oliver Nelles
of Siegen
10.1 Requirements
Different Filters
Coffee filter:
Soot filter: Air filter:
Lets only liquids pass!
Lets only small Lets only small
Optical filter: particles pass! particles pass!
Lets only certain
colors pass!
Analog
electronic filter: Digital filter:
Lets certain Lets certain
frequencies pass! frequencies pass!
Realized as Realized in software
R-L-C-circuit. on a computer.
University

Oliver Nelles
of Siegen
10.1 Requirements
Three Steps for Deriving a Filter
1. Specification: What should the filter do and under what restrictions?
2. Design: Which filter fulfills these specifications?
3. Implementation (realization): How is the filter build (in hardware) or programmed
(in software)?
Four Steps for Designing a Filter

a) Choice of a system class: E.g. linear, stable, causal, time-invariant, dynamic systems.
b) Choice of a filter structure:
e.g. FIR (finite impulse response)
or IIR (infinite impulse response).
c) Determination of the filter order.
d) Determination of the filter parameters.
University

Oliver Nelles
of Siegen
10.1 Requirements
Signal-to-Noise Ratio
abbreviated SNR is the ratio between the averaged power of a signal (meaningful
information) and the averaged power of noise (disturbance)
PSignal
SNR =
PNoise
Often it is given in logarithmic scale, in decibel:
✓ ◆
PSignal
SNR = 10 lg dB
PNoise
Since it relates powers (~ squares of amplitudes) the 3 dB corner frequency marks the 1/2
p
drop-off, not the 1/ 2 drop-off as it is known from the magnitude bode diagram used in
control theory which shows amplitudes!
Task of a filter is to improve (i.e., increase) the SNR. This is typically possible because
signal and noise are dominant in different frequency ranges. The corner frequency of the
filter represents the boundary between signal frequency range and the noise frequency range.
University
4. Filter Page 181 Prof. Dr.-Ing.

Oliver Nelles
of Siegen
10.1 Requirements
SNR = 100 SNR = 10
1.5 1.5
1 1
0.5 0.5
u(t)
u(t)
0 0
-0.5 -0.5
-1 -1
-1.5 -1.5
100 150 200 250 300 350 400 450 500 100 150 200 250 300 350 400 450 500
t t
filtered with PT1, f = 0.2 filtered with PT1, f = 0.2
1.5 1.5
1 1
0.5 0.5
u(t)
u(t)
0 0
-0.5 -0.5
-1 -1
-1.5 -1.5
100 150 200 250 300 350 400 450 500 100 150 200 250 300 350 400 450 500
t t
filtered with PT1, f = 1 filtered with PT1, f = 1
1.5 1.5
1 1
0.5 0.5
u(t)
u(t)
0 0
-0.5 -0.5
-1 -1
-1.5 -1.5
100 150 200 250 300 350 400 450 500 100 150 200 250 300 350 400 450 500
t t
University

Oliver Nelles
of Siegen
10.1 Requirements
Limit (Cut-off) Frequency Pass Suppress
A frequency selective filter can only be useful if the desired
signal and the disturbance are in different frequency ranges. disturbance
Amplitude
Then it is possible to place the limit (cut-off) frequency !g
in such a way that a significant part of the desired signal can desired signal
pass while a significant part of the disturbance cannot.
Filter Types
If, as in the above example, the desired signal lies mostly in the low-frequency range while
the disturbance lies mostly in the high frequency range, a low-pass filter can improve the
signal quality a lot. A low-pass filter lets all low frequency components pass but suppresses
all high frequency components. That is the most common used filter type. In many
applications, however, the desired signal and disturbance are in other frequency ranges.
Low-pass High-pass Band-pass Band-stop
Amplitude
Amplitude
Amplitude
Amplitude
University

Oliver Nelles
of Siegen
10.1 Requirements
Application Examples for Different Filter Characteristics
• Low-pass: Suppression of high frequency noise to improve the quality of the signal.
• High-pass: Suppression of a slowly changing signal change like offsets (frequency 0) or
trends or drifts.
• Band-pass: Extraction of a frequency band. Typical for radio or TV receivers. The signal is
modulated on a high frequency carrier that it needs to be extracted from before further
processing.
• Band-stop: Suppression of certain (typically narrow) frequency ranges. Commonly applied
to actuation signals in the aerospace industry to avoid damages due to an excitation of
resonances (weakly damped modes). Also called a notch filter.
Ideal Filter
• Perfect output of the signal in the passband, i.e., .
• Perfect suppression of the signal in the stopband, i.e., .
• Infinitely steep transition from passband to stopband, i.e., steepness = ∞.
• No phase shift (no delay) of the signal, i.e., .
University

Oliver Nelles
of Siegen
10.1 Requirements
Real Filters
In the real world, these properties cannot be exactly realized. The demands for an ideal filter
can never be achieved. Thus we easier the requirements and accept tolerances.
Specification of Real Lowpass Filters

• Pass-band: Gain close to 1, between 1–!1 und 1+ !1.
• Stop-band: Gain smaller than !2.
• Pass-band: " < "p, transition-band: "p < " < "s, stop-band: " > "s.
• No requirements on the phase. Sometimes linear phase is demanded, see later.
Transition-
Remarks: p = pass s = stop
band • The closer "p and "s lie together and
the smaller !1 and !2 are chosen, the more
Pass-band extreme are the requirements.
Stop-band
• More extreme requirements necessarily
lead to more complex filters.
University

Oliver Nelles
of Siegen
10.1 Requirements
Restriction to Filters With the Following Properties (see Chapter 8)
• Stable
• Linear (for nonlinear filters see Section 10.6)
• Causal (for non-causal filters, see Section 10.7)
• Time-invariant (for time-variant filters, see Section 10.8)
Furthermore it is sometimes desirable, particularly in communications:

• linear in its phase
This means that every oscillation is identically shifted by the filter in phase. This is
independent of the oscillation frequency. This property is important in acoustic environments
(audio components) because the ears are very sensitive the frequency-dependent phase
differences. If the linear phase property is not at least approximately fulfilled this means low
and high frequency sounds arrive at the ear at different times! This would disturb any
acoustic sensation. In control systems with linear phase have a different name: they are called
• systems with a pure dead time with no other phase delay.
University

Oliver Nelles
of Siegen
10.1 Requirements
Property: Linear Phase
Mathematically “linear phase” more precisely means the phase is a linear function of the
frequency:
u y
with a real Tt
A filter with such a transfer function has an output y(t) to an input oscillation u(t) with an
amplitude A1, frequency !1 and phase "1 after transients are decayed of:
Because the phase shift is linear in the Amplitude gain Phase shift
frequency this can be written as:
The phase "1 of the input signal u(t) is not changed by the filter. Time shift
And this is the case independent of the frequency of the signal !1. (dead time)
The dead time Tt is commonly also called group propagation delay:
University

Oliver Nelles
of Siegen
10.1 Requirements
Property: Linear Phase
• Can exactly only be achieved by FIR filters.
• For IIR filters the phase can only be approximately linear in a certain frequency band.
• Especially in the audio and communications fields linear phase is a commonly requested
property of big importance.
Linearly scaled Logarithmically scaled

frequency axis frequency axis
linear
linear phase
phase
University

Oliver Nelles
of Siegen
10.1 Requirements
Property: Zero Phase
A particularly simple special case of a filter with linear phase is a filter with zero phase, i.e.,
with a phase response = 0 for all frequencies. This is the case for transfer functions that are
purely real and non-negative. Such a transfer function F(z) can be constructed from an
arbitrary transfer function G(z) with arbitrary phase as follows:
This leads to a purely real frequency response:
This means: F(z) has for every zero zn a mirrored zero at zn–1 = 1/zn and for every pole zp a
mirrored pole at zp–1 = 1/zp. If zn and zp are inside the unit circle (stable!) then 1/zp and 1/zp
automatically are outside the unit circle (unstable!). Consequently, zero phase filters have the
following properties:
• FIR: non-causal.
• IIR: unstable and non-causal.
University

Oliver Nelles
of Siegen
10.1 Requirements
Implementation of Zero Phase Filters
Because they are necessarily non-causal, zero phase filters can only be implemented offline.
A simple possibility is to filter the data with a causal G(z) and subsequently filter the
outcome in backward direction again with G(z). The phase delay induced by the first filter
exactly will be reversed by the second backward filtering process.
1 1
0.5 0.5 x(50–k) Time reverse
0 0
u(k)
-0.5 -0.5
1
-1 -1
0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50 0.5 u(k)
y(k)
k k 0
-0.5
Filter with G(z) Filter with G(z) -1
0 5 10 15 20 25 30 35 40 45 50
1 1 k
0.5
x(k) 0.5 x(50–k)
0 0
u(k) y(50–k)
-0.5 -0.5
u(k) y(k)
-1 -1 |G(z)|2
0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50
k k
Time reverse u(k) x(k) Time x(N–k) y(N–k) Time y(k)
G(z) reverse G(z) reverse
University

Oliver Nelles
of Siegen
Equivalent descriptions of a linear dynamic systems in discrete time:
Difference equation of order n Impulse response
→ Can be implemented directly (m £ n). → Cannot be implemented directly!

Usually m = n. If m < n we can assume Approximation with m+1 terms:
m = n with bi = 0 for i > m.
Properties
• Order n is small: e.g. n = 2, 3, 4, ... • Order m is large: m = 10, 20, 30, ...
• Feedforward: biu(k–i) • Feedforward: biu(k–i)
• Feedback: aiy(k–i) • No feedback!
• Infinite impulse response (IIR) • Finite impulse response (FIR)
University

Oliver Nelles
of Siegen
Difference in the Order
• IIR filters usually have significantly fewer parameters (ai & bi) than FIR filters (bi).
• IIR filters need fewer memory for storage of previous data.
• IIR impulse response usually asymptotically exponentially decays towards zero.
FIR impulse response is exactly equal to zero after time steps k > m.
• IIR filters can be unstable. FIR filters are inherently stable (no feedback).
• IIR filters have an analog correspondence. FIR filters exist only in the digital world.
IIR, n = 2 FIR, m = 15
0.12 0.12
0.1 constrained 0.1 arbitrary form

Impulse response
Impulse response
0.08
flexibility! 0.08 possible!
(5 deg. of freedom) (16 deg. of
0.06 0.06
freedom)
0.04 0.04
k = 16
0.02 0.02
0 0
-0.02 -0.02
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40
k k
University

Oliver Nelles
of Siegen
Transfer Functions
IIR Filter FIR Filter
• m zeros at arbitrary locations • m zeros at arbitrary locations

• n Pole at arbitrary locations • m poles at 0
• b0 = 0 for strictly proper systems • b0 = 0 for strictly proper systems
• Complex relationship between • bi = gi are the first m+1 steps of the
parameters and impulse response impulse response, all subsequent ones = 0
• Not well suited for adaptation • Well suited for adaptation
– feedback structure – feedforward structure
– stability problems – inherent stability
University

Oliver Nelles
of Siegen
Example:
A system with impulse response can be exactly realized by an
IIR filter of 1. order: a = 0.8
1
0.9
0.8
0.7
response
Impulse
0.6
0.5
This infinite geometric series can exactly be written as: 0.4

0.3
0.2
0.1
0
0 2 4 6 8 10 12 14 16 18 20
The gain of this IIR filter is: k
The marginally stable case (integral behavior) is achieved for a = 1.

Then the gain cannot be calculated anymore.
University

Oliver Nelles
of Siegen
An FIR filter can approximately represent every stable impulse response. For the first m+1
terms an FIR of m. order can exactly describe every sequence:
A natural choice for the filter parameters would be:

for
However, this would yield a wrong (too small) gain because all summands for k > m are
simply missing.
Alternatively the last parameter (summand) can be a = 0.8
1
adjusted in order to make the gain correct,

i.e., for ! → 0 / z → 1:
0.8
response
Impulse
0.6
bm is chosen
0.4
such that the
gain keeps
This is a reasonable approach for low-pass filters. correct!
0.2
For high-pass filter an alternative could be to require m=9
identical gains for ! → ∞ / z → ∞. 0
0 5 10 15 20
k
University

Oliver Nelles
of Siegen
Properties of FIR filters:
• stable,
• can realize linear phase,
• very flexible because many degrees of freedom (parameters) → frequency response can
be shaped as desired,
• only forward path → simple to implement,
• easy to adapt.
Properties of IIR filters:

• can become unstable,
• no linear phase possible,
• with the help of a few parameters significant effects can be realized,
• high steepness even for low orders,
• feedback path → more complex to implement,
• complex to adapt (stability problems, not linear in its parameters).
University

Oliver Nelles
of Siegen
General Remarks About FIR Filter Design
The design of digital filters commonly is based on the mature field of design of analog filters.
Because FIR filters only exist in the digital world, no analog correspondence is available.
New design method must be developed. The three standard approaches are:
• Window method: A simple approach that can be pursued by hand. The desired amplitude
response is established. Then it is transformed by the inverse Fourier transform to the
impulse response in the time domain. Since the impulse response is usually of infinite
length the filter order m must be reduced/cropped to a realizable number. This causes an
approximation error and thus the method is not very accurate.
• Frequency sampling method: This is a very universal approach and also possible for
recursive filters. The desired frequency response is sampled and transformed with the
inverse DFT to the impulse response.
• Optimal filter method: With support of a software tool this is the most powerful and
flexible approach. A minmax optimization problem is solved via a Chebyshev
approximation that minimizes the maximal deviation between the frequency response of the
filter and the desired frequency response. This is carried out with the algorithm proposed by
Parks-McClellan and implemented in the MATLAB signal processing toolbox.
University

Oliver Nelles
of Siegen
Example: A simple FIR filter of 1. order
For FIR filters the output is calculated as a weighted average of the current and previous
inputs (moving average, MA). A simple low-pass filter can look like:
0 7.2
10
1 u(k)
-1
10
y(k)
noisy step response

0.8
-2
10 -2 -1 0 0.6
10 10 10
0
0.4
-20
-40 0.2
-60
0
-80
-2 -1 0 -0.2
10 10 10 0 5 10 15 20 25 30 35 40 45 50
k
University

Oliver Nelles
of Siegen
Remarks on FIR Filters With Linear Phase
FIR filters have to fulfill certain condition in order to have a linear (or affine) phase:
• Linear phase, i.e., : symmetrical impulse response.
• Affine phase, i.e., : centrosymmetrical impulse response.
Remember:
Addition of two conjugate complex numbers:
→ purely real!
Same numbers written in absolute value and phase form:
→ purely real! Im
b z1
→ Sum of two conjugate complex numbers is purely real! a Re
–b z2
University

Oliver Nelles
of Siegen
Example: Symmetrical FIR Filter of Length L = 9 (Order m =8)
Transfer function of the filter:
Because of symmetry we have:
Factoring z–4 out yields:
The frequency response is obtained for . Expression of the following form
are purely real and therefore have phase = 0. Thus the phase of this filter finally is:
(+ ! if the sign of “{...}” is negative!)
University

Oliver Nelles
of Siegen
The 4 Types of FIR Filters With Linear Phase
Type 1: Symmetry Type 2: Symmetry Symmetry:
odd length even length •
L=9 L=8
• Phase is linear, i.e.
Impulse response
0123456789 k Impulse response 0123456789 k
Type 3: Centrosymmetry Type 4: Centrosymmetry Centrosymmetry:

odd length even length •
L=9 L=8
• Phase is affine, i.e.
Impulse response
Impulse response
0123456789 k 0123456789 k β = ⇡/2
University

Oliver Nelles
of Siegen
Window Method
The idea behind this approach is to design a filter that has a desired frequency response
GD(i!) (D = desired). Subsequently the impulse response g(k) can be calculated via the
inverse Fourier transform as follows:
This impulse response typically is non-causal and of infinite length. We have to shift it and
crop it at a certain finite order m to make the FIR filter realizable.
Example: Low-pass with cut-off frequency !e (sampling rate T0 = 1 s)
0.2
0.1
-0.1
-30 -20 -10 0 10 20 30
University

Oliver Nelles
of Siegen
Approximation Error Through Cropping the Impulse Response
∞ coefficients 61 coefficients Unwanted
7.2 7.2
behavior!
1 1
Gibbs
0.8 0.8
Phenomenon
0.6 0.6
0.4 0.4
0.2 0.2
0 0
-1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1
31 coefficients 11 coefficients
7.2 7.2
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
-1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1
University

Oliver Nelles
of Siegen
Consequences From Unwanted Behavior of FIR-Filters
The ripple in the amplitude response of the FIR filter can easily be explained. The cropping
of the impulse response g(k) is identical to a windowing with a uniform/rectangular window
w(k). In the frequency domain this corresponds to a convolution with the Fourier transform
of the rectangular window W(i!), the sinc-function:
This explains the ripples. Unfortunately they do not become smaller if more coefficients are
spent to describe the impulse response more accurately. This is the so-called Gibbs
phenomenon (see math, “Fourier series”).
In order to reduce this undesirable effect, the impulse response is multiplied with a smoother
window like in the DFT context. Such a window can reduce high frequencies by letting the
impulse response slowly decay towards zero. For FIR filter design the so-called Kaiser
window is commonly applied.
University

Oliver Nelles
of Siegen
Optimal Filter Design Method
With most optimization methods the quadratic error |E(i!)|2 between the desired filter
characteristics HD(i!) and the real filter characteristics H(i!) is minimized:
However, the algorithm according to Parks and McClellan minimizes the maximal (not
squared) error because it has yield more reliable results:
The minimization of the maximal absolute value ensures that the ripples are equally
distributed over all frequencies which led to the name Equiripple filter. The criterion is also
important in many other approaches to robust optimization and control.
Because the absolute value of the error is magnitudes larger in the pass-band than in the stop-
band, it is important to multiply the errors with a normalization weight that guarantees no
frequency ranges are preferred:
University

Oliver Nelles
of Siegen
To achieve a filter with equally large (small) ripples in the pass- and stop-band the following
frequency weight must be chosen for a low-pass filter:
or
MATLAB offers the Parks/McClellan minimax algorithm

and least-squares optimization tools: MATLAB
The default mode of operation of firls and firpm is to design type I or type II linear phase filters, depending on whether
the order you desire is even or odd, respectively. A lowpass example with approximate amplitude 1 from 0 to 0.4 Hz,
and approximate amplitude 0 from
0.5 to 1.0 Hz is
n = 20; % Filter order
f = [0 0.4 0.5 1]; % Frequency band edges
a = [1 1 0 0]; % Desired amplitudes
b = firpm(n,f,a); % Parks-McClellan FIR Design
From 0.4 to 0.5 Hz, firpm performs no error minimization; this is a transition band or "don't care" region. A transition
band minimizes the error more in the bands that you do care about, at the expense of a slower transition rate. In this
way, these types of filters have an inherent trade-off similar to FIR design by windowing.To compare least squares to
equiripple filter design, use firls to create a similar filter.
University

Oliver Nelles
of Siegen
10.3 Design of FIR Filters MATLAB
Type bb = firls(n,f,a);
and compare their frequency responses using FVTool: fvtool(b,1,bb,1)
Note that the y-axis shown in the figure below is in Magnitude Squared. You can set this by right-clicking on the axis
label and selecting Magnitude Squared from the menu.
The filter designed with firpm exhibits equiripple behavior. Also note that the firls filter has a better response over
most of the passband and stopband, but at the band edges (f = 0.4 and f = 0.5), the response is further away from the
ideal than the firpm filter. This shows that the firpm filter's maximum error over the passband and stopband is
smaller and, in fact, it is the smallest possible for this band edge configuration and filter length.
University

Oliver Nelles
of Siegen
Removing Periodic Signals of Known Frequency (High-pass Approach)
With an FIR filter a arbitrary periodic signal of known frequency can be removed perfectly.
Typical applications:
• Carrier frequency of a radio signal
• Hum (50 Hz and multiples as upper harmonics)
The following FIR filter removes all periodic signals with period length Tp = m·T0 or
frequency !p = !0 /m, respectively:
m=7
1 1
0.8 0.8
0.6 0.6
After transient:
0.4 0.4
0.2 0.2
Perfect suppression!
u(k) 0 u(k) y(k) 0
-0.2 G(z) -0.2
-0.4 -0.4
-0.6 -0.6
-0.8 -0.8
-1 -1
0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50
k k
University

Oliver Nelles
of Siegen
The special properties of such a filter are:
• Independence of the shape of the signal (depends only on the period length).
• Removes all multiples of !p perfectly.
• Perfect damping with –∞ dB (infinite steepness!).
• High-pass! Removes all low frequencies (and d.c. values) as well.
2 1.2
1.5 u(k)
1
1
noisy step response

0.5 0.8
0.6
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
y(k)
100 0.4
50
0.2
0
0
-50
-100 -0.2
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0 5 10 15 20 25 30 35 40 45 50
k
University

Oliver Nelles
of Siegen
Removing Periodic Signals of Known Frequency (Low-pass Approach)
If the same task as before is requested with a low-pass filter instead of a high-pass the
following two possibilities with gain = 1 suggest itself:
u(k) u(k)
k k
• Positive and negative half waves have to be • Positive and negative half waves must
symmetrical in order to cancel each other. accumulated to zero.
• m has to be even. • Strong averaging (low-pass effect).
• Little distortion for other frequencies. • Removes only multiples of !p.
• Removes only multiples of 2!p.
University

Oliver Nelles
of Siegen
0 0
10 10
-1 -1
10 10
-2 -2
10 -3 -2 -1 0
10 -3 -2 -1 0
10 10 10 10 10 10 10 10
100 100
0
0
-100
-100 -3 -2 -1 0
-200 -3 -2 -1 0
10 10 10 10 10 10 10 10
1.2 1.2
1 1
noisy step response
noisy step response

0.8
u(k) 0.8
u(k)
0.6
y(k) 0.6
y(k)
0.4 0.4
0.2 0.2
0 0
-0.2 -0.2
0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50
k k
University

Oliver Nelles
of Siegen
Example: A Simple IIR Filter of 1. Order
For IIR filters the output is a weighted average of the current and previous inputs (moving
average, MA) and previous outputs (autoregressive, AR) → ARMA. The most simple first
order IIR filter is a PT1-system, i.e.:
In comparison to FIR filters, here
implicitly infinitely old inputs u(k-i)
influence the output!
1.2
0
10
1 u(k)
y(k)
noisy step response

0.8
-1
10 -2 -1 0
0.6
10 10 10
0 0.4
-10
-20 0.2
-30
0
-40
10
-2
10
-1
10
0 -0.2
0 5 10 15 20 25 30 35 40 45 50
k
University

Oliver Nelles
of Siegen
Transformation from Analog in Digital
Typically IIR filters are designed with traditional methods in the analog world. In a second
step they are transformed from the analog in the digital world. For this transformation various
approaches are common, dependent on the application area:
• Impulse invariance method: Demand identical impulse response in the analog and digital.
• Bilinear transformation (also called: Tustin formula): The s-variable in the analog
frequency domain is approximated by a rational fractional function in z such that a
numerator / denominator expression in the s-domain becomes a numerator / denominator
expression in the z-domain (and vice versa).
Furthermore there exist other method that are more popular in digital control than in filter
design:
• Identical time signals with zero or first order hold.
• Identical poles and zeros.
In the following, we focus on the bilinear transformation approach.
University

Oliver Nelles
of Siegen
Bilinear Transformation (Tustin Formula)
The exact transformation between s and z is nonlinear and
would destroy the fractional rational function form. Linear
system theory would not apply anymore.
Via the bilinear transformation this form is preserved. The stability properties stay identical,
as well.
max. possible frequency
before aliasing starts!
s-Plane z-Plane
Im Im
Re Re
University

Oliver Nelles
of Siegen
Comparison: Frequency Response in the Analog and Digital World
A transfer function Ga(s) in the s-domain can approximately be transformed by the bilinear
transformation into the z-domain:
The frequency response in the analog can be obtained by and correspondingly by

going through the unit circle in the z–domain . Since the bilinear transformation is
just an approximation, the analog frequency differs from the digital frequency :
4
-1
The upper bound for the digital frequency is given
-2
by the half sampling frequency according to Shannon: -3
-4
-30 -20 -10 0 10 20 30
University

Oliver Nelles
of Siegen
Bilinear Transformation (Tustin Formula) = Trapezoidal Rule for Integration
In discrete time a continuous integration can be approximated in different ways. More
accurate than calculating the lower or upper sum (see next slide) is the trapezoidal rule:
medium
height
width u(t)
In the z-domain this results in:
This formula shall correspond to an integration in the s-domain:

(k–1)T0 kT0 (k+1)T0 t
This exactly yields the bilinear transformation:
University

Oliver Nelles
of Siegen
Integration with Lower and Upper Sum
u(t) u(t)
(k–1)T0 kT0 (k+1)T0 t (k–1)T0 kT0 (k+1)T0 t
Differentiation with Forward and Backward Differences
u(t) u(t)
(k–1)T0 kT0 (k+1)T0 t (k–1)T0 kT0 (k+1)T0 t
University

Oliver Nelles
of Siegen
Stability area is mapped too
10.4 Design of IIR Filters large small
Comparison
Bilinear Transformation Forward Differences Backward Differences
(Trapezoidal Integration) (Lower Sum Integration) (Upper Sum Integration)
s-Plane s-Plane s-Plane
z-Plane z-Plane z-Plane
University

Oliver Nelles
of Siegen
Procedure for Filter Design Via Bilinear Transformation
1. Specification is either directly made in the analog world or it is transformed from the
digital in the analog world.
Friedrich Bessel, 1784-1846
2. Filter design in the analog world. (www.wikepedia.org)
3. Transformation of the final analog filter in the digital world.
The Following IIR Filters are Common:

• Bessel filter: Approximately linear phase in the pass-band
• Butterworth filter: Monotone amplitude response
• Chebyshev filter type 1: Ripple in the pass-band
• Chebyshev filter type 2: Ripple in the stop-band
• Cauer filter (elliptic filter): Ripple in the pass- and stop-band
For the steepness of filters of identical orders (i.e., comparable complexity):
Bessel < Butterworth < Chebyshev < Cauer
University

Oliver Nelles
of Siegen
Overview on Analog Filters
1 1
0.8 0.8
0.6 0.6 Chebyshev
Butterworth
0.4 0.4 Type I
0.2 0.2
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
1 1
0.8 0.8
Chebyshev 0.6 0.6 Cauer

Type II 0.4 0.4 (elliptic)
0.2 0.2
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
University

Oliver Nelles
of Siegen
Stephen Butterworth, 1885-1958
10.4 Design of IIR Filters (www.wikepedia.org)
Butterworth Filter
• Design with focus on maximal flatness of the amplitude response close to the limit
frequency .
• Monotone amplitude response, i.e., no ripples.
• Fast drop-off in the amplitude response at the limit frequency.
• Strong overshoot of the impulse response.
• Relative low steepness with 20·n dB / decade (n = filter order).
s-Plane s-Plane
Amplitude Response:
n=3 n=4
Im Im
× × × ×
× ×
× ×
Re Re
× ×
× × × ×
where si are the n stable poles of the 2n-root of .
× Pole of × Pole of
University

Oliver Nelles
of Siegen
Butterworth-Filter
0
Magnitude [dB]
-20 n=2
-40
n=4
n=6
-60
0
n=2
-180
Phase [°]
n=4
-360
n=6
-540
-2 -1 0 1 2
10 10 10 10 10
University

Oliver Nelles
of Siegen
Butterworth Filter
1.2
n=4 n=6
n=2
1
0.8
Step responses
0.6
0.4
0.2
0
0 10 20 30 40 50
Time [s]
University

Oliver Nelles
of Siegen
Pafnuti Chebyshev, 1894
Chebyshev Filter
• Steeper than Butterworth filter.
• Ripples in pass-band (type I) or stop-band (type II) in the amplitude response.
Acceptance of ripple drawback for benefits in steepness.
• Step response oscillates more than for Butterworth filter.
• Transposes into Butterworth filter if the allowed ripple factor ! → 0!
• Design parameters: limit frequency , order n, allowed ripple factor !.
Chebyshev Polynomial of Order n:
! : ripple factor
Because the Chebyshev polynomial changes
in the pass-band between 0 and 1 a lower limit
on the gain is given by:
University

Oliver Nelles
of Siegen
Source: https://en.wikipedia.org/wiki/Chebyshev_filter
University

Oliver Nelles
of Siegen
Chebyshev Filter Type I
0
Magnitude [dB]
-20 n=2
-40
n=4
-60 n=6
0
n=2
-180
Phase [°]
n=4
-360
n=6
-540
-2 -1 0 1 2
10 10 10 10 10
University

Oliver Nelles
of Siegen
Chebyshev Filter Type I
1.2
n=4 n=6
n=2
1
0.8
Step responses
0.6
0.4
0.2
0
0 10 20 30 40 50
Time [s]
University

Oliver Nelles
of Siegen
Chebyshev Filter Type I (n = 4)
Bode Diagram
Magnitude (dB) 0
−20
−40
−60
−80
0
−90
Phase (deg)
−180
−270
−360
−1 0 1
10 10 10
Frequency (rad/s)
University

Oliver Nelles
of Siegen
Wilhelm Cauer, 1900-1945
Cauer Filter (Elliptic Filter)

• Steeper than Chebyshev filter, even the steepest possible (for linear filters).
• Ripples in the pass-band and stop-band in the amplitude response.
• Step response oscillates stronger than for the Chebyshev filter.
• Transposes into Chebyshev filter type I if the steepness factor ! → ∞!
• Design parameters: limit frequency , order n, ripple " and steepness !.
Elliptic functions of order n:
" : Ripple for pass-band for even n

! : Steepness (selectivity factor)
calculated according to
xni: zeros
a complex formula in
xpi: poles dependence on !. for odd n
Maximal steepness at x = 1.
University

Oliver Nelles
of Siegen
Source: https://en.wikipedia.org/wiki/Elliptic_filter
University

Oliver Nelles
of Siegen
Cauer Filter
0
n=2
Magnitude [dB]
n=4
-20
-40
n=6
-60
1080
n=6
720
Phase [°]
360 n=2
n=4
-360
-2 -1 0 1 2
10 10 10 10 10
University

Oliver Nelles
of Siegen
Cauer Filter
1.2
n=4
n=2 n=6
1
0.8
Step responses
0.6
0.4
0.2
0
0 10 20 30 40 50
Time [s]
University

Oliver Nelles
of Siegen
4.4 Entwurf von IIR-Filtern
% For data sampled at 1000 Hz, design a 9th-order highpass
% Butterworth filter with cutoff frequency of 300Hz.
Wn = 300/500; % Normalized cutoff frequency
[z,p,k] = butter(9,Wn,'high'); % Butterworth filter
[sos] = zp2sos(z,p,k); % Convert to SOS form
h = fvtool(sos); % Plot magnitude response
Shannon frequency
University

Oliver Nelles
of Siegen
% Design a 4th-order butterworth band-pass filter which passes
% frequencies between 0.15 and 0.3.
[b,a]=butter(2,[.15,.3]); % Bandpass digital filter design
h = fvtool(b,a); % Visualize filter
University

Oliver Nelles
of Siegen
% For data sampled at 1000 Hz, design a 9th-order lowpass Chebyshev
% Type I filter with 5 dB of ripple in the passband, and a passband
% edge frequency of 300Hz.
Wn = 300/500; % Normalized passband edge frequency
[z,p,k] = cheby1(9,5,Wn);
h = fvtool(sos) % Plot magnitude response
University

Oliver Nelles
of Siegen
% Design a 2nd-order Chebyshev Type I band-pass filter which passes
% frequencies between 0.2 and 0.5 with 3 dB of ripple in the
% passband.
[b,a]=cheby1(2,3,[.2,.5]); % Bandpass digital filter design
University

Oliver Nelles
of Siegen
% For data sampled at 1000 Hz, design a ninth-order lowpass
% Chebyshev Type II filter with stopband attenuation 40 dB down from
% the passband and a stopband edge frequency of 300Hz.
Wn = 300/500; % Normalized stopband edge frequency
[z,p,k] = cheby2(9,40,Wn);
University

Oliver Nelles
of Siegen
% Design a 6th-order Chebyshev Type II band-pass filter which passes
% frequencies between 0.2 and 0.5 and with stopband attenuation 80 dB
% down from the passband.
[b,a]=cheby2(6,80,[.2,.5]); % Bandpass digital filter design
University

Oliver Nelles
of Siegen
% For data sampled at 1000 Hz, design a sixth-order lowpass
% elliptic filter with a passband edge frequency of 300Hz, 3 dB of
% ripple in the passband, and 50 dB of attenuation in the stopband.
Wn = 300/500; % Normalized passband edge frequency
[z,p,k] = ellip(6,3,50,Wn);
University

Oliver Nelles
of Siegen
% Design a 6th-order Elliptic band-pass filter which passes
% frequencies between 0.2 and 0.5, and with 5 dB of ripple in the
% passband, and 80 dB of attenuation in the stopband
[b,a]=ellip(6,5,80,[.2,.5]); % Bandpass digital filter design
University

Oliver Nelles
of Siegen
Normalization and Transformation
Up to here we have focused on low-pass filters. But with the help of simple transformations
this knowledge can be carried over to all kind of filters.
Starting point is the design of a low-pass filter with normalized limit frequency !g = 1 rad/s.
All other filters can be easily derived from this:
Low-pass with limit frequency !g:
High-pass with limit frequency !g:
Band-pass with limit frequencies !g1 and !g2:
Band-stop with limit frequencies !g1 and !g2:
University

Oliver Nelles
of Siegen
Block Diagram of Digital Filters symbolic representation!
Delay of one sampling time step:
WARNING: Formally such a block diagram is wrong because it mixes time and frequency
domain. However, such a sloppy representation is commonly found and easy to read. More
strictly the following time delay is meant:
Multiplication with a factor:
Addition:
Subtraction:
University

Oliver Nelles
of Siegen
FIR Filter
• m memory elements
• m+1 multiplications and m additions
• No feedback
• For symmetrical filter with or ,
half of the multiplications can be save by first adding u(k) with u(k–m) and
u(k–1) with u(k–m–1), etc.
tapped delay line
University

Oliver Nelles
of Siegen
Efficient Realization of a Tapped Delay Line in Software
Example for m = 3:
The pointer moves up one memory block in each time step. When it moves out at the
top it jumps back to the bottom. This can be implemented with the modulo operator:
adr := (adr + 1) mod m. In each time step only one memory block has to be overwritten
instead of moving all of them one step further!
4. 4. 4. 4.
3. 3. 3. 3.
2. 2. 2. 2.
1. 1. 1. 1.
4. 4. 4. 4.
3. 3. 3. 3.
2. 2. 2. 2.
1. 1. 1. 1.
University

Oliver Nelles
of Siegen
IIR Filter
An IIR filter of order n can be written as
If the order of the numerator is smaller than the order of the denominator (m < n) then simply
the lacking bi = 0 for i > m. This transfer function can be split into two part in two ways:
u x y
Direct Form I:
u x y
Direct Form II:
University

Oliver Nelles
of Siegen
IIR Filter in Direct Form I
• 2n memory blocks
• 2n+1 multiplications
and 2n additions
• n feedback paths
University

Oliver Nelles
of Siegen
IIR Filter in Direct Form II (redundante Variante)
identical!
• 2n memory blocks
and 2n additions
University

Oliver Nelles
of Siegen
IIR Filter in Direct Form II (Non-Redundant Variant)
• n memory blocks
• n memory bocks correspond
to the n states of the filter
(see state space in control)
and 2n additions
University

Oliver Nelles
of Siegen
Cascade Form
Consists of a series circuit of IIR filters of 2. order in direct form II:
In this product each factor represents a second order system with two conjugate complex or
two real poles. For an even order n of the complete filter l = n/2.
For an odd n we have l = (n+1)/2 and bl2 = al2 = 0.
Parallel Form
Consists of a parallel circuit of filters derived from a partial fraction expansion:
This means that filters with poles at 0, with real poles at –ai and with conjugate complex pole
pairs at –fi and –fi* are run in parallel..
Ladder Form and Lattice Form

Representations in form of continued fractions or lattice structures are sophisticated filter
forms that possess advantages with respect to robustness against round-off errors.
University

Oliver Nelles
of Siegen
10.6 Non-Causal Filters u(k)
g(k)
y(k)
Causal Filters
For a causal filter its output y(k) depends only on the current and previous input u(k–i)
with i ≥ 0. This automatically means that the impulse response is equal to zero for negative
times: g(k)
k
since g(i) = 0 for i < 0,
otherwise the future inputs would influence the now: u(k–i)
Non-Causal Filters
For a non-causal filter its output y(k) also depend on the future input u(k–i) with i < 0.
This automatically means that the impulse response is not equal to zero for negative times:
g(k) commonly
symmetrical,
but this is
k not necessary
since g(i) ≠ 0 for i < 0, because future inputs are relevant: u(k–i)
University

Oliver Nelles
of Siegen
How the Future is Known to Calculate a Non-Causal Filter?
• Offline data processing: The data set is available from start to end in the computer. Then
the “now” can be arbitrarily chosen be the user.
• Buffers in online data processing: Data is stored in a buffer for a couple of sampling time
steps, say D steps, before being processed further. The whole processing is therefore
delayed by D steps. Relative to this delayed “now” there exist the possibility to look D
steps into the future up to g(–D). It is important to note that in order to look D steps into
the future with a non-causal filter, we have to buffer D steps of the signal, thus introducing
a dead time of D steps.
1 Signal processing is based on

“Now” can be defined buffered signals that are D Buffer
Messsignal
0.5 arbitrarily! steps back with respect to

0 real time and thus
as many steps can be
-0.5
Past Future Filter
predicted :
-1
0 10 20 30 40 50 60 70 80
k
University

Oliver Nelles
of Siegen
Advantages of Non-Causal Filters
• An impulse or step response that is symmetric to k = 0 has a real frequency response, i.e.,
no phase delay (see green dashed filter response)!
Symmetry implies:
Rearrange: =
=
real for
conj. compl. conj. compl.

pair → real pair → real
• By forward and backward filtering of the data (which is possible only offline) every phase
delay introduced by the forward filtering is exactly compensated again by the backward
1.2
filtering. This fact is independent on the nature of the filter and 1 Butterworth Filter 4. Order
thus is true for every type (FIR, IIR, nonlinear). 0.8
However, it is filtered twice. This means we effectively have 0.6
the amplitude response of |G(i!)|2. 0.4
0.2
non-causal causal
• Because a non-causal filter can “react” to a step input before it 0
actually happens, such a filter is much faster! -0.2

20 30 40 50 60 70 80
University

Oliver Nelles
of Siegen
Drawbacks of Non-Causal Filters
• Can hardly be applied for applications with strict real-time requirements such as feedback
control because any delays deteriorate the performance significantly.
In communication systems, however, delays introduced by buffers usually are
– irrelevant/unimportant since communication is unidirectional (radio, TV),
– negligible when communication is bidirectional (telephone) because signal run times
introduce the major part of the delay anyway.
In feedback control a buffer would introduce an additional dead time. This has severe
consequences for the control quality (reduced phase margin, danger of instability). These
drawbacks are typically more important than the achievable improvements in signal quality.
causal buffer non-causal Buffer
filter D=3 filter + non-causal filter =
causal overall system
gk(k) gp(k) gak(k) gp+ak(k)
k k k k
University

Oliver Nelles
of Siegen
Non-Causal Filters in Feedback Control
Feedback control gives nice examples for non-causal filters:
1. Reference input filter: Commonly the future course of the reference value is known a
priori. The non-causal filters can easily be exploited to utilize this knowledge.
2. Feedback filter: The comparison between desired and control value requires the control
value as fast as possible. A non-causal filter with buffer would introduce a dead time
which deteriorates the control performance because it causes phase lag. There non-causal
filter would be counterproductive. A “truly” non-causal filter cannot be employed
because the future control variable is unknown.
Filter 1 Controller Plant
Filter 2
University

Oliver Nelles
of Siegen
Nonlinear filter are seldom applied due to the additional complexity in their handling and
design. In the field of image processing they are however, more common. Most frequently
simple nonlinear operators like max-, min- or other order/sorting-operators can be found.
Median Filter
Probably the most important and frequently used nonlinear filter is the median filter. It is
helpful in eliminating outliers. In contrast to the arithmetic average, the median gives the
numbers which is right in the middle of a sorted sequence, i.e., half of the number are larger,
half of the numbers are smaller.
Example:
Sequence: 4, 7, 20, 21, 30 → median = 20, arithmetic average = 16.4
Sequence: 4, 7, 20, 21, 1000 → median = 20, arithmetic average = 210.4
The median is commonly used to eliminate outliers e.g. in statistics where the arithmetic
average does not represent the “typical” case like study program duration, house prices, etc.
University

Oliver Nelles
of Siegen
Median Filter for Elimination of Outliers
A median filter of n. order has an output y(k) that is calculated as the median of the last
n data samples u(k), u(k–1), ..., u(k–n–1). With a median filter of n. order from n subsequent
data samples (n–1)/2 outliers in series can be filtered out and removed without distorting the
signal very much.
Example: Median filter of 3. order versus linear average FIR filter

Median filter:
Linear FIR filter:
2
2
Median 2
Linear
7.3 7.3 Filter 7.3 FIR Filter
1 1 1
u(k)
y(k)
y(k)
0.5 0.5 0.5
1 Outlier
0 0 0 1 Outlier
1 Outlier
-0.5 -0.5 1 Outlier -0.5
2 Outlier
-1 -1 2 Outlier -1
0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30
Time k Time k Time k
University

Oliver Nelles
of Siegen
10.8 Outlook: Adaptive Filters
What is an Adaptive Filter?
An adaptive filter has no fixed parameters but they change over time in order to meet
changing requirements. The time-varying parameters are typically changed according to some
adaptation law in order to improve the performance of the filter. Typical applications are:
• Online system identification: A time-variant process shall be identified (modeled by
measurement data). Because the process behavior changes over time the filter has to track
these changes.
• Channel equalization: A signal is distorted from sender to receiver by the dynamic
channel in between (obstacles, reflections, ...). This distortion must be compensated
(canceled) at the receiver the improve the quality. E.g. built in cell phones!
• Echo compensation: To avoid (or weaken) acoustic feedback distortions, adaptive filters
are applied to eliminate the part from the sound signal back from the speaker to the
microphone.
• Active noise suppression: An adaptive filter can model a measured disturbance in order to
actively compensate it by adding it to the signal with 180° phase shift (destructive
inference).
University

Oliver Nelles
of Siegen
Principles of an Adaptive Filter
• Comparison between desired filter output yd(k) and actual filter output y(k).
• Calculation of the error e(k).
• In the adaptation law the change of the filter parameters is computed from the error. This
usually is done by an update of the filter parameters according to:
• Different adaptation laws distinguish each other by different calculations of this

parameter update !" (k). The following goal are pursued and for each application a suited
compromise must be sought:
– convergence speed
– tracking speed
adaptive
– computational demand in each update step
filter " (k)
– numerical robustness (round-off errors!)
• Typical adaptation laws are:
– least mean squares (LMS): gradient method adaption
law
– recursive least squares (RLS): Newton’s method
University

Oliver Nelles
of Siegen
Online Adaptation
The gradient method tries to minimize the quadratic error e2(k) by changing the parameter
vector in direction opposite to the steepest ascent (gradient) by a step proportional to the step
size or length η:
Commonly adaptive filters are of FIR type, i.e.:

with
Thus the parameter update becomes (Remember: e(k) = yd(k) – y(k)):
This means the update is proportional to the (new) step size !´, to the error e(k) and to the
“excitation” (regressor) of the corresponding parameter "i by u(k–i).
University

Oliver Nelles
of Siegen
Discrete-time transfer function:

sys = filt(num,den);2 % Assigning a discrete-time transfer function
FIR filter:
fir1;1 % FIR filter using the window method
firls;1 % FIR-Filter using least squares optimization
firpm;1 % FIR-Filter using Parks-McClellan optimization
IIR filter:
besself;1 % Bessel filter
butter;1 % Butterworth filter
cheby1;1 % Chebychev filter type 1
cheby2;1 % Chebychev filter type 2
ellip;1 % Cauer filter (elliptic filter)
University

Oliver Nelles
of Siegen
y = filter(b,a,X); % Digital IIR filter (direct form II)

y = filtfilt(b,a,X);1 % Corresponding non-causal filter
% with forward and backward path
% WARNING: The amplitude response has
% the squared (twice) effect
[b,a] = yulewalk(n,f,m);1 % Digital, recursive IIR filter.
% Uses least squares to model the
% frequency response
H = dfilt.structure(in1,...);1 % Yields discrete-time filter according
% to the method 'structure', see
% MATLAB help
[b,a] = prony(h,n,m);1 % Filter design in the time-domain
% according to the “Prony” method
University

Oliver Nelles
of Siegen
Filter-Parameter-Identifikation:
[b,a] = invfreqz(h,w,n,m);1 % Identifies a discrete-time amplitude
% and phase response (continuous-time:
% “invfreqs”)

2 : Control System Toolbox
University

Oliver Nelles
of Siegen
11. Selected Methods in
Signal Processing
University

Oliver Nelles
of Siegen
Contents Chapter 11
6.1 Principal Component Analysis (PCA)
6.2 Clustering
University
Prof. Dr.-Ing.
11. Selected Methods in Signal Processing Page 264
Oliver Nelles
of Siegen
11.1 Principal Component Analysis
Data Preprocessing
Complex tasks in signal processing often are partitioned into two or more steps that each can
be handled simpler individually. Typically, a early (first) steps is called signal preprocessing.
Dependent on the specific task, signal preprocessing can be:
• Filtering, smoothing, interpolation
• Transformation of data into a new coordinate system
Outputs y1 y 2 y3 yr
• Dimension reduction, data compression
...
• Transformation into the frequency domain
further data
• Feature extraction
processing
• Nonlinearity transform transformed ...
Inputs x1 x 2 x3 xq
Some of the most common an important data
data pre-
preprocessing approaches will be discussed in
processing
the following.
...
original
Inputs u1 u2 u3 up
University
11. Selected Methods in Signal Processing Page 265 Prof. Dr.-Ing.

Oliver Nelles
of Siegen
Supervised versus Unsupervised Learning
Two approaches to learning can be distinguished:
• Supervised learning: The desired output y is known and is compared with the result of the
used method . A loss function to measure the quality of the method that depends on y
and is calculated and often optimized. Frequently the mean squared error (MSE) is used
for that purpose.
• Unsupervised Learning: The desired output y is unknown or at least not used. Rather in
interim goal is defined which can be calculated solely on the input data {ui(k)}, i = 1, 2,
..., p and k = 1, 2, ..., N. Frequently the distribution of data in the input space plays an
important role.
Unsupervised learning is much simpler to realize than supervised learning. The interim goal
is easier to achieve than the final one. However, the risk exists that the interim goal is not as
helpful as assumed. Therefore the success of unsupervised learning is not always guaranteed.
The methods presented here are unsupervised and require little computational effort.
University

Oliver Nelles
of Siegen
Projection of Vectors
In order to keep the absolute value of u constant,

the vectors describing the coordinate axes have
to be normalized to one, i.e.:
University
6. Ausgewählte Methoden der Signalverarbeitung Page 267 Prof. Dr.-Ing.

Oliver Nelles
of Siegen
Transformation of the Coordinate System
With a principal component analysis (PCA) data is transformed from one coordinate system
into a new one. The 1. new axis shall point in the direction of the highest variance of the
data. The 2. new axis shall be orthogonal to the first and again in the direction of the highest
data variance remaining, and so on. The idea behind this procedure is that data can often be
described best in directions of high variance and often can be neglected in directions of low
variance. The low variance directions typically represent just noise.
The example on the left illustrates this idea. The data distribution shows a strong correlation
between u1 and u2. It can be assumed that u1 and u2 may depend on each other,
1
e.g. u2 = a u1 + n with a ≈ 0.7 and noise n. A PCA orients the 0.8
1. axis in direction of the highest variance, i.e., x1 = 0.6
x2 x1
u1 + a u2 and the 2. axis orthogonally, i.e., x2 = u2 – a u1. 0.4
0.2
If the assumed relationship between u1 and u2 is indeed u2 0

-0.2
true then x2 = n and x2 describes only noise and thus -0.4
contains no information and can be removed (dimensionality -0.6
reduction). -0.8
-1
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
u1
University

Oliver Nelles
of Siegen
11.1 Principal Component Analysis WARNING: The data needs
to have zero mean!
Derivation of Principal Component Analysis (PCA)
Start with a p-dimensional space. The task of a PCA is to find new axes xi = [xi1 xi2 ... xip]T
for i = 1, 2, ..., p, while the 1. axis point in the direction of the highest data variance, the
2. axis in the direction of the second highest, and so on. All axes shall be orthogonal to each
other.
In the N×p data matrix U all data is stored with respect to the original coordinate system:
2. data point
N data points
2. old axis p dimensions
The scalar products uT(k) x are the projections of the k = 1, 2, ..., N data points onto an
arbitrary axis x = {x1, x2, ..., xp}. If the data has zero mean (if not then the mean has to be
subtracted first) then the following expression corresponds to the squared distance to the
mean (which is equal to 0): (uT(k) x)2.
University

Oliver Nelles
of Siegen
If we calculate this variance for each data point and sum them, we get the variance of the
whole data along the new axis x:
We want to maximize this expression. However, we must prevent that the variance becomes
large just by shrinking the axis (and thereby generate large numbers). Thus the axes’ scaling
are restricted to a norm of 1:
This constraint is included in the optimization. With ! as Lagrange multiplier we achieve the
following optimization problem:
University

Oliver Nelles
of Siegen
The solution of this maximization yields the eigenvalue problem:
The eigenvector corresponding to the highest eigenvalue !1 is the 1. axis x1, the eigenvector
corresponding to the second highest eigenvalue !2 is the 2. axis x2, and so on up to the
smallest eigenvalue !p with the p. axis xp. The eigenvalues of UTU are the squared singular
values of U and thus can be computed with a singular value decomposition (SVD). This can
be done to a extremely high accuracy without explicitly squaring the matrix U. These
eigenvalues all are positive and the associated eigenvectors are orthogonal to each other.
Gene H. Golub, 1932-2007 Gene Golub‘s licence plate.

(www.wikepedia.org) Photo of Professor Kroonenberg of the University Leiden.
For fun...
Gene Golub is computer
scientist at Stanford University.
He has contributed more than
anyone else to make SVD the
most powerful and common tool
of modern linear Algebra (matrix
computation).
University

Oliver Nelles
of Siegen
Singular Value Decomposition (SVD)
SVD computes the following matrix decomposition of an m×n matrix U:
If U has more rows than columns the following matrix dimensions arise:
= · · The marked red quadratic matrix in S
m contains the singular values of U on
n×n n×n its diagonal. They are identical to the
diag{s1, s2, ..., sn} square root of the eigenvalues of U TU.
n m×n They are sorted from large to small.
Therefore the matrix U can be decomposed in a sum of n outer products (each has rank 1),
whose influence becomes smaller through the decreasing singular values:
s1 s2 sn
mit
maximal rank = n
If the rank of U
is r < n then
sr+1 = ... = sn = 0.
University

Oliver Nelles
of Siegen
U has only rank 2 since
11.1 Principal Component Analysis s3 = 0 and thus the third
singular value does not
Example: contribute to the rank.
1 2 3 = -0.1013 0.7679 -0.0183 · 35.1826 0 0 · -0.5193 -0.5755 -0.6318
4 5 6 -0.2486 0.4881 0.5367 0 1.4769 0 -0.7508 -0.0459 0.6589
7 8 9 -0.3958 0.2082 -0.8133 0 0 0.0000 -0.4082 0.8165 -0.4082
10 11 12 -0.5430 -0.0717 0.0896
13 14 15 -0.6902 -0.3515 0.2053
= 35.1826 -0.1013 · -0.5193 -0.5755 -0.6318 + 1.4769 0.7679 · -0.7508 -0.0459 0.6589 + 0
-0.2486 0.4881
-0.3958 0.2082
-0.5430 -0.0717
-0.6902 -0.3515
Dimension Reduction by PCA

The PCA transforms data from one p-dimensional space into another p-dimensional space.
This for itself can be an advantage because the new data distribution can be numerically
better or easier to interpret. One step further is dimensionality reduction by PCA. Here all
axes with low variance (below some threshold) are removed. The underlying (implicit)
assumption is that these axes represent just noise. This is especially appropriate for extremely
high-dimensional space where supervised technique would be too complicated.
University

Oliver Nelles
of Siegen
Transformation
The columns of the matrix V contain the eigenvectors of UTU . They are also called the right
singular vectors of U. Correspondingly the left singular vectors of U are in the columns of the
matrix W and are identical to the eigenvectors of U UT. The data contained in the matrix U
can be transformed linearly into the new space by:
For the transformation back we have to calculate from X to U :
The last equality hold because V is unitary, i.e., V TV = I and V V T = I and thus V T = V ‒1..
In the case of dimensionality reduction only the most important axes are selected. They
belong to the largest eigenvalues of UTU or to the largest singular values of U, respectively.
Because a SVD sorts the eigenvalues according to their absolute values, this corresponds to
the first singular values.
University

Oliver Nelles
of Siegen
Size of singular values
11.1 Principal Component Analysis 16000
14000
12000
10000
Example: Compression of a picture 8000 97% of
6000 the variance
• Picture with 128×45 pixels is represented as a 128×45-dimensional 4000
2000
matrix where “0” stands for “black” up to “255” for “white” and 0
0 5 10 15 20 25 30 35 40 45
many grey shades in between. 10

4
97% of
• The most important 5-10 axes from a PCA already represent the 3
the variance
10
picture quite well. The singular values quickly decline to 0.
• Computational effort is high. This method is not used in praxis. 10
2
0 5 10 15 20 25 30 35 40 45
Original Dimensionality reduction to ? axes: Dimensions

45 1 2 5 10 20
20 20 20 20 20 20
40 40 40 40 40 40
60 60 60 60 60 60
80 80 80 80 80 80
100 100 100 100 100 100
120 120 120 120 120 120

10 20 30 40 10 20 30 40 10 20 30 40 10 20 30 40 10 20 30 40 10 20 30 40
University

Oliver Nelles
of Siegen
Example: Character Recognition Source: http://www.cs.mcgill.ca/~sqrt/dimr/dimreduction.html
• Characters A-Z with 5×5 pixels with

“0” = “black” and “1” = “white”.
• Each pixel corresponds to one
axis u1, u2, ..., u25.
• On each axis the pixel values (“0” or “1”) x2
are entered, i.e., in each dimension only
values at 0 and 1 appear.
• The 25-dimensional input space corresponds
unit hyper-cube. Data only appears at the
corners.
• PCA with dimensionality reduction to 2 axes
x1 and x2 explains 44% of the data variance!
• “A” /”R” and “W”/”N”/”M” lie closely
together. They are hard to distinguish from
the 2 features alone. For “X”/”O” and “T”/”H”
and “A”/”Y” the distinction is much easier! x1
University

Oliver Nelles
of Siegen
University

Oliver Nelles
of Siegen
University

Oliver Nelles
of Siegen
12 50 120
Figure 2. Rank 12, 50, and 120 approximations to a rank 598 color photo of Gene Golub.

computational work than Gaussian elimina- the components, and the scaled left singular So far in this column I have hardly men-
tion, but it has impeccable numerical prop- vectors, σkuk, are the scores. PCAs are usually tioned eigenvalues. I wanted to show that it
erties. You can judge whether the singular described in terms of the eigenvalues and ei- is possible to discuss singular values without
values are small enough to be regarded as genvectors of the covariance matrix, AAT, but discussing eigenvalues—but, of course, the
negligible, and if they are, analyze the rel- the SVD approach sometimes has better nu- two are closely related. In fact, if A is square,
evant singular system. merical properties. symmetric, and positive definite, its singular
Let Ek denote the outer product of the SVD and matrix approximation are often values and eigenvalues are equal, and its left
k-th left and right singular vectors, that is illustrated by approximating images. Our and right singular vectors are equal to each
example starts with the photo on Gene other and to its eigenvectors. More gener-
Ek = ukvkT Golub’s Web page (Figure 2). The image ally, the singular values of A are the square
is 897-by-598 pixels. We stack the red, roots of the eigenvalues of ATA or AAT.
Then A can be expressed as a sum of rank-1 green, and blue JPEG components verti- Singular values are relevant when the ma-
matrices, cally to produce a 2691-by-598 matrix. trix is regarded as a transformation from
n We then do just one SVD computation. one space to a different space with pos-
Page 279 A = σk Ek
∑ After computing a low-rank approxima- sibly different dimensions. Eigenvalues
k=1 tion, we repartition the matrix into RGB are relevant when the matrix is regarded
components. With just rank 12, the colors as a transformation from one space into
If you order the singular values in decreasing are accurately reproduced and Gene is itself—as, for example, in linear ordinary
order, σ > σ > ... > σ , and truncate the sum
1 2 n
recognizable, especially if you squint at the differential equations.
after r terms, the result is a rank-r approxima- picture to allow your eyes to reconstruct Google finds over 3,000,000 Web pages
Oliver Nelles
Prof. Dr.-Ing.
tion to the original matrix. The error in the the original image. With rank 50, you that mention “singular value decomposi-
approximation depends upon the magnitude can begin to read the mathematics on the tion” and almost 200,000 pages that men-
of the neglected singular values. When you white board behind Gene. With rank 120, tion “SVD MATLAB.” I knew about a few
do this with a matrix of data that has been the image is almost indistinguishable from of these pages before I started to write this
centered, by subtracting the mean of each the full rank 598. (This is not a particularly column. I came across some other interest-
column from the entire column, the pro- effective image compression technique. In ing ones as I surfed around.
cess is known as principal component analy- fact, my friends in image processing call it Professor SVD made all of this, and much
sis (PCA). The right singular vectors, vk, are “image degradration.” ) more, possible. Thanks, Gene.
Reprinted from TheMathWorks News&Notes | October 2006 | www.mathworks.com

of Siegen
University
A Few Search Results for
“Singular Value Decomposition”
■ The Wikipedia pages on SVD and PCA are quite good and ■ The first Google hit on “protein svd” is “Protein Substate Mod-
contain a number of useful links, although not to each other. eling and Identification Using the SVD,” by Tod Romo at Rice
en.wikipedia.org/wiki/Singular_value_decomposition University. The site provides an electronic exposition of the
en.wikipedia.org/wiki/Principal_component_analysis use of SVD in the analysis of the structure and motion of pro-
teins, and includes some gorgeous graphics.
■ Rasmus Bro, a professor at the Royal Veterinary and Agri- bioc.rice.edu/~tromo/Sprez/toc.html
cultural University in Denmark, and Barry Wise, head of Ei-
genvector Research in Wenatchee, Washington, both do che- ■ Los Alamos biophysicists Michael Wall, Andreas Rechsteiner, and
mometrics using SVD and PCA. One example involves the Luis Rocha provide a good online reference about SVD and PCA,
analysis of the absorption spectrum of water samples from a phrased in terms of applications to gene expression analysis.
lake to identify upstream sources of pollution. public.lanl.gov/mewall/kluwer2002.html
www.models.kvl.dk/users/rasmus
www.eigenvector.com ■ “Representing cyclic human motion using functional analysis”
(2005), by Dirk Ormoneit, Michael Black, Trevor Hastie, and
■ Tammy Kolda and Brett Bader, at Sandia National Labs in Liver- Hedvig Kjellstrom, describes techniques involving Fourier analy-
more, ca, developed the Tensor Toolbox for MATLAB, which sis and principal component analysis for analyzing and modeling
provides generalizations of PCA to multidimensional data sets. motion-capture data from activities such as walking.
csmr.ca.sandia.gov/~tgkolda/TensorToolbox www.csc.kth.se/~hedvig/publications/ivc_05.pdf

■ In 2003, Lawrence Sirovich of the Mount Sinai School of Medicine ■ A related paper is “Decomposing biological motion: a frame-
published “A pattern analysis of the second Rehnquist U.S. Supreme work for analysis and synthesis of human gait patterns,”
Court” in the Proceedings of the US National Academy of Sciences. (2002), by Nicholaus Troje. Troje’s work is the basis for an “ei-
His paper led to articles in the New York Times and the Washington genwalker” demo.
Post because it provides a nonpolitical, phenomenological model of www.journalofvision.org/2/5/2
court decisions. Between 1994 and 2002, the court heard 468 cases. www.mathworks.com/moler/ncm/walker.m
Since there are nine justices, each of whom takes a majority or mi-
nority position on each case, the data is a 468-by-9 matrix of +1s ■ A search at the US Patent and Trademark Office Web page lists
and -1s. If the judges had made their decisions by flipping coins, this 1,197 U.S. patents that mention “singular value decomposi-
matrix would almost certainly have rank 9. But Sirovich found that tion.” The oldest, issued in 1987, is for “A fiber optic inspection
the third singular value is an order of magnitude smaller than the system for use in the inspection of sandwiched solder bonds
first one, so the matrix is well approximated by a matrix of rank 2. in integrated circuit packages”. Other titles include “Compres-
In other words, most of the court’s decisions are close to being in a sion of surface light fields”, “Method of seismic surveying”,
two-dimensional subspace of all possible decisions. “Semantic querying of a peer-to-peer network”, “Biochemical
Page 280 www.pnas.org/cgi/reprint/100/13/7432 markers of brain function”, and “Diabetes management.”

www.uspto.gov/patft
■ Latent Semantic Indexing involves the use of SVD with term-
document matrices to perform document retrieval. For ex-
ample, should a search for “singular value” also look for “eigen-
value”? See a 1999 SIAM Review paper by Michael Berry, Zlato RESOURCES
Oliver Nelles
Prof. Dr.-Ing.
Drmac, and Liz Jessup, “Matrices, Vector Spaces, and Informa-

tion Retrieval.” On the Early History of the Singular Value Decomposition
locus.siam.org/SIREV/volume-35/art_1035134.html
epubs.siam.org/SIREV/volume-41/art_34703.html
Cleve’s Corner Collection
www.mathworks.com/res/cleve
©1994-2006 by The MathWorks, Inc. MATLAB, Simulink, Stateﬂow, Handle Graphics, Real-Time Workshop, and xPC TargetBox are registered trademarks and SimBiology, SimEvents,
and SimHydraulics are trademarks of The MathWorks, Inc. Other product or brand names are trademarks or registered trademarks of their respective holders.
91425v00 10/06
of Siegen
University
5
Difficulties with Dimensionality Reduction
4
The assumption that low variance axes are redundant and 3
can be removed can be wrong! A small variance point
2
y(k)
towards a possible linear dependency but this is not
1
necessarily the case. An analysis based on input space
0
distributions only can never ensure this with certainty.
-1
The output has to be considered in order to be sure.
-2
0 10 20 30 40 50 60 70 80 90 100
For example for dynamic processes a strong correlation Time k
of two subsequent outputs y(k–1) and y(k–2) occurs. 5
However, they are not redundant if the process is of Strongly correlated, but
4
AR(2)-type as an example, that is it follows the equation: important information!
3
y(k) = –a1y(k–1) – a2y(k–2) + v(k)
y(k–2)
2
Although y(k–1) and y(k–2) are highly correlated (the 1
higher, the smaller the sampling time is) both carry 0
important information and are not redundant. -1
-2
-2 -1 0 1 2 3 4 5
y(k–1)
University

Oliver Nelles
of Siegen
Feature Selection versus Feature Extraction
A dimensionality reduction with PCA yield a feature extraction. This means that from a
many original inputs, say p, a smaller number of features, say q, are generated. However,
they may depend on all original inputs. Therefore the next processing step requires are
smaller number of inputs/features and is simpler to perform. But none of the original p
measurements can be discarded.
A more radical approach is feature selection. Here the task is not only to reduce the
dimensionality but also to remove inputs so that they don’t have to be measured anymore.
This simplifies not only the further processing but also the overall effort by requiring fewer
sensors.
u1 x1 u1 u2
u2 Feature x2 u2 Feature u5
extraction selection
up xq up uq
Each output xi can depend on all inputs uj! Each output is identical to one input!
University

Oliver Nelles
of Siegen
Application: Classification
A frequent application of PCA is data pre-processing, especially for dimsnionality reduction
in classification. The task is to correctly map measurements to r different classes. This can be
done with the original measurements u1, u2, ..., up or with features x1, x2, ..., xq extracted from
these measurements. Usually q << p which means that the classification problem becomes of
much lower dimensionality.
In the A-Z-character recognition example we have r = 26 classes, p= 5×5 = 25 original inputs
and only q =2 features (although for a higher classification accuracy than 44% we would
require realistically 3-5 features).
For a coin-operated machine we would have to distinguish between r = 9 classes (1c, 2c, 5c,
10c, 20c, 50c, 1€, 2€, “no €-coin”). Possible inputs are
• Weight, color, diameter, thickness, reflectance, ...
x1
u1 y1
x2
u2 Feature y2
Classification
extraction xq
up yr
University

Oliver Nelles
of Siegen
K=3
11.2 Clustering u2
Basics of Clustering
Like PCA Clustering operates on the input data. The task is to
find groups (clusters) of data points. These groups can be of
different shapes and sizes. Depending on the method, a special
prototype is defined that defines how a cluster should look u1
u2 K=4
like. In two dimensions examples are: hollow or filled circles
or ellipsoids, linies, ...
A similarity measure is defined as a loss function. The similarity

of each cluster is evaluated with this similarity measure. The
famous K-means clustering for example utilizes the following
type of loss function: u1
where K is the number of clusters and runs over the data points belonging to the
cluster j whose center of gravity is closest (in the Euclidian sense).
University

Oliver Nelles
of Siegen
11.2 Clustering
K-means clustering tries to find K filled circles (or spheres) by minimizing the quadratic
distances of all data points to the center of their associated cluster.
Instead of looking for circles (spheres) is can be easily extended to ellipses (ellipsoids) of a
certain shape, i.e., a given covariance matrix !. This can be done by replacing the Euclidian
distance metric with the so-called Mahalanobis distance.
! ~ unit matrix ! = diagonal matrix ! = sym. pos. def. matrix

u2 u2 u2
Lines with
identical
Mahalanobis
distance
u1 u1 u1
An extension to higher dimensions is easily possible.

It is possible as well to look for ellipse (ellipsoids) of variable covariance matrix (shape).
However, this require more complex algorithms as design by Gustafson and Kessel or
Gath and Geva.
University

Oliver Nelles
of Siegen
11.2 Clustering
K-means Clustering
The K-means algorithm works as follows:
1. Choose the number of clusters K.
2. Initialize the cluster center with randomly selected data points.
3. Assign each data sample to the cluster with the closest center (according to the chosen
distance metric).
4. Calculate the center of gravity for each cluster (averaging the associated data points).
5. Place the new cluster centers at those centers of gravity
6. If (at least) one cluster center has moved then go to step 3 otherwise STOP.
It can be shown that this algorithm minimizes the loss function (on the previous slide).
However, is can converge to a local optimum. Because the initialization is random, different
initialization can be tried out and the best result can be selected.
A difficult “tuning factor” is the choice for the number of clusters K.
University

Oliver Nelles
of Siegen
11.2 Clustering
Examples for K-means Clustering
Interpretation of the figures:
• Data point are marked by dots.
• The old cluster centers are marked by circles.
• The new cluster centers are marked by crosses.
• The color of the data points represents the association to the cluster of the same color.
Observations:
• Convergence is very fast; only a few iteration are needed.
• The global minimum of the loss function is reached in most cases.
• The sensitivity with respect to the initialization is low.
• For reasonable results the number of clusters has to be chosen in the right manner.
• Normalization of data is important because some dimensions can be dominant
(and others almost irrelevant) if axes are scaled differently.
University

Oliver Nelles
of Siegen
11.2 Clustering
K=3 5 Iterations until convergence
10 10 10
1. Iteration 2. Iteration
8 8 8
6 6 6
4 4 4
2 2 2
0 0 0
-2 -2 -2
-4 -2 0 2 4 6 8 -4 -2 0 2 4 6 8 -4 -2 0 2 4 6 8
10 10 10
3. Iteration 4. Iteration 5. Iteration

8 8 8
6 6 6
4 4 4
2 2 2
0 0 0
-2 -2 -2
-4 -2 0 2 4 6 8 -4 -2 0 2 4 6 8 -4 -2 0 2 4 6 8
University

Oliver Nelles
of Siegen
11.2 Clustering
K=3 3 Iterations until convergence Fast convergence due to lucky initialization
10 10 10

8 8 8
6 6 6
4 4 4
2 2 2
0 0 0
-2 -2 -2
-4 -2 0 2 4 6 8 -4 -2 0 2 4 6 8 -4 -2 0 2 4 6 8
K=3 3 Iterations until convergence Bad result due to unlucky initialization

10 10 10

8 8 8
6 6 6
4 4 4
2 2 2
0 0 0
-2 -2 -2
-4 -2 0 2 4 6 8 -4 -2 0 2 4 6 8 -4 -2 0 2 4 6 8
University

Oliver Nelles
of Siegen
11.2 Clustering
K=3 6 Iterations until convergence Scaling of the y-axis is factor 100 larger
1000 1000 1000

800 800 800
600 600 600
400 400 400
200 200 200
0 0 0
-200 -200 -200

-4 -2 0 2 4 6 8 -4 -2 0 2 4 6 8 -4 -2 0 2 4 6 8
K=3 10 Iterations until convergence Scaling of the x-axis is factor 100 larger
10 10 10

8 8 8
6 6 6
4 4 4
2 2 2
0 0 0
-2 -2 -2
-400 -200 0 200 400 600 800 -400 -200 0 200 400 600 800 -400 -200 0 200 400 600 800
University

Oliver Nelles
of Siegen
11.2 Clustering
K=2 3 Iterations until convergence Solution is stable, almost independent of initialization
10 10 10

8 8 8
6 6 6
4 4 4
2 2 2
0 0 0
-2 -2 -2
-4 -2 0 2 4 6 8 -4 -2 0 2 4 6 8 -4 -2 0 2 4 6 8
K=4 Many different solutions dependent on the initialization

10 10 10
1. Solution, last iteration 2. Solution, last iteration 3. Solution, last iteration

8 8 8
6 6 6
4 4 4
2 2 2
0 0 0
-2 -2 -2
-4 -2 0 2 4 6 8 -4 -2 0 2 4 6 8 -4 -2 0 2 4 6 8
University

Oliver Nelles
of Siegen
11.2 Clustering
Fuzzy Clustering
The loss function known from K-means clustering can be re-written (extended):
The second sum runs over all data points (not only those belonging to a single cluster j).
K-means is a special case of fuzzy K-means with
if data point belongs to cluster j
if data point does not belong to cluster j
The variable !ij denotes the degree of membership to a cluster. A value of “1” means this
point fully belongs to that cluster. A value of “0” means it doesn’t. The degree of
membership !ij can be extended from a binary values to a real value between 0 and 1. Each
point belongs to each cluster to a certain degree. They have to sum up to 1. A degree of
membership of 0.51 to cluster A is similar to 0.49 to cluster B and would yield similar
results. In the classical K-means it is binary and the point would fully be associated with
cluster A und not at all with cluster B. Therefore, fuzzy clustering is less prone to bad
initialization.
University

Oliver Nelles
of Siegen
11.2 Clustering
Clustering for Classification
Like PCA clustering is suitable for data pre-processing. It is
often utilized for solving classification problems. Instead of
directly feeding the input features to the (supervised) classifier, 6
cluster
5
they are clustered first. With the help of these cluster, the 4
= classes
classifier has an easier task to perform the classification. 3
class 1
The underlying idea is that a certain distribution of the data 2
class 2
reflects the associated classes. Often this is the case. However,
1
this is not guaranteed. Therefore an unsupervised method can -1
go astray. -2
-4 -2 0 2 4 6 8
6
5
cluster
4
≠ classes
3
class 1 class 1
2
class 2
1
-1
-2
-4 -2 0 2 4 6 8
University

Oliver Nelles
of Siegen
PCA:
[COEFF,SCORE] = princomp(X);1
Singular Value Decomposition:

[U,S,V] = svd(X);
Fuzzy K-means Clustering:

[center,U,obj_fcn] = fcm(data,cluster_n);2
1 : Statistics Toolbox
2 : Fuzzy Logic Toolbox
University

Oliver Nelles
of Siegen
5. Measurement Errors
and Statistics
University

Oliver Nelles
of Siegen
5. Measurement Errors and Statistics
5.1 Measurement Errors

5.2 Accuracy Rating
5.3 Error Propagation
5.4 Histograms and Probability Density
5.5 Estimation of Mean and Variance
5.6 Confidence Intervals
University
5. Measurement Errors and Statistics Page 296 Prof. Dr.-Ing.

Oliver Nelles
of Siegen
Error Definitions
The absolute error e of some measurements is the difference between the displayed or
outputed value y and the (typically unknown) true value Wert yw:
The relative error er is absolute error divided by the true value yw and commonly is given in
percentage:
The true value yw is unknown in practice (otherwise no measurement would be necessary).

With additional effort it can be determined with high accuracy:
• Measurement with a precision instrument.
• Comparison with a measuring standard.
Often the quadratic error e2 (absolute or relative) is utilized for optimization as an criterion.
Many reasons for this exist. An important one is that the resulting optimization is particularly
easy to solve and manage (least squares).
University

Oliver Nelles
of Siegen
Systematic and Random Errors
Two error classes have to be distinguished:
• Systematic errors: Reason and kind of the error action are known. With a higher effort in
the measurement system an improvement and/or compensation would be possible, at least
in principle.
Examples: Temperature influence with strain gauges. Nonlinear characteristics.
• Random or stochastic errors: Repeated measurements under identical conditions yield
different results. Typically the errors are different in size and sign (not necessarily, see
quantization errors). The measurement values scatter! In contrast to systematical errors,
random errors can not be predicted or compensated. With averaging (calculating the mean
value), however, their influence can be reduced. The result will improve in quality
typically with where N is the number of trails that are averaged.
Examples: Brownian Motion. Fluctuations in material composition.
If we look very closely, most/all errors are of systematic nature. We have limited resources
and cannot afford an arbitrary effort; we do not have infinite insights. Therefore we treat all
errors that seem to be random as random! Typically many independent small systematic
influences seem to be of random nature.
University

Oliver Nelles
of Siegen
Error Causes
• Disturbances: It has to be distinguished between internal and external disturbances:
− Internal disturbances affect the sensor itself, e.g. wear.
− External disturbances come from the outside world, e.g. temperature influences.
By accepting a high effort in the choice of a precision instrument and by changing the
environment (e.g. climate chamber), disturbances can be kept to a minimum but they can
never be annihilated.
• Observation errors: Error induced by the observer himself, e.g. by making a mistake
during the measurement, wrongly reading the display, … With care such errors can be
avoided.
• Feedback error: Influence of the sensor on the object to be measured, e.g. the temperature
of the thermometer changes the temperature of the body that shall be measured. The
amount of such feedback depends on the measurement method. Radiation-based
temperature measurement avoids such an unwanted feedback. Physics tells us some effect
can never be completely eliminated (Heisenberg’s uncertainty principle) but on a
macroscopic level it can be negligible with the appropriate method.
• Non-ideal characteristics: The measurement system can possess static and dynamic
errors and with a digital output it possesses quantization errors as well.
University

Oliver Nelles
of Siegen
Non-ideal Sensor Characteristics
• Static errors: In the ideal case, the characteristics of the sensor is linear/affine.
In practice nonlinearities distort the result.
Example: quantity = temperature, output = voltage:
U [V] nonlinear
10
T [°C] –100 – 50 0 50 100
linear/
U [V] 1 1.7 3 6 10 affine
1
• Dynamic errors: If the measured quantity changes over –100 0 100 T [°C]
time, the sensor follows with a time constant and delay. T [°C]
T
If we do not wait long enough until the measurement U
values reach steady state (settling time) a dynamic error
occurs. Time t
UQ[V]
• Quantization errors: During the A/D conversion the U [V]
discretization causes errors in time (through sampling) and
in amplitude (through quantization). The latter corresponds eQ
to a stepwise characteristics. The maximum error is eQ/2. U [V]
University

Oliver Nelles
of Siegen
5.2 Accuracy Rating
The quality of measurement devices in practice is often characterized with their accuracy
rating or guaranteed minimum accuracy. With this declaration a manufacturer guarantees
that possible measurement errors within the specified conditions are limited to certain interval.
The accuracy rating declares the maximally to expect Typical accuracy ratings:
error in percentage of the instrument range. 0,1; 0,2; 0,5; 1; 1,5; 2,5
Example: Voltage measurement, accuracy rating = 0,5

a) Range: 0V – 100V. Display: 7V. 0V 100V
max. error = 0,5% · 100V = 0,5V. guaranteed interval = 7V ± 0,5V.
b) Range: 0V – 10V. Display: 7V. 0V 10V
max. error = 0,5% · 10V = 0,05V. guaranteed interval = 7V ± 0,05V.
Recommendation: Always measure in the upper third of the instrument range!
University

Oliver Nelles
of Siegen
Problem
Commonly the quantity to be measured cannot be measured directly but has to be calculated
from other measurements:
Examples:
a) Determination of electrical power from voltage and current:
b) Determination of speed or velocity from distance and time interval:
c) Determination of force via resistance change dependent on length, area, and specific
conductivity:
with
How do errors in the measurement of U, I, s, t, l, A (or r), ρ affect the final results?
University

Oliver Nelles
of Siegen
Gaussian Error Propagation for Systematic Errors
The requested quantity y can be deducted from the measurement values xi, i = 1, .., n, as
follows:
The errors of the single measurements xi are denoted by !xi. This yields the following
systematic error accumulation for the final output y:
This equation directly is obtained from the Taylor series expansion of the function f, in which
all higher than first order terms (linear) are neglected. Thus it is approximately correct if the
errors are small, i.e., !xi is close to zero.
In the above equation, measurement errors can cancel or attenuate each other because they
might be of opposite sign. Of course this requires knowledge about the right sign of !xi and
the slope of f () and therefore the systematic over- or underestimation.
University

Oliver Nelles
of Siegen
A different situation exists if just the maximal magnitude of errors can be assessed. The
following worst case assessment is obtained.
Gaussian Error Propagation for Maximal Errors:
Examples:
a) Power measurement:
If for example the voltage is measured too small (!U < 0) and the current too large
(!I > 0) (and U > 0, I > 0), then these error can (partly) compensate each other. If
nothing is known about the sign of the errors and only their magnitude can be assessed,
then a maximal error assessment has to be made in which the individual errors
accumulate.
University

Oliver Nelles
of Siegen
Examples:
b) Speed measurement:
In this example a (partly) compensation happens if both, the distance and time interval,
are over- or underestimated because of the “−” sign. Notice that the second term can
become extremely large if the time interval t is chosen very small, i.e., then the speed
measurement is very sensitive with respect to measurement errors in time.
c) Force measurement with strain gauges: with
University

Oliver Nelles
of Siegen
Gaussian Error Propagation for Random Errors
The quantity y to be measured depends on the input quantities xi, i = 1, .., n, as follows
The standard deviation of the individual input factors xi shall be given by sxi. Then the
standard deviation of the output quantity y becomes:
Example: Averaging of N measurements with equal standard deviations sx
This is a universal statistical law! 100 times more measurement values improve the quality
by a factor of 10 by reducing the standard deviation of the output y correspondingly.
University

Oliver Nelles
of Siegen
Approximation for Random Errors in Practice
Because it is difficult to estimate the standard deviations sxi for all quantities xi, the following
formula allows to assess the mean error of the output roughly (strictly speaking this formula
is not exact):
The standard deviations sxi are approximated by |!xi| roughly!
Difference of the effect of systematic and random errors

Systematic errors !xi = !x, i = 1, …, N, add up:
Random errors !xi = !x, i = 1, …, N, partly compensate each other:
Therefore averaging yields benefits for random errors (smaller scattering)!

University

Oliver Nelles
of Siegen
Histograms
If we measure the same quantity N times under identical conditions, each outcome will be
different due to random errors. In order to get an overview on the quality of the
measurements and the size of the random errors, it makes sense to plot a histogram. This
divides the measurements in intervals of size !x. The number of measurement values that fall
in the interval i are called frequency of the observation (German: “absolute Häufigkeit”)
Hi. Each measurement falls in exactly one interval (with nI intervals):
Recommendation for the number of intervals:
The relative value of Hi (Ger: ”relative Häufigkeit”)

hi describes the fraction of Hi that falls into interval i: 0,3
0,2
The relative frequencies of observations sum up to 1: 0,1
University

Oliver Nelles
of Siegen
Probability Density Function (PDF)
With a histogram it is easy to see how the measurements are distributed, e.g. how strongly
they scatter around their mean value . If we increase the number of measurements N and at
the same time increase the resolution by making more intervals nI smaller and smaller by
decreasing !x, then the histogram converges to the probability density function (pdf):
It is:
The density p(x) is a continuous and no stepwise function. We can calculate the probability
of a measurement to fall into a certain interval (x1 x2] by:
0,3
The true density p(x) according to which the
measurements are distributed is usually unknown. 0,2
Typically, realistic assumptions are made from 0,1
insights in the first principles and a histogram. In
most cases a Gaussian distribution is assumed if nothing
contrary is known. Here is why... (see next slide)
University

Oliver Nelles
of Siegen
Normal Distribution (Gaussian)
A normal distribution with mean !x and variance "x2 is defined as follows:
It is of highest theoretical and practical

importance. On the one hand, many other
distributions can be approximated by the
Gaussian (binomial-, t-/student distribution).
On the other hand, the central limit theorem of statistics
builds the key fundament for the essential normal distribution. It says that the sum of several
independent random variables follows approximately a normal distribution. This is truly
remarkable because it makes (almost, there are some minor exceptions) no restrictions on the
distribution of each random variable!
In practice, most random errors are caused by many tiny effects that sum up. Therefore,
almost all random errors are nearly Gaussian distributed. This explains why the Gaussian
appears so often and is so well known.
University

Oliver Nelles
of Siegen
Fundamentals of Estimation
An estimation in the statistical sense is the determination of one or many, in general n,
quantities (parameters) by utilizing N measurement data. Typically the number of estimated
parameters n is significantly smaller
than the number of available data N: N data n parameters
Estimator
Therefore an estimation can often be interpreted as a type of data reduction or compression.

Common examples are the estimation of the:
• mean value of the measurement data (n = 1).
• standard deviation (scattering) of the measurement data (n = 1).
• auto- or cross-correlation function of a time signal (n = large).
• coefficients of a regression line (n = 2) or polynomial (n = 3, …).
The estimation results depend on the actual measurement data. If the same quantity is
measured twice (even under identical conditions) we obtain different results and thus
different estimates, because the random disturbances (noise) have different values.
University

Oliver Nelles
of Siegen
Properties of an Estimator: Variance
The estimation result depends on the random fluctuations
of the disturbances which are modeled as random variables.
Thus the estimation will yield different results for each
data set. The estimation result is distributed according to
an (unknown) probability density, e.g. an Gaussian normal
distribution
The quality of an estimation obviously is high if the estimated
values are close to each other. This is the case, if
the pdf is narrow, i.e. has a small variance. The smaller, the better.
A further demand on the properties of a good estimator is that the pdf becomes smaller the
larger the amount of data N becomes. For many estimators indeed the variance follows the
law:
2 1 1
σestimator ⇠ σestimator ⇠ p
N N
A data set 4 times the size reduces the scatting by a factor of 2!
University

Oliver Nelles
of Siegen
5.5 Estimation of Mean and Variance : estimated parameter
: true parameter
Properties of an Estimator: Bias
In the previous slide is was assumed that the mean value
of the pdf is identical to the true (but unknown) value !0
of the estimated parameters. If this is the case, the estimation
is without bias (unbiased):
This is a desirable but not necessary property. Furthermore it

is often traded for other advantages like a low variance!
If the estimation is not unbiased it possesses a bias (systematic estimation error) :
If the bias (and the variance) tend to 0 for N → ∞, then we call this a consistent estimation:
University

Oliver Nelles
of Siegen
Estimation of the Mean
Random errors can be reduced by averaging, i.e., calculating the mean value of several
individual measurements. This is the simplest and most straightforward way to effectively
lower scattering and noise influence. The estimation of the mean value thus plays an
important role. We clearly distinguish between the true (but unknown) mean !x and the
estimated mean value (also called sample mean or empirical mean) .
sample mean:
It can be shown that the sample mean approaches the true value (unbiased) if N becomes large
It can also be shown that for statistically independent data the variance of the sample mean
estimation decreases for increasing data sets N, such as [4]:
University

Oliver Nelles
of Siegen
Estimation of the Variance
The variance !x2 of the data is also an important quantity. It determines how widely the data
is spread or scattered. The estimation of the data variance (sample variance or empirical
variance) can be performed by: " is unknown!
x
is its estimation!
sample variance:
The true mean "x is usually unknown und is replaced by its best estimate . Because of this
the sum is divided by N–1 and not by N. One degree of freedom (dof) was already exploited
or exhausted (figuratively speaking) for the estimation of this mean value and is not available
anymore for the variance estimation. Only N–1 dof are remaining. It can also be shown
theoretically that due to the denominator N–1 we have an unbiased estimation [4]:
→ unbiased!
The variance of an estimate can be used for assessing the reliability of an estimate itself. It is
required for example for determination of the confidence intervals that indicate the
reliability of the estimate.
University

Oliver Nelles
of Siegen
Trust in a Measurement
A measurement or an estimated mean from many measurements is practically almost useless
if its reliability is unknown. If its reliability is low then we cannot trust any information.
Different information sources can be obtained with different reliabilities. A prerequisite for
sensor fusion, for example, is some knowledge about their reliability. How can we quantify
this?
Confidence Interval
The trust or confidence in an estimate can be quantified based on its probability density
function (pdf). The pdf allows to calculate the probability that the true value lies within some
interval. Typically a symmetric interval around the mean is considered. Most pdfs also have
their maximal value at their mean. The probability that the deviation from the mean is
smaller than ±δ is:
For any interval size (width) ! we can calculate the associated probability. It is called a
confidence interval.
University

Oliver Nelles
of Siegen
Confidence Interval for Normal Distributions
The “width” of a pdf is determined by its standard deviation. Therefore it makes sense to
measure the width of confidence intervals ±! in terms multiples of the standard deviation.
For normal distributions the following confidence intervals are common:
Interval Probability (1–") 0.4
0.35
68,27%
0.3
95,45% 0.25
0.2
99,73%
0.15
99,99% 0.1
0.05
The associated probability values 1–" are called
0
confidence levels. The probability of error is denoted by -3 -2 -1 0 1 2 3
" and typically chosen as a small value like 5%, 1%, or

even 0.1%. The less risk can be accepted the more multiples of the standard deviation must
be accounted for. Such considerations are also part of any quality control system where error
rates like 1 in 10.000 directly correspond to a multiple of σ.
University

Oliver Nelles
of Siegen
Decreasing the Standard Deviation
The quality of the estimator depends on the standard deviation that can be decreased by:
• Improvement of the quality of the measurement: Because we need to reduce random
errors this is usually a complex and expensive task. Typical approaches are based on the
isolation of environmental disturbances coming from temperature, air pressure, vibrations,
radiation, etc.
• Averaging over many measurements: This is the typical approach to reduce random
errors. The measurement is carried out several times and its average result is utilized.
We know already that calculating the mean of N measurement values reduces the original
standard deviation of the individual measurements !x as follows:
This means it is possible, in principle, to decrease the standard deviation of the mean to an
arbitrary accuracy. We just have to measure often enough! To double the accuracy we
have to measure 4 times as many values. At the end, this is just a matter of cost and time.
University

Oliver Nelles
of Siegen
Confidence Intervals for Sample Mean With Known Standard Deviations
For random variables following a normal distribution, the confidence interval is
where the factor c corresponds to the requested confidence level 1–! or error probability !,
e.g. c = 3 for a confidence level of 99,73%.
Instead of measuring the value x a single time, the mean can be calculated from N measure-
p
ments. Then we replace x with and its standard deviation decreases according to 1/ N :
c = 1: 68,27% confidence interval

But this formula typically cannot be applied directly because the standard deviation "x is
unknown. The next best thing to do, is to approximate it with the square root of the estimated
sample variance sx2. However, by using this approximation we make an (usually tiny) error.
University

Oliver Nelles
of Siegen
Confidence Intervals for Sample Mean With Unknown Standard Deviations
Because the estimated sample mean sx is only an approximated value of the (unknown) true
standard deviation !x the original confidence interval discussed above is not exactly accurate.
In order to take this uncertainty into account the formula for the confidence interval has to be
corrected. This can be done by replacing the normal distribution by the slightly wider
Student’s t-distribution. The t-distribution accounts for the additional uncertainty caused by
the possible estimation error of the estimated instead of the true standard deviation. It thus
depends on the number of measurements N, the so-called
0.4 t-distribution normal
degrees of freedom (dof). If the data set is huge N = 10 distribution
0.35
(N → ∞), the estimation error for sx tends to zero,
0.3
then Student’s t-distribution converges to the
0.25
normal distribution. However, for only a few
0.2
measurements it becomes fatter at the outside
0.15
making room for more uncertainty (fat tail!). This t-distribution
0.1 N=3
yields wider confidence intervals.
0.05
0
-3 -2 -1 0 1 2 3
University

Oliver Nelles
of Siegen
Confidence Intervals for Sample Mean With Unknown Standard Deviations
For random variables that follow a t-distribution the formula for the confidence interval is
basically unchanged:
estimate for "x
but the factor c is larger than for a normal distribution

(see table). For large N the factor c is hardly Factor c for a t-distribution
changed. But for small data sets (small N) N 1–! = 68,27% 1–! = 95,45% 1–! = 99,73%
it becomes significantly bigger.

5 1,11 2,65 5,51
The standard deviation is not known like for 10 1,05 2,28 3,96
the normal distribution but estimated as follows: 20 1,03 2,13 3,42
50 1,01 2,05 3,16
100 1,00 2,03 3,08
200 1,00 2,01 3,04
= Gaussian distribution ∞ 1,00 2,00 3,00
University

Oliver Nelles
of Siegen
Example: Confidence Intervals
A voltage meter yields measurement values that are corrupted by random errors. These errors
come from an accumulation of many small disturbances which are not known in detail and
whose sources are not studied. Therefore we can assume the overall error follows a normal
distribution. From a long history of this voltage meter its behavior and accuracy are well
known. The variance of the disturbance is determined to be σx2 = 0.01 or .
a) The voltage meter displays: U = 7 V.
In which range will the true voltage be if we accept an error probability of maximal 0.3%?
→ Requested confidence level = 99.7%. For a normal distribution this corresponds to c=3.
The formula for known standard deviation is used, i.e., the confidence interval is
calculated from the normal distribution because the standard deviation is well-known
from a previous history of the instrument. (Or we assume N → ∞ for the estimate).
University

Oliver Nelles
of Siegen
b) The results in example a) does not fulfill our accuracy requirements. Therefore we decide
to carry out 10 separate measurements and calculate its mean (average). This should get
us closer to the true value than the above interval:
U [V]: 7.1 7.0 7.2 6.7 6.9 7.0 6.6 7.2 7.1 7.1
Sample mean:
Standard deviation of the sample mean:
This result is more accurate by a factor of 3.16 for the same error probability of 0.3%.
Even more measurement would improve the accuracy further.
University

Oliver Nelles
of Siegen
c) We repeat the experimental setup from b) with a new instrument because the old one is
broken. Thus a long history of the instrument’s accuracy is not available. We do not know
(as before) that the variance is 0.01. Therefore we have to estimate the instrument’s
accuracy by calculating the standard deviation of the 10 measurement values
Sample standard deviation of the measurements:
Sample standard deviation of the mean:
Factor c for the t-distribution with the confidence level of 1–! = 99.7%: c = 3.96
The larger interval range has two reasons:

(i) factor 2 bigger standard deviations of the measurements (instrument is worse),
(ii) factor 1.32 (3,96/3) bigger c-factor, because we need the t- not the normal
distribution due to only estimated instrument quality.
University

Oliver Nelles
of Siegen
„Six Sigma (6!)“ Quality Management System
This quality control management system was introduced in the mid 1980s by Motorola and
since then has been adopted by many companies. It became particularly famous due to the
introduction within General Electric (GE) by its CEO Jack Welch who made it a great
success and the name “Six Sigma” became quite well-known.
The idea of Six Sigma is to reduce tolerances in a way, that the short term standard deviation
becomes so small that the failure rate corresponds only to 6! = quality of 1 ppb (parts per
billion). According to expert knowledge, long term influences (mean changes slowly over
time due to wear etc.) already cause approximately ±1,5!. Thus the final quality will be in
the range of 4,5! = quality of 3,4 ppm (parts per million).
The implementation of “Six Sigma” is not only done in manufacturing. Rather all areas of a
company are required to deliver a high quality level. An important feature of ”Six Sigma” is
an inherent feedback control. Quality is permanently measured and deviations from the
required numbers cause control actions. The five main steps in “Six Sigma” are:
Define. Measure. Analyze. Improve. Control. (DMAIC).
The statistic evaluation plays an important role in “Six Sigma”.
University

Oliver Nelles
of Siegen
6. Static and Dynamic Behavior
of Sensors
University

Oliver Nelles
of Siegen
6. Static and Dynamic Behavior of Sensors
6.1 Overview
6.2 Static Behavior of Sensors
6.3 Dynamic Behavior of Sensors
6.4 Filtering of Sensor Signals
University
6. Static and Dynamic Behavior of Sensors Page 327 Prof. Dr.-Ing.

Oliver Nelles
of Siegen
6.1 Overview
Measurement errors have their reasons commonly in one or more of the following
issues:
1. Nonlinear static characteristics of the instrument.
2. Dynamic transfer behavior of the instrument.
3. Noise superposes the desired signal.
Against these error sources counter measures can be taken that eliminate or at least
reduce the error:
1. Compensation of the nonlinear distortion.
2. Compensation of the dynamic lag or waiting for the signal to settle (dynamics has faded).
3. Filtering to suppress noise.
Even if these counter measures are not completely successful or sufficient it is important to
understand their effects. Only this allows one to assess the errors appropriately.
University

Oliver Nelles
of Siegen
Linear Characteristics
The static characteristics between the input x and the output y can be described by a function:
In sensorics we are primarily interested in the relationship between a measured quantity x,

e.g. temperature, pressure, or displacement, and the yielded or displayed output y of the
instrument, e.g. a voltage between 0V and 10V.
In the ideal case, this characteristics is linear, i.e., it exists a proportional relationship
between input and output:
For converting between input and output (or back) only the
proportionality constant k is necessary. It is independent of
the operating point (OP). This is also true for the almost as
simple affine relationship that includes an additional offset:
By a simple transformation of the axis

it can be transformed in the linear form .
University

Oliver Nelles
of Siegen
Advantages of a Linear (Affine) Characteristics
• Easy to understand and to handle.
• Described by one (two) parameters: k (and k0).
• Identical sensitivities (slopes) in all operating points.
Life and Dead Zero

In measurement techniques the representation of the origin is practically important:
• Dead Zero: If the output y = f (x) = 0 for x = 0, i.e., the characteristics goes exactly
through the origin of the coordinate system, as it is the case for linear systems.
• Life zero: If the output y = f (x) ≠ 0 for x = 0, i.e., the characteristics does not go exactly
through the origin of the coordinate system, as it is the case for affine systems.
A life zero offers an important practical advantage. It allows to distinguish between a zero
measurement x = 0 with y = k0 and a disconnection or other wire breakage (y = 0).
University

Oliver Nelles
of Siegen
Linearization nonlinear
characteristics
In reality every instrument will possess a nonlinear character-
istics. It is possible to approximate this relationship by linear
or affine characteristics. Two alternative approaches exist:
1. Global approximation: The complete nonlinear OP
characteristics in the whole range is approximated
by a line (blue dashed).
2. Linearization around an operating point (OP): The nonlinear characteristics in a small
range around some operating point (OP) is approximated by a line (blue solid). Such an
approximation is superior to the first approach as long the systems stays close to the OP
(x0, y0). Each OP requires an individual line since the slope and offset depends on the OP.
The line follows the equation:
Method 2 is better, if x changes slowly and it is possible to adjust the line as the OP changes.
If the behavior is rapidly time-variant the 1. method might be better.
University

Oliver Nelles
of Siegen
Sensitivity
The sensitivity S of an instrument is determined by the slope of its characteristics in the
considered OP:
If the sensitivity is low a change in the measured value x
hardly affects the output y of the instrument!
In general, the sensitivity of a nonlinear characteristics is operating point dependent,

i.e., S = S(x0). For linear or affine characteristics the sensitivity is constant over the whole
operating range because the slope never changes, i.e., S = k.
progressively
Common nonlinear characteristics possess a monotonically increasing
increasing or decreasing sensitivity (in absolute value). progressively
The first is called progressive, the latter behavior is called decreasing
degressive.
Of course, more complicated characteristics with inflection degressively
point(s) are possible as well. But the four main degressively decreasing
characteristics to the right cover at least 90% of all cases. increasing
University

Oliver Nelles
of Siegen
Compensation of Nonlinear Behavior
If the nonlinear characteristics of a sensor is known (from manufacturer’s data or thorough
measurements) it can be compensated at least partially. Two alternative exist:
• Differential principle: This is a popular approach for inductive and capacitive sensors and
utilizes a bridge circuit. The nonlinearity often cannot fully be compensated but the
approximation is commonly of high quality.
• Inversion of characteristics: By connecting the sensors and its inverted static
characteristics in series theoretically both cancel each other. Theoretically, this is possible
if the characteristics is strictly monotonous.
However, practical problems occur if the sensitivity is extremely small or large. The latter
implies that the sensitivity of the inverted characteristics is extremely small.
This is also a standard method in control. Smart sensors commonly include such a
compensation as well. Together with such a compensation they offer (almost) linear
behavior which makes it very user friendly.
University

Oliver Nelles
of Siegen
Compensation Via Difference Calculation
The key idea is to calculate the difference between two signals that are caused by counter-
acting (e.g. opposite) effects. For inductive (or capacitive) displacement sensors e.g. one
signal shows a positive and the other a negative influence. Calculating the difference yields:
From a Taylor series expansion of the function f that gives
we recognize the quadratic terms (and all terms of even powers) are eliminated in the
difference calculation:
By eliminating the quadratic terms the characteristics

between x and yd become more close to linear in a
wider range. For all purely quadratic relationship the
difference even yields an exact linear characteristics.
University

Oliver Nelles
of Siegen
Compensation Via Inversion
The key idea is to isolated x as a function of y (inversion):
The inverse function only exists of f (x) is biuniquely, i.e., if for every y from the physically
reasonable range, exactly one x exists. If f (x) does not fulfill this property (most will do) then
the inversion can be carried out in intervals in which this property holds. By such an inversion,
the electronics can compensate for all (at least most) nonlinearities in the sensor. The “~“ shall
indicate that an exact inversion is never possible in practice.
A prerequisite for an inversion is that the function f (x) is known accurately. Special care is
necessary for very small or large (where the inverse is very small) sensitivities because tiny
errors cause huge deviations.
Sensor Evaluation
University

Oliver Nelles
of Siegen
Determination of the Static Characteristics
• The input signal must be held constant long enough that the output signal has time to
settle. Then one point on the x-y-characteristics can be read out.
→ Time required for measuring through the entire characteristics is high!
• Characteristics typically are stored in a look-up table with
linear interpolation (red dashed). Alternatives: Polynomials, neural networks, ...
• Characteristics for more than 1 input are called characteristic maps. They are commonly
measured on a grid, e.g. 8 × 8 combinations for 2 inputs.
x(t) y
x06 y06
y06 y(t)
x01
y01
y01
Time t x01 x06 x
University

Oliver Nelles
of Siegen
Characteristics in Lookup Tables
If a quantity depends in a nonlinear way on several other quantities, a characteristic map is
required to describe such a behavior. For more than 2 input dimensions, however, only slices
can be graphically illustrated. Therefore a 2-D example:
250
Engine torque [Nm]

200
150
100
50
0
-50
-100
80
60 7000
6000
40 5000
4000
20 3000
Throttle angle 1000
2000 Engine speed
0 0
[degree] [rpm]
250
A typically characteristic map out of
Engine torque [Nm]

200
an automotive area: The control of 150

100
combustion engines. The engine torque 50
depends decisively on the engine speed 0

-50
and the throttle angle (for gasoline engines) -100
or injection mass (for Diesel engines). 80
60 7000
6000
40 5000
4000
20 3000
Throttle angle 1000
2000 Engine speed
0 0
[degree] [rpm]
University

Oliver Nelles
of Siegen
Dynamic Errors
The output y of an ideal instrument follows the input x instantaneously, i.e., without any time
lag. In reality such an ideal behavior cannot be realized. Masses have to be accelerated,
capacitors have to be charged, temperature must adjust, electric/magnetic fields have to build
up, signals need to be processed. Such delays or lags cause a so-called dynamic error.
Dynamic errors only show if the input signal changes. They are the higher, the faster these
changes are. Examples for really fast input signals are impulses or steps.
To compare the dynamic behavior of sensors it makes sense to relate to a common scenario
where the input changes step-wise and the deviation of the response y to a perfect step is
measured. The response can be partitioned into 3 parts:
x(t)
1. 0 … Tt: y(t) does not react at all.
y(t)
2. Tt … Tset: y(t) reacts.
Transient
3. Tset … ∞: y(t) settles (almost) to its final value.
For Filters see Chapter 10 dead time Tt time t

settling time Tset
University

Oliver Nelles
of Siegen
The smaller the dead time Tt and the settling time Tset are, the faster the sensor behaves and
the smaller the dynamic error becomes. An ideal sensor has: Tt = 0 und Tset = 0 but of course
this is not possible
Overshoot and Damping

Unfortunately the output is not always as nice with an asymptotic approach to its final value
as shown in the last slide. Often the dynamic behavior (at least approximately) follows a
differential equation of 2. order:
y(t) D = 0.25
where D is the damping and !0 is the resonance frequency D = 0.5

given by the physics of the sensor. The equation e.g. can
describe a mass-spring-damper-system as it occurs in every
instrument needle/pointer. If the damping D is too low (D < 1) D=1
oscillations will occur; if the damping D is too high (D > 1) D=2
the settling time will be too long. Therefore the best
compromise is the so-called aperiodic limit case with D = 1. time t
University

Oliver Nelles
of Siegen
How to Avoid or Reduce Dynamic Errors?
1. Wait after a change in the measured quantity x until settling is reached after time period
Tt + Tset and then read output value y or process it further, respectively.
2. In a post-processing step the delayed and time-lagged output y(t) is predicted into the
future (non-causal system).
3. Reduce the time-lag in the dynamic error with dynamic filter with differential character.
The price to be paid is a higher sensitivity to noise.
Method 1 and 2 can only work if the output y is not need at once! Method 1 additionally
requires that the changes are step-wise and not continuous.
Method 1 and 2 thus cannot be used for feedback control systems! In feedback control
it is crucial that the control variable x is fed back at once to the comparison with the desired
value. The controller must act as quickly as possible with respect to deviations. Any
additional delay will deteriorate the control performance.
w e u x
controller process
That leaves us with method 3 where it is important to find a
good trade-off between noise sensitivity and the reduction of y
dynamic errors. sensor
University

Oliver Nelles
of Siegen
2. Measurement of
Electrical Quantities
University

Oliver Nelles
of Siegen
2. Measurement of Electrical Quantities
2.1 Moving Coil Mechanism

2.2 Measurement of Current
2.3 Measurement of Voltage
2.4 Measurement of Power and Energy
2.5 Measurement of AC Quantities
2.6 Measurement Methods and Amplifier Circuits
University
2. Measurement of Electrical Quantities Page 342 Prof. Dr.-Ing.

Oliver Nelles
of Siegen
Why is the measurement of electrical quantities so important?
Electrical current possesses many advantages over alternative physical means to transport
energy and information like with air pressure or hydraulics. Electricity is:
• Easy to measure with high efficiency.
• Easy and with high efficiency to transform to other quantities with motors (torque,
speed), electric heating (heat) or air conditioning (coldness), lamp or LED (light).
• Well and easy to control.
• Efficiently to transport over long distances.
• Almost everywhere available.
• Standard means to transmit information.
• Easy to covert into digital signals and to process in a computer.
Because of these advantages electricity plays a dominant role in measuring things (sensorics)
and manipulating things (actuation). At least the last part in sensorics and the first part in
actuators is often of electrical nature to exploit the good controllability properties.
University

Oliver Nelles
of Siegen
First Principles
A magnetic field of flux density B generates a force
on a wire of length l that is orthogonal to the field
and carries an electrical current I. The generated
Lorentz force is calculated by
This force is proportional to the current and can be used to indicate

its value. If this force is in balance with a spring, a pointer can
display the size of the current.
More accurately, the force is generated in N windings of a coil.
Because it acts on each side of the coil, the actual torque is twice
this force times the distance r (diameter of the coil = d = 2r).
This gives the torque:
Acting on a torsion spring with torque M = c ! results in a displayed angle !.

University

Oliver Nelles
of Siegen
2.1 Moving Coil Mechanism Source: http://de.wikipedia.org/wiki/Drehspulmesswerk
(1) Weicheisenkern, (2) Permanentmagnet, (3) Polschuhe, (4) Skale, (5) Spiegelskale, (6) Rückstellfeder, (7) Drehspule, (8)
Ruhelage, (9) Maximalausschlag, (10) Spulenkörper, (11) Justierschraube, (12) Zeiger, (13) Südpol, (14) Nordpol
University

Oliver Nelles
of Siegen
Some Facts on the Moving Coil Mechanism Meter
• Most frequently applied analog way to measure currents.
• Range: 10−6A − 100A. Accuracy: 0,1% − 1,5%. Settling Time: 0,5s − 1s.
• With resistors in parallel the range can be changed.
• By coupling it with an DC converter it can be used to measure an AC current.
• With an auxiliary resistor and Ohm’s law, it can be used to measure voltage.
• Replacing the permanent magnet creating B by an electromagnet, the meter can be used
for measurement of power.
Change of Range:
Internal Resistance: 10RM Internal Resistance: 100RM
Internal Resistance: RM
I : 10 I : 10 I
A A A
x 10 x 10
RM RM RM
University

Oliver Nelles
of Siegen
Systematic Error of Current Measurement
Circuit without meter Circuit with meter
I0 IM Internal
R R A resistance
U0 U0
distorts the
RM
measurement!
True current: Measured current:
This leads to a relative error in the current measurement of: Current is

measured too
for RM << R for RM → 0
0 small!
Current meters should have an internal resistance as small as possible!
University

Oliver Nelles
of Siegen
Using a Current Meter for Measuring a Voltage
Circuit without meter Circuit with meter
I0 I0 IM finite internal
V resistance
R U0 R UM
distorts the
RM
measurement!
True voltage: Measured voltage:
This leads to a relative error in the voltage measurement of: Voltage is

measured too
for RM >> R for RM → ∞
0 small!
Voltage meters should have an internal resistance as large as possible!
University

Oliver Nelles
of Siegen
Change of Range
Internal Resistance: RM Internal Resistance: 10RM Internal Resistance: 100RM
IM IM IM
: 10 9RM : 10 99RM
V V V
x 10 x 10
RM RM RM
Nomenclature of Voltage Meters

The internal resistance is given in relation to the upper range value and in Ω /V.
E.g. “1 kΩ /V” means:
• 100 kΩ internal resistance within the range 0…100 V.
• 10 kΩ internal resistance within the range 0…10 V, etc.
University

Oliver Nelles
of Siegen
Considerations About the Systematic Errors in Current and Voltage Measurements:
• To reduce the deterioration in current measurements, we want to have a small internal
resistance, in the ideal case RM = 0.
• To reduce the deterioration in voltage measurements, we want to have a large internal
resistance, in the ideal case RM = ∞.
• The demand for a small internal resistance is much more difficult to fulfill than the
demand for a large internal resistance, because
− the coil of the moving coil mechanism naturally has a finite resistance, in particular if
N is high,
− also the connections/contacts where the meter is attached have a resistance,
− amplifier circuits easily can generate a resistance close to RM = ∞ (see Chapter 2.6).
These arguments show that a voltage measurement can be performed more accurately
than a current measurement.
Therefore we can apply a trick to use voltage measurements for determining currents.
University

Oliver Nelles
of Siegen
Indirect Current Measurement With a Shunt
A Shunt is a measurement resistor that has been manufactured with care (expensive!) to
ensure a low resistance with great accuracy almost independent of disturbing influences like
temperature. The voltage drop over such a shunt is measured and by Ohm’s law the flowing
current is determined. Compared to a direct current measurement, which incorporates the
meter in series within the circuit, the following advantages are obtained:
Direct current measurement
• The resistance of shunt is more accurate than the internal
resistance of the meter. I
A
→ Smaller measurement error.
RM << 1
• The resistance of the shunt can be chosen to be smaller than
the internal resistance of the meter.
→ Smaller measurement error. Indirect current
measurement via a shunt
• The wires and connections to the meter lead to a voltage drop
I
and are sources of measurement errors. Because the current V
through the voltage meter is tiny (<< I), these are insignificant
Shunt RM >> 1
compared to the direct method.
University

Oliver Nelles
of Siegen
First Principles
Electrical power is the product between voltage and current:
Replacing the permanent magnet of the moving coil mechanism creating the magnetic field B
by an electromagnet, constructs the electrodynamic instrument. It can measure power. If
the electromagnet is fed with voltage U this creates a current and subsequently a magnetic
field proportional to U:
With the formula for the moving coil mechanism we obtain:
The generated torque is proportional to the power P.
University

Oliver Nelles
of Siegen
Measuring Electrical Energy
Measuring energy is based on the measurement of power. Energy is power integrated over
time:
If the power is constant over time, energy is simply power times time:
Otherwise can be fed to an integration circuit (see Chapter 2.6) and be computed in an analog
manner. Alternatively it can be measured (counted) by a motor meter. A motor meter
basically is an induction measuring system (see Chapter 2.5) in which the electromagnets are
replaced with an electromotor whose torque is proportional to the power. The number of
revolutions of the disk is proportional to the energy.
University

Oliver Nelles
of Siegen
Mean, Peak, Rectified, and Root Mean Square (RMS) Values
AC Quantities are periodic signals x(t) with a period (cycle duration) of T. The following
measures of “size” have to be distinguished:
Mean: Peak:
Rectified: RMS:
The by far most important periodic signal type is a sine or cosine signal. A sine oscillation
with amplitude A has the following characteristic values:
Mean: Peak:
Rectified: RMS:
For a rectangular oscillation the mean, peak, rectified, and RMS values are all identical to its
amplitude A. The rectified value is the mean of the absolute value. The RMS value is a
measure for the signal power or energy.
University

Oliver Nelles
of Siegen
Measuring Mean Values
The mean value of an electrical AC quantity can be directly measured with a moving coil
mechanism, if the frequency of the signal is high enough. Often occurring frequencies around
50 or 60 Hz (power net frequency) are so much higher than the bandwidth of the moving coil
mechanism (around 1 Hz) that the instrument shows only the mean value. I.e., only the offset
value of the AC signal is displayed.
Measuring Peak Values

A diode lets only the positive half part of an oscillation
signal u(t) pass. A capacitor C stores the highest occurring
value of this voltage. Since the voltage meter has a very
high internal resistance RM, the capacitor will be hardly
discharged (dashed line) before it is charged again at the V
next period T. A circuit manages to half the refresh times
by an additional diode.
University

Oliver Nelles
of Siegen
Measuring Rectified Values
The most straight forward way to rectify a signal is to
let only the positive half of the oscillation pass by a diode.
The negative halves are blocked. In contrast to its V
definition, this approach in the mean measures only ½
of the rectified value. Therefore the result has to be
multiplied by 2.
More advanced is the Graetz circuit which requires 4
diodes that manage to let the positive halves pass and
let the negative halves pass in the other direction. Thus
the full rectified value is determined.
Because for oscillations of sin type, the relation between
the rectified value and both, the peak value and the RMS
value are known, both values can be calculated from the
V
rectified one:
University

Oliver Nelles
of Siegen
Apparent, Active, and Reactive Power
In coils and capacitors where inductivity and capacity are the dominant factors, AC voltage
and current are phase shifted by +90° and –90°, respectively. Thus, if not purely ohmic
impedances are present, phase shifts ! between voltage and current have to be taken into
account in any AC circuit in general. The apparent power PS in such a impedance is simply
the product between the RMS values (called “effective” in German) of voltage and current:
But the entire apparent power cannot perform work. One part of it just oscillates around the
mean value 0. The really useful part of it is called active power (“Wirkleistung” in German).
This part can perform work and is calculated by:
The part that cannot perform any work is called reactive power (“Blindleistung” in German)
and calculated by:
If voltage and current are not phase-shifted (φ = 0), then the

reactive power = 0 and apparent power = active power.
University

Oliver Nelles
of Siegen
Voltage and Current Apparent Power
3 8
6
2
4
1
2
0 0
-2
-1
-4
-2
-6
-3 -8
0 2 4 6 8 10 0 2 4 6 8 10
Active Power Reactive Power

8 8
6 6
4 4
2 2
0 0
-2 -2
-4 -4
-6 -6
-8 -8
0 2 4 6 8 10 0 2 4 6 8 10
University

Oliver Nelles
of Siegen
Power Measurement
What happens if we measure an AC current with a moving coil mechanism instrument like a
DC current?
The displayed deflection is proportional to the product between voltage and current
The 2. cos term is averaged out to 0, because we can assume a high frequency of AC
quantities (e.g. 50 Hz) compared to the bandwidth of the instrument (around 1 Hz). This
gives the mean value of the apparent power pS(t) which is identical to the mean of the
amplitude of the active power:
The reactive power can be measured by shifting the voltage by –90° before feeding it to the
instrument. The displayed value is proportional to the reactive power:
University

Oliver Nelles
of Siegen
Measuring the Apparent Power
One way to measure apparent power is to measure the RMS of voltage and current separately
and subsequently multiply them:
An alternative is to let this multiplication happen in a moving coil mechanism instrument by

physical law. To do this, the instrument has to be fed with the rectified values of voltage and
current. The scale must then consider the quadratic nature of the result and the conversion
factor between rectified and RMS values.
Measuring the Phase Shift

There are instruments to measure the phase shift between voltage and current. If this is
determined, the active and reactive powers can be calculated form the apparent power.
Besides these possibilities there are some tricky measurement circuits for three-phase
systems that are beyond the scope of this chapter.
University

Oliver Nelles
of Siegen
Energy Measurement
Because only active power can perform work, the energy (work)
can be calculated by integration:
If the power is constant over time this gives:
To really measure the energy, can be done by an

induction-based system. Such a reliable measure-
ment system is very common, e.g. in any household
for measurement of the consumed electricity
(“Stromzähler”).
An electromagnet generates a field that creates
eddy currents in the revolving disk. These cause
a torque which is proportional to the product of
voltage and current, i.e., the active power.
University

Oliver Nelles
of Siegen
Operational Amplifier
An operational amplifier (OpAmp) is an active component. This means that it needs an
external energy source which is given by a supply voltage UV. An OpAmp is a multi-stage
amplifier circuit that incorporates many transistors. Since 1962 it is available as an integrated
circuit on a chip. Practically all measurement circuits are realized with the help of OpAmps. It
is easy to build filter, integrator, differentiator and many more kind of circuits. Analog
computers are based on OpAmp circuits and allow to simulate differential equations in a
straight forward manner. They can be seen as the predecessor of Simulink.
A real OpAmp has the following properties:

• 2 inputs U+ und U–, whose difference Ue
is amplified and generates the output Ua = Vue. Das
Bild
kann
nicht
angez
• Input resistance Re is in the mega ohm range.

eigt
werde
n.
• output resistance Ra is only a few ohm.

• Gain V is in the range 10.000 – 100.000.
University

Oliver Nelles
of Siegen
Ideal Operational Amplifier
Idealized an OpAmp can be described by the following approximations:
• Input resistance Re = ∞.
• Output resistance Ra = 0.
• Gain V = ∞.
Amplifier with Feedback

An OpAmp is either used as a switch (comparator) or most frequently applied with feedback
that typically is used with negative sign (like in feedback control). I.e., the output is fed back
to the “–”-input. This ensures that the input voltage Ud becomes very small since Ud = Ua/V
with V = ∞. Furthermore, the current into the OpAmp is insignificant since the input
resistance is huge (Re = ∞). Therefore, all fed back OpAmps are assumed to follow the
important simplifications:
• OpAmp input voltage Ud = 0.
• OpAmp input current Ie = 0.
University

Oliver Nelles
of Siegen
Voltage Amplification (Non Inverting)
A voltage amplifier has the task to convert an
input voltage Ue in an output voltage Ua = K Ue.
Moreover the load on the input voltage should
be as small as possible, i.e., only a tiny current
should be drawn from the circuit at the input.
On the other side, the output should be capable
to drive significant currents.
The gain of the voltage amplification has to be adjusted by the components within the circuit
easily.
Ue can be measured over the resistor R2, because between the “+“ and “–“ inputs of the
OpAmp almost no voltage drops. Ue splits according to the standard voltage divider rules
onto both resistors, since almost no current goes into the OpAmp. Therefore the transfer
function becomes:
University

Oliver Nelles
of Siegen
Application of a Voltage Amplifier
• Voltage Measurement: The voltage that shall be measured is connected to the input.
At the output any circuit can draw a high current without influencing the measure-
ment circuit. The evaluation circuit itself does not need to possess a very high resistance.
• Constant Voltage Source: If a voltage source is connected to the input, the OpAmp
output can draw big currents without putting any load on the input. The voltage source is
then in no danger to break down.
• Voltage Amplification: With an appropriate choice of R1 and R2 almost any desired gain
K > 1 can be created.
Voltage Follower / Impedance Converter

Interesting is the special case R1 = 0 (short circuit)
and R2 = ∞ (wire open). Such a circuit just converts the
resistance/impedance. The transfer function is unity:
University

Oliver Nelles
of Siegen
Voltage Amplification (Inverting)
The voltage amplification circuit has a small input
resistance. Furthermore, it changes sign (inverting).
Ue also drops at the resistor R1, because between
the “+“ and “–“ inputs almost no voltage drops.
According to the same argument, the output
voltage Ua drops over R2. No current flows
into the OpAmp. This means:
It is also possible to add additional input in

parallel. It can be used to build more complex
addition or subtraction circuits., e.g.:
University

Oliver Nelles
of Siegen
Creation of Desired Dynamic Behavior
With the OpAmps any dynamic behavior can be achieved by using not only ohmic
impedances, but also applying frequency-dependent components like capacitors and coils.
With a current of sin-type we get:
At a resistor with resistance R the voltage becomes:
At a coil with inductivity L:
At a capacitor with capacity C:
University

Oliver Nelles
of Siegen
Integrator
An integrator circuit is needed e.g. for the simulation of
differential equations. It is also required for computing
energy from power, speed from acceleration, distance
from speed, electrical charge from current etc.
Differentiator
At the OpAmp circuit it is obvious, that this is the exact
opposite of the integrator shown above.
With R and C the proportionality (time) constant can

be adjusted.
University

Oliver Nelles
of Siegen
Low Pass Filter
This circuit simulates a first order differential
equation. It is a simple low pass filter (PT1)
to suppress high frequency disturbances like
noise.
The factor –C1/C2 is a gain factor, i.e., it determines

the static gain of the transfer function. For a filter it
is thus reasonable to choose C1 = C2. A subsequent
inverter should be used to get rid of the “–“ sign. R1C1 is the time constant and 1/R1C1 is
called the corner frequency which determines the filter bandwidth.
University

Oliver Nelles
of Siegen
PID Control
This OpAmp circuit realizes a PID controller, which
is the most widely used controller type. The
(P) part realizes the proportional, the (I) part
realizes the integrative, and the (D) part
realizes the derivative action. The respective
values can be adjusted by the corresponding
resistors and capacitors.
With help of nonlinear components like diodes, e.g. an expo-

nential characteristics can be constructed. It is even possible
to construct circuit that calculate the logarithm. Based
on these, multiplication and division are easy to build.
University

Oliver Nelles
of Siegen
Bridge Circuit
Measurement of impedances (purely ohmic or frequency-dependent) can be reduced to a
simple voltage and current measurement and a subsequent division. But very powerful and
widely used are direct measurements via a bridge circuit. For simplicity, the procedure shall
be explained for resistances but an extension to any kind of impedance is straight forward.
There are 2 alternative approaches:
1. The unknown resistance is compared to an adjustable resistance.
The adjustable resistance will be tuned as long the bridge circuit is balanced.
2. The unknown resistance deviates only insignificantly from its (known) nominal value.
In this case, it is possible to calculate the resistance from the diagonal bridge voltage.
Method 1 has the advantage that the diagonal bridge voltage has to be measured only for
very small (positive or negative) values around 0. It is not necessary to have an instrument
that can handle large amplitudes. It is possible to achieve with a high accuracy with simple
instruments. On the other hand the tuning can be tedious.
Method 2 is fast and effective but works only around an operating point, i.e., if the resistance
is close to its nominal value.
University

Oliver Nelles
of Siegen
Balance the Bridge
This bridge circuit was invented and first applied by Wheatstone
in 1843. Under the following condition this bridge is balanced,
i.e., the diagonal voltage is zero (Ud = 0):
According to the voltage divider rule this means:
If the resistance R2 is unknown, we can tune one resistor (in principle, any one or more than
one) until the diagonal voltage is zero: Ud = 0. The bridge then is balanced. The unknown
resistance thus can be calculated from:
Advantage: Independent of quality of the voltage source U0. Only measurement of Ud around
zero is necessary.
Drawback: Tedious tuning of the comparing resistance.
University

Oliver Nelles
of Siegen
Bridge Voltage
If the unknown resistance deviated only slightly from its nominal
value, the diagonal voltage can be used as a measure of this
resistance:
If the resistance deviation ΔR is small compared to R, in approximation we have:
0.5
However, the relation between ΔR and Ud is only

approximately linear:
0
-0.5
-1 0 1 2 3 4
University

Oliver Nelles
of Siegen
Increase of Sensitivity
Half Bridge
The sensitivity of the measurement can be doubled by utilizing
2 measurement resistors (red) instead of 1:
∆R U0
Ud ⇡
R 2
Full Bridge
A further increase of sensitivity can be achieved by utilizing
2 positively (red, R + ΔR) and negatively (green, R – ΔR)
changed resistances. This is e.g. a common approach for
resistance strain gauges. Typically the strains are attached
on opposite sides of a bar.
∆R
Ud = U0
R
University

Oliver Nelles
of Siegen
Oscillators
Electrical oscillators consist of a capacitor with
capacity C and a coil with inductivity L and a
resistor with (relatively small) resistance R.
Such an oscillator is the equivalent to a mass-damper-spring system in mechanics. Only in
the resistor or the damper, respectively, energy is lost (more strictly speaking converted to
heat). Without these dissipative elements, they would oscillate forever with their resonance
frequency !0. This resonance frequency depends on C and L (or the spring constant c and
the mass m, respectively). Therefore, it can be utilized to measure capacities and/or
inductivities in an indirect manner.
Electrical oscillators follow the relationship between voltage and current given by:
With a current of sin-type:
University

Oliver Nelles
of Siegen
Resonance in Oscillators
In the case of resonance, the change of voltage at the capacitor and the coil cancel each other
exactly. Resonance happens for:
Then, the impedance of the oscillator is purely ohmic. In the ideal case of no energy loss
(R → 0 or in the mechanical case damper constant d → 0, respectively) the current would be
of infinite amplitude and oscillating at the resonance frequency of:
or for the mechanical counter part:
The resonance frequency !0 can be used to determine:

• the inductivity L if C s known,
• the capacity C if L is known.
Coils, capacitors, and whole oscillators can naturally be build in OpAmp circuits.
University

Oliver Nelles
of Siegen
3. Measurement of
Non-Electrical Quantities
University

Oliver Nelles
of Siegen
3. Measurement of Non-Electrical Quantities
3.1 Sensors and Sensor Systems

3.2 Displacement and Angles
3.3 Speed
3.4 Acceleration
3.5 Force, Torque, Pressure, and Mass
3.6 Temperature
3.7 Flow
3.8 Miscellaneous
University
3. Measurement of Non-Electrical Quantities Page 378 Prof. Dr.-Ing.

Oliver Nelles
of Siegen
Desired Properties for Sensors
• Conversion of a physical measurement quantity into a signal that is suitable for further
processing. Typically, this is an electrical signal because it is especially well suited for
this task.
• Sensitivity: High as possible reaction with respect to the quantity that shall be measured.
• Selectivity: Low as possible reaction with respect to everything else.
• Stability: Constant as possible behavior with respect to all environmental changes like
temperature and aging.
Sensor Systems
• Sensors integrated with intelligent components such as micro-controllers with software
(also called smart sensor).
• Combination of many identical or different sensors.
• Integration of sensors, actuators, and appropriate control equipment.
University

Oliver Nelles
of Siegen
Sensor Fusion
• Information of many sensors is combined in a clever way to achieve advantages.
• Stochastic measurement errors can be reduced by averaging.
• Different principles can be combined to reduce their weaknesses and
gain strengths from synergy effects.
Examples for Sensor Fusion:

• Stereo Vision: 2 cameras build up a 3D picture or video.
• Navigation System: Modern such systems for planes, ships, and cars make use of the
satellite-based GPS and combine it with local sensing of speed, steering angle, etc.
• Driver Assistance: Adaptive cruise control (ACC), lane detection, night vision, lane
changing assistant (blind spot detection), etc. are based on a variety of different sensors
like radar, laser, CCD camera, ultrasonic, navigation maps, …
• Smart Dust: Next page.
University

Oliver Nelles
of Siegen
Smart Dust
A few cm3 small, intelligent sensor systems communicate over a wireless network with a
base station and possibly with each other. This is performed by the means of laser beams.
These concepts are currently developed at UC Berkeley by Pister and some ideas and
problems are known from the novel “Prey” by Crichton. Maybe it becomes reality!
Integration of Different Technologies:

• Ultra energy efficient micro-electronics.
• MEMS: micro-electro-mechanical
systems.
• Wireless laser-based
communication (1 kB/s).
• Management of huge distributed networks.
• Possible sensors: camera, microphone,
acceleration sensor, temperature, humidity.
• Extremely cheap.
University

Oliver Nelles
of Siegen
Resistive Measurement Methods
Many of the techniques to measure displacement and angels can also be used for the
determination of force, torque, and pressure. It is just necessary to have a spring whose
displacement is proportional to these quantities.
Principle of Resistive Displacement Measurement

The ohmic resistance of a electric wire depends on its length l, its cross-section area and its
specific resistance ! which in turn depends on the material:
If the wire is pulled apart with a force F this is influencing the relative resistance:
The factor K summarizes the influence of length and area change

and the variation of the specific resistance.
F F
University

Oliver Nelles
of Siegen
Resistive Measurement Method: Strain Gauge
Resistive strain gauges utilize the resistance change caused by a length change !. They are
commonly manufactured as an elastic foil and glued on the body to be measured. It can be
distinguished between different material types:
• Metal: Typical sensitivity is around K = 2.
The resistance change is mainly based on the
length and area change. Specific resistance
changes only insignificantly.
• Semiconductor: Typical sensitivity is very
high in absolute values, either around K = –100 or around K = 100 for n- or p-doped
semiconductors. The piezoresistive effect is utilized, i.e., the internal generation of
electrical charge resulting from an applied mechanical force. It changes the specific
resistance significantly. This extremely high sensitivity must be
paid for by an undesirable high temperature dependency.
University

Oliver Nelles
of Siegen
Resistive Method:
Strain Gauge Embodiments [3]
University

Oliver Nelles
of Siegen
Resistive Measurement Method: Placement of Strain Gauges
Applying multiple strain gauges can improve the sensitivity of the measurement. Like shown
below, in a bridge circuit the sensitivity can be quadrupled (4x). The higher selectivity of
such an approach is desirable. However, most important is the robustness against temperature
changes because the temperature effects (and others) cancel each other. If the resistances are
all changed relatively in the same manner, the bridge voltage is not affected at all.
upper strains
are stretched
lower strains
are compressed
University

Oliver Nelles
of Siegen
Resistive Measurement Method: Effect of Magnetic Field
• Hall sensors: A magnetic field orthogonal to an electrical current leads to an Lorentz
force on the electrons. This causes a Hall voltage orthogonal to magnetic field and
current. For currents around 100…500 mA the voltage is typically around 50…400 mV
with reasonable field strength. Such Hall sensors are commonly used as limit switches.
• Field-plates: The Hall effect deflects the current and enforces it not go the direct way but
to take a detour. As a consequence, the resistance increases (magneto resistive effect). A
quadratic characteristic results. ++++++++
It can be compensated by a
differential bridge circuit.
– – – – – – – –
N
University

Oliver Nelles
of Siegen
Inductive Measurement Method: Inductivity of a Coil [3]
The inductivity of a coil can be calculated from its number of
windings N and its magnetic resistance Rm: ÆA
N2 s
L= with Rm =
Rm µ0 µr A
where s is the length of the flux lines, A is the area where the
flux lines pass through, and !r is the relative magnetic
permeability of the material. For the coil three such parts add up: a) inside the coil in a part
that is filled with iron (!r >> 1), b) inside the coil in a part that is filled with air or nothing
(!r ≈ 1) and c) outside the coil that usually also consists of air or nothing (!r ≈ 1):
siron s soutside s
Rm = Rm,a + Rm,b + Rm,c = + + ⇡
µ0 µr A µ0 A µ0 Aoutside µ0 A
The 1. term can be neglected due to the very high value for !r. The 3. term can be neglected
due to the large area Aoutside outside. This leaves us the 2. term. Therefore the inductivity is
inverse proportional to the length of the part inside the coil which is not filled:
University

Oliver Nelles
of Siegen
Inductive Measurement Method: Plunger and Differential Plunger
A small displacement !s of the armature
Plunger Differential Plunger
from the operating point s influences the
inductivity in a nonlinear way as follows:
This means that only for tiny displacements the inductivity L is roughly proportional to the
displacement !s (with negative sign, i.e., !s > 0 → !L < 0). To enlarge the roughly linear
range, the differential approach was developed. The idea is to introduce a second coil whose
inductivity operates in the other direction. The displacement drives the armature opposite to
the first coil and a displacement !s leads to a decrease in the first but increase in the second
coil, or the other way round:
A clever combinations of both inductivities by a circuit creates a linear measurement

characteristics. The linear behavior is achieved in an exact way, not only by approximation or
linearization, which would be valid only for small displacements !s.
University

Oliver Nelles
of Siegen
Inductive Measurement Method: Differential Measurement Principle
Such a bridge circuit can be used to create a linear characteristics. The diagonal bridge
voltage Ud is equal to the difference between the voltage drop along the upper resistance and
inductivity:
Introduction of the dependency on the displacement gives:
The differential principle together with the bridge circuit results in an exact proportionality
between displacement and diagonal voltage. This type of “physical linearization” is widely
applied in many circumstances (also with capacitor, etc.).
University

Oliver Nelles
of Siegen
Inductive Sensors [3]
University

Oliver Nelles
of Siegen
Capacitive Measurement Method
The capacity C of a plate capacitor depends on the distance between the plates d, the area of
the plates A and the permittivity !r determined by the material between the plates:
Change of Capacitor Plate

A change in the distance between both plates has the same Capacitor
nonlinear effect as the just discussed displacement in the inductivity:
Similar to the inductivity change, the capacitor can be built Differential Capacitor
according to the differential principle. Again, together with
a bridge circuit a linear characteristics can be created.
University

Oliver Nelles
of Siegen
Capacitive Measurement Method: Change of Capacitor Plate
A change of the plate area directly (without any tricks) effects the capacity in a linear way:
Depth b
With an original plate area of A = b s this yields a change of that area of !A = b !s. Thus, the
capacity changes linearly with the displacement of the plates against each other:
insulation layer electrode
This approach is commonly applied for displacement and angle

measurement as well as fill level measurement in tanks and other types
of reservoirs. It is important to notice that the liquid must be conducting
electricity. The reservoir together with the conducting contents acts as the
one capacitor plate, the electrode as the other. The insulation layer acts as
dielectric medium. The effective plate area (proportional to the capacity)
is proportional to the fill level.
conducting
University

Oliver Nelles
of Siegen
Capacitive Measurement Method: Change of Dielectric Medium
With the shown approach the thickness of layers can be measured if their
permittivity !r2 is known. On one capacitor plate the material layer
is applied; the remaining part is typically filled just with air, i.e., !r1 = 1.
The capacity of the capacitor is influenced in a nonlinear way by the
thickness of the material layer d2. According to the rule of a series connection of two
capacitors, we get the following overall capacity (d1 = d – d2):
Displacement, angles, and fill levels even of non-conducting materials

(as long as !r2 is significantly different from !r1) can be measured with
the approach shown to the right. Here, the relationship between the
displacement s2 and the overall capacity follows the rule of two capacitors
in parallel which yields a linear relationship (s1 = s – s2):
non-
conducting
University

Oliver Nelles
of Siegen
Optical Measurement Techniques
• Incremental Displacement Measurement: The distance is divided into equidistant
intervals whose width determines the resolution of the measurement. The intervals are
counted and the measurement is always relative to a starting point.
• Coded Displacement Measurement: Coding of the position allows to determine the
absolute, not only the relative, position.
• Interferometric Displacement Measurement: Highly accurate measurement based on
interference of laser beams. Displacement around !/8 can be determined (! ~ 600 nm).
University

Oliver Nelles
of Siegen
Miscellaneous
Displacement measurement techniques applied in modern driver assistant systems:
• Infrared: Based on the emission and reflection of laser impulses and the measurement of
their time delay (ns range!). Can be used to measure the distance to the ahead driving car
for adaptive cruise control systems. Good visibility is required, but then good signal
quality can be expected. Quite low price.
• Radar: An alternative to infrared technology. Typically, realized with 77 GHz radar
frequency. According to the Doppler principle besides the distance, the relative velocity
to the next car can be measured. Bad visibility is no handicap. Relatively expensive.
In the short range 24 GHz radar is used for parking sensors.
• Ultrasound: Used for parking sensors (only short distances!). High importance for
nondestructive material testing.
• CCD camera: Together with powerful but expensive and complex image data processing,
this can support the other sensors. It is necessary for lane and blind spot detection. Very
flexible but complicated. Not very robust.
University

Oliver Nelles
of Siegen
3.3 Speed
Possibilities for Speed Measurement
Three main alternatives are available for speed measurement:
1. Measurement of a time interval !t, in which are certain distance !s is covered.
Subsequently the speed can be calculated by v = !s / !t. Speed measurement is done by
measuring distance and time. "
2. Measurement of a rotational speed " and conversion into the r
translational speed with v = " r. v
3. Direct measurement of speed by the use of:

- Doppler effect of acoustic waves.
- Doppler effect of electromagnetic waves with radar or light.
- Combination of 2 cameras and correlation analysis
(strictly speaking based on method 1, but only used for speeds).
University

Oliver Nelles
of Siegen
3.3 Speed
Doppler Effect for Acoustic Waves
The Doppler effect describes the relative velocity between v
the object that emits the waves and the objet that reflects
the waves. The acoustic Doppler effect is typically used in
the ultrasonic range.
For departing objects the frequency shift becomes
v
(c = speed of sound):
For approaching objects:
Doppler Effect for Electromagnetic Waves (Radar, Light)

Due to the theory of special relativity (c = speed of light):
University

Oliver Nelles
of Siegen
3.3 Speed
Speed Measurement with 2 Cameras and Correlation Analysis
With well-structured surfaces like bulk on
a conveyor belt or a street below a car,
these patterns can be recorded with 2
distant cameras.
Comparing both camera signals with
the help of correlation analysis, yields
the time interval between both signals.
With known camera distance d, the camera 1
2
speed can be calculated from v = d / !t. x(t)
0
correlation function rxy -2

1 0 20 40 60 80 100
t [s]
maximum! camera 2
0.5
2
y(t)
0 0
-2
!t = 10 s -0.5
0 10
!t [s]
20 30 0 20 40
t [s]
60 80 100
University

Oliver Nelles
of Siegen
3.3 Speed
Speed Measurement: Optical Methods
A disc as shown to the right can be mounted on
an axle and illuminated by a light source. The
reflected light can be accepted from a photo diode.
The discs can be marked incrementally or coded. Typically they have a marking of the initial
point, that give an absolute reference for the incremental disc. The speed range that can be
covered by this kind of approach is typically around 0 – 12000 min-1.
Speed Measurement: Tachogenerators

A generator can be used for speed measurement.
DC motors/generators yield a DC voltage
proportional to the speed. AC motors/generators
yield an AC voltage that has to be rectified
before its amplitude is proportional to the
speed. However the direction information
(sign) is lost by this procedure.
University

Oliver Nelles
of Siegen
3.3 Speed
Speed Measurement: Inductive Method
The inductivity of a coil depends on the relative magnetic permeability
!r of the material through which the field line pass. Therefore, teeth
and gaps can be detected, if the cog wheel is built of ferromagnetic
material. In contrast to optical speed sensors, this approach is very
robust against dirt and other environmental disturbances.
Thus, they are commonly used in automotive industry.
A double gap marks the initial point.
Speed Measurement: Magneto Resistive Method

permanent
It is similar to the inductive method. With a field plate the magnet
dependency of the electrical resistance of a resistor on the
field plate
strength of a magnetic field is utilized (see Hall effect).
By using nonsymmetrical teeth-gap sequences, even the crown gear
speed direction can be recovered.
University

Oliver Nelles
of Siegen
3.3 Speed
Yaw/Angular Velocity Measurement: Coriolis Principle (e.g., for ESP)
With micromechanics it is possible to realize an equivalent
of a tuning fork that can be excited by permanent oscillations
(in direction left/right). Due to these oscillations, the endings
of the fork move with speed v. An angular velocity w (from
outside) with the oscillation orthogonal with respect to the
movement creates the Coriolis force orthogonally
which is proportional to the angular velocity w. movement of tuning fork v
spring
acceleration sensor seismic mass
University

Oliver Nelles
of Siegen
3.4 Acceleration
Measurement Principles
For measurement of accelerations (translationally or rotationally) the following two
approaches are important:
• The derivative of speed signals (attention: Derivatives enhance the noise!).
• Measurement of the force F or torque M at a body with mass m or a moment of inertia Θ
and determination of acceleration via:
or
The first approach leads to the two previous sections. Therefore only the second approach is
pursued here. Hereby the inertia of a mechanical resonator acts on a seismic mass. The
equations of motion are those of a standard spring-damper-mass system:
University

Oliver Nelles
of Siegen
r
3.4 Acceleration c D= p
d
!0 =
m 2 mc
Measurement of Acceleration with a Seismic Mass
With the usual notations for the damping D and the resonance frequency !0, a seismic mass
follows the equation:
If !0 is chosen to be big (via a stiff spring and a small mass) then the
3. term dominates the left part of this equation which yields
approximately:
Acceleration measurement:
c >> 1, m << 1, D << 1, !0 >> 1
Velocity measurement:
c << 1, m << 1, D >> 1
Displacement measurement:
c << 1, m >> 1, D << 1, !0 << 1
University

Oliver Nelles
of Siegen
3.4 Acceleration
Frequency Response of a Seismic Mass
The frequency response is very dependent on the damping D around . Therefore,
either the low frequency range (tuned to high resonance frequency) or the high
frequency range (tuned to low resonance frequency) is utilized. The high frequency
range is used for measuring accelerations (slide before), the low frequency range is used for
measuring displacement of oscillations.
The frequency response shown on the low resonance
right is given by the relationship: frequency
high resonance
frequency
acceleration displacement
measurement measurement
University

Oliver Nelles
of Siegen
Force Measurement
Measurement of force is typically achieved via measurement of displacements. The following
principles are the most common ones:
• Strain Gauges: In elastic deformation the force is proportional to the change of length
which in turn results in a change of electric resistance (see Chapter 3.2).
• Piezoelectric Effect: A force or stress applied to a crystal generates an electric charge
(“piezo” means “squeeze” or “press” in Greek). This principle is well-suited to measure
highly dynamic (fast and/or oscillating) forces.
This effect is a reversible process, i.e., a mechanical force is generated if an electrical
field is applied to the crystal. The force → field effect can be used for sensors; the
field →force effect can used to build actuators. The latter is e.g. used to generate
ultrasound or for injection valve control of modern Diesel engines.
• Magnetoelastic Effect: The dependency of the magnetic properties of certain alloys with
respect to an external force can be used to measure this force. The caused displacement is
minimal.
University

Oliver Nelles
of Siegen
Load Cell
A load cell consists of an elastic, cylindrical body that is compressed or elongated by an
external force. Strain gauges are glued on this body which measure the resulting stress.
• Range: 50 N … 5 MN.
• Uncertainty ~ 0,05%.
• Applications, e.g. electromechanical scales (balances):
- Commercial balances.
- Horizontal containers.
- Weighbridges.
- Rail scales.
strain gauges
- Belt scales.
University

Oliver Nelles
of Siegen
Piezoelectric Force Measurement
Certain types of crystal, e.g. SiO2, generate an electric field in response to mechanical force
or stress. Dependent on the polarization direction, an electric charge gathers on the stressed
areas (longitudinal effect, “Längseffekt”) or in the orthogonal direction (transverse effect,
“Quereffekt”) or from a shear force (shear effect, “Schereffekt”)
The amount of electric charge Q is proportional to the
causing force F:
Q = kF with k = 2,3·10-12 As/N
In order to increase this tiny amount of charge,
those crystals are typically build as stacks, i.e.,
many crystals are placed in series.
Shortly after their generation the charges try
to balance each other. Thus, the effect is only
temporarily. The electric charge has to be
stored somehow after its generation. longitudinal transverse shear
University

Oliver Nelles
of Siegen
Piezoelectric Force Measurement: Dynamic Behavior
It is possible to describe the crystal as a current source with an internal resistance Rq and a
capacity of Cq (see figure b) below). If a force appears suddenly (step input), then quickly a
charge Q0 is generated. With a time constant of RqCq this charge exponentially fades away
although the force continues to act. Via the internal resistance the capacitor discharges. If it is
required to measure static forces, it is therefore necessary to feed the voltage to an integrator
OpAmp circuit.
This transient behavior of the piezoelectric effect is a drawback for stationary measurements,
but is well-suited for fast dynamic measurement because it possesses a high bandwidth.
The voltage generated as a result
of the electric charge can be
calculated as:
Uq = Q / Cq
University

Oliver Nelles
of Siegen
Reversed Piezoelectric Effect: Principle of Actuators
The piezoelectric effect offers new possibilities in actuation because of its high bandwidth.
High injection pressures of 2000 bar spray the Diesel fuel very accurately and smoothly into
the cylinder. This allows to partition the injection into several small injections to shape the
combustion profile. Thereby, it is possible to make the explosion more efficient and at the
same time optimize its other properties like decent acoustics.
Piezoelectric Injector for Diesel Engines [Siemens VDO]
University

Oliver Nelles
of Siegen
Torque Measurement
For the measurement of torque
the technique discussed for
force measurement can be applied. Strain gauges can
be applied to an axle to measure torsional stress. The
change in resistance can be evaluated in a bridge circuit.
A different possibility is to measure the torsional dis-
placement between a flange-mounted disc and a pipe
mounted in further distance. The displacement measure-
ment can be performed inductively or capacitively. strain
DMS gauges
Source: http://www.telemetrie-
Signal Processing world.de/fachartikel/7._Drehmomentmessung_mit_
Telemetrie.pdf
One difficulty with measuring torques is the transmission pipe disc
of the measurement signals outside of the rotating axle to a
fixed system around. This can be solved via slip rings.
A more robust technique is via a transformer.
Modern systems are based on infrared or radio systems.
University

Oliver Nelles
of Siegen
Pressure Measurement
Pressure measurement is typically based on the measurement of force. The force acts on a
defined area, normally a membrane. Actually, pressure differences are measured, i.e., the
deviation between a pressure and some reference pressure:
• If the reference is equal to the atmosphere pressure, measurement value is called
excess (over) pressure or under pressure. Example: tire of a car.
• Sometimes the reference pressure is zero (vacuum). Then, the
measurement value is called absolute pressure.
The difference pressure lifts or lowers the membrane.
By this, the pressure difference is converted
into a displacement. This can either be
displayed directly (see figure) or it can
be further converted with the principles
discussed in Chapter 3.2
(resistive, inductive, capacitive)
into an electric signal.
University

Oliver Nelles
of Siegen
Mass Measurement
Masses m can be determined via their proportional weight force F. The proportional constant
is the acceleration due to gravity g:
F = mg
A counter force is created that balances the weight force. If the counter force is also
generated by masses, the acceleration g cancels out. If, on the other hand, the counter force is
generated by springs, magnetic or electric fields, or similar, the scale has to be calibrated
dependent on the location because g is influenced by the location on earth (not a perfect,
homogeneous sphere!), even so in higher heights.
University

Oliver Nelles
of Siegen
3.6 Temperature
A) Thermocouples
• 2 wires consisting different materials (usually metal alloys) A and B that produce a
voltage proportional to a temperature difference between either end of the pair of
conductors. Thermocouples are a widely used type of temperature sensor for
measurement and control of temperature T.
• At the ends of the wires a circuit is connected. These connection have temperature T0. For
reference, these connection can be put into ice water.
• The voltage generated by the thermocouple consisting
of wire A and B is given by:
The proportionality constant kAB and the reference

temperature T0 have to be known a priori!
A Cu
Evaluation
B Cu T−T0
University

Oliver Nelles
of Siegen
3.6 Temperature
B) Resistance Thermometer (PTC, Positive Temperature Coefficient) − Metal
The ohmic resistance of a metal wire depends on the temperature T approximately as follows:
The coefficients ! und " are material dependent, R0 denotes the resistance at a reference
temperature T0 (as well material dependent). Because " is much smaller than !, the quadratic
term can be neglected − at least for small and moderate temperate changes.
Typically the reference temperature is chosen as T0 = 0°C:
where # denotes the measured temperature in °C.

The temperature coefficient ! describes the relative
change of the resistance with the temperature:
! > 0 for PTC
! < 0 for NTC
From the measured value we obtain:
standardized
at 100 Ω
University

Oliver Nelles
of Siegen
3.6 Temperature
Signal Processing for Resistance Thermometer
• Processed with bridge circuits
• Direct voltage measurement possible if current is forced by a constant current source.
• CAUTION: The current through the measurement resistor must be small enough that the
power loss (dissipation) is negligible. Otherwise, the heat can distort the temperature
measurement.
• For the Pt-100 resistance thermometer two accuracy classes are standardized:
Class A: ± (0.15 + 0.002|! |)°C Class B: ± (0.30 + 0.005|! |)°C
PTC Resistance Thermometer (Metal) Thermocouples

more accurate less accurate
up to max. 850°C even for higher temperatures
slower (large time constant) faster (small time constant)
no point-wise measurement point-wise measurement
University

Oliver Nelles
of Siegen
3.6 Temperature
C) Resistance Thermometer (NTC, Negative Temperature Coefficient) − Semiconductor
In semiconductors the number of free electrons grows with the temperature significantly. The
intrinsic conductivity increases, the resistance decreases. With the material constant b and the
resistance R0 at temperature T0 the following relationship holds:
With the constant this yields:
Thus, the sensitivity becomes:
The temperature coefficient is:
! < 0: negative temperature coefficient!
Applications: car, appliances.
University

Oliver Nelles
of Siegen
3.6 Temperature
PTC Resistance Thermometer NTC Resistance Thermometer
! is positive and small ! is negative and has large absolute value
! is almost constant ! is strongly temperature dependent
(~ linear characteristics) (strongly nonlinear characteristics)
resistance is small; Calibration resistance is so large that no calibration of the
of the wires is necessary wires is necessary
extensive in space manufactured in tiny sizes
no point-wise measurement point-wise measurement possible
slow fast
high accuracy medium accuracy
high long-term-stability little long-term-stability
University

Oliver Nelles
of Siegen
3.6 Temperature
D) PTC Resistance Thermometer – Semiconductor
PTC thermometers consisting of semiconducting and ferromagnetic material and not of metal.
In the low temperature range it has a small resistance with negative temperature coefficient.
Above a material dependent critical temperature TA, the Curie temperature, the unified
orientation dissolves. This leads to a exponential increase of the resistance in a small
temperature band (TN – TE). In this range the approximate relationship holds:
Sensitivity and temperature coefficients are:
The temperature coefficient

is 5 x higher as with NTC.
Drawbacks are the extremely
dispersive material properties and
volatile stability. Thus, only a low
precision is possible.
University

Oliver Nelles
of Siegen
Source: http://de.wikipedia.org/wiki/Temperaturmessung
3.6 Temperature
E) Miscellaneous
Besides the discussed temperature measurement approaches, there exist
many alternatives that also work according to a contact principle. The
following things have to be considered:
• First, the sensors measure their own temperature.
• The instrumentation engineer has to ensure that the sensor adopts
the temperature of the medium which shall be measured.
• The sensors affect the medium which shall be measured. Thus, the sensors can introduce
or draw heat from the medium. This means, the measurement is interacting!
Alternatively, there exist sensors which work according to the radiation principle.
Especially for high temperatures this is a common approach. The sensors do not have any
contact to the measured medium. They evaluate its radiation, e.g.:
• Thermopile: Series connection of thermocouples that are sensitive to heat radiation.
• Pyroelectric temperature sensor (see picture): Based on the change of polarization of
certain dielectric materials whose charge density on their surface is measured.
• Radiation pyrometer: Based on the measurement of the radiation power density ~ !T 4.
University

Oliver Nelles
of Siegen
3.7 Flow
Volume Flow and Mass Flow
The volume flow is defined as:
The mass flow is defined as:
Both quantities are related via the density ! of the fluid:
If the density is known theoretically (commonly the case for incompressible fluids) or can be
measured, then it is possible to convert volume flow in mass flow and vice versa.
Mass flow as a quantity has the advantage that it is constant in closed systems, while volume
flow of compressible fluids depends on their density and thus also on pressure and
temperature. On the other hand, the measurement of volume flows is cheaper, simpler, and
more widely used.
University

Oliver Nelles
of Siegen
3.7 Flow
A) Differential Pressure Method
The flow measurement is indirectly performed by measuring pressures. A narrowing pipe
increases the flow velocity due to a decreasing cross-section. Following Bernoulli, the flow
velocity increases accordingly:
The pressure drop therefore becomes:
The volume flow can be calculated from the square root of the difference pressure:
Dependent on the kind of narrowing (orifice, nozzle, venturi), an additional pressure drop of
9% – 60% has to be considered due to turbulence (energy loss). That has to be taken into
account with a proportionality factor k.
With a Pitot tube well-known through Prandtl such difference pressures can be measured.
University

Oliver Nelles
of Siegen
3.7 Flow
Different Kinds of Narrowing Pitot Tube
Mounted on Airbus A380

Source:
http://en.wikipedia.org/wiki/Pitot_tu
be
Properties of Flow Measurement with the Differential Pressure Method

• Robust, simple and resistant (endure hard environmental conditions).
• No moving parts. Limited measurement range due to quadratic pressure dependency.
• Most commonly used and standardized approach.
University

Oliver Nelles
of Siegen
3.7 Flow
B) Volume Counter Measurement
Volume counter with metering chamber
Transports fluid in chambers and thus oval
counters its amount and therefore flow. gear meter
Volume counter with hydrometric vane

A wheel with vanes (or blades) is
turned by the fluid flow. Actually,
the flow velocity is measured but
a multiplication with the cross- meter with
section yields the volume flow. meter with axial wings vertical wings
Modern method: The energy for turning the wheel is not taken from the fluid flow. Rather it
is supplied from outside. The pressure drop is feedback controlled to zero.
Properties: Large measurement range, independent of viscosity, sensitive with respect to

contamination of the fluid because of moving parts.
University

Oliver Nelles
of Siegen
3.7 Flow
C) Float Measurement
A floating body with large cross-section AK is placed inside the fluid flow. It is lifted to a
height where the force caused by the flow balances exactly the force caused by its weight:
Here v is the flow velocity inside the ring-formed opening A – AK between the tube and the
floating body. According to the balance of continuity the flow is proportional to the square
of the height h (~ diameter):
measuring
tube
guidanc
e In order to not only display the height but transmit
the signal to the outside world, it is reasonable to
secondary primary convert it into an electrical signal. An effective way
coil
coil to realize that, is to use a ferromagnetic floating
floating
body as coupling between two coils works like in a
body transformer.
flow meter
with floating body
University

Oliver Nelles
of Siegen
3.7 Flow
D) Magnetic Inductive Measurement
For all conducting fluids, the flow can be measured based on Faraday’s law in a contact-free
manner. Orthogonally to the flow a magnetic field with density B is generated. Thus, in a
moving conductor (as such the fluid can be interpreted) orthogonal to the field, a voltage is
induced. This voltage is generated orthogonal to the magnetic field and to the flow direction
and amounts to:
The flow can be calculated by multiplication of
velocity v with cross-section A.
u
! v=− Properties:
Bd
B – Very good linearity, big measurement range.
Electrodes – Independent of density, viscosity, pressure, temperature.
– Also suitable for corrosive fluids and fluids that contain
solids.
– No internal constructions necessary.
magnetic inductive
flow meter – Minimum conductivity is necessary.
University

Oliver Nelles
of Siegen
3.7 Flow
D) Remark: Reversal of the Sensor Principle as Actuator
In the movie “The Hunt for Red October” [Sean Connery, Alec Baldwin] a new and silent
drive system plays an important role. This is no science fiction! The movie refers to a so-
called magneto-hydrodynamic drive, which is constructed without any moving parts.
However, it works only in salt water because it is based on Faraday’s law and requires a
conducting medium.
The magnetic field is generated by a
superconductive generator. Orthogonal to the
field an electric current is send through the
water. Together with the current the magnetic
field results in a force on the water that is
accelerated orthogonal to field and current.
This causes the water to shoot outside the
ship without any propeller!
The picture shows the first ship of this type
with superconductive magneto-
hydrodynamic drive [Mitsubishi, 1998].
University

Oliver Nelles
of Siegen
3.7 Flow
E) Coriolis-based Measurement
A body that rotates with the angular velocity !
that moves with a speed of v orthogonal to the F/2
axis of rotation experiences a Coriolis force
orthogonal to this axis and the speed direction
of
This force bends the U-pipe to an angle ". F/2

With a sin-type excitation, a phase lag exists Coriolis flow meter in U-pipe
Coriolis-Durchflussmesser configuration
in U-Rohr-Anordnung
between point A and B. This phase lag is

proportional to the mass flow.
Properties:
– No constructions inside necessary.
– Robust with respect to all fluid properties.
– Suitable for liquids and gases. Coriolis flow meter in straight configuration
University

Oliver Nelles
of Siegen
3.7 Flow
F) Hot Wire Measurement
A hot wire or a hot foil are heated by an electric
current via a constant voltage or current source.
The flow that flows around the wire or foil
decreases its temperature. This temperature drop causes a change in the electric resistance
that is measured (typically by a bridge circuit).
Here the mass flow is directly measured because the cooling is proportional to the
temperature difference between wire/foil and fluid and proportional to the number of
molecules that impact. Corrections with respect to density or pressure changes are
superfluous.
hot wire
Properties:
– Especially well suited for low velocities.
– Sensitive with respect to dirt and burn-out.
hot wire meter
– Because of aging frequent calibrations are necessary. with bridge circuit
University

Oliver Nelles
of Siegen
3.7 Flow
G) Miscellaneous
Beside the approaches discussed more detailed above, many alternatives are worth at least to
mention briefly:
• Vortex method: The frequency of a vortex shedding (Karman vortex street) behind a
body where a fluid flows around is proportional to the velocity of the fluid.
• Transit time method: Within a short interval a short injection is carried out into a pipe.
The velocity of the fluid is determined by measuring the time interval and the distances
between 2 points of the solution clouds.
• Laser Doppler flow measurement: The frequency shift of laser light that is scattered on
particles inside the fluid yields a point-wise velocity measurement.
• Ultrasound flow measurement: a) Transition time method: A sound wave runs inside the
medium, i.e., the speeds add up (wave + medium). This speed minus the wave speed in
the resting medium yields the medium (fluid) speed. b) Doppler method: The frequency
shift of reflecting sound waves is used. It is dependent on the speed of the medium.
University

Oliver Nelles
of Siegen
3.8 Miscellaneous
Many other quantities can be measured which are not discussed here. The most prominent
certainly are:
• Density: Weighting methods determine the mass and the volume via suppression. The
density can be calculated by division. For solid materials, the uplift in liquids or gases can
be used. For liquids, the hydrostatic pressure difference can be used. For gases, Bunsen’s
law describing the relationship between volume flow and density for exhausting gas
through a hole can be used.
• Concentration: A huge number of special methods exist dependent on the kind aggregate
state of the studied material. Frequently these methods are based on absorption, emission,
or reflection of radiation. For Chromatography different delays of different
components inside an intermixture are used. For Spectroscopy different properties of
atoms or molecules (mass, spin, …) are used for their division. The
Refractometry uses changes in the optical refractive index and Polarimetry uses
changes in the polarization level.
University

Oliver Nelles
of Siegen
3.8 Miscellaneous
• Concentration: Changes in the thermal conductivity can be utilized. Of particular
importance are the measurement of:
1. Humidity: Many approaches exist based on changes in the evaporation rate,
conductivity, permittivity inside a capacitor !r.
2. pH Value: Between electrodes within different liquids a voltage occurs, an effect
known from a galvanic cell (battery). A diaphragm enables the exchange of ions but
prevents the mixing of the liquids.
3. Particle: E.g, the particle-induced couldiness or scattering of light is measured.
• Light: Photoresistors are resistors whose resistance depends on the amount of light they
measure. Photodiodes and CCDs (charge coupled devices) convert light (point-, row-, or
matrix-wise) into electrical current. The sensitivity depends strongly on the wave length of
the light.
• Sound: A dynamic microphone works according to Faraday’s law. This means a
membrane is coupled with a wire that moves though an magnetic field. The induced
voltage in this wire is proportional to membrane movement. However, Capacitor
microphones are based on a capacity change dependent on the membrane movement.
University

Oliver Nelles
of Siegen

Sensorics Script PDF

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Sensorics Script PDF

Transféré par

Droits d'auteur :

Formats disponibles

Sensorics

Version: 1. April 2019 Page 1 Prof. Dr.-Ing.

Page 2 Prof. Dr.-Ing.

Page 3 Prof. Dr.-Ing.

Page 4 Prof. Dr.-Ing.

1.1 Historical Issues

1. Introduction to Measurement Techniques Page 5 Prof. Dr.-Ing.

1. Introduction to Measurement Techniques Page 6 Prof. Dr.-Ing.

1. Introduction to Measurement Techniques Page 7 Prof. Dr.-Ing.

1. Introduction to Measurement Techniques Page 8 Prof. Dr.-Ing.

1. Introduction to Measurement Techniques Page 9 Prof. Dr.-Ing.

1. Introduction to Measurement Techniques Page 10 Prof. Dr.-Ing.

1. Introduction to Measurement Techniques Page 11 Prof. Dr.-Ing.

desired value manipulated controlled value

1. Introduction to Measurement Techniques Page 12 Prof. Dr.-Ing.

measurement = number · unit

Examples: Speed = 3 m/s = 3 m·s−1, Mass = 4 kg, Force = 5 kg·m/s2 = 5 N

1. Introduction to Measurement Techniques Page 13 Prof. Dr.-Ing.

1. Introduction to Measurement Techniques Page 14 Prof. Dr.-Ing.

1. Introduction to Measurement Techniques Page 15 Prof. Dr.-Ing.

1. Introduction to Measurement Techniques Page 16 Prof. Dr.-Ing.

0.6 Time: continuous 0.6 Time: discrete

0.6 Time: continuous 0.6 Time: discrete

1. Introduction to Measurement Techniques Page 17 Prof. Dr.-Ing.

1. Introduction to Measurement Techniques Page 18 Prof. Dr.-Ing.

Page 19 Prof. Dr.-Ing.

4.1 Discretisation of Amplitude and Time

4. Digital Measurement Techniques Page 20 Prof. Dr.-Ing.

4. Digital Measurement Techniques Page 21 Prof. Dr.-Ing.

Focus of this lecture

4. Digital Measurement Techniques Page 22 Prof. Dr.-Ing.

A/D Convertor uc(t) us(t) u(k)

D/A Convertor y(k) y*(t) y(t)

4. Digital Measurement Techniques Page 23 Prof. Dr.-Ing.

0.6 Time: continuous 0.6 Time: discrete

0.6 Time: continuous 0.6 Time: discrete

4. Digital Measurement Techniques Page 24 Prof. Dr.-Ing.

4. Digital Measurement Techniques Page 25 Prof. Dr.-Ing.

4. Digital Measurement Techniques Page 26 Prof. Dr.-Ing.

4.2 Sampling Theorem (www.wikepedia.org)

Shannon‘s Sampling Theorem

4. Digital Measurement Techniques Page 27 Prof. Dr.-Ing.

4. Digital Measurement Techniques Page 28 Prof. Dr.-Ing.

Spectrum of the continuous signal Spectrum of the sampled signal

Spectrum of the sampled signal Spectrum of the sampled signal

4. Digital Measurement Techniques Page 29 Prof. Dr.-Ing.

4. Digital Measurement Techniques Page 30 Prof. Dr.-Ing.

4. Digital Measurement Techniques Page 31 Prof. Dr.-Ing.

because this is the interval width. The quantization

4. Digital Measurement Techniques Page 32 Prof. Dr.-Ing.

4. Digital Measurement Techniques Page 33 Prof. Dr.-Ing.

4. Digital Measurement Techniques Page 34 Prof. Dr.-Ing.

4. Digital Measurement Techniques Page 35 Prof. Dr.-Ing.

4. Digital Measurement Techniques Page 36 Prof. Dr.-Ing.

Properties: Speed depends on the size of steps.

Application Field: continuous conversion,

4. Digital Measurement Techniques Page 37 Prof. Dr.-Ing.

4. Digital Measurement Techniques Page 38 Prof. Dr.-Ing.

4. Digital Measurement Techniques Page 39 Prof. Dr.-Ing.

4. Digital Measurement Techniques Page 40 Prof. Dr.-Ing.

4. Digital Measurement Techniques Page 41 Prof. Dr.-Ing.

4. Digital Measurement Techniques Page 42 Prof. Dr.-Ing.

Gate time is one or