DoktorIngenieur
vorgelegt von
5.10.2005
31.01.2006
Prof. Dr.Ing. Alfred Leipertz
Priv.Doz. Dr.Ing. habil. Rudolf Rabenstein
Prof. Dr.Ing. Reinhard Lerch
Acknowlegements
My special thanks go to my supervisor Dr.Ing. habil. Rudolf Rabenstein for many fruitful
and interesting scientific, and nonscientific discussions, for his qualified supervision of my
thesis and diverting aftermeeting and afterconference evenings. I would further like to
thank Prof. Peter Steffen for long hours of elaborate discussions on various theoretical
aspects of my work. Their advices improved my understanding and the quality of this
thesis considerably.
I would like to thank Prof. Reinhard Lerch, Prof. Jorn Thielecke and Prof. Joachim
Hornegger for their interest in my work, for reviewing this thesis, and for finding the time
to participate in the defense.
I am very thankful to all members of the Telecommunications Laboratory at the
University ErlangenNuremberg for their friendship, support and for making my stay there
so enjoyable. I would especially like to thank Herbert Buchner for many discussions on
adaptive filters, for managing all the paperwork for our patents on wave domain adaptive
filtering and for his comments on my thesis. I also would like to thank my colleague
Heinz Teutsch for entertaining office hours and business trips, and R
udiger Nagel and
Manfred Lindner for the design and construction of outstanding hardware for the wave
field synthesis systems. Additionally, I would like to thank Wolfgang Herbordt for fruitful
discussions on wave domain adaptive filtering, for our very enjoyable business trips and
for his comments on my thesis.
A very interesting side aspect of my work was the collaboration with the artists Michael
Amman and Heijko Bauer. This collaboration resulted in the production of several compositions for WFS which have been presented with great success at the Horkunstfestival
in Erlangen. I would like to thank them for this great opportunity to get insight into this
nontechnical field.
The major portion of this work has been carried out within the scope of the EC
funded IST project CARROUSO and a joint project with Airbus Deutschland GmbH
funded by the German Bundesministerium f
ur Wirtschaft und Arbeit. I would like to
thank all the members of these projects for their support, openminded discussions and
entertaining social evenings during the project meetings.
Finally, my very special thanks go to Claudia for her outstanding patience and support,
and to my son Lennard for beeing so tame during the first weeks of his life. I would also
like to thank my family, the family of Claudia and all of my friends for all the support
and relaxed moments during the last years.
Danksagungen
Mein besonderer Dank gilt meinem Doktorvater Dr.Ing. habil. Rudolf Rabenstein f
ur
die vielen fruchtbaren und interessanten wissenschaftlichen und nichtwissenschaftlichen
Diskussionen, f
ur die qualifizierte Betreuung meiner Arbeit und unterhaltende Abende im
Anschluss an Versammlungen und Konferenzen. Ich mochte weiterhin Prof. Peter Steffen
f
ur die langen Stunden danken, in denen wir verschiedene theoretische Aspekte meiner
Arbeit erorterten. Beider Anregungen trugen in betrachtlichem Ausma dazu bei mein
Verstandnis f
ur das Themengebiet und die Qualitat dieser Arbeit zu steigern.
Ferner mochte ich Prof. Reinhard Lerch, Prof. Jorn Thielecke und Prof. Joachim
Hornegger f
ur ihr Interesse an meiner Arbeit danken, f
ur die Rezension dieses Manuskripts
und daf
ur, dass sie die Zeit fanden, an der Verteidigung teilzunehmen.
Ich bin allen Mitgliedern des Laboratoriums f
ur Nachrichtentechnik an der Universitat
ErlangenN
urnberg sehr dankbar f
ur ihre Freundschaft, f
ur ihre fachliche Unterst
utzung
und daf
ur, dass sie meinen Aufenthalt so angenehm gestaltet haben. Ganz besonders
danke ich Herbert Buchner f
ur viele Gesprache u
ur die Organber die adaptive Filterung, f
isation des Papierverkehrs bei der Anmeldung unseres Patents zur adaptiven Filterung im
Wellenbereich und f
ur seine Kommentare zu meiner Arbeit. Ich mochte zudem meinem
Kollegen Heinz Teutsch f
ur die kurzweiligen B
urostunden und Geschaftsreisen danken
sowie R
udiger Nagel and Manfred Lindner f
ur die Entwicklung und Konstruktion hervor
ragender Gerate und Aufbauten f
ur die WellenfeldsyntheseSysteme. Uberdies
mochte ich
Wolfgang Herbordt f
ur die ertragreichen Diskussionen zur adaptiven Filterung im Wellenbereich danken, f
ur unsere angenehmen Geschaftsreisen und f
ur seine Anmerkungen zu
meiner Arbeit.
Ein sehr reizvoller Teil meiner Arbeit bestand in der Zusammenarbeit mit den
K
unstlern Michael Amman und Heijko Bauer. Unsere Kooperation m
undete in der Produktion mehrerer Kompositionen f
ur Wellenfeldsynthese, die mit groem Erfolg auf dem
Horkunstfestival in Erlangen prasentiert wurden. Ich mochte den beiden daf
ur danken,
dass sie mir einen Einblick in dieses nichttechnische Feld ermoglichten.
Der Hauptteil meiner Arbeit entstand im Rahmen des von der EU finanzierten ISTProjekts CARROUSO und innerhalb des gemeinsamen Projekts mit Airbus Deutschland
GmbH, das vom deutschen Bundesministerium f
ur Wirtschaft und Arbeit gefordert
wurde. Ich mochte allen Beteiligten dieser Projekte f
ur ihre Unterst
utzung, f
ur offene
Debatten und abwechslungsreiche Abende wahrend der ProjektZusammentreffen danken.
Schlielich geht mein spezieller Dank an meine Freundin Claudia f
ur ihre auerordentliche Geduld und Hilfe und an meinen Sohn Lennard, weil er wahrend der ersten
Wochen seines Lebens so friedlich war. Ich mochte auch meiner Familie, der Familie von
Claudia und allen meinen Freunden f
ur deren Beistand und die entspannenden Momente
in den letzten Jahren danken.
vi
vii
Contents
1 Introduction
1.1
1.2
1.3
2.2
2.1.1
2.1.2
2.1.3
2.3
2.4
2.5
2.7
2.3.2
Point Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4.2
Greens Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4.3
Line Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4.4
Planar Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.5.1
2.6
2.6.2
. . . . . 35
2.7.2
viii
CONTENTS
43
43
44
45
47
49
51
52
53
54
56
57
58
61
61
62
62
63
65
66
68
68
69
70
70
71
72
73
73
74
76
76
78
79
81
CONTENTS
3.6
ix
82
83
85
85
85
86
87
88
89
90
91
94
97
99
101
. 101
. 102
. 104
. 104
. 105
. 110
. 111
. 112
. 119
. 122
. 125
. 126
. 127
. 128
. 131
. 132
.
.
.
.
.
134
134
134
136
137
CONTENTS
4.5
4.6
138
139
142
145
146
146
147
147
149
150
153
155
156
157
159
CONTENTS
5.3.2
5.3.3
5.3.4
5.3.5
5.3.6
5.3.7
5.3.8
5.4
Room
xi
.
.
.
.
.
.
.
.
193
195
197
198
207
209
215
216
. 217
. 218
221
A Notations
A.1 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A.2 Abbreviations and Acronyms . . . . . . . . . . . . . . . . . . . . . . . .
A.3 Mathematical Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . .
225
. 225
. 226
. 227
B Coordinate Systems
B.1 Cartesian Coordinate System
B.2 Spherical Coordinate System .
B.3 Cylindrical Coordinate System
B.4 Polar Coordinate System . . .
235
. 235
. 237
. 238
. 240
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
C Mathematical Preliminaries
C.1 Greens Second Integral Theorem . . . . . . . . . . . . . . . . . . . . . .
C.2 The Stationary Phase Method . . . . . . . . . . . . . . . . . . . . . . . .
C.2.1 Approximation of a Linear Distribution of Point Sources . . . . .
C.3 Spatiotemporal Spectrum of the Two and Threedimensional Freefield
Greens Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
C.3.1 Twodimensional Greens Function . . . . . . . . . . . . . . . . .
C.3.2 Threedimensional Greens Function . . . . . . . . . . . . . . . .
243
. 243
. 244
. 244
. 245
. 245
. 246
xii
CONTENTS
E.3 Einleitung . . . . . . . . . . . . . . . . . . . . . . . . . . .
E.3.1 Der Einfluss des Wiedergaberaumes . . . . . . . . .
E.3.2 Systeme zur Kompensation des Wiedergaberaumes
E.3.3 Uberblick
u
ber diese Arbeit . . . . . . . . . . . . .
E.4 Zusammenfassung und Schlussfolgerungen . . . . . . . . .
Bibliography
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
257
259
260
261
262
265
Chapter 1
Introduction
Among the various senses of humans hearing and vision are the ones most present in
everyday situations. Vision is the sense that is sensible to light and hearing is the sense
that is sensible to sound. From a strictly physical viewpoint sound could be regarded just
as compression waves traveling in air. However, sound is much more to humans than just
this physical definition. Sound transmits a wide variety of information, impressions and
sensations and may be interpreted e. g. as noise, speech or music by the receiver. Due to
its importance to humans, it is desirable to have techniques for the recreation of acoustic
events. Sound reproduction aims at recreating an (virtual) acoustic scene at a remote
place or to a later time. When realized properly a perfect auditory illusion of the original
scene is created. Thus, the goal of sound reproduction is to create the perfect acoustic
illusion. However, the human auditory sense is very sensible and is able to detect minor
differences between the original scene and the reproduced one.
The perfect reproduction of recorded or synthetic acoustic scenes is an active research
topic of the last decades. Several generations of engineers invented various sound reproduction systems [Tor98, Ste96, Gri00, KTH99, Pol00]. However, the perfect acoustic
illusion has never been realized by either of them. Nevertheless, sound reproduction has
been improved considerably in terms of quality and spatial impression in the last decades.
In the following a brief overview on sound reproduction systems will be given.
The history of sound reproduction dates back to the invention of the telephone in the
late 19th century. Johann Philipp Reis constructed a first prototype of a telephone in
1860 [Wik05c], which was able to cover a distance of about 100 m. Alexander Graham
Bell improved this first prototype and patented the telephone in 1876 [Wik05a]. The
telephone can be regarded as one of the first sound transmission systems. It was mainly
designed for speech communication and provided quite poor quality for the transmission
of music. It also didnt cope with the binaural nature of human hearing. Some years later
researchers recognized that a binaural reproduction may improve the spatial impression
of sound reproduction considerably [Tor98]. Starting from these first approaches a wide
Introduction
variety of reproduction systems have been developed up to now. This work will mainly
focus on systems that are designed for the highquality reproduction of sound and music.
The first reproduction systems in this context consisted of only one loudspeaker and
are therefore termed as monophonic systems. A one loudspeaker system is able to reproduce the spatial impression of the original scene only to some limited extend. To
improve the situation stereophonic reproduction uses two loudspeakers. These should
be placed equidistantly with respect to the listener. Typically a stereophonic system is
designed to have an angle of 30o between the loudspeakers viewed from the listeners
position. Most stereophonic systems aim at recreating the horizontalonly (pantophonic)
sound field. Stereophony relies on stereophonic reproduction principles, like amplitude
panning [HW98], derived from psychoacoustic research. As a result, the correct spatial
impression of the original scene is only perceived at one particular listening position. This
position is frequently termed as the sweetspot. To improve the situation, stereophony
has been extended to the surround reproduction techniques. The main driving force for the
further development of surround techniques was the cinema industry. First surround systems consisting of three loudspeakers in the front and two in the rear have been presented
in 1940 [Tor98]. They were also based on stereophonic principles and to some extend
shared their limitations. However, it took quite a long time until surround techniques
had their commercial breakthrough. Today five channel surround systems are state of the
art in home cinema systems and even more channels are typically used in movie theaters.
The reproduction techniques have also been extended to threedimensional (periphonic)
reproduction. Vector base amplitude panning (VBAP) [Pul97, Pul99] is an example for
a threedimensional stereophonic reproduction system.
Advanced surround reproduction systems, which overcome the sweetspot and other
limitations of the stereophonic surround techniques have been developed e. g. [Cam67,
KNOBH96, KN93, WA01]. The systems to be explicitely mentioned here are wave field
synthesis (WFS) [Ber88, Hul04] and (higherorder) ambisonics [Dan00, Ger85]. Both are
based on a solid physical foundation. WFS will be discussed in detail in Section 5.1.
In general, the driving signals for the loudspeakers are generated from signals captured
from the original sources, from geometrical information on the source locations, and from
information on the room acoustics of the recording room. This information may have
been recorded in an existing recording room (e. g. a concert hall) or it may have been
artificially created. Typically hybrid approaches of both are used in nowadays music
recording. The signal processing algorithms for the generation of the loudspeaker signals
are derived from fundamental psychoacoustic and acoustic principles. While the quality
of reproduction has increased considerably there are still a number of open problems.
One of these, common to all systems discussed above, is the influence of the room where
the reproduction takes place, the listening room. The following section will illustrate the
influence of the listening room on spatial sound reproduction.
listening area
virtual
source
listening room
recording room
Figure 1.1: Simplified example that shows the effect of the listening room on the auralized wave field. The dashed lines from one virtual source to one exemplary listening
position show the acoustic rays for the direct sound and one reflection off the side wall of
the virtual recording room. The solid line from one loudspeaker to the listening position
shows a possible reflection of the loudspeaker wave field off the wall of the listening room.
1.1
The influence of the listening room on a sound reproduction system and the reproduced
scene will be illustrated first in an intuitive fashion. For this purpose a simple reproduction scenario is considered in the sequel. Figure 1.1 illustrates this simplified example.
The mapping of an acoustic scene in a church (e. g. a singer performing in the choir) into
the listening room is shown exemplarily. For simplicity the propagation of sound waves
will be illustrated by acoustic rays in Fig. 1.1. The dashed lines in Fig. 1.1 from the
virtual source to one exemplary listening position show the acoustic rays for the direct
sound and several reflections off the side walls of the recorded room. The loudspeaker
system in the listening room reproduces the direct sound and reflections in order to create
the desired spatial impression of the original scene. The theory behind nearly all of the
deployed methods assumes an anechoic listening room which does not exhibit any reflections. However, this idealistic assumption is rarely met by typical listening rooms. The
solid line in Fig. 1.1 from one loudspeaker in the upper row to the listening position illustrates a possible reflection of the wave field produced by one loudspeaker off the wall of
Introduction
the listening room. These additional reflections caused by the listening room may impair
the desired spatial impression, as this simplified example intuitively illustrates.
The influence of the listening room on surround reproduction systems is a topic of active
research [Gri98, DZJR05, KS03, Vol98, VTB02, Vol96]. Since the acoustic properties of
the listening room and the reproduction system used may vary in a wide range, no generic
conclusion can be given for the perceptual influence of the listening room. However, the
reflections imposed by the listening room will have influence on psychoacoustic properties
of the reproduced scene. These influences may be e. g. degradation of the directional localization or sound coloration. Especially dominant early reflections seem to influence the
desired spatial impression. Another effect of the listening room on lowfrequency reproduction may be the buildup of low frequency resonances. These resonances negatively
influence the perception of short sound events.
In general, a reverberant listening room superimposes its characteristics to the desired
impression of the recorded room. Listening room compensation aims at eliminating or
reducing the effect of the listening room on the auralized scene. The following section
will introduce listening room compensation systems and will briefly review common approaches.
1.2
The are two basic classes of approaches to listening room compensation: (1) passive and
(2) active listening room compensation. Passive listening room compensation applies
acoustic insulation materials to the listening room as countermeasure against its reflections. However, it is well known that acoustic insulation gets impractical and costly above
an even rather modest level of sound absorption. This holds especially for low frequencies.
Additionally, the effect of this countermeasure is limited by cost and room design considerations. Thus, in practical setups passive room compensation alone cannot provide a
sufficient suppression of listening room reflections. A second possibility for compensating
listening room reflections is to use concepts from active control to perform the desired
compensation. Among various variations there are two basic approaches to be mentioned
here: (1) active control of the acoustic impedance at the walls of the listening room and
(2) utilization of the reproduction system. The first class of approaches tries to actively
influence the impedance of a wall in order to derive freefield conditions [GKR85, OM53].
The second class exploits synergies with the reproduction system in order to control the
wave field within the listening area. The approaches based on the second class will be
reviewed shortly in the following.
Overviews on classical singlechannel room compensation approaches and their limitations
can be found in [Fie01, Fie03, HM04, Mou94]. The singlechannel approaches analyze the
wave field reproduced by one loudspeaker at only one position. Common to most active
room compensation approaches is the basic idea to preequalize the loudspeaker driving
signal using a suitable compensation filter computed from the analysis of the reproduced
wave field. However, there are three fundamental problems. The first problem is that
the compensation filters have to be computed adaptively due to the timevarying nature
of the room characteristics. E.g. as a result of a temperature change in the listening
room the speed of sound will change and hence the acoustic properties [OYS+ 99]. A
wide variety of problems are related to the algorithms used for the adaptation of the
compensation filters. The second problem is that room impulse responses are, in general,
nonminimum phase [NA79] which prohibits to calculate an exact inverse filter. However, the optimal compensation filter is the inverse filter to the impulse response from
the loudspeaker to the measured position. The third problem is that the compensation
filter is optimal only for the measured position. As a result, the performance in terms
of achieved compensation will decrease with increasing distance to the measured position [TW02, TW03, BHK03, NOBH95].
In order to overcome these last two problems multichannel active room compensation
systems have been proposed by several authors. Here, the acoustic properties of the listening room are measured from one or more loudspeakers to one or more microphone
positions. In the multichannel case the calculation of exact inverse filters is possible in
most practical situations [MK88] and the undesired position dependency of active room
compensation may be improved also. However, most classical multichannel approaches
utilize only a limited number of loudspeakers and analysis positions. Hence, they are not
able to provide sufficient control over the wave field and are not able to sufficiently analyze
the reproduced wave field. As a result, the influence of the listening room will be compensated mainly at the analyzed positions with the potential of serve artifacts outside of
these positions. These approaches will therefore be termed as multipoint compensation
approaches.
Advanced reproduction systems like WFS and higherorder ambisonics may provide an
improvement in terms of control over the reproduced wave field. However, they also require a large number of reproduction channels. A sufficient analysis of the reproduced
wave field will require a large number of analysis channels additionally. Unfortunately,
the adaptation of the compensation filters is subject to fundamental problems for a scenario with many playback and analysis channels [SMH95]. The following requirements
for an advanced listening room compensation system can be deduced from the problems
of classical room compensation systems discussed above.
Advanced room compensation methods should:
1. derive analysis signals from the entire listening area, not only from selected points,
2. be based on a spatial reproduction system which provides control over the acoustic
wave field inside the entire listening area, and
Introduction
1.3
This work is organized as follows: Chapter 2 introduces the fundamentals of sound propagation. These fundamentals will serve as preliminaries for the remaining chapters. In
detail, the wave equation, its solutions and orthogonal wave field decompositions will be
discussed. Chapter 3 deals with the first requirement mentioned above and introduces
Fourier based analysis of acoustic wave fields. The Fourier analysis of generic multidimensional signals will be specialized to acoustic fields. This results in the plane wave
decomposition of acoustic wave fields, which provides a powerful tool for the wave field
analysis (WFA) of the reproduced wave field. Chapter 4 discusses the fundamentals of
listening room compensation. First, a generic theory of sound reproduction systems will
be developed that provide sufficient control over the reproduced wave field. Then the
influence of the listening room and the adaptive computation of the compensation filters
will be illustrated. This will be followed by an analysis of the fundamental problems
of classic adaptation algorithms applied to a large number of reproduction and analysis
channels. Spatiotemporal signal and system transformations will then be proposed as
solution to these fundamental problems. This leads to an improved listening room compensation system that fulfills all of the above stated requirements. Chapter 5 presents
WFS as a particular implementation of a spatial sound reproduction system, discusses
the artifacts of WFS and WFA and introduces an improved listening room compensation system for WFS. Results from simulated and measured acoustic environments will
be presented. Chapter 6 will finally give a summary of this work and will draw some
conclusions.
Chapter 2
Fundamentals of Sound Propagation
In the following chapter the fundamentals of sound propagation will be introduced. These
fundamentals will serve as prerequisites for the discussion of wave field analysis, sound
reproduction and listening room compensation in the remainder of this work. The chapter
is organized as follows: First the wave equation and its homogeneous solutions are derived,
followed by a discussion of the inhomogeneous wave equation and reasonable choices for
boundary conditions. Then the solution to the inhomogeneous wave equation with respect
to arbitrary boundary conditions is presented. Finally, the influence of boundaries with
simple geometries is discussed.
2.1
The acoustic wave equation provides the mathematical foundation of sound propagation
through fluids. This section briefly reviews the acoustic wave equation. For an in depth
discussion please refer to [Wil99, Pie91, Bla00, MF53a, MF53b].
2.1.1
In order to derive the lossless acoustic wave equation the following (typical) basic assumptions are made:
1. the propagation medium is homogeneous,
2. the propagation medium is quiescent,
3. the propagation medium can be characterized as an ideal gas,
4. the state changes in the gas can be modeled as adiabatic processes, and
5. the pressure and density perturbations due to wave propagation are small compared
to the static pressure p0 and the static density 0 .
The first condition assures that the relevant parameters of the medium are independent of
the position, the second condition assures that the parameters are independent of time and
that there is no gross movement of the medium. As a result of assumption (3) the laws for
ideal gases apply to the medium, assumption (4) ensures that there is no energy exchange
in form of heat conduction with the medium (no propagation losses) and assumption (5)
ensures that the field variables and medium characteristics can be linearized around an
operating point. The assumptions (1)(5) are reasonable assumptions for typical scenarios
where acoustic wave propagation in air is considered.
The wave equation can be derived from two fundamental physical principles, (I) the
conservation of mass and (II) the momentum equation. The first principle describes the
mass balance in an infinitesimal volume element. Its mathematical formulation is given
as follows [Bla00]
+ v(x, t) = 0 ,
(2.1)
t
where denotes the density of the propagation medium, the nabla operator and v(x, t)
the acoustic particle velocity at the position x and the time t. The time derivative of the
density in Eq. (2.1) will be expressed by the acoustic pressure p(x, t) in the following.
However, this requires to consider the characteristics of the propagation medium. Utilizing
assumptions (3)(5) the desired relation between the time derivative of the density and
the acoustic pressure can be found as [Bla00]
p(x, t)
= c2
,
t
t
(2.2)
where c denotes the speed of sound. The speed of sound is in general dependent on the
characteristics of the propagation medium. For air, it mainly depends on the temperature
and the humidity. A value which reflects typical conditions (T = 20oC, 50% rel. Hum.)
for wave propagation in air is c = 343 [ ms ]. Using Eq. (2.2) to eliminate the temporal
derivative of the density in Eq. (2.1) and applying assumption (5) yields
p(x, t)
= 0 c2 v(x, t) ,
t
(2.3)
v(x, t)
+ p(x, t) = 0 .
t
(2.4)
Eulers equation together with Eq. (2.3) comprises the mathematical formulation of acoustic wave propagation in air, if assumptions (1)(5) are met. Equation (2.3) can be used
to eliminate the acoustic particle velocity v(x, t) from Eulers equation (2.4). The result
is the wellknown homogeneous acoustic wave equation
2 p(x, t)
1 2
p(x, t) = 0 .
c2 t2
(2.5)
Although various other forms of the acoustic wave equation exist in the literature, this is
the most common formulation used. The 2 operator is also referred as Laplace operator
= 2 . The wave equation, as introduced by Eq. (2.5), covers all effects that are inherent
to the wave nature of sound, e. g. diffraction. As noted before, assumptions (1)(5) have to
hold reasonable in order for the wave equation to be a mathematical model for the physical
process of wave propagation. If one or more of these assumptions do not hold, other forms
of the wave equation have to be used. Examples are given e. g. in [Pie91, Bla00, Zio95].
The formulation of the wave equation given by Eq. (2.5) is independent from the particular coordinate system used for the position vector x. The Laplace operator in the wave
equation (2.5) has to be specialized to the particular coordinate system used. Appendix B
introduces the coordinate systems and the respective operators employed within this thesis.
An alternative form of the wave equation can be derived by applying a Fourier transformation [GRS01, HV99] with respect to the time t of the acoustic pressure p(x, t) and the
particle velocity v(x, t). The Fourier transform pair of the acoustic pressure is given as
P (x, ) = Ft {p(x, t)} =
p(x, t) =
Ft1 {P (x, )}
p(x, t) ejt dt ,
(2.6a)
1
=
2
P (x, ) ejt dt ,
(2.6b)
where = 2 f denotes the temporal (radial) frequency and Ft {} the Fourier transformation with respect to the time t. Timedomain Fourier transformed signals are denoted
by capital letters within this thesis. Please refer to Section 3.1 for a more detailed discussion of the Fourier transformation. Introducing P (x, ) into the wave equation (2.5)
and applying the differentiation theorem [GRS01] of the Fourier transformation derives
the wave equation formulated in the frequency domain
2 P (x, ) +
2
c }
 {z
P (x, ) = 0 .
(2.7)
k2
This form is known as the Helmholtz equation. In general, the term /c is abbreviated
by the acoustic wavenumber k
2
k = k () =
2
c
(2.8)
10
where the wavenumber k is assumed to be nonnegative (k 0) within this work. Equation (2.8) is also termed as dispersion relation. It gives a connection between the acoustic
wavenumber k and the temporal frequency . It will be assumed in the sequel, that
the wavenumber can be expressed by the temporal frequency using Eq. (2.8) whenever
appropriate. This holds especially for variables depending on the temporal frequency
which are defined by equations including only the wavenumber k = k(). It will be
shown in Section 2.2 that the dispersion relation allows to derive a connection between
the temporal and spatial frequency of a monochromatic plane wave.
2.1.2
Realworld acoustic wave fields depend on three spatial and one temporal coordinate.
However, acoustic wave fields depending only on two spatial coordinates instead of three
are often suitable for many problems within this work. The transition from the threedimensional description of a wave field P (x, ) to a twodimensional is performed by
assuming that the wave field exhibits no dependency on one of the three spatial coordinates. A wave field that fulfills this condition will be termed as twodimensional wave
field. Please note, that this term covers threedimensional wave fields that are independent from one spatial variable as well as truly twodimensional wave fields [MF53a, Wil99].
So far the choice of the coordinate from which a twodimensional wave field is assumed
to be independent was left open. It will be assumed in the remainder of this work that
a twodimensional wave field does not depend on the zcoordinate. This leads to the
following condition in Cartesian coordinates
PC (x, y, z, ) = PC (x, y, ) .
2.1.3
(2.9)
The homogeneous acoustic wave equation (2.5) describes the acoustic wave propagation
for the freefield case (no boundaries are present) of a source free volume. In order to
calculate the wave field for generic scenarios, further information about the boundaries
and the sources is required. Thus, the exact solution of the wave equation depends on
1. the initial conditions,
2. the acoustic sources present, and
3. the boundary conditions.
The discussion of initial conditions will be neglected within this thesis. It will be assumed
that the acoustic pressure and velocity can be set to zero as initial condition. The derivation of the wave equation (2.5) was not bound to a specific coordinate system used for the
11
description of the wave fields. Depending on the geometry of the problem it is convenient
to use different representations of acoustic wave fields in different coordinate systems.
2.2
This section will consider the freefield solutions of the homogeneous wave equation (2.5)
formulated in Cartesian coordinates. Section B.1 introduces the Cartesian coordinate
system. All vectors and functions evaluated in this coordinate system will be denoted by
the index C attached to the respective variables.
A well known solution to the wave equation formulated in Cartesian coordinates is the
solution of dAlembert [MF53a]
pC (xC , t) = f (ct nTC,0 xC ) ,
(2.10)
where f () denotes an arbitrary function and nC,0 a normal vector with nC,0  = 1. The
proof that above equation provides a solution of the wave equation can be done straightforward by introducing Eq. (2.10) into Eq. (2.5) and using the definition of the Laplace
operator in Cartesian coordinates. In order to interpret Eq. (2.10) the substitution
(xC , t) = ct nTC,0 xC is used in the following. If the position vector xC is constant,
e. g. (0, t) = ct, then (t) is proportional to the time t by the speed of sound. If the
time is constant, e. g. (xC , 0) = nTC,0 xC , then (xC) describes a plane. The vector nC,0 is
the normal vector of this plane. However, if is constant, e. g. (xC, t) = 0, then (xC , t)
describes for every time instant t a plane moving with the speed c into the direction given
by the vector nC,0 . Thus, Eq. (2.10) describes propagating wavefronts with the shape of
f () that propagate with the speed of sound into the direction given by nC,0 . This type of
waves is termed as plane waves. Figure 2.1 illustrates the result for an arbitrary shaped
plane wave traveling in twodimensional space.
Performing a temporal Fourier transformation of Eq. (2.10) according to Eq. (2.6a) yields
PC (xC , ) =
T
1
F ( ) ej c nC,0 xC .
c {z c }
(2.11)
P ()
Using the abbreviation P () for the frequency dependent parts, as denoted above, and
kC,0 = /cnC,0, allows to derive a more compact form of the frequencydomain dAlembert
solution as
PC (xC , ) = P () ejkC,0 xC .
T
(2.12)
12
f ()
nC,0
(xC , t) = ct nTC,0 xC = 0
Figure 2.1: Illustration of dAlemberts solution of the wave equation in Cartesian
coordinates. An arbitrary shaped plane wave f () traveling in twodimensional space is
shown for a fixed time t. The gray plane denotes a plane of constant value = 0 that
moves with the speed of sound c into the direction given by nC,0 .
Introducing Eq. (2.12) into the homogeneous Helmholtz equation (2.7) yields
2
2
2
k 2 = kx,0
+ ky,0
+ kz,0
= kC,0 2 .
(2.13)
Equation (2.13) states that the acoustic wavenumber k is equal to the length of the vector
kC,0 . Thus, kC,0 will be denoted as the wave vector of a plane wave in the following. The
acoustic dispersion relation (2.8) relates the acoustic wavenumber k to the frequency .
Hence, Eq. (2.13) relates the length of the wave vector of a plane wave to its (temporal)
frequency. Thus, each wave vector kC,0 belongs to a specific (temporal) frequency 0 . A
signal in the time domain of the form ej0 t is called a monofrequent or monochromatic
signal. The constant 0 = 2f0 denotes the angular frequency, where f0 is the number
T
of cycles per second the signal exhibits. Due to these considerations, the term ejkC,0 xC
will be denoted as monochromatic plane wave. The wave vector kC,0 of a plane wave can
be interpreted as a vector consisting of the spatial frequencies kC,0 = [ kx,0 ky,0 kz,0 ]T ,
where each spatial frequency denotes 2 times the number of cycles per meter of the
monochromatic plane wave in x,y and zdirection.
The elements of the wave vector are not independent from each other for a fixed frequency
0 (monochromatic plane wave). Three parameters of the set (k0,x , k0,y , k0,z , 0) instead
of all four are sufficient to characterize a monochromatic plane wave. Equation (2.13)
ensures that plane waves, as described by Eq. (2.12), are a solution to the wave equation
13
1
0.8
0.6
y > [m]
0.4
kC,0
0.2
0
0.2
0.4
0.6
0.8
1
1
0.5
0
x > [m]
0.5
Figure 2.2: Pressure field of a monochromatic plane wave as given by Eq. (2.12). The
illustrated plane wave has a frequency of f0 = 1000 Hz. The gray level denotes the
amplitude. The plane wave is traveling in the xyplane for ease of illustration.
formulated in Cartesian coordinates.
Figure 2.2 shows the pressure field of a monochromatic plane wave traveling in the xyplane of a Cartesian coordinate system. The plane wave has a frequency of f0 = 1000 Hz.
The wave vector, which points into the direction of wave propagation, is also shown. As
stated by the inverse Fourier transformation (2.6b), arbitrary shaped plane waves can
be generated by a superposition of monochromatic plane waves with different frequency
dependent weights P (). Of special interest in the remainder of this work are plane waves
with the shape of a Dirac pulse f (ct nTC xC ) = (ct nTC xC ). As the temporal Fourier
transformation of a Dirac pulse equals 1, the weights P () for this special case have to
be chosen as P () = 1/c.
Please note that the condition (2.13) also includes the special case of evanescent waves.
For a detailed treatment on evanescent waves please refer to [Wil99].
2.2.1
It was shown in the previous section that the homogeneous wave equation is fulfilled by
dAlemberts solution (2.10) for all possible choices of nC,0 . Thus, arbitrary solutions of
the wave equation can be expressed as superposition of plane waves traveling into all
possible directions in threedimensional space. Using the frequency domain formulation
of a plane wave (2.12), an arbitrary wave field can be expressed as superposition of all
14
possible wave vectors in Eq. (2.12). However, as derived in the previous section, not all
elements of the wave vector can be chosen freely for one particular frequency . Hence,
an arbitrary wave field can be expressed as follows [Wil99]
Z Z
1
PC (xC , ) =
PC (kx , ky , )ej(kxx+ky y+kz z) dkx dky ,
(2.14)
(2)2
where PC (kx , ky , ) denotes the amplitude and phase of the plane waves and
kz2 = kz2 (kx , ky , ) = k 2 kx2 ky2 .
(2.15)
Equation (2.14) will be termed as plane wave expansion and the coefficients PC (kx , ky , )
as plane wave expansion coefficients in the following. The plane wave expansion coefficients are typically derived by introducing the expansion (2.14) into the wave equation
considering the particular problem. Are more generic approach for their derivation which
is based on the concept of a transformation will be given in Section 3.3.
As the sign of the wavenumber kz is ambiguous due to the powers in Eq. (2.15), the
traveling direction of the plane waves in the zdirection is not included in the plane wave
expansion coefficients PC (kx , ky , ). There are two possible approaches to overcome this
problem. The first approach is to assume that kz is positive. As a result Eq. (2.14) has to
be limited to the upper half space z > 0 to be unambiguous. The second approach is to
include the sign in the expansion coefficients. Within this thesis this will be done by using
(1)
(2)
two sets of expansion coefficients PC (kx , ky , ) and PC (kx , ky , ), where the upper index
(1) denotes positive kz and (2) negative kz values. The plane wave contributions can be
regarded as incoming or outgoing plane waves with respect to the plane z = 0. If kz is
positive, then the waves enter the half space z > 0 through the plane z = 0 and then they
will be termed as incoming waves. Otherwise, if kz is negative then they will be termed
as outgoing waves. Accordingly, the expansion coefficients are denoted as incoming (1)
or outgoing (2) plane wave expansion coefficients.
2.3
The following section considers the freefield solutions of the homogeneous wave equation (2.5) formulated in cylindrical coordinates. Section B.3 introduces the cylindrical
coordinate system as used within this thesis. All vectors and functions evaluated in this
coordinate system will be denoted by the index Y. This section is mainly based on the
work of [Wil99].
The wave equation, as given by Eq. (2.5), can be specialized straightforward to the case
of cylindrical coordinates
2 pY (xY , t)
1 2
pY (xY , t) = 0 ,
c2 t2
(2.16)
15
(2.17)
Introducing this solution into the wave equation (2.16) results in three ordinary differential
equations of second order for p (), pz (z) and pt (t). The solutions to these are given as
follows [Wil99]
p () = p,1 () ej + p,2 () ej ,
(2.18a)
(2.18b)
(2.18c)
where p,i , pz,i(kz ) and P () denote arbitrary constants. As the angular part p () is
periodic in with a period of 2, the constant has to be an integer ( Z). The
solution to the radial part pr (r) is given by Bessels differential equation
2
d2 pr (r) dpr (r)
2
+
+
(k
)pr (r) = 0 ,
r
dr 2
rdr
r2
(2.19)
where kr2 = k 2 kz2 (see Appendix B.3). The solutions to Bessels differential equation
are given by the Bessel functions of first, second and third kind [AS72]. A traveling
wave solution is given in terms of Hankel functions (Bessel functions of third kind) as
follows [Wil99]
pr (r) = pr,1 (kr ) H(1) (kr r) + pr,2 (kr ) H(2) (kr r) ,
(2.20)
(1),(2)
where H
() denotes the th order Hankel function of first/second kind. The general
solution to the wave equation in cylindrical coordinates can be derived by combining the
solutions given by Eq. (2.18) and Eq. (2.20) together as required by Eq. (2.17). Discarding
the constants, the solution can be written to be proportional to
pY (xY , t) ej H(1),(2) (kr r) ejkz z ejt .
(2.21)
This result consists of a product with three exponential parts and one Hankel function.
Each exponential part depends on one parameter only, where can be interpreted as the
angular frequency, kz as the wavenumber (or spacial frequency) in the zdirection and as
the temporal frequency. The sign of the angular frequency denotes the rotation direction,
while the sign of kz denotes the propagation direction of waves in the zdirection. In order
16
to find a similar interpretation for the radial part, the properties of Hankel functions have
to be investigated. In the farfield (kr r 1) the Hankel functions can be approximated
as follows [AS72]
r
1
1
2
(1)
H (kr r)
(2.22a)
ej(kr r 2 4 ) ,
kr r
r
1
1
2
(2)
H (kr r)
(2.22b)
ej(kr r 2 4 ) .
kr r
These approximations of the Hankel functions can be interpreted as incoming/outgoing
(1)
radial waves. Thus, the Hankel function of first kind H (kr r) can be interpreted as
(2)
incoming radial wave contribution and the Hankel function of second kind H (kr r) as
outgoing radial wave contribution. The parameter kr can the be understood as a radial
wavenumber. Figure 2.3 illustrates the incoming/outgoing cylindrical waves for = 0
and kz = 0 for different time instants. The traveling direction of the radial waves can be
seen clearly when following the wave fronts over time.
2.3.1
It was shown for the wave equation formulated in Cartesian coordinates that it is possible
to express arbitrary solutions as a superposition of elementary solutions. In this case these
solutions were plane waves. The same principle can be applied when using cylindrical
coordinates. The elementary solutions using these coordinates are given by Eq. (2.21).
According to the term spherical harmonics used for the elementary solutions of the
wave equation in spherical coordinates [Wil99, Pie91, Bla00], the solutions in cylindrical
coordinates will be termed as cylindrical harmonics in the remainder of this thesis. The
general solution of the wave equation (2.16) is then given as a superposition of elementary
solutions for all possible parameters and kz . Thus, in the frequency domain the general
solution can be given as
Z
1 X j (1)
PY (xY , ) =
e
P (, kz , ) H(1)(kr r) ejkz z dkz
2 =
1 X j (2)
+
e
P (, kz , ) H(2)(kr r) ejkz z dkz .
2 =
(2.23)
The infinite integral over kz can be interpreted as a spatial Fourier integral, the infinite
sum over as Fourier series. Please refer to Section 3.1.1 and Section 3.1.4 for details
on spatial Fourier transformations and Fourier series. The coefficients P (1),(2) (, kz , )
are termed as cylindrical harmonics expansion coefficients in the following and will be
denoted by a breve over the respective variable. It was shown in the previous section
(1)
(2)
(see e. g. Fig. 2.3) that H (kr r) belongs to an incoming (converging) and H (kr r)
17
t = 0 ms
t = 1.6 ms
t = 2.8 ms
18
2.3.2
This section will derive the cylindrical harmonics expansion for a twodimensional wave
field. This specialized decomposition will be termed as circular harmonics expansion in
the following.
If the wave field to be expanded exhibits no dependence on the zcoordinate it is convenient to use polar coordinates instead of cylindrical coordinates for the expansion into
circular harmonics (PY (, r, z, ) = PP (, r, )). As a consequence of condition (2.9) the
integration over kz in Eq. (2.23) will become equal to a spatial Dirac pulse 2 (z). This
Dirac pulse will be discarded in the following. The circular harmonics expansion is then
given as follows
X
PP (xP , ) =
P (1) (, ) H(1)(kr) ej + P (2) (, ) H(2)(kr) ej .
(2.25)
=
This specialized expansion can be used to describe twodimensional wave fields. Please
note, that this formulation will not allow to expand a generic threedimensional wave
30
=0
0
330
30
300
60
90
270
120
240
150
30
180
=2
0
120
240
180
210
270
120
240
30
270
150
90
330
90
300
150
300
330
60
210
60
=1
0
19
180
=3
0
210
330
60
300
90
270
120
240
150
180
210
Figure 2.4: Illustration of the angular basis functions ej of the cylindrical harmonics. The plots show the absolute value of the real part ({ej }) for different angular
frequencies .
field observed only at the plane z = 0. The correct condition for this case would be to
introduce z = 0 into Eq. (2.23). However, this will not result the circular harmonics
decomposition of the wave field, as given by Eq. (2.25).
The expansion into spherical/cylindrical harmonics is closely related to the multipole
expansion of a wave field [Wil99, Pie91]. The multipole expansion of a wave field expands
an outgoing wave field into the fields of multiple acoustic point sources located in the
vicinity of the origin with equal amplitude and opposite phases. The same principle can
be applied to the incoming wave field by expanding it into acoustic drains. The wave field
is thus decomposed into the contributions belonging to monopoles, dipoles, quadrupoles,
etc. located at the origin. The expansion into cylindrical harmonics for the angular and
radial part can be understood as multipole expansion. The angular frequency is equal
to the order of the multipole. For the circular harmonics, as given by Eq. (2.25), the
multipole sources/drains of the multipole expansion are line sources or drains due to the
condition PY (, r, z, ) = PP (, r, ).
20
{H (kr r)}
1
0.5
0
1
2
5
(1)
0.5
1
10
12
14
16
18
20
12
14
16
18
20
kr r
{H (kr r)}
1
0.5
(1)
0.5
1
10
kr r
(1)
Figure 2.5: Illustration of the radial basis functions H (kr r) of the cylindrical harmon(1)
ics. The upper plots shows the real value of H (kr r) the lower the imaginary value for
different angular frequencies .
2.4
The following section will introduce solutions of the inhomogeneous wave equation
1 2
p(x, t) 2 2 p(x, t) = q(x, t) .
c t
2
(2.26)
These solutions compromise acoustic sources as will be shown in the following sections.
2.4.1
Point Source
The basis for the model of a point source is a radially oscillating sphere, that generates
an outgoing wave field. Due to the symmetry the wave field is angle independent (omnidirectional). The model of a point source is derived when considering the limiting case
for which the radius of the sphere becomes progressively smaller. The sphere will then
degenerate to a single point in space. However, the same principle can be applied to nearly
any arbitrary shaped source (with oscillating mass of fluid) if the dimensions of the source
21
are small compared to the considered wavelength and the wave field is observed at a large
distance compared to the source dimensions. As these assumptions hold for several types
of realworld sources, the model of a point source is a frequently used idealization for
acoustic sources. One example for the application of the point source model is the acoustic
field of a loudspeaker mounted in a cabinet (closedloudspeaker). The field observed
at some distance to the loudspeaker has approximately the properties of a wave field
generated by a point source.
Due to the omnidirectional nature of the radiated pressure field it is convenient to use
a spherical coordinate system (see Appendix B.2) to describe the wave field of a point
source. The acoustic pressure field PH (xH , ) of a monochromatic point source placed at
the origin is given as follows [Pie91]
1
(2)
(2)
PH (xH , ) = PH (, ) = P () ejk ,
(2.27)
where k denotes the wavenumber, the radius and P () a frequency dependent pressure
amplitude. Transforming Eq. (2.27) back into the timedomain using the inverse Fourier
transformation (2.6b) yields
(2)
pH (xH , t) =
p(t ) .
2
c
(2.28)
This result proves that Eq. (2.27) describes an outgoing spherical wave. The shape of
the spherical wave in radial direction is given by p(t), which can be computed by an
inverse Fourier transformation of the frequency dependent pressure amplitude P (). The
amplitude of a point source exhibits a 1/ decay and has a pole for = 0. The point
source is therefore also termed as acoustic monopole. This pole at = 0 does not fit well
to physical reality. However, the point source model provides a reasonable approximation
well outside of this pole.
The acoustic particle velocity of a point source can be derived from Eulers equation (2.4).
However, the particle velocity VH (xH , ) has only nonzero contributions in the radial
direction due to the omnidirectional nature of the wave field. The radial component
VH, (, ) can be computed as
VH, (, ) =
1
P (, ) ,
Z(k, )
(2.29)
jk
.
1 + jk
(2.30)
The acoustic impedance is in general complex, which reveals that the point source has a
reactive part. However, in the farfield (k 1) the acoustic impedance can be approximated as Z 0 c, which exhibits no reactive contributions.
22
5
1/
pressure
velocity
3
2
1
0
1
2
3
4
5
0.5
1.5
2.5
> [m]
3.5
4.5
Figure 2.6: Acoustic pressure P (, ) and normalized velocity VH, (, ) for a monochromatic point source with frequency f0 = 1000 Hz plotted over the distance . The plot
shows the real value of the functions and the 1/ decay curve.
Figure 2.6 shows the real part of the acoustic pressure P (, ) and the normalized velocity VH, (, ) of a point source with monochromatic excitation. The frequency of the
excitation was f0 = 1000 Hz. The 1/ decay is shown additionally. It can be seen that for
this example the phase difference between the acoustic pressure and velocity due to the
complex acoustic impedance is only considerable for relatively small distances < 1 m.
If the sign of the argument of the exponential term in Eq. (2.27) is reversed an incoming
spherical wave is produced
1
(1)
(1)
PH (xH , ) = PH (, ) = P () ejk .
(2.31)
This wave field can be interpreted as the result of an acoustic monopole drain placed at
the origin.
2.4.2
Greens Functions
The concept of the Greens functions provides a convenient way to compute arbitrary solutions of the inhomogeneous wave equation. The following section will briefly review the
relevant parts of the background to the Greens functions presented in [MF53a, Zio95].
The appendant inhomogeneous wave equation to the point source model, given by
Eq. (2.27), can be derived by plugging this special solution into the left hand side of
23
the Helmholtz equation (2.7). After evaluating the nabla operator, the result can be
found as [Zio95]
2 P (x, ) + k 2 P (x, ) = 4 P () (x) ,
(2.32)
where (x) denotes a spatial Dirac pulse at the origin. Equation (2.32) states that the left
hand side of the wave equation is equal to a spatial Dirac pulse multiplied by a frequency
dependent factor. Thus, this proves that the excitation for the point source model (2.27)
is an infinitesimal small point in space. This result can be generalized to the case of a
point source placed at an arbitrary point x0 . The acoustic pressure field of the shifted
point source can be derived straightforward from Eq. (2.27) with = x x0  as
1 ejkxx0 
G0,3D (xx0 , ) =
.
4 x x0 
(2.33)
(2.34)
This special solution of the inhomogeneous wave equation is known as the freefield Greens
function G0,3D (xx0 , ).
In general, Greens functions are the solutions to inhomogeneous differential equations
when excited by a (multidimensional) Dirac pulse [Wei03]. In the context of linear
systems, Greens functions can be interpreted as the spatiotemporal impulse response of
the inhomogeneous wave equation. The vector x is also termed as observation point and
the vector x0 as the source point. The notation G(xx0 , ) for a generic Greens function
highlights this interpretation.
One important property of Greens functions is their reciprocity principle [MF53a]. This
principle states, that the Greens function remains unchanged if the observation and source
position are interchanged
G(xx0 , ) = G(x0 x, ) .
(2.35)
Obviously Eq. (2.35) holds for the freefield Greens function given by Eq. (2.33). The
reciprocity principle will be utilized during the derivation of the general solution to an
inhomogeneous wave equation in Section 2.6.
The freefield solutions to the inhomogeneous wave equation (2.26) for arbitrary excitations Q(x, ) = Ft {q(x, t)} can be formulated in terms of the freefield Greens function
as follows [Wil99]
Z
P (x, ) =
Q(x , ) G0,3D (xx , ) dV .
(2.36)
V
This principle will be exploited in the following two sections to derive the wave fields
generated by line and planar sources.
24
The Greens function is typically defined in the (temporal) frequency domain. However,
for some applications its inverse Fourier transformation is of use
g(xx0, t) = Ft1 {G(xx0 , )} .
(2.37)
2.4.3
Line Source
The concept of an acoustic line source is strongly related to the one of a point source.
Section 2.1.2 stated suitable conditions to reduce a threedimensional description of a
wave field to a twodimension description. In order to derive the transition from three to
twodimensions it was assumed that the acoustic field exhibits no dependence on the zcoordinate (e. g. PY (, r, z, ) = PP (, r, )). The analogon in this twodimensional space,
to a point source in a threedimensional space, is then a thin line with infinite length in
the zdirection and timevarying mass of fluid in radial direction. This type of source will
be termed as line source in the following. However, the concept of a line source is equal
to the one of a twodimensional point source for truly twodimensional wave propagation.
The wave field of a line source can be derived by calculating the field of an infinite long
radially oscillating cylinder [Bla00]. As for the point source, reducing the radius of the
cylinder until it has degenerated to a line derives the wave field of the line source. As a
result the pressure field exhibits a pole at r = 0. However, as for the point source, it can
be shown that a line source is a reasonable model for realworld line sources.
Although is possible to derive the wave field of a line source using the way depicted
above, an alternative derivation was chosen here. As indicated in the previous section it
is convenient to use the freespace Greens function (2.33) to calculate arbitrary solutions
of the inhomogeneous wave equation. The inhomogeneous wave equation for a line source
placed at the origin whose axis is perpendicular to the xyplane is given as follows
2 PC (xC , ) + k 2 PC (xC , ) = P () (x)(y) .
{z
}

(2.38)
Q(x,)
The solution to Eq. (2.38) is given in terms of the freefield Greens function by Eq. (2.36).
Introduction of the line source excitation into (2.36) and exploitation of the sifting property of the Dirac function yields
Z
PC (xC , ) = P () (x )(y ) G0,3D (xC xC , ) dV
V
(2.39)
Z
jk x2 +y 2 +(zz )2
e
p
= P ()
dz ,
x2 + y 2 + (z z )2
4
25
where dV = dx dy dz denotes the volume element used for the first integral. The
second integral can be solved using an integral definition of the Hankel function [GR65].
Due to symmetry of the problem the solution is not dependent on the zcoordinate but
p
dependent on the distance r = x2 + y 2 to the source position. Thus, it is convenient
to use a cylindrical coordinate system (see Appendix B.3). The pressure field of a line
source is given as follows
(2)
(2)
PY (xY , ) = PY (r, ) =
j
(2)
P () H0 (kr) .
4
(2.40)
The particle velocity of a line source can be computed using Eulers equation (2.4). Due
to the geometry of the problem, the particle velocity will only have a contribution in the
radial direction. Evaluation of Eulers equation in the radial direction of a cylindrical
coordinate system yields
PY (xY , )
= j0 VY,r (r, ) .
r
(2.41)
Introduction of Eq. (2.40) into Eq. (2.41) yields the acoustic particle velocity of a line
source in radial direction as
VY,r (r, ) =
(2)
1
1
(2)
P () H0 (kr) =
P () H1 (kr) ,
40 c
40 c
(2.42)
(i)
where H () denotes the first derivative of the Hankel function with respect to the
radius r. The acoustic impedance Z(k, r) of a line source can be calculated according to
Eq. (2.29) using Eq. (2.42)
(2)
H0 (kr)
Z(k, r) = j0 c (2)
.
(2.43)
H1 (kr)
As for the point source, the acoustic impedance is in general complex. However, in the
farfield (kr 1) the acoustic impedance can be approximated as Z 0 c, which exhibits
no reactive contributions. The same result was obtained for the point source.
For the point source the amplitude decay was 1/r. In order to derive a similar result
for the amplitude decay of the line source, a closer look at the properties of the zerothorder Hankel function has to be taken. In the farfield (kr 1) the Hankel function can
be approximated as given by Eq. (2.22). This approximation states, that the amplitude
decay of a line source in the far field is 1/ r. In the nearfield the amplitude decay is
strongly dependent on the small argument properties of the Hankel function, no simple
conclusion can be drawn here.
Figure 2.7 shows the real part of the acoustic pressure PY (r, ) and normalized velocity
40 c VY,r (r, ) of a monochromatic line source. The frequency of the excitation was f0 =
26
1
1/sqrt(r)
pressure
velocity
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
1
0.5
1.5
2.5
r > [m]
3.5
4.5
Figure 2.7: Acoustic pressure PY (r, ) and normalized velocity 40 c VY,r (r, ) for a
monochromatic line source with frequency f0 = 1000 Hz plotted over the distance r.
The plot shows the real value of the functions and the 1/ r decay curve.
Repetition of above derivation with Greens function of an incoming spherical wave yields
the wave field of a line drain placed at the origin as
(1)
(1)
PY (xY , ) = PY (r, ) =
j
(1)
P () H0 (kr) .
4
(2.44)
However, this result is evident when taking the farfield approximation (2.22) of the Hankel
functions into account.
2.4.4
Planar Sources
The previous two sections introduced source models for point and line sources. The
traveling waves outside the excited point/line are the eigensolutions of the wave equation
formulated in spherical and cylindrical coordinates respectively. As was already shown
in Section 2.2 plane waves are eigensolutions of the acoustic wave equation formulated in
Cartesian coordinates. This section will introduce solutions of the inhomogeneous wave
equation that result in plane waves.
A plane wave can be exited by an infinitesimal thin plate with infinite size that vibrates
uniformly [JF86]. The simplest case is to place the vibrating plate in the xyplane (z = 0).
This excitation would then result in a plane wave traveling parallel to the zaxis. Without
loss of generality it will be assumed in the following that the plate includes the origin of the
27
coordinate system and its bearing is denoted by its surface normal n0 . The inhomogeneous
wave equation for such a planar source is given as follows
2 PC (xC , ) + k 2 PC (xC , ) = P () 2jk (nTC,0 xC ) ,
(2.45)
where P () denotes a frequency dependent factor. As for the derivation of the line source
it is convenient to use the free space Greens function together with Eq. (2.36) to calculate
the wave field of a planar source. Introduction of the right hand side of Eq. (2.45) into
Eq. (2.36) yields the following integral
Z
jk (xx )2 +(yy )2 +(zz )2
e
p
PC (xC , ) = 2jk P ()
(nTC,0 xC ) dV . (2.46)
2
(x x ) + (y y ) + (z z )
V 4
(2.47)
where kC,0 = /c nC,0 denotes the wave vector of the plane wave. This solution is, as
desired, equal to the homogeneous solution of the wave equation in Cartesian coordinates
given by Eq. (2.12). Plane waves with an arbitrary frequency spectrum can be excited
by using suitable frequency weights P (). The frequency weights can be derived from a
temporal Fourier transformation of the desired shape p(t).
Plane waves are often used to model the farfield of a point or line source. The curvature
of the wavefronts of a line/point source decays with increasing distance to the source.
In the farfield (kr 1) the wave fronts are approximately equal to the ones of a plane
wave.
The particle velocity of a plane wave can be found by introducing Eq. (2.47) into Eulers
equation (2.4). This results in
VC (xC , ) =
1
PC (xC , ) nC,0 .
0 c
{z}
(2.48)
1/Z0
Above result states that the acoustic impedance Z0 of a plane wave is independent from
the position and frequency. The impedance Z0 = 0 c is also known as characteristic
acoustic impedance. The same impedance as for a plane wave is obtained for a point/line
source in the farfield (kr 1). This again highlights the connections between point/line
sources and plane waves in the farfield. Equation (2.48) states that the particle movement
is parallel to the direction of propagation. Hence, this proves that acoustic waves, as
described by the wave equation (2.5), are longitudinal waves.
In the remainder of this work twodimensional wave fields will be used frequently. The
reduction to twodimensions for a plane wave can be done by simply omitting the zcomponents of the position vector xC and the wave vector kC,0 . The resulting plane wave
28
ky
k
0
kx
Figure 2.8: Illustration of the parameters used for a twodimensional plane wave in
Cartesian and polar coordinates.
propagates in the xyplane and can be characterized by its frequency weights P () and
its tilt angle 0 . Figure 2.8 illustrates the parameters of a twodimensional plane wave in
Cartesian and polar coordinates. The connection between the wave vector kC,0 and the
tilt angle can be found by changing the underlying coordinate system from the Cartesian
coordinate system used for Eq. (2.47) to the polar coordinate system (see Appendix B.4).
The product of wave and position vector is then given as
kTC,0 xC = kr cos( 0 ) .
(2.49)
Thus, for a plane wave traveling in twodimensional space with tilt angle 0 the acoustic
pressure of a plane wave can be reformulated in polar coordinates as
PP (xP , ) = P () ejkr cos(0 ) .
2.5
(2.50)
Boundary Conditions
In the foregoing sections only freefield propagation of acoustic waves was considered.
However, in the context of this thesis wave propagation inside enclosures is of special
interest to model wave propagation inside rooms. As stated in Section 2.1, the overall
solution of the wave equation inside an enclosure must meet the acoustic conditions at
the boundaries of the enclosure. The following section will introduce typical boundary
conditions.
The geometry used to formulate the boundary conditions inside an enclosure is depicted
29
V
V
Figure 2.9: Volume V , surface V and inward pointing surface normal n used to formulate boundary conditions for the wave equation.
by Fig. 2.9: a compact volume V which is enclosed by a closed surface V . The orientation of the surface is given by the inward pointing surface normal n. The surface may
represent a real existing or a virtual boundary.
The entire range of possible boundary conditions can be classified into two basic classes:
homogeneous boundary conditions and inhomogeneous boundary conditions. Problems involving mixtures of these classes can be solved by a superposition of the corresponding
solutions. Inhomogeneous boundary conditions are typically used for radiation problems
(e. g. vibrating bodies), homogeneous boundary conditions when the boundaries are stationary. The points on the boundary will be denoted by xs in the following (xs V ).
The total wave field inside the enclosure can be diverted into two components
P (x, ) = Ps (x, ) + Pb (x, ) ,
(2.51)
where Ps (x, ) denotes the wave field generated by sources present inside the volume V
and Pb (x, ) the wave field generated by the boundaries. In the following a commonly
used classification of boundary conditions is reviewed briefly.
2.5.1
Boundary conditions can be formulated in terms of the acoustic pressure, the acoustic particle velocity or both. Typically continuous pressure or particle velocity is assumed at the
boundary. Based on these principles three types of boundary conditions can be formulated
for the description of the acoustic properties of boundaries [JF86, Pie91, MF53a]:
1. Dirichlet boundary condition
P (xs , ) = f (xs , ) ,
(2.52)
30
where f (xs , ) denotes an arbitrary function. Above conditions assume that the
acoustic pressure at the boundary is equal to f (xs , ). For the homogeneous case
(f (xs , ) = 0) this condition models a pressure release boundary. An example for
the application of the homogeneous condition is the boundary condition at the open
end of a duct. It can be shown that the pressure near the open end of a duct is
zero. Such boundaries are also often termed as acoustical soft surfaces.
2. Neumann boundary condition
P (xs , )
= f (xs , ) .
(2.53)
n
The partial derivative in above equation is an abbreviation for the gradient in direction of the normal vector n (see also Appendix C.1). Above condition can be
rewritten in terms of the particle velocity using Eulers equation (2.4) as follows
(2.54)
where Vn (xs , ) denotes the particle velocity in direction of the surface normal n.
Thus, the Neumann boundary condition assumes that the particle velocity in normal
direction is equal to f (xs , ) on the boundary V .
For the homogeneous case (f (xs , ) = 0) this condition models a rigid boundary
This homogeneous boundary condition is used to model acoustically impenetrable
surfaces. A typical example would be a structural wall with smooth surface, e. g. a
wall made of concrete. Such boundaries are also termed as acoustical hard surfaces.
3. Robin boundary condition
The third kind of boundary condition introduced here is a linear combination of the
first two kinds
P (xs , )
+ j(xs , )P (xs , ) = f (xs , ) .
(2.55)
n
For the homogeneous case (f (xs , ) = 0) above condition can be related to the
concept of the specific acoustic impedance [Pie91]. The specific acoustic impedance
at the boundary Zs (xs , ) is defined as follows
Zs (xs , ) =
P (xs , )
.
Vn (xs , )
(2.56)
Introducing Eq. (2.54) into Eq. (2.56) yields equality between Eq. (2.56) and
Eq. (2.55) for (xs , ) = 0 /Zs (xs , ). The specific acoustic impedance is typically being used to describe porous surfaces that are not necessarily impenetrable.
The robin boundary condition includes naturally the boundary conditions of first
and second kind. If Zs is chosen as Zs 1, a rigid or hard boundary is described.
x0
31
Q(x, )
O
Figure 2.10: Geometry used for the general solution of the inhomogeneous wave equation
for a bounded region V with prescribed boundary conditions on the closed boundary V .
Otherwise if Zs is chosen as 1/Zs 1, a pressure release or soft boundary is described. If Zs is equal to the characteristic acoustic impedance Z0 = 0 c freespace
propagation is modeled.
2.6
The solution of the inhomogeneous wave equation with respect to arbitrary boundary
conditions is of special interest in the context of this thesis. It can be used e. g. to analyze
the reproduction of an acoustic scene in the listening room by the reproduction system.
The loudspeakers can be modeled by suitable source models and the characteristics of the
walls by suitable boundary conditions. In the sequel the solution of the inhomogeneous
wave equation will be derived for a bounded region in the presence of inhomogeneous
boundary conditions. The derivation, as given in this section, is based upon [MF53a].
Figure 2.10 illustrates the underlying geometry of the considered interior problem. A
source Q(x, ) generates a wave field within the bounded region V . The closed boundary V surrounding the region V may impose arbitrary homogeneous or inhomogeneous
boundary conditions. Please note, that the region V may be two or threedimensional.
In the first case V describes a plane and V the closed contour surrounding it, in the
second case V describes a volume and V the closed surface surrounding it. The derived
solution is based upon the concept of the Greens functions.
The acoustic pressure field P (x, ) within the region V obeys the inhomogeneous
32
(2.57)
(2.58)
with respect to the same but homogeneous boundary conditions as used for Eq. (2.57).
Multiplying both sides of Eq. (2.57) with G(x0 x, ) and both sides of Eq. (2.58) with
P (x, ) and subtracting the derived equations from each other results in
G(x0 x, ) 2 P (x, ) P (x, ) 2 G(x0 x, ) =
Integration of both sides of Eq. (2.59) over the entire region V gives
Z
Q(x, )G(x0x, ) dV +
G(x0 x, ) 2 P (x, ) P (x, ) 2 G(x0 x, ) dV =
P (x , ) for x V V,
0
0
=
(2.60)
0
otherwise.
where the sifting property of the Dirac delta function was exploited and dV denotes a
suitable chosen volume element in V . In above equation x constitutes a source point and
x0 a receiver point. However, it was desired to have x as receiver point. This can be
reached by interchanging the source and receiver points x and x0 . The second volume
integral can be simplified to a boundary integral using Greens second integral theorem
(see Appendix C.1). Comparison of above result with the left hand side of Eq. (C.3)
yields that v can be identified with G(x0 x, ) and u with P (x, ). Applying the steps
described above to (2.60) yields
P (x, ) =
G(xx0 , )
P (x0 , ) P (x0 , )
G(xx0 , ) dS0 ,
n
n
V
V
(2.61)
where dS0 denotes a suitably chosen surface element on V . Please note, that the Greens
function used in Eq. (2.61) is equal to the one defined by Eq. (2.58) due to the reciprocity
33
theorem (2.35). Equation (2.61) constitutes the solution to the inhomogeneous Helmholtz
equation (2.57) with respect to arbitrary inhomogeneous boundary conditions. It consists
of a volume integral involving the source terms and a boundary integral involving the
boundary conditions. In order to interpret Eq. (2.61) two special cases will be discussed
in the following:
1. Inhomogeneous wave equation, homogeneous boundary conditions
The pressure field and the Greens function have to obey the homogeneous boundary
conditions. As a result, the surface integral over V in (2.61) will vanish. Thus, the
solution for this case is given as
Z
P (x, ) =
Q(x0 , ) G(xx0, ) dV0 ,
(2.62)
V
P (x, ) =
G(xx0 , )
P (x0 , ) P (x0 , )
G(xx0 , ) dS0 .
n
n
V
(2.63)
This integral is known as KirchhoffHelmholtz integral [Pie91, Wil99] (or Helmholtz
integral equation). The KirchhoffHelmholtz integral states that at any point within
the sourcefree region V the sound pressure P (x, ) can be calculated if both the
34
xx
x0
V
x
O
Figure 2.11: Parameters used for the threedimensional freespace KirchhoffHelmholtz
integral (2.64).
2.6.1
The Greens function for a point source in freespace was already derived in Section 2.4.2.
Introduction of Greens function as given by Eq. (2.33) into the KirchhoffHelmholtz
integral (2.63) yields
I
ejkxx0 
ejkxx0
1
(P (x0 , ))
P (x0 , )
dS0 , (2.64)
P (x, ) =
4 V n
x x0 
n x x0 
35
integral states that at any point within the sourcefree volume V the sound pressure
P (x, ) can be calculated if both the sound pressure and velocity are known on the
surface enclosing the volume. This principle will be exploited in Section 3.4 for efficient
wave field analysis. However, as stated before the KirchhoffHelmholtz integral provides
also the basis for sound reproduction. If a monopole and corresponding dipole source
distribution is placed on the surface V , the acoustic pressure inside that surface can be
controlled. The application of this principle to sound reproduction will be illustrated in
Section 4.1.
2.6.2
P (x, ) =
G0,2D (xx0 , )
P (x0, ) P (x0 , )
G0,2D (xx0 , ) dL0 ,
n
n
S
(2.67)
where dL0 denotes a suitably chosen line element on S. Figure 2.12 illustrates the
geometry used for the twodimensional freespace KirchhoffHelmholtz integral. The specialization of the KirchhoffHelmholtz integral to the twodimensional case requires an
appropriate Greens function. As derived in Section 2.4.3, the twodimensional analogon to a point source is a line source. Thus, the freespace Greens function for a line
source placed at the origin is given by Eq. (2.40). Generalization of this result to an arbitrary source position x and observation position x0 yields the twodimensional freespace
Greens function as
G0,2D (xx0 , ) = G0,2D (x0 x, ) =
j (2)
H (k x x0 ) .
4 0
(2.68)
jk (2)
G0,2D (xx0 , ) = hG0,2D (xx0 , ), ni = H1 (k x x0 ) cos ,
n
4
(2.69)
36
xx
x0
S
x
O
Figure 2.12: Parameters used for the twodimensional freespace KirchhoffHelmholtz
integral (2.70).
where denotes the angle between the inward pointing normal vector n of the closed
contour S and the vector x x0 . Equation (2.69) can be interpreted as the field of a
dipole line source whose axis lies parallel to the normal vector n. Introduction of Eq. (2.68)
and Eq. (2.69) into Eq. (2.67) together with Eq. (2.65) yields the twodimensional freespace KirchhoffHelmholtz integral as
jk
P (x, ) =
4
j0 c Vn (x0 , )
(2)
H0 (k x
x0 ) P (x0 , )
(2)
H1 (k
x x0 ) cos
dL0 .
(2.70)
The twodimensional KirchhoffHelmholtz integral states that any twodimensional pressure distribution on the surface S can be reconstructed from a distribution of monopole
line sources and dipole line sources on the closed contour S surrounding the surface S.
The strength of the line sources is given by the acoustic velocity Vn (x0 , ), the strength
of the dipole line sources by the acoustic pressure P (x0 , ) on the closed contour S.
2.7
The solution of the (inhomogeneous) wave equation is dependent on the geometry of the
boundary and the boundary conditions at the boundary. Suitable boundary conditions
have already been introduced in Section 2.5. Closedform solutions of the (inhomogeneous)
wave equation can only be given for simple geometries and boundary conditions. The
following section will derive solutions for the reflection of plane waves at planar boundaries
and the principle of acoustic mode expansion for rectangular rooms.
37
Medium 1
Z1 = 0,1 c1
Pi
Pr
Medium 2
Z2 = 0,2 c2
Pt
Figure 2.13: Illustration of the geometry and the parameters used in the discussion of
plane wave reflection and transmission at the planar boundary between two different fluid
media.
2.7.1
In the following the influence of the interface between two fluid materials with possibly different acoustic properties will be considered. This model is useful to calculate the acoustic
properties of penetrable materials. Nevertheless, the case of impenetrable surfaces is also
included inherently. The walls of a room can be modeled as plane boundaries in first approximation. However, real walls will always have finite extend. If the wavelength of the
considered waves are small compared to the extends of the wall, it is reasonable to model
them by an infinite large plane. Every wave field can be expanded into plane waves as
was shown in Section 2.2.1. Thus, the interaction of plane waves with planar boundaries
is of special interest. The following section will summarize the results given in [Bla00]. To
simplify the discussion, only the twodimensional case will be considered here. However,
the results can be generalized straightforward to the threedimensional case.
Figure 2.13 illustrates the underlying geometry. The two media are described by their
densities 0,1 , 0.2 and their acoustic velocities c1 , c2 . The boundary is illuminated by an
incident plane wave with incidence angle i . Due to the incident plane wave a reflected
and transmitted plane wave is produced with the incidence angles r and t respectively.
In order to derive the relations between these wave fields it is assumed that the pressure
38
and the acoustic particle velocity in normal direction to the interface are continuous. As
a result, the incidence angle of the incident and the reflected wave field will be equal
i = r .
(2.71)
Please note, that this result is independent from the acoustic properties of the mediums. However, the incidence angle of the transmitted wave field is dependent from the
properties of the two mediums. It is given as follows
sin t
c2
=
.
sin i
c1
(2.72)
This relation is known as Snells law. The acoustic pressure of the reflected and the
transmitted wave field can be formulated in terms of reflection and transmission factors.
These relate the pressure amplitudes of the reflected Pr and the transmitted wave field
Pt to the pressure amplitude of the incident wave field Pi . These factors are given as
Pr
Z2 cos i Z1 cos t
=
,
Z2 cos i + Z1 cos t
Pi
Pt
2Z2 cos i
.
=
=
Z2 cos i + Z1 cos t
Pi
Rpw =
(2.73a)
Tpw
(2.73b)
Depending on the angle of incidence i and the properties of the two materials some
special cases arise. Only the case of normal incidence (i = 0) will be discussed briefly,
please refer to [Bla00] for other cases. In case of normal incidence the reflection and
transmission factors simplify to
2.7.2
Z2 Z1
,
Z2 + Z1
2Z2
=
.
Z2 + Z1
R0,pw =
(2.74a)
T0,pw
(2.74b)
One of the simplest models that can be used for a room is a box with rectangular shape.
Although such a simplified model neglects the complex structure of a real room (e. g. the
furniture) it allows, with feasible complexity, to calculate the wave field produced by
sources placed inside the room. Thus, it can be used to gain insight into the structure
of wave fields propagating in rooms. The rather broad approximation by a rectangular
shape is especially reasonable for low frequencies. In the following section the theory of
acoustic modes in rectangular rooms, as derived e. g. in [Pie91, Mec02], will be briefly
reviewed.
The underlying geometry of the problem is illustrated in Fig. 2.14: a rectangular enclosure
with the lengths Lx , Ly and Lz in the x, y and ycoordinate respectively. The acoustic
39
z
Lz
x0
Ly
Lx
x
Figure 2.14: Parameters used to describe the rectangular room.
properties of the walls will be described by their specific acoustic impedance as given
by Eq. (2.56). It is convenient to use Cartesian coordinates in the following, due to the
geometry of the problem .
The solution of the homogeneous wave equation inside the room can be calculated by
assuming separation of the variables in the three spatial dimensions. It can be shown
that the solution of a partial differential equation on a finite domain can be expressed by
a set of basis functions with discrete spectra [MF53a]. The fundamental solution for a
rectangular room is then given as follows
T
(2.75)
where km = [kx,m ky,m kz,m ] denotes the modal wave vector, m = [mx my mz ]T the
vector of (integer) modal orders and Am (), Bm () arbitrary complex constants. Above
(fundamental) solutions are also known as the modes of the room. Each particular mode
has a specific order mx , my , mz in each spatial direction. The fundamental solution
compromises a combination of two plane waves for each spatial coordinate. The solution
of the wave equation in a rectangular room is then a weighted superposition of all possible
modes at one particular point x inside the room
X
T
T
P (x, ) =
Am () ejkm x + Bm () ejkm x ,
(2.76)
m
where m denotes the summation over all permutations of the modal orders mx , my and
mz . The constants Am () and Bm () depend on the acoustic excitation present in the
40
room. The modal wavenumbers kx,m , ky,m and kz,m depend on the geometry and the
boundary conditions.
It will be assumed in the following that the acoustic properties of the walls are characterized by their impedance (2.56). One of the simplest solutions can be found by assuming
that all walls of the room are rigid. This assumption will serve as starting point for a, to
some extend, generalized model at the end of this section. Introducing the rigid boundary
condition into Eq. (2.76) yields
X
P (x, ) =
Am () cos(kx,m x) cos(ky,m y) cos(kz,m z) ,
(2.77)

{z
}
m
m (x)
where the wavenumbers of the (cosine shaped) plane waves can be derived as
k{x,y,z},m =
m{x,y,z}
.
L{x,y,z}
(2.78)
It will be assumed further that the rigid room is excited by a point source placed at
the position x0 . Figure 2.14 illustrates the configuration. For this purpose a solution of
the inhomogeneous wave equation (2.32) under the given boundary conditions has to be
found. Since it can be shown that the modes m (x) form a set of mutually orthogonal
functions it is possible to perform a modal expansion of the inhomogeneous part of the
wave equation. The expansion coefficients Am () can be derived by comparison of the
expansion coefficients for the homogeneous solution and the excitation. The resulting
wave field for a point source and rigid walls is
P (x, ) =
X m (x) m (x0 )
4
P ()
,
2
V
k 2 km
m
(2.79)
where km = km  denotes the modal wavenumber, P () the spectrum of the point source
and V = Lx Ly Lz the volume of the room. A consequence of Eq. (2.79) is that a room
with rigid boundaries will exhibit resonances at the discrete frequencies m = km c.
The results derived so far can be generalized to the case of boundary conditions of the
third kind (see Section 2.5). The acoustic properties of the walls are then described by
their specific acoustic impedance Zs . It can be shown [Mec02] that the fundamental
solution for this generalized case is still given by Eq. (2.75). Unfortunately, it is not
possible to derive the wave vectors km (spatial eigenfrequencies) of the modes explicitly.
In the general case, they have to be calculated numerically. However, for a room with
nearly rigid walls
Zs 
1,
(2.80)
0 c
it can be shown [Pie91] that a reasonable approximation is given by replacing the modal
2
wavenumber km
in Eq. (2.79) by a complex wavenumber whose imaginary part is proportional to the real part of the wall admittance. This approximated solution is given as
follows
P (x, ) =
41
X m (x) m (x0 )
4
P ()
,
2 + jk
V
k 2 km
c
m
(2.81)
42
43
Chapter 3
Fourier Analysis of Wave Fields
Fourier analysis has been used in different fields of engineering in the last decades. Its
success is amongst others a reason of the efficient implementations, like the fast Fourier
transformation (FFT), which are widely available. The following chapter will first review Fourier analysis of spatiotemporal signals. The specialization to acoustic wave
fields yields the plane wave decomposition which will be introduced and discussed in the
remainder of this chapter.
3.1
The Fourier transformation of signals has proven to be a powerful tool for system and
signal analysis. Acoustic pressure fields have, in general, four degrees of freedom: three
spatial dimensions and one temporal. They can be understood as fourdimensional signals. To keep the notation brief, this special type of multidimensional signals will be denoted as multidimensional or spatiotemporal signals within the remainder of this thesis.
The most commonly used form of the Fourier transformation is the timedomain Fourier
transformation of signals. This transformation was already introduced in Section 2.1.
The Fourier transformation of onedimensional signals can be extended straightforward
to multidimensional signals as will be shown in the sequel. This section proceeds as follows: First the multidimensional Fourier transformation of generic signals which depend
on one temporal and up to three spatial dimensions will be introduced. As shown in
Chapter 2, it is convenient to use different coordinate systems for the description of wave
fields. Therefore, the multidimensional Fourier transformation will be specialized to the
Cartesian and cylindrical coordinate system in a next step.
44
3.1.1
Z
1
1
p(t) = Ft {P ()} =
p(t) ejt d ,
(3.1b)
2
where denotes the temporal frequency and Ft {} the Fourier transformation with respect
to the time t. Timedomain Fourier transformed signals are denoted by capital letters
within this thesis. The properties and theorems of the timedomain Fourier transformation
will not be discussed within this thesis. Please refer to the literature, e. g. [GRS01, HV99].
The temporal Fourier transformation can be extended straightforward to spatiotemporal
signals p(x, t). The Fourier transform pair for a multidimensional signal p(x, t) is then
given as follows [JD93, Zio95]
Z
Z
Z
Z
1
1
P (k, ) ejhk,xi+jt dt dK ,
(3.2b)
p(x, t) = Fx,t {P (k, )} =
(2)D+1 KRD
where k denotes the wave vector, D = {1, 2, 3} the spatial dimensionality of the signal,
hk, xi the inner product of k and x, d and dK volume elements of the position and wave
vector space respectively. Spatially Fourier transformed signals will be denoted by a tilde
over the variable within this thesis. The spatial part of the transformation, as formulated
above, is independent from the particular coordinate system used. The inner product
hk, xi and the volume elements d and dK have to be specialized to the coordinate system used. Please note, that the exponential terms for the spatial and the temporal part
have opposite signs. This choice of signs accounts for the spatiotemporal propagation of
plane waves as will be shown in Section 3.3.1. However, in the literature different choices
for the signs of the temporal and spatial part may be found. The Fourier transformation
and its inverse form a complete set of transformations. A signal can be transformed into
the wavenumberfrequency domain with the Fourier transformation (3.2a) and afterwards
back into the spacetime domain using the inverse Fourier transformation (3.2b) without
any information loss.
Examination of the spatiotemporal Fourier integrals (3.2) yields that the integration
can be split into a spatial and temporal part. The spatiotemporal Fourier transformation (3.2a) can be expressed as a temporal Fourier transformation of the signal followed
by a spatial Fourier transformation
P (k, ) = Fx { Ft{ p(x, t) } } = Fx {P (x, )} .
(3.3)
p(x, t)
Fx
Ft
P (x, )
45
p(k, t)
Ft
Fx
P (k, )
P (k, ) = Fx {P (x, )} =
P (x, ) ejhk,xi d ,
(3.4a)
D
R
Z
1
1
P (x, ) = Fx {P (k, )} =
P (k, ) ejhk,xi dK .
(3.4b)
(2)D KRD
These transformations will be referred as (spatial) Fourier transformations of a multidimensional signal in the following. Figure 3.1 gives an overview of the spatiotemporal
Fourier transformations of signals as introduced in this section. Up to now, this formulation is independent on the particular coordinate system used for the spatial variables.
The following two sections will specialize the transformations above to the Cartesian and
cylindrical coordinate system.
3.1.2
(3.5)
46
Introducing this definition into the Fourier transformation (3.4) yields the Fourier transformation of a signal whose spatial dependency is given in Cartesian coordinates
PC (xC , ) ejkC xC d ,
3
R
Z
1
T
1
PC (xC , ) = Fx {PC (kC , )} =
PC (kC , ) ejkC xC dK ,
3
(2) KR3
(3.6a)
(3.6b)
where d = dx dy dz and dK = dkx dky dkz denote the space and wavenumber volume
elements used for integration. The multidimensional Fourier transformation formulated in
Cartesian coordinates is normally referred to as multidimensional Fourier transformation
in the literature.
The Fourier transformation exhibits several properties and theorems which are of interest
within this thesis. These will be reviewed shortly in the following. For a detailed discussion
of the properties of multidimensional Fourier transformations please refer to the literature,
e.g. [Bam89, Bra78].
Separability
The exponential terms in the Fourier integrals (3.6) can be split into a multiplication
of exponential terms involving only one spatial dimension. Thus, the integration can be
performed independently for each spatial dimension. Since the Fourier transformation
is also separable in the temporal dimension, the spatiotemporal Fourier transformation
formulated in Cartesian coordinates is fully separable in all variables. This property is
often exploited in practical implementations.
Scaling theorem
The scaling theorem of the Fourier transformation relates the Fourier transformation of a
signal whose spatial variable has been scaled to its unscaled Fourier transformation. Lets
consider the Fourier transformation of a spatiotemporal signal is given PC (kx , ky , kz , ) =
Fx {PC (x, y, z, )}. The Fourier transformation of this signal when scaling the xcoordinate
by the constant factor a is given as
Fx {PC (a x, y, z, )} =
1 kx
PC ( , ky , kz , ) .
a
a
(3.7)
The same result applies to a scaling of the other spatial variables, since the Fourier
transformation in Cartesian coordinates is fully separable for all spatial variables.
47
Convolution theorem
The convolution theorem relates the convolution of two signals to their Fourier transformations. The convolution of two spatial signals is defined in Appendix B.1 by Eq. (B.8).
Introducing this definition into the Fourier transformation (3.6a) yields
C (kC , ) .
Fx {fC (xC , t) x,t hC (xC , t)} = FC (kC , ) H
(3.8)
Thus, the Fourier transformation of a (spatial) convolution of two signals is given as the
multiplication of the Fourier transformations of the two signals.
3.1.3
The Fourier transformation (3.6), as introduced in the previous section, is based on the
Cartesian coordinate system. However, problems with cylindrical symmetry can often
be described more convenient in cylindrical coordinates. This section will introduce the
Fourier transformation of signals whose spatial dependency is given in cylindrical coordinates.
The cylindrical coordinate system, as used within this work, is introduced in Section B.3.
Using the relations given there, the inner product of kY and xY in cylindrical coordinates
becomes
hkY , xY i = kr r cos( ) + kz z ,
(3.9)
where xY = [ r z]T denotes the position and kY = [ kr kz ]T the wave vector in cylindrical
coordinates. The volume elements d and dK in Eq. (3.4) can be expressed as d =
r d dr dz and dK = kr d dkr dkz , respectively. The Fourier transformation and its
inverse in cylindrical coordinates is then derived as follows
(3.10a)
PY (xY , ) = Fx1 {PY (kY , )} =
Z Z Z 2
1
=
PY (kY , ) ejkr r cos()jkz z kr d dkr dkz .
(2)3 0
0
(3.10b)
Figure 3.2 illustrates the connections between the Fourier transformation formulated in
Cartesian and cylindrical coordinates. The interconnection of both is given by the conversion of the position vector and the wave vector into Cartesian or cylindrical coordinates,
respectively. The transformations (3.10), as introduced above, assume three spatial di
ts
48
PC (xC , )
Fx
PC (kC , )
x = r cos
y = r sin
z=z
kx = kr cos
ky = kr sin
kz = kz
PY (xY , )
Fx
PY (kY , )
Figure 3.2: Illustration of the connections between the Fourier transformation formulated in Cartesian and cylindrical coordinates.
mensions. However, they can be specialized straightforward to the case of twodimensions
by assuming that they do not depend on the zcoordinate (see Section 2.1.2). The resulting polar Fourier transformation can be derived from Eq. (3.10) by discarding the
zdependent parts and replacing the radial wavenumber kr by the wavenumber k
Z Z 2
(3.12)
49
same scaling theorem as for the Cartesian Fourier transformation applies here. For the
radial variable the scaling theorem will be derived in the following. Lets assume a signal
given in cylindrical coordinates PY (xY , ) has the Fourier transformation PY (kY , ). The
problem is to find the Fourier transformation of the radially scaled signal PY (, a r, z, )
in terms of the Fourier transformation of the unscaled signal. By substitution of a r into
the definition of the cylindrical Fourier transformation (3.10a) the desired relation can be
found as
1
kr
Fx {PY (, a r, z, )} = 2 PY (, , z, ) .
(3.13)
a
a
Thus, scaling in the radial direction has only influence on the wavenumber k of the
Fourier transformed signal. This result is evident when considering the circular basis
of the cylindrical Fourier transformation. Scaling of the radial variable is a natural
operation on this basis and will not influence the angle or height dependent parts.
Convolution theorem
The convolution theorem of the Fourier transformation in Cartesian coordinates (3.8)
gives a simple relation between the convolution of two signals and their Fourier transformation. It would be desirable to find a similar relationship for the cylindrical Fourier
transformation. This relationship can be found by transforming both sides of Eq. (3.8)
from Cartesian to cylindrical coordinates. However, the convolution operation in Cartesian coordinates is then transformed into its representation in cylindrical coordinates as
given by Eq. (B.23). Since this representation is quite different from the definition of the
convolution in Cartesian coordinates, it will be termed as cylindrical convolution in the
following. The overall result is then given as
Y (kY , ) H
Y (kY , ) ,
Fx {GY (xY , ) x HY (xY , )} = G
(3.14)
3.1.4
(, r, z, ) ej ,
P
(3.15)
(, r, z, ) are
where Z denotes the angular frequency. The expansion coefficients P
given as
Z 2
1
1
P (, r, z, ) = FS {PY (, r, z, )} =
PY (, r, z, ) ej d .
(3.16)
2 0
50
The Fourier series transform pair will be denoted as Fourier series (FS) in the following, the
expansion coefficients by a circle over the respective variable. The series expansion (3.15)
will be plugged into the definition of the cylindrical Fourier transformation (3.10a) in the
following. Since the zcoordinate can be transformed independently from the angular and
radial coordinate, it is sufficient to consider only the latter two. In order to simplify the
notation, the Fourier transformation of the zcoordinate will not be mentioned explicitly.
Plugging Eq. (3.15) into the definition of the cylindrical Fourier transformation (3.10a)
yields
Z 2
Z
X
(, r, z, )r
PY (, kr , z, ) =
P
ej ejkr r cos() d dr ,
(3.17)
= 0
0
{z
}
h (,r,kr )
where h (, r, kr ) abbreviates the angular integral. This can be solved by using the substitution = 2 [Wei03]
Z 2
h (, r, kr ) =
ej ej 2 ej(kr r sin ) d
(3.18)
0
j
= 2 j J (kr r) e ,
where J () denotes the th order Bessel function of first kind [AS72]. Substitution of
this result into Eq. (3.17) results in
Z
X
j
(, r, z, ) J (kr r) rdr .
PY (, kr , z, ) = 2
j e
P
(3.19)
0
The remaining radial integral equals the definition of the th order Hankel transformation [Gas78, Sne72]. This transformation will be denoted in the following as
Z
H,r {f ()} =
f () J (kr r) rdr .
(3.20)
0
Thus, the cylindrical Fourier transformation can be expressed in terms of a discrete Fourier
series. The series coefficients are given by the Hankel transformation of the Fourier series
coefficients of the pressure field. Applying the same steps to the inverse cylindrical Fourier
transformation yields the Fourier transformations (3.10) in terms of a Fourier series as
follows
PY (kY , ) = 2
PY (xY , ) = 2
(, r, kz , )} ej ,
j H,r {P
(3.21a)
Y (, kr , z, )} ej .
j H,kr {P
(3.21b)
51
FS1
PY (xY , )
(, r, z, )
P
2j H,r
FS
PY (kY , )
3.2
The concept of the Fourier transformation is strongly related to the theory of linear
systems. For example, the timedomain Fourier transformation is widely used for the
analysis of linear systems in the frequency domain. The following section will discuss the
relation of multidimensional Fourier transformations to linear systems and will illustrate
that the wave equation can be understood as a linear system. As for the multidimensional
Fourier transformation, the discussion will be limited to spatiotemporal signals which are
dependent on one temporal and up to three spatial dimensions.
A onedimensional system maps a onedimensional input signal to a onedimensional
output signal. For a timedomain signal this mapping is typically denoted as follows
p(t) = S{q(t)} ,
(3.22)
where S denotes the system, q(t) the input signal and p(t) the output signal. An example
for such a system could be a wireless transmission system for a speech signal. The concept
of onedimensional systems can be extended straightforward to spatiotemporal signals.
The mapping between the input q(x, t) and the output signal p(x, t) can then be written
as
p(x, t) = S{q(x, t)} .
(3.23)
52
q(x, t)
p(x, t)
(3.24)
If there is only a single input or output, the system will be termed as singleinput/multiple
output (SIMO) or multipleinput/singleoutput (MISO) system and consequently a system with one input and one output as singleinput/singleoutput (SISO) system.
Systems can be characterized by considering certain properties of the connection between
the input and output signal. The next section will introduce a common classification of
(multidimensional) systems which is based on their properties.
3.2.1
This section gives a brief overview on the properties of systems which are useful within
the context of this thesis. For a more in depth discussion of the properties of systems
please refer to the literature [GRS01, Bam89]. Multidimensional systems can be classified
as follows:
Linear systems
A system is said to be linear if a superposition of scaled input signals leads to the output
of scaled output signals, where each of the output signals is the response of the particular
input signal. This principle can be formulated as follows
SL {A1 q1 (x, t) + A2 q2 (x, t)} = A1 p1 (x, t) + A2 p2 (x, t) ,
(3.25)
where A1 , A2 denote arbitrary constants and qi (x, t) the input signal that produces the
output pi (x, t) = S{qi (x, t)}. If only one input signal is present, then the response of a
linear system to a scaled input signal is an output signal which is scaled according to the
scaling of the input signal. The discussion of systems will be limited to linear systems
within this thesis. It will be assumed in the following that the linearity condition given
by Eq. (3.25) is always met.
53
(3.26)
(3.27)
(3.28)
where t0 denotes an arbitrary delay, x0 an arbitrary vector and p(x, t) = S{q(x, t)}.
3.2.2
54
System property
Required conditions
linear system
linear medium
linear boundary conditions
homogeneous boundary conditions
medium characteristics are constant over time
boundary conditions are constant over time
unlimited domain
no boundary conditions are present
homogeneous medium
Table 3.1: Relations between system properties, medium characteristics and boundary
conditions for the inhomogeneous wave equation.
result, the inhomogeneous wave equation (2.26) consists of a linear superposition of differential operators in time and space. Since these operators are linear, the inhomogeneous
wave equation can can be interpreted as linear system when discarding the boundary
conditions. If boundary conditions are present then these also have to be linear and homogeneous in order for the system to be linear. Summarizing, wave propagation can be
understood as linear system if the medium characteristics are linear and the boundary
conditions are linear and homogeneous.
The differential operators in the wave equation (2.5) are time and spaceinvariant. However, for the wave equation to represent a timeshift invariant system the medium characteristics and boundary conditions have to be constant over time. As boundary conditions
define boundaries in the spatial domain, spaceshift invariance can only be obtained when
no boundary conditions are present. This implies that the domain where the inhomogeneous wave equation is defined on is unlimited. Additionally, the medium characteristics
have to be constant over space. Thus, spaceshift invariance can only be derived for freefield propagation in a homogeneous medium. Table 3.1 summarizes the results. If the
process of wave propagation should represent a LTSI system then all of the requirements
on the right hand side of Table 3.1 have to hold.
3.2.3
The timedomain Fourier transformation provides a powerful tool for the analysis of LTI
systems. However, the same applies to multidimensional LTSI systems and the multidimensional Fourier transformation as will be shown in the following.
Using the sifting property of the Dirac function, the response p(x, t) of a generic linear
55
where d denotes a suitable chosen volume element for the integration. Equation (3.29)
states that the response of the system S L can be expressed as an integral involving the
input signal q(x, t) and the response of the system to shifted Dirac pulses. The response
of a system to a Dirac pulse is typically denoted as the impulse response of the system.
In general, the impulse response is dependent on the time t, the position vector x and
their shifts
S L {(x x , t )} = h(x, x , t, ) .
(3.30)
where h(x, x , t, ) denotes the impulse response. However, if the system under consideration is a LTSI system, then h(x, x , t, ) can be expressed as the spatiotemporal shifted
impulse response
S LTSI {(x x , t )} = h(x x , t ) .
(3.31)
In this case, the integral (3.29) represents a multidimensional convolution. Thus, the
system output p(x, t) can be expressed as spatiotemporal convolution of the input signal
q(x, t) with the impulse response h(x, t) of the system
p(x, t) = S LTSI {q(x, t)} = q(x, t) x,t h(x, t) ,
(3.32)
(3.33)
Introducing e(x, t) into Eq. (3.32) and using the definition of the multidimensional Fourier
transformation (3.2a) yields
(3.34)
Since the output signal is simply a scaled version of the input signal, e(x, t) represents
56
) H(k,
P (k, ) = Q(k,
)
(3.35)
Thus, in the spatiotemporal Fourier transformed domain the system response is derived
by multiplication of the input signal with the multidimensional transfer function H(k,
).
Equation (3.35) can be interpreted as follows: The input signal and the impulse response
of the system are decomposed into the eigenfunctions of a LTSI system. These provide
a suitable basis for this purpose. The response of the system in this eigenspace is given
as simple multiplication of the corresponding eigenvalues. Since the kernel of the Fourier
transformation (3.2) is equivalent to the eigenfunctions of a LTSI system, it provides a
suitable transformation for the desired decomposition.
The space and timeshift invariance of the system and its impulse response is explicitly
required in order to express the system response as multidimensional convolution. If the
system is time and/or spaceshift variant then the impulse response is not invariant to
temporal and/or spatial shifts. As a result the output of the system cannot be expressed
as multidimensional convolution in the timespace domain or as product in the frequencywave vector domain.
3.3
Up to now, the signals and systems to be analyzed were treated as generic fourdimensional
spatiotemporal signals and systems. The process of wave propagation, as covered by
the acoustic wave equation, can be understood as spatiotemporal system and the corresponding wave fields as spatiotemporal signals. Hence, the multidimensional Fourier
transformation provides a useful tool for system and signal analysis in this context. Since
acoustic wave fields exhibit characteristic properties, it is reasonable to specialize the
Fourier transformation to the case of acoustic wave fields.
It was derived in Section 2.2 that plane waves are eigensolutions of the acoustic wave
equation formulated in Cartesian coordinates. Thus, by decomposing a wave field into its
plane wave contributions it can be represented as superposition of plane waves. However,
this requires a specialized transformation that derives the expansion coefficients in terms
of plane waves. Because of its foundation this transformation will be termed as plane
wave decomposition.
This section will introduce the continuous plane wave decomposition. Furthermore the
properties and theorems of the plane wave decomposition and its representations in terms
of other transformations will be derived.
57
15
0.8
10
0.6
kC,0
kC,0
0.2
k > [1/m]
0
0
y > [m]
0.4
0.2
0.4
0.6
10
0.8
1
1
0.5
0
x > [m]
0.5
15
15
10
0
5
k > [1/m]
10
Figure 3.5: Pressure field and its spatial Fourier transformation of a monochromatic
plane wave traveling in the xyplane. The shown plane wave has an incidence angle of
0 = 45o and a frequency of f0 = 1000 Hz. The circle denotes the position of a spatial
Dirac pulse.
3.3.1
In order to derive more insight into the connections between the Fourier transformation
of a wave field and the expansion into plane waves, the spatial Fourier transformation of
a monochromatic plane wave is investigated in the following.
Introducing the pressure field of a monochromatic plane wave, as given by Eq. (2.47), into
the definition of the spatial Fourier transformation (3.6a) yields
T
PC (kC , ) = Fx {( 0 ) ejkC,0xC } = (2)3 ( 0 ) (kC kC,0 ) ,
(3.36)
where kC,0 denotes the wave vector of the plane wave. Equation (3.36) states, that the
spatial part of the Fourier transformation of a plane wave equals a multidimensional Dirac
pulse at the position kC = kC,0 . Figure 3.5 illustrates this result for a monochromatic plane
wave traveling in the xyplane. The wave vector kC,0 of a plane wave is orthogonal to its
isophase planes. Its angle represents the incidence angle of the plane wave. As given by
Eq. (2.13) and Eq. (2.8), there is a link between the absolute value of the wave vector of
a monochromatic plane wave and its temporal frequency
k02
= kC,0  =
2
0
(3.37)
15
58
Equation (3.37) states, that the absolute value of the wave vector kC,0 is related to its
temporal frequency by the speed of sound. Thus, the position of the Dirac pulse resulting
from the spatial Fourier transformation of a monochromatic plane wave will lie on the
surface of a sphere with the radius 0 /c. The angle under which the Dirac pulse is seen
from the origin denotes the incidence angle of the plane wave, since the wave vector of a
plane wave is orthogonal to its isophase planes. As stated before, complex pressure fields
can be expressed as a superposition of plane waves. Each plane wave has an individual
incidence angle and frequency spectrum, parameters which are captured by the expansion
coefficients. The spectrum of one particular plane wave can be extracted from the spatial
Fourier transformation by computing the coefficients along a line with the incidence angle
of the considered plane wave. Concluding the discussed results, the plane wave expansion
coefficients can be derived from a spatial Fourier transformation of the acoustic pressure
field.
3.3.2
The previous section revealed that it is suggestive to use cylindrical coordinates to derive
the plane wave expansion coefficients. In order to further illustrate the benefits of this
coordinate system change, Fig. 3.6 illustrates the geometric relations between the wave
vector of a monochromatic plane wave and the position vector expressed in Cartesian and
cylindrical coordinates respectively. The wave vector formulated in cylindrical coordinates
kY = [ kr kz ]T consists directly of the incidence angle with respect to the xyplane, the
radial wavenumber kr and the wavenumber kz in the zdirection. The azimuth angle of
the plane wave is implicitly given by the wavenumber kz . Thus, the plane wave expansion
coefficients can be derived by applying a Fourier transformation formulated in cylindrical
coordinates to the wave field. The radial wavenumber kr is related to the wavenumber k
and the wavenumber kz in the zdirection by
p
p
(3.38)
kr (kz , ) = k 2 kz2 = (/c)2 kz2 ,
where the acoustic dispersion relation (2.8) was introduced to derive the second equality.
Thus, the dependency on the radial wavenumber kr can be dropped for the expansion
coefficients in the following. Based on these considerations, the plane wave expansion coefficients can be calculated from a specialization of the Fourier transformation formulated
in cylindrical coordinates (3.10a) as follows
P (, kz , ) = P 3 { PY (xY , ) } =
Z Z Z 2
=
PY (xY , ) ej( kr r cos()+kz z ) r d drdz .
(3.39)
Please note that the left hand side of Eq. (3.39) does not depend explicitly on the radial
wavenumber kr due to Eq. (3.38). This specialized transformation decomposes the wave
59
ky
kr
kz
kx
Figure 3.6: Illustration of the position and wave vector of a monochromatic plane wave
in Cartesian and cylindrical coordinates. Only the xyplane is shown for simplicity. The
parallel lines illustrate the isograms of equal phase for a plane wave propagating in the
xyplane.
field PY (xY , ) into its plane wave expansion coefficients P (, kz , ). This transformation
is termed as plane wave decomposition [HdVB01]. The operator that performs the plane
wave decomposition is denoted as P D in general, the plane wave decomposed signals by
a bar over the signals throughout this thesis. The index D below the plane wave decomposition operator denotes the dimensionality of the decomposition. Figure 3.7 illustrates
the derivation of the plane wave decomposition from the spatial Fourier transformation.
It can be concluded from Eq. (3.38) and Fig. 3.7 that the plane wave expansion components P (, kz , ) of a wave field can be derived from its cylindrical Fourier transformation
PY (, kr , kz , ) as follows
P (, kz , ) = PY (, kr , kz , ) (kr
p
(/c)2 kz2 ) .
(3.40)
For most applications in the context of this thesis, it is not feasible to investigate the wave
field within the entire threedimensional space. However, the analysis of a twodimensional
plane is feasible and sufficient in most situations. It is assumed in the following that the
pressure field exhibits no dependence in the zdirection, thus PY (, r, z, ) = PP (, r, ).
The plane wave decomposition is then given by specializing the cylindrical Fourier transformation (3.10) assuming independence from the zcoordinate and using the dispersion
relation (2.8) as
60
Fourier transformation
PC (kC , ) = Fx {PC (xC , )}
xC xY
kC kY
cylindrical coordinates
2
c
= kr2 + kz2
Figure 3.7: Derivation of the plane wave decomposition from a spatial Fourier transformation of the wave field.
P (, ) = P { PP (xP , ) } =
(3.41)
(3.42)
Hence, in most cases the results derived within this thesis for the twodimensional
plane wave decomposition can be generalized straightforward to threedimensions using
Eq. (3.42). The twodimensional plane wave decomposition is also referred as polar or
radial Fourier transformation in the literature [ACD+ 03] due to the underlying coordinate
system. Figure 3.8 illustrates the relation between the (spatial) Fourier transformation
and the plane wave decomposition of a wave field.
PC (xC , )
61
Fx
PC (kC , )
x = r cos()
y = r sin()
PP (xP , )
P (, )
kx = k cos()
ky = k sin()
2
k 2 = c
Figure 3.8: Illustration of the connections between the spatial Fourier transformation
and the plane wave decomposition of a twodimensional wave field.
3.3.2.1
(3.43)
3.3.3
The inverse threedimensional plane wave decomposition is derived analogous to the plane
wave decomposition. Using the inverse Fourier transformation formulated in cylindrical
coordinates (3.10b) and discarding the integral over dkr due to Eq. (3.38) and Eq. (3.40)
yields the inverse plane wave decomposition as
PY (xY , ) = P 1
P (, kz , ) =
3
Z Z 2
k
=
P (, kz , ) ej( kr r cos()+kz z ) d dkz .
(2)3 0
(3.44)
62
PP (xP , ) = P
P (, )
k
=
(2)2
P (, )ejkr cos() d .
(3.45)
Unless explicitly mentioned, the twodimensional inverse plane wave decomposition will
be used and referred to as inverse plane wave decomposition in the remainder of this work.
Introducing Eq. (3.45) into Eq. (3.41) proofs that a wave field can be transformed into
its plane wave expansion coefficients and afterwards back into its spatial representation
without information loss.
The inverse plane wave decomposition can be interpreted as the superposition of plane
waves from all directions . The frequency spectrum of a particular plane wave is given
by its plane wave decomposition P (, ). The k term present in the inverse plane wave
decomposition can be interpreted as timedomain filtering process with the frequency
characteristic /c. The filtering process has to be performed when reconstructing the
wave field from its plane wave decomposition. As a consequence, the plane wave decomposed signals P (, kz , ) must exhibit a frequencydomain lowpass characteristic. The
desired /c highpass response in Eq. (3.45) can be realized only approximately in practical implementations. One approach to approximate the desired filter response is to split
the required filter onto the forward and the inverse plane wave decomposition using the
relation  = j j.
3.3.4
The following section will derive connections between the plane wave decomposition and
other well known transformations. Additionally, some useful representations and variants
of the plane wave decomposition will be introduced.
3.3.4.1
It was shown in Section 3.1.4 that the cylindrical Fourier transformation can be expressed
in terms of the Hankel transformation. The same applies to the plane wave decomposition
when introducing the dispersion relation into Eq. (3.21a) and discarding the zdependent
parts. The plane wave decomposition in terms of the Hankel transformation is given as
P (, ) = 2
63
(, r, )} ej .
j H,r {P
(3.46)
In the following the special case of a pressure field that exhibits radial symmetry
PP (, r, ) = P (r, ) will be considered. The Fourier series expansion coefficients are
(, r, ) = P (r, ) [], where [] denotes the discrete unit pulse. Hence,
then given as P
the summation over in Eq. (3.46) degenerates to the fixed value = 0. The plane wave
decomposition for a radially symmetric field is then given as
Z
P () = 2 H0,r {P (r, )} = 2
P (r, ) J0(kr) r dr .
(3.47)
0
This specialized plane wave decomposition for radially symmetric wave fields is well known
as FourierBessel or zerothorder Hankel transformation [Pap68]. Please note, that the
plane wave decomposition P () of a radial symmetric field P (r, ) is also radially symmetric and thus depends only from the frequency .
3.3.4.2
The Bessel function J (kr) can be expressed as sum of Hankel functions [AS72]
J (kr) =
(1)
1
[ H(1) (kr) + H(2) (kr) ] ,
2
(3.48)
(2)
where H () and H () denote the th order Hankel functions of first and second kind.
The Hankel functions of first and second kind play an important role in the context of
traveling wave solutions for the wave equation formulated in cylindrical coordinates. The
(1)
physical interpretation of the Hankel functions gained in Section 2.3 is that H (kr)
(2)
corresponds to a converging incoming wave and H (kr) to a diverging outgoing wave.
Thus, Eq. (3.48) can be used to decompose a wave field into incoming and outgoing wave
contributions
P (x, ) = P (1) (x, ) + P (2) (x, ) ,
(3.49)
where P (1) (x, ) denotes the incoming part and P (2) (x, ) the outgoing part respectively.
Introducing Eq. (3.48) into Eq. (3.46) and using the definition of the Hankel transformation (3.20) yields
P (, ) =
+
j e
j ej
(3.50)
(, r, ) H (2) (kr) rdr .
P
replacemen
64
H(1)
,r
FS
j H(2)
,r
FS
j
PP (xP , )
FS1
P (1) (, )
(, r, )
P
P (2) (, )
Figure 3.9: Plane wave decomposition of a wave field into incoming and outgoing contributions as given by Eq. (3.52). The decomposition is given as Fourier series, where
the expansion coefficients are computed from the transformation (3.53) of the angular
expansion coefficients of the wave field.
Equation (3.50) can be divided into incoming and outgoing plane wave contributions
similar to Eq. (3.49)
P (, ) = P (1) (, ) + P (2) (, ) .
(3.51)
This allows to decompose a wave field into incoming and outgoing plane wave contributions
as follows
P (1) (, ) =
P (2) (, ) =
j H(1)
,
,r {P (, r, )} e
(3.52a)
j H(2)
,
,r {P (, r, )} e
(3.52b)
(2)
where the notation H(1)
,r {} and H,r {} is chosen in analogy to the definition of the Hankel
(2)
transformation (3.20). The transformations H(1)
,r {} and H,r {} are defined as follows
H(1),(2)
{f ()}
,r
(3.53)
3.3.4.3
65
As discussed in Sections 2.2 and 2.3, plane waves and cylindrical waves are eigensolutions
of the acoustic wave equation in Cartesian/cylindrical coordinates. Both eigenfunction
bases allow to decompose a wave field into plane waves or cylindrical harmonics. However, their relations to each other are not clear. This section will derive the interrelation
between the decomposition into plane waves and the decomposition in terms of circular
harmonics for twodimensional wave fields.
The expansion of a wave field into circular harmonics was already introduced in Section 2.3.2. It is given as follows (see Eq. (2.25))
PP (xP , ) =
P (2) (, )H(2)(kr)ej .
(3.54)
This expansion can be interpreted as Fourier series with respect to the angle . In order
to derive a relation between plane waves and circular harmonics the angular Fourier series
of a plane wave is derived by calculating its expansion coefficients using Eq. (3.16). The
result is the following identity
jkr cos()
j J (kr)ej() ,
(3.55)
which is also known as JacobiAnger expansion [Wei03, GR65]. Equation (3.55) expands
the pressure field of a plane wave into a series of Bessel functions. This representation
of a plane wave can be split into incoming/outgoing cylindrical contributions, using the
relation between the Bessel and the Hankel functions given by Eq. (3.48). This was
shown in the previous section. In the following only the incoming part of the wave field
will be considered. Introducing the incoming part of Eq. (3.55) into the inverse plane
wave decomposition (3.45) yields
Z 2
X
k
(1)
(1) (, ) 1
P (xP , ) =
P
j H(1) (kr) ej() d
(2)2 0
2 =
(3.56)
Z 2
X
k
(1)
j
(1)
j
=
j P (, ) e
d H (kr)e .
2(2)2 = 0
Comparing above result with the decomposition of the acoustic field into cylindrical harmonics (3.54) for the incoming part of the field yields a relation between the plane wave
and circular harmonics expansion coefficients
Z 2
k
(1)
P (, ) =
j P (1) (, )ej d .
(3.57)
2
2(2) 0
Equation (3.57) relates the circular harmonics expansion coefficients to the plane wave
expansion coefficients. However, it was desired to derive a relation between the plane wave
66
P (1) (, )
FS
P (1) (, )
P (2) (, )
FS
P (2) (, )
Figure 3.10: Relationship between the expansion into circular harmonics and the plane
wave decomposition as given by Eq. (3.58). The frequency dependence was discarded in
this diagram.
expansion coefficients and the circular harmonics expansion coefficients. Equation (3.57)
bears strong resemblance with the calculation of the Fourier series expansion coefficients
given by Eq. (3.16). The desired relation can be found by identifying P (1) (, ) as the
expansion coefficients of a Fourier series. Performing the steps outlined above also for the
outgoing part of the wave field derives then the desired relations as
4 X (1)
(1)
j P (, ) ej ,
P (, ) =
k =
4 X (2)
P (2) (, ) =
j P (, ) ej .
k =
(3.58a)
(3.58b)
Equation (3.58) states that the plane wave decomposition of a wave field is, up to the
factor j , given by the discrete Fourier series of the expansion coefficients in terms of
circular harmonics. Figure 3.10 illustrates the derived relationship. Please note, that
these results are similar to those obtained in [HdVB01] although the derivations differ.
3.3.4.4
The previous sections introduced different representations of the plane wave decomposition. Figure 3.11 illustrates their links and the assumptions made to derive them. It can
be concluded that the cylindrical harmonics expansion coefficients can be derived from a
complex Hankel transformation when comparing the decomposition in terms of circular
harmonics (3.58) with the one in terms of the Hankel transformation (3.52)
k
(, r, )} .
P (1),(2) (, ) = H(1),(2)
{P
4 ,r
(3.59)
3D PWD
0 0
(1),(2)
PY (, r, z, ) = PP (, r, )
(xP , ) =
P (, ) =
4
k
2D PWD
0 0
(1),(2)
P (1),(2) (, )H
(kr)ej
P (1),(2) (, ) =
R R2
PP (xP , ) =
(, r, ) ej
P
R R R2
PP
P (, ) = 2
PWD as Hankel
transformation
(, r, )} ej
j H,r {P
J (kr) =
j P (1),(2) (, ) ej
circular harmonics
P (1),(2) (, ) =
1
2
(1)
(2)
[ H (kr) + H (kr) ]
(, r, )} ej
j H(1),(2)
{P
,r
incoming/outgoing
67
Figure 3.11: Different representations of the plane wave decomposition (PWD) and the
assumptions used to derive them.
P (, kz , ) =
68
3.3.5
The following section will derive useful properties and theorems of the twodimensional
plane wave decomposition. Most of them have been derived in accordance with ones known
from the Fourier transformation. Since the plane wave decomposition is a specialization
of the cylindrical Fourier transformation for acoustic wave fields some of the properties
of Fourier transformations directly apply also to the plane wave decomposition.
3.3.5.1
Properties
The following properties have been derived from the definition of the plane wave decomposition and the properties of the cylindrical Fourier transformation (see Section 3.1.3).
Linearity
The plane wave transformation is a linear transform as can be proven easily. As a consequence, the superposition principle applies, which is desirable since this principle applies
also to (linear) acoustic fields.
Separability
As for the Fourier transformation in cylindrical coordinates, the plane wave transformation
is not separable in the angular and radial direction (, r). However, the threedimensional
plane wave decomposition is separable in the zcoordinate. As a consequence, the properties and theorems will be derived only for the twodimensional plane wave decomposition
in the following. The wellknown theorems from the onedimensional Fourier transformation apply then for the zcoordinate.
Timedomain filtering of a wavefield
Lets consider that a wave field pP (xP , t) and its plane wave decomposition P (, ) =
P{pP (xP , t)} is given. If the wave field has been filtered in the timedomain by a filter
h(t), then its plane wave decomposition will be a filtered version of the plane wave decomposition of the unfiltered field, filtered by the same filter as the field. This result can
be derived from the following equality by subsequently carrying out the transformations
as
ph (, t) = Ft1 { P{ Ft{ pP (xP , t) t h(t) }}}
= p(, t) t h(t) ,
(3.60)
where ph (, t) denotes the plane wave decomposition of the filtered wave field. This
property is useful e. g. when measuring the spatiotemporal impulse response of a source.
69
The plane wave decomposed wave field for an arbitrary source excitation is given by a
timedomain convolution of the plane wave decomposed impulse response with the desired
source signal.
Realvalued wave field
A realvalued wave field pP (xP , t) is the natural case for acoustic pressure fields. The
plane wave transformation of a realvalued pressure field is again realvalued. Proof:
If the pressure field is realvalued then its frequency domain representation exhibits a
conjugate complex symmetry PP (xP , ) = PP (xP , ). The same symmetry applies to
the exponential term in the plane wave decomposition (3.41). The multiplication of these
terms is again of conjugate complex symmetry and thus the plane wave decomposition
P (, ) = P (, ). As result, the plane wave decomposition p(, t) of a realvalued
wave field pP (xP , t) is also realvalued.
Spatial symmetries
The plane wave transformation of radially symmetric field PP (r, ) is also radially symmetric P () = P{PP (r, )}. This result was already derived in Section 3.3.4.1.
Duality
The plane wave decomposition depends on one variable less than the wave field it was
computed from. This results from exploiting the acoustic dispersion relation. Hence, it
is not straightforward possible to formulate a duality principle as it is possible for the
Fourier transformation.
3.3.5.2
Scaling Theorem
The cylindrical Fourier transformation of a signal scaled in the radial variable r was
already derived in Section 3.1.3. Adopting this principle to the plane wave decomposition
yields
P{PP (, a r, )} =
1
P (, ) .
a
a2
(3.61)
Hence, scaling in the radial direction only influences the frequency characteristics of the
plane wave decomposed field. This result is evident when considering the polar basis of
the plane wave decomposition. Scaling of the radial variable is a natural operation on
this basis and will not influence the angle dependent parts.
70
3.3.5.3
Rotation Theorem
In the following the plane wave decomposition of a rotated acoustic pressure field will be
calculated. The plane wave decomposition of a pressure field rotated by the angle 0 is
given as
Z Z
P (, ) =
(3.62)
P{PP ( 0 , r, )} = P ( 0 , ) .
(3.63)
Thus, a rotation of a wave field results in a rotation of the plane wave decomposed
wave field. This result is evident when considering the polar basis of the plane wave
decomposition. Rotation is a natural operation on this basis.
3.3.5.4
Multiplication Theorem
In the following a relation between the multiplication of two wave fields and their plane
wave decompositions is derived. Consider the multiplication of two fields
CP (xP , ) = AP (xP , ) BP (xP , ) .
(3.64)
As each wave field has a periodicity of 2 in it can be developed into a Fourier series,
e. g. for the field AP (xP , ) as follows
AP (xP , ) =
r, ) ej ,
A(,
(3.65)
X
X
r, ) ej B(,
r, ) ej .
A(,
(3.66)
= =
Combining the two exponential terms and using the substitution = + allows to
rewrite Eq. (3.66) such that the series coefficients are given by the convolution of the
r, ) and B(,
r, )
expansion coefficients of the two fields A(,
CP (xP , ) =
r, ) B(,
r, )) ej .
(A(,
(3.67)
Developing CP (xP , ) into a Fourier series and comparing the expansion coefficients with
Eq. (3.67) it is easy to see that
r, ) = A(,
r, ) B(,
r, ) .
C(,
(3.68)
71
This result states that the multiplication of two wave fields for one particular radius r
corresponds to a convolution of the Fourier series expansion coefficients for this particular
radius. The plane wave decomposition in terms of the Fourier series expansion coefficients
was derived in Section 3.3.4.1. It is given by a Fourier series of the Hankel transformation
of the Fourier series expansion coefficients. Introducing Eq. (3.68) into Eq. (3.46) yields
the plane wave decomposition of a multiplication of two wave fields as
) = 2
C(,
3.3.5.5
r, ) B(,
r, )} ej .
j H,r {A(,
(3.69)
Convolution Theorem
Consider the spatial convolution of two wave fields AP (xP , ) and BP (xP , )
CP (xP , ) = AP (xP , ) x BP (xP , ) .
(3.70)
Performing a multidimensional spatial Fourier transformation and applying the convolution theorem of the Fourier transformation (3.14) allows to represent the spatial convolution as multiplication of the spatial Fourier transformed signals
P (kP , ) .
CP (kP , ) = AP (kP , ) B
(3.71)
It would be desirable to find a similar relationship for the plane wave decomposition of
the two wave fields. However, the plane wave decomposed signals depend on one variable
less than the cylindrical Fourier transformation of the same signals. Hence, Eq. (3.71)
does not hold for plane wave decomposed signals. It seems that no similar relationship
exists for the plane wave decomposition. In order to confirm this it is instructive to
introduce the definition of the polar convolution (as given by Eq. (B.34)) into the plane
wave decomposition
Z Z 2
C(, ) =
CP (xP , ) ejkr cos() r ddr
0
0
Z Z 2 Z Z 2
=
AP ( , r , ) BP(, r, ) r d dr ejkr cos() r ddr ,
0
(3.72)
where = (, , r, r ) and r = r(, , r, r ) as defined by Eq. (B.35). Interchanging the order of integration yields
Z Z 2
Z Z 2
jkr cos()
C(, ) =
AP ( , r , ) r
BP (, r, )e
r ddr ddr .
0
(3.73)
72
Using geometric relations, the exponential term in the inner integrals can be expressed as
ejkr cos() = ejkr cos() ejkr
cos( )
(3.74)
) =
C(,
0
Z
AP ( , r , ) r
Z
2
jkr cos()
BP (, r, ) e
r ddr ejkr
cos( )
d dr . (3.75)
Inspection of the inner integral yields that this does not only involve dependence on
and r, but also on r directly. Thus, it cannot be expressed as the plane wave decomposition of B(, r, ) which would be necessary in order to formulate a meaningful
convolution theorem.
3.3.5.6
Parsevals Theorem
Parsevals theorem relates the energy of a signal to the energy of its Fourier transformation [Bam89, GRS01]. A similar theorem for acoustic wave fields and their plane wave
decompositions will be derived in the following.
Parsevals theorem for spatiotemporal signals given in polar coordinates can be derived
from its formulation in Cartesian coordinates [Bam89] by introducing the coordinate and
wave vector transformations given by Eq. (B.28) and Eq. (B.29) respectively. Parsevals
theorem in polar coordinates is then given as
Z
where bP (xP , t) denotes the conjugate complex field to bP (xP , t). For the special choice
aP (xP , t) = bP (xP , t) Eq. (3.76) simplifies to
Z
1
aP (xP , t) r ddrdt =
(2)3
{z
}
2
Ea
2
AP (kP , ) k ddkd . (3.77)
The left hand side will be defined as the energy Ea of the field aP (xP , t). Equation (3.77)
relates the energy of a field to the energy of its Fourier transformed representation. Parsevals theorem for the plane wave decomposition can be derived from Eq. (3.77) by
introducing the dispersion relation (2.8) into Eq. (3.77)
2
0
1
aP (xP , t) r ddrdt =
(2)3
2
73
A(,
)2
dd .
c
(3.78)
Equation (3.78) represents Parsevals theorem for the plane wave decomposition.
3.3.5.7
Lets first assume that the spatial Fourier transformation PC,0 (kC , ) of a plane wave
traveling in threedimensional space is given for the plane z = 0. For freefield propagation
the spatial Fourier transformation of the acoustic pressure at an arbitrary height z is given
as [Ber87, Wil99]
PC (kC , ) = PC,0 (kx , ky , ) ejkz z .
(3.79)
This principle is denoted as plane wave extrapolation [Ber87] and can be applied also to
plane wave decomposed signals as will be illustrated in the following. The plane wave
expansion coefficient P (, ) for any fixed denotes the spectrum of a plane wave with
incidence angle . This spectrum is given with respect to the origin of the coordinate
system. However, for some applications it is desirable to spatially shift the origin of the
plane wave decomposition to a point xP,0 . This shifted plane wave decomposition will be
denoted as PxP,0 (, ). Generalization of Eq. (3.79) allows express the shifted plane wave
decomposition PxP,0 (, ) in terms of the unshifted plane wave decomposition P (, ) as
PxP,0 (, ) = P (, ) ejkr0 cos(0 ) .
(3.80)
The wave field extrapolation technique for plane wave decomposed wave fields, as described by Eq. (3.80), is a powerful tool since it allows extrapolate a given (e. g. measured) plane wave decomposition to other positions. In this context, the inverse plane
wave decomposition (3.45) can be understood as the extrapolation of a plane wave decomposed field P (, ) to the position xP , followed by a superposition of all extrapolated
plane wave components. The superposition of all plane wave components copes for the
omnidirectional nature of the acoustic pressure PP (xP , ).
3.3.5.8
Summary
The properties and theorems of the plane wave decomposition that were derived in the
previous sections are summarized in Table 3.2. As stated before, they were derived for
the twodimensional plane wave decomposition. A generalization to the threedimensional
plane wave decomposition can be done easily for most of the results by using Eq. (3.42)
and the theorems of the onedimensional Fourier transformation. Some of the derived
relationships exhibit a strong resemblance with theorems of the Fourier transformation
74
PP (xP , ) = P 1 {P (, )}
P (, ) = P 1 {PP (xP , )}
a AP (xP , ) + b BP (xP , )
) + b B(,
)
a A(,
PP (, a r, )
1
P (, a )
a2
PP ( 0 , r, )
P ( 0 , )
AP (xP , ) BP (xP , )
AP (xP , ) x BP (xP , )
R R R 2
aP (xP , t)2 r ddrdt
0
0
1
(2)3
r, ) B(,
r, )} ej
j H,r {A(,
R R 2
)2 dd
A(,
0
c
3.3.6
This section briefly outlines relationships between the plane wave decomposition and other
methods/transformations used for the analysis of multidimensional signals and wave fields.
Slant stack / Radon transformation
Slant stacking is a method well known from seismic data processing. The concept of
slant stacking is based on the Radon transformation [Rad17, Tof96, Dea93]. One method
can be derived from the other as shown in [Tof96]. Both transformations are frequently
used in seismics, computerized tomography (CT), medical imaging and inverse scattering
problems. The link between slant stacking and the plane wave decomposition will be
illustrated in the following. The slant stack integral, generalized to a twodimensional
acoustic pressure field pC (xC , t) is defined as
uC (sC , ) =
pC (xC , + sTC xC ) d ,
(3.81)
R2
where uC (sC , ) denotes the radon transformation of pC (xC , t) and d = dx dy. Introducing
the coordinate system transformation from Cartesian to polar coordinates, as given by
Eq. (B.28) and Eq. (B.29) (substituting k with s in Eq. (B.29)), into Eq. (3.81) yields
uP (, s, ) =
2
0
pP (, r, + sr cos( )) r ddr .
(3.82)
75
Comparing Eq. (3.82) with the timedomain plane wave decomposition (3.43) yields equality for
uP (, s, )
= p(, ) .
(3.83)
s=1/c
Thus, the plane wave decomposition as defined within this work, can be interpreted as
a slant stack or Radon transformation specialized to the case s = 1/c. A similar result
was derived in [Sca97]. This result is quite evident when looking at the applications of
the Radon transformation and the slant stack. The Radon transformation is often used
in image processing. The Radon transformation maps straight lines in the image domain
into Dirac peaks in the Radon domain. It is therefore typically used for edge detection in
digital image processing [Tof96]. The same principle can be applied to acoustic fields: A
Dirac shaped plane wave can be understood as an edge in the pressure field pC (xC , t).
Beamforming
M
X
Wm () ej (kC kC,s )
xC,m
(3.84)
m=0
where the Wm () denote complex filter coefficients. These are often used to optimize the
beamformer response. The steering vector kC,s defines the look direction of the beamformer, which in typical applications is equal to the desired source direction. Figure 3.12
shows a block diagram of the filterandsum beamformer. The microphone signals are
multiplied by a phase factor (delay), filtered and summed up to create the beamformer
output signal. However, above formulation assumes discrete microphone positions and a
plane wave as input signal. Equation (3.84) can be generalized to an arbitrary wave field
XC (xC , ) as input signal and a continuous distribution of the microphones. Specializing to the case of Wm = 1 the following continuous representation can be derived in a
straightforward manner from Eq. (3.84):
Z
T
YC (kC,s , ) =
XC (xC , ) ejkC,sxC d ,
(3.85)
R2
where d = dxdy. Comparing Eq. (3.85) with Eq. (3.6a) yields that the delayandsum beamformer can be interpreted as spatial Fourier transformation. Hence, if the
76
xC,0
ejkC,s xC,0
W0 ()
W1 ()
...
xC,1
kC
xC,M
Y (kC , kC,s , )
...
ejkC,s xC,1
ejkC,s rC,M
WM ()
3.4
The plane wave decomposition, as introduced in Section 3.3.2, requires to have access
to the acoustic pressure for the entire twodimensional plane. For practical reasons the
area to be analyzed has to be limited. However, if the analyzed area is limited, then the
KirchhoffHelmholtz integral is applicable. Having knowledge of the acoustic field at the
boundary of the area to be analyzed allows to extrapolate the field within this boundary.
This extrapolated field can then be used to calculate the plane wave decomposition from
measurements on the boundary. The following section will derive the effects of limiting
the analyzed area and the plane wave decomposition of a wave field based on boundary
measurements.
3.4.1
The twodimensional plane wave decomposition as given by Eq. (3.41) assumes that the
acoustic pressure field can be measured on an infinite large plane. As stated before,
in practical applications the measurement area will be finite. The artifacts caused by
this limitation of the measurement aperture will be derived in the following. Due to
the underlying polar geometry of the plane wave decomposition only a limitation in the
77
y
1
circC (x, y) = 0
circC (x, y) = 1
(3.86)
where R denotes the radius of the considered circular area. The limitation of the aperture can be analyzed by multiplying the wave field with a circular window function and
introducing this into the (infinite) plane wave decomposition (3.41)
r
PR (, ) = P{circP ( ) PP (xP , )} ,
R
(3.87)
where circP (r) denotes the circular window function which is defined as follows
1 , for r = px2 + y 2 1,
circP (r) = circC (x, y) =
0 , otherwise.
(3.88)
Figure 3.13 illustrates the circ function. The plane wave decomposition for the multiplication of two fields was derived in Section 3.3.5.4. Its calculation involves the knowledge
of the Fourier series expansion coefficients of the two fields. The expansion coefficients of
the window function can be calculated by utilizing Eq. (3.16)
r, ) = FS1 {circP ( r )} = 1
C(,
R
2
2
0
r
circP ( ) ej
R
[] for r R,
d =
0
otherwise.
(3.89)
78
r, ) P
(, r, )} ej
j H,r {C(,
j
(3.90)
(, r, )J (kr) rdr e
P
f () J (kr) rdr .
(3.91)
Thus, the finite aperture plane wave decomposition is given by a Fourier series whose
coefficients are given by the finite Hankel transformation of the Fourier series coefficients
of the wave field
PR (, ) = 2
(, r, )} ej .
j H,R {P
(3.92)
This result is quite illustrative, since the representation of the plane wave decomposition as
Hankel transformation decouples the angular and the radial parts. The radially symmetric
finite aperture has thus only influence on the radial part and therefore only on the Hankel
transformation.
3.4.2
(2)
(2)
79
y
R
xl
R
x
Figure 3.14: Geometry used for the KirchhoffHelmholtz integral for a circular boundary.
where PP,e (xP , ) denotes the extrapolated pressure field and r = xl x. The extrapolated pressure field can be introduced into the definition of the plane wave decomposition
in order to derive the plane wave decomposed wave field
P (, ) = P{PP,e (xP , )} .
(3.94)
Unfortunately, above specialization of the KirchhoffHelmholtz integral and thus the plane
wave decomposition cannot be solved conveniently. The next section will utilize the
concept of circular harmonics to provide an elegant solution for circular boundaries.
3.4.3
It is promising to express the measured wave field by circular harmonics, since the chosen
circular boundary geometry and the plane wave decomposition are both formulated conveniently in polar coordinates. It was shown in Section 2.3.2 that the acoustic pressure
can be expressed in terms of circular harmonics (see Eq. (2.25))
X
PP (xP , ) =
P (1) (, )H(1)(kr) + P (2) (, )H(2) (kr) ej .
(3.95)
=
Alternatively, it is convenient to express the acoustic pressure field as Fourier series with
respect to the angle
X
(, r, ) ej ,
PP (xP , ) =
P
(3.96)
=
80
(3.97)
Please note, that the expansion coefficients P (1) (, ) and P (2) (, ) in terms of circular
harmonics are independent of the radius R. They can be used to calculate the plane
wave decomposition, as was shown in Section 3.3.4.3. Unfortunately, Eq. (3.97) does
not provide a onetoone relation between the Fourier series coefficients of the acoustic
pressure and the expansion coefficients in terms of circular harmonics. Thus, it cannot be
solved to derive the cylindrical harmonics coefficients. This conclusion is not surprising,
since the KirchhoffHelmholtz integral (2.70) states that the acoustic pressure and its
gradient are required on the boundary to describe the wave field within the boundary.
Hence, the gradient of the acoustic pressure has to be taken into account additionally.
Eulers equation (2.4) gives a relation between the gradient of the acoustic pressure and
the particle velocity. Its frequency domain representation is given as
PP (xP , ) = j0 VP (xP , ) ,
(3.98)
where VP (xP , ) denotes the acoustic particle velocity vector. However, only one component of the particle velocity vector VP (xP , ) is sufficient to solve Eq. (3.97). Introducing
Eq. (3.95) into Eulers equation (3.98) and evaluation of the inward directed radial component only derives
h~er , PP (xP , )i = k
X
(3.99)
(1),(2)
where ~er denotes the outward pointing radial normal vector and H
(kr) the derivatives of the Hankel functions with respect to the radius r. Expression of the radial
component of the acoustic velocity VP,r (xP , ) as Fourier series and introduction of this
series into Eq. (3.98) together with Eq. (3.99) for r = R derives a similar relationship as
Eq. (3.97) for the Fourier series expansion coefficients of the radial particle velocity
r (, R, ) = P (1) (, )H (1) (kR) + P (2) (, )H (2) (kR) ,
j0 cV
(3.100)
PP (, R, )
VP,r (, R, )
P (1) (, )
(, R, )
P
FS1
M(kR)
FS1
r (, R, )
V
81
P (2) (, )
4
j
k
4
j
k
FS
FS
P (1) (, )
P (2) (, )
Figure 3.15: Block diagram of the plane wave decomposition based on the measurement
of the acoustic pressure and velocity on a circular boundary only with radius R.
Solving Eq. (3.101) by inverting M1 (kR) yields the circular harmonics expansion coefficients P (1) (, ) and P (2) (, ) as
(2)
(, R, ) H(2) (kR)j0 cV
r (, R, )
H (kR)P
(1)
P (, ) =
,
(2)
(1)
(1)
(2)
H (kR)H (kR) H (kR)H (kR)
(1)
(, R, ) H(1) (kR)j0 cV
r (, R, )
H
(kR)P
(2)
.
P (, ) =
(2)
(1)
(1)
(2)
H (kR)H (kR) H (kR)H (kR)
(3.102a)
(3.102b)
The derived result allows to calculate the plane wave decomposition using pressure and
pressure gradient measurements performed on a circular boundary. It can be interpreted as multidimensional filtering process where the Fourier series expansion coefficients
(, R, ) and V
r (, R, ) are filtered by the filter M(kR). The plane wave decomposiP
tion of the incoming and outgoing part of the wave field is then given as the Fourier
series (3.58) of the circular harmonics expansion coefficients. Figure 3.15 illustrates the
entire computation of the plane wave decomposition based on boundary measurements
by combining Eq. (3.102) with Eq. (3.58). This result is similar to the one derived by
[HdVB01, Hul04]. The practical implementation of the presented plane wave decomposition using microphone arrays will be discussed in Section 5.2.2.
3.5
The following section will derive the plane wave decomposition of two analytic spatial
source models. The models used for this purpose are plane waves and line sources. These
source types play an important role for the analysis of wave fields on the one hand and
for auralization purposes on the other hand. It will be assumed that the timedomain excitation of the sources is a Dirac pulse. This assumption poses no restriction, as arbitrary
timedomain excitations can be considered by timedomain convolution of the plane wave
82
decomposed wave fields. This was shown in Section 3.3.5.1. The plane wave decompositions will be derived for the ideal case of an unlimited aperture as well as for the more
practical case of a finite aperture.
3.5.1
The plane wave decomposition of a plane wave is derived in the following. Starting point
is the pressure field of a plane wave as given by Eq. (2.47)
T
PC (xP , ) = ejkC,0 xC ,
(3.103)
where kC,0 denotes the wave vector of the plane wave and implicitly its incidence angle 0 .
First no limitation of the aperture will be assumed in order to derive a result under ideal
conditions. Introducing the wave field of a plane wave into the twodimensional spatial
Fourier transformation in Cartesian coordinates as given by Eq. (3.6a) yields
PC (kC , ) =
Tx
ej(kC kC,0 )
d =
R2
2
(3.104)
1
(k k0 ) ( 0 ) ,
k
(3.105)
where 0 denotes the incidence angle of the plane wave. The Dirac pulse (k k0 )
compromises the dispersion relation (2.8) and can be discarded for the derivation of the
desired plane wave decomposition. Hence, the plane wave decomposition of a plane wave
is given as
c
2
P (, ) = (2)
( 0 ) .
(3.106)
The obtained result is evident, since the plane wave decomposition was designed to decompose a wave field into plane waves. Hence, the plane wave decomposition of a plane
wave should have the derived form of a spatial Dirac pulse at the incidence angle of the
plane wave. It was stated in Section 3.3.3 that the plane wave decomposition exhibits
a lowpass character, a fact which can be seen also by the frequency dependent factor
c/ in Eq. (3.106). The following paragraph will consider the case of a finite aperture as
discussed in Section 3.4.1.
83
Finite aperture
The finite aperture plane wave decomposition of a plane wave can be calculated by utilizing
Eq. (3.92). However, this requires to calculate the finite Hankel transformation of the
angular Fourier series expansion coefficients of a plane wave. The Fourier series expansion
coefficients of a plane wave with incidence angle 0 can be derived from the JacobiAnger
expansion (3.55) as
(, r, ) = j J (kr) ej0 .
P
(3.107)
(, r, ) is
The finite Hankel transformation of the Fourier series expansion coefficients P
given as [GR65]
(, r, )} = j
H,R {P
j0
J (kr)J (kr) r dr =
0
1
2
= R j ej0 J+1
(kR) ,
2
(3.108)
where R denotes the radius of the circular aperture. The finite aperture plane wave
decomposition of a plane wave becomes
PR (, ) = R
2
J+1
(kR) ej(0 ) .
(3.109)
Figure 3.16 shows the finite aperture plane wave decomposition of a plane wave with
an incidence angle 0 = 180o and an aperture of R = 1 m. Figure 3.16(a) shows the
frequencyangle response PR (, ) and Fig. 3.16(b) the timeangle response pR (, t). The
timeangle response was calculated utilizing an inverse (timedomain) Fourier transformation pR (, t) = Ft1 {PR (, )}. The widening of the frequency response for low frequencies
due to the finite aperture can be seen clearly. This is a consequence of the properties of the
Bessel function included in the angular Fourier series given by Eq. (3.109). For = 0 this
series exhibits no dependence on the angle of the plane wave contributions. Thus, = 0
represents the omnidirectional case. The zerothorder Bessel function has its maximum
at a lower frequency compared to its higher orders and this results in the dominant omnidirectional response for low frequencies. As expected, the timedomain response consists
of a peak at the incidence angle of the plane wave. However, there are also some artifacts
present due to the finite aperture of the decomposition. This ringing effect is also known
from the Fourier transformation as Gibbss phenomena.
3.5.2
This section derives the finite aperture plane wave decomposition of a line source. The
acoustic pressure of a line source placed at the origin is given by Eq. (2.40). The wave
84
2000
1800
28
1600
26
24
1200
time (ms)
frequency (Hz)
1400
1000
800
22
20
18
600
400
16
200
14
90
180
angle (o)
270
90
180
angle (o)
270
Figure 3.16: Finite aperture plane wave decomposition of a (timedomain) Dirac shaped
plane wave. The incidence angle of the plane wave was chosen as 0 = 180o , the radius of
the circular aperture as R = 1 m. The frequencyangle PR (, ) and timeangle pR (, t)
responses are shown.
field produced by a line source placed at an arbitrary position xP,0 can be derived from
the shift theorem of the Hankel functions. This theorem is given as follows [Wil99]
(2)
P
J (kr)H (kr0 )ej(0 )
(2)
H0 (k xP xP,0 ) = P=
(2)
j(0 )
= J (kr0 )H (kr)e
for r r0 ,
for r > r0 .
(3.110)
It will be considered in the following that the line source is placed outside of the circular
aperture r0 > R. The acoustic pressure within the circular aperture produced by a line
source placed at the position xP,0 can be derived from Eq. (3.110) as
j X
J (kr)H(2) (kr0 )ej(0 ) ,
PP (xP , ) =
4 =
(3.111)
where r0 > R r. As for the plane wave in the previous section, the limited aperture
plane wave decomposition of a line source will be derived by applying Eq. (3.92). The
required Fourier series coefficients of the line source can be deduced from Eq. (3.111) as
(, r, ) = j J (kr)H (2) (kr0 ) ej0 .
P
(3.112)
85
(, r, ) is
The finite Hankel transformation of the Fourier series expansion coefficients P
then given as [GR65]
Z R
j
j
(2)
0
(, r, )} = e
H,R {P
H (kr0 )
J (kr)J (kr) r dr =
4
0
j
2
= R ej0 H(2) (kr0 )J+1
(kR) .
8
(3.113)
Introducing this result into Eq. (3.92) yields the finite aperture plane wave decomposition
of a line source as
R X +1 (2)
2
PR (, ) =
j
H (kr0 )J+1
(kR) ej(0 ) .
4 =
(3.114)
Figure 3.17 shows the finite aperture plane wave decomposition of a line source placed
at the position 0 = 180o , r0 = 3 m for an aperture of R = 1 m. Figure 3.17(a) shows
the frequencyangle response PR (, ) and Fig. 3.17(b) the timeangle response pR (, t).
The timeangle response was calculated utilizing an inverse (timedomain) Fourier transformation. The widening of the frequency response for low frequencies due to the finite
aperture can be seen clearly.
3.6
3.6.1
This section will derive the discrete plane wave decomposition. For this purpose a procedure similar to the derivation of the discrete Fourier transformation will be applied [OS99].
3.6.1.1
Due to practical aspects it is not feasible to measure the wave field to be analyzed in a
continuous manner. In practice, the wave field will be measured only at a limited number
of discrete points in space. As for timedomain sampling, this spatial discretization will
be modeled by a spatial sampling grid consisting of spatial Dirac pulses. Sampling is best
86
1800
16
1600
14
1400
12
1200
10
time (ms)
frequency (Hz)
2000
1000
800
8
6
600
4
400
200
0
90
180
angle (o)
270
90
180
angle (o)
270
Figure 3.17: Finite aperture plane wave decomposition of the wave field produced by
a line source excited with an (timedomain) Dirac pulse. The position of the source was
chosen as 0 = 180o, r0 = 3 m , the circular aperture has an radius of R = 1 m. The
frequencyangle PR (, ) and timeangle pR (, t) responses are shown.
performed in the angular and radial direction due to the underlying polar geometry of the
plane wave decomposition. Thus, a pulse train consisting of spatial Diracs placed on a
polar grid is the natural choice. Within this thesis, the twodimensional polar pulse train
P (, r) will be defined as follows
P (, r) =
M
1
X
X
1
=0 =0
(r ) (
2) ,
M
(3.115)
where M denotes the number of angular discretization points. Figure 3.18 illustrates
the sampling grid for M = 24 angular discretization points. Although the number of
sampling points is finite in the angular coordinate, it is infinite in the radial coordinate.
Limiting the number of radial sampling points limits the aperture of the analyzed area.
Please note that the sampling grid is not dependent on the temporal frequency since
no timedomain sampling will be considered in the following.
3.6.1.2
The polar pulse train, as introduced in the previous section, can be used to model the
spatial sampling of a wave field. Due to the sifting property of the Dirac function, multiplication of the continuous wave field with the pulse train yields the values of the wave
87
Figure 3.18: Twodimensional polar pulse train P (, r) for M = 24 angular discretization points. The circles mark the positions of the spatial Dirac pulses.
field at the discrete sampling positions. Introducing the polar pulse train into the definition of the plane wave decomposition (3.41) yields the plane wave decomposition of a
sampled field as follows
P (, ) = P {PP (xP , )} = P{PP (xP , ) P (xP )}
=
M
1
X
X
=0 =0
PP (
2
2
, , ) ejk cos( M ) .
M
(3.116)
In accordance to the discrete time Fourier transformation (DTFT) [OS99], this transformation will be termed as discrete space plane wave decomposition. The transformation
will be denoted in the following by P and the transformed signals with an asterisk in
the index. The spectral properties of the discrete space plane wave decomposition will be
analyzed in the next section.
3.6.1.3
The spectral characteristics of the discrete space plane wave decomposition can be computed using the multiplication theorem (3.69). However, this requires to calculate the
Fourier series expansion coefficients of the polar pulse train (3.115) using Eq. (3.16).
88
X
X
1
1
(, r) =
P (xP ) ej d =
[ M]
(r ) .
2 0
r
=0
=
(3.117)
According to Section 3.3.5.4 these coefficients have to be convolved with the Fourier series
(, r, ) of the wave field and the result is then Hankel transformed.
expansion coefficients P
The entire procedure results in
Z
X
X
(, r) P
(, r, )} J (kr) r dr =
( M, , ) J (k) . (3.118)
{
P
0
= =0
( M, r, )} ej , (3.119)
DFBT {P
where DFBT {} denotes the discrete FourierBessel transformation of order . The sum(, r, ) of
mation over consists of the DFBTs of repetitions of the angular spectrum P
the pressure field at the positions = M. These repetitions result from the angular
discretization of the wave field. Thus, aliasing may occur if the number of angular sampling points M is finite and the repetitions of the angular spectrum overlap. In order to
(, r, ) has to be limited. Above considerations
avoid spatial aliasing the bandwidth of P
result in the following antialiasing condition for an even number of angular sampling
positions
M
M
(, r, ) = P (, r, ) , for 2 + 1 2
(3.120)
P
0
, otherwise.
M/2
X
=M/2+1
(, r, )} ej .
j DFBT {P
(3.121)
The antialiasing condition (3.120) requires a limitation of the bandwidth in the angular
frequency domain. However, this comes down to a filtering process performed on the
angle coordinate of the pressure field P (, r, ). Thus, the antialiasing condition requires
a spatial filtering of the wave field.
3.6.1.4
Equation (3.121) reveals that the discrete space plane wave decomposition can be expressed as Fourier series with respect to the angle . Thus, it is possible to express the
89
series in terms of a discrete Fourier transformation (DFT) when the angle is also discretized. However, it must be assured that the antialiasing condition (3.120) is met. In
the following it is assumed that the discrete version of the wave field is given as
P (, , ) = P (, r, )
2
1
(r )(
)
(3.122)
(3.123)
Introducing these definitions into the discrete space plane wave decomposition (3.121)
derives the following
P (, ) =
M
1
X
X
P (, , ) ejk cos(
2
)
M
(3.124)
=0 =0
3.6.2
(3.125)
(, r, ) denotes the window function. Above equation matches with the antiwhere W
aliasing condition (3.120) if the window function is chosen as rectangular window with
a width of M in the angular frequency coordinate. Transforming Eq. (3.125) into the
spatiotemporal domain using Eq. (3.16) yields
1
PP,W (, r, ) =
2
PP ( , r, ) WP( , r, ) d ,
(3.126)
(, r, ). Equation (3.126)
where WP (, r, ) denotes the Fourier series expansion of W
can be interpreted as a cyclic convolution of the pressure field with the window function.
Hence, the antialiasing condition can be realized by spatially filtering the pressure field
in the angular coordinate. The spatial filtering of the wave field has to take place before
90
x
R
Figure 3.19: Angular pulse train for M = 24 angular discretization points. The circles
mark the sampling positions. Measurements are taken only at these positions.
sampling the field, in order to avoid spatial aliasing. One possible realization of this spatial
antialiasing filter would be to use directive microphones for the recording. Unfortunately,
this would require directive microphones of very high order which are not available yet.
In the following the aliasing artifacts will be analyzed when no spatial antialiasing filter
is applied, thus the antialiasing condition is not met.
3.6.2.1
It was shown in Section 3.4 that measurements taken on a boundary surrounding the area
of interest are sufficient to derive the plane wave decomposition of the wave field enclosed
by this boundary. However, taking measurements on a circular boundary requires no
discretization in radial direction. It is sufficient to consider a discretization in the angular
coordinate only in order to derive the sampling artifacts. Without loss of generality it is
assumed in the following that the plane wave decomposition is derived from measurements
that are taken on a circular boundary with radius R. The polar pulse train (3.115)
simplifies to an angular pulse train in this case
P () =
M
1
X
=0
2) .
M
(3.127)
Figure 3.19 illustrates the angular pulse train for M = 24. The plane wave decomposition
based on measurements taken on a circular boundary requires to measure the acoustic
pressure and the radial particle velocity (see Section 3.4.3). Both will be measured at
discrete positions in a practical implementation.
91
1800
0.9
1600
0.8
1400
0.7
1200
0.6
1000
0.5
800
0.4
600
0.3
400
0.2
200
0.1
0
36
24
12
0
12
angular frequency >
P (, R, )
2000
24
It was shown before that arbitrary wave fields can be decomposed into plane waves.
Thus, it is sufficient to discuss the case of a plane wave in order to derive the sampling
artifacts for arbitrary wave fields. The derived results for this special choice of field can
be generalized straightforward to arbitrary wave fields. Due to the radial symmetry of
the plane wave decomposition, the incidence angle of the plane wave will play no role for
these considerations. In the following the sampling artifacts of a Dirac shaped plane wave
will be derived.
The pressure field of a plane wave in polar coordinates can be expressed in terms of the
JacobiAnger expansion (3.55). Thus, the acoustic pressure of a plane wave on a circular
boundary with radius R is given as
jkR cos(0 )
PP (, R, ) = e
j J (kR)ej0 ej .

{z
}
(3.128)
(,R,)
P
The calculation of the plane wave decomposition based on boundary measurements using
Eq. (3.102) requires to calculate the Fourier series expansion coefficients of the acoustic
pressure PP (, R, ). Equation (3.128) can be interpreted as Fourier series with respect to
(, R, ) denote the expansion coefficients of the series. Figure 3.20
the angle , where P
(, R, ) of a plane wave for R = 1 m. The triangular
illustrates the angular spectrum P
(, R, ) becomes clearer when taking a closer look at the properties of the
structure of P
92
1
= 0
= 1
= 2
= 5
= 10
= 15
0.8
J (kr)
0.6
0.4
0.2
0
0.2
0.4
10
12
14
16
18
20
kr
Figure 3.21: Bessel functions J (kr) for the orders = [0, 1, 2, 5, 10, 15].
involved Bessel functions J (kR). Bessel functions of order  1 start with small values
that increase monotonically until the maximum value of the Bessel function is reached and
decay oscillating with 1/ kr for large arguments. Hence, they exhibit a kind of spatial
band pass character. Figure 3.21 shows the Bessel functions for some particular orders
(, R, ) in Fig. 3.20 exhibits a kind of triangular
. As a result of these properties, P
structure.
However, due to the discretization only a limited number of M positions will be measured
on the circular boundary. The effects of the angular sampling can be derived analogously
to the case of polar sampling shown in Section 3.6.1.3. The sampled field Ps (, R, )
can be derived by multiplying the field with the angular sampling grid P (). For the
Fourier series expansion coefficients this results in the following convolution
() =
s (, R, ) = P
(, R, )
P
X
( + M, R, ) ,
=
P
(3.129)
93
frequency
M2
M
2
(, R, ) due to
Figure 3.22: Illustration of the repetitions of the angular spectrum P
angular sampling.
Section 3.6.1.3 and is given by Eq. (3.120). However, in the following it will be assumed
that the angular spectrum is not bandlimited.
(, R, ) for a plane wave
Figure 3.23 shows the absolute value of the angular spectrum P
with incidence angle 0 = 0o sampled on M = 24 angular sampling points with a radius of
R = 1 m. The aliasing in the baseband  12 caused by the repetitions of the baseband
can be seen clearly. As a result of these properties and properties of the Bessel functions
discussed before, aliasing contributions will be always present in the angular spectrum
of a sampled wave field. Thus, the discrete plane wave decomposition will always be an
approximate representation of a wave field. Please note that the continuous plane wave
decomposition is exact in this sense.
(, R, ) using
The wave field can be reconstructed from its angular spectrum P
Eq. (3.128). However, as illustrated by Fig. 3.22, it is necessary to limit the angular
frequencies used for the reconstruction due to the aliasing contributions and the symmetry of the repeated spectrums. Thus, the reconstruction of the sampled field is given as
PP (xP , ) =
M/2
X
s (, R, ) ej .
P
(3.130)
=M/2+1
The limitation of the order , however, introduces additional artifacts when reconstructing
the wave field.
Summarizing, the discrete plane wave decomposition when considering no bandlimitation
of the angular spectrum exhibits two types of artifacts:
1. aliasing and
94
1800
0.9
1600
0.8
1400
0.7
1200
0.6
1000
0.5
800
0.4
600
0.3
400
0.2
200
0.1
0
36
24
12
0
12
angular frequency >
P (, R, )
2000
24
Figure 3.23: Absolute value of the angular spectrum of a sampled plane wave signal
(M = 24, R = 1 m).
2. truncation artifacts.
The following sections perform a quantitative analysis of the aliasing artifacts utilizing
two different criteria and perform an analysis of the error caused by truncation of the
angular frequencies.
3.6.2.3
In the previous section the origin of the aliasing contributions was derived in a more
qualitative fashion. Unfortunately, these aliasing contributions will lead to artifacts in
the plane wave decomposition and hence to artifacts when reconstructing a wave field
from its plane wave decomposition. It was also shown that aliasing errors will be present
always when assuming no bandlimitation of the angular spectrum. Figure 3.23 illustrates
further that the aliasing error will be dependent on the angular frequency and the
temporal frequency . In order to perform a quantitative analysis of the error introduced,
a suitable error criterion has to be chosen and evaluated. The following section will derive
two different criteria for the aliasing error and their upper bounds.
Energy of the Aliasing Contributions
The interfering aliasing contributions in the baseband  < M/2 are caused by the
(, R, ). Thus, the energy of these repeated contributions
repetitions of the spectrum P
95
[dB]
al = 60 [dB]
al = 30 [dB]
1800
1600
0
10
20
1400
30
1200
40
1000
50
800
60
600
70
400
80
200
90
0
>
10
al (, R, )
2000
100
Figure 3.24 shows the energy al (, R, ) of the aliasing contributions for M = 24 angular
discretization points at a radius of R = 1 m. It can be seen clearly that the energy is
dependent on the order . Higher orders exhibit more aliasing at lower frequencies than
lower orders. This observation is also in conjunction with Fig. 3.23.
The calculation of al (, R, ) requires numeric evaluation of Eq. (3.131). In the sequel a
closedform upper bound for the aliasing error al will be derived. An upper bound for
the absolute value of the aliasing contributions is given as
X
P
(,
R,
)
J+M (kR) .
(3.133)
al
1
96
When taking the properties of the Bessel functions into consideration (see also Fig. 3.21),
it can be concluded that for large M the sum above can be approximated quite well by
its first term. Taking additionally the symmetry of the problem into account and noting
that J () 0 in the area of interest, an upper bound for the aliasing error is derived as
follows
Z
1 2
R
al (, R, )
J+M ( ) d ,
(3.134)
0
c
where M/2. Using the upper bound for the Bessel functions given in [JKA02] an
upper bound for the integral in Eq. (3.134) can be derived as
al (, R, )
c
(kR)2+2M +1 ,
R u(, M)
(3.135)
where
u(, M) = 22+2M ( + M)!2 (2 + 2M + 1) .
(3.136)
This result can be used to derive a frequency limit for a given order and allowable aliasing
error. If the allowable aliasing error is chosen equal for all angular frequencies this results
in
1
al R u(, M) 2+2M +1
c
,
(3.137)
fal (, M, al )
2R
c
where al denotes the allowable aliasing error. Equation (3.137) can be understood as
a kind of antialiasing condition in the temporalfrequency domain. Please note that
Eq. (3.137) does not provide an exact antialiasing condition, as this is wellknown from
timedomain sampling of onedimensional signals [OS99, GRS01]. There will always be
aliasing contributions in the analyzed field, without a bandlimitation of the angular
spectrum.
Figure 3.24 shows the derived lower frequency limit fal (, M, al ) for two choices of al . It
can be seen clearly that the derived bound provides an upper bound.
Energy Ratio of the Aliasing to Signal Contributions
Up to now only the energy of the aliasing contributions was taken into account. The energy
of the signal without the aliasing contributions was discarded for these considerations. It
is reasonable to weight the energy of the aliasing contributions by the desired signal
energy, since the aliasing contributions interfere with the desired signal components. The
aliasingtosignal ratio (ASR) will be defined as follows
ASR(, R, ) =
al (, R, )
.
R
J ( R )2 d
0
c
(3.138)
In the following results obtained from a numerical evaluation of Eq. (3.138) will be shown.
Figure 3.25 shows the ASR for a plane wave signal sampled at M = 24 angular points with
97
1800
10
1600
20
1400
30
1200
40
1000
50
800
60
600
70
400
80
200
90
0
>
10
ASR(, R, )
[dB]
2000
100
an radius of R = 1 m. It can be seen that the aliasing contributions get more dominant
for higher angular frequencies . This result indicates that it is not meaningful to use
the higher angular and temporal frequencies of the analyzed wave field since these are
distorted by spatial aliasing contributions.
3.6.2.4
As stated before, it is necessary to limit the number of Fourier series expansion coefficients
(angular frequencies ) used for the reconstruction of a wave field. This truncation is
necessary due to the aliasing present and leads to artifacts when reconstructing the wave
field. In the following these artifacts will be analyzed quantitatively.
As before, the basis for the following considerations will be a Dirac shaped plane wave.
The wave field of a plane wave on a circular boundary with radius R can be expressed as
the series (3.128). Truncation of this series leads to
PP,Mtr (, R, ) =
Mtr
X
=Mtr +1
j J (kR)ej(0 ) .
(3.139)
98
[dB]
tr < 60 [dB]
tr < 30 [dB]
0
10
20
30
1000
40
50
60
500
tr (M, R, )
1500
70
80
90
10
15
Mtr >
20
25
100
Figure 3.26: Truncation error tr (Mtr , R, ) for a plane wave (R = 1 m) and upper
bound for the truncation error as given by Eq. (3.141) for different upper bounds tr . The
gray levels denote the signal energy in dB.
where Mtr = M/2. Equation (3.139) can be used to formulate an error measure between
the wave field of a plane wave and its truncated series representation
tr (Mtr , R, ) = PP (, R, ) PP,Mtr (, R, ) ,
(3.140)
2
tr (Mtr , R)Mtr +1
,
(Mtr + 1) 1 tr (Mtr , R)
(3.141)
where
tr (Mtr , R) =
99
keR
.
2(Mtr + 1)
(3.142)
Figure 3.26 shows additionally the derived upper bound of the truncation error for two
different maximum truncation errors tr .
The plane wave decomposition, as given by Eq. (3.46), can be expressed as Fourier series.
The expansion coefficients of this series are given as the Hankel transformations of the
angular expansion coefficients of the pressure field. However, due to the foregoing considerations this series representation will have to be truncated also. As a result, the discussed
truncation artifacts will also be present in the plane wave decomposed wave field. Please
note that truncation artifacts are also well known from the Fourier transformation [OS99].
In practice a such derived field representation will be truncated always. Hence, the derived results state that it is in practice not possible to exactly reconstruct a plane wave
which was captured on a circular boundary. The authors [JKA02] use this fact to define
the dimensionality of a wave field. They conclude that for a given reconstruction error a
finite number of components is sufficient to characterize a bandlimited wave field within
a circular area with finite size.
3.6.3
Summary
100
101
Chapter 4
Listening Room Compensation
The problem of listening room compensation was discussed on a qualitative level in Section 1.1. This chapter will analyze the influence of the listening room on sound reproduction quantitatively and will propose active listening room compensation as countermeasure
for its influence. For this purpose the fundamentals of sound reproduction systems will be
introduced first, followed by an analysis of the wave field generated by the sound reproduction system in the listening room. An active approach to compensate for the influence
of the listening room will be developed on basis of the foregoing analysis. This approach is
first derived in continuous space and time and then discretized. Application of traditional
adaptive filtering schemes provide a promising solution on first glance. However, their
application is not advisable due to fundamental problems related to the high number of
(correlated) channels used for massive multichannel reproduction systems. Based on the
analysis of the fundamental problems of these algorithms, an improved approach to active
listening room compensation is proposed in the last part of this chapter.
4.1
Sound Reproduction
It was stated in Chapter 1 that sound reproduction systems aim at perfectly reconstructing an acoustic scene. This section will derive the fundamentals of sound reproduction
systems that are capable of fulfilling this goal. For this purpose the scenario depicted in
Figure 4.1 will be considered. The wave field emitted by an arbitrary virtual source S
should be reproduced in the bounded region V . This region will be termed as listening
region in the following, since the listeners reside there. The virtual source S may not
have contributions in V . The limitation to one virtual source poses no constraints on the
wave field to be reproduced, since this source may have arbitrary shape and frequency
characteristics. Additionally, multiple sources can be reproduced on basis of the principle
of superposition.
The basic principle of sound reproduction can be illustrated with the principle of Huy
102
virtual
source
xx
S(x, )
x0
V
x
Figure 4.1: Reproduction of the wave field emitted by a virtual source S(x, ) inside the
bounded region V and the parameters used for the KirchhoffHelmholtz integral (4.1).
gens [MF53a, Wik05b]. Huygens stated that any point of a propagating wave front at
any timeinstant conforms to the envelope of spherical waves emanating from every point
on the wavefront at the prior instant. This principle can be used to synthesize acoustic wavefronts of arbitrary shape. Spherical waves are generated by point sources (see
Section 2.4.1) or approximately by closed loudspeakers. Accordings to Huygens principle
these loudspeakers have to be placed on a wave front. However, it is not very practical to
position the acoustic sources on the wavefronts for synthesis. By placing the loudspeakers
on an arbitrary fixed curve and by weighting and delaying the driving signals, an acoustic
wavefront can be synthesized with a loudspeaker array. Figure 4.2 illustrates this principle.
The mathematical foundation of this more illustrative description to sound reproduction
is given by the KirchhoffHelmholtz integral. This principle was introduced in Section 2.6
and will be utilized in the following to derive a generic theory of sound reproduction
systems.
4.1.1
The solution of the inhomogeneous wave equation for a bounded region with respect
to arbitrary boundary conditions was derived in Section 2.6. It will be assumed in the
following that no acoustic sources and obstacles are present within the listening region V .
103
virtual
source
P (x, ) =
G0 (xx0 , )
S(x0 , ) S(x0 , )
G0 (xx0 , ) dS0 ,
(4.1)
n
n
V
where S(x0 , ) denotes the wave field produced by the virtual source S on the boundary
V . The underlying geometry is illustrated in Figure 4.1. Equation (4.1) is valid for all
points within V (x V ), outside V the acoustic pressure P (x, ) equals zero. In order to
derive an interpretation of Eq. (4.1) the involved Greens functions have to be specified.
As stated in Section 2.6.1 and Section 2.6.2, the freefield Greens function G0 (xx0 , )
and its directional gradient can be understood as the field emitted by sources placed on
V . These sources will be termed as secondary sources in the following. The strength of
these sources is determined by the pressure and the directional pressure gradient of the
virtual source field S(x0 , ) on V .
Thus, this specialized KirchhoffHelmholtz integral can be interpreted as follows: If appropriately chosen secondary sources are driven by the sound pressure and the directional
104
pressure gradient of the wave field emitted by the virtual source S on the boundary V ,
then the wave field within the region V is equivalent to the wave field which would have
been produced by the virtual source inside V . Thus, the theoretical basis of sound reproduction is given by the KirchhoffHelmholtz integral (2.63).
Please note, that the space V in Eq. (4.1) may be two or threedimensional. In the first
case V describes a plane and V the closed curve surrounding it, in the second case V
describes a volume and V the closed surface surrounding it. However, the Greens function and thus the secondary sources used in the KirchhoffHelmholtz integral depend on
the dimensionality of the problem. The freefield KirchhoffHelmholtz integral for threeand twodimensional regions V was already derived in the Sections 2.6.2 and 2.6.1. The
next two sections will illustrate the application of this principle to sound reproduction in
two and threedimensions.
4.1.2
The threedimensional freefield Greens function was already derived in Section 2.4.2 and
is given by Eq. (2.33). In the context of sound reproduction, it can be interpreted as the
field of a monopole point source distribution on the surface V . The KirchhoffHelmholtz
integral (4.1) also involves the directional gradient of the Greens function. The directional
gradient of the threedimensional freefield Greens function, as given by Eq. (2.66), can
be interpreted as the field of a dipole source whose main axis lies in direction of the
normal vector n. Thus, the KirchhoffHelmholtz integral states in the threedimensional
case, that the acoustic pressure inside the volume V can be controlled by a monopole and
a dipole point source distribution on the surface V enclosing the volume V .
4.1.3
In general it will not be feasible to control the pressure and its gradient on the entire
twodimensional surface of a threedimensional volume. Typical reproduction systems
are restricted to the reproduction in a plane only. This reduction of dimensionality is
reasonable for most scenarios due to the spatial characteristics of human hearing [Bla96].
The twodimensional KirchhoffHelmholtz integral has been derived in Section 2.6.2.
Please note, that within this work (see Section 2.1.2) the term twodimensional refers to
truly twodimensional wave fields or fields that are independent from one of the three
spacial coordinates. The required twodimensional freefield Greens function for the
KirchhoffHelmholtz integral is given by Eq. (2.68). It can be interpreted as the field
of a monopole line source which intersects the reproduction plane at the position x0 .
The directional gradient of the twodimensional freefield Greens function, as given by
Eq. (2.69), can be interpreted as the field of a dipole line source whose main axis lies in
direction of the normal vector n. Thus, the KirchhoffHelmholtz integral states in this
105
case, that the acoustic pressure on the plane V can be controlled by a monopole and a
dipole line source distribution on the closed curve V surrounding the plane.
4.1.4
The KirchhoffHelmholtz integral states that a sound reproduction system may be realized with secondary monopole and dipole sources. In practice it is desirable to utilize
only one of these two source types. Thus, one of the two secondary source terms in the
KirchhoffHelmholtz integral (4.1) has to be eliminated. Since monopole sources can be
realized as a first approximation by closed loudspeakers, it is reasonable to drop the dipole
sources. However, similar principles as shown in the following can be applied to drop the
monopole contributions.
The second term in the KirchhoffHelmholtz integral (4.1) belonging to the dipole secondary sources can be eliminated by assuming homogeneous Neumann boundary conditions on V . As a result, the boundary V will be modeled as acoustically rigid boundary.
In order to fulfill this requirement, the Greens function used in the KirchhoffHelmholtz
integral has to be modified. As a consequence of the desired boundary condition, this
modified Greens function G(xx0 , ) has to obey the following condition
=0.
(4.2)
G(xx0 , )
n
x0 V
The desired Greens function can be derived by adding a suitable homogeneous solution
(with respect to the region V ) to the freefield Greens function
G(xx0 , ) = G0 (xx0 , ) + G0,m (xm (x)x0 , ) ,
(4.3)
where G0,m (xm (x)x0 , ) denotes a suitable freefield Greens function with the source
point x0 and the receiver point xm (x). The receiver point xm (x) has to be chosen such,
that G(xx0 , ) fulfills the presumed Neumann boundary condition. In the following
the Greens function G0,m (xm (x)x0 , ) will be derived for the threedimensional case.
However, the same principles can be applied also to the twodimensional case as will be
shown later in this section.
The directional gradient of the modified Greens function (4.3) is given as
1 ejkxx0
1 ejkxm (x)x0 
+
=
4 x x0 
4 xm (x) x0 
ejkxx0 1 + jkxm (x) x0 
ejkxm (x)x0  !
1 + jkx x0 
=
cos
cos
=0,
4x x0 
x x0 
4xm (x) x0 
xm (x) x0 
(4.4)
where and denote the angles between the normal vector n and the vectors x x0
and xm (x) x0 respectively. A solution of Eq. (4.4) is given by choosing the receiver
106
x
x0
xm
xm
x0
V
n
x0
Figure 4.3: Illustration of the geometry used for the derivation of the modified Greens
function for a sound reproduction system using monopole secondary sources only.
point xm (x) according to Fig. 4.3 as the point x mirrored at the tangent to the curve V
at the position x0 . This way the directional gradient of G0,m (xm (x)x0 , ) is equal to the
directional gradient of G0 (xx0 , ) but with opposite sign, since cos and cos have the
same magnitude but opposite signs for this special geometry. Thus, G0,m (xm (x)x0 , ) is
given as
1 ejkxm (x)x0 
G0,m (xm (x)x0 , ) =
.
(4.5)
4 xm (x) x0 
Please note, that the point xm (x) is always outside the region V (xm R3 \V ). Introducing Eq. (4.5) into Eq. (4.4), with the geometry depicted in Fig. 4.3, yields that
G0,m (xm (x)x0 , ) fulfills the requirement prescribed by Eq. (4.2). Thus, the reproduction of arbitrary wave fields inside the volume V using a distribution of monopole sources
only on the surface V is possible in principle. However two problems remain:
1. the field outside the volume V will not vanish and
2. the unmodified reproduction system would produce undesired reflections due to the
homogeneous Neumann boundary condition assumed to discard the dipoles.
In the sequel these two problems will be addressed as well as twodimensional sound
reproduction.
Consequences of the wave field produced outside of the listening area
The field outside the volume V will not vanish, as this would be the case for reproduction
based on the KirchhoffHelmholtz integral. Equation (4.5) together with Eq. (4.3) states,
that the field outside the region V is a mirrored version of the field within V . As a
consequence, V has to be concave in order to avoid deteriorations of the wave field
within the listening region V by its mirrored version.
107
Both the Greens function G0 (xx0 , ) and G0,m (xm (x)x0 , ) can be interpreted as point
sources placed at the position x0 . Their wave fields produced inside V will be equivalent,
since by construction x x0  = xm x0 . The Greens function G(xx0 , ) used for
reproduction is then given by introducing Eq. (4.5) into Eq. (4.3) as
G(xx0 , ) = 2 G0(xx0 , ) .
(4.6)
Hence, the strength of the virtual source has to be doubled to cope with the additional
secondary source at x0 represented by G0,m (xm (x)x0 , ). The wave field at arbitrary
receiver points x is then given as
I
ejkxx0
1
P (x, ) =
2
(S(x0 , ))
dS0 .
(4.7)
4 V n
x x0 
Consequences of the imposed Neumann boundary condition
The homogeneous Neumann boundary condition chosen in the derivation above implicitly
models the surface V as rigid surface. As a result, the reproduction of the desired wave
field using Eq. (4.7) will additionally reproduce reflections at the (virtual) boundary V .
These reflections however, are not desired since the reproduction system should model a
freefield space V . The reflections at the border V only take place for those components
of the wave field where the local propagation direction of the wave field to be reproduced
does not coincide with the normal vector n. Thus, these undesired reflections are avoided
by exciting only those secondary sources whose normal vector n coincides with the local
propagation direction of the wave field to be reproduced. This selection can be performed
by introducing a window function a(x0 ) into Eq. (4.7)
1
P (x, ) =
4
2 a(x0 )
ejkxx0
(S(x0 , ))
dS0 ,
n
x x0 
(4.8)
(4.9)
Equation (4.8) together with Eq. (4.9) comprise the basis of sound reproduction systems
utilizing secondary monopole sources only. If the distribution of monopole sources on V
is driven by the directional pressure gradient of the virtual source weighted by the window
function a(x0 ), then the wave field of this source is reconstructed perfectly inside V .
The terms involved to drive the secondary sources in Eq. (4.8) can be combined into the
driving function D(x0 , )
D(x0 , ) = 2 a(x0 )
(4.10)
108
where Vn,S (x0 , ) denotes the particle velocity of the virtual source in direction of the
surface normal n.
Twodimensional sound reproduction
Similar considerations as performed above lead to the wave field created by a distribution
of monopole line sources on the closed curve V
I
j
(2)
P (x, ) =
D(x0 , ) H0 (k x x0 ) dS0 ,
(4.11)
4 V
where the driving function D(x0 , ) is given by Eq. (4.10). Figure 4.4 shows the reproduced wave field when using a twodimensional circular distribution of secondary
monopole sources. The radius of the circular region was chosen as R = 1.50 m. Two
cases were evaluated: (1) the left row shows the results when the window function a(x0 )
is discarded, (2) the right row shows the results when incorporating the window function.
These results were calculated by numerical evaluation of Eq. (4.11). It can be clearly seen
that the window function eliminates the reflections introduced by the Neumann boundary conditions. The wave field is reproduced correctly within the circular distribution of
secondary monopole sources. The wave field outside of that region does not vanish, as
this would be the case when using secondary monopole and dipole sources.
Linear secondary source distributions
The closed contour V can be degenerated to an infinite line for twodimensional reproduction or an infinite plane for threedimensional reproduction. The line/plane will then
divide the two/threedimensional space into tworegions. One of these can be chosen as
the listening area. However, only virtual source fields whose local propagation direction at
the secondary source distribution coincides with the normal vector n can be reproduced.
Specializing Eq. (4.11) to the case that the secondary source contour V degenerates to
an infinite line located on the xaxis yields
Z
j
(2)
PC (xC , ) =
DC (xC,0 , ) H0 (k xC xC,0 ) dx0 ,
(4.12)
4
where xC,0 = [ x0 0 ]T . Above formulation can also be derived from the twodimensional
Rayleigh integral [Sta97]. Equation (4.12) can be used to describe linear secondary source
distributions. These are frequently used for practical implementations of spatial sound
reproduction systems, e. g. for wave field synthesis systems (see Section 5.1).
Focused virtual sources
The theory of sound reproduction introduced so far assumed that the virtual source
S(x, ) has no contributions within the listening area. The theory can be extended to
109
t = 0.0 ms
3
1
y > [m]
y > [m]
t = 0.0 ms
3
3
2
0
x > [m]
3
2
3
2
0
x > [m]
3
2
2
0
x > [m]
0
x > [m]
t = 5.0 ms
y > [m]
y > [m]
t = 5.0 ms
3
t = 2.0 ms
y > [m]
y > [m]
t = 2.0 ms
0
x > [m]
0
x > [m]
Figure 4.4: Reproduction of a bandlimited (sinc shaped) plane wave using a twodimensional circular distribution of secondary monopole sources (R = 1.50 m). The left
row shows the resulting wave field when discarding the window function a(x0 ), the right
row when taking it into account.
110
focused virtual sources, these are virtual sources with contributions in the listening area.
This extension will not be considered here, but can be found in [Ver97, TAG+ 01, YTF03].
4.1.5
It was shown in the previous section that the local traveling direction of the virtual source
wave field has to be taken into account for reproduction using a monopole only secondary
source distribution. The plane wave decomposition, as introduced in Section 3.3.2, inherently includes this information. In the following the loudspeaker driving signals will be
derived from the plane wave decomposed wave field of the virtual source. Without loss
of generality the discussion will be limited to the twodimensional case.
The representation of a wave field by its plane wave decomposition is given by the inverse
plane wave decomposition (3.45). Hence, the wave field of the virtual source S(x, ) can
be represented as follows
Z 2
k
) ejkr cos() d ,
SP (xP , ) =
S(,
(4.13)
2
(2) 0
) denotes the plane wave expansion coefficients of the virtual source field
where S(,
SP (xP , ), as given by Eq. (3.41). The calculation of the monopole driving signal (4.10)
includes the calculation of the directional gradient of the virtual source field. The gradient
of the exponential term in Eq. (4.13) can be derived as follows
"
#
cos( )
.
(4.14)
ejkr cos() = jk ejkr cos()
sin( )
Using above result yields the gradient of Eq. (4.13) as
jk 2
SP (xP , ) =
(2)2
2
0
) ejkr cos()
S(,
"
cos( )
sin( )
d ,
(4.15)
where and r denote the coordinates of the spatial polar coordinate system. The directional gradient of the plane wave decomposed virtual source field takes the local geometry
of the virtual boundary V into account. Figure 4.5 illustrates the underlying geometry
for one particular plane wave with the wave vector k ((k) = ). The directional gradient
of the plane wave decomposed virtual source field can be expressed as
Z 2
jk 2
) cos( ) ejkr cos() d , (4.16)
SP (xP , ) = hn, SP (xP , )i =
S(,
n
(2)2 0
where denotes the angle of the surface normal n ((n) = ). The driving signal D(x0 , )
can be derived from Eq. (4.16) by using Eq. (4.10) together with a suitably chosen window
function a(x0 ). Considering the geometry depicted in Fig. 4.5, the generic definition (4.9)
111
x0
e
an
pl
g
iso
(
ve
wa
s)
m
a
r
Figure 4.5: Geometric parameters used to derive the driving signals for the reproduction
of a plane wave decomposed wave field. The geometry is illustrated for one particular
plane wave, with k denoting the wave vector of this plane wave.
of the window function can be formulated more precisely in terms of the angles of the
individual plane wave contributions
1 , if   /2,
a(x0 ) =
(4.17)
0 , otherwise.
The influence of the window function is illustrated by the gray wedge in Fig. 4.5. Introducing Eq. (4.16) and Eq. (4.17) into Eq. (4.10) yields the driving signal at the location
xP,0 = [ 0 r0 ]T as
jk 2
DP (xP,0 , ) =
(2)2
+ 2
(4.18)
The presented results reveal that a plane wave decomposition of the virtual source S(x, )
allows to conveniently incorporate the effect of the window function a(x0 ) into the formulation of the driving signal D(x0 , ).
4.1.6
112
number of secondary sources placed at discrete positions. This spatial sampling of the
secondary source distribution may lead to spatial aliasing. The following section will
derive the effects of spatial sampling and suitable antialiasing conditions. The theory
will be presented for the twodimensional case, however it can be extended also in a
straightforward way to the threedimensional case using the presented techniques.
The reproduced wave field within the listening area V was derived in Section 4.1.4. It is
given by Eq. (4.11), which can be generalized as follows
I
P (x, ) =
D(x0 , ) V (x x0 , ) dS0 ,
(4.19)
V
where V (x x0 , ) denotes the wave field of the secondary sources including the negative
sign. In the twodimensional case these secondary sources constitute line sources. The
wave field of the secondary line sources is given by the twodimensional freefield Greens
function (2.68)
j (2)
V2D (x x0 , ) = H0 (k x x0 ) .
(4.20)
4
Sampling of a onedimensional signal in the timedomain leads to repetitions of the spectrum of this signal [GRS01, OS99]. Due to these repetitions of the spectrum, a frequency
domain analysis of temporal sampling is most convenient. Aliasing artifacts will be present
in a timedomain sampled signal, if the signal is not bandlimited or the bandlimited repeated spectrums overlap. The same principles as used for timedomain signals can be
applied to spatial sampling of multidimensional signals.
Equation (4.19) can be understood as a generalized multidimensional convolution integral. The convolution is performed on the contour V with the listening position x as
parameter. This generalized convolution can be derived from the multidimensional convolution (3.32) by a suitable parameterization of the boundary V . For the derivation of
the sampling artifacts, a spatiotemporal frequency domain description of the reproduced
wave field is desired. Due to the convolutional structure of Eq. (4.19) this will lead to a
Linear Arrays
113
y
PC (xC , )
n
x
V
x
Figure 4.6: Geometry used to derive the sampling artifacts for linear loudspeaker arrays.
The denote the sampling positions of the driving function DC,S (x, ).
However, no detailed analysis of the aliasing artifacts has been performed so far. This
section analyzes the spatial aliasing artifacts of linear secondary source distributions and
derives an antialiasing condition.
It will be assumed that the secondary source distribution is located on the xaxis (y = 0) of
a Cartesian coordinate system and in a first step has infinite length. Figure 4.6 illustrates
the geometry of the line array. The reproduced wave field is given by specializing Eq. (4.19)
to the geometry depicted in Fig. 4.6, as follows
Z
PC (xC , ) =
DC (xC,0 , ) VC(xC xC,0 , ) dx0 ,
(4.21)
where for a line array placed on the xaxis xC,0 = [ x0 0 ]T . Equation (4.21) exhibits
the form of a convolution integral along the xaxis. Applying a twodimensional spatial
Fourier transformation and the convolution theorem (3.8) yields the pressure field in the
spatiotemporal frequency domain as
C (kx , ) VC (kC , ) ,
PC (kC , ) = D
(4.22)
C (kx , ) = Fx {DC (x, )}. In order to derive the wave field reproduced by a
where D
discrete distribution of secondary sources it is assumed that the driving function DC (x, )
is sampled at equidistant discrete positions. The process of sampling can be modeled by
a multiplication of the continuous driving function with a series of Dirac functions
DC,S (x, ) = DC (x, )
1 X
(x x)
x =
1 X
=
DC (x, ) (x x) ,
x =
(4.23)
where DC,S (x, ) denotes the sampled driving function and x the distance (sampling
period) between the sampling positions. The sampling positions are indicated in Fig. 4.6
114
by the dots . The result of sampling is a series of weighted Dirac pulses at the sampling
positions. The spatiotemporal spectrum of the sampled driving function can be calculated
by applying a spatial Fourier transformation to Eq. (4.23) with respect to the xcoordinate
C,S (kx , ) = 2 D
C (kx , ) kx
D
(kx
2
)
x
C (kx 2 , ) .
= 2
D
x
=
(4.24)
As for time domain sampling, the spatial sampling results in a repetition of the spectrum
C (kx , ) on the spatial frequency kx axis. Introducing
of the continuous driving function D
C,S (kx , ) into Eq. (4.22) derives the spectrum of the wave field reproduced by a sampled
D
secondary source distribution as
PC,S (kC , ) = 2
C (kx 2 , ) VC (kC , ) .
D
x
=
(4.25)
In order to derive the effects of spatial sampling and a sampling theorem, the spatio C (kx , ) and the secondary sources VC (kC , )
temporal spectrums of the driving function D
have to be specialized. The Fourier transformation of a secondary line source was derived
in Appendix C.3.1 and is given by Eq. (C.16). The spectrum of the driving function
C (kx , ) is dependent of the wave field of the virtual source S(x, ) to be reproduced.
D
Since arbitrary wave fields can be decomposed into plane waves, the following paragraph
will derive the sampling artifacts for the reproduction of a plane wave.
sin(pw ) ej c x cos pw ,
c
(4.26)
where pw denotes the incidence angle of the plane wave. For the upper half plane (y > 0),
the secondary source distribution is only capable of reproducing plane waves traveling into
the positive ydirection. Thus, it is reasonable to limit the incidence angle of the virtual
plane waves to 0 pw < . As a consequence to this limitation, the window function will
become a constant a(x0 ) = 1. Calculating the spectrum of the driving signal DC,pw (x, )
115
and introducing this result into Eq. (4.25) derives the reproduced wave field as
= 4j sin pw
(kx
cos pw ) VC (
+ cos pw , ky , ) .
c
x
c
x
c
=
(4.27)
Thus, the spectrum of the reproduced wave field for a discrete distribution of secondary
sources is given as a series of Diracs weighted by the spectrum of the secondary sources
evaluated at the positions of these Diracs. The result for = 0 comprises the desired
plane wave. The other terms in the sum for 6= 0 are potential aliasing contributions.
Their strength depends on the weighting given by VC (kC , ). In the ideal case the spatiotemporal spectrum of the secondary sources would have to be chosen such to filter out
the aliasing contributions. A suitable choice would be a spatialtemporal lowpass filter.
However, for sound reproduction the spectrum VC (kC , ) is given by the secondary sources
and their underlying physics. The spectrum VC (kC , ) for line sources as secondary sources
was derived in Appendix C.3.1 and is given by Eq. (C.16). Introducing this result into
Eq. (4.27) yields
1 q 2
(kx
cos pw ) ( kx + ky2 ) +
x
c
k
c
=
!
X
2
1
+j
(kx
cos pw ) 2
.
2
x
c
kx + ky ( c )2
=
(4.28)
The reproduced spectrum consists of a real and an imaginary part. The imaginary part
can be identified as being produced by the nearfield of the secondary sources. Hence,
this part will not be considered further for the sampling considerations derived in the
following. Figure 4.7 illustrates the real part of PC,S (kC , ) in the spatial kx ky frequency
plane. For a fixed temporal frequency , the first Delta function in the real part of
Eq. (4.28) can be interpreted as a series of Dirac lines perpendicular to the ky axis in
2
+ c cos pw . The second Delta
the spatial frequency plane at the positions kx = x
function can be interpreted as a circular Dirac pulse with the radius c . Due to the sifting
property of Dirac functions, the result of the multiplication of the two Dirac functions is
given by their crossings in the spatial frequency plane. For the situation shown in Fig. 4.7
the result will be two Diracs at the positions indicated by the dots . In this particular
example, these two Diracs represent the desired wave field of a plane wave traveling into
the positive ydirection for the upper half plane y > 0 and into the negative ydirection
for the lower half plane y < 0. This symmetry results from the reproduction using only
116
ky
(kx +
2
x
(kx
cos pw )
cos pw )
(kx
2
x
2
x
cos pw )
pw
2
x
(
= 1
=0
kx
kx2 + ky2 c )
=1
Figure 4.7: Illustration of the real part of the spectrum PC,S (kC , ) reproduced by a
discrete secondary monopole source distribution for the reproduction of a plane wave
with incidence angle pw . The resulting spectrum is given by the intersection of the two
Dirac functions at the positions indicated by the dots .
secondary monopoles as discussed in Section 4.1.4.
However, for an increasing distance x between the secondary sources there may also
be additional contributions in the reproduced wave field. The repetitions of the first
Delta function in the real part of Eq. (4.28) for 6= 0 move towards the circular Delta
function for an increasing distance x. If these repetitions overlap with the circular Delta
function additional plane wave contributions will result due to the sifting property. These
contributions constitute the spatial aliasing due to the spatial sampling of the secondary
source distribution. These are avoided if the frequency of the reproduced plane wave is
limited. The antialiasing condition for the driving function can be derived from Fig. 4.7
and Eq. (4.28) as
c
.
(4.29)
f
x (1 + cos pw )
Thus, a reduction of the temporal bandwidth of the reproduced plane wave avoids aliasing
contributions present in the reproduced wave field. For arbitrary wave fields the condition (4.29) has to be fulfilled for the minimum and maximum incidence angle of their
plane wave contributions.
If the antialiasing condition given by Eq. (4.29) is not fulfilled then aliasing artifacts
will be present in the reproduced wave field. According to Fig. 4.7 and Eq. (4.28) these
artifacts will be a superposition of plane waves (for 6= 0) with different incidence angles.
However, only those spectral repetitions will result in spatial aliasing contributions where
the circular Dirac pulse and the Dirac lines in Fig. 4.7 intersect. Hence, only a subset
117
90
120
60
150
30
180
210
330
240
300
270
Figure 4.8: Incidence angle of the desired plane wave pw (dashed line) and its aliasing
contributions pw,al (solid lines). The desired monochromatic plane wave has a frequency
of f0 = 10 kHz and an incidence angle of pw = 90o , the secondary source distance was
chosen to x = 0.15 m.
al of all possible spectral repetitions will be present in the reproduced wave field for a
particular incidence angle and frequency of the desired plane wave. This subset includes
all al Z\0 for which the following condition holds
2
+
cos
(4.30)
al
pw
x
c
c
Using this subset, the incidence angles al of the plane waves produced by aliasing can
then be derived from Eq. (4.28) as
cos pw,al =
x al
cos pw
(4.31)
Figure 4.8 shows the incidence angles of the desired plane wave and its aliasing contributions for the reproduction of a monochromatic plane wave with a frequency of f0 = 10 kHz
with a secondary source distance of x = 0.15 m. The incidence angle of the plane wave
was pw = 90o .
Up to now, only the real part of the reproduced spectrum was considered since the imaginary part is related to nearfield effects of the secondary sources. The poles of the
imaginary part are also located on the circle shown in Fig. 4.7. Applying the sifting property of the Delta function, the spectrum of these contributions is given by evaluation the
118
y
P (x, )
kpw
pw
pw
pw
V
x
Figure 4.9: Illustration of the effects caused by truncation of an infinite linear array.
The gray area illustrates approximately the area where the wave field of a plane wave
with the incidence angle pw is reproduced.
2
imaginary part at kx = x
+ c cos pw . The result is not bandlimited in the ky but in
the kx direction. Hence, the antialiasing condition (4.29) applies also to the imaginary
part. The aliasing contributions of the imaginary part have the form of evanescent plane
waves.
119
y
x0
V
R
P (x, )
r
0
x
x
Figure 4.10: Geometry used to derive the sampling artifacts of circular loudspeaker
arrays. The dots denote the spatial sampling positions of the driving function DC,S (x, ).
aliasing artifacts of reproduced plane waves will be plane waves themselves, these aliasing
artifacts will not be present at all listener positions. Those plane waves who are relevant
at a given listener position can be found easily by the geometric approximation discussed
above (see also Fig. 4.9). A special case is represented by a plane wave incidence angle
of pw = 90o and listener positions far away from the array: no aliasing artifacts will be
present. The aliasing frequency will be infinite in this case.
Since typical listening rooms have a rectangular shape, rectangular arrays are frequently
used to build spatial auralization systems. These consist of four truncated linear arrays,
one at each side. Thus, rectangular arrays can be regarded as a superposition of truncated
linear arrays and the sampling theory introduced so far can be applied. However, it has to
be taken care that only those linear arrays are considered which are active for a particular
plane wave to be reproduced. This selection can be performed on basis of the window
function a(x0 ).
4.1.6.2
Circular Arrays
In the following the spatial sampling of a circular shaped secondary source distribution
with radius R will be investigated. Figure 4.10 illustrates the geometry of the considered
circular array. Due to the underlying circular geometry of the problem it is convenient to
use polar coordinates for the description of the reproduced wave field. The reproduced
wave field PP (xP , ) can be derived by specializing Eq. (4.19) to the geometry depicted in
120
Fig. 4.10
j
PP (xP , ) =
4
(2)
DP (0 , R, ) H0 (kr) R d0 ,
(4.32)
where r = xP xP,0  and V2D (xP xP,0 , ) as given by Eq. (4.20) was introduced. The
(2)
Hankel function H0 (kr) in Eq. (4.32) can be expressed by Bessel and Hankel functions
which depend only on one of the positions xP and xP,0 using the shift theorem of the
Hankel functions given by Eq. (3.110). Introducing Eq. (3.110) for r0 = R and r R into
Eq. (4.32) yields the reproduced wave field inside the circular boundary V as
Z 2
j X
(2)
j
PP (xP , ) =
J (kr) H (kR) R e
DP (0 , R, ) ej0 d0 =
4 =
0
X
R, ) ej ,
= j R
J (kr) H(2) (kR) D(,
2 =
(4.33)
where for the second equality the definition of the Fourier series, as given by Eq. (3.16),
was introduced. Equation (4.33) states that the reproduced wave field is given by a Fourier
series with respect to the angle . The coefficients of this Fourier series are given by the
R, ) of the driving function weighted by a Bessel and a
Fourier series coefficients D(,
Hankel function.
The effect of discretizing the secondary source distribution is modeled by sampling the
loudspeaker driving function DP (0 , R, ) at equidistant angles, resulting in a total of N
sampled secondary source positions. The sampled driving function DP,S (0 , R, ) is given
as
DP,S (0 , R, ) = DP (0 , R, ) P (0 ) ,
(4.34)
where P (0 ) denotes the angular pulse train, as defined by Eq. (3.127). The effects of
angular sampling were discussed in Section 3.6.2.2. Sampling will result in repetitions of
the angular spectrum as illustrated by Eq. (3.129). Applying this principle to the sampled
S (, R, ) of the
driving function DP,S (0 , R, ) results in the Fourier series coefficients D
sampled driving function
S (, R, ) =
D
+ N, R, ) .
D(
(4.35)
Introducing Eq. (4.35) into Eq. (4.33) yields the wave field PP,S (xP , ) reproduced by a
discrete secondary source distribution as
X X
S ( + N, R, ) ej .
PP,S (xP , ) = j R
J (kr) H(2) (kR) D
2 = =
(4.36)
The result for = 0 constitutes the desired wave field. Please note, that effects of the
limited aperture of the array are included inherently in Eq. (4.36) by the Hankel function
121
(2)
H (kR). The terms for 6= 0 are potential aliasing contributions. Their energy should
be zero in the ideal case or at least be minimized in practical applications.
The formulation of the reproduced wave field in terms of angular frequencies given by
Eq. (4.36) can be used to split the reproduced wave field PP,S (xP , ) into the wave
field PP,S,0 (xP , ) containing no aliasing contributions and into its aliasing contributions
PP,S,al (xP , ). The wave field PP,S,0 (xP , ) would have been reproduced by a continuous
secondary source distribution and is given as
X
S (, R, ) ej .
PP,S,0 (xP , ) = j R
J (kr) H(2) (kR) D
2 =
(4.37)
The aliasing contributions PP,S,al (xP , ) reproduced by a discretized secondary source distribution can be derived from the spectral repetitions present in Eq. (4.36)
X X
S ( + N, R, ) ej .
J (kr) H(2) (kR) D
PP,S,al (xP , ) = j R
2
=
(4.38)
1
The splitup of the reproduced wave field can be used to calculate the energy of the
aliasing contributions with respect to the desired wave field. The reproduced aliasingtosignal ratio RASR is defined according to the aliasingtosignal ratio ASR derived in
Section 3.6.2.3 as follows
R
PP,S,al (xP , )2 d
RASR(xP , ) = R0
.
(4.39)
)2 d
P
(x
,
P,S,0
P
0
In general, the RASR will be dependent on the desired wave field and the listener position.
As for the linear arrays the reproduction of a plane wave will be considered in the following.
Sampling artifacts for the reproduction of plane waves
The driving function for the reproduction of a plane wave can be derived according to
Eq. (4.10) by considering the window function a(x0 ) and calculating the directional gradient of the wave field of a plane wave. The continuous driving function DP,pw (0 , R, )
for a plane wave is given as
1 for 3 ,
0
pw
2
2
apw (0 ) =
(4.41)
0 otherwise.
122
Introducing Eq. (4.41) into Eq. (4.40) allows to calculate the Fourier series expansions
S,pw (, R, ) of the sampled driving function for the reproduction of a plane
coefficients D
wave. These can then be used to calculate the wave field reproduced by a discrete secondary source distribution using Eq. (4.36). It was shown in Section 3.6.2.2 that a plane
wave exhibits an infinite bandwidth in the angular frequency domain. As a result, no
exact antialiasing condition can be given for the reproduction of plane waves on circular arrays. In the following results derived by numerical evaluation of Eq. (4.36) will be
shown.
The reproduction of a bandlimited (sincshaped) plane wave with an incidence angle of
pw = 3
on a circular array was numerically evaluated for this purpose. The circular ar2
ray consists of 48 secondary line sources placed on a circle with a radius of RLS = 1.50 m.
The aliasing artifacts will depend on the bandwidth of the desired plane wave. Figure 4.11(a) shows a snapshot of the reproduced wave field PP,S (xP , ) for a bandwidth of
1 kHz. The desired plane wave as well as the aliasing contributions can be clearly seen.
Figure 4.11(b) illustrates additionally the extracted aliasing contributions PP,S,al (xP , )
of Fig. 4.11(a). Figure 4.12 shows the RASR(xP , ) for different maximum frequencies.
The presented results show that the RASR is dependent on the listener position and the
bandwidth of the reproduced plane wave. Two conclusions can be drawn from Fig. 4.12:
(1) the higher the bandwidth of the plane wave is, the more energy is contained in the
aliasing contributions of the reproduced field and (2) the farer the listener position is from
the active secondary sources, the lower is the energy of the aliasing contributions. The
latter conclusion was also derived for truncated linear arrays discussed in the previous
section.
4.1.6.3
The derivation of sampling artifacts for circular arrays, as given in the previous section,
can be generalized to the case of arbitrary shaped arrays. This will be illustrated briefly
in the following section.
The basic idea is to perform a mapping of an arbitrary shaped listening area V and its
boundary V to a circular listening area V and its boundary V . Figure 4.13 illustrates
the desired mapping. The Riemann mapping theorem [Wei03] states that every simply
connected region can be mapped with a onetoone transformation to a unit circular region
(with radius R = 1) using an analytic function. The desired mapping can be performed
using a conformal mapping [Wei03, SS67] M which maps the Cartesian coordinate system
of the arbitrary listening region V shown on the left side of Fig. 4.13 into the circular
region V shown on the right side of Fig. 4.13. A benefit of using a conformal mapping
is that the local rectangular angles of the Cartesian coordinate system are preserved in
the transformed domain. It remains to find a suitable analytic function for this purpose.
Tables for a wide variety of geometries can be found in [SS67]. Since conformal mappings
123
124
(a) f = 500 Hz
(b) f = 650 Hz
(c) f = 800 Hz
(d) f = 1000 Hz
125
P (x, )
V
P (x , )
Figure 4.13: Derivation of the sampling artifacts of an arbitrary shaped array by performing a conformal mapping to a circular array.
can be combined also, these tables of conformal mappings allow to handle or approximate
nearly all possible reproduction geometries. Once a suitable conformal mapping has been
found for the coordinates, this can be introduced into the relations for the circular arrays
derived in the previous section. The mapping of an equidistantly sampled arbitrary shaped
array onto a circular array may result in a nonequidistant angular sampling on the circle
as illustrated by Fig. 4.13. As a consequence to this irregular angular sampling, the
effects of angular sampling will not be a simple repetition of the Fourier series expansion
S (, R, ) of the loudspeaker driving function as given by Eq. (4.35). Instead,
coefficients D
the sampling will result in a convolution with a complex function in the angular frequency
domain . As a result, it will not be possible to derive a generic antialiasing theorem.
4.1.6.4
Secondary line sources are required for the reproduction of a wave field in a plane only.
However, typical implementations of twodimensional reproduction systems use point
sources as secondary sources, since point sources can be approximated quite well by closed
loudspeakers. In this case the secondary source field is given as
1 ej c xx0 
.
V3D (x x0 , ) =
4 x x0 
(4.42)
The concept of wave field synthesis introduced in Section 5.1 is an example for a sound
reproduction system utilizing secondary point sources.
The spatial Fourier transformation of V3D (x x0 , ) has been derived in Section C.3.2.
The poles of the spatial Fourier transformation VC,3D (kC , ) are located on a circle with
126
111111111111111111111111111111111111111
000000000000000000000000000000000000000
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
n
000000000000000000000000000000000000000
111111111111111111111111111111111111111
V
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
R
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
listening region
V
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
R
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
listening room
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
Figure 4.14: Illustration of the geometry used to derive the characteristics of a sound
reproduction system located in the listening room.
radius c in the kx ky domain. Comparison with the imaginary part of the spectrum
derived for a sampled linear secondary line source distribution as illustrated by Eq. (4.7)
yields a strong similarity. For the reproduction of a plane wave the resulting spectrum for
point sources as secondary sources is not bandlimited in the ky but in the kx direction.
Hence, the sampling theory presented so far can also be applied with minor changes to
the case of secondary point sources.
4.2
127
G(xx0 , )
L(x, )
D(x0 , )
Figure 4.15: Response of the reproduction system located in the listening room illustrated as space shiftvariant system.
V . The listening region V is located inside the listening room, which is modeled by
an arbitrarily shaped bounded region R surrounded by the boundary R. In order to
derive the influence of the listening room, the acoustic conditions at the boundary R
have to be considered. In the following it will be assumed that the walls of the listening
are not actively vibrating. Hence, their characteristics can be modeled by homogeneous
boundary conditions formulated on R. The general solution of the inhomogeneous wave
equation subject to homogeneous boundary conditions was derived in Section 2.6. For
the geometry depicted in Fig. 4.14 the wave field L(x, ) within R (x R) is given by
specializing Eq. (2.62) as follows
Z
L(x, ) = D(x0 , ) G(xx0, ) dV0 ,
(4.43)
R
where G(xx0 , ) denotes a suitably chosen Greens function which conforms to the homogeneous boundary conditions imposed on R and D(x0 , ) the driving function of the
monopole distribution. The negative sign of Eq. (4.43) in comparison to Eq. (2.62) is
chosen in accordance to the negative sign of the inhomogeneous part of the inhomogeneous wave equation (2.57). It was derived in the previous section, that the wave field
within V can be controlled by a distribution of monopole secondary sources on V . Thus,
D(x0 , ) will only have contributions on the boundary V . Hence, the integral above can
be rewritten in terms of the virtual boundary V
L(x, ) =
(4.44)
Please note, that the integral (4.44) is still valid for the entire listening room. The response
of the listening room to the auralized wave field, as given by Eq. (4.44), can be understood
as a linear space shiftvariant system (H(x, x0 , ) = G(xx0 , )). Figure 4.15 illustrates
this interpretation of Eq. (4.44).
4.3
The review of classical approaches to room compensation presented in Section 1.2 stated
fundamental requirements for an improved active listening room compensation system.
128
In particular the first two were: (1) sufficient observation and (2) control of the wave
field within the listening region. The required observational capabilities of the room compensation system can be realized by the techniques derived in Section 3. The techniques
described there can be utilized to analyze the wave field within a bounded region by
measurements on the boundary of that region. However, for room compensation it is also
required to gain control over the wave field within the listening region. The methods
for sound reproduction introduced in Section 4.1 allow to control the wave field within
a bounded region by a distribution of monopoles on the boundary of that region. Thus,
destructive interference can be used to compensate for the reflections of the listening room
in the listening area. The following section will derive the fundamentals of active listening
room compensation using destructive interference.
4.3.1
It will be assumed in the following that the compensation of the listening room influence
is only performed within the listening region V . It was derived in Section 4.1.4 that
a distribution of monopoles is suitable to control the acoustic pressure field within the
enclosed region. In the context of room compensation this principle can be interpreted
also as follows: if the influence of the listening room is compensated on the boundary
surrounding the listening region, then the reproduction within that region will be free
of artifacts emerging from a nonideal listening room. This basic idea is utilized in the
following.
The reproduced wave field within the listening region (x V ) is given by Eq. (4.44)
I
L(x, ) =
D(x0 , ) G(xx0, ) dS0 ,
(4.45)
V
where L(x, ) denotes the reproduced wave field within the listening area and G(xx0 , )
a suitably chosen Greens function that fulfills the boundary conditions imposed at the
walls of the listening room. The underlying geometry is illustrated by Fig. 4.14. The
Greens function G(xx0 , ) inherently includes the effect of the listening room on the
auralized wave field, e. g. the acoustic reflections. However, these effects are not desired
as shown in Section 1.1. Desired are freefield propagation conditions for the reproduction.
The desired wave field A(x, ) within the listening area can be derived from Eq. (4.44) as
I
A(x, ) =
D(x0 , ) G0(xx0 , ) dS0 ,
(4.46)
V
where G0 (xx0 , ) denotes a suitably chosen freefield Greens function. The desired wave
field, as given by Eq. (4.46), can be can be interpreted as the response of a LTSI system.
Figure 4.16 illustrates the computation of the desired wave field. The error between the
desired and the reproduced wave field is defined as follows
E(x, ) = A(x, ) L(x, ) .
(4.47)
129
G0 (xx0 , )
A(x, )
D(x0 , )
Figure 4.16: Calculation of the desired wave field illustrated as LTSI system.
D(x0 , )
C(x0 x0 , )
G(xx0 , )
W (x0 , )
L(x, )
(4.48)
The reproduced wave field L(x, ) when preequalizing the driving signal is then derived
by combining Eq. (4.45) and Eq. (4.48) as
L(x, ) =
=
IV
(4.49)
130
Perfect listening room compensation is gained if the reproduced wave field is as close as
possible to the desired (freefield) wave field. Thus, the compensation filter has to ensure
that L(x, ) is close to A(x, ). Under the assumption of arbitrary excitation by D(x0 , ),
a suitable compensation filter C(x0 x0 , ) can be derived by comparing the integrands of
Eq. (4.49) to the integrand of Eq. (4.46)
Z
(4.50)
where x0 and x0 have been interchanged in Eq. (4.49) to derive Eq. (4.50). Above result
states that the optimal room compensation filter C(x0 x0 , ) can be found by solving the
integral equation Eq. (4.50). In case of an anechoic listening room (freefield propagation)
the compensation filter is given as
C(x0 x0 , ) = (x0 x0 ) .
(4.51)
This solution is evident since it states that no filtering is required in this simplified case.
However, for generic listening rooms a solution of Eq. (4.50) can become quite complex.
The integral on the left hand side of this equation can be interpreted as a generalized
convolution operation. In contrast to the standard convolution operation this formulation takes the spacevariance of the problem explicitly into account. The fundamental
problem given by Eq. (4.50) is termed as deconvolution or inversefiltering problem in the
context of LTSI systems. Hence, the process of calculating an appropriate compensation
filter C(x0 x0 , ) for active room compensation is similar to solving a spacevariant deconvolution problem.
In order to calculate the compensation filters, explicit knowledge of the Greens function
G(xx0 , ) is required. In general, this function will not be known apriori and will have
to be derived from acoustic measurements taken in the listening room. Additionally, the
acoustic characteristics of the listening room may change due to e. g. persons entering
the room or temperature variations. The active listening room compensation system has
to cope with these changes. These requirements call for an adaptive solution to room
compensation.
Summarizing, active listening room compensation exhibits two fundamental problems:
1. complex solution of the spacevariant deconvolution problem given by Eq. (4.50),
and
2. the listening room transfer function G(xx0 , ) is not known apriori.
The following two sections will briefly address these two problems.
4.3.2
131
As stated in the previous section, the inverse filtering problem for a spatiotemporal LTSI
system leads to a similar result as given by Eq. (4.50). However, in this case the integral on the left hand side of Eq. (4.50) degenerates to a (spaceinvariant) convolution
integral. A solution to this shiftinvariant problem is given by transforming the signals
into the spatiotemporal frequency domain by performing a multidimensional Fourier
transformation (3.2a). The compensation filter can then be derived by calculating the
fraction between the transfer function of the actual system response and the desired system response for each spatial and temporal frequency. This results from the convolution
theorem (3.8) of the multidimensional Fourier transformation. Unfortunately, this procedure is not applicable to shiftvariant systems since the convolution theorem does not
hold. In the following section a generic solution to the shiftvariant deconvolution problem
is presented which is based upon the functional transformation method (FTM) [TR03].
Equation (4.50) has the form of a Fredholm integral equation of the first kind with a
symmetric Greens function as kernel. The solution to such a problem may be found by
expanding the Greens function G(xx0 , ) into an eigenfunction series [MF53a]. A wide
variety of expansions can be used for this purpose. One expansion that could be used
was already introduced in Section 2.7.2 by the concept of the modal expansion (2.79).
However, this expansion is only suitable for rectangular rooms with nearly rigid walls.
The functional transformation method provides a versatile framework for the solution of
partial differential equations in bounded domains by means of a series expansion. In the
following this concept will be utilized to provide a generic solution to the deconvolution
problem given by Eq. (4.50).
Using the FTM the Greens function G(xx0 , ) on a bounded domain with arbitrary
boundary conditions can be expanded into the following series [Pet04]
1 X
1
0 , ) K(x, ) ,
G(xx0 , ) =
K(x
(4.52)
N j(j + )
132
Introducing the expansions of the Greens function and the compensation filter into
Eq. (4.50) yields then
I
G0 (xx0 , ) =
C(x0 x0 , ) G(xx0, ) dS0 =
V
c( )
0 , )K(x, ) ,
K(x
j(j + )
(4.54)
where the biorthogonality property of the kernels was exploited [PR04] to derive the
second equality. Above equation can then be solved for the coefficients c( ) using the
biorthogonality property again
I
j(j + )
G (xx0 , ) K(x,
) dS .
(4.55)
c( ) =
0 , ) V 0
N K(x
Introducing this into Eq. (4.53) yields an explicit expression for the compensation filters
I
X j(j + )
C(x0 x0 , ) =
K(x0 , )
G0 (xx0 , ) K(x,
) dS .
(4.56)
N
The integral on the right hand side of Eq. (4.56) can be interpreted as the expansion of
the freefield Greens function G0 (xx0 , ) into the eigenfunctions K(x, ) and its adjoint
eigenfunctions K(x,
) of the wave equation bounded by the listening room. Division of
these by the expansion coefficients of the Greens function of the room G(xx0 , ) yields
then the expansion coefficients of the compensation filter C(x0 x0 , ). Thus, by expand
ing the related functions into a series with respect to the kernels K(x, ) and K(x,
) a
closed form solution is found for the deconvolution problem given by Eq. (4.50). For the
spaceinvariant case the eigenfunctions of the underlying systems are exponential functions
(see Section 3.2.3). A transformation of the functions using the Fourier transformation
provides an equivalent solution to the (spaceinvariant) deconvolution system since the
Fourier transformation has exponential functions as kernels. However, the presented solution still requires knowledge of the Greens function of the room. A solution to this
problem will be discussed in the next section.
4.3.3
D(x0 , )
C(x0 x0 , )
G0 (xx0 , )
A(x, )
133
W (x0 , )
G(xx0 , )
E(x, )
L(x, )
Figure 4.18: Block diagram of a system for the adaptation of the room compensation
filter.
The room compensation problem, as defined within this thesis, is supervised since the
monopole driving function D(x0 , ) is assumed to be known and thus can be utilized
to adapt the compensation filters. The equivalent unsupervised problem would have to
derive the room compensation filters from measurements only. Inverse filtering is one
of various cases where adaptive filters provide a convenient solution. Adaptive filtering
schemes utilize in general the error between the desired and the actual system response
in order to adapt the filter. For a system given in continuous time and space, this filter
has to be parameterized by a finite number of parameters [Son67, SK92]. If operating
optimally, the adaptive filter is adapted such that the error between the desired and the
actual system response is minimized. A wide variety of algorithms have been developed
in the past decades to tackle this fundamental problem [Hay96].
Adapting the structure of a generic adaptive inverse filtering scheme to the problem of
active room compensation results in the block diagram given by Fig. 4.18. The monopole
driving signal D(x0 , ) is fed through the room compensation filter C(x0 x0 , ) resulting
in the filtered driving signal W (x0 , ). The wave field within the listening region L(x, )
is then determined by the room transfer function G(xx0 , ). Since the room transfer
function is not known in general, the wave field within the listening region L(x, ) will be
measured in a generic room compensation system. The desired wave field A(x0 , ) within
the listening region is given by the monopole driving signal D(x0 , ) and the freefield
transfer function G0 (xx0 , ). The reproduction error E(x0 , ) between the desired wave
field A(x0 , ) and the reproduced wave field L(x, ) is then used to adapt the compensation filters C(x0 x0 , ). The relations between the particular signals shown in Fig. 4.18
were already derived in Section 4.3.
134
The continuous space and time representation used so far is not appropriate for a practical
realization of room compensation. The following section will develop a generic framework
for adaptive active room compensation which is based on a discrete time and space representation of the involved signals and systems.
4.4
The previous sections illustrated the principle solution to active room compensation in
continuous time and space. For a practical system, both time and space have to be
sampled appropriately. Additionally, the formulations used in the previous sections were
independent from the dimensionality of the problem. Due to complexity constraints, practical reproduction systems are typically limited to the reproduction in a plane only. In the
following discussions the case of twodimensional reproduction and analysis will be considered. However, the principles derived in the sequel can be generalized straightforwardly
to threedimensional sound reproduction systems. Since the room transfer function is not
known a priori or may change over time, an adaptive solution to calculate the compensation filter is favorable. The need for an adaptive calculation of the room compensation
filters was e. g. illustrated in [PSR05] by simulating a nonadaptive room compensation
system operating under varying acoustic conditions.
The following section presents a discrete framework for an adaptive active room compensation system, derives an adaptation algorithm and highlights the fundamental problems
of adaptive filtering for massive multichannel reproduction systems.
4.4.1
In order to derive a discrete realization of the active listening room compensation system,
spatial sampling, temporal sampling and a frequency domain description of the signals
and systems are discussed in the sequel.
4.4.1.1
Spatial Discretization
A practical realization of room compensation requires to sample the monopole line source
distribution on the line V surrounding the listening area V . The effects of spatial
sampling of the monopole distribution were already discussed in Section 4.1.6. The spatial sampling may result in spatial aliasing if the antialiasing conditions derived in Section 4.1.6 are not met reasonable. In particular, the temporal bandwidth of the reproduced
wave field has to be limited in order to avoid spatial aliasing. Thus if the temporal bandwidth is reduced, then the sampled version of the monopole line distribution with an
135
111111111111111111111111111111111111111
000000000000000000000000000000000000000
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
N
1
discrete
2
000000000000000000000000000000000000000
111111111111111111111111111111111111111
synthesis
000000000000000000000000000000000000000
111111111111111111111111111111111111111
and analysis
000000000000000000000000000000000000000
111111111111111111111111111111111111111
positions
000000000000000000000000000000000000000
111111111111111111111111111111111111111
M
1
000000000000000000000000000000000000000
111111111111111111111111111111111111111
2
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
x
000000000000000000000000000000000000000
111111111111111111111111111111111111111
listening area
000000000000000000000000000000000000000
111111111111111111111111111111111111111
x
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
0
000000000000000000000000000000000000000
111111111111111111111111111111111111111
V
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
listening room
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
n
Figure 4.19: Discretized version of the active room compensation problem. The denote
the discrete sampling positions.
appropriate driving function provides full control over the wave field within V .
In order to adapt the room compensation filter, the reproduced wave field within the
listening area has to be analyzed. It was shown in Section 3.4 that acoustic measurements taken on the boundary of the listening area are suitable to characterize the wave
field within the listening area. In general, these measurements will be taken using microphones. However, in a practical implementation it will not be possible to place the
secondary acoustic sources and the microphones at the same positions. On the one hand
mechanical limitations will prohibit this. On the other hand the acoustic pressure level
in the vicinity of the secondary sources will be quite high, leading potentially to overload
of the microphones. For these reasons it is advisable to separate the points where the line
sources for synthesis and the microphones for analysis are placed. Figure 4.19 illustrates
the scenario that is considered in the following. The line source distribution is sampled at
a total of N positions, the reproduced wave field is analyzed at a total of M positions. The
analysis points are located within the area spanned by the sampled contour V . Since the
room compensation filter is calculated based on the measurements taken at these points,
perfect room compensation can only be achieved within the area spanned by the analysis
points. This area comprises the listening area as depicted in Fig. 4.19.
The following discussion will assume that the antialiasing conditions for the synthesis
and analysis of the wave field are met reasonably. As derived in Section 3.6.2.3 and Section 4.1.6 a limitation of the temporal bandwidth is necessary when assuming no spatial
136
Temporal Discretization
All signals are sampled synchronously in the timedomain at uniform time steps kTs ,
where k denotes the discrete time index and Ts the temporal sampling interval. The
sampling frequency is then given as fs = 1/Ts . It has to be chosen according to the
temporal sampling theorem [OS99] and the upper frequency limit provided by the sound
reproduction and analysis system. Temporal sampling will be illustrated in the following
for the monopole driving signal d(x, t). The same procedure as illustrated for this signal
applies also to the other signals used for room compensation.
The discretized version of the monopole driving signal d(x, t) for the nth synthesis position is defined as
dn (k) := d(xn , kTs ) ,
(4.57)
where xn denotes the nth spatially sampled position on V . For the description of
adaptation algorithms it is convenient to consider subsequent temporal samples or spatially discrete positions together. Subsequent temporal samples are considered together
by capturing K samples of dn (k) into a K 1 vector dn (k) as follows
dn (k) =
dn (k) dn (k 1) dn (k K + 1)
iT
(4.58)
Equivalent definitions apply to the signals w(x, t), l(x, t), a(x, t) and e(x, t) resulting in
their discrete counterparts wn (k), lm (k), am (k) and em (k). Please note that the dimension
of all these vectors is K 1. In order to capture the spatial information, all spatial samples
at one time instant k are combined together into a vector. For the loudspeaker driving
signals this vector is defined as follows
(N)
(k) =
iT
(4.59)
Equivalent definitions apply again to the other signals used for room compensation.
The Greens function G(xx0 , ) characterizing the listening room can be regarded as the
acoustic transfer function from the excitation point x0 to the measurement point x. Thus,
its counterpart in the temporal domain g(xx0 , t) = Ft1 {G(xx0 , )} can be interpreted
as the corresponding Greens impulse response. In general this impulse response may be of
infinite length. For practical purposes this impulse response is truncated at a reasonable
time (e. g. when its energy has decayed below a suitably chosen threshold) resulting in a
finite impulse response (FIR). The discretized impulse response is then defined as follows
rm,n (k) := g(xm xn , kTs ) ,
(4.60)
137
where rm,n (k) will be denoted as room impulse response. As for the signals, Nr subsequent
temporal samples can be captured into a Nr 1 vector
h
iT
rm,n = rm,n (0) rm,n (1) rm,n (Nr 1)
,
(4.61)
where Nr denotes the finite number of temporal samples. The impulse responses from all
excitation to all analysis points can be alternatively combined into the matrix
..
..
..
R(k) =
(4.62)
,
.
.
.
rM,1(k) rM,N (k)
where R(k) exhibits the dimensions M N. The matrix R(k) describes the sound propagation in the listening room from all synthesis to all analysis points. The matrix R(k) will
be termed as matrix of room impulse responses. It can be interpreted as a spatiotemporal
sampled version of the Greens impulse response. If the respective impulse responses in
R(k) are actually measured, then these will not only contain the acoustic transfer functions but also the influence of the employed hardware [Gar00]. These influences will be
neglected in the following since this section aims at deriving the principal problems of
adaptive filtering for multichannel systems.
Analogous definitions, as given above for rm,n and R(k), apply also to the functions
g0 (xx0 , t) and c(x0 x0 , t) resulting in their discrete counterparts fm,n , cn,n , F(k) and
C(k). The dimensions of fm,n , cn,n are Nf 1, Nc 1 and the dimensions of F(k), C(k)
are M N, N N. The matrix F(k) is termed as the matrix of freefield impulse responses
and C(k) as the compensation filter.
4.4.1.3
For the temporal frequency domain description of the signals and systems used for room
compensation, the discrete time Fourier transformation (DTFT) [OS99] will be used in the
following. The DTFT transformation and its inverse for the spatially discrete loudspeaker
driving signal dn (k) is given as follows
Dn () =
dn () ejTs ,
(4.63a)
Ts
dn (kTs ) =
2
2/Ts
Dn () ejkTs d .
(4.63b)
Analogous definitions apply to the other signals wn (k), lm (k), am (k) and em (k). The
vector of spatially combined loudspeaker driving signals d(N) (k) will be transformed into
the temporal frequency domain by transforming each element dn (k) separately using the
DTFT (4.63). The resulting vector will be denoted by d(N) (). Vectors and matrices of
138
C(k)
w(N) (k)
R(k)
d(N) (k)
l(M) (k)
e(M) (k)
F(k)
1
l(M) (k)
a(M) (k)
Figure 4.20: Block diagram illustrating the adaptive MIMO inverse filtering approach
to room compensation.
frequency domain signals will be underlined in the sequel. Analogous definitions as for
d(N) (k) apply to the other signals used.
The matrix of room impulse responses R(k) will be transformed into the temporal frequency domain by transforming each element rm,n (k) separately using the DTFT (4.63).
The resulting matrix in the frequency domain will be termed as room transfer matrix and
denoted by R(). The matrices F(k) and C(k) will be transformed analogously. The
transfer matrix F() will be termed as freefield transfer matrix.
4.4.1.4
Using the foregoing definitions a discrete counterpart of the adaptive framework given by
Fig. 4.18 is derived in this section. Figure 4.20 illustrates the resulting discrete time and
space block diagram. The matrices of impulse responses R(k), F(k) and C(k) describe
discrete linear multipleinput/multipleoutput (MIMO) systems (see Section 3.2). Since
the respective impulse responses are finite, they will be termed as MIMO FIR systems
in the following. Thus, room compensation can be understood as inverse MIMO FIR
system filtering problem. For few synthesis and analysis channels numerous solutions to
this problem have been developed in the past [Gar00, TW02, TW03, BHK03, NOBH95].
However, algorithms for massive multichannel systems still remain a challenge as will be
shown in the following. In the next section the case of perfect knowledge of the room
transfer matrix will be considered first in order to derive useful results for the solvability
of such inverse problems.
4.4.2
139
N
X
n=1
(4.64)
where denotes the discrete convolution in this context. Due to the MIMO structure of
the system the desired signal am (k) is given as a sum of the filtered monopole driving
signals dn (k).
The reproduced signal lm (k) at the mth analysis position is derived by considering the
room transfer function and the compensation filters. It is given as
lm (k) =
=
N
X
n=1
n =1
n=1
N X
N
X
(4.65)
For perfect compensation of the room influence the desired and the reproduced signal
!
should be equivalent for all analysis positions (am (k) = lm (k) for m = 1...M). Comparing
the terms inside the summation of Eq. (4.64) and Eq. (4.65), for sufficient excitation dn (k),
derives the following relation for the computation of the room compensation filters
N
X
n =1
for m = 1...M .
(4.66)
Equation (4.66) can be regarded as the discrete counterpart of Eq. (4.50). It constitutes
a spatiotemporal discrete spacevariant deconvolution problem.
For the single channel case (N = M = 1) Eq. (4.66) states that the exact solution would
be given by computing an inverse filter to rm,n (k). In general, room impulse responses
are not minimum phase and thus cannot be inverted exactly. Typical approaches towards
this problem calculate a minimumphase representation of the room impulse response and
invert only this [NA79]. However, in the MIMO case the situation improves. The MINT
states that an exact solution can be found under the following assumptions:
140
1. the number of synthesis positions is higher than the number of analysis points
(N > M),
2. the nth input dn (k) can be equalized jointly for all M analysis positions independently of the other inputs, and
3. the transfer functions in the zdomain of R(k) and F(k) do not exhibit common
zeros.
The first condition states that at least one synthesis position more than analysis positions
is required for an exact solution. This condition is also fundamental for exact synthesis
of sound fields, as shown in [Fli02]. The second condition holds in the context of this
work since in linear acoustics the superposition principle applies. The third condition
states that the inversion problem can be solved much easier in the MIMO case, since only
common zeros of the transfer matrices in the zdomain of R(k) and F(k) pose a problem
(see also Eq. (4.66)). In most practical cases this condition will be fulfilled. The solution
to the MIMO inverse filtering problem is found by setting up a system of linear equations.
This is performed by representing the convolution in Eq. (4.66) by a matrix multiplication
using Toeplitz matrices and formulating the problem jointly for all analysis points. This
results in the following system of linear equations
T1,1 T1,N
c1,n
f1,n
..
.. .. = .. ,
..
(4.67)
.
.
. . .

TM,1 TM,N
{z
}
cN,n
fM,n
where the Tm,n denote Toeplitz matrices, and cn ,n and fm,n are defined equivalently to
Eq. (4.61) with the lengths Nc and Nf respectively. The Toeplitz matrices Tm,n are given
as follows
rm,n (0)
0
..
..
.
rm,n (1)
rm,n (0)
.
..
..
.
.
rm,n (1)
0
.
.
.
..
..
Tm,n = rm,n (Nr 1)
(4.68)
rm,n (0)
0
rm,n (Nr 1) . .
rm,n (1)
..
..
..
.
.
0
.
0
0
0 rm,n (Nr 1)
An exact solution to the system of equations (4.67) can be found when Nf = Nr + Nc 1
and the matrix T is square. This leads to the following condition for the length of the
inverse filters
M(Nr 1)
Nc =
.
(4.69)
N M
141
9
8
7
Nc / Nr >
6
5
4
3
2
1
0
5
M >
Figure 4.21: Relative filter length of the room compensation filters Nc /Nr for a fixed
number of excitation points N = 10 and a varying number of analysis (equalization)
points (Nr 1).
Equation (4.69) states that the length Nc of the inverse filters increases with the number
of analysis points (= equalized points). If M is chosen to be near N then the length of
the inverse filters will be longer than the length of the room impulse responses. If M is
chosen quite small then the length of the inverse filters will be smaller than the length of
the room impulse responses. The length of the inverse filters is approximately equivalent
to the length of the room impulse responses for M = N/2. Figure 4.21 illustrates the
relative length of the compensation filters for N = 10 synthesis positions.
The calculation of the compensation filter coefficients, by solution of Eq. (4.67), requires
to invert the matrix T. For a high number of synthesis and analysis positions or coefficients Nr of the room impulse response this matrix becomes very large and the inversion
gets computationally very complex. Thus, direct implementation of the MINT is unfeasible for a high number of synthesis and analysis points and typical lengths of room
impulse responses due to the matrix inversion required. However, the derived conditions
for an exact solution give implications for the solvability and the required length of the
compensation filters. This length may become quite long compared to the room response,
if the number of synthesis points is chosen near the number of analysis points. Unfortunately, this is the desired case for equalization of the entire listening area. The following
section will introduce an adaptive solution to room compensation that will overcome the
problems of the MINT.
142
4.4.3
The previous section derived an exact solution for the computation of the room compensation filters when certain assumptions are fulfilled. Besides its computational complexity,
the presented solution requires that the listening room transfer matrix is known apriori
and additionally does not change over time. Since both of these requirements are not
fulfilled in typical listening room scenarios an alternative approach will be proposed. As
stated before, an adaptive algorithm may provide a solution to these problems. This
section will introduce the concept of linear inverse optimum leastsquares error adaptive
filtering and will point out the fundamental problems when applied to room compensation
in the context of this thesis. The following discussion will first derive the normal equation
of the inverse leastsquares error adaptive filtering problem and will then give some remarks on the derivation of the filteredx recursive leastsquares (XRLS) algorithm [BQ00].
The filteredx RLS algorithm deviates from the standard recursive leastsquares (RLS)
algorithm by using a filtered version of the input signal for the adaptation. The results
derived for the XRLS algorithm are generic since most of the frequently used adaptive
filtering schemes (e. g. filteredx leastmean squared (XLMS) algorithm) can be understood as specialization of the presented XRLS algorithm [Hay96].
Perfect listening room compensation is gained when the reproduced wave field matches the
desired wave field. A measure for this condition is given by the error em (k) = am (k)lm (k)
between the desired wave field am (k) and the reproduced wave field lm (k). If this measure is zero for all M analysis points, perfect listening room compensation is gained. In
the context of adaptive filtering the error em (k) is used as cost function for the inverse
filter optimization problem. The compensation filters should be adapted such that they
minimize the error em (k). The weighted leastsquares (WLS) estimate uses a weighted
sum of the timeaveraged error as cost function [Hay96]
(
c, k) =
k
X
WWLS (k, )
=0
M
X
m=1
em ()2 ,
(4.70)
k
X
=0
M
X
m=1
em ()2 .
(4.71)
143
Combining all M error signals at one timeinstant k into the vector e(M) (k) =
[ e1 (k) e2 (k) eM (k) ]T allows to write the cost function more conveniently as
(
c, k) =
k
X
(4.72)
=0
The optimal filter coefficients in the meansquared error (MSE) sense are found by setting
the gradient with respect to the estimated filter coefficients c of the cost function to zero
(
c, k) !
=0.
(4.73)
c
It remains in the following to express the error e(M) (k) in terms of the filter coefficients.
The reproduced signal lm (k) at the mth analysis point is given by Eq. (4.65). By rearranging the timedomain convolutions, Eq. (4.65) can be rewritten as
c (
c, k) =
lm (k) =
N X
N
X
n=1 n =1
N X
N
X
(4.74)
n=1 n =1
where dm,n,n (k) denote the driving signals dn (k) which have been filtered by the room
transfer function rm,n (k), as defined above. The vectors dm,n,n (k) and cn,n (k) containing
the timehistory of the filtered driving signals and the coefficients of the room compensation filter. They are defined according to Eq. (4.58) and Eq. (4.61). The length of both
vectors is chosen equal to the length of the inverse filters Nc , that should be longer than
the room impulse response Nc > Nr for typical room compensation scenarios as stated in
Section 4.4.2. Introducing the following definitions
c(N)
Tn,1 (k) c
Tn,2(k) c
Tn,N (k) ]T ,
n (k) = [ c
(N)
(N)
T
c(k) = [ c(N)
2 (k)T c
N (k)T ]T ,
c
1 (k)
T
T
T
T
d(N)
m,n (k) = [ dm,n,1 (k) dm,n,2 (k) dm,n,N (k) ] ,
(N)
(N)
T
d(N,N)
(k) = [ d(N)
dm,2 (k)T dm,N (k)T ]T ,
m
m,1 (k)
(N,N)
(N,N)
DR (k) = [ d(N,N)
(k) d2 (k) dM (k) ] ,
1
(4.75)
(4.76)
(4.77)
(4.78)
(4.79)
where c(k) and DR (k) have the sizes Nc NN 1 and Nc NN M respectively, allows to
express the reproduced wave field jointly at all M analysis positions as
l(M) (k) = DTR (k) c(k) .
(4.80)
Using Eq. (4.80) allows to express the error at all M analysis positions by the desired
signals a(M) (k), the filtered loudspeaker driving signals DR (k) and the filter coefficients
c(k)
e(M) (k) = a(M) (k) DTR (k) c(k) .
(4.81)
144
As stated before, the optimal filter coefficients that minimize the error e(M) (k) can be
derived according to Eq. (4.73). Evaluation of the gradient with respect to c(k) of the
cost function yields
(
c, k)
c (
c, k) =
=
c
=2
=0
k
X
=0
k
X
k DR ()DTR ()
c(k) DR ()a(M) () =
(4.82)
= 0Nc N N 1 .
Rearranging Eq. (4.82) yields the normal equation of the multichannel inverse filtering
problem as
dd (k) c(k) =
da (k) ,
(4.83)
dd (k) and
da (k) abbreviate the terms involving the signals DR (k) and a(M) (k)
where
dd can be interpreted as the time and analysis
in the sums of Eq. (4.82). The matrix
position averaged autocorrelation matrix of the filtered loudspeaker driving signals which
is defined as
k
X
dd (k) =
k DR () DTR () .
(4.84)
=0
da can be interpreted as the as the time and analysis position averaged crossThe vector
correlation vector between the filtered loudspeaker driving signals and the desired signals
which is defined as
k
X
da (k) =
k DR () a(M) () .
(4.85)
=0
The filteredx RLS algorithm can be derived from the normal equation (4.83) and above
definitions of the correlation matrices by computing the sums in a recursive fashion and
by applying the matrix inversion lemma. The derivation of the filteredx RLS algorithms
using this procedure can be found e. g. in [BQ00]. The filteredx RLS algorithm deviates
from the standard RLS algorithm by using a filtered version of the input signal for adaptation. The coefficients of the room compensation filter c(k) are typically not computed
for every time step k, they are updated at a lower rate than the sampling rate due to
complexity constraints.
The following discussion will focus on the fundamental problems of the adaptive inverse
filtering problem given by Eq. (4.83) and not on particular algorithms and their problems
used for realization.
4.4.4
145
Three fundamental problems of adaptive inverse filtering can be concluded from the
derivation of the normal equation (4.83) in the previous section. These are:
1. Required apriori knowledge of the room transfer function,
2. nonuniqueness problem when minimizing the cost function (
c, k), and
dd (k).
3. illconditioning of the autocorrelation matrix
It was stated before that in general the room transfer impulse responses rm,n (k) will not
be known apriori and are potentially timevarying. Thus it seems that not much has been
gained by using an adaptive filtering scheme since this still requires apriori knowledge to
calculate the filtered driving signals DR (k). While some amount of deviation between the
actual room transfer functions and apriori measured ones can be tolerated [BQ00] large
changes in the listening room will likely cause poor convergence of the adapted filters. In
order to cope with changing environments, the listening room transfer functions have to
be identified additionally. There are various online identification methods for this task. It
will be assumed in the following that a suitable algorithm for this purpose can be applied
within the context of room compensation to solve the first fundamental problem. An
overview over possible methods can be found in [Gar00, HS97]. However, most of these
algorithms are not capable of handling the massive multichannel case [SMH95, BMS98].
The second problem is related to the minimization of the cost function (
c, k). The optimal room compensation filters are given by calculating the inverse filters to the room
impulse responses, as shown by Eq. (4.66). However, minimization of the cost function (
c, k) may not provide the optimal solution in these terms. Depending on the
driving signals d(N) (k) there may be multiple possible solutions for c(k) that minimize
(
c, k) [SMH95, BMS98]. This problem will be termed as the nonuniqueness problem in
the following. It likely occurs if the room transfer functions are not excited in all spatial
and temporal frequencies. For example if only one wall in a room is excited by a plane
wave emitted by the reproduction system, the reflections caused by the other walls for
other excitations are not included in the compensation filters. In general, a MIMO system can only be identified perfectly if it is excited in all of its spatiotemporal degrees
of freedom. As a consequence of the nonuniqueness problem, changing the spatial and
temporal characteristics of the driving signals may temporally result in a rapid increase
of the error e(M) (k). This makes a new convergence of the compensation filters necessary.
Thus, the nonuniqueness problem is mainly an issue for highly temporally and spatially
nonstationary driving signals. A typical situation is the synthesis of a moving virtual
source.
The third fundamental problem is related to the solution of the normal equation (4.83).
146
The normal equation has to be solved with respect to the coefficients of the room compensation filter. However, due to the dimensionality and potentially illconditioning of
dd (k) this may become an infeasible task for a large number
the autocorrelation matrix
of reproduction channels and analysis points. Additionally, an exact solution may not
exist always. Some conditions for the solvability of the normal equation in the context of
room compensation are given in [Gar00].
The following section derives a generic framework for room compensation which explicitly
solves the last problem by utilizing spatiotemporal signal and system transformations.
It will be shown additionally that the first two problems are highly improved by this
approach.
4.5
The major conclusion of the previous section was, that the adaption of the room compensation filters using conventional filteredx algorithms is not feasible due to its complexity
for massive multichannel systems. The following section will derive a generic framework
for an improved room compensation system. Improved means in this context to overcome
most of the problems discussed in the previous section.
dd
Analysis of the AutoCorrelation Matrix
4.5.1
The main shortcomings of the XRLS algorithm introduced, as in the previous section,
dd of the filtered driving signals. The structure
emerge from the correlation matrix
and contents of this matrix are analyzed in detail in order to propose a solution to these
dd , as given by Eq. (4.84), can be expressed in
shortcomings. The correlation matrix
form of the following block matrix
(1,1),(1,1) (k)
(1,1),(N,N ) (k)
..
..
..
dd (k) =
(4.86)
.
,
.
.
(N,N ),(1,1) (k)
(N,N ),(N,N ) (k)
where (n , n) denotes an index consisting of all combinations of the two variables n and
(n ,n1 ),(n ,n2 ) (k) of dimension Nc Nc are defined as
n. The matrices
1
2
(n ,n1 ),(n ,n2 ) (k) =
1
2
k
X
=0
M
X
(4.87)
m=1
where n1 , n1 , n2 , n2 {1, 2, , N}. Using above definition (4.87), the correlation matri (n ,n ),(n ,n ) (k) can be interpreted as the
ces
1
147
The permutations are a result of the MIMO structure of the room transfer function and
the room compensation filter. Thus, if the room transfer matrix is diagonalized then
dd will become a blockdiagonal matrix. As a result, the MIMO system representing
the listening room and also the MIMO FIR inverse filtering problem will be decoupled.
This fundamental idea will be used in the following sections to derive an improved room
compensation system.
4.5.2
The previous section stated that a decoupling of the listening room transfer matrix yields
dd (k). This section will show how the desired
a decoupling of the autocorrelation matrix
decoupling can be obtained using the concept of the singular value decomposition (SVD).
Performing a DTFT of Eq. (4.65) yields the signal at the M analysis points in the frequency domain as
l(M) () = R() w(N) () ,
(4.88)
where R() denotes the DTFT transformed matrix of impulse responses from each synthesis to each analysis position, l(M) () and w(N) () the DTFT transformed signal at the
analysis positions and filtered loudspeaker driving signal respectively. They are defined
according to Section 4.4.1.3. The following two sections will introduce the SVD for the
decomposition of the room transfer matrix and its application for the decomposition of
Eq. (4.88).
4.5.2.1
The singular value decomposition states that any matrix R() can be decomposed into
two unitary matrices U() and V(), and a diagonal matrix R()
[Hay96, GL89, Dep88].
It will be assumed in the following that R() has the dimensions M N with N M.
The SVD of the room transfer matrix is then given as follows
(4.89)
148
where U() and V() have the dimensions M M and N M respectively, and R()
H
H
the dimension M M. Since U() and V() are unitary U()U () = V()V () =
IM M . The columns of the matrix V() are constructed from the right singular vectors
vb (), the matrix U() from the left singular vectors ub () of R(). The diagonal matrix
R()
is defined as follows
R()
= diag{ [1 , 2 , , M ] } ,
(4.90)
B
X
b ub () vH
b () .
(4.91)
b=1
Hence, the matrix R() can be represented as a finite series constructed from the left
and right singular vectors weighted by the corresponding singular values. Above series
representation may also be used to calculate an approximation of R() by using only a
subset of the singular values and the their corresponding left and right singular vectors.
For this purpose the Btr < B largest singular values and corresponding singular vectors
are typically used.
From a formal point of view above expansion exhibits similarities to the expansion (4.52)
of the Greens function using the FTM. The difference is that the FTM expansion requires
an infinite number of components. The listening room transfer function is constructed
from measurements at a limited number of discrete positions. It was shown in Section 3.6.2
that this limited number of positions inherently imposes a truncation. However, the FTM
expansion may also be limited to a reasonable number of components.
The relation given by Eq. (4.89) can be inverted by exploiting the unitary property of the
left and right singular matrices. This results in
= UH () R() V() .
R()
(4.92)
Hence, each matrix R() can be transformed using the left and right singular matrices
149
GL89]. The GSVD for the matrices R() and F() is given as follows
(4.94a)
(4.94b)
where F() has the dimension M N. The matrices X(), V() and U() are unitary
matrices with the dimensions M M, N M and N M respectively. The matrix
X() is the generalized singular matrix of R() and F(). As for the SVD, the matrices
and F()
are diagonal matrices constructed from the singular values of R() and
R()
F(). The GSVD transforms R() and F() into their joint eigenspace using the singular
matrices X(), V() and U().
Equation (4.94) and Eq. (4.93) can be combined to derive the following result
1 () F()
R+ () F() = V() R
UH () ,
(4.95)
where it is assumed that R() and F() have both full rank. Equation (4.95) will be
used in Section 4.5.3 to derive the desired decoupling of the MIMO adaptive system.
4.5.2.2
The SVD, as introduced in the previous section, can be used to transform the listening
room transfer matrix into a diagonal matrix. Equation (4.92) together with the unitary
property of the left and right singular matrices can be used to reformulate Eq. (4.88) as
follows
VH () w(N) () ,
(4.96)
UH () l(M) () = R()

{z
}

{z
}
l(M) ()
(M)
(M) ()
w
(M) () denote the transformed signal at the analwhere the M 1 vectors l () and w
ysis positions and the transformed driving signal respectively. In the context of signals
and systems the SVD can be understood as a transformation. The left and right singular
matrices U() and V() constitute the kernels of this transformation. The transformation of the MIMO system representing the listening room can be performed by pre and
postfiltering the room transfer matrix with V() and UH (). The pre and postfilters
constitute MIMO systems themselves. Thus, Eq. (4.88) can be expressed entirely in this
transformed domain. This principle is illustrated by Fig. 4.22. A similar matrix formulation as illustrated by Fig. 4.22 can also be given for the fast convolution technique using
the discrete Fourier transformation (DFT) of a signal [OS99]. In this special case the
transformation matrices are given by sampled exponential functions.
The benefit of using this transform domain description of the listening room transfer
150
(M) ()
w
M
UH ()
R()
V()
N
l(M) ()
R()
Figure 4.22: Transformation of the MIMO listening room response using the singular
value decomposition.
of R(). Due to its diagonal structure, the signals at the analysis positions in the trans(M)
formed domain l () can be computed by scalar multiplication of the main diagonal
m () of R()
(M) ()
elements R
with the transformed loudspeaker driving signals w
m () = R
m () W
m () .
L
(4.97)
If R() is not full rank (B < M), then above equation only has to be evaluated for
m = 1...B. Hence, the transformation of the signals and systems using the SVD decomposes the MIMO system given by R() into B SISO systems. The spacevariant system
R() is transformed into a spaceinvariant representation in a transformed domain by the
SVD.
In general, the computation of the SVD will be too complex to benefit from the complexity reduction given by this decomposition of the MIMO system. However, presuming an
efficient transformation of R() with equivalent properties as the SVD based transformation of the systems and signals may result in a highly reduced complexity. A prominent
example for such an efficient realization is the fast Fourier transformation (FFT) which
provides a very efficient implementation of the DFT for signals and systems.
In the context of adaptive filtering of MIMO systems the introduced transformation results in a decoupling of the entire adaptive system as will be shown in the following.
4.5.3
The previous section derived a decomposition of the MIMO system R(), representing
the listening room, into several SISO systems by using a transformation which was based
on the SVD. This section will derive a decoupling of the entire adaptive system depicted
by Fig. 4.20, by jointly decoupling the systems F() and R() using the GSVD.
The error between the desired a(M) () and the reproduced l(M) () wave field at the M
analysis positions can be derived by transforming Eq. (4.64) and Eq. (4.65) into the
151
frequency domain
e(M) () = a(M) () l(M) () =
(4.98)
In the sequel a decoupling of Eq. (4.98) will be derived, resulting in a decoupling of the
MIMO adaptive inverse filtering problem. The basic idea is to diagonalize the listening
room transfer matrix R() and the freefield transfer matrix F() using the GSVD. It
will be assumed first that both transfer matrices have full rank. However, the results can
be generalized straightforwardly to the case that F() and/or R() have not full rank.
The decompositions of the transfer matrices F() and R() are given by Eq. (4.94). It
remains to chose a suitable decomposition of the compensation filter C(). In general, the
compensation filter is derived by solving the deconvolution problem given by Eq. (4.66).
The leastsquares solution of Eq. (4.66) in the frequency domain is given by C() =
R+ () F(). An eigenspace expansion of C() using the GSVD is given by Eq. (4.95).
However, the room transfer function R() and its pseudoinverse R+ () will not be known
apriori. This problem can be solved by expanding the room compensation filter as given
(4.99)
where C()
denotes a diagonal matrix, where some diagonal elements may be zero. Using
(M)
Eq. (4.99) together with Eq. (4.94a) yields the transformed signal l () at the analysis
points
l(M) () = R()
(M) () ,
(4.100)
C()
d
(M)
(M) () = UH () d(N) (). Decomposition of the freewhere l () = XH () l(M) () and d
field transfer matrix, according to Eq. (4.94b), yields the desired signal in the transformed
domain as
(M) () ,
(M) () = F()
a
d
(4.101)
(M) () = XH () a(M) (). Using Eq. (4.100) and Eq. (4.101) allows to decouple
where a
Eq. (4.98) in the transformed domain
(M) () R()
(M) () ,
(M) () = F()
e
d
C()
d
(4.102)
(M) () denotes the error signal for all M components in the transformed domain.
where e
Since R(),
C()
and F()
are diagonal matrices, the mth component of the error signal
(4.103)
152
CM ()
F1 ()
(M) ()
d
FM ()
1 ()
R
UH ()
d(N) ()
C1 ()
M ()
R
(M) ()
e
+
l(M) ()
(M) ()
a
+
Figure 4.23: Block diagram illustrating the eigenspace adaptive inverse filtering approach to room compensation.
component of the respective signals and systems. Thus, Eq. (4.103) states that the MIMO
adaptive inverse filtering problem can be decomposed into M SISO adaptive inverse filtering problems using the GSVD. The computation of the room compensation filters can be
performed independently for each of the M transformed components. The transformation
of the systems and signals is performed by transforming them into the joint eigenspace
of R() and F() using the GSVD. Therefore this approach will be termed as eigenspace
inverse adaptive filtering in the following. Please note that the transformation is not dependent from the driving signals. Figure 4.23 illustrates the eigenspace inverse adaptive
filtering approach. Up to now it was assumed that R() and F() have full rank. However, if the transfer matrix R() or F() does not have not full rank, then Eq. (4.103)
does not have to evaluated for all M transformed components.
In the following the theory of multichannel adaptive inverse filtering presented in Section 4.4.3 will be specialized to the derived system decoupling. Due to the decoupling,
the cost function (
c, k) given by Eq. (4.71) can be minimized independently for each
component m. The error signal for the mth component in the transformed domain is
given by Eq. (4.103). Applying the procedure outlined in Section 4.4.3 derives the normal
equation in the transformed domain as
da,m (k) ,
(4.104)
dd,m (n) denotes the timeaveraged autocorrelation matrix of the mth compowhere
da,m (n) the correspondnent of the transformed filtered loudspeaker driving signal and
ing crosscorrelation matrix between the filtered loudspeaker driving signal and the desired signal, and cm (k) the filter coefficients of the mth compensation filter. The auto
dd,m (n) has the dimensions Nc Nc . Due to this reduction in dimencorrelation matrix
153
sionality, the solution of the M equations Eq. (4.104) is much more efficient as for the
adaptation using the original (not transformed) signals. Equation (4.104) corresponds to
the well known single channel normal equation. There may still be timedomain correlations present in the filtered input signals which may cause problems when solving the
normal equation (4.104). However, there are numerous approaches known in the literature
on singlechannel adaptive inverse filtering to overcome these problems [Hay96]. Please
dd (k) have been removed in the transformed
note, that the spatial correlations present in
domain by the spatial decoupling of the MIMO systems. Thus, the nonuniqueness problem discussed in Section 4.4.4 is improved additionally.
There exist approaches to additionally diagonalize the single channel autocorrelation matrices, e. g. frequencydomain adaptive filtering (FDAF) using the discrete Fourier transformation [Hay96, BBK03]. Since the eigenfunctions of LTI systems are exponential
functions, the FDAF approach is based on an equivalent idea as presented here for the
spatial decoupling of a MIMO system.
4.5.4
In the previous sections an eigenspace approach to adaptive inverse filtering was developed. Its main benefit was the decoupling of the MIMO adaptive inverse filtering problem
into a series of single channel adaptive inverse filtering problems. This way the complexity
to compute room compensation filters for massive multichannel systems was significantly
reduced. Thus, the second and third fundamental problem formulated at the beginning of
Section 4.4.4 have been solved. However, the first problem still remains. More precisely,
the major drawback of the presented approach eigenspace adaptive filtering are:
1. The computation of the joint and right singular matrix X() and V() requires
apriori knowledge of the room transfer matrix R(),
2. the computation of the joint and right singular matrix is quite complex, and
3. the transformation of the loudspeaker driving signals, the filtered loudspeaker driving signals and the measured signals may become complex without an efficient algorithm.
These problems emanate mainly from the fact that the GSVD is a data dependent transformation. In general no optimizations can be performed without placing restrictions on
the structure of the transfer matrices F() and R(). However, it is known that the
listening room transfer matrix is constructed from a sampled version of the Greens function. The Greens function itself has to fulfill the wave equation and the homogeneous
boundary conditions imposed by the room. Thus, it should be possible to construct an
analytic transformation from this knowledge. Comparing the series representation of the
154
d(N)
T1
(M)
d
(M)
w
T2
w(N)
(M)
e
F
+
l(M)
(M)
a
+
T3
l(M)
M
Figure 4.24: Block diagram of wave domain adaptive inverse filtering approach to active
room compensation.
Greens function of the listening room given by the FTM (4.52) with the series representation of the listening room transfer matrix given in terms of its SVD (4.91) yields that
both expansions exhibit very similar structures. Both expand the Greens function and
its sampled representation into a series of kernels and adjoint kernels in case of the FTM
and left and right singular vectors in case of the SVD. Hence, this similarity can be used
to construct analytic transformations with the potential of incorporating optimizations.
But this will still not solve the first problem mentioned above.
The basic idea to overcome this problem is to give up the idea of a perfect diagonalization of the MIMO system in favor of a generic transformation which is to some degree
independent of the listening room characteristics. In order to still benefit from the transformation of the MIMO system, the generic transformation should compact the MIMO
system to its main diagonal elements and as few offdiagonal elements as possible. Choices
for suitable transformations include e. g. freefield expansions of the Greens functions or
statistics based transformations like the KarhunenLo`eve transformation (KLT) [JN84]
that analyze several listening rooms in order to derive a suitable transformation. Since
these generic transformations inherently have to account for the wave nature of sound in
order to perform well, this approach will be termed as wave domain adaptive (inverse)
filtering (WDAF) and the transformed domain as wave domain in the following.
Based on the approach of eigenspace adaptive filtering and above considerations a generic
block diagram of the WDAF approach can be developed. Figure 4.24 displays this generic
block diagram. The signal and system transformations are performed by three generic
transformations. Their structure is not limited to the MIMO FIR systems derived from
155
the GSVD. Transformation T1 transforms the driving signals into the wave domain, T2
inversely transforms the filtered loudspeaker driving signals from the wave domain, and
T3 transforms the signals at the analysis points into the wave domain. The signals and
transfer functions in the wave domain are denoted by a tilde over the respective variable,
since suitable transforms will be based on the idea of a transformation into the eigenspace
of the respective systems. The adaptation is then performed entirely in the wave domain.
If the transformations perfectly decouple the MIMO system, then a series of singlechannel
inverse filtering problems in the wave domain results.
Please note, that the generic block diagram depicted by Fig. 4.24 also includes the
eigenspace adaptive inverse filtering and multipoint equalization approaches. In the
former case the transformations are given as the following MIMO FIR systems (see Section 4.5.3): T1 = UH (), T2 = V() and T3 = XH (). In the latter case the transformations equal unit matrices.
The concept of WDAF as introduced in this section exhibits strong similarities to the theoretical concepts used for the coding of digital signals. Especially the frequently utilized
approach of transform coding [JN84] is based on the same fundamental principle. A transformation of the digital signal into a transformed domain should yield a more compact
representation than in the original domain. Typically this transformed representation is
then quantized to further compact the data. The optimal transformation in the sense of
the fewest coefficients in the transformed domain is also provided by the SVD. However,
the transformation matrices then depend on the data to be coded, which is undesirable.
As a solution to this problem, suboptimal transformations are used which perform well
in terms of decorrelation and energy compaction. The optimal transformation in these
terms is given by the KarhunenLo`eve transformation [JN84]. For natural still images
the discrete cosine transformation (DCT) has proven to provide a wellsuited transformation for a wide variety of images [JN84, AR75, SGP+ 95]. In audio coding the modified
discrete cosine transformation (MDCT) is often employed due to its energy compaction
properties [WVY00] for typical audio signals.
The next section will introduce suitable transformations for the approximate decoupling
of the listening room transfer matrix.
4.5.5
The previous section introduced WDAF. The basic idea behind this concept, in contrary
to eigenspace adaptive filtering, is to utilize a data independent transformation to transform the MIMO adaptation problem into a more compact representation. This section
will introduce two analytic dataindependent transformations for this purpose. They are
based on freefield wave field representations and thus neglect the influence of the listening
156
room. Hence, they will provide a decoupling of the freefield transfer matrix F() but not
a perfect decoupling of the listening room transfer matrix R(). However, both transformations can be implemented quite efficiently for special analysis geometries. Their
performance in the context of WDAF based listening room compensation will be evaluated in Section 5.3. Especially the decomposition into circular harmonics has proven its
suitability for active listening room compensation in typical rectangular listening rooms.
4.5.5.1
The Greens function G(xx0 , ) can be interpreted as the transfer function from a source
point x0 to a receiver point x with respect to the boundary conditions imposed by the
listening room. Section 2.2 stated that plane waves are eigensolutions of the freefield
wave equation formulated in Cartesian coordinates. It was further shown in Section 3.3
that arbitrary wave fields can be decomposed into plane waves by the plane wave decomposition. In order to describe the room characteristics for the propagation of plane waves,
it is reasonable to define a transfer function for the propagation of plane waves similar to
the Greens function G(xx0 , ). This function will be termed as the plane wave Greens
function G(
0 , ) in the sequel. The plane wave Greens function can be interpreted as
the transfer function for an excitation resulting in a plane wave with incidence angle 0
to a resulting plane wave with the incidence angle .
It was derived in Section 2.4.4 that the wave field emitted by a planar source under freefield conditions is a plane wave. Hence, the model of a planar source can be used to
generate a plane wave in the listening room. The wave field produced by a planar source
inside the listening room is given by introducing the inhomogeneous part belonging to a
planar source (see Eq. (2.45)) into Eq. (4.43)
Z
Ppw,0 (x, ) = 2jk
(nT0 x) G(xx0 , ) dV0 ,
(4.105)
R
where nT0 denotes the normal vector of the planar source (plane wave) and G(xx0 , ) a
suitably chosen Greens function which conforms to the homogeneous boundary conditions
of the listening room. The normal vector nT0 depends from the incidence angle of the
produced plane wave nT0 = nT0 (0 ). The plane wave Greens function G(
0 , ) can be
derived by performing a plane wave decomposition of the wave field produced by the
planar source in the listening room
G(
0 , ) = P{Ppw,0 (x, )} .
(4.106)
In general, G(
0 , ) will depend on the wave equation and its boundary conditions. The
plane wave Greens function describes the propagation of plane waves under the given
boundary conditions. For freefield propagation conditions, the excited plane wave will
(M) ()
w
M
Dpw
R()
N
157
l(M) ()
M
R()
Figure 4.25: Transformation of the listening room response R() into its plane wave
representation R()
using the plane wave decomposition P.
be the only one present. Thus, the plane wave representation of the freefield Greens
function G0 (xx0 , ) is given as
0 (0 , ) = ( 0 ) .
G
(4.107)
It was stated in Section 4.4.1 that the listening room transfer matrix R() can be interpreted as a spatially sampled version of the Greens function from the synthesis points
to the analysis points. Thus, it can be transformed into its plane wave representation
R()
using suitable transformations. The discrete plane wave decomposition allows to
decompose the wave field captured at the analysis points into its plane wave representation. However, it has to be taken care that the reproduction system generates a plane
wave with incidence angle 0 for the measurement of the plane wave room transfer matrix
For a continous secondary source distribution, the driving signals for the reproducR().
tion of a plane wave decomposed wave field were derived in Section 4.1.5 and are given
by Eq. (4.18). For the special choice P (, ) = ( 0 ) Eq. (4.18) generates the driving
signals for a Dirac shaped plane wave with incidence angle 0 . A suitably discretized
version of Eq. (4.18) can then be used for the computation of R().
Figure 4.25 illustrates the transformation of the listening room transfer matrix into its
plane wave representation. The number of plane wave components is denoted by M and
the number of secondary sources by N. The transformation Dpw denotes the calculation
(M) () using
of suitable driving signals from the plane wave decomposed desired field w
a discrete formulation of Eq. (4.18). The plane wave transfer matrix R()
describes the
(M)
() in terms of plane waves.
influence of the listening room on the desired field w
4.5.5.2
It was shown in Section 2.3.2 that arbitrary wave fields can be decomposed into circular
harmonics. Circular harmonics are the eigensolutions of the wave equation formulated in
158
(M) ()
w
M
Dch
R()
N
l(M) ()
M
R()
Figure 4.26: Transformation of the listening room response R() into its circular har
monics representation R()
using the circular harmonics decomposition.
Greens function can be interpreted as the transfer function for an excitation resulting in
a circular harmonic (or multipole) with order 0 to a resulting circular harmonic (or multipole) with the order . It can be computed similar to the plane wave Greens function
by introducing a suitable excitation into Eq. (4.43) and calculating the circular harmonics
expansion coefficients of the resulting wave field using Eq. (3.59). In general, G(
0 , )
will depend on the wave equation and its boundary conditions.
in terms of cirThe transformations required to derive the room transfer function R()
cular harmonics will be summarized briefly in the following for a circular geometry of
the analysis positions. The expansion coefficients in terms of circular harmonics of the
captured wave field are then given by Eq. (3.102). However, it has to be taken care that
the reproduction system generates a circular harmonic with order 0 for the measurement
4.6. Summary
4.6
159
Summary
This chapter derived an improved listening room compensation system which overcomes
the limitations of the traditional multipoint compensation algorithms by fulfilling the
requirements stated in Section 1.2.
The first requirement stated in Section 1.2 called for a proper analysis of the wave field
reproduced within the listening area. The wave field analysis techniques introduced in
Chapter 3 allow the analysis of the entire wave field within the listening area by measurements taken on its boundary. The analysis can be performed by measuring at a limited
number of discrete positions if the bandwidth of the analyzed wave field is limited.
The second requirement stated in Section 1.2 called for a spatial reproduction system
which provides control over the reproduced wave field within the listening area. Section 4.1
introduced a generic theory of sound reproduction based on the KirchhoffHelmholtz integral. It was further shown in Section 4.1.6 that a spatially discrete distribution of
monopoles surrounding the listening area allows to control the wave field within the listening area if the (temporal) bandwidth of the reproduced wave field is limited.
Both WFA and sound reproduction presume a proper spatial sampling in order to be able
to analyze and control a bandlimited wave field within the listening area. The number
of analysis and reproduction channels required for a proper control and analysis is quite
high for an even relatively moderate bandwidth of the wave fields concerned. As a result,
the application of traditional adaptive multichannel inverse modeling algorithms will become unfeasible due to the dimensionality of the normal equation (4.83) of the adaptation
dd (k) may be illconditioned due to
problem. Additionally, the autocorrelation matrix
spatiotemporal correlations.
The third requirement stated in Section 1.2 called for an improved multichannel adaptation algorithm that overcomes these problems. Section 4.5.3 and Section 4.5.4 proposed
the use of eigenspace adaptive inverse filtering or WDAF in order to fulfill the third requirement. Eigenspace adaptive inverse filtering uses the GSVD to spatially decouple
the MIMO adaptation problem. This decoupling results in a series of singlechannel inverse adaptation problems. This results in a significant complexity reduction, since the
dimensionality of the singlechannel normal equation (4.104) is reduced by a factor of N 4
(where N equals the number of loudspeakers) compared to the multichannel case (see
Section 4.5.1). Another effect of the decoupling is that the identification of the room
transfer matrix required for the filteredx adaptation algorithms is also decoupled into
singlechannel identification problems [BSK04b, BSK04c, BSK04a]. The complexity of
eigenspace adaptive filtering can be reduced further by discarding components in the
transformed domain with low significance.
However, the drawback of using the GSVD, as transformation for the decoupling, is its
dependence on the room transfer matrix. The concept of WDAF proposes the use of a
160
generic transformation which is, to some extend, independent of the room characteristics.
The transformation used for WDAF should compact the MIMO system to as few as possible paths. In the optimal case these paths should be decoupled or should only have weak
couplings inbetween. Two analytic transformations which may be used for this purpose
were introduced in the previous section.
The following chapter will illustrate the application of WDAF to active listening room
compensation for wave field synthesis systems.
161
Chapter 5
Room Compensation Applied to
Spatial Sound Systems
The previous chapter derived the theoretical basis of active listening room compensation. For a practical implementation of the presented techniques and algorithms, sound
reproduction and WFA have to be realized by properly designed systems. The following
section discusses the practical realization of sound reproduction using the concept of wave
field synthesis, the practical realization of WFA by circular microphone arrays and active
listening room compensation using WDAF.
5.1
Wave field synthesis (WFS) is a sound reproduction technique which is essentially based on
the KirchhoffHelmholtz integral. The concept of sound reproduction using this physical
foundation was discussed in Section 4.1. However, in order to arrive at a realizable
system several theoretical and practical problems have to be solved. These constitute the
concept of WFS which was initially developed by the technical university of Delft [Ber88]
and has been developed further by a vital WFS research community during the past two
decades [Sta97, Ver97, Vog93, dVSV94, SdVL98, dB04, Hul04, TWR03, PEBL05, Spo04,
WW+ 04, WKRT04, STR02].
It was stated in Section 4.1.4 that a twodimensional sound reproduction system can be
realized by appropriately driving a monopole line source distribution surrounding the
listening area. In theory, these secondary line sources would have to be of infinite length.
In practice however, a finite line source from the ceiling to the floor is sufficient when
assuming acoustic rigid floor and ceiling [Hul04]. Although it was shown by [Hul04] that
such line sources can be realized by electrostatic loudspeakers, this solution is costly and
impractical. The concept of WFS utilizes therefore a distribution of monopole point
sources as secondary sources. The benefit of this choice is, that closed loudspeakers
162
constitute a reasonable approximation of point sources and thus can be used for the
realization of such a system. The mismatch of secondary source types may lead to the
artifacts discussed in detail in the following section. Without compensation of these
artifacts, the reproduced wave field is given by Eq. (4.11) by exchanging the secondary
source term constituting a line source by the one constituting a monopole point source
I
1
ejkxx0
P (x, ) =
D(x0 , )
dL0 ,
(5.1)
4 V
x x0 
where the contour V delimits the listening area V (as illustrated by Fig. 4.1) and dL0
denotes a suitably chosen line element on V . The driving signal D(x0 , ) is defined by
Eq. (4.10).
For a practical system the secondary source distribution has to be discretized. The discretization of the secondary source distribution and its consequences on the reproduced
wave field were discussed in Section 4.1.6. It is assumed in the following, that the antialiasing conditions derived in Section 4.1.6 are fulfilled reasonably. These conditions are
not fulfilled for the entire auditory frequency range of humans for a typical WFS system.
Thus, aliasing will be present in the reproduced wave field. However, for reproduction
purposes aliasing artifacts do not play a dominant role since the human auditory system
doesnt seem to be too sensible for spatial aliasing [dB04, Sta97, Wit05]. A distance of
x = 10 . . . 30 cm between the secondary sources positions has proven to be suitable in
practice. Unfortunately, aliasing poses limits for active listening room compensation since
no control can be gained within the listening area above the aliasing frequency.
Concluding the above considerations, a WFS system can be realized by using closed loudspeakers which surround the listening area. These loudspeakers should be leveled with
the listeners ears for optimal auralization results. The listening area and the surrounding
loudspeakers may have arbitrary shapes. Examples of WFS systems will be shown in Section 5.1.5. However, care has to be taken to compensate for the monopole point sources
instead of line sources used as secondary sources. The following section will investigate
this in detail.
5.1.1
In order to analyze and correct the artifacts caused by using secondary point sources
instead of line sources for twodimensional sound reproduction, the geometry depicted in
Fig. 5.1 is considered. The listening area V and its surrounding contour V are located
in the z = 0 plane. The surface V is generated by extending V infinitely into both
zdirections. This special choice allows to degenerate the reproduction geometry from the
threedimensional case to the desired twodimensional case. By comparing the resulting
solution with Eq. (5.1) the artifacts of the point source usage will be derived.
The wave field generated by a distribution of monopole point sources placed on the surface
163
V
x0
z
x
x0
x
V
O
Figure 5.1: Illustration of the geometry used to derive the artifacts of using secondary
monopole point sources for twodimensional reproduction.
V is covered by Eq. (4.8). Its specialization to the geometry illustrated in Fig. 5.1 is
given as
I
1
ejkxCxC,0 
PC (xC , ) =
DC (xC,0 , )
dS0 ,
(5.2)
4 V
xC xC,0 
where the monopole driving function is defined according to Eq. (4.10) and dS0 denotes
a suitably chosen surface element on V . For this specialized geometry, the vector xC,0
can be expressed by the vector xC,0 and an offset in zdirection
xC,0 (xC,0 , z0 ) = xC,0 + [ 0 0 z0 ]T .
(5.3)
It was shown in Section 3.3 that arbitrary wave fields can be decomposed into plane
waves. Thus, it is sufficient to derive the artifacts for the reproduction of a plane wave.
The driving function for the reproduction of a plane wave is given as
T
DC,pw (xC,0 , ) = 2 jk P () a(xC,0 ) cos ejkC,0 xC,0 ,
(5.4)
where denotes the angle between the surface normal n onto V and the wave vector k0
of the plane wave ( = (n, k0 )) and P () the spectrum of the plane wave. Due to the
special geometry it is possible to split the integration path of Eq. (5.2) into an integration
along the closed curve V and an integration along the zdirection. Please note, that the
surface normal n is independent from z0 . Introducing DC,pw (xC,0 , ) into Eq. (5.2) and
164
T
ej(kxCxC,0 (xC,0 ,z0 )kC,0 xC,0 ) dz0 dL0 . (5.5)
xC xC,0 (xC,0 , z0 )
The inner integral can be approximated, for not too small wave numbers k and distances
xC x (xC,0 , z ), using the stationary phase method. Details on this method and its
C,0
0
application to the inner integral can be found in Appendix C.2. This approximation of
the inner integral leads to the following result
r
I q
1
1
ejkxCxC,0 
PC (xC , )
2 xC xC,0  DC,pw (xC,0 , )
dL0 .
(5.6)
4 jk V
xC xC,0 
Equation (5.6) states that the contributions of all secondary sources on a line parallel
to the zaxis through the point xC,0 can be approximated by one secondary point source
Equation (5.7) states that the driving signal has to modified in its amplitude and frequency
characteristics. The amplitude correction is dependent from the secondary source position
x0 and the receiver position x within the listening area. Thus, a compensation of the
165
amplitude artifacts seems to be possible only for one particular point within V . This is
not desirable, since an optimal spatial audio reproduction system should reproduce the
desired wave field without any artifacts at all positions within the listening area. An
amplitude correction for a reference line instead of a reference point is also possible for
certain geometries of the secondary sources and the reference line. For the reproduction
of virtual point sources this was shown in [Sta97]. The amplitude correction included in
Eq. (5.6) can be used to estimate the amplitude error of an arbitrary shaped WFS system.
This will be shown in Section 5.1.3.
The results also indicate that no phase errors are present in the reproduced wave field using
WFS when applying the necessary compensation given by Eq. (5.7). Especially for active
room compensation this is an important result. Please note, that the derived correction
is only valid for the reproduction of plane waves. Similar results for the reproduction of
virtual point sources can be found in [Sta97, Ver97, Vog93].
5.1.2
The previous section derived that WFS is not capable of correctly reproducing the amplitude of the desired wave field throughout the entire listening area. In general, the
amplitude of the desired wave field can only be reproduced correctly at one reference
position (or reference line). A WFS system may also exhibit other artifacts besides these
amplitudes errors. This section will briefly discuss four types of artifacts of WFS and
their impact on active listening room compensation. The artifacts of WFS are:
1. Spatial aliasing
In a practical realization the continuous distribution of secondary point sources will
be realized by point sources placed at discrete positions. This spatial sampling of
the secondary source distribution was discussed in detail in Section 4.1.6. Spatial
sampling will result in spatial aliasing artifacts present in the reproduced wave
field if the antialiasing conditions are not met. Most WFS systems are designed
for wideband reproduction of sound. However, the distance between the sampled
secondary sources is typically chosen not to fulfill the antialiasing conditions for
the entire frequency range due to complexity considerations. As a result spatialaliasing artifacts are present in the reproduced wave field. Fortunately, the human
auditory system does not seem to be very sensible to these artifacts if the sampling of
the secondary sources is performed in accordance to a resulting minimum temporal
aliasing frequency of about 1 kHz [dB04, Sta97, Wit05].
2. Truncation and diffraction
Practical realizations of WFS systems will have finite dimensions, may have bents
166
WFS artifacts
1. spatial aliasing
4. amplitude errors
Table 5.1: Overview over the artifacts of WFS and their impact on active room compensation.
in their secondary source contours and nonclosed contours. This may lead to truncation and diffraction artifacts present in the reproduced wave field. The impact
of these artifacts can be decreased to some extend by applying spatial tapering to
the driving signal. For a detailed discussion of these artifacts and countermeasures
please refer to [Sta97, Ver97].
3. Restriction to twodimensional reproduction
Typical WFS systems are limited to the reproduction in twodimensions. Additionally to the discussed artifacts, this limitation has consequences for the control a
WFS system has over the reproduced wave field. A twodimensional WFS system
is only capable of acoustically controlling the plane where the secondary sources are
placed in. The wave field above and under this plane will exhibit artifacts [Sta97].
4. Amplitude errors
The secondary source type mismatch for a twodimensional WFS system results
in the spectral and amplitude errors derived in Section 5.1.1. The spectral errors
can be corrected for all listener positions within the listening area. However, the
amplitude errors can only be corrected for one listener position (or reference line)
in general. As a result, the reproduced wave field will exhibit position dependent
amplitude errors.
Table 5.1 summarizes the artifacts of WFS. For a properly designed WFS system and
typical auralization scenarios these effects play no dominant role. However, for active
room compensation they pose limits on the achievable performance. Spatial aliasing (1)
limits the frequency up to which an application of room compensation is possible, since
above of this frequency WFS gains no proper control over the reproduced wave field.
The truncation and diffraction errors (2) will pose limits to room compensation in that
167
sense that they introduce artifacts into the wave field reproduced for destructive interference. The limitation to twodimensional reproduction (3) constricts the suppression
of reflections twofold. On the one side, reflections emerging from boundaries outside the
reproduction plane (elevated reflections) cannot be compensated for the entire listening
area. On the other side, the performance of active room compensation will decrease for
listener positions above or below the listening area. The amplitude errors (4) limit the
suppression a WFS system can gain by destructive interference. Perfect compensation is
only possible at the reference position (line), outside of this position (line) the listening
room reflections cannot be compensated perfectly. These amplitude errors have to be
taken into account when prescribing a desired wave field for the adaptation process.
In the following section, the amplitude errors and the suppression of elevated reflections
will be analyzed in more detail for active room compensation using a circular loudspeaker
array.
5.1.3
The previous section introduced the artifacts of WFS systems on a qualitative level. The
following section will analyze the artifacts of circular WFS systems and their impact
on room compensation in a more quantitative fashion. The reason for considering this
specialized geometry is that a circular array was used for the experimental validation of
the proposed methods in Section 5.3. In the sequel the reproduction of plane waves will
be considered. It is sufficient to derive the artifacts of a circular WFS system for one
fixed incidence angle of the reproduced plane wave only, since circular arrays are radially
symmetric. The resulting characteristics for arbitrary wave fields can then be derived
easily from the presented results. Some of the results shown in the following have been
presented in [SRR05].
Figure 5.2 illustrates the geometry of a circular WFS system. Specialization of Eq. (5.1)
to the desired geometry together with the corrected plane wave driving function (5.7)
yields the wave field PP (xP , ) reproduced by the circular WFS system as
PP (xP , ) =
R3
8 jk
0 +3/2
0 +/2
DP,pw ( , R, )
ejkr
d ,
r
(5.8)
where 0 denotes the incidence angle of the plane wave, R the radius of the loudspeaker
array and DP,pw the uncorrected driving function for a plane wave. The integration limits
in Eq. (5.8) are chosen in accordance to the effect of the window function a( ) given by
Eq. (4.9). The uncorrected driving function DP,pw can be derived from Eq. (4.40) as
(5.9)
168
r
x
r
V
V
Figure 5.2: Illustration of the geometric parameters used to describe the wave field
reproduced by a circular WFS system.
where 0 denotes the incidence angle of the plane wave and P () its spectrum. The
amplitude correction included in Eq. (5.8) was chosen such to derive a correct amplitude
at the center of the circular array. The distance r between the secondary sources and a
listener position depends on the integration variables and is given as
p
(5.10)
r(, , r, R) = R2 + r 2 Rr cos(2 + ) .
Equation (5.8) together with Eq. (5.9) and Eq. (5.10) constitutes the mathematical description of the wave field reproduced by a circular WFS system. The effects of secondary
source sampling for a circular reproduction system have been discussed in Section 4.1.6.2.
The following two sections will quantitatively analyze the effects of the amplitude errors
and the twodimensional restriction on room compensation using a circular WFS system.
5.1.3.1
Amplitude Errors
Section 5.1.1 illustrated the application of the stationary phase approximation to compensate for the secondary source type mismatch of WFS. This approximation can be used
further to derive an upper bound for the amplitude error for the reproduction of a plane
wave. The amplitude of the reproduced wave field is given by introducing Eq. (5.4) into
Eq. (5.6)
r I
jk
cos
T
a(x0 ) p
ejk0 x0 ejkxx0  dV0 ,
(5.11)
P (x, ) = P ()
2 V
x x0 
where denotes the angle between the surface normal n onto V and the wave vector k0
of the plane wave to be reproduced. An upper bound for the amplitude error is derived
169
by shifting the calculation of the absolute value into the integral and noting that the
exponential functions have unit amplitude. This results in the following upper bound
r jk I
cos
P (x, ) P ()
(5.12)
a(x0 ) p
dV0 .
x x0 
2 V
Equation (5.12) can be specialized straightforward to the geometry depicted in Fig. 5.2.
The limiting effect of the amplitude errors on room compensation can be estimated by
calculating the error between the reproduced wave field P (x, ) and the wave field of the
desired plane wave Ppw (x, ) as follows
E(x, ) = P (x, ) Ppw (x, ) ,
(5.13)
where Ppw (x, ) is given by Eq. (2.47). For room compensation the desired wave field
Ppw (x, ) can be interpreted as the reflections caused by the listening room which should
be canceled by destructive interference.
In the following, results for one particular reproduction scenario will be shown. For this
purpose the reproduction of a monochromatic plane wave with a frequency of f0 = 400 Hz
and incidence angle 0 = 90o on a circular array with radius R = 1.50 m is considered.
The reproduced wave field and the upper bound for the amplitude were calculated by
numeric evaluation of Eq. (5.8) and Eq. (5.12) for the chosen geometry and desired plane
wave. The amplitude of the reproduced wave field was adjusted such that it had unit
amplitude in the center. The numeric evaluation of the reproduced wave field by Eq. (5.8)
also includes all other reproduction artifacts mentioned in Section 5.1.2.
Figure 5.3 illustrates the derived results for an 2 2 m area in the center of the listening
area. Figure 5.3(a) shows a snapshot of the reproduced wave field: a plane wave with
incidence angle 0 = 90o and frequency f0 = 400 Hz. On first sight, the circular system
described by Eq. (5.8) seems to be capable of reproducing a plane wave without major
artifacts. However, there are some slight deviations in the amplitude visible. Figure 5.3(b)
shows the amplitude of the reproduced plane wave. The equiamplitude contours illustrate
the amplitude variations. For the region shown the overall amplitude variation is about
8 dB. Figure 5.3(c) shows the upper bound of the amplitude error as given by Eq. (5.12)
and its equiamplitude contours. It can be seen clearly that the upper bound provides a
reasonable approximation of the amplitude of the simulated wave field for the upper half
(y > 0) of the region shown. For the lower half (y < 0) the error is overestimated by
about 3 dB.
Figure 5.3(d) shows the amplitude of the averaged error E(x, ). The error was averaged
over one signal period of the monochromatic plane wave in order to eliminate numerical
artifacts. The error is small in the vicinity of the center due to the amplitude adjustment
in the center. As predicted by Figure 5.3(b), the error is smaller above the center. This is
due to the fact that the secondary sources above the center are used for the reproduction
[dB]
170
0.5
0.5
0.5
0.5
0.5
1
1
0.5
0
x > [m]
0.5
y > [m]
y > [m]
1
2
0.5 3
4
6
5
1
1
0.5
1
2
0
x > [m]
3
4
5
0.5
0
0.5
2
4
1
1
2
0.5
0
x > [m]
2
0.5
0
5
25
0.5
203
30
2
5
10
2 1
5
0
y > [m]
0
2
15
y > [m]
10
5
[dB]
00
15
0.5
[dB]
1 3
3
25
10
20
15
15
0.5 1
20
10
1
1
0.5
0
x > [m]
0.5
25
5
1
30
(d) averaged error between the desired and the reproduced wave field
Figure 5.3: Results when reproducing a monochromatic plane wave with a circular WFS
system. The desired plane wave has a frequency of f0 = 400 Hz and an incidence angle
of 0 = 90o . The radius of the simulated WFS system is R = 1.50 m.
171
of this particular plane wave. Figure 5.3(d) shows the maximum achievable position
dependent suppression that can be reached for the compensation of a plane wave by
destructive interference. The results also include truncation errors due to the numerical
evaluation of Eq. (5.8).
5.1.3.2
The previous section investigated the amplitude errors of circular WFS systems and their
impact on active room compensation. As stated before, a twodimensional WFS system
has only full control over the wave field within the reproduction plane. Reflections caused
by boundaries out of that plane (e. g. by the ceiling) will result in elevated contributions
with respect to the reproduction plane. In the following some particular results for the
suppression of elevated plane waves will be shown.
The pressure field of an elevated plane wave in the reproduction plane (z = 0) is given as
PP,pw,0 (, r, ) = P () ejkr cos(0 ) cos(0 ) ,
(5.14)
where 0 denotes the elevation angle with respect to the reproduction plane. An elevation
angle of 0 = 0o denotes no elevation. As in the previous section Eq. (5.8) was numerically
evaluated. For this purpose the driving function belonging to an elevated plane wave was
computed using Eq. (4.10) and introduced into Eq. (5.8). The absolute value of the
error E(x, ) defined by Eq. (5.13) is used as performance measure for the suppression of
elevated contributions. Figure 5.4 shows the suppression of the incident wave field by the
circular WFS system for different elevation angles of the incident plane wave. Figure 5.4(a)
with an elevation angle 0 = 0o is shown for reference. It is equal to Figure 5.3(d). As
expected, an increasing elevation angle lowers the suppression of the incident field which
can be achieved by WFS. For an elevation angle of 0 = 30o the averaged error is even
higher than without room compensation. As a consequence to these results a proper
damping of the ceiling and the floor in order to avoid elevated reflections seems to be
mandatory for an active room compensation system using twodimensional reproduction
techniques.
5.1.4
Rendering Techniques
The practical realization of a WFS system requires to generate the individual driving
signals for each loudspeaker. The process of generating the driving signals and reproducing
the desired wave field is referred to as acoustic rendering in the following. This term is
chosen in accordance to the frequently used term rendering in computer graphics for the
creation of (virtual) visual scenes. The following section will introduce two rendering
techniques for WFS.
172
[dB]
20
10
0.5
0
x > [m]
0.5
15
0.5
1
1
30
0.5
10
30
0
x > [m]
0.5
15
0
1
0
x > [m]
0.5
30
10
0
0
15
0
0.5
20
25
0.5
20
y > [m]
10
0.5
0.5
y > [m]
[dB]
1
1
25
0.5
20
10
[dB]
15
5
1
10
25
5
1
20
15
2
0
20
y > [m]
25
2
5
10
15
0.5 1
0
1 5
1
15
0
2
30
25
20
0.5
1
5
10
203
30
15
5
2
10
2 1
5
0
1 0
5
1
0
2
15
y > [m]
[dB]
0.5
1
15
1
1
0.5
25
0
x > [m]
0.5
30
Figure 5.4: Average error between the desired and the reproduced wave field for a
plane wave with a frequency of f0 = 400 Hz, an incidence angle of 0 = 90o and varying
elevation angles 0 . The radius of the simulated WFS system is R = 1.50 m.
5.1.4.1
173
Databased Rendering
The technique of databased rendering auralizes a recorded or synthetically created (virtual) acoustic scene by the knowledge of its acoustic wave field on the border of the listening region. Section 4.1.4 illustrated the reproduction of arbitrary virtual source wave
fields using secondary monopole sources. The presented technique required to obtain the
local propagation direction of the virtual source wave field at the loudspeaker positions.
It was shown in [BdVV93] that one possibility to obtain the desired loudspeaker driving
signals is to place directional microphones (Cardiods) at or nearby the loudspeaker positions. One major drawback of this approach is that the loudspeaker and the microphone
setup have to match exactly.
A more convenient way is to measure or virtually create a plane wave decomposition of
the wave field generated by the virtual source. Suitable techniques for this purpose were
introduced in Section 3.4. The loudspeaker driving signals can be obtained by extrapolation of the plane wave decomposed signals to the loudspeaker positions, as shown in
Section 4.1.5. The area analyzed in order to derive the plane wave decomposition should
include at least the listening area for optimal results.
Typically, the plane wave decomposition of the spatial impulse response from one virtual
source to the desired listening area is measured or simulated [HdVB02, Hul04] due to
complexity restrictions. The plane wave decomposition of the virtual source wave field
for arbitrary excitations is obtained by (timedomain) convolution of the virtual source
signal and the plane wave decomposition of the spatial impulse response, as given by
Eq. (3.60). The loudspeaker driving signals for arbitrary virtual source signals can be
computed as follows
d(xn , t) = d0 (xn , t) s(t) ,
(5.15)
where d(xn , t) denotes the loudspeaker driving signal of the nth loudspeaker, d0 (xn , t)
the impulse response obtained from the (extrapolated) measurements and s(t) the virtual source signal. The impulse response d0 (xn , t) for WFS can be obtained from the
plane wave decomposition of the measured wave field (spatial impulse response) by using
Eq. (4.18) together with the secondary source correction (5.7). The convolution process (5.15) has to be performed for each loudspeaker. Thus, reproduction of a virtual
source using databased rendering requires a multichannel convolution of the virtual source
signal with the impulse responses d0 (xn , t). For a high number of loudspeakers and/or
long impulse responses d0 (xn , t) this process may become computationally very complex.
Databased rendering of acoustic scenes is a technique typically used for the highquality
reproduction of complex static acoustic scenes. Since databased rendering is capable
of reproducing arbitrary wave fields, it can be used to generate the wave field for the
compensation of the listening room reflections.
174
5.1.4.2
Modelbased Rendering
Modelbased rendering uses analytic spatial models for the virtual sources to calculate
the appropriate driving signals for the loudspeakers. Point sources and plane waves are
the most common models used for this purpose. For a plane wave, the driving signal of
the nth loudspeaker is given by combining Eq. (5.4) and Eq. (5.7) as
p
p
T
Dpw (xn , ) = 2a(xn ) cos( ) 2 x xn  jk S() ej c n0 xn ,
(5.16)
where nT0 denotes the normal vector (incidence angle) of the plane wave to be reproduced,
the angle between the vector n0 and the surface normal n at the point xn and S() the
spectrum of the plane wave (virtual source). By transforming this equation back into the
timedomain, the loudspeaker driving signals can be computed from the source signals by
delaying, weighting and filtering,
p
(5.17)
d(xn , t) = 2a(xn ) cos( ) 2 x xn  ( f (t) s(t) ) (t n ) ,
where the delay n = nT0 xn /c and f (t) = F 1 { jk} denotes the inverse Fourier trans
form of jk. Equation (5.17) can be realized very efficient by weighting and delaying
blocks of the filtered virtual source signal s(t). Only one singlechannel convolution per
virtual source is necessary. It can be shown that similar relations can be derived for the
driving functions of a virtual point source [Ver97, dB04, Hul04]. Multiple virtual sources
(e. g. plane waves) can be synthesized by superimposing the loudspeaker signals from each
virtual source.
The main benefit of modelbased rendering of virtual point sources and plane waves is its
computational efficiency. Besides delaying and weighting of the virtual source signal only
one filtering process per virtual source is required.
Plane waves and point sources can further be used to simulate classical loudspeaker setups, like stereo and 5.1 setups. Thus WFS is backward compatible to existing sound
reproduction systems and can even improve them by optimal loudspeaker positioning in
small listening rooms. It is possible to place the required virtual speakers at positions
outside the listening room. This creates proper two or fivechannel reproduction in small
listing spaces not compatible with the recommendations for correct loudspeaker placement, e.g. [ITU97]. Besides these two types of virtual sources also other sources models
have been implemented successfully [CCW03, Baa05].
5.1.5
Up to now, the theoretical aspects of sound reproduction in general and WFS have been
discussed. The theory presented implied possible solutions for a practical implementation of WFS. The main outcome was that a discrete distribution of closed loudspeakers
(monopoles) driven by appropriate driving signals can be used to reproduce the wave field
175
of a virtual source within the listening area. The following section briefly discusses realization aspects of WFS systems by the example of the implementations realized within
the course of this work at Multimedia Communications and Signal Processing of the
University ErlangenNuremberg [LMS].
5.1.5.1
Hardware
176
Figure 5.5: Ushaped WFS system with 24 oneway loudspeakers. The size of the
listening area is approximately 1.50 1.50 m.
Figure 5.6: Circular WFS system with 48 twoway loudspeakers. The listening area has
a radius of R = 1.50 m.
177
Figure 5.7: Ushaped WFS system consisting of four MEPs. The system has 32 individual channels.
ADAT interface were utilized. The loudspeaker driving signals are purely generated by
the software discussed in the next section.
5.1.5.2
Software
The software utilized to generate the loudspeaker driving signals for the laboratory WFS
system operates on the LINUX operating system. The software interface to the soudcards
is provided by the Advanced Linux Sound Architecture (ALSA) [ALS] together with the
JACK audio audio connection kit [JAC]. The JACK server acts as realtime lowlatency
patchbay for all applications that access the soundcards. A wide variety of audio software
exists for the JACK/ALSA bundle. One of the most useful applications for modelbased
rendering is BruteFIR [Tor], a very efficient realtime convolution engine. Once the WFS
filters have been derived according to Section 5.1.4.1, the virtual source signals can be
convolved with BruteFIR for auralization. Using a multiprocessor workstation, the computationally complex convolutions can be performed in realtime for typical scenarios.
As stated in Section 5.1.4.2 modelbased rendering mainly requires to weight and delay
the virtual source signals. Thus, an approach based on convolution of the source signals
with appropriate impulse responses would not be computationally efficient here. A dedicated application for modelbased rendering was developed that explicitly exploits the
178
weighting and delay approach. The model based rendering software provides the following
features:
synthesis of point sources and plane waves,
synthesis of moving point sources with arbitrary trajectories,
interactive graphical user interface for loudspeaker and source setup,
room effects using a mirror image model,
source input from files or life input from ADAT/SPDIF interfaces, and
simulation of conventional loudspeaker setups (e. g. 5.1 surround setup [ITU97]).
Figure 5.8 shows a snapshot of the graphical user interface of the model based realtime rendering software (wfsapp). The upper half of the application window shows the
loudspeaker and source setup. Sources can be moved intuitively in realtime by clicking
on the source and dragging using the computer mouse. The lower half of the application
window controls the synthesis and application parameters and the setup for the virtual
room used for the mirror image model. All source and room parameters can be changed
in realtime during playback operation.
5.2
It was shown in Section 4.3 that adaptive room compensation requires to analyze the
wave field reproduced in the listening area by the reproduction system. Chapter 3 introduced analysis techniques for this purpose that were based upon orthogonal wave field
expansions and measurements take on the boundary of the region of interest. The plane
wave expansion and the expansion into circular harmonics have proven to provide suitable
bases. So far, only the theoretical aspects of wave field analysis using these techniques
have been discussed. This section will discuss some practical aspects of twodimensional
wave field analysis in the context of room compensation using linear and circular microphone arrays.
5.2.1
179
180
reproduced by a rectangular WFS system without scaling down the listening area by a circular array. Linear microphone arrays have been used widely in the past for the analysis
of acoustic scenes on the basis of beamforming techniques [Tre02, JD93, BW01]. Section 3.3.6 derived the relation between the plane wave decomposition and beamforming
techniques. Hence, these techniques can be used to calculate the plane wave decomposition from linear microphone array measurements. Similar analysis techniques have
also been derived by [HdVB01, HdVB02, Hul04] on basis of the KirchhoffHelmholtz and
Rayleigh integrals. In the following some of these techniques and their properties will be
reviewed briefly.
The KirchhoffHelmholtz integral implies that acoustic pressure and velocity have to be
measured on an arbitrary shaped closed curve in order to characterize the wave field
within the region enclosed by the curve (see Section 2.6.2). This curve can be degenerated to a line that extends to infinity at both sides in order to derive the relations for
linear analysis geometries. It was shown in [HdVB01] that the plane wave decomposition
based on this principle can be understood as spatial Fourier transformation. This result is
evident due the similarity between the plane wave decomposition and the Fourier transformation shown in Section 3.3. The data used for the spatial Fourier transformation
is combined from omnidirectional (pressure) and dipole (velocity) microphones in that
way, that the combination comprises a hypercardioid microphone. A technique based on
pressure and velocity measurements is capable of distinguishing between the wave field
emerging in front of the array from the wave field emerging behind the array. Using only
pressure microphones, as this is done often in beamforming techniques, does not allow this
front/back discrimination. The analysis capabilities of a finite length (truncated) linear
array for the analysis of plane waves are dependent on the incidence angle of the plane
wave. The angular resolution gets coarser for plane waves that travel nearly parallel to
the array. Thus, linear arrays are not suitable for the analysis of arbitrary wave fields
with plane wave contributions coming from all directions.
To overcome this problem [HdVB01] proposed to use two linear arrays intersecting each
other at an angle of 90 degrees to form a cross shaped array. This way, the problem of
a limited angular resolution can be improved by using the more appropriate of the two
arrays for the analysis of a plane wave contribution. The combined array is then capable of analyzing contributions from all directions without severe artifacts. It was shown
in [SKR03] that a cross shaped array is suitable for the analysis of the reproduced wave
field required for listening room compensation.
Linear arrays can be used to build rectangular analysis arrays which may fit better to
rectangular loudspeaker setups. For this purpose the techniques discussed in [HdVB01]
could be extended to the rectangular geometry or beamforming approaches could be used.
Especially constant directivity beamforming techniques [BW01] might be of use in this
context to reduce geometry dependent artifacts. Linear microphone arrays and their
181
variations will not be discussed further in the context of this thesis. Please refer to the
literature, e. g. [Tre02, JD93, BW01], for details. The following section will concentrate
on circular microphone arrays.
5.2.2
As stated before, circular arrays are the natural choice for the wave field analysis techniques introduced in Chapter 3. This is due to the underlying polar geometry of the plane
wave decomposition and the decomposition into circular harmonics. The following section
will briefly discuss the discrete realization of the decomposition into circular harmonics
using circular microphone arrays. Section 3.4.3 introduced this decomposition on basis
of continuous pressure and pressure gradient measurements performed on a circle with
radius R. The method depicted in Fig. 3.15 first calculates the Fourier series expansion
coefficients of the pressure and pressure gradient measurements and then carries out a
multidimensional filtering process in order to derive the circular harmonics expansion coefficients.
In a practical implementation, the continuous pressure and pressure gradient measurements have to be replaced by measurements taken at discrete positions on the circle. The
artifacts resulting from this spatial sampling process were derived in Section 3.6. One
major result of the angular sampling is that no exact antialiasing condition on basis of
the (temporal) bandwidth of the captured wave field can be given for arbitrary wave fields.
The number of sampling positions necessary on the circle has to be determined by prescribing a maximum allowable aliasingtosignal ratio, as shown in Section 3.6.2.3. It will
be considered in the following that the angular sampling has been performed reasonable
for the required aliasingtosignal ratio. The pressure and pressure gradient measurements
can be performed by placing pressure (omnidirectional) and pressure gradient (figureofeight) microphones at equiangular positions on the circle. The mainaxis of the pressure
gradient microphone has to coincide with the normal vector in radial direction at the
microphone position.
(, R, ) and V
r (, R, ) is
The calculation of the Fourier series expansion coefficients P
carried out by performing a discrete Fourier transformation (DFT) [OS99] with respect
to the discretized angle . The filtering process given by Eq. (3.102) is not affected by the
angular discretization since the angular frequency is already discrete in the continuous
formulation, due to the periodicity of the measurements in the angular coordinate. Figure 5.9 shows a block diagram of the discrete angle circular harmonics decomposition. If
the aliasing and truncation errors are reasonable small, then the captured wave field is described with minor deviations by the circular harmonics expansion coefficients P (1) (, )
and P (2) (, ). Please note, that in practice an exact description is not possible when
using a finite number of angular sampling positions, due to the aliasing and truncation
182
PP (, R, )
DFT
(, R, )
P
P (1) (, )
M(kR)
VP,r (, R, )
DFT
r (, R, )
V
P (2) (, )
Figure 5.9: Block diagram of the discrete space circular harmonics decomposition for a
circular microphone array using the discrete Fourier transformation (DFT).
errors introduced.
For the continuous case, the plane wave decomposition can be derived from the circular
harmonics expansion coefficients by the Fourier series given by Eq. (3.58). In the discrete
angle case this Fourier series can be realized by the inverse discrete Fourier transformation
(IDFT) [OS99] performed on the angular frequency . This results in the discrete plane
wave decomposition of the analyzed wave field.
Up to now, no time discretization has been considered. If performed properly, above
considerations hold also for a time discretization of the signals. The transformation into
the frequency domain can then be performed by the DFT. Hence, for the spatiotemporal
discrete microphone signals pP (, R, k) and vP,r (, R, k) a temporal and an angular DFT
has to be performed. This will result in total in a twodimensional DFT of the microphone
signals. The filter M(kR) has to be discretized according to the timedomain discretization. Both the DFT and the IDFT can be realized very efficiently by the fast Fourier
transformation (FFT) [OS99] in a practical implementation.
5.2.3
Section 5.1.2 derived several artifacts of WFS and their impacts on listening room compensation. Most of these artifacts are caused by using a twodimensional reproduction
technique in a threedimensional environment. The theoretical basis of both sound reproduction using WFS and the WFA techniques introduced in this work is given by the
twodimensional KirchhoffHelmholtz integral. As a result, WFA will exhibit similar artifacts as derived for WFS. However, the main difference between WFS and WFA in this
context is, that the Greens functions used in the KirchhoffHelmholtz integral (2.70) act
as virtual secondary line sources used for the extrapolation of the boundary measurements
taken on V into the region V . WFS realizes these secondary line sources physically by
loudspeakers. This section will briefly discuss four types of artifacts of WFA and their
impact on active listening room compensation. These artifacts of WFA are:
1. Spatial aliasing
As for WFS, the discretization of the underlying physical and mathematical rela
183
tions may result in spatial aliasing due to spatial sampling. For the plane wave
decomposition this spatial sampling and its artifacts were discussed in Section 3.6.
It was shown that antialiasing conditions can be formulated in terms of the (temporal) bandwidth of the analyzed wave field. The analyzed wave field will exhibit
aliasing artifacts if the antialiasing conditions derived for the discrete plane wave
decomposition are not reasonably fulfilled.
2. Truncation and diffraction
Practical implementations of microphone arrays used for WFA will have finite extends and may be based on nonclosed contours. The result will be truncation and
diffraction errors present in the captured wave field. Truncation will limit the effective spatial resolution for low frequencies. This was illustrated for different analytic
source models in Section 3.5.
3. Restriction to twodimensional analysis
As for WFS, the restriction to twodimensions for WFA leads to limited analysis capabilities for threedimensional wave fields. Twodimensional methods cannot fully
distinguish between reflections emitted in the analysis plane and elevated reflections. Contributions from elevated reflections will be mixed into the contributions
of sources located in the analysis plane. However, the results presented in [Gro05]
indicate that these limited capabilities can be improved to some extend.
4. Amplitude errors
The virtual line sources used for the extrapolation process in twodimensional WFA
are only capable of correctly extrapolating wave fields that can be be expressed
by the circular harmonics decomposition (2.25). Since point sources cannot be expressed by this twodimensional decomposition, wave fields captured of point sources
will exhibit amplitude and spectral errors due to the reasons explained in Section 5.1.2. The analyzed wave field will not be known apriori in general and thus
no compensation of these artifacts is possible.
Table 5.2 summarizes the artifacts of twodimensional WFA systems. The discussed artifacts play no dominant role for a well designed WFA system when its results are used for
auralization purposes [HdVB01, HdVB02, Hul04]. However, these artifacts will limit the
performance of an active room compensation system since an exact analysis of the reproduced wave field is mandatory. Spatial aliasing (1) will effectively limit the frequency up
to which the reproduced wave field is analyzed correctly and hence room compensation
can be applied. The truncation errors (2) limit the lowfrequency analysis capabilities and
the diffraction errors (2) may introduce unwanted artifacts into the compensation signals.
The twodimensional restriction (3) and its implication on the analysis of elevated wave
field contributions may have severe consequences for active room compensation. Since
184
WFA artifacts
1. spatial aliasing
2. truncation and diffraction errors
3. 2D restriction
4. amplitude errors
Table 5.2: Overview over the artifacts of twodimensional WFA and their impact on
active room compensation.
elevated contributions may be present in the analysis data of the reproduction plane,
the compensation filters may also contain contributions to compensate for them. However, the elevated contributions can only be compensated at the microphone positions, as
WFS has very limited capabilities to compensate for elevated contributions (see e. g. Section 5.1.3.2). Thus, the compensation signals for the elevated contributions may cause
artifacts for positions out of the microphone positions. A countermeasure for this problem
is a proper damping of the ceiling and the floor of the listening room in order to avoid
elevated reflections as far as possible. If extrapolation techniques are used to generate the
compensation signals at the loudspeaker positions, then these signals will exhibit amplitude errors (4).
In the following sections, the amplitude errors and analysis capabilities of elevated reflections of circular microphone arrays will be studied more quantitatively.
5.2.4
The previous section introduced the artifacts of twodimensional WFA systems on a qualitative level. This section will analyze the artifacts of circular WFA systems and their
impact on room compensation in a more quantitative fashion. The reason for considering
this specialized geometry is that a circular microphone array is used for the experimental
validation of the proposed room compensation algorithms in Section 5.3. In the sequel
the reproduction of plane waves will be considered. It is sufficient to derive the artifacts
of a circular WFA system for one fixed incidence angle of the reproduced plane wave.
Section 3.4.3 introduced an efficient method for the computation of the circular harmonics expansion coefficients of wave fields analyzed by circular microphone arrays. In the
following, simulations of circular WFA systems based on this algorithm will be presented.
Some of the results have been presented in [SRR05].
5.2.4.1
185
The following section will analyze the amplitude errors when using circular harmonics
for the extrapolation of boundary measurements. The results hold also for the extrapolation of wave fields based on the twodimensional KirchhoffHelmholtz integral (2.70)
since the expansion into circular harmonics can be used for arbitrary wave fields. The
Hankel functions used as basis for the decomposition (2.25) exhibit a farfield amplitude
decay (see Eq. (2.22)) which is inverse proportional to the square root of the extrapolation
radius. The extrapolation of the wave field of a point source using Eq. (2.25) will exhibit
amplitude errors, due to this property of the Hankel functions. As a result of the radial
symmetry of the circular harmonics expansion, these amplitude errors will be radially
symmetric. Thus, it is sufficient to analyze the error for one particular angle of the
extrapolated wave field. In the following simulations based on numerical evaluation of the
algorithm illustrated in Section 5.2.2 will be presented.
As stated before, the twodimensional analysis and extrapolation techniques will not be
capable of correctly reproducing the amplitude of threedimensional point sources. The
extrapolation using the circular harmonics expansion of the wave field emitted by a point
source in a plane was calculated in order to illustrate this. The wave field of a point
source PP,ps (r, ) is given by Eq. (2.27). Its circular harmonics decomposition for a circular microphone array was calculated according to Eq. (3.102). The expansion coefficients
were then used for the extrapolation using Eq. (2.25). This procedure results in the extrapolated wave field PP,e (r, ) of a point source using circular harmonics.
Figure 5.10 illustrates the results for a point source located at a distance d = 3 m and an
array with an radius of R = 0.75 m. Figure 5.10(a) shows the amplitude decays of a point
source and its wave field extrapolated from the circular harmonics. The amplitudes were
normalized to the radius of the array. The deviation from the decay of a point source
after extrapolation of the field is clearly visible. Figure 5.10(b) shows the absolute value
of the amplitude error EP (r, ) = PP,ps (r, ) PP,e (r, ) between the wave field of a point
source PP,ps (r, ) and its extrapolation PP,e (r, ).
Point sources are widely used approximations for realworld sources (e. g. for loudspeakers). The mirror image model indicates that the typical room response of a point source
can be understood as a combination of the primary point source and its reflections. These
reflections are again modeled as point sources. Hence, it would be desirable to derive a
twodimensional extrapolation technique which is capable of correctly extrapolating the
wave field of a point source. In principle it is possible to modify the circular harmonics
decomposition and extrapolation technique to fulfill these requirements. A drawback of
such a modification would be then that the extrapolation of plane waves would exhibit
amplitude errors (this effect is similar to the amplitude error discussed for WFS). One
possibility for modification of Eq. (2.25) could be to use spherical Hankel functions instead
of Hankel functions. Another possibility proposed by [HdVB01], which only works for the
186
1.15
20
1.1
1.05
1
0.95
0.9
0.85
0.8
25
30
35
40
0.75
0.7
0.25
0.5
r > [m]
0.75
45
0.25
0.5
r > [m]
0.75
(a) amplitude of a point source PP,ps (r, ) and of (b) absolute value of the amplitude error EP (r, )
its extrapolated field PP,e (r, )
Figure 5.10: Amplitude decay of a point source compared to the decay of the extrapolation of its measured field. The point source is located at a distance of d = 3 m, the
radius of the WFA array is R = 0.75 m.
measurement of spatiotemporal impulse responses from point sources to the listening
area, is to modify measured impulse responses according to the desired decay.
5.2.4.2
It was stated before, that twodimensional analysis techniques have limited capabilities
for the analysis of threedimensional wave fields. This section will show some results in
order to illustrate this drawback. For this purpose the plane wave decomposition of a
plane wave with incidence angle 0 = 180o and varying elevation angle 0 using a circular
array is computed
P0 (, ) = P{PP,pw,0 (, r, )} ,
(5.18)
where PP,pw,0 (, r, ) denotes the wave field of an elevated plane wave, as given by
Eq. (5.14). Figure 5.11 shows the absolute value of the plane wave decomposition
P0 (, ). The plane wave decompositions were computed by simulating a Dirac shaped
plane wave with varying elevation angle on a circular array with an radius R = 0.75 m.
Ideally, the plane wave decomposition of a Dirac shaped plane wave would be a Dirac line
along the incidence angle of the plane wave. The plane wave decomposition of a plane
wave differs from this theoretic result as shown in Fig. 5.11(a), due to the finite aperture
of the array and aliasing errors. However, the results even differ more with increasing
187
[dB]
700
700
5
10
600
15
500
20
400
25
300
30
35
200
5
10
600
frequency (Hz)
frequency (Hz)
[dB]
15
500
20
400
25
300
30
35
200
40
100
0
40
100
45
0
90
180
angle (o)
270
45
0
90
180
angle (o)
270
700
700
5
10
600
15
500
20
400
25
300
30
35
200
5
10
600
frequency (Hz)
frequency (Hz)
[dB]
15
500
20
400
25
300
30
35
200
40
100
0
45
0
90
180
angle (o)
270
40
100
0
45
0
90
180
angle (o)
270
Figure 5.11: Plane wave decomposition of a Dirac shaped plane wave with an incidence
angle of 0 = 180o and varying elevation angles 0 using a circular microphone array.
The plots show the magnitude of the frequency response P0 (, ) for different elevation
angles 0 of the plane wave. The radius of the simulated array is R = 0.75 m.
188
40
0 = 0o
0 = 30o
0 = 60o
0 = 90o
30
20
10
0
10
20
30
40
90
180
angle (o)
270
Figure 5.12: Energy E0 () of the plane wave contributions P0 (, ) (see Fig. 5.11) for
different elevation angles 0 of the analyzed elevated plane wave.
elevation angle, as shown in Fig. 5.11(b) to Fig. 5.11(d). The plane wave decomposition exhibits no directionality at all in the extreme case of Fig. 5.11(d) (elevation angle
0 = 90o ). Figure 5.12 shows the energy E0 () of the plane wave components illustrated
in Fig. 5.11. The energy E0 () is defined as follows
Z
1
P0 (, )2 d .
(5.19)
E0 () =
2
This measure gives insight into the energy distribution with respect to the incidence angle
of the plane wave contributions derived by the plane wave decomposition. This way the
decreasing directionality with increasing elevation angle of the plane wave can be seen
clearly in Fig. 5.12.
It can be concluded from the presented results that elevated plane waves interfere into all
components of the decomposed field. Similar results have been reported by [HSdVB03].
The presented results are also valid for other types of elevated sources, since arbitrary
wave fields can be expressed as superposition of plane waves.
5.2.5
The following section will discuss some hardware related aspects of the practical realization of a circular WFA system. Section 3.4.3 and Section 5.2.2 introduced a basic
algorithm and its discrete realization for a circular harmonics decomposition using a circular microphone array. The theory behind this approach states that the measurement of
189
the acoustic pressure and its gradient at discrete positions on a closed circular contour are
suitable to derive the circular harmonics decomposition coefficients of the entire analyzed
wave field within this contour. The number of spatial sampling positions on the circle and
thus the number of microphones is dependent on the antialiasing conditions derived in
Section 3.6.2. The total number of microphones is twice the number of angular sampling
positions, since the acoustic pressure and its gradient have to be measured. The spatial
aliasing frequency of the WFA system should be at least as large as the spatial aliasing
frequency of the WFS system when used for active room compensation. Hence, the number of sampling positions on the circle will be quite high in typical applications. The WFA
system is typically realized by sequentially measuring the different sampling positions on
the circle due to the complexity when recording all signals in realtime. However, one
drawback of this sequential approach is that it is not possible to record live sound events.
It is only possible to record the spatiotemporal impulse response from an acoustic source
(e. g. a loudspeaker) to the microphones. The circular harmonics decomposition of the
wave field emitted by this source, when excited by a temporal Dirac pulse, can be calculated by applying Eq. (3.102) to the measurements.
A stepper motor drive that can be controlled from a PC was used to perform the sequential measurements within the course of this work. A rod was mounted on the stepper
motor and microphones were placed at the end(s) of the rod. Different microphone types
can be used for the measurements. High quality pressure and pressure gradient (figureofeight) microphones are commercially available and can be used without modifications
for this purpose. However, one problem occurs when using pressure and pressure gradient
microphones for the measurements: they cannot be placed at the same spatial position.
This problem can be overcome by placing the pressure and the pressure gradient microphone at the two opposite ends of the rod. Their opposite positions have then to be taken
into account when calculating the decomposition. Figure 5.13 and Fig. 5.14 illustrate
a practical implementation using this principle. There are also microphones available
which capture pressure and pressure gradient at the same spatial position. The soundfield
microphone [Jag84] or velocity probes are examples for such microphones.
Figure 5.15 shows the plane wave decomposition p(, t) computed from the spatiotemporal impulse produced by a loudspeaker placed in a room response which was measured by a circular microphone array. The direct part of the wave field at = 180o and
the reflections caused by the room can be seen clearly. The plane wave decomposition is
a powerful tool for the analysis of acoustic scenes.
Besides the fundamental artifacts discussed in Section 5.2.3 also other artifacts have to
be considered in a practical implementation. E. g. if multiple microphones are used for
a practical realization of a circular array then their different characteristics (microphone
mismatch) and position errors have to be take into account. These and other practical
issues are discussed e. g. in [HSdVB03, Teu05, BW01].
190
Figure 5.13: Rod with pressure (AKG CK92 [AKG]) and pressure gradient (AKG
CK94 [AKG]) microphone mounted on a stepper motor drive. The total length of the rod
is l = 1.50 m.
5.3
Section 4.4 derived the concept of WDAF for improved active listening room compensation. The basic idea of WDAF is to perform a spatiotemporal transformation of the
multidimensional signals and systems in order to decouple the MIMO inverse system
identification problem. The complexity of the identification problem is reduced significantly this way. Up to now, the application of WDAF using a particular spatial sound
reproduction system was not considered. This section will illustrate the application of
WDAF based listening room compensation to WFS systems. Results based on simulated
and measured acoustic environments will prove the performance of the proposed method.
The relevance of the discussed WFS and WFS artifacts, some extensions to the base algorithm and the application to other spatial sound reproduction systems will be discussed
additionally.
Active listening room compensation requires to be able to control the wave field within the
listening area on the one side, and on the other side to analyze the reproduced wave field
within the listening area. A closed contour WFS system is capable of controlling the wave
field within its contour below the spatial aliasing frequency. This was illustrated in the
Sections 4.1 and 5.1. A circular microphone array is capable of analyzing the wave field
within its contour below the spatial aliasing frequency. This was shown in Section 3.4.3.
Both techniques will be utilized in the following for active listening room compensation.
The concept of WDAF requires to find a suitable wave domain transformation. This
transformation should approximately diagonalize the listening room transfer matrix, as
stated in Section 4.5.4. In general, the optimal transformation will be dependent on the
acoustic boundary conditions imposed by the listening room. Hence, the transformation
191
Figure 5.14: Setup used for the sequential circular microphone array measurements.
depends on the geometry of the listening room and the acoustic boundary conditions
present at its walls. The following section proposes a wave domain transformation for
typical listening rooms.
5.3.1
The geometry of the listening room has to be taken into account in order to derive a
suitable wave domain transformation. Typical listening rooms will exhibit a rectangular
shape as first approximation. Hence, the boundary conditions imposed by the listening
room can be modeled approximately by a rectangular box with homogeneous boundary
conditions at its sides. In the following this simplified model of a listening room will be
used.
Two different wave field representations have been introduced in Chapters 2 and 3: the
plane wave decomposition and the decomposition into circular harmonics. Both are candidates for the wave domain transformation. Due to the assumed simplified geometry of
the listening room and the relatively low frequencies considered (due to spatial aliasing)
192
35
30
time (ms)
25
20
15
10
90
180
angle (o)
270
Figure 5.15: Plane wave decomposition of the spatiotemporal impulse response emitted
by a loudspeaker placed in a room. The distance of the loudspeaker to the center of the
array was d = 2.50 m, the radius of the microphone array R = 0.50 m. The circle was
sampled at 128 positions.
circular harmonics seem to provide a more suitable basis for the wave domain transformation than plane waves. In the freefield case, each circular harmonics component emitted
by the WFS system will result only in contributions of the same component after analysis.
However, in a reverberant listening room each circular harmonics component emitted will
result in additional components other that the emitted one. The energy of these components should be as low as possible. There are some physical indications for the energy
compaction performance of circular harmonics [Oes56, DGZD01, GD04] in rectangular
rooms.
The following section will specialize the generic concept of wave domain adaptive filtering to the use of circular harmonics for the signal representation in the wave domain.
The performance of this special choice for the wave domain transformation, on basis of
simulated and measured acoustic environments, will be shown in the Sections 5.3.4 and
5.3.5.
(M)
w
T2
w(N)
N
listening room
(M)
e
F
+
T3
l(M)
(M)
a
(M)
d
T1
S()
193
l(M)
Figure 5.16: Block diagram of the proposed wave domain adaptive inverse filtering
based active room compensation system for WFS.
5.3.2
The concept of WDAF was derived in Section 4.5.4 and is illustrated by Fig. 4.24. This
generic concept will be specialized in the sequel to the reproduction using a WFS system
and the circular harmonics decomposition as wave domain transformation. Two rendering
techniques have been introduced in Section 5.1.4 for WFS based rendering of acoustic
scenes: modelbased and databased rendering. Both techniques differ in the generation of
the loudspeaker driving signals. Modelbased rendering uses spatial models for the virtual
sources, while databased rendering allows to prescribe any desired virtual wave field at the
cost of an increased complexity. For the derivation of the room compensation algorithm,
the modelbased approach will be considered in the following. However, the algorithm
can be generalized straightforward to the databased approach as will be indicated later
in this section.
Figure 5.16 illustrates the proposed WDAF based active room compensation algorithm
for WFS systems. The notation of the signals and transfer matrices is similar to the one
used in Section 4.5, the virtual source signal is denoted by S(). Only one virtual source
is covered in Fig. 5.16, multiple virtual sources can be considered by applying the principle
of superposition. The transformations T1 through T3 are based on the decomposition into
circular harmonics of the respective wave fields. For active room compensation only the
incoming parts (see Section 2.3.1) of the respective wave fields are of interest. For the
particular scenario considered these generic transformations are specialized as:
194
Transformation T1 :
This transformation transforms the virtual source signal S() using a spatial model
(M) in the wave domain. Suitable models for
into the loudspeaker driving signals d
the virtual source characteristics are line sources or plane waves. The plane wave
decompositions for these two source types were derived in Section 3.5 and are given
by Eq. (3.114) and Eq. (3.109), respectively. The circular harmonics components
can be derived easily from these analytic plane wave decompositions since they are
given in terms of a Fourier series equal to the expansion of a wave field into circular
harmonics (2.25).
It is desirable to include a parametric room model into the analytic source models for the creation of virtual acoustic scenes. The mirror image model [AB79]
is widely used for this purpose. Efficient algorithms for the mirror image
model [DGZD01, GD04] can be included conveniently into transformation T1 . These
algorithms are based on a decomposition into spherical harmonics since a three dimensional wave field was considered for their derivation. However, they can be
reformulated straightforward in terms of circular harmonics for the twodimensional
case considered here.
Transformation T2 :
This transformation generates the loudspeaker driving signals from the filtered driv (M) . It was shown in Section 4.1.5 that the loudspeaker driving signals
ing signals w
can be calculated conveniently from a plane wave decomposition of the virtual source
wave field. Thus, transformation T2 can be realized by calculating the the plane wave
decomposition from the circular harmonics representation using Eq. (3.58) and applying Eq. (4.18) together with the secondary source correction (5.7) to derive the
loudspeaker driving signals.
Transformation T3 :
This transformation calculates the circular harmonics decomposition coefficients of
the wave field within the listening area from the microphone array measurements.
Equation (3.102) can be used for this purpose. However, this requires to measure
the acoustic pressure and additionally its gradient at the analysis positions.
The signal representation in terms of circular harmonics should be based on the aperture
models the freefield propagaof the loudspeaker array. In this case the transfer matrix F
tion in terms of circular harmonics from the loudspeaker to the microphone array. In the
ideal case this matrix would only model the propagation delay. However, it is advisable to
incorporate additionally the artifacts of WFS systems (see Section 5.1.2) in order to pre(M) that can be achieved. Besides these artifacts the transfer
scribe an desired wave field a
should also include an additional delay to ensure the computation of causal
matrix F
room compensation filters. This delay is also known as modeling delay [Hay96, Gar00].
195
So far only the modelbased approach to auralization with WFS has been considered. As
stated before, this specialized approach can be extended straightforward to the databased
(M) are given by the
approach. In this case the transformed loudspeaker driving signals d
circular harmonics decomposition coefficients of the virtual wave field to be reproduced.
For circular apertures these coefficients can be calculated from Eq. (3.102).
It was proposed by the author in previous papers [BSK04b, SBR04c, BSK04c, SBR04a,
BSK04a, SBR04b] to perform a Fourier transformation with respect to the angular variable of the plane wave decomposition to derive the signals in the wave domain. This
Fourier transformation of the plane wave decomposed signals is up to the factor j (and
a frequency correction) equivalent to calculating the circular harmonics expansion coefficients, since the plane wave decomposition in terms of circular harmonics is given by the
Fourier series (3.58).
An online implementation of the room compensation algorithm would require to use a
filteredx algorithm for the adaptation of the compensation filters. The proposed active listening room compensation algorithm was implemented in an offline fashion using
MATLAB [MAT]. Therefore, for the implementation of the active listening room compensation algorithm depicted by Fig. 5.16 no filteredx algorith was used. This way the
performance of the proposed decoupling without the shortcomings of the filteredx algorithms [TBB00, SH94, Bja95] can be evaluated. The frequency domain adaptive filtering
(FDAF) algorithm introduced in [BBK03] was utilized for the adaptation process in the
implementation. This algorithm can be extended straightforward to the filteredx scheme.
5.3.3
The following performance measures will be used to evaluate the performance of the
proposed transformations and the resulting active listening room compensation system.
Energy compaction
The ability of the different transformed signal and system representations to compact the
room characteristics into less coefficients than using the microphone signals directly will
be evaluated by two measures: (1) the energy of the elements of the room transfer matrix
and (2) the energy compaction performance. For the room transfer matrix in its different representations the first measure is defined by calculating the energy of each spatial
transmission path. For its representation in the pressure domain (pressure microphones)
the energy of the elements of the room transfer matrix is defined as
Z
1
E(m, n) =
Rm,n ()2 d .
(5.20)
2
Similar definitions apply to the room transfer matrix in its plane wave and circular har 0 ) and
monics representation (see Section 4.5.5) yielding the energy representations E(,
196
0 ).
E(,
In transform coding the energy compaction performance of different transformations is
typically evaluated by calculating the energy compaction EC(i) [AR75]. For one particular room transfer matrix this measure is defined by calculating the ratio between the
energy of the first i dominant elements and all elements. For this purpose the energies
E(m, n) are sorted in descending order yielding the sorted elements Esort (). Then the
ratio between the energy of the first i sorted elements and all elements is calculated as
follows
Pi
=0 Esort ()
EC(i) = PM N 1
,
(5.21)
E
()
sort
=0
where 0 EC(i) 1. Equivalent definitions apply for the transformed representations.
The more energy is captured by the first i elements the better the performance in terms
of energy compaction.
Both the energy of the matrix elements E(m, n) and the energy compaction ratio EC(i) are
typically evaluated in a logarithmic scale. The energy of the elements E(m, n) illustrates
the distribution of the energy in the different transformed domains, while EC(i) illustrates
the ability of a transformation to compact the energy on few coefficients. The concept of
eigenspace adaptive filtering, as introduced in Section 4.5.3, compacts the entire energy
on the main diagonal by using the GSVD as signal and system transformation.
E()
=
2
P (, )2 d .
(5.22)
197
Adaptation error
The error e(M) used to adapt the room compensation filters gives insight into the peradapt (k) will be used as
formance of the adaptation process. The mean squared error E
measure for the adaptation error in the sequel. It is defined as follows
Eadapt (k) =
M
X
m=1
 em (k) 2
(5.23)
Of special interest is the convergence speed, its lower limit for stationary scenarios and its
behavior for spatiotemporal nonstationary scenarios. In the ideal case the adaptation
error should decrease fast until its lower bound and should stay low for nonstationary
scenarios.
5.3.4
The following section will illustrate the performance of active listening room compensation for WFS using the proposed WDAF approach on the basis of simulated acoustical
environments. The main benefits of using simulated versus real acoustical environments
are that the parameters of the simulated environment can be changed easily in order to
simulate different scenarios and that practical aspects as noise, transducer mismatch and
misplacement can be excluded for a first proof of the WDAF concept.
A wide variety of methods can be used to simulate the wave propagation between the
loudspeakers and the microphones in the listening room. One of the most common methods used for this purpose is the mirror image method [AB79]. It assumes that the walls of
the listening room act as acoustic mirrors creating mirror sources. However, this assumption is only accurate for high frequencies and walls with large extends [FHLB99]. Due
to the relatively low spatial aliasing frequency of typical WFS and WFA systems, room
compensation will be typically applied to the lower frequencies only. In order to have an
accurate simulation of the acoustic environment, a FTM based simulation method of the
wave equation was used [PR05]. Using this method, no spatial discretization is required
and thus the microphone and speaker placement can be done exact in space. This is not
possible when using methods which perform a spatial discretization, like e. g. the finite
element method (FEM) [Red05]. The particular implementation used (wave2d) simulates
the wave propagation in twodimensions. The acoustical properties of the walls are characterized by the frequencyindependent plane wave reflection factor Rpw . A reflection
factor of Rpw = 0 models freefield propagation, a reflection factor of Rpw = 1 perfectly
reflecting walls. The plane wave reflection factor can be linked to the reverberation time
T60 of a room [Pie91].
The geometry of the simulated acoustical environment is a simplification of the real environment described in Appendix D.1. The real environment was simplified for the simu
198
lation to a twodimensional rectangular plane with the dimensions 5.90 5.80 m, due to
complexity considerations. The simulated geometry is described in detail in Appendix D.2
and illustrated by Fig. D.3.
For the evaluation of the algorithms the impulse responses from all loudspeakers to all
(pressure/pressuregradient) microphones were simulated, in order to derive the required
room transfer matrix. Two scenarios were simulated: a freefield scenario with Rpw = 0
and a reverberant scenario with Rpw = 0.8. The latter one approximates the acoustic
characteristics of the real room described in Appendix D.1 (with all curtains opened).
The following two sections will show results for a circular and a rectangular WFS system.
5.3.4.1
The first setup that is evaluated in the sequel consists of a circular WFS and WFA system.
The reason for the chosen setup is that an equivalent WFS system has been build at the
laboratory (see Fig. 5.6). The exact geometry of the laboratory setup is described in Appendix D.1. The size and position of the loudspeaker and microphone array was simulated
according to Fig. D.3, the listening room was approximated by a rectangular shape for the
simulations. The simulated WFS system consists of 48 loudspeakers placed equidistant
on a circle with a radius of RLS = 1.50 m. The loudspeakers were approximated by point
sources. According to Section 4.1.6.2 and Fig. 4.12(b) this particular angular sampling
results in a reproduced aliasingtosignal ratio of RASR(0, 650 Hz) 32 dB at a frequency of 650 Hz in the center of the circular array. The simulated WFA system consists
of 48 angular sampling positions at a radius of Rmic = 0.75 m. Both virtual pressure
and pressuregradient microphones were simulated at the sampling positions. According
to Section 3.6.2.3 and Eq. (3.138) this particular angular sampling of the WFA system
results in an aliasingtosignal ratio of ASR(23, Rmic, 650 Hz) 44 dB for the = 23
circular harmonics component at a frequency of fal = 650 Hz. For the results all signals
were lowpass filtered to a frequency of fLP = 650 Hz in order to keep the aliasing artifacts
reasonably small. The remainder of this section will show and discuss the results obtained
from the simulated setup.
Singular Vectors
It was stated in Section 5.3.1, that circular harmonics provide a reasonable basis for
the WDAF approach. An indication for the suitability of this choice can be found by
calculating the right singular vectors vb () of the room transfer matrix R() (for the
pressure microphones) using the SVD. Figure 5.17(a) shows the absolute value of the
first eight right singular vectors vb () for the freefield case Rpw = 0 and a frequency of
f = 80 Hz. The singular vectors have been sorted by their descending singular values and
thus by their energy. The results are similar to the angular basis functions of the circular
90
270 90
199
270 90
270 90
270
0.5
1
1.5
2
180
0.5
1
1.5
2
180
0.5
1
1.5
2
180
0.5
1
1.5
2
180
90
270 90
0.5
1
1.5
2
180
270 90
0.5
1
1.5
2
180
270 90
0.5
1
1.5
2
180
270
0.5
1
1.5
2
180
90
270 90
270 90
270 90
270
0.5
1
1.5
2
180
0.5
1
1.5
2
180
0.5
1
1.5
2
180
0.5
1
1.5
2
180
90
270 90
0.5
1
1.5
2
180
270 90
0.5
1
1.5
2
180
270 90
0.5
1
1.5
2
180
270
0.5
1
1.5
2
180
Figure 5.17: Absolute value of the first eight right singular vectors (f = 80 Hz) for the
simulated circular WFS/WFA system sorted by their descending singular values (top left
to bottom right). The singular vectors for two different plane wave reflection factors Rpw
at the walls of the simulated room are shown.
200
harmonics (see Fig. 2.4). However, this result is evident since freefield propagation has
been simulated and circular harmonics match optimally to the geometry of the circular
WFA system in this case. In order to prove the suitability of the circular harmonics
decomposition for a typical listening room the singular vectors were also computed for
the reverberant case Rpw = 0.8. These singular vectors are shown in Fig. 5.17(b). It can be
seen that they are detoriated to some extend compared to the circular harmonics, however
their coarse structure still matches the circular harmonics quite well. The presented results
indicate that circular harmonics provide a suitable transformation for typical listening
rooms.
Energy Compaction
The concept of WDAF based active room compensation relies, among other properties,
on the energy compaction performance of the chosen transformation. Two measures were
introduced in Section 5.3.3 to investigate the energy compaction in the different signal
representations: (1) the energy of the elements of the room transfer matrix and (2) the
energy compaction performance. These measures are evaluated in the following for the
freefield and reverberant case.
Figure 5.18 shows the energy of the elements of the room transfer matrix for the pressure microphones and its circular harmonics representation for the two different simulated
cases. Figure 5.18(a) shows E(m, n) for the pressure microphones in the freefield case
(Rpw = 0). The direct path from the loudspeakers to the microphones can be seen clearly.
0 ) for the freefield case. Almost all energy is compacted onto
Figure 5.18(b) shows E(,
the main diagonal elements by the circular harmonics representation of the room transfer
matrix. Please note the different scales used for the results shown. For the freefield case
all energy in the circular harmonics domain should be compacted onto the main diagonal.
However, some off diagonal elements with very low energy are also present in Fig. 5.18(b)
due to aliasing and truncation artifacts present in the analyzed wave field and numerical
errors in the freefield simulation. Figure 5.18(c) and 5.18(d) show the energy distributions for the reverberant case. The reflections of the loudspeaker wave fields at the walls
of the listening room can be seen clearly in Fig. 5.18(c). Figure 5.18(d) shows the energy
of the room transfer matrix represented in circular harmonics. As desired, the main diagonal elements represent a major portion of the energy. The offdiagonal elements are a
result of the reverberant enclosure.
Not only the performance of the circular harmonics in terms of energy compaction towards the main diagonal is of interest, but also its performance to represent the MIMO
system by as few as possible coefficients. This can be measured by the energy compaction
of the different representations. Results for the reverberant case are shown in Fig. 5.19.
This figure illustrates the energy compaction according to Eq. (5.21) for the room transfer
matrix in the pressure domain EC(i), in the plane wave decomposed domain EC(i)
and
201
[dB]
[dB]
0.5
20
40
15
35
1.5
10
30
25
2.5
20
15
3.5
10
10
5
0 >
microphone m >
45
15
0
5
20
10
15
25
4.5
20
5
10
20
30
loudspeaker n >
20
40
10
0
>
10
[dB]
0
45
0.5
40
0
20
5
15
1.5
35
10
2
30
2.5
25
20
3.5
15
4
4.5
10
5
10
20
30
loudspeaker n >
40
10
5
0 >
microphone m >
30
20
15
0
5
20
10
15
5.5
20
25
20
10
0
>
10
20
Figure 5.18: Energy of the room transfer matrix of the signals captured by the pressure
0 ). The two rows show
microphones E(m, n) and in the circular harmonics domain E(,
the results for different plane wave reflection factors Rpw at the walls of the simulated
room.
30
202
0
2
SVD
circular harmonics
plane wave decomposed
pressure
4
6
8
10
12
14
16
18
20
12
24
36
48
60
i >
Figure 5.19: Energy compaction performance EC(i) for the different representations of
the room transfer matrix. Shown are the results for Rpw = 0.8
203
[dB]
[dB]
0
5
460
450
10
440
15
430
20
420
25
410
30
400
35
390
40
380
45
270
180
15
0 >
10
20
90
25
0
90
180
>
370
30
270
90
50
180
270
angle [o] >
(a) energy of the elements of the plane wave decom (b) plane wave decomposed target wave field: plane
posed room transfer matrix
wave with 0 = 180o
[dB]
200
460
10
450
10
15
440
15
430
20
420
25
410
30
35
400
35
40
390
40
45
380
45
50
370
20
150
25
30
100
50
90
180
270
angle [o] >
250
[dB]
90
180
270
angle [o] >
50
(c) plane wave decomposed reproduced wave field (d) plane wave decomposed reproduced wave field with
without room compensation
room compensation
Figure 5.20: Results obtained by the proposed WDAF based active room compensation
algorithm using a circular WFS/WFA system placed in a rectangular room with Rpw = 0.8
at the walls. The results for converged compensation filters are shown.
204
only produce a contribution at its incidence angle (see Eq. (4.107)). However, for the
shown reverberant case also contributions at other angles than the incidence angle are
present due to the reflections caused by the listening room. Results for one particular
incident plane wave are shown in Figures 5.20(b) to 5.20(d). The figures show the plane
wave decompositions of the desired wave field and the resulting wave fields without and
with applying room compensation. The compensation filters were adapted using white
noise. This results derived this way can be reproduced easily. However, the performance
of the adaptation when using signals like speech or music will be typically lower than for
white noise. The desired wave field for the results shown is a bandlimited (sinc shaped)
plane wave with an incidence angle of 0 = 180o . The plane wave decomposition of the
(M) () is shown in Fig. 5.20(b). The plane wave decomposition of
desired wave field a
(M)
the wave field reproduced within the listening area l0 () without compensation is illustrated in Fig. 5.20(c). The reflections caused by the listening room are clearly visible.
Figure 5.20(d) shows the resulting wave field l(M) () when applying the proposed room
compensation algorithm. There are only some slight variations from the target wave field
visible. Figure 5.21(a) shows the energy of the plane wave components shown in Figures 5.20(b) to 5.20(d). The results prove that the compensation filters are capable of
compensating the undesired contributions from other directions that the incidence angle
of the target plane wave. Figure 5.21(b) shows the adaptation error for this particular
simulated scenario. Please note, that due to the limited bandwidth of the room compensation system only very limited information is available to adapt the coefficients of the
room compensation filters. However, the results prove that fast and stable adaptation is
possible even in this szenario using the WDAF concept.
The room compensation filters for the shown results were fed back into the simulation of
the listening room in order to visualize the impact of room compensation [PSR05]. Figure 5.22 shows the resulting wave fields without and with room compensation for three
different timeinstants. The left row of Figure 5.22 shows the resulting wave fields for the
simulated reverberant room without applying room compensation. The upper wave field
for t = 0 ms shows the desired plane wave in the center of the array, the middle wave
field for t = 6.8 ms shortly before the upper wall and the lower one for t = 18.1 ms when
the reflected plane wave enters the listening area again traveling in the opposite direction.
The disturbances caused be the reflections off the walls of the listening room can be seen
clearly. The right row of Figure 5.22 shows the results when applying the WDAF based
room compensation algorithm. The wave fields after convergence of the compensation
filters are shown. It can be seen clearly that the room compensation algorithm is capable
of actively compensating the listening room reflections within the listening area. Only
the desired bandlimited plane wave is present inside the circular array.
205
0
w/o room compensation
with room compensation
desired wave field
5
10
15
20
25
30
35
90
180
angle [o] >
270
360
5
10
15
20
25
30
35
3
time > [s]
Figure 5.21: Results of room compensation for a bandlimited (sinc shaped) plane wave
with incidence angle 0 = 180o (simulated circular WFS/WFA system).
t = 18.1 ms
t = 6.8 ms
t = 0 ms
206
Figure 5.22: The left row shows the wave field without room compensation for different
timeinstants. The right row show the results when applying room compensation for
converged room compensation filters.
5.3.4.2
207
The second setup that will be evaluated in the sequel consists of a rectangular WFS and
a circular WFA system. The geometry of a rectangular WFS system is a better match to
the geometry of typical listening rooms. The size and position of the loudspeaker and microphone array was simulated according to Fig. D.3. The simulated WFS system consists
of 60 loudspeakers placed equidistantly at a distance of x 0.27 m on a rectangular
contour with the dimensions 3.50 3.50 m. The distance of the loudspeakers is chosen
such to result in approximately the same aliasing properties as for the circular WFS system discussed in the previous section. According to Section 4.1.6.1 and Eq. (4.29), this
particular loudspeaker distance results is an aliasing frequency of fal 625 . . . 1250 Hz
depending of the incidence angle of a reproduced plane wave. As for the setup used in
the previous section, the simulated WFA system consists of 48 angular sampling positions with a radius of Rmic = 0.75 m. All signals were lowpass filtered to a frequency of
fLP = 650 Hz in order to keep the aliasing artifacts reasonably small.
The remainder of this section will only show results for the singular vectors and the energy compaction performance in order to prove the applicability of the circular harmonics
decomposition as transformation for the WDAF concept. However, further simulations
have been performed. They revealed a similar performance of active room compensation
as compared to the case of the circular WFS system discussed in the previous section.
Singular Vectors
Figure 5.23(a) shows the absolute value of the first eight right singular vectors for the freefield case (Rpw = 0) and a frequency of f = 80 Hz. The singular vectors have been sorted
by their descending singular values and thus by their energy. As for the circular WFS
system, the results are similar to the angular basis functions of the circular harmonics
(see Fig. 2.4). The singular vectors for the reverberant case (Rpw = 0.8) are shown in
Fig. 5.23(b). It can be seen that they are detoriated to some extend in comparison with the
circular harmonics, however their coarse structure still matches the circular harmonics.
As expected, the presented results indicate that circular harmonics provide a suitable
transformation for the rectangular WFS system also.
Energy Compaction
Figure 5.24 shows the energy of the elements of the room transfer matrix for the pressure
microphones and its circular harmonics representation for the reverberant case. Please
note the different scales used for the results shown. Figure 5.24(a) shows E(m, n) for
0 ) of the room transfer
the pressure microphones and Fig. 5.24(b) shows the energy E(,
matrix represented in circular harmonics. As desired, the main diagonal elements in the
circular harmonics domain represent a major portion of the energy. The energy com
208
90
270 90
270 90
270 90
270
0.5
1
1.5
2
180
0.5
1
1.5
2
180
0.5
1
1.5
2
180
0.5
1
1.5
2
180
90
270 90
0.5
1
1.5
2
180
270 90
0.5
1
1.5
2
180
270 90
0.5
1
1.5
2
180
270
0.5
1
1.5
2
180
90
270 90
270 90
270 90
270
0.5
1
1.5
2
180
0.5
1
1.5
2
180
0.5
1
1.5
2
180
0.5
1
1.5
2
180
90
270 90
0.5
1
1.5
2
180
270 90
0.5
1
1.5
2
180
270 90
0.5
1
1.5
2
180
270
0.5
1
1.5
2
180
Figure 5.23: Absolute value of the first eight right singular vectors (f = 80 Hz) for the
simulated rectangular WFS/circular WFA system sorted by descending singular values
(top left to bottom right). The singular vectors for two different plane wave reflection
factors Rpw at the walls of the simulated room are shown.
209
[dB]
[dB]
45
20
1
40
35
10
30
3
25
4
20
10
5
0 >
microphone m >
15
15
0
5
20
15
10
10
15
5
10
20
30
40
loudspeaker n >
50
60
25
20
20
10
0
>
10
20
30
Figure 5.24: Energy of the room transfer matrix for the signals captured by the pressure
0 ). Shown are the results
microphones E(m, n) and in the circular harmonics domain E(,
for Rpw = 0.8.
paction of the different representations for the reverberant case is shown in Fig. 5.25. It
can be seen clearly that the circular harmonics representation of the room transfer matrix
compacts the energy much better than its pressure or plane wave representation.
As for the circular WFS system, the results prove that the circular harmonics decomposition of the room transfer matrix provides the desired properties. This result is evident
since an optimally designed WFS system should have no influence on the optimal transformation used for WDAF.
5.3.5
210
0
2
SVD
circular harmonics
plane wave decomposed
pressure
4
6
8
10
12
14
16
18
20
12
24
36
48
60
i >
Figure 5.25: Energy compaction performance EC(i) for the different representations
of the room transfer matrix for the rectangular WFS system. Shown are the results for
Rpw = 0.8.
The exact geometry of the laboratory setup is described in Appendix D.1. The listening room is equipped with removable curtains at all sides in order to control its acoustic
properties. These curtains have been removed for the measurements. The measured setup
consists of a 48 channel circular WFS system with a radius of RLS = 1.50 m. The system
is realized with 48 twoway highquality loudspeakers driven individually by multichannel
amplifiers. For a detailed description of the WFS system see Section 5.1.5 and Appendix D.1. The WFA system used for the measurements consists of 48 angular sampling
positions with a radius of Rmic = 0.75 m. The measurements were taken at the sampling
positions in a sequential fashion using a stepper motor unit with a rod mounted onto it.
Both a pressure and a pressuregradient microphone was used for the measurements. They
were mounted at oppsite ends of the rod and the measurements were postprocessed to
compensate for the opposite positions. See Section 5.2.5 and Appendix D.1 for a detailed
description of the circular WFA system. As stated in Section 5.1.3.2 and Section 5.2.4.2
elevated reflections can neither be compensated nor analyzed properly by twodimensional
WFS and WFA systems. The ceiling of the measured listening room is damped properly,
211
Figure 5.26: Setup used for the measurement of the listening room transfer matrix. The
gray curtains were removed for the measurements used in Section 5.3.5.
the floor was damped by damping material placed below the WFS system for the measurements. Figure 5.26 illustrates the measurement setup. As for the simulated acoustic
environment, all signals were lowpass filtered to a frequency of fLP = 650 Hz in order
to keep the aliasing artifacts reasonably small. The properties of the transformed signal
representations and the room compensation performance will be discussed in the sequel.
Singular Vectors
Figure 5.27 shows the absolute value of the first eight right singular vectors derived from a
SVD of the measured room transfer matrix. Shown are the singular vectors for a frequency
of f = 80 Hz. The singular vectors are quite detoriated compared to the simulation results
shown in Fig. 5.17(b). However, their coarse structure is still equivalent to the circular
harmonics. Possible reasons for the detoriations are measurement and ambient noise,
variations in the directivity and frequency response of the microphones used compared to
ideal omnidirectional and figureofeight microphones and elevated reflections.
212
90
270 90
270 90
270 90
270
0.5
1
1.5
2
180
0.5
1
1.5
2
180
0.5
1
1.5
2
180
0.5
1
1.5
2
180
90
270 90
0.5
1
1.5
2
180
270 90
0.5
1
1.5
2
180
270 90
0.5
1
1.5
2
180
270
0.5
1
1.5
2
180
Figure 5.27: Absolute value of the first eight right singular vectors (f = 80 Hz) of the
measured circular WFS/WFA system sorted by descending singular values (top left to
bottom right).
Energy Compaction
Figure 5.28 shows the energy of the elements of the room transfer matrix for the pressure
microphones and its circular harmonics representation for the measured setup. Please
note the different scales used. Figure 5.28(b) shows the energy of the room tranfer matrix represented by its circular harmonics decomposition. The main diagonal elements
represent a major portion of the energy. The offdiagonal elements are a result of the
reverberation of the listening room and the measurement errors mentioned before. The
structure of the energy representation in the circular harmonics domain is quite similar
to the simulation results shown in Fig. 5.18(d). This shows that the simulations can be
used quite well to predict the realworld performance.
The energy compaction of the different representations for the measured room is shown in
Fig. 5.29. The energy compaction performance of the circular harmonics for the measured
setup is only slightly lower than for the simulated setup shown in Fig. 5.19.
Summarizing, the results obtained from the measurement of the circular WFS/WFA setup
prove that the circular harmonics decomposition of the room transfer matrix provides the
desired properties for the WDAF concept. Hence, the circular harmonics decomposition
is a reasonable transformation for WDAF based active room compensation in realworld
applications.
213
[dB]
[dB]
0
45
0
20
1
40
10
10
5
4
25
0 >
microphone m >
35
30
15
20
15
10
15
20
10
20
30
loudspeaker n >
40
15
0
5
20
10
25
20
10
0
>
10
20
30
Figure 5.28: Energy of the room transfer matrix for the signals captured by the pressure
0 ) for the measured
microphones E(m, n) and in the circular harmonics domain E(,
circular WFS/WFA setup.
Room Compensation Performance
The investigation of the singular vectors and the energy compaction for the measured
setup showed similar results compared to the simulated acoustic environment used in Section 5.3.4.1. Similar simulations as shown in Fig. 5.20 and Fig. 5.21 have been performed
also for the measured setup. The results were similar. However, these simulations only
prove the performance at the reference (microphone) positions. For the twodimensional
environments these results are also valid for positions inside the microphone array. This
was shown for the simulated setup in Fig. 5.22. For threedimensional environments and
twodimensional WFS systems amplitude errors and elevated reflections will limit the
achievable room compensation performance as discussed in Section 5.1.3. The influence
of these and other artifacts will be discussed briefly in the following section.
The results of active room compensation shown for the simulated environments included
only stationary acoustic scenes. In order to illustrate the tracking capabilities of the
proposed algorithm a nonstationary scenario was additionally simulated. For this purpose a point source at a distance of r = 5 m to center of the WFS system was chosen
as desired virtual source wave field. The angle of the virtual point source was changed
every T = 5 s starting from 1 = 0o to 2 = 90o . The adaptation was performed using
214
0
2
SVD
circular harmonics
plane wave decomposed
pressure
4
6
8
10
12
14
16
18
20
12
24
36
48
60
i >
Figure 5.29: Energy compaction performance EC(i) for the different representations of
the room transfer matrix for the measured circular WFS/WFA setup.
white noise as virtual source signal. Figure 5.30 shows the resulting adaptation error.
The error decays fast for the first position. This indicates a fast convergence of the room
compensation filters. The error increases slightly at the times the virtual point source
changes its position. This reveals that the room compensation filters have to be adapted
again slightly to the changed source position. The reason for this need for renewed adaptation is that the new virtual source position excites portions of the room that have not
be excited before. However, the fact that the increase is only slight indicates that the
major characteristics of the room are already included from the previous position. The
results for the moving point source prove that the proposed algorithm is capable of handling nonstationary virtual scenes without major degradations by capturing the major
room characteristics by only some few coefficients. Traditional multichannel adaptation
algorithms would not exhibit this superior tracking performance illustrated in Fig. 5.30
for this reason. An equivalent behavior as for the tracking of a point source discussed
above is also known from singlechannel adaptive filtering algorithms if not all frequencies
of the concerned system are excited by the input signal that is used for the adaptation
process [SMH95, BMS98]. In this case the adaptation error may increase if new frequency
0
0
10
20
215
70
80
90
10
10
15
15
20
20
25
25
30
30
35
0
50
100
150
35
Figure 5.30: Mean squared adaptation error Eadapt (k) for a moving point source for the
measured circular WFS/WFA setup.
components become active in the input signal.
5.3.6
Using twodimensional reproduction and analysis techniques in a threedimensional environment may cause artifacts during reproduction and analysis of the wave field in the
listening area. This section will shortly discuss the impact of these and other artifacts on
the performance of active listening room compensation.
For WFS the major limiting artifacts for room compensation are the amplitude errors and
the limited suppression capabilities for elevated reflections discussed in Section 5.1.2. For
WFA the major limiting artifact is the limited analysis capabilities for elevated reflections
as discussed in Section 5.2.3. The results derived in Sections 5.3.4 and 5.3.5 proved that
a (perfect) compensation of the listening room reflections is, in principle, possible using
the proposed active room compensation algorithm. However, these results only show
the performance at the microphone positions, since these results are based on the twodimensional wave field decompositions derived from the microphone measurements. For
the twodimensional simulated environments these results are also valid within the entire
listening area as illustrated by Fig. 5.22. For threedimensional environments, like the
measured setup, the artifacts of WFS and WFA have to be taken into account to predict
the performance within the listening area. It was already stated in Section 5.1.3.2 and
Section 5.2.4.2 that elevated reflections should be suppressed by passive damping meth
216
ods due to the limited suppression and analysis capabilities of twodimensional WFS and
WFA systems. The quantitative influence of elevated reflections for the measured circular
setup has been derived in Section 5.1.3.2 and Section 5.2.4.2. Assuming a proper passive
damping of elevated reflections, the major remaining artifacts are the amplitude errors
of twodimensional WFS. This artifact will effectively limit the achievable performance
of active listening room compensation depending on the listener position. A quantitative
analysis of the influence of the amplitude error for a circular WFS system that matches
the measured setup was given in Section 5.1.3.1. Up to now, the only countermeasure for
these amplitude errors is to use line sources instead of point sources for the secondary
sources.
Besides the artifacts discussed above, other effects will also influence the performance
of an active listening room compensation system. These will be mentioned only shortly
here. On the analysis side additional limiting effects may be, e. g. microphone mismatch
and position errors, microphone and preamplifier noise, and ambient noise. On the reproduction side limiting effects are e. g. directivity and nonlinear characteristics of the
loudspeakers used.
5.3.7
The WDAF based active listening room compensation algorithm for WFS, as introduced
in Section 5.3.2, minimizes the mean squared error between the desired wave field and the
actually reproduced wave field. The complexity reduction for the adaptation of the room
compensation filters in the WDAF concept is based on a spatiotemporal transformation
of the MIMO system represented by the listening room. In the sequel some possible
modifications which may further decrease the complexity of the proposed algorithm will
be briefly discussed on a qualitative level. Possible modifications could be based on:
1. modification of the wave domain transformation,
2. performing spatiotemporal selective compensation of reflections, and
3. modification of the (singlechannel) adaptation algorithm used.
The results shown for the simulated and measured acoustic environments in the previous
sections illustrated that the circular harmonics decomposition provides a quite reasonable
choice for the wave domain transformation for circular analysis arrays and typical listening
rooms. However, there may exist other analytic transformations which perform better.
A possible candidate could be to use elliptical harmonics [MF53a] as basis for the wave
domain transformation for rectangular listening rooms.
The presented algorithm compensates all reflections of the listening room. The complexity
of the active room compensation algorithm can be reduced further by focusing on specific
217
reflections of the listening room. Since the reproduction using WFS and the analysis using
WFA is performed in a spatiotemporal fashion, this concentration could be performed
in the temporal domain, the spatial domain or both. Examples for these cases could
be to compensate only early reflections, reflections emerging from an rigid wall or early
reflections emerging from a rigid wall. The focusing plane wave decomposition [Hul04]
may be used for this purpose. The particular choice for the spatiotemporal selectively
of active room compensation should be based on psychoacoustic criteria. This would
allow to concentrate on the removal of only the relevant reflections from a psychoacoustic
viewpoint. The benefit of spatiotemporal selectively is that it may result in shorter or
less room compensation filters.
The basic concept of WDAF is to decouple the room transfer matrix and hence the MIMO
adaptive filter algorithm to several SISO adaptive filter algorithms. The particular choice
of the SISO adpative filter algorithm used is independent from this decoupling. Efficient
implementations or approximations of the filteredx RLS algorithm may be used for this
purpose (e. g. XLMS).
5.3.8
Active compensation of the listening room reflections within the entire listening area is
in principle only possible below the spatial aliasing frequency of the reproduction system
used. Above the aliasing frequency no control over the wave field within the listening
area is gained. As a consequence, destructive interference to cancel the listening room
reflections will fail. Perfect compensation of the listening room reflections will only be
possible at the microphone positions above the aliasing frequency. Hence, above the
aliasing frequency the proposed active listening room compensation system will exhibit
similar performance as the traditional multipoint approaches reviewed in Section 1.2.
From a theoretic viewpoint nothing can be done to overcome the limitations imposed
to active room compensation by spatial sampling. However, these limitations can be
improved to some extend in a more practical sense. This could be done for example by
(1) incorporating the knowledge of the listening room reflections into the virtual scene
during reproduction or (2) by utilizing the properties of human spatial perception in
order to hide the listening room reflections. In the first approach equivalent reflections
present in the virtual scene and in the listening room could be replaced by the ones
of the listening room. In the second approach reflections of the listening room could
be hidden by modifying the virtual scene to be reproduced according to psychoacoustic
spatial masking effects. However, no solutions or algorithms that successfully demonstrate
the applicability of these ideas are known to the author at the time this work was written.
A very broad overview over some ideas can be found in [CN03].
218
5.4
The application of WDAF based active listening room compensation to WFS systems was
illustrated in Section 5.3. The results shown there prove that active listening room compensation is capable of removing the reflections caused by a reverberant listening room.
However, the formulation of the generic framework for room compensation developed in
Section 4.5 is not based on a specific reproduction or analysis system used. It is also not
limited to twodimensional reproductions systems, it can be extended straightforward to
threedimensional reproduction systems.
The particular reproduction and analysis system used to assemble an active room compensation system have to fulfill two requirements in order to be able to compensate for
the listening room reflections in the entire listening area. These are: (1) the reproduction
system must be capabale of controlling the wave field within the entire listening area and
(2) the analysis system must be capable of analyzing the wave field reproduced within the
entire listening area. Both requirements are fulfilled by the different WFS systems and the
circular WFA system used for the room compensation setups presented in Sections 5.3.4
and 5.3.5. Either reproduction or analysis system that fulfills the requirements stated
above can be used to construct an advanced active listening room compensation system
which results in large compensated area. It remains to find a suitable transformation in
order to decouple the MIMO system represented by the room transfer matrix. A suitable
datadependent transformation that can always be applied is given by the GSVD. This
was shown in Section 4.5.3. No generic dataindependent transformation for the WDAF
concept can be given, since this transformation depends on the geometry of the particular
problem. Hence, it has to be derived problem specific. Two examples for the application
of the proposed techniques to other reproduction scenarios are given briefly in the following.
Higherorder ambisonics [Dan00, Ger85] is an alternative as spatial reproduction system
to WFS. It has been shown that ambisonics is equivalent to WFS when discarding the effects of spatial sampling [NE98, NE99, DNM03]. It was also shown in [DNM03] that WFS
and ambisoncis exhibit different characteristics for the aliasing artifacts present above the
spatial aliasing frequency. There are indications that the spatial aliasing artifacts of an
ambisonics system scale inversely with the distance to the center of the system while for
WFS the aliasing artifacts are spread over the entire listening area. For an ambisonics
system the virtual source wave fields are represented by its ambisonics representation.
This representation is equivalent to the circular harmonics decomposition of wave fields.
Thus if a circular WFA system is used to analyze the reproduced wave field, then the
ambisonics representation may be used for a WDAF based active room compensation
system for ambisonics. Active listening room compensation systems based on the am
219
220
221
Chapter 6
Summary and Conclusions
Since its early days sound reproduction aims at creating the perfect acoustic illusion.
Many reproduction systems emerged over the past decades with the ambition to fulfill
this goal. However, the perfect acoustic illusion has never been realized by either of them.
Nevertheless, sound reproduction has been improved considerably in terms of quality and
spacial impression compared to the first monophonic systems [Tor98, Ste96, Gri00].
The theory behind most reproduction systems assumes freefield propagation from the
loudspeakers to the listener. Unfortunately, this assumption does not hold for reproduction systems placed in a listening room. The listening room is typically a compromise
between design, cost and acoustic requirements. As a consequence, most listening rooms
are more or less reverberant and will impose their reflections onto the reproduced wave
field. This work developed an improved framework for active compensation of the listening room acoustics.
The basic idea behind this framework is to utilize the existing sound reproduction system in order to cancel the reflections imposed by the listening room with destructive
interference. This additionally requires to analyze the influence of the listening room on
the reproduced wave field. In order to be able to compensate the reflections for the entire listening area, two basic requirements were derived for the reproduction and analysis
system. These are: (1) the analysis system should be able to (perfectly) analyze the
reproduced wave field within the listening area, and (2) the reproduction system should
provide (perfect) control over the reproduced wave field within the listening area.
Chapter 3 introduced WFA based on the plane wave and circular harmonics decomposition as solution to the first requirement. Especially, circular microphone arrays have
proven to provide many desirable properties for WFA. Section 4.1 derived a generic theoretical framework for sound reproduction systems. Wave field synthesis was introduced
in Section 5.1 as particular implementation of a spatial sound reproduction system that
fulfills the desired control capabilities.
Both WFA and WFS systems are realized by densely placed microphones and loudspeak
222
ers. A large number of analysis and reproduction channels is required, in order to provide sufficient analysis and control capabilities for an even modest upper frequency limit
and size of the listening area. Please note, that most of the multiplepoint equalization
schemes use only a relatively low number of channels and thus do not provide sufficient
acoustic analysis and control. It was derived in Section 4.4.4 that traditional approaches
to adaptive inverse filtering are not applicable in the context of active room compensation for massive multichannel reproduction systems. As solution, a spatial decoupling of
the MIMO adaptive inverse filtering problem on basis of signal and system transformations was proposed. It was shown that the optimal transformation is given on basis of
the GSVD. However, the GSVD depends on the acoustic characteristics of the listening
room. In general these characteristics are unknown and may change over time. In order
to resolve these problems the concept of WDAF proposes to use a transformation which
is independent from the particular characteristics of the listening room. As drawback,
this transformation will not result in an optimal spatial decoupling of the multichannel
adaptive system.
Section 5.3.2 used WFA, WFS and WDAF as building blocks for an improved listening
room compensation system. The results derived from offline simulations of this system
proved that the proposed methods are able to provide room compensation for the entire
listening area. Additionally, the computational complexity has been lowered significantly
by WDAF.
It was shown by [Hay96] that the building block adaptive filter can be used for four basic
classes of applications: (1) inverse modeling, (2) identification, (3) interference canceling,
and (4) prediction. Thus, the presented theoretical framework of WDAF can be applied
also to other massive multichannel adaptation problems. The first application class has
been discussed in detail within this work. The second class is typically used for acoustic
echo cancellation (AEC) systems, the third for acoustic humanmachine interfaces and
the fourth for coding of signals. The concept of WDAF was also successfully applied
to massive multichannel AEC by [BSK04b, BSK04c, BSK04a] and massive multichannel
adaptive beamforming by [HNS+ 05]. An approach that uses spatial transformations for
the coding of massive multichannel audio was published by [TGAA+ 03]. Thus, the concept of WDAF provides an unified framework for massive multichannel adaptive filtering.
The application of this framework is not necessarily limited to acoustic wave fields, it
may also be very useful for nonacoustic applications. The WDAF approach for massive
multichannel adaptation problems has also been patented [BSHK03a, BSHK03b] due to
its generality.
This work focused mainly on the theoretical foundation of WFA, sound reproduction,
eigenspace adaptive filtering and WDAF. However, for a practical implementation of these
techniques practical aspects have to be considered also. Some of them already have been
mentioned in Sections 5.2.5 and 5.3.6, e. g. transducer mismatch and misplacement, and
223
transducer and amplifier noise. These issues have been discussed for arbitrary microphone
arrays e. g. by [Tre02, JD93, BW01] and for circular microphone arrays e. g. by [Teu05].
If not handled properly the resulting artifacts may reduce the achievable room compensation gain considerably.
Active listening room compensation, as introduced in this work, gains at perfectly canceling the reflections imposed by the listening room. From a psychoacoustic viewpoint
this may not be necessary or desired in all situations [CNC04]. For a particular scenario
it may be sufficient to e. g. cancel only the early reflections produced by one wall of the
listening room. The presented framework can be extended straightforward to handle such
cases since it provides full control over the spatiotemporal structure of the reproduced
field. As benefit, this may further reduce the complexity of active listening room compensation algorithms. However, the author is not aware of research results from the field
of psychoacoustics that could be applied straightforward for this purpose.
224
225
Appendix A
Notations
The following section introduces the notations used throughout this thesis.
A.1
Conventions
The following conventions are used in this thesis: For scalar variables lower case denotes
the time domain, upper case the temporal frequency domain. Vectors are denoted
by lower case boldface and matrices by upper case boldface variables. The temporal
frequency domain for vectors and matrices is denoted by underlining the respective
variables. The spatial frequency domain is denoted by a tilde placed over the symbol.
The following table summarizes the conventions used at the example of the variable p:
domain
scalar
vector
matrix
spacetime domain
p(x, t)
p(x, t)
P(x, t)
P (x, )
p(x, )
P(x, )
p(k, t)
(k, t)
p
P(k,
t)
P (k, )
(k, )
p
P(k,
)
2
1
P
3 ()
1
3 represent the following:
where the symbols

226
A. Notations
1 The domain into which a quantity has been transformed is denoted by the following
symbols over the respective quantity:
domain
transformation
decoration
P ()
FS1
()
P
P ()
P ()
traveling direction
decoration
incoming wave
(1)
outgoing wave
(2)
coordinate system
A.2
symbol
decoration
Cartesian
PC ()
spherical
PH ()
cylindrical
PY ()
polar
PP ()
aliasingtosignal ratio
discrete FourierBessel transformation
discrete Fourier transform
discretetime Fourier transform
fast Fourier transformation
finite impulse response
functional transformation method
LMS
LSE
LSI
LTI
LTSI
MIMO
MISO
MMSE
MSE
PWD
RASR
RLS
SIMO
SISO
VBAP
WFA
WFS
XLMS
XRLS
A.3
Mathematical Symbols
()1
inverse operation of ()
rk{}
rank of a matrix
()
()T
()
hermitian of ()
transposed of ()
conjugate complex of ()
()+
pseudoinverse of a matrix
nabla operator
for all
Laplace operator
227
228
A. Notations
()
Dirac pulse
(x)
[x]
proportional to
{}
imaginary part
{}
real part
()
n
h, i
Transformations
Ft {}
Fx {}
Fx1{}
Fx {}
Fx1{}
FS {}
H,r {}
Ft1{}
FS1
{}
H,R {}
H(1)
,r {}
H(2)
,r {}
th order complex Hankel transformation of second kind with respect to variable r
response of system
Ti {}
P 1 {}
P 1
{}
M{}
conformal mapping
P{}
P {}
229
Symbols
0mn
P (, r)
P ()
[rad]
az [rad]
pw [rad]
al (, R, )
tr (Mtr , R, )
truncation error
(xC , t)
al
[rad]
az [rad]
angular frequency
(
c, k)
[kg/m ]
0 [kg/m3 ]
(xS , )
m (x)
[rad/s]
da (k)
dd (k)
230
A. Notations
1
2
dd,m (k)
da,m (k)
crosscorrelation matrix of mth component of the transformed filtered secondary source driving signals and desired signals
m (x)
A(x, )
ASR(, R, )
aliasingtosignal ratio
am (k)
a(M ) (k)
a(x)
am (k)
C(k)
C(xx , )
cn,n (k)
c(k)
c [m/s]
speed of sound
cn,n (k)
circ()
DR (k)
D(x, )
Dpw (x0 , )
Dcorr,pw (x0 , )
DS (x, )
dn (k)
231
d(N ) (k)
d0 (xn , t)
dn (k)
E(x, )
E()
E(m, n)
0 )
E(,
0 )
E(,
Eadapt (k)
EC(i)
em (k)
e(M ) (k)
e()
exponential function
em (k)
F(k)
fm,n
f [1/s]
fs [1/s]
sampling frequency
f ()
arbitrary function
fm,n (k)
fal
G(xx0 , )
G(
0 , )
Greens function
G(
0 , )
G0 (xx0 , )
G0,2D (xx0 , )
G0,3D (xx0 , )
(1)
H ()
232
A. Notations
(2)
H ()
J ()
square root of 1
K(x, )
K(x,
)
k [rad/m]
wave vector
km
k [rad/m]
wavenumber
kr [rad/m]
radial wavenumber
kx [rad/m]
ky [rad/m]
kz [rad/m]
L(x, )
lm (k)
l(M ) (k)
lm (k)
M(kR)
Nc
Nf
Nr
surface normal
n0
normal vector
P (x, )
P ()
acoustic pressure
PP,pw,0 (, r, )
E0 ()
233
po [N/m2 ]
static pressure
Q(x, )
R(k)
matrix consisting of impulse responses from all synthesis to all analysis positions of discretized listening room impulse response
fixed radius
RLS
Rmic
RASR(xP , )
rm,n
r [m]
rm,n (k)
S(x, )
Tm,n
Ts [s]
sampling period
T60 [s]
reverberation time
t [s]
continuous time
U()
ub ()
uC (sC , )
V()
Vn (x, ) [m/s]
Vr (x, ) [m/s]
V (x x0 , )
V2D (x x0 , )
V2D (x x0 , )
v(x, t) [m/s]
vb ()
W (x, )
wn (k)
w(N ) (k)
wn (k)
X()
234
A. Notations
position vector
x [m]
YC (kC , kC,S , )
YC (kC,S , )
y [m]
Z()
Z0
z [m]
235
Appendix B
Coordinate Systems
In the sequel the different coordinate systems used throughout this work are defined.
Additionally their interrelations, some operators and special functions are introduced.
Since the position vector x and the wave vector k will be affected by a coordinate system
change, most of the results will be given for both quantities.
B.1
Figure B.1 illustrates the Cartesian coordinate system exemplarily for the position vector
xC . However, this illustration also applies to the wave vector kC . The threedimensional
z
xC
y
Figure B.1: Illustration of the Cartesian coordinate system for the position vector xC .
case will be considered in the following. Nevertheless, the introduced relations hold also
for the twodimensional case when neglecting the zcomponents. The position vector xC
236
B. Coordinate Systems
kC =
kx ky kz
iT
(B.1)
(B.2)
(B.3)
Operators
The gradient of a scalar variable pC (xC ) defined in Cartesian coordinates is given as [AW01]
pC (xC ) =
its Laplacian as
pC (xC ) = 2 pC (xC ) =
pC (xC )
x
pC (xC )
y
pC (xC )
z
(B.4)
(B.5)
Special Functions
A Dirac pulse placed at the origin of the Cartesian coordinate system is given in Cartesian
coordinates as [AW01, Bra03]
C (x, y, z) = C (xC ) = (x) (y) (z) ,
(B.6)
(B.7)
Convolution
The spatial convolution of two threedimensional functions in Cartesian coordinates is
given as
gC (xC ) = fC (xC ) x hC (xC )
Z Z Z
=
fC (x , y , z ) hC (x x , y y , z z ) dx dy dz .
(B.8)
237
xH
az
Figure B.2: Illustration of the spherical coordinate system for the position vector xH .
B.2
Fig. B.3 illustrates the spherical coordinate system exemplarily for the position vector
xH . However, this illustration applies also to the wave vector kH . The position and wave
vector in spherical coordinates are defined as follows:
h
iT
xH = az
,
(B.9)
kH =
az k
iT
(B.10)
(B.11)
xC coor
(B.12a)
(B.12b)
(B.12c)
238
B. Coordinate Systems
xY
y
Figure B.3: Illustration of the cylindrical coordinate system for the position vector xY .
where , az , denote the polar angle, the azimuth angle and the radius respectively.
These variables are subject to the following ranges: 0 2, 0 az < , 0 < .
The relations between the wave vector k in spherical kH and Cartesian kC coordinates is
given in accordance to the position vector as
ky
1
kx = k sin az cos
= tan
(B.13a)
kx
!
k
z
ky = k sin az sin
az = cos1 p 2
(B.13b)
kx + ky2 + kz2
q
kz = k cos az
k = kx2 + ky2 + kz2 ,
(B.13c)
B.3
Fig. B.3 illustrates the cylindrical coordinate system exemplarily for the position vector
xY . However, this illustration applies also to the wave vector kY . The position and wave
vector in cylindrical coordinates are defined as follows:
h
iT
xY = r z
,
(B.14)
kY =
kr kz
iT
(B.15)
239
kC
ky
xC
kr
kz
kx
Figure B.4: Illustration of the position vector x and the wave vector k expressed in
Cartesian and cylindrical coordinates as given by Eq. (B.17) and Eq. (B.18).
The volume element used for integration in cylindrical coordinates is given as
dV = r d dr dz .
(B.16)
z=z,
(B.17c)
where , r, z denote the angle, radius, and height respectively. These variables are subject
to the following ranges: 0 2, 0 r < , < z < . The relations between
the wave vector k in cylindrical kY and Cartesian kC coordinates is similar to the position
vector given as
ky
1
(B.18a)
kx = kr cos
= tan
kx
q
ky = kr sin
kr = kx2 + ky2
(B.18b)
kz = kz
kz = kz ,
(B.18c)
240
B. Coordinate Systems
Operators
The gradient of a scalar variable pY (xY ) defined in cylindrical coordinates is given
as [AW01]
p (x )
Y
pY (xY ) =
its Laplacian as
1
pY (xY ) = pY (xY ) =
r r
2
r
1 pY (xY )
r
pY (xY )
z
(B.19)
pY (xY )
1 2 pY (xY ) 2 p(xY )
r
+ 2
+
.
r
r
2
z 2
(B.20)
Special Functions
A Dirac pulse placed at the origin of the cylindrical coordinate system is given in cylindrical coordinates as [Gas78]
(r)
Y (xY ) =
,
(B.21)
r
a Dirac pulse with an offset xY,0 as [Gas78]
Y (xY xY,0 ) =
1
(r r0 )( 0 )(z z0 ) .
r0
(B.22)
Convolution
Using the definition of the spatial convolution in Cartesian coordinates (B.8) and introducing the coordinate system change to polar coordinates yields
gY (xY ) = fY (xY ) hY (xY )
Z Z Z 2
=
fY ( , r , z )hY (, r, z z ) r d dr dz ,
(B.23)
r sin r sin
r cos r cos
(, , r, r ) = tan
,
q
r(, , r, r ) = r 2 + r 2 2rr cos( ) .
B.4
(B.24a)
(B.24b)
The polar coordinate system can be derived from the cylindrical coordinate system by
setting z = 0 and kz = 0. Figure B.5 illustrates the polar coordinate system for the
241
xP
x
Figure B.5: Illustration of the polar coordinate system for the position vector xP .
position vector xP . The position and wave vector in polar coordinates are defined as
follows
h
iT
xP = r
,
(B.25)
kP =
iT
(B.26)
(B.27)
kx = k cos
ky = k sin
ky
kx
= tan
q
k = kx2 + ky2 .
(B.29a)
(B.29b)
Operators
The gradient of a scalar variable pY (xP ) defined in polar coordinates is given as [AW01]
"
#
pP (xP ) =
pP (xP )
r
1 pP (xP )
r
(B.30)
242
B. Coordinate Systems
its Laplacian as
1
pP (xP ) = pP (xP ) =
r r
2
pP (xP )
1 2 pP (xP )
r
+ 2
.
r
r
2
(B.31)
Special functions
A Dirac pulse placed at the origin of the polar coordinate system is given in polar coordinates as [Gas78]
(r)
,
(B.32)
P (xP ) =
r
a Dirac pulse with an offset xP,0 as [Gas78]
(xP xP,0 ) =
1
(r r0 )( 0 ) .
r0
(B.33)
Convolution
Using the definition of the spatial convolution in twodimensional Cartesian coordinates (B.8) and introducing the coordinate system change to polar coordinates yields
gP (xP ) = fP (xP ) hP (xP )
Z Z 2
=
fP ( , r )hP (, r) r ddr ,
0
(B.34)
r sin r sin
r cos r cos
,
(, , r, r ) = tan
q
(B.35a)
(B.35b)
243
Appendix C
Mathematical Preliminaries
The following section summarizes mathematical preliminaries for reference within this
thesis.
C.1
(C.1)
where v and u denote scalar fields which second derivatives must exist and are integrable,
n denotes the inward pointing surface normal, dV and dS volume and surface elements
respectively, and ha, bi denotes the inner product of two vectors a and b. Figure C.1
illustrates the geometry used for Greens second integral theorem. The parts of the inner
product on the right hand side of Greens theorem (C.1) involving the surface normal
n and the gradients of the fields v and u can be understood as a directional gradient.
This operation calculates the gradient in direction of the inward pointing surface normal.
V
V
Figure C.1: Bounded region V , surface V , and inward pointing surface normal n used
for Greens second integral theorem (C.1).
244
C. Mathematical Preliminaries
u
.
n
(C.2)
Thus, Greens theorem yields a relation between the directional gradients of the scalar
fields u, v on the boundary V with the fields and their Laplacians inside the bounded
region V . Introducing Eq. (C.2) into Eq. (C.1) results in
Z
(v u u v) dV =
v
u
v
u
n
n
dS .
(C.3)
This form of Greens second integral theorem will be used in this work.
C.2
The stationary phase method [Ble84, Wil99] considers the approximate calculation of
integrals that exhibit the following form
Z
F =
f (z) ej(z) dz .
(C.4)
2j
f (zs ) ej(zs ) ,
(zs )
(C.5)
where (z) denotes the second derivative of (z) with respect to z and zs the stationary
phase point. The stationary phase point zs is found by setting the first derivative (z)
of (z) to zero
!
(zs ) = 0 .
C.2.1
(C.6)
Section 5.1.1 discusses the correction of the secondary source type mismatch for a twodimensional WFS system. For this purpose Eq. (5.5) is approximated using the stationary
phase method outlined in the previous section.
Comparing the inner integral of Eq. (5.5) to Eq. (C.4) yields
(z0 ) = k x x0 (x0 , z0 ) kT0 x0 ,
jk cos
f (z0 ) =
,
x x0 (x0 , z0 )
(C.7)
(C.8)
C.3. Spatiotemporal Spectrum of the Two and Threedimensional Freefield Greens Functions245
where the integration in Eq. (5.5) is performed over the variable z0 . The first derivative
of (z0 ) with respect to z0 is given as
(z0 ) =
k z0
.
x x0 (x0 , z0 )
(C.9)
(0) = k x x0  kT0 x0 ,
k
(0) =
,
x x0 
jk cos
f (0) =
.
x x0 
(C.10)
(C.11)
(C.12)
Introducing Eq. (C.5) together with above results into Eq. (5.5) yields Eq. (5.6).
C.3
The following section will derive the spatiotemporal spectrums of the two and threedimensional freefield Greens functions.
C.3.1
j (2)
H ( xC xC,0 ) .
4 0 c
(C.13)
j (2)
H ( r) .
4 0 c
(C.14)
Due to the radial symmetry of the Greens function GP,2D (xP 0, ) in polar coordinates,
its spatial Fourier transformation is given by its Hankel transformation [Pap68]
Z
j
(2)
P,2D (kP 0, ) =
G
H0 ( r) J0 (kr) r dr
4 0
c
(C.15)
1
j
=
+
(k
)
,
4(k 2 ( c )2 ) 4k
c
where [GR65] and [Gas78] was utilized to derive the solution of above integral. Transforming this result back into Cartesian coordinates and applying the shift theorem of
246
C. Mathematical Preliminaries
the spatial Fourier transformation yields the spectrum of the twodimensional freefield
Greens function as
!
q
1
j
T
C,2D (kC xC,0 , ) =
G
( kx2 + ky2 ) ejkC xC,0 .
2 + p 2
2
2
2
4(kx + ky ( c ) ) 4 kx + ky
c
(C.16)
The spatiotemporal spectrum of the twodimensional freefield Greens function describes
the complex valued pressure field of a line source placed at the position xC,0 .
C.3.2
1 ej c xC xC,0 
.
GC,3D (xC xC,0 , ) =
4 xC xC,0 
(C.17)
1 ej c r
GH,3D (xH 0, ) =
.
4 r
(C.18)
Due to the radial symmetry of the Greens function GH,3D (xP 0, ) in spherical coordinates,
its spatial Fourier transformation is given by its Hankel transformation [Pap68]
Z
1
1
where [GR65] was utilized to derive the solution of above integral. Transforming this
result back into Cartesian coordinates and applying the shift theorem of the spatial Fourier
transformation yields the spectrum of the threedimensional freefield Greens function as
1
T
C,3D (kC xC,0 , ) = 1 q
G
ejkC xC,0 .
4 k 2 + k 2 ( )2
x
y
c
(C.20)
247
Appendix D
Measured and Simulated Acoustic
Environments
In this section, the acoustic environments and experimental setups are described which
were used to derive the performance of the proposed WDAF based room compensation
approach. In particular the acoustic environments considered in this work consisted of a
laboratory environment and a simulated environment. The acoustical properties of the
laboratory environment were measured as described in the next section.
D.1
The acoustic environment that was used for the measurements was the multimedia laboratory of Multimedia Communications and Signal Processing at the University of
ErlangenNuremberg [LMS]. The laboratory has the dimensions 5.905.803.10 m (w
l h). The room is equipped with carpet, a damped ceiling and sound absorbing curtains at all sides. The curtains can be removed partly of fully in order to change the
reverberation characteristics of the room. The reverberation time for closed curtains is
approximately T60 250 ms, for opened curtains T60 400 ms. For opened curtains, the
plane wave reflection factor of the walls can be approximated by Rpw = 0.8. Figure D.1
illustrates the geometry of the multimedia laboratory. A circular loudspeaker and microphone array was used for the measurement of the room transfer matrix. Their geometry
and position is also depicted in Fig. D.1.
The circular loudspeaker array with a radius of RLS = 1.50 m consists of 48 equidistantly
mounted twoway loudspeakers (ELAC Type 301 [ELA]). All loudspeakers are mounted
at a height of 1.60 m. Figure 5.6 shows the circular loudspeaker array. The circular microphone array with a radius of RMic = 0.75 m was realized in a sequential fashion using a
stepper motor drive. The setup shown in Fig. 5.13 and Fig. 5.14 was used for this purpose.
The acoustic pressure and velocity in normal direction were measured at 48 equidistant
248
Figure D.1: Geometry of the multimedia laboratory, loudspeaker and microphone array.
The dimensions are given in centimeters [cm].
249
4.5
4
y > [m]
3.5
3
2.5
2
1.5
microphones
circular LS array
rectangular LS array
1
0.5
4
x > [m]
Figure D.2: Exact positions of the loudspeakers and microphones used for the measured/simulated setups (LS = loudspeaker). The coordinates are given with respect to
the lower left corner of the multimedia laboratory illustrated by Fig. D.1.
positions. Figure D.2 additionally illustrates the exact positions of the loudspeakers and
microphones with respect to the lower left corner of the multimedia laboratory.
Using the described setup, the impulse responses from each loudspeaker to each microphone were measured sequentially. The entire measurement procedure was controlled
by a personal computer. The measurements were taken at a sampling rate of 48 kHz
using MLS sequences [RV89]. The measurements taken from each loudspeaker to each
microphone were then used to compose listening room transfer matrix R() used for the
results presented in this work.
D.2
250
Figure D.3: Simulated setups of loudspeaker and microphone array. Dimensions are
given in centimeters [cm].
251
Appendix E
Titel, Inhaltsverzeichnis, Einleitung
und Zusammenfassung
The following german translations of the title (Section E.1), the table of contents (Section E.2), the introduction (Section E.3), and the summary (Section E.4) are a mandatory requirement for a doctoral thesis at the Faculty of Engineering of the University of
ErlangenNuremberg.
E.1
Titel
E.2
Inhaltsverzeichnis
1 Einleitung
1.1 Der Einfluss des Wiedergaberaumes . . . . . . . . . . . . . . . . . . . . . .
1.2 Systeme zur Kompensation des Wiedergaberaumes . . . . . . . . . . . . .
1.3 Uberblick
u
ber diese Arbeit . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Grundlagen der akustischen Wellenausbreitung
2.1 Die akustische Wellengleichung . . . . . . . . . . . . . . . . . . . . . .
2.1.1 Herleitung der homogenen akustischen Wellengleichung . . . . .
2.1.2 Zweidimensionale Wellenfelder . . . . . . . . . . . . . . . . . . .
2.1.3 Allgemeine Losung der Wellengleichung . . . . . . . . . . . . . .
2.2 Losungen der homogenen Wellengleichung in kartesischen Koordinaten
2.2.1 Expansion in ebene Wellen . . . . . . . . . . . . . . . . . . . . .
2.3 Losungen der homogenen Wellengleichung in Zylinderkoordinaten . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
3
4
6
7
7
7
10
10
11
13
14
252
16
18
20
20
22
24
26
28
29
31
34
35
36
37
38
43
43
44
2.4
2.5
2.6
2.7
45
47
49
51
52
53
54
56
57
58
61
61
62
62
63
65
E.2. Inhaltsverzeichnis
253
Uberblick
u
ber die verschiedenen Reprasentationen der
Zerlegung in ebene Wellen . . . . . . . . . . . . . . . . . .
3.3.5 Eigenschaften und Theoreme der Zerlegung in ebene Wellen . . . .
3.3.5.1 Eigenschaften . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.5.2 Skalierungstheorem . . . . . . . . . . . . . . . . . . . . . .
3.3.5.3 Rotationstheorem . . . . . . . . . . . . . . . . . . . . . . .
3.3.5.4 Multiplikationstheorem . . . . . . . . . . . . . . . . . . . .
3.3.5.5 Faltungstheorem . . . . . . . . . . . . . . . . . . . . . . .
3.3.5.6 Parsevalsches Theorem . . . . . . . . . . . . . . . . . . . .
3.3.5.7 Extrapolation von ebenen Wellen . . . . . . . . . . . . . .
3.3.5.8 Zusammenfassung . . . . . . . . . . . . . . . . . . . . . .
3.3.6 Beziehungen zu anderen Methoden und Transformationen . . . . .
Die Zerlegung in ebene Wellen unter der Benutzung von Randmessungen .
3.4.1 Die Zerlegung in ebene Wellen mit beschrankter Apertur . . . . . .
3.4.2 Die Zerlegung in ebene Wellen auf der Basis von KirchhoffHelmholtz Extrapolation . . . . . . . . . . . . . . . . . . . . . . . .
3.4.3 Die Zerlegung in ebene Wellen auf der Basis von zylindrischen Harmonischen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Die Zerlegung in ebene Wellen von analytischen Quellenmodellen . . . . .
3.5.1 Die Zerlegung in ebene Wellen einer ebenen Welle . . . . . . . . . .
3.5.2 Die Zerlegung in ebene Wellen einer Linienquelle . . . . . . . . . .
Die diskrete Zerlegung in ebene Wellen . . . . . . . . . . . . . . . . . . . .
3.6.1 Die Herleitung der diskrete Zerlegung in ebene Wellen . . . . . . . .
3.6.1.1 Der Zweidimensionale polare Impulskamm . . . . . . . . .
3.6.1.2 Definition der raumdiskreten Zerlegung in ebene Wellen .
3.6.1.3 Spektrale Eigenschaften der raumdiskreten Zerlegung in
ebene Wellen . . . . . . . . . . . . . . . . . . . . . . . . .
3.6.1.4 Definition der diskreten Zerlegung in ebene Wellen . . . .
3.6.2 Abtastungs und Endlichkeitsartefakte . . . . . . . . . . . . . . . .
3.6.2.1 Angulare Abtastung von Randmessungen . . . . . . . . .
3.6.2.2 Abtastung einer ebenen Wellen auf einem zirkularen Rand
3.6.2.3 Quantitative Analyse der Aliasingartefakte . . . . . . . . .
3.6.2.4 Quantitative Analyse der Endlichkeitsartefakte . . . . . .
3.6.3 Zusammenfassung . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.4.4
3.4
3.5
3.6
4 Raumkompensation
4.1 Schallwiedergabe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.1 Schallwiedergabe basierend auf dem KirchhoffHelmholtz Integral
4.1.2 Dreidimensionale Schallwiedergabe . . . . . . . . . . . . . . . . .
66
68
68
69
70
70
71
72
73
73
74
76
76
78
79
81
82
83
85
85
85
86
87
88
89
90
91
94
97
99
101
. 101
. 102
. 104
254
4.1.3
4.1.4
4.1.5
4.1.6
4.2
4.3
4.4
4.5
4.6
Zweidimensionale Schallwiedergabe . . . . . . . . . . . . . . . . . .
Schallwiedergabe mit Monopolen als Sekundarquellen . . . . . . . .
Wiedergabe eines in ebene Wellen zerlegten Wellenfeldes . . . . . .
Raumliche Abtastung der Verteilung der Monopol Sekundarquellen
4.1.6.1 Lineare Arrays . . . . . . . . . . . . . . . . . . . . . . . .
4.1.6.2 Zirkulare Arrays . . . . . . . . . . . . . . . . . . . . . . .
4.1.6.3 Beliebig geformte Arrays . . . . . . . . . . . . . . . . . . .
4.1.6.4 Punktquellen als Sekundarquellen . . . . . . . . . . . . . .
Schallwiedergabe in Raumen . . . . . . . . . . . . . . . . . . . . . . . . . .
Grundlagen der Raumkompensation . . . . . . . . . . . . . . . . . . . . . .
4.3.1 Raumkompensation als Entfaltungsproblem . . . . . . . . . . . . .
4.3
Allgemeine Losung des Entfaltungsproblems f
ur Raume . . . . . . .
4.3.3 Adaption der Raumkompensationsfilter . . . . . . . . . . . . . . . .
Raumkompensation f
ur Reproduktionssysteme mit vielen Kanalen . . . . .
4.4.1 Diskrete Realisierung der Raumkompensation . . . . . . . . . . . .
4.4.1.1 Raumliche Diskretisierung . . . . . . . . . . . . . . . . . .
4.4.1.2 Zeitliche Diskretisierung . . . . . . . . . . . . . . . . . . .
4.4.1.3 Frequenzbereichsbeschreibung der Signale und Systeme . .
4.4.1.4 Adaption der Raumkompensationsfilter . . . . . . . . . . .
4.4.2 Exakte inverse Filterung mit dem MINT . . . . . . . . . . . . . . .
4.4.3 Adaption der Raumkompensationsfilter mittels des minimalen quadratischen Fehlers . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.4 Fundamentale Probleme der adaptiven inversen Filterung . . . . . .
Allgemeines Konzept f
ur ein verbessertes Raumkompensationssystem . . .
dd . . . . . . . . . . . . . . .
4.5.1 Analyse der Autokorrelationsmatrix
4.5.2 Entkoppelung der Raumtransfermatrix des Wiedergaberaumes . . .
4.5.2.1 Singulawertzerlegung . . . . . . . . . . . . . . . . . . . . .
4.5.2.2 Entkoppelung der Raumtransfermatrix des Wiedergaberaumes . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.5.3 Adaptive inverse Filterung im Eigenraum . . . . . . . . . . . . . . .
4.5.3 Adaptive inverse Filterung im Wellenbereich . . . . . . . . . . . . .
4.5.5 Angenaherte Entkoppelung der Raumtransfermatrix der Wiedergaberaumes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.5.5.1 Zerlegung in ebene Wellen . . . . . . . . . . . . . . . . . .
4.5.5.2 Zerlegung in zirkulare Harmonische . . . . . . . . . . . . .
Zusammenfassung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
104
105
110
111
112
119
122
125
126
127
128
131
132
134
134
134
136
137
138
139
142
145
146
146
147
147
149
150
153
155
156
157
159
5 Raumkompensation fu
161
r Raumklangsysteme
5.1 Wellenfeldsynthese . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
E.2. Inhaltsverzeichnis
5.1.1
5.1.2
5.1.3
5.2
5.3
5.4
255
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
162
165
167
168
171
171
173
174
174
175
177
178
178
181
182
184
185
186
188
190
191
193
195
197
198
207
209
215
216
. 217
. 218
221
225
. 225
. 226
. 227
256
B Koordinatensysteme
B.1 Kartesisches Koordinatensystem .
B.2 Spharisches Koordinatensystem .
B.3 Zylindrisches Koordinatensystem
B.4 Polarkoordinatensystem . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
C Mathematische Grundlagen
C.1 Greens zweiter Integralsatz . . . . . . . . . . . . . . . . . . . . . .
C.2 Die Methode der stationare Phase . . . . . . . . . . . . . . . . . .
C.2.1 Approximation einer linearen Verteilung von Punktquellen
C.3 Raumzeitliches Spektrum der zwei und dreidimensionalen freifeld
schen Funktionen . . . . . . . . . . . . . . . . . . . . . . . . . . .
C.3.1 Zweidimensionale Greensche Funktion . . . . . . . . . . .
C.3.1 Dreidimensionale Greensche Funktion . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
235
. 235
. 237
. 238
. 240
243
. . . . . 243
. . . . . 244
. . . . . 244
Green. . . . . 245
. . . . . 245
. . . . . 246
E.3.3 Uberblick
u
ber diese Arbeit . . . . . . . . . . . . . . .
E.4 Zusammenfassung und Schlussfolgerungen . . . . . . . . . . .
Literaturverzeichnis
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
251
. 251
. 251
. 257
. 259
. 260
. 261
. 262
265
E.3. Einleitung
E.3
257
Einleitung
Unter den verschiedenen Sinnen der Menschen sind Horen und Sehen in alltaglichen Situationen die prasentesten. Horen ist der Sinn, der sensibel f
ur Schallwellen ist. Von einem
strickt physikalischen Standpunkt aus konnten Schallwellen einfach nur als Kompressionswellen, die sich in der Luft fortbewegen, betrachtet werden. Jedoch ist Schall f
ur den
Menschen viel mehr als aus dieser physikalische Definition hervorgeht. Schall transportiert eine groe Bandbreite an Informationen, Impressionen und Gef
uhlen und kann zum
Beispiel vom Empfanger als Gerausch, Sprache oder Musik interpretiert werden. Diese
Bedeutung f
ur den Menschen macht es erstrebenswert, Techniken zur Wiederherstellung
von akustischen Ereignissen zu besitzen. Schallreproduktion strebt die Wiederherstellung
einer (virtuellen) akustischen Szene an einem anderen Ort oder zu einer anderen Zeit an.
Richtig ausgef
uhrt sollte eine perfekte akustische Illusion der originalen Szene erweckt
werden. Demnach ist das Ziel der Schallreproduktion die perfekte akustische Illusion zu
erzeugen. Jedoch ist der menschliche Horsinn sehr sensibel und vermag minimale Unterschiede zwischen der originalen Szene und der reproduzierten zu detektieren. Die perfekte
Reproduktion von aufgenommenen oder synthetischen akustischen Szenen ist ein aktives
Forschungsgebiet in den letzten Dekaden. Generation von Ingenieuren erfanden verschiedenste Schallreproduktionssysteme [Tor98, Ste96, Gri00, KTH99, Pol00]. Die perfekte
akustische Illusion wurde jedoch von keinem dieser Systeme realisiert. Nichtsdestotrotz
wurde die Schallreproduktion im Sinne der Reproduktionsqualitat und der raumlichen
Wahrnehmung in den letzten Jahrzehnten nennenswert gesteigert. Im Folgenden wird ein
kurzer Uberblick
u
ber Systeme zur Schallreproduktion gegeben.
Die Geschichte der Schallwiedergabe reicht zur
uck bis zur Erfindung des Telefons im
spaten 19ten Jahrhundert. Johann Philipp Reis entwickelte einen ersten Prototyp des
258
gewissen Grad wiederzugeben. Um die Situation zu verbessern, benutzt die stereophonische Reproduktion zwei Lautsprecher. Typischerweise ist ein stereophonisches System
daf
ur entwickelt worden, u
ber einen Winkel von 30o , vom Zuhorer aus betrachtet, zwischen den Lautsprechern zu verf
ugen und gleiche Abstande von den Lautsprechern zum
Zuhorer zu haben. Die meisten stereophonischen Systeme streben nur die Wiedererstellung des Schallfeldes in der horizontalen Ebene an. Stereophonie basiert auf Prinzipien
zur stereophonischen Wiedergabe, wie dem AmplitudenPanning [HW98], die aus der
psychoakustischen Forschung abgeleitet wurden. Als Folge kann die korrekte raumliche
Impression der originalen Szene nur an einer bestimmten Zuhorerposition wahrgenommen werden. Diese Position wird oft als sweetspot Bezeichnet. Um die Situation zu
verbessern, wurde Stereophonie um die RaumklangWiedergabetechniken erweitert. Die
treibende Kraft f
ur die Weiterentwicklung der Raumklangverfahren war die Filmindustrie.
Erste Raumklangsysteme, bestehend aus drei Lautsprechern vorne und zwei Lautsprechern hinten, wurden 1940 prasentiert [Tor98]. Diese basierten ebenfalls auf stereophonischen Prinzipien und wiesen zum Teil auch deren Limitierungen auf. Jedoch dauerte es
noch einige Zeit bis die Raumklangverfahren den kommerziellen Durchbruch erreichten.
Heute sind f
unfkanalige Raumklangsysteme Stand der Technik in Heimkinosystemen und
in Kinos werden typischerweise noch mehr Kanale genutzt. Die Reproduktionstechniken
erweiterten sich auch um eine dreidimensionale Wiedergabe. Vektorbasiertes Amplitudenpanning [Pul97, Pul99] kann als Beispiel f
ur ein dreidimensionales stereophonisches
Wiedergabesystem dienen.
Fortgeschrittene Raumklangsysteme, welche den sweetspot und andere Limitationen
der stereophonischen Systeme aufheben, wurden ebenfalls entwickelt [Cam67, KNOBH96,
KN93, WA01]. Die Systeme, die hier genannt werden sollten, sind Wellenfeldsynthese
(WFS) [Ber88, Hul04] und Ambisonics hoherer Ordnung [Dan00]. Beide basieren auf einer soliden physikalischen Grundlage. WFS wird im Detail im Abschnitt 5.1 diskutiert.
Im Allgemeinen werden die Ansteuerungssignale f
ur die Lautsprecher aus den aufgenommenen Signalen der originalen Quellen, aus geometrischen Informationen u
ber die Quellenpositionen und aus Informationen u
ber die Akustik des Aufnahmeraumes gewonnen.
Diese Information kann in einem existierenden Aufnahmeraum aufgenommen worden sein,
(zum Beispiel in einer Konzerthalle) oder k
unstlich erzeugt worden sein. Die Signalverarbeitungsalgorithmen zur Erzeugung der Lautsprechersignale werden von fundamentalen
psychoakustischen oder physikalischen Prinzipien abgeleitet. Obwohl sich die Qualitat der
Reproduktion schon erheblich verbessert hat, gibt es noch eine Reihe offener Probleme. Eines davon, das alle oben genannten Systeme gemein haben, ist der Einfluss des Raumes in
dem die Wiedergabe stattfindet, der Wiedergaberaum. Der folgende Abschnitt illustriert
den Einfluss des Wiedergaberaumes auf die Raumklangwiedergabe.
E.3. Einleitung
E.3.1
259
Der Einfluss des Wiedergaberaumes auf ein Wiedergabesystem und die wiedergegebene Szene wird zuerst in einer intuitiven Weise aufgezeigt. Im Folgenden betrachten wir
daf
ur ein einfaches Wiedergabeszenario. Abbildung 1.1 zeigt dieses vereinfachte Szenario.
Die Ubertragung
einer akustischen Szene von einer Kirche (z.B. ein Sanger der im Chor
singt) in den Wiedergaberaum wird exemplarisch aufgezeigt. Zur Vereinfachung illustriert
Abbildung 1.1 die Ausbreitung von akustischen Wellen mittels akustischer Strahlen. Die
gestrichelten Linien in Abbildung 1.1 von der virtuellen Quelle zu einer beispielhaften
Horerposition zeigen die akustischen Strahlen f
ur den Direktpfad und von verschiedenen
Reflektionen von den Seitenwanden des Aufnahmeraumes. Das Lautsprechersystem im
Wiedergaberaum reproduziert den Direktpfad und die Reflektionen, um den gew
unschten
raumlichen Eindruck der originalen Szene wiedherzustellen. Die Theorie hinter fast allen
angewendeten Methoden nimmt einen reflektionsfreien Wiedergaberaum an. Jedoch wird
diese idealisierte Annahme selten von typischen Wiedergaberaumen erf
ullt. Die durchgezogene Linie in Abbildung 1.1, von einem Lautsprecher in der oberen Reihe zur Horerposition, zeigt eine mogliche Reflektion des vom Lautsprecher produzierten Wellenfeldes von
der Wand des Wiedergaberaumes. Wie dieses vereinfachte Beispiel intuitiv zeigt, konnen
diese zusatzlichen, durch den Wiedergaberaum verursachten Reflektionen den gew
unschten raumlichen Eindruck verfalschen.
Der Einfluss des Wiedergaberaumes auf die Raumklangwiedergabe ist ein Themengebiet der aktuellen Forschung [Gri98, DZJR05, KS03, Vol98, VTB02, Vol96]. Nachdem
die akustischen Eigenschaften des Wiedergaberaumes und des angewendeten Wiedergabesystems in einem weiten Bereich variieren, konnen keine allgemeinen Aussagen u
ber
den perzeptualen Einfluss des Wiedergaberaumes gegeben werden. Jedoch werden die
durch den Wiedergaberaum hinzugef
ugten Reflektionen Einfluss auf die psychoakustischen Eigenschaften der wiedergegebenen Szene haben. Diese Einfl
usse konnten z.B. eine
Verschlechterung der Lokalisierung und Klangverfarbung sein. Speziell dominante erste
Reflektionen scheinen den gew
unschten raumlichen Eindruck zu beeinflussen. Ein anderer
Effekt des Wiedergaberaums auf die niederfrequente Wiedergabe konnte das Auftreten
von niederfrequenten Resonanzen sein. Diese Resonanzen beeinflussen die Perzeption von
kurzen Klangereignissen negativ.
Im Allgemeinen wird ein hallender Wiedergaberaum seine Charakteristik auf die
gew
unschte Impression des aufgenommenen Raumes aufpragen. Die Kompensation des
Wiedergaberaumes will den Einfluss des Wiedergaberaumes auf die wiedergegebene Szene
eliminieren oder vermindern. Der folgende Abschnitt wird Systeme zur Raumkompensa
tion vorstellen und einen kurzen Uberblick
u
ber bekannte Ansatze geben.
260
E.3.2
Zur Kompensation des Wiedergaberaumes sind verschiedene Ansatze mogliche. Als erste
Manahme gegen Raumreflektionen dient die Anwendung von akustischen Dammmaterialien, im Rahmen dieser Arbeit als passive Raumkompensation bezeichnet. Jedoch ist
allgemein bekannt, dass akustische Dammung unpraktisch und teuer ist und eine relativ
geringe Dampfung erreicht. Dies gilt speziell f
ur niedrige Frequenzen. Zusatzlich begrenzen Kosten und Design
uberlegungen die Wirkung dieser Gegenmanahme. Als Folge
kann passive Raumkompensation alleine in einer praktischen Anwendung keine hinreichende Unterdr
uckung der Reflektionen des Wiedergaberaumes bieten. Die grundlegende
Idee der aktiven Kompensation des Wiedergaberaumes bilden Konzepte der aktiven Kontrolle, um die gew
unschte Kompensation durchzuf
uhren. Unter den verschiedenen Varianten konnen hier zwei grundsatzliche Ansatze genannt werden: (1) aktive Kontrolle der
akustischen Impedanz an den Wanden des Wiedergaberaumes und (2) Verwendung des
Wiedergabesystems. Der erste Ansatz versucht die Impedanz einer Wand aktiv zu beeinflussen, um Freifeldbedingungen zu erhalten [GKR85, OM53]. Der zweite nutzt Synergien
mit dem Reproduktionssystem, um Kontrolle u
ber das Wellenfeld im Wiedergabebereich
zu erhalten. Die darauf wiederum basierenden Ansatze werden im Folgenden kurz zusammengefasst.
Ein Uberblicke
u
ber klassische einkanalige Ansatze zur Raumkompensation und deren
Limitierungen finden sich bei [Fie01, Fie03, HM04, Mou94]. Die einkanaligen Ansatze
analysieren das von einem Lautsprecher wiedergegebene Wellenfeld an nur einer Position.
Gemeinsam ist den meisten Ansatze zur aktiven Raumkompensation die grundsatzliche
Idee der Vorverzerrung von Lautsprecheransteuerungssignalen mittels geeigneter Kompensationsfilter, die aus der Analyse des reproduzierten Wellenfeldes berechnet wurden.
Jedoch leiden diese unter drei fundamentalen Problemen. Das erste Problem besteht darin,
dass die Kompensationsfilter aufgrund der zeitveranderlichen Charakteristik des Wiedergaberaumes adaptiv berechnet werden m
ussen. Eine Temperaturveranderung im Wie
dergaberaum resultiert zum Beispiel in einer Anderung
der Schallgeschwindigkeit und
+
damit der akustischen Eigenschaften [OYS 99]. Eine groe Bandbreite von Problemen
ist mit den Algorithmen verbunden, die zur Adaption des Kompensationsfilters genutzt
werden. Das zweite Problem besteht darin, dass Raumimpulsantworten im Allgemeinen
nicht minimalphasig sind und es daher nicht moglich ist, einen exakten inversen Filter
zu berechnen. Jedoch ist der optimale Kompensationsfilter ein inverser Filter zur Impulsantwort vom Lautsprecher zur gemessenen Position. Das dritte Problem besteht darin,
dass der Kompensationsfilter nur f
ur die gemessene Position optimal ist. In der Folge wird
die Performance, im Sinne der Kompensation, mit ansteigender Distanz zum gemessenen
Punkt [TW02, TW03, BHK03, NOBH95] niedriger ausfallen.
Um die letzten beiden dieser Probleme zu losen, schlugen verschiedene Autoren mehrkanalige aktive Raumkompensationssysteme vor. Dabei werden die akustischen Eigen
E.3. Einleitung
261
E.3.3
Uberblick
u
ber diese Arbeit
262
E.4
Seit Anbeginn versucht die Schallreproduktion die perfekte akustische Illusion zu erzeugen. Viele Reproduktionssysteme mit dem Ehrgeiz dieses Ziel zu erf
ullen, entstanden in
den letzten Dekaden. Die perfekte akustische Illusion wurde jedoch von keinem dieser
Systeme realisiert. Nichtsdestotrotz verbesserte sich die Reproduktion, was die Qualitat
und den raumlichen Eindruck betrifft, verglichen mit den ersten monophonen Systemen
erheblich. [Tor98, Ste96, Gri00].
Die Theorie hinter den meisten Reproduktionssystemen nimmt eine Freifeldausbreitung
von den Lautsprechern zum Zuhorer an. Ungl
ucklicherweise wird diese Annahme von
Reproduktionssystemen, die in einem Wiedergaberaum platziert sind nicht erf
ullt. Der
Wiedergaberaum gestaltet sich typischerweise als Kompromiss zwischen Design, Kosten
und akustischen Anforderungen. Als Folge sind die meisten Wiedergaberaume mehr oder
weniger hallend und pragen ihre Reflektionen auf das reproduzierte Wellenfeld auf. Vorliegende Arbeit entwickelt ein theoretisches Framework f
ur die aktive Kompensation der
Reflektionen im Wiedergaberaum. Grundlegende Idee dazu ist, das bereits existierende
Schallreproduktionssystem zu nutzen, um destruktive Interferenzen zur Unterdr
uckung
der Reflektionen zu erzeugen. Dies erfordert jedoch eine zusatzliche Analyse des Einflusses, den der Wiedergaberaum hat. Um die Reflektionen in einem groen Zuhorerbereich
zu unterdr
ucken, wurden zwei grundsatzliche Anforderungen an das Reproduktions und
Analysesystem erarbeitet. Diese sind: (1) das Analysesystem sollte die (perfekte) Analyse des reproduzierten Wellenfeldes innerhalb des Zuhorerbereiches ermoglichen und (2)
263
264
nalige Keulenformung [HNS+ 05]. Folglich bietet das Konzept der adaptiven Filterung im
Wellenbereich ein generelles Framework zur adaptiven Filterung mit sehr vielen Kanalen.
Die Anwendung dieses Frameworks ist nicht notwendigerweise auf akustische Wellenfelder
limitiert, es konnte auch sehr nutzvoll f
ur nichtakustische Anwendungen sein.
Vorliegende Arbeit konzentrierte sich hauptsachlich auf theoretische Grundlagen von
WFA, Schallwiedergabe, adaptiver Filterung im Eigenraum und adaptive Filterung im
Wellenbereich. Jedoch m
ussen f
ur eine praktische Implementierung dieser Techniken
zusatzlich praktische Aspekte bedacht werden. Einige davon werden bereits in den Abschnitten 5.2.5 und 5.3.6 benannt, zum Beispiel Wandler Variationen und Fehlpositionierung und Wandler und Verstarkerrauschen. Diese Punkte werden f
ur beliebige Mikrophonarrays zum Beispiel von [Tre02, JD93, BW01] und f
ur zirkulare Mikrophonarrays
z.B. von [Teu05] diskutiert. Wenn diese nicht angemessen bedacht werden, konnen die
resultierenden Artefakte den erreichbaren Gewinn durch Raumkompensation erheblich
reduzieren.
Wie in dieser Arbeit besprochen, ist das Ziel einer aktiven WiedergaberaumKompensation die hinzugef
ugten Reflektionen des Wiedergaberaums perfekt zu unterdr
ucken. Von einem psychoakustischen Standpunkt aus betrachtet, konnte dies nicht in
allen Situationen notwendig sein. F
ur ein spezifisches Szenario konnte es zum Beispiel
ausreichen, die ersten Reflektionen, die durch eine bestimmte Wand erzeugt werden, zu
unterdr
ucken. Das vorgestellte Framework kann direkt um diese Moglichkeit erweitert
werden, da es volle Kontrolle u
ber die raumzeitliche Struktur des reproduzierten Wellenfeldes ermoglicht. Von Vorteil ware dann die weitere Verringerung der Komplexitat
aktiver Raumkompensationsalgorithmen. Jedoch sind dem Autor keine Erkenntnisse aus
der psychoakustischen Forschung bekannt, die zu diesem Zweck direkt angewendet werden
konnten.
265
Bibliography
[AB79]
J.B. Allen and D.A. Berkley. Image method for efficiently simulating smallroom acoustics. Journal of the Acoustical Society of America, 65(4):943950,
1979.
[ACD+ 03]
[AKG]
[ALE]
Alesis. http://www.alesis.com/.
[ALS]
The
Advanced
Linux
http://www.alsaproject.org/.
[AR75]
N. Ahmed and K.R. Rao. Orthogonal Transforms for Digital Signal Processing. Springer, 1975.
[AS72]
[AW01]
G.B. Arfken and H.J. Weber. Mathematical Methods for Physicists. Academic Press, 2001.
[BA05a]
T. Betlehem and T.D. Abhayapala. A modal approach to soundfield reproduction in reverberant rooms. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages III 289292, Philadephia,PA,USA, 2005.
[BA05b]
T. Betlehem and T.D. Abhayapala. Theory and design of sound field reproduction in reverberant rooms. Journal of the Acoustical Society of America,
117(4):21002111, April 2005.
[Baa05]
Sound
Architecture
(ALSA).
266
BIBLIOGRAPHY
[Bam89]
[BBK03]
[BdB00]
[BdVV93]
[Ber87]
[Ber88]
[BHK03]
[Bja95]
[Bla96]
J. Blauert. Spatial Hearing: The Psychophysics of Human Sound Localization. MIT Press, 1996.
[Bla00]
[Ble84]
[BMS98]
[BQ00]
M. Bouchard and S. Quednau. Multichannel recursiveleastsquares algorithms and fasttransversalfilter algorithms for active noise control and
BIBLIOGRAPHY
267
sound reproduction systems. IEEE Transactions on Speech and Audio Processing, 8(5):606618, September 2000.
[Bra78]
[Bra03]
[BSK04b]
[BSK04c]
H. Buchner, S. Spors, and W. Kellermann. Wavedomain adaptive filtering for acoustic humanmachine interfaces based on wavefield analysis and
synthesis. In European Signal Processing Conference (EUSIPCO), 2004.
[BSP01]
[BW01]
M. Brandstein and D. Ward. Microphone Arrays: Signal Processing Techniques and Applications. Springer, 2001.
[Cam67]
[CAR]
[CCW03]
268
BIBLIOGRAPHY
[CHP02]
[CN03]
E. Corteel and R. Nicol. Listening room compensation for wave field synthesis. What can be done? In 23rd AES International Conference, Copenhagen,
Denmark, May 2003. Audio Engineering Society (AES).
[CNC04]
D. Cabrera, A. Nguyen, and Y.J. Choi. Auditory versus visual spatial impression: A study of two auditoria. In 10th Meeting of the International
Conference on Auditory Display (ICAD), Sydney, Australia, July 2004.
[Dan00]
[dB04]
[Dea93]
S.R. Deans. The Radon Transform and Some of its Applications. Krieger
Publishing Company, 1993.
[Dep88]
Ed.F. Deprettere, editor. SVD and Signal Processing: Algorithms, Applications and Architectures. NorthHolland, 1988.
[DGZD01]
R. Duraiswami, N.A. Gumerov, D.N. Zotkin, and L.S. Davis. Efficient evaluation of reverberant sound fields. In IEEE Workshop on Applications of
Signal Processing to Audio and Acoustics, pages 203206, New Paltz, USA,
Oct 2001.
[DNM03]
J. Daniel, R. Nicol, and S. Moreau. Further investigations of high order ambisonics and wavefield synthesis for holophonic sound imaging. In 114th AES
Convention, Amsterdam, The Netherlands, March 2003. Audio Engineering
Society (AES).
[dVSV94]
D. de Vries, E.W. Start, and V.G. Valstar. The Wave Field Synthesis concept
applied to sound reinforcement: Restrictions and solutions. In 96th AES
Convention, Amsterdam, Netherlands, February 1994. Audio Engineering
Society (AES).
[DZJR05]
BIBLIOGRAPHY
269
[FHLB99]
[Fie01]
L.D. Fielder. Practical limits for room equalization. In 111th AES Convention, New York, NY, USA, September 2001. Audio Engineering Society
(AES).
[Fie03]
[Fli02]
[Fur01]
[Gar00]
[Gas78]
J.D. Gaskill. Linear Systems, Fourier Transforms, and Optics. John Wiley
& Sons, 1978.
[GD04]
N.A. Gumerov and R. Duraiswami. Fast Multipole Methods for the Helmholtz
Equation in three Dimensions. Elsevier, 2004.
[Ger85]
[GKR85]
[GL89]
G.H. Golub and C.F. Van Loan. Matrix Computations. The Johns Hopkins
University Press, 1989.
[GR65]
I.S. Gradshteyn and I.M. Ryzhik. Tables of Integrals, Series, and Products.
Academic Press, 1965.
270
BIBLIOGRAPHY
[Gri98]
[Gri00]
[Gro05]
P. Grond. Extraction of 3D information from 2D array measurements. Masters thesis, Laboratory of Acoustical Imaging and Sound Control, Delft University of Technology, 2005.
[GRS01]
[Hay96]
[HdVB01]
E. Hulsebos, D. de Vries, and E. Bourdillat. Improved microphone array configurations for auralization of sound fields by Wave Field Synthesis. In 110th
AES Convention, Amsterdam, Netherlands, May 2001. Audio Engineering
Society (AES).
[HdVB02]
E. Hulsebos, D. de Vries, and E. Bourdillat. Improved microphone array configurations for auralization of sound fields by Wave Field Synthesis. Journal
of the Audio Engineering Society (AES), 50(10), Oct. 2002.
[HM04]
[HNS+ 05]
[HS97]
C.H. Hansen and S.D. Snyder. Active Control of Noise and Vibration. E&FN
Spon, 1997.
[HSdVB03] E. Hulsebos, T. Schuurmanns, D. de Vries, and R. Boone. Circular microphone array recording for discrete multichannel audio recording. In 114th
AES Convention, Amsterdam, Netherlands, March 2003. Audio Engineering
Society (AES).
[Hul04]
BIBLIOGRAPHY
271
[HV99]
[HW98]
[ITU97]
[JAC]
[Jag84]
D.S. Jagger. Recent developments and improvements in soundfield microphone technology. In 75th AES Convention, Paris, France, March 1984.
Audio Engineering Society (AES).
[JD93]
D.H. Johnson and D.E. Dudgeon. Array Signal Processing: Concepts and
Techniques. PrenticeHall, 1993.
[JF86]
M.C. Junger and D. Feit. Sound, Structures, and Their Interaction. Acoustical Society of America, 1986.
[JKA02]
H.M. Jones, A. Kennedy, and T.D. Abhayapala. On dimensionality of multipath fields: Spatial extent and richness. In Proc. Int. Conf. Acoustics,
Speech, and Signal Processing (ICASSP 02), Orlando, USA, May 2002.
[JN84]
[KdVBB05] M. Kuster, D. de Vries, D. Beer, and S. Brix. Structural and acoustic analysis
of multiactuator panels. In 118th AES Convention, Barcelona, Spain, May
2005. Audio Engineering Society (AES).
[KN93]
O. Kirkeby and P.A. Nelson. Reproduction of plane wave sound fields. Journal of the Acoustic Society of America, 94(5):29923000, Nov. 1993.
[KNOBH96] O. Kirkeby, P.A. Nelson, F. OrdunaBustamante, and H. Hamada. Local sound field reproduction using digital signal processing. Journal of the
Acoustic Society of America, 100(3):15841593, Sept. 1996.
[KS03]
B. Klehs and T. Sporer. Wave field synthesis in the real world: Part 1 In
the living room. In 114th AES Convention, Amsterdam, The Netherlands,
March 2003. Audio Engineering Society (AES).
[KTH99]
272
BIBLIOGRAPHY
[LB05]
[LMS]
[MAT]
[Mec02]
[Mey04]
J. Meyer.
Spherical microphone arrays for 3D sound reproduction.
In Y.Huang and J.Benesty, editors, Audio Signal Processing for NextGeneration Multimedia Communication Systems. Kluwer Academic Publishers, 2004.
[MF53a]
[MF53b]
[MK88]
M. Miyoshi and Y. Kaneda. Inverse filtering of room acoustics. IEEE Transactions on Acoustics, Speech, and Signal Processing, 36(2):145152, February
1988.
[Mou94]
[NA79]
S.T. Neely and J.B. Allen. Invertibility of a room impulse response. Journal
of the Acoustical Society of America, 66:165169, July 1979.
[NE98]
[NE99]
[NOBH95]
BIBLIOGRAPHY
273
[Oes56]
[OM53]
H.F. Olson and E.G. May. Electronic sound absorber. Journal of the Acoustical Society of America, 25(6):11301136, November 1953.
[OS99]
[OYS+ 99]
[Pap68]
[PEBL05]
B. Pueo, J. Escolano, S. Bleda, and J.J. Lopez. An approach for wave field
synthesis high power applications. In 118th AES Convention, Barcelona,
Spain, May 2005. Audio Engineering Society (AES).
[Pet04]
S. Petrausch. Solution of the wave equation using the functional transformation method. LMS Internal Report, 19th April 2004.
[Pie91]
A.D. Pierce. Acoustics. An Introduction to its Physical Principles and Applications. Acoustical Society of America, 1991.
[Pol00]
[PR04]
[PR05]
S. Petrausch and R. Rabenstein. Highly efficient simulation and visualization of acoustic wave fields with the functional transformation method. In
Simulation and Visualization, pages 279290, Magdeburg, March 2005. Otto
von Guericke Universitat.
[PSR05]
274
BIBLIOGRAPHY
[Pul97]
[Pul99]
V. Pulkki. Uniform spreading of amplitude panned virtual sources. In Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz,
USA, Oct. 1999.
[Rad17]
J. Radon. Uber
die Bestimmung von Funktionen durch ihre Integralwerte
langs gewisser Mannigfaltigkeiten. Berichte Sachsische Akadamie der Wissenschaften, 69:262267, 1917.
[Red05]
J.N. Reddy. Introduction to the finite element method. Mcgraw Hill, 2005.
[RME]
[RV89]
[SBR04a]
S. Spors, H. Buchner, and R. Rabenstein. Adaptive listening room compensation for spatial audio systems. In European Signal Processing Conference
(EUSIPCO), 2004.
[SBR04b]
[SBR04c]
S. Spors, H. Buchner, and R. Rabenstein. A novel approach to active listening room compensation for wave field synthesis using wavedomain adaptive
filtering. In IEEE International Conference on Acoustics, Speech, and Signal
Processing (ICASSP), Montreal, Canada, 2004.
[Sca97]
[SdVL98]
[SGP+ 95]
V. Sanchez, P. Garcia, A.M. Peinado, J.C. Segura, and A.J. Rubino. Diagonalizing properties of the discrete cosine tranforms. IEEE Transactions on
Signal Processing, 43(11), Nov. 1995.
[SH94]
S.D. Snyder and C.H. Hansen. The effect of transfer function estimation
errors on the filteredX LMS algorithm. IEEE Transactions on Signal Processing, 42(4):950953, April 1994.
BIBLIOGRAPHY
275
[SH00]
[SK92]
[SKR03]
[SMH95]
M.M. Sondhi, D.R. Morgan, and J.L. Hall. Stereophonic acoustic echo cancellation an overview of the fundamental problem. IEEE Signal Processing
Letters, 2(8):148151, August 1995.
[Sne72]
[Son67]
M.M. Sondhi. An adaptive echo canceller. The Bell System Technical Journal, 46(3):497511, March 1967.
[Spo04]
[SRR05]
S. Spors, M. Renk, and R. Rabenstein. Limiting effects of active room compensation using wave field synthesis. In 118th AES Convention, Barcelona,
Spain, May 2005. Audio Engineering Society (AES).
[SS67]
[SSR05]
[Sta97]
[Ste96]
276
BIBLIOGRAPHY
[STR02]
[TAG+ 01]
[TBB00]
O.J. Tobias, J.C. Bermudez, and N.J. Bershad. Mean weight behavior of
the filteredX LMS algorithm. IEEE Transactions on Signal Processing,
48(4):10611075, April 2000.
[Teu05]
H. Teutsch. Wavefield Decomposition using Microphone Arrays and its Application to Acoustic Scene Analysis. PhD thesis, University of ErlangenNuremberg, 2005. http://www.lnt.de/lms/publications (last viewed on
4/16/2007).
[TGAA+ 03] S. TorresGuijarro, J. Ander, B. Alava, F.J. CasjusQuiros, and L.I. OrtizBerenguer. Multichannel audio decorrelation for coding. In 6th Int. Conference on Digital Audio Effects (DAFX03), London, UK, Sept. 2003.
[Tof96]
[Tor]
[Tor98]
[TR03]
L. Trautmann and R. Rabenstein. Digital Sound Synthesis by Physical Modeling using the Functional Transformation Method. Kluwer Academic/Plenum Publishers, New York, 2003.
[Tre02]
[TW02]
F. Talantzis and D.B. Ward. Multichannel equalization in an acoustic reverberant environment: Establishment of robustness measures. In Institute
of Acoustics Spring Conference, Salford, UK, March 2002.
[TW03]
BIBLIOGRAPHY
277
[TWR03]
G. Theile, H. Wittek, and M. Reisinger. Potential wavefield synthesis applications in the multichannel stereophonic world. In AES 24th International
Conference on Multichannel Audio, Banff, Canada, June 2003. Audio Engineering Society (AES).
[Ver97]
[Vog93]
[Vol96]
[Vol98]
[VTB02]
E.J. Volker, W. Teuber, and A. Bob. 5.1 in the living room  on acoustics
of multichannel reproduction. In Proc. of the Tonmeistertagung, Hannover,
Germany, 2002.
[WA01]
[Wei03]
[Wik05a]
[Wik05b]
[Wik05c]
[Wil99]
[Wit05]
278
BIBLIOGRAPHY
[WW+ 04]
[YTF03]
[Zio95]
L.J. Ziomek. Fundamentals of Acoustic Field Theory and Space Time Signal
Processing. CRC Press, 1995.
Curriculum Vitae
Name:
Birth:
Nationality:
School Education
1977
September 1978 July 1980
September 1980 July 1982
September 1982 June 1992
June 1992
Alternative Service
July 1992 October 1993
Nuremberg, Germany.
University Education
October 1994 October 2000 Student of electrical engineering at the
University of Erlangen/Nuremberg, Germany.
October 2000
Reception of Dipl.Ing. degree
Professional Live
January 2001 October 2005
November 2005
Scientific assistant,
Multimedia Communications and Signal Processing,
University of Erlangen/Nuremberg.
Senior Scientist,
Deutsche Telekom Laboratories,
Deutsche Telekom AG, Technical University of Berlin.