Vous êtes sur la page 1sur 293

Sascha Spors

Active Listening Room Compensation for


Spatial Sound Reproduction Systems

Active Listening Room Compensation


for
Spatial Sound Reproduction Systems

Der Technischen Fakultat der


Friedrich-Alexander-Universitat Erlangen-N
urnberg
zur Erlangung des Grades

Doktor-Ingenieur

vorgelegt von

Sascha Michael Spors


Erlangen, 2005

Als Dissertation genehmigt von


der Technischen Fakultat der
Friedrich-Alexander-Universitat
Erlangen-N
urnberg

Tag der Einreichung:


Tag der Promotion:
Dekan:
Berichterstatter:

5.10.2005
31.01.2006
Prof. Dr.-Ing. Alfred Leipertz
Priv.-Doz. Dr.-Ing. habil. Rudolf Rabenstein
Prof. Dr.-Ing. Reinhard Lerch

Acknowlegements
My special thanks go to my supervisor Dr.-Ing. habil. Rudolf Rabenstein for many fruitful
and interesting scientific, and non-scientific discussions, for his qualified supervision of my
thesis and diverting after-meeting and after-conference evenings. I would further like to
thank Prof. Peter Steffen for long hours of elaborate discussions on various theoretical
aspects of my work. Their advices improved my understanding and the quality of this
thesis considerably.
I would like to thank Prof. Reinhard Lerch, Prof. Jorn Thielecke and Prof. Joachim
Hornegger for their interest in my work, for reviewing this thesis, and for finding the time
to participate in the defense.
I am very thankful to all members of the Telecommunications Laboratory at the
University Erlangen-Nuremberg for their friendship, support and for making my stay there
so enjoyable. I would especially like to thank Herbert Buchner for many discussions on
adaptive filters, for managing all the paperwork for our patents on wave domain adaptive
filtering and for his comments on my thesis. I also would like to thank my colleague
Heinz Teutsch for entertaining office hours and business trips, and R
udiger Nagel and
Manfred Lindner for the design and construction of outstanding hardware for the wave
field synthesis systems. Additionally, I would like to thank Wolfgang Herbordt for fruitful
discussions on wave domain adaptive filtering, for our very enjoyable business trips and
for his comments on my thesis.
A very interesting side aspect of my work was the collaboration with the artists Michael
Amman and Heijko Bauer. This collaboration resulted in the production of several compositions for WFS which have been presented with great success at the Horkunstfestival
in Erlangen. I would like to thank them for this great opportunity to get insight into this
non-technical field.
The major portion of this work has been carried out within the scope of the EC
funded IST project CARROUSO and a joint project with Airbus Deutschland GmbH
funded by the German Bundesministerium f
ur Wirtschaft und Arbeit. I would like to
thank all the members of these projects for their support, open-minded discussions and
entertaining social evenings during the project meetings.
Finally, my very special thanks go to Claudia for her outstanding patience and support,
and to my son Lennard for beeing so tame during the first weeks of his life. I would also
like to thank my family, the family of Claudia and all of my friends for all the support
and relaxed moments during the last years.

Danksagungen
Mein besonderer Dank gilt meinem Doktorvater Dr.-Ing. habil. Rudolf Rabenstein f
ur
die vielen fruchtbaren und interessanten wissenschaftlichen und nicht-wissenschaftlichen
Diskussionen, f
ur die qualifizierte Betreuung meiner Arbeit und unterhaltende Abende im
Anschluss an Versammlungen und Konferenzen. Ich mochte weiterhin Prof. Peter Steffen
f
ur die langen Stunden danken, in denen wir verschiedene theoretische Aspekte meiner
Arbeit erorterten. Beider Anregungen trugen in betrachtlichem Ausma dazu bei mein
Verstandnis f
ur das Themengebiet und die Qualitat dieser Arbeit zu steigern.
Ferner mochte ich Prof. Reinhard Lerch, Prof. Jorn Thielecke und Prof. Joachim
Hornegger f
ur ihr Interesse an meiner Arbeit danken, f
ur die Rezension dieses Manuskripts
und daf
ur, dass sie die Zeit fanden, an der Verteidigung teilzunehmen.
Ich bin allen Mitgliedern des Laboratoriums f
ur Nachrichtentechnik an der Universitat
Erlangen-N
urnberg sehr dankbar f
ur ihre Freundschaft, f
ur ihre fachliche Unterst
utzung
und daf
ur, dass sie meinen Aufenthalt so angenehm gestaltet haben. Ganz besonders
danke ich Herbert Buchner f
ur viele Gesprache u
ur die Organber die adaptive Filterung, f
isation des Papierverkehrs bei der Anmeldung unseres Patents zur adaptiven Filterung im
Wellenbereich und f
ur seine Kommentare zu meiner Arbeit. Ich mochte zudem meinem
Kollegen Heinz Teutsch f
ur die kurzweiligen B
urostunden und Geschaftsreisen danken
sowie R
udiger Nagel and Manfred Lindner f
ur die Entwicklung und Konstruktion hervor
ragender Gerate und Aufbauten f
ur die Wellenfeldsynthese-Systeme. Uberdies
mochte ich
Wolfgang Herbordt f
ur die ertragreichen Diskussionen zur adaptiven Filterung im Wellenbereich danken, f
ur unsere angenehmen Geschaftsreisen und f
ur seine Anmerkungen zu
meiner Arbeit.
Ein sehr reizvoller Teil meiner Arbeit bestand in der Zusammenarbeit mit den
K
unstlern Michael Amman und Heijko Bauer. Unsere Kooperation m
undete in der Produktion mehrerer Kompositionen f
ur Wellenfeldsynthese, die mit groem Erfolg auf dem
Horkunstfestival in Erlangen prasentiert wurden. Ich mochte den beiden daf
ur danken,
dass sie mir einen Einblick in dieses nicht-technische Feld ermoglichten.
Der Hauptteil meiner Arbeit entstand im Rahmen des von der EU finanzierten ISTProjekts CARROUSO und innerhalb des gemeinsamen Projekts mit Airbus Deutschland
GmbH, das vom deutschen Bundesministerium f
ur Wirtschaft und Arbeit gefordert
wurde. Ich mochte allen Beteiligten dieser Projekte f
ur ihre Unterst
utzung, f
ur offene
Debatten und abwechslungsreiche Abende wahrend der Projekt-Zusammentreffen danken.
Schlielich geht mein spezieller Dank an meine Freundin Claudia f
ur ihre auerordentliche Geduld und Hilfe und an meinen Sohn Lennard, weil er wahrend der ersten
Wochen seines Lebens so friedlich war. Ich mochte auch meiner Familie, der Familie von
Claudia und allen meinen Freunden f
ur deren Beistand und die entspannenden Momente
in den letzten Jahren danken.

vi

vii

Contents
1 Introduction

1.1

The Influence of the Listening Room . . . . . . . . . . . . . . . . . . . . .

1.2

Listening Room Compensation Systems . . . . . . . . . . . . . . . . . . . .

1.3

Overview of this thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 Fundamentals of Sound Propagation


2.1

2.2

The Acoustic Wave Equation . . . . . . . . . . . . . . . . . . . . . . . . .

2.1.1

Derivation of the Homogeneous Acoustic Wave Equation . . . . . .

2.1.2

Two-dimensional Wave Fields . . . . . . . . . . . . . . . . . . . . . 10

2.1.3

Generic Solution of the Wave Equation . . . . . . . . . . . . . . . . 10

Solutions of the Homogeneous Wave Equation in Cartesian Coordinates . . 11


2.2.1

2.3

2.4

2.5

2.7

Plane Wave Expansion . . . . . . . . . . . . . . . . . . . . . . . . . 13

Solutions of the Homogeneous Wave Equation in Cylindrical Coordinates . 14


2.3.1

Cylindrical Harmonics Expansion . . . . . . . . . . . . . . . . . . . 16

2.3.2

Circular Harmonics Expansion . . . . . . . . . . . . . . . . . . . . . 18

Solutions of the Inhomogeneous Wave Equation . . . . . . . . . . . . . . . 20


2.4.1

Point Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.4.2

Greens Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.4.3

Line Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.4.4

Planar Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.5.1

2.6

Classification of Boundary Conditions . . . . . . . . . . . . . . . . . 29

Solution of the Inhomogeneous Wave Equation for a Bounded Region . . . 31


2.6.1

Three-dimensional Free-Space Kirchhoff-Helmholtz Integral . . . . . 34

2.6.2

Two-dimensional Free-Space Kirchhoff-Helmholtz Integral

. . . . . 35

The Effect of Boundaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36


2.7.1

Plane Wave Reflection and Transmission at a Planar Boundary . . 37

2.7.2

Acoustic Modes in a Rectangular Room . . . . . . . . . . . . . . . 38

viii

CONTENTS

3 Fourier Analysis of Wave Fields


3.1 Fourier Analysis of Multidimensional Signals . . . . . . . . . . . . . . . . .
3.1.1 Multidimensional Fourier Transformation . . . . . . . . . . . . . . .
3.1.2 Multidimensional Fourier Transformation in Cartesian Coordinates
3.1.3 Multidimensional Fourier Transformation in Cylindrical Coordinates
3.1.4 Cylindrical Fourier Transformation Expressed as Hankel Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Linear Systems and Fourier Analysis . . . . . . . . . . . . . . . . . . . . .
3.2.1 Classification of Multidimensional Systems . . . . . . . . . . . . . .
3.2.2 The Wave Equation as Multidimensional System . . . . . . . . . . .
3.2.3 Linear Systems and the Fourier Transformation . . . . . . . . . . .
3.3 The Continuous Plane Wave Decomposition . . . . . . . . . . . . . . . . .
3.3.1 Fourier Analysis of Plane Waves . . . . . . . . . . . . . . . . . . . .
3.3.2 Definition of the Plane Wave Decomposition . . . . . . . . . . . . .
3.3.2.1 Time-domain Plane Wave Decomposition . . . . . . . . .
3.3.3 Definition of the Inverse Plane Wave Decomposition . . . . . . . . .
3.3.4 Representations of the Plane Wave Decomposition . . . . . . . . . .
3.3.4.1 Plane Wave Decomposition as Hankel Transformation . .
3.3.4.2 The Decomposition into Incoming/Outgoing Plane Waves
3.3.4.3 The Plane Wave Decomposition in Terms of Circular Harmonics . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.4.4 Overview on the Different Representations of the Plane
Wave Decomposition . . . . . . . . . . . . . . . . . . . . .
3.3.5 Properties and Theorems of the Plane Wave Decomposition . . . .
3.3.5.1 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.5.2 Scaling Theorem . . . . . . . . . . . . . . . . . . . . . . .
3.3.5.3 Rotation Theorem . . . . . . . . . . . . . . . . . . . . . .
3.3.5.4 Multiplication Theorem . . . . . . . . . . . . . . . . . . .
3.3.5.5 Convolution Theorem . . . . . . . . . . . . . . . . . . . .
3.3.5.6 Parsevals Theorem . . . . . . . . . . . . . . . . . . . . . .
3.3.5.7 Plane Wave Extrapolation . . . . . . . . . . . . . . . . . .
3.3.5.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.6 Relations to other Methods and Transformations . . . . . . . . . .
3.4 The Plane Wave Decomposition using Boundary Measurements . . . . . .
3.4.1 The Finite Aperture Plane Wave Decomposition . . . . . . . . . . .
3.4.2 Plane Wave Decomposition based on Kirchhoff-Helmholtz Extrapolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.3 Plane Wave Decomposition based on Circular Harmonics . . . . . .
3.5 Plane Wave Decomposition of Analytic Source Models . . . . . . . . . . .

43
43
44
45
47
49
51
52
53
54
56
57
58
61
61
62
62
63
65
66
68
68
69
70
70
71
72
73
73
74
76
76
78
79
81

CONTENTS

3.6

ix

3.5.1 Plane Wave Decomposition of a Plane Wave . . . . . . . . . . . . .


3.5.2 Plane Wave Decomposition of a Line Source . . . . . . . . . . . . .
The Discrete Plane Wave Decomposition . . . . . . . . . . . . . . . . . . .
3.6.1 Derivation of the Discrete Plane Wave Decomposition . . . . . . . .
3.6.1.1 Definition of the Two-dimensional Polar Pulse Train . . .
3.6.1.2 Definition of the Discrete Space Plane Wave Decomposition
3.6.1.3 Spectral Properties of the Discrete Space Plane Wave Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . .
3.6.1.4 Definition of the Discrete Plane Wave Decomposition . . .
3.6.2 Sampling and Truncation Artifacts . . . . . . . . . . . . . . . . . .
3.6.2.1 Angular Sampling of Boundary Measurements . . . . . . .
3.6.2.2 Sampling of a Plane Wave on a Circular Boundary . . . .
3.6.2.3 Quantitative Analysis of Aliasing Artifacts . . . . . . . . .
3.6.2.4 Quantitative Analysis of the Truncation Error . . . . . . .
3.6.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4 Listening Room Compensation


4.1 Sound Reproduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.1 Sound Reproduction based on the Kirchhoff-Helmholtz Integral .
4.1.2 Three-dimensional Sound Reproduction . . . . . . . . . . . . . . .
4.1.3 Two-dimensional Sound Reproduction . . . . . . . . . . . . . . .
4.1.4 Sound Reproduction with Monopole Secondary Sources . . . . . .
4.1.5 Reproduction of a Plane Wave Decomposed Field . . . . . . . . .
4.1.6 Spatial Sampling of the Secondary Monopole Source Distribution
4.1.6.1 Linear Arrays . . . . . . . . . . . . . . . . . . . . . . . .
4.1.6.2 Circular Arrays . . . . . . . . . . . . . . . . . . . . . . .
4.1.6.3 Arbitrary Shaped Arrays . . . . . . . . . . . . . . . . . .
4.1.6.4 Point Sources as Secondary Sources . . . . . . . . . . . .
4.2 Sound Reproduction in Rooms . . . . . . . . . . . . . . . . . . . . . . . .
4.3 Fundamentals of Listening Room Compensation . . . . . . . . . . . . . .
4.3.1 Room Compensation as Deconvolution Problem . . . . . . . . . .
4.3.2 Generic Solution to the Room Deconvolution Problem . . . . . . .
4.3.3 Adaptation of the Room Compensation Filters . . . . . . . . . . .
4.4 Listening Room Compensation for Massive Multichannel Reproduction
Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.1 Discrete Realization of Room Compensation . . . . . . . . . . . .
4.4.1.1 Spatial Discretization . . . . . . . . . . . . . . . . . . .
4.4.1.2 Temporal Discretization . . . . . . . . . . . . . . . . . .
4.4.1.3 Frequency Domain Description of Signals and Systems .

82
83
85
85
85
86
87
88
89
90
91
94
97
99

101
. 101
. 102
. 104
. 104
. 105
. 110
. 111
. 112
. 119
. 122
. 125
. 126
. 127
. 128
. 131
. 132
.
.
.
.
.

134
134
134
136
137

CONTENTS

4.5

4.6

4.4.1.4 Adaptation of Room Compensation Filters . . . . . . . . .


4.4.2 Exact Inverse Filtering using the MINT . . . . . . . . . . . . . . .
4.4.3 Least-Squares Error Adaptation of the Room Compensation Filters
4.4.4 Fundamental Problems of Adaptive Inverse Filtering . . . . . . . .
Generic Framework of an Improved Listening Room Compensation System
dd . . . . . . . . . . . . .
4.5.1 Analysis of the Auto-Correlation Matrix
4.5.2 Decoupling of the Listening Room Transfer Matrix . . . . . . . . .
4.5.2.1 Singular Value Decomposition . . . . . . . . . . . . . . . .
4.5.2.2 Decoupling of the Listening Room Transfer Matrix . . . .
4.5.3 Eigenspace Adaptive Inverse Filtering . . . . . . . . . . . . . . . . .
4.5.4 Wave Domain Adaptive Inverse Filtering . . . . . . . . . . . . . . .
4.5.5 Approximate Decoupling of the Listening Room Transfer Matrix . .
4.5.5.1 Decomposition into Plane Waves . . . . . . . . . . . . . .
4.5.5.2 Decomposition into Circular Harmonics . . . . . . . . . .
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

138
139
142
145
146
146
147
147
149
150
153
155
156
157
159

5 Room Compensation Applied to Spatial Sound Systems


161
5.1 Wave Field Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
5.1.1 Correction of Secondary Source Type Mismatch . . . . . . . . . . . 162
5.1.2 Artifacts of WFS and their Impact on Room Compensation . . . . 165
5.1.3 Quantitative Analysis of the Artifacts for Circular WFS Systems . . 167
5.1.3.1 Amplitude Errors . . . . . . . . . . . . . . . . . . . . . . . 168
5.1.3.2 Suppression of Elevated Reflections . . . . . . . . . . . . . 171
5.1.4 Rendering Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 171
5.1.4.1 Data-based Rendering . . . . . . . . . . . . . . . . . . . . 173
5.1.4.2 Model-based Rendering . . . . . . . . . . . . . . . . . . . 174
5.1.5 Practical Implementation of a WFS System . . . . . . . . . . . . . 174
5.1.5.1 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
5.1.5.2 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
5.2 Practical Implementation of Wave Field Analysis . . . . . . . . . . . . . . 178
5.2.1 Linear Microphone Arrays . . . . . . . . . . . . . . . . . . . . . . . 178
5.2.2 Circular Microphone Arrays . . . . . . . . . . . . . . . . . . . . . . 181
5.2.3 Artifacts of Two-dimensional Wave Field Analysis and Extrapolation182
5.2.4 Quantitative Analysis of the Artifacts of Circular WFA Systems . . 184
5.2.4.1 Amplitude Errors due to Extrapolation . . . . . . . . . . . 185
5.2.4.2 Analysis of Elevated Reflections . . . . . . . . . . . . . . . 186
5.2.5 Practical Realization of a Circular WFA System . . . . . . . . . . . 188
5.3 Listening Room Compensation for Wave Field Synthesis . . . . . . . . . . 190
5.3.1 Decoupling of the Listening Room Transfer Matrix . . . . . . . . . 191

CONTENTS

5.3.2
5.3.3
5.3.4

5.3.5
5.3.6
5.3.7
5.3.8
5.4

Room

xi

WDAF based Room Compensation for WFS Systems . . . . . . .


Performance Measures for Active Room Compensation . . . . . .
Results based on Simulated Acoustic Environments . . . . . . . .
5.3.4.1 Circular WFS System . . . . . . . . . . . . . . . . . . .
5.3.4.2 Rectangular WFS System . . . . . . . . . . . . . . . . .
Results based on Measurement of the Acoustic Environment . . .
The Influence of WFS and WFA Artifacts . . . . . . . . . . . . .
Possible Modifications of the Proposed Algorithm . . . . . . . . .
Compensation of Listening Room Reflections above the Spatial
Aliasing Frequency . . . . . . . . . . . . . . . . . . . . . . . . . .
Compensation for other Spatial Reproduction Systems . . . . . . .

6 Summary and Conclusions

.
.
.
.
.
.
.
.

193
195
197
198
207
209
215
216

. 217
. 218
221

A Notations
A.1 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A.2 Abbreviations and Acronyms . . . . . . . . . . . . . . . . . . . . . . . .
A.3 Mathematical Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . .

225
. 225
. 226
. 227

B Coordinate Systems
B.1 Cartesian Coordinate System
B.2 Spherical Coordinate System .
B.3 Cylindrical Coordinate System
B.4 Polar Coordinate System . . .

235
. 235
. 237
. 238
. 240

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

C Mathematical Preliminaries
C.1 Greens Second Integral Theorem . . . . . . . . . . . . . . . . . . . . . .
C.2 The Stationary Phase Method . . . . . . . . . . . . . . . . . . . . . . . .
C.2.1 Approximation of a Linear Distribution of Point Sources . . . . .
C.3 Spatio-temporal Spectrum of the Two- and Three-dimensional Free-field
Greens Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
C.3.1 Two-dimensional Greens Function . . . . . . . . . . . . . . . . .
C.3.2 Three-dimensional Greens Function . . . . . . . . . . . . . . . .

243
. 243
. 244
. 244
. 245
. 245
. 246

D Measured and Simulated Acoustic Environments


247
D.1 Measured Acoustic Environment . . . . . . . . . . . . . . . . . . . . . . . . 247
D.2 Simulated Acoustic Environments . . . . . . . . . . . . . . . . . . . . . . . 249
E Titel, Inhaltsverzeichnis, Einleitung und Zusammenfassung
251
E.1 Titel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
E.2 Inhaltsverzeichnis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251

xii

CONTENTS

E.3 Einleitung . . . . . . . . . . . . . . . . . . . . . . . . . . .
E.3.1 Der Einfluss des Wiedergaberaumes . . . . . . . . .
E.3.2 Systeme zur Kompensation des Wiedergaberaumes

E.3.3 Uberblick
u
ber diese Arbeit . . . . . . . . . . . . .
E.4 Zusammenfassung und Schlussfolgerungen . . . . . . . . .
Bibliography

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

257
259
260
261
262
265

Chapter 1
Introduction
Among the various senses of humans hearing and vision are the ones most present in
everyday situations. Vision is the sense that is sensible to light and hearing is the sense
that is sensible to sound. From a strictly physical viewpoint sound could be regarded just
as compression waves traveling in air. However, sound is much more to humans than just
this physical definition. Sound transmits a wide variety of information, impressions and
sensations and may be interpreted e. g. as noise, speech or music by the receiver. Due to
its importance to humans, it is desirable to have techniques for the recreation of acoustic
events. Sound reproduction aims at recreating an (virtual) acoustic scene at a remote
place or to a later time. When realized properly a perfect auditory illusion of the original
scene is created. Thus, the goal of sound reproduction is to create the perfect acoustic
illusion. However, the human auditory sense is very sensible and is able to detect minor
differences between the original scene and the reproduced one.
The perfect reproduction of recorded or synthetic acoustic scenes is an active research
topic of the last decades. Several generations of engineers invented various sound reproduction systems [Tor98, Ste96, Gri00, KTH99, Pol00]. However, the perfect acoustic
illusion has never been realized by either of them. Nevertheless, sound reproduction has
been improved considerably in terms of quality and spatial impression in the last decades.
In the following a brief overview on sound reproduction systems will be given.
The history of sound reproduction dates back to the invention of the telephone in the
late 19th century. Johann Philipp Reis constructed a first prototype of a telephone in
1860 [Wik05c], which was able to cover a distance of about 100 m. Alexander Graham
Bell improved this first prototype and patented the telephone in 1876 [Wik05a]. The
telephone can be regarded as one of the first sound transmission systems. It was mainly
designed for speech communication and provided quite poor quality for the transmission
of music. It also didnt cope with the binaural nature of human hearing. Some years later
researchers recognized that a binaural reproduction may improve the spatial impression
of sound reproduction considerably [Tor98]. Starting from these first approaches a wide

Introduction

variety of reproduction systems have been developed up to now. This work will mainly
focus on systems that are designed for the high-quality reproduction of sound and music.
The first reproduction systems in this context consisted of only one loudspeaker and
are therefore termed as monophonic systems. A one loudspeaker system is able to reproduce the spatial impression of the original scene only to some limited extend. To
improve the situation stereophonic reproduction uses two loudspeakers. These should
be placed equidistantly with respect to the listener. Typically a stereophonic system is
designed to have an angle of 30o between the loudspeakers viewed from the listeners
position. Most stereophonic systems aim at recreating the horizontal-only (pantophonic)
sound field. Stereophony relies on stereophonic reproduction principles, like amplitude
panning [HW98], derived from psychoacoustic research. As a result, the correct spatial
impression of the original scene is only perceived at one particular listening position. This
position is frequently termed as the sweet-spot. To improve the situation, stereophony
has been extended to the surround reproduction techniques. The main driving force for the
further development of surround techniques was the cinema industry. First surround systems consisting of three loudspeakers in the front and two in the rear have been presented
in 1940 [Tor98]. They were also based on stereophonic principles and to some extend
shared their limitations. However, it took quite a long time until surround techniques
had their commercial breakthrough. Today five channel surround systems are state of the
art in home cinema systems and even more channels are typically used in movie theaters.
The reproduction techniques have also been extended to three-dimensional (periphonic)
reproduction. Vector base amplitude panning (VBAP) [Pul97, Pul99] is an example for
a three-dimensional stereophonic reproduction system.
Advanced surround reproduction systems, which overcome the sweet-spot and other
limitations of the stereophonic surround techniques have been developed e. g. [Cam67,
KNOBH96, KN93, WA01]. The systems to be explicitely mentioned here are wave field
synthesis (WFS) [Ber88, Hul04] and (higher-order) ambisonics [Dan00, Ger85]. Both are
based on a solid physical foundation. WFS will be discussed in detail in Section 5.1.
In general, the driving signals for the loudspeakers are generated from signals captured
from the original sources, from geometrical information on the source locations, and from
information on the room acoustics of the recording room. This information may have
been recorded in an existing recording room (e. g. a concert hall) or it may have been
artificially created. Typically hybrid approaches of both are used in nowadays music
recording. The signal processing algorithms for the generation of the loudspeaker signals
are derived from fundamental psychoacoustic and acoustic principles. While the quality
of reproduction has increased considerably there are still a number of open problems.
One of these, common to all systems discussed above, is the influence of the room where
the reproduction takes place, the listening room. The following section will illustrate the
influence of the listening room on spatial sound reproduction.

1.1. The Influence of the Listening Room

listening area

virtual
source

listening room

recording room
Figure 1.1: Simplified example that shows the effect of the listening room on the auralized wave field. The dashed lines from one virtual source to one exemplary listening
position show the acoustic rays for the direct sound and one reflection off the side wall of
the virtual recording room. The solid line from one loudspeaker to the listening position
shows a possible reflection of the loudspeaker wave field off the wall of the listening room.

1.1

The Influence of the Listening Room

The influence of the listening room on a sound reproduction system and the reproduced
scene will be illustrated first in an intuitive fashion. For this purpose a simple reproduction scenario is considered in the sequel. Figure 1.1 illustrates this simplified example.
The mapping of an acoustic scene in a church (e. g. a singer performing in the choir) into
the listening room is shown exemplarily. For simplicity the propagation of sound waves
will be illustrated by acoustic rays in Fig. 1.1. The dashed lines in Fig. 1.1 from the
virtual source to one exemplary listening position show the acoustic rays for the direct
sound and several reflections off the side walls of the recorded room. The loudspeaker
system in the listening room reproduces the direct sound and reflections in order to create
the desired spatial impression of the original scene. The theory behind nearly all of the
deployed methods assumes an anechoic listening room which does not exhibit any reflections. However, this idealistic assumption is rarely met by typical listening rooms. The
solid line in Fig. 1.1 from one loudspeaker in the upper row to the listening position illustrates a possible reflection of the wave field produced by one loudspeaker off the wall of

Introduction

the listening room. These additional reflections caused by the listening room may impair
the desired spatial impression, as this simplified example intuitively illustrates.
The influence of the listening room on surround reproduction systems is a topic of active
research [Gri98, DZJR05, KS03, Vol98, VTB02, Vol96]. Since the acoustic properties of
the listening room and the reproduction system used may vary in a wide range, no generic
conclusion can be given for the perceptual influence of the listening room. However, the
reflections imposed by the listening room will have influence on psychoacoustic properties
of the reproduced scene. These influences may be e. g. degradation of the directional localization or sound coloration. Especially dominant early reflections seem to influence the
desired spatial impression. Another effect of the listening room on low-frequency reproduction may be the build-up of low frequency resonances. These resonances negatively
influence the perception of short sound events.
In general, a reverberant listening room superimposes its characteristics to the desired
impression of the recorded room. Listening room compensation aims at eliminating or
reducing the effect of the listening room on the auralized scene. The following section
will introduce listening room compensation systems and will briefly review common approaches.

1.2

Listening Room Compensation Systems

The are two basic classes of approaches to listening room compensation: (1) passive and
(2) active listening room compensation. Passive listening room compensation applies
acoustic insulation materials to the listening room as countermeasure against its reflections. However, it is well known that acoustic insulation gets impractical and costly above
an even rather modest level of sound absorption. This holds especially for low frequencies.
Additionally, the effect of this countermeasure is limited by cost and room design considerations. Thus, in practical setups passive room compensation alone cannot provide a
sufficient suppression of listening room reflections. A second possibility for compensating
listening room reflections is to use concepts from active control to perform the desired
compensation. Among various variations there are two basic approaches to be mentioned
here: (1) active control of the acoustic impedance at the walls of the listening room and
(2) utilization of the reproduction system. The first class of approaches tries to actively
influence the impedance of a wall in order to derive free-field conditions [GKR85, OM53].
The second class exploits synergies with the reproduction system in order to control the
wave field within the listening area. The approaches based on the second class will be
reviewed shortly in the following.
Overviews on classical single-channel room compensation approaches and their limitations
can be found in [Fie01, Fie03, HM04, Mou94]. The single-channel approaches analyze the
wave field reproduced by one loudspeaker at only one position. Common to most active

1.2. Listening Room Compensation Systems

room compensation approaches is the basic idea to pre-equalize the loudspeaker driving
signal using a suitable compensation filter computed from the analysis of the reproduced
wave field. However, there are three fundamental problems. The first problem is that
the compensation filters have to be computed adaptively due to the time-varying nature
of the room characteristics. E.g. as a result of a temperature change in the listening
room the speed of sound will change and hence the acoustic properties [OYS+ 99]. A
wide variety of problems are related to the algorithms used for the adaptation of the
compensation filters. The second problem is that room impulse responses are, in general,
non-minimum phase [NA79] which prohibits to calculate an exact inverse filter. However, the optimal compensation filter is the inverse filter to the impulse response from
the loudspeaker to the measured position. The third problem is that the compensation
filter is optimal only for the measured position. As a result, the performance in terms
of achieved compensation will decrease with increasing distance to the measured position [TW02, TW03, BHK03, NOBH95].
In order to overcome these last two problems multichannel active room compensation
systems have been proposed by several authors. Here, the acoustic properties of the listening room are measured from one or more loudspeakers to one or more microphone
positions. In the multichannel case the calculation of exact inverse filters is possible in
most practical situations [MK88] and the undesired position dependency of active room
compensation may be improved also. However, most classical multichannel approaches
utilize only a limited number of loudspeakers and analysis positions. Hence, they are not
able to provide sufficient control over the wave field and are not able to sufficiently analyze
the reproduced wave field. As a result, the influence of the listening room will be compensated mainly at the analyzed positions with the potential of serve artifacts outside of
these positions. These approaches will therefore be termed as multi-point compensation
approaches.
Advanced reproduction systems like WFS and higher-order ambisonics may provide an
improvement in terms of control over the reproduced wave field. However, they also require a large number of reproduction channels. A sufficient analysis of the reproduced
wave field will require a large number of analysis channels additionally. Unfortunately,
the adaptation of the compensation filters is subject to fundamental problems for a scenario with many playback and analysis channels [SMH95]. The following requirements
for an advanced listening room compensation system can be deduced from the problems
of classical room compensation systems discussed above.
Advanced room compensation methods should:
1. derive analysis signals from the entire listening area, not only from selected points,
2. be based on a spatial reproduction system which provides control over the acoustic
wave field inside the entire listening area, and

Introduction

3. use an advanced multichannel adaptation algorithm that overcomes the fundamental


problems of massive multichannel adaptation problems.
This work will derive, discuss and evaluate a novel efficient approach to active listening
room compensation for spatial audio systems that yields an extended compensated area
compared to the multi-point approaches. In the sequel a short overview of this work will
be given.

1.3

Overview of this thesis

This work is organized as follows: Chapter 2 introduces the fundamentals of sound propagation. These fundamentals will serve as preliminaries for the remaining chapters. In
detail, the wave equation, its solutions and orthogonal wave field decompositions will be
discussed. Chapter 3 deals with the first requirement mentioned above and introduces
Fourier based analysis of acoustic wave fields. The Fourier analysis of generic multidimensional signals will be specialized to acoustic fields. This results in the plane wave
decomposition of acoustic wave fields, which provides a powerful tool for the wave field
analysis (WFA) of the reproduced wave field. Chapter 4 discusses the fundamentals of
listening room compensation. First, a generic theory of sound reproduction systems will
be developed that provide sufficient control over the reproduced wave field. Then the
influence of the listening room and the adaptive computation of the compensation filters
will be illustrated. This will be followed by an analysis of the fundamental problems
of classic adaptation algorithms applied to a large number of reproduction and analysis
channels. Spatio-temporal signal and system transformations will then be proposed as
solution to these fundamental problems. This leads to an improved listening room compensation system that fulfills all of the above stated requirements. Chapter 5 presents
WFS as a particular implementation of a spatial sound reproduction system, discusses
the artifacts of WFS and WFA and introduces an improved listening room compensation system for WFS. Results from simulated and measured acoustic environments will
be presented. Chapter 6 will finally give a summary of this work and will draw some
conclusions.

Chapter 2
Fundamentals of Sound Propagation
In the following chapter the fundamentals of sound propagation will be introduced. These
fundamentals will serve as prerequisites for the discussion of wave field analysis, sound
reproduction and listening room compensation in the remainder of this work. The chapter
is organized as follows: First the wave equation and its homogeneous solutions are derived,
followed by a discussion of the inhomogeneous wave equation and reasonable choices for
boundary conditions. Then the solution to the inhomogeneous wave equation with respect
to arbitrary boundary conditions is presented. Finally, the influence of boundaries with
simple geometries is discussed.

2.1

The Acoustic Wave Equation

The acoustic wave equation provides the mathematical foundation of sound propagation
through fluids. This section briefly reviews the acoustic wave equation. For an in depth
discussion please refer to [Wil99, Pie91, Bla00, MF53a, MF53b].

2.1.1

Derivation of the Homogeneous Acoustic Wave Equation

In order to derive the lossless acoustic wave equation the following (typical) basic assumptions are made:
1. the propagation medium is homogeneous,
2. the propagation medium is quiescent,
3. the propagation medium can be characterized as an ideal gas,
4. the state changes in the gas can be modeled as adiabatic processes, and
5. the pressure and density perturbations due to wave propagation are small compared
to the static pressure p0 and the static density 0 .

2. Fundamentals of Sound Propagation

The first condition assures that the relevant parameters of the medium are independent of
the position, the second condition assures that the parameters are independent of time and
that there is no gross movement of the medium. As a result of assumption (3) the laws for
ideal gases apply to the medium, assumption (4) ensures that there is no energy exchange
in form of heat conduction with the medium (no propagation losses) and assumption (5)
ensures that the field variables and medium characteristics can be linearized around an
operating point. The assumptions (1)-(5) are reasonable assumptions for typical scenarios
where acoustic wave propagation in air is considered.
The wave equation can be derived from two fundamental physical principles, (I) the
conservation of mass and (II) the momentum equation. The first principle describes the
mass balance in an infinitesimal volume element. Its mathematical formulation is given
as follows [Bla00]

+ v(x, t) = 0 ,
(2.1)
t
where denotes the density of the propagation medium, the nabla operator and v(x, t)
the acoustic particle velocity at the position x and the time t. The time derivative of the
density in Eq. (2.1) will be expressed by the acoustic pressure p(x, t) in the following.
However, this requires to consider the characteristics of the propagation medium. Utilizing
assumptions (3)-(5) the desired relation between the time derivative of the density and
the acoustic pressure can be found as [Bla00]
p(x, t)

= c2
,
t
t

(2.2)

where c denotes the speed of sound. The speed of sound is in general dependent on the
characteristics of the propagation medium. For air, it mainly depends on the temperature
and the humidity. A value which reflects typical conditions (T = 20oC, 50% rel. Hum.)
for wave propagation in air is c = 343 [ ms ]. Using Eq. (2.2) to eliminate the temporal
derivative of the density in Eq. (2.1) and applying assumption (5) yields

p(x, t)
= 0 c2 v(x, t) ,
t

(2.3)

where 0 denotes the static density of air.


The second underlying physical principle that forms the wave equation is the momentum
equation. The momentum equation relates the force applied to an infinitesimal volume
(mass) element to the acceleration of that volume element as a result of the force. The
resulting equation is also known as Eulers equation. It is given as follows [Bla00]
0

v(x, t)
+ p(x, t) = 0 .
t

(2.4)

Eulers equation together with Eq. (2.3) comprises the mathematical formulation of acoustic wave propagation in air, if assumptions (1)-(5) are met. Equation (2.3) can be used

2.1. The Acoustic Wave Equation

to eliminate the acoustic particle velocity v(x, t) from Eulers equation (2.4). The result
is the well-known homogeneous acoustic wave equation
2 p(x, t)

1 2
p(x, t) = 0 .
c2 t2

(2.5)

Although various other forms of the acoustic wave equation exist in the literature, this is
the most common formulation used. The 2 operator is also referred as Laplace operator
= 2 . The wave equation, as introduced by Eq. (2.5), covers all effects that are inherent
to the wave nature of sound, e. g. diffraction. As noted before, assumptions (1)-(5) have to
hold reasonable in order for the wave equation to be a mathematical model for the physical
process of wave propagation. If one or more of these assumptions do not hold, other forms
of the wave equation have to be used. Examples are given e. g. in [Pie91, Bla00, Zio95].
The formulation of the wave equation given by Eq. (2.5) is independent from the particular coordinate system used for the position vector x. The Laplace operator in the wave
equation (2.5) has to be specialized to the particular coordinate system used. Appendix B
introduces the coordinate systems and the respective operators employed within this thesis.
An alternative form of the wave equation can be derived by applying a Fourier transformation [GRS01, HV99] with respect to the time t of the acoustic pressure p(x, t) and the
particle velocity v(x, t). The Fourier transform pair of the acoustic pressure is given as
P (x, ) = Ft {p(x, t)} =
p(x, t) =

Ft1 {P (x, )}

p(x, t) ejt dt ,

(2.6a)

1
=
2

P (x, ) ejt dt ,

(2.6b)

where = 2 f denotes the temporal (radial) frequency and Ft {} the Fourier transformation with respect to the time t. Time-domain Fourier transformed signals are denoted
by capital letters within this thesis. Please refer to Section 3.1 for a more detailed discussion of the Fourier transformation. Introducing P (x, ) into the wave equation (2.5)
and applying the differentiation theorem [GRS01] of the Fourier transformation derives
the wave equation formulated in the frequency domain
2 P (x, ) +

 2

c }
| {z

P (x, ) = 0 .

(2.7)

k2

This form is known as the Helmholtz equation. In general, the term /c is abbreviated
by the acoustic wavenumber k
2

k = k () =

 2
c

(2.8)

10

2. Fundamentals of Sound Propagation

where the wavenumber k is assumed to be nonnegative (k 0) within this work. Equation (2.8) is also termed as dispersion relation. It gives a connection between the acoustic
wavenumber k and the temporal frequency . It will be assumed in the sequel, that
the wavenumber can be expressed by the temporal frequency using Eq. (2.8) whenever
appropriate. This holds especially for variables depending on the temporal frequency
which are defined by equations including only the wavenumber k = k(). It will be
shown in Section 2.2 that the dispersion relation allows to derive a connection between
the temporal and spatial frequency of a monochromatic plane wave.

2.1.2

Two-dimensional Wave Fields

Real-world acoustic wave fields depend on three spatial and one temporal coordinate.
However, acoustic wave fields depending only on two spatial coordinates instead of three
are often suitable for many problems within this work. The transition from the threedimensional description of a wave field P (x, ) to a two-dimensional is performed by
assuming that the wave field exhibits no dependency on one of the three spatial coordinates. A wave field that fulfills this condition will be termed as two-dimensional wave
field. Please note, that this term covers three-dimensional wave fields that are independent from one spatial variable as well as truly two-dimensional wave fields [MF53a, Wil99].
So far the choice of the coordinate from which a two-dimensional wave field is assumed
to be independent was left open. It will be assumed in the remainder of this work that
a two-dimensional wave field does not depend on the z-coordinate. This leads to the
following condition in Cartesian coordinates
PC (x, y, z, ) = PC (x, y, ) .

2.1.3

(2.9)

Generic Solution of the Wave Equation

The homogeneous acoustic wave equation (2.5) describes the acoustic wave propagation
for the free-field case (no boundaries are present) of a source free volume. In order to
calculate the wave field for generic scenarios, further information about the boundaries
and the sources is required. Thus, the exact solution of the wave equation depends on
1. the initial conditions,
2. the acoustic sources present, and
3. the boundary conditions.
The discussion of initial conditions will be neglected within this thesis. It will be assumed
that the acoustic pressure and velocity can be set to zero as initial condition. The derivation of the wave equation (2.5) was not bound to a specific coordinate system used for the

2.2. Solutions of the Homogeneous Wave Equation in Cartesian Coordinates

11

description of the wave fields. Depending on the geometry of the problem it is convenient
to use different representations of acoustic wave fields in different coordinate systems.

2.2

Solutions of the Homogeneous Wave Equation in


Cartesian Coordinates

This section will consider the free-field solutions of the homogeneous wave equation (2.5)
formulated in Cartesian coordinates. Section B.1 introduces the Cartesian coordinate
system. All vectors and functions evaluated in this coordinate system will be denoted by
the index C attached to the respective variables.
A well known solution to the wave equation formulated in Cartesian coordinates is the
solution of dAlembert [MF53a]
pC (xC , t) = f (ct nTC,0 xC ) ,

(2.10)

where f () denotes an arbitrary function and nC,0 a normal vector with |nC,0 | = 1. The
proof that above equation provides a solution of the wave equation can be done straightforward by introducing Eq. (2.10) into Eq. (2.5) and using the definition of the Laplace
operator in Cartesian coordinates. In order to interpret Eq. (2.10) the substitution
(xC , t) = ct nTC,0 xC is used in the following. If the position vector xC is constant,
e. g. (0, t) = ct, then (t) is proportional to the time t by the speed of sound. If the
time is constant, e. g. (xC , 0) = nTC,0 xC , then (xC) describes a plane. The vector nC,0 is
the normal vector of this plane. However, if is constant, e. g. (xC, t) = 0, then (xC , t)
describes for every time instant t a plane moving with the speed c into the direction given
by the vector nC,0 . Thus, Eq. (2.10) describes propagating wavefronts with the shape of
f () that propagate with the speed of sound into the direction given by nC,0 . This type of
waves is termed as plane waves. Figure 2.1 illustrates the result for an arbitrary shaped
plane wave traveling in two-dimensional space.
Performing a temporal Fourier transformation of Eq. (2.10) according to Eq. (2.6a) yields
PC (xC , ) =

T
1

F ( ) ej c nC,0 xC .
|c {z c }

(2.11)

P ()

Using the abbreviation P () for the frequency dependent parts, as denoted above, and
kC,0 = /cnC,0, allows to derive a more compact form of the frequency-domain dAlembert
solution as
PC (xC , ) = P () ejkC,0 xC .
T

(2.12)

12

2. Fundamentals of Sound Propagation

f ()

nC,0

(xC , t) = ct nTC,0 xC = 0
Figure 2.1: Illustration of dAlemberts solution of the wave equation in Cartesian
coordinates. An arbitrary shaped plane wave f () traveling in two-dimensional space is
shown for a fixed time t. The gray plane denotes a plane of constant value = 0 that
moves with the speed of sound c into the direction given by nC,0 .
Introducing Eq. (2.12) into the homogeneous Helmholtz equation (2.7) yields
2
2
2
k 2 = kx,0
+ ky,0
+ kz,0
= |kC,0 |2 .

(2.13)

Equation (2.13) states that the acoustic wavenumber k is equal to the length of the vector
kC,0 . Thus, kC,0 will be denoted as the wave vector of a plane wave in the following. The
acoustic dispersion relation (2.8) relates the acoustic wavenumber k to the frequency .
Hence, Eq. (2.13) relates the length of the wave vector of a plane wave to its (temporal)
frequency. Thus, each wave vector kC,0 belongs to a specific (temporal) frequency 0 . A
signal in the time domain of the form ej0 t is called a monofrequent or monochromatic
signal. The constant 0 = 2f0 denotes the angular frequency, where f0 is the number
T
of cycles per second the signal exhibits. Due to these considerations, the term ejkC,0 xC
will be denoted as monochromatic plane wave. The wave vector kC,0 of a plane wave can
be interpreted as a vector consisting of the spatial frequencies kC,0 = [ kx,0 ky,0 kz,0 ]T ,
where each spatial frequency denotes 2 times the number of cycles per meter of the
monochromatic plane wave in x,y and z-direction.
The elements of the wave vector are not independent from each other for a fixed frequency
0 (monochromatic plane wave). Three parameters of the set (k0,x , k0,y , k0,z , 0) instead
of all four are sufficient to characterize a monochromatic plane wave. Equation (2.13)
ensures that plane waves, as described by Eq. (2.12), are a solution to the wave equation

2.2. Solutions of the Homogeneous Wave Equation in Cartesian Coordinates

13

1
0.8
0.6

y > [m]

0.4

kC,0

0.2
0
0.2
0.4
0.6
0.8
1
1

0.5

0
x > [m]

0.5

Figure 2.2: Pressure field of a monochromatic plane wave as given by Eq. (2.12). The
illustrated plane wave has a frequency of f0 = 1000 Hz. The gray level denotes the
amplitude. The plane wave is traveling in the x-y-plane for ease of illustration.
formulated in Cartesian coordinates.
Figure 2.2 shows the pressure field of a monochromatic plane wave traveling in the x-yplane of a Cartesian coordinate system. The plane wave has a frequency of f0 = 1000 Hz.
The wave vector, which points into the direction of wave propagation, is also shown. As
stated by the inverse Fourier transformation (2.6b), arbitrary shaped plane waves can
be generated by a superposition of monochromatic plane waves with different frequency
dependent weights P (). Of special interest in the remainder of this work are plane waves
with the shape of a Dirac pulse f (ct nTC xC ) = (ct nTC xC ). As the temporal Fourier
transformation of a Dirac pulse equals 1, the weights P () for this special case have to
be chosen as P () = 1/c.
Please note that the condition (2.13) also includes the special case of evanescent waves.
For a detailed treatment on evanescent waves please refer to [Wil99].

2.2.1

Plane Wave Expansion

It was shown in the previous section that the homogeneous wave equation is fulfilled by
dAlemberts solution (2.10) for all possible choices of nC,0 . Thus, arbitrary solutions of
the wave equation can be expressed as superposition of plane waves traveling into all
possible directions in three-dimensional space. Using the frequency domain formulation
of a plane wave (2.12), an arbitrary wave field can be expressed as superposition of all

14

2. Fundamentals of Sound Propagation

possible wave vectors in Eq. (2.12). However, as derived in the previous section, not all
elements of the wave vector can be chosen freely for one particular frequency . Hence,
an arbitrary wave field can be expressed as follows [Wil99]
Z Z
1
PC (xC , ) =
PC (kx , ky , )ej(kxx+ky y+kz z) dkx dky ,
(2.14)
(2)2
where PC (kx , ky , ) denotes the amplitude and phase of the plane waves and
kz2 = kz2 (kx , ky , ) = k 2 kx2 ky2 .

(2.15)

Equation (2.14) will be termed as plane wave expansion and the coefficients PC (kx , ky , )
as plane wave expansion coefficients in the following. The plane wave expansion coefficients are typically derived by introducing the expansion (2.14) into the wave equation
considering the particular problem. Are more generic approach for their derivation which
is based on the concept of a transformation will be given in Section 3.3.
As the sign of the wavenumber kz is ambiguous due to the powers in Eq. (2.15), the
traveling direction of the plane waves in the z-direction is not included in the plane wave
expansion coefficients PC (kx , ky , ). There are two possible approaches to overcome this
problem. The first approach is to assume that kz is positive. As a result Eq. (2.14) has to
be limited to the upper half space z > 0 to be unambiguous. The second approach is to
include the sign in the expansion coefficients. Within this thesis this will be done by using
(1)
(2)
two sets of expansion coefficients PC (kx , ky , ) and PC (kx , ky , ), where the upper index
(1) denotes positive kz and (2) negative kz values. The plane wave contributions can be
regarded as incoming or outgoing plane waves with respect to the plane z = 0. If kz is
positive, then the waves enter the half space z > 0 through the plane z = 0 and then they
will be termed as incoming waves. Otherwise, if kz is negative then they will be termed
as outgoing waves. Accordingly, the expansion coefficients are denoted as incoming (1)
or outgoing (2) plane wave expansion coefficients.

2.3

Solutions of the Homogeneous Wave Equation in


Cylindrical Coordinates

The following section considers the free-field solutions of the homogeneous wave equation (2.5) formulated in cylindrical coordinates. Section B.3 introduces the cylindrical
coordinate system as used within this thesis. All vectors and functions evaluated in this
coordinate system will be denoted by the index Y. This section is mainly based on the
work of [Wil99].
The wave equation, as given by Eq. (2.5), can be specialized straightforward to the case
of cylindrical coordinates
2 pY (xY , t)

1 2
pY (xY , t) = 0 ,
c2 t2

(2.16)

2.3. Solutions of the Homogeneous Wave Equation in Cylindrical Coordinates

15

where xY = [ r z]T denotes the position in cylindrical coordinates. A standard technique


often utilized to solve partial differential equations (PDE) of this type is the separation
of variables. Here it is assumed that the solution of a PDE can be written in terms of a
product of functions which are only dependent on one variable. Applying this principle to
the wave equation (2.16) states that the solution can be written as a product of functions
which are only dependent from one of the three spatial variables ,r,z and the time t
pY (, r, z, t) = p () pr (r) pz (z) pt (t) .

(2.17)

Introducing this solution into the wave equation (2.16) results in three ordinary differential
equations of second order for p (), pz (z) and pt (t). The solutions to these are given as
follows [Wil99]
p () = p,1 () ej + p,2 () ej ,

(2.18a)

pz (z) = pz,1 (kz ) ejkz z + pz,2 (kz ) ejkz z ,


pt (t) = P () ejt ,

(2.18b)
(2.18c)

where p,i , pz,i(kz ) and P () denote arbitrary constants. As the angular part p () is
periodic in with a period of 2, the constant has to be an integer ( Z). The
solution to the radial part pr (r) is given by Bessels differential equation
2
d2 pr (r) dpr (r)
2
+
+
(k

)pr (r) = 0 ,
r
dr 2
rdr
r2

(2.19)

where kr2 = k 2 kz2 (see Appendix B.3). The solutions to Bessels differential equation
are given by the Bessel functions of first, second and third kind [AS72]. A traveling
wave solution is given in terms of Hankel functions (Bessel functions of third kind) as
follows [Wil99]
pr (r) = pr,1 (kr ) H(1) (kr r) + pr,2 (kr ) H(2) (kr r) ,
(2.20)
(1),(2)

where H
() denotes the -th order Hankel function of first/second kind. The general
solution to the wave equation in cylindrical coordinates can be derived by combining the
solutions given by Eq. (2.18) and Eq. (2.20) together as required by Eq. (2.17). Discarding
the constants, the solution can be written to be proportional to
pY (xY , t) ej H(1),(2) (kr r) ejkz z ejt .

(2.21)

This result consists of a product with three exponential parts and one Hankel function.
Each exponential part depends on one parameter only, where can be interpreted as the
angular frequency, kz as the wavenumber (or spacial frequency) in the z-direction and as
the temporal frequency. The sign of the angular frequency denotes the rotation direction,
while the sign of kz denotes the propagation direction of waves in the z-direction. In order

16

2. Fundamentals of Sound Propagation

to find a similar interpretation for the radial part, the properties of Hankel functions have
to be investigated. In the far-field (kr r 1) the Hankel functions can be approximated
as follows [AS72]
r
1
1
2
(1)
H (kr r)
(2.22a)
ej(kr r 2 4 ) ,
kr r
r
1
1
2
(2)
H (kr r)
(2.22b)
ej(kr r 2 4 ) .
kr r
These approximations of the Hankel functions can be interpreted as incoming/outgoing
(1)
radial waves. Thus, the Hankel function of first kind H (kr r) can be interpreted as
(2)
incoming radial wave contribution and the Hankel function of second kind H (kr r) as
outgoing radial wave contribution. The parameter kr can the be understood as a radial
wavenumber. Figure 2.3 illustrates the incoming/outgoing cylindrical waves for = 0
and kz = 0 for different time instants. The traveling direction of the radial waves can be
seen clearly when following the wave fronts over time.

2.3.1

Cylindrical Harmonics Expansion

It was shown for the wave equation formulated in Cartesian coordinates that it is possible
to express arbitrary solutions as a superposition of elementary solutions. In this case these
solutions were plane waves. The same principle can be applied when using cylindrical
coordinates. The elementary solutions using these coordinates are given by Eq. (2.21).
According to the term spherical harmonics used for the elementary solutions of the
wave equation in spherical coordinates [Wil99, Pie91, Bla00], the solutions in cylindrical
coordinates will be termed as cylindrical harmonics in the remainder of this thesis. The
general solution of the wave equation (2.16) is then given as a superposition of elementary
solutions for all possible parameters and kz . Thus, in the frequency domain the general
solution can be given as
Z

1 X j (1)
PY (xY , ) =
e
P (, kz , ) H(1)(kr r) ejkz z dkz
2 =

1 X j (2)
+
e
P (, kz , ) H(2)(kr r) ejkz z dkz .
2 =

(2.23)

The infinite integral over kz can be interpreted as a spatial Fourier integral, the infinite
sum over as Fourier series. Please refer to Section 3.1.1 and Section 3.1.4 for details
on spatial Fourier transformations and Fourier series. The coefficients P (1),(2) (, kz , )
are termed as cylindrical harmonics expansion coefficients in the following and will be
denoted by a breve over the respective variable. It was shown in the previous section
(1)
(2)
(see e. g. Fig. 2.3) that H (kr r) belongs to an incoming (converging) and H (kr r)

2.3. Solutions of the Homogeneous Wave Equation in Cylindrical Coordinates

17

t = 0 ms

t = 1.6 ms

t = 2.8 ms

Figure 2.3: Illustration of an incoming/outgoing cylindrical Dirac shaped wave as given


by Eq. (2.21) for = 0 and kz = 0. The left row shows an incoming wave, the right
row an outgoing wave for different time instants. Only the x-y-plane is shown for ease of
illustration.

18

2. Fundamentals of Sound Propagation

to an outgoing (diverging) cylindrical wave. Thus, the expansion coefficient P (1) (, kz , )


describes the incoming wave field, whereas P (2) (, kz , ) describes the outgoing wave field.
The total wave field is given by Eq. (2.23) as a superposition of incoming and outgoing
contributions
(1)
(2)
PY (xY , ) = PY (xY , ) + PY (xY , ) .
(2.24)
The cylindrical harmonics expansion coefficients are typically derived by introducing the
expansion (2.23) into the wave equation considering the particular problem. However, a
generic formula for the derivation of the cylindrical harmonics expansion coefficients is
derived later within this work. It is given by Eq. (3.59).
In order to get more insight into the cylindrical harmonics expansion a closer look at the
basis functions is taken. As a result of Eq. (2.17) each spatial variable has its own basis
function. The z-coordinate has an exponential function as basis function. Their properties
are well known from the Fourier transformation and will not be discussed further here.
The -coordinate also has an exponential function as basis, however the angular frequency
variable is limited to integer values. Figure 2.4 illustrates the angular basis functions for
different angular frequencies. For the sake of illustration the plots only show the absolute
value of the real part. It can be seen clearly, that the angular basis functions exhibit
a spatial selectivity in the angular coordinate and thus can be interpreted as directivity
patterns.
Hankel functions are the basis functions of the radial coordinate r. The Hankel functions
of first and second type are conjugate complex to each other [AS72]. Hence, only the
properties of one type have to be studied. Figure 2.5 illustrates the Hankel functions of
first type (belonging to incoming waves).

2.3.2

Circular Harmonics Expansion

This section will derive the cylindrical harmonics expansion for a two-dimensional wave
field. This specialized decomposition will be termed as circular harmonics expansion in
the following.
If the wave field to be expanded exhibits no dependence on the z-coordinate it is convenient to use polar coordinates instead of cylindrical coordinates for the expansion into
circular harmonics (PY (, r, z, ) = PP (, r, )). As a consequence of condition (2.9) the
integration over kz in Eq. (2.23) will become equal to a spatial Dirac pulse 2 (z). This
Dirac pulse will be discarded in the following. The circular harmonics expansion is then
given as follows

X
PP (xP , ) =
P (1) (, ) H(1)(kr) ej + P (2) (, ) H(2)(kr) ej .
(2.25)
=

This specialized expansion can be used to describe two-dimensional wave fields. Please
note, that this formulation will not allow to expand a generic three-dimensional wave

2.3. Solutions of the Homogeneous Wave Equation in Cylindrical Coordinates

30

=0
0

330

30
300

60
90

270

120

240
150

30

180

=2
0

120

240
180

210

270

120

240

30

270

150

90

330

90

300

150

300

330

60

210

60

=1
0

19

180

=3
0

210

330

60

300

90

270

120

240
150

180

210

Figure 2.4: Illustration of the angular basis functions ej of the cylindrical harmonics. The plots show the absolute value of the real part (|{ej }|) for different angular
frequencies .

field observed only at the plane z = 0. The correct condition for this case would be to
introduce z = 0 into Eq. (2.23). However, this will not result the circular harmonics
decomposition of the wave field, as given by Eq. (2.25).
The expansion into spherical/cylindrical harmonics is closely related to the multipole
expansion of a wave field [Wil99, Pie91]. The multipole expansion of a wave field expands
an outgoing wave field into the fields of multiple acoustic point sources located in the
vicinity of the origin with equal amplitude and opposite phases. The same principle can
be applied to the incoming wave field by expanding it into acoustic drains. The wave field
is thus decomposed into the contributions belonging to monopoles, dipoles, quadrupoles,
etc. located at the origin. The expansion into cylindrical harmonics for the angular and
radial part can be understood as multipole expansion. The angular frequency is equal
to the order of the multipole. For the circular harmonics, as given by Eq. (2.25), the
multipole sources/drains of the multipole expansion are line sources or drains due to the
condition PY (, r, z, ) = PP (, r, ).

20

2. Fundamentals of Sound Propagation


=
=
=
=

{H (kr r)}

1
0.5

0
1
2
5

(1)

0.5
1

10

12

14

16

18

20

12

14

16

18

20

kr r
{H (kr r)}

1
0.5

(1)

0.5
1

10

kr r
(1)

Figure 2.5: Illustration of the radial basis functions H (kr r) of the cylindrical harmon(1)
ics. The upper plots shows the real value of H (kr r) the lower the imaginary value for
different angular frequencies .

2.4

Solutions of the Inhomogeneous Wave Equation

The following section will introduce solutions of the inhomogeneous wave equation
1 2
p(x, t) 2 2 p(x, t) = q(x, t) .
c t
2

(2.26)

These solutions compromise acoustic sources as will be shown in the following sections.

2.4.1

Point Source

The basis for the model of a point source is a radially oscillating sphere, that generates
an outgoing wave field. Due to the symmetry the wave field is angle independent (omnidirectional). The model of a point source is derived when considering the limiting case
for which the radius of the sphere becomes progressively smaller. The sphere will then
degenerate to a single point in space. However, the same principle can be applied to nearly
any arbitrary shaped source (with oscillating mass of fluid) if the dimensions of the source

2.4. Solutions of the Inhomogeneous Wave Equation

21

are small compared to the considered wavelength and the wave field is observed at a large
distance compared to the source dimensions. As these assumptions hold for several types
of real-world sources, the model of a point source is a frequently used idealization for
acoustic sources. One example for the application of the point source model is the acoustic
field of a loudspeaker mounted in a cabinet (closed-loudspeaker). The field observed
at some distance to the loudspeaker has approximately the properties of a wave field
generated by a point source.
Due to the omni-directional nature of the radiated pressure field it is convenient to use
a spherical coordinate system (see Appendix B.2) to describe the wave field of a point
source. The acoustic pressure field PH (xH , ) of a monochromatic point source placed at
the origin is given as follows [Pie91]
1
(2)
(2)
PH (xH , ) = PH (, ) = P () ejk ,

(2.27)

where k denotes the wavenumber, the radius and P () a frequency dependent pressure
amplitude. Transforming Eq. (2.27) back into the time-domain using the inverse Fourier
transformation (2.6b) yields
(2)

pH (xH , t) =

p(t ) .
2
c

(2.28)

This result proves that Eq. (2.27) describes an outgoing spherical wave. The shape of
the spherical wave in radial direction is given by p(t), which can be computed by an
inverse Fourier transformation of the frequency dependent pressure amplitude P (). The
amplitude of a point source exhibits a 1/ decay and has a pole for = 0. The point
source is therefore also termed as acoustic monopole. This pole at = 0 does not fit well
to physical reality. However, the point source model provides a reasonable approximation
well outside of this pole.
The acoustic particle velocity of a point source can be derived from Eulers equation (2.4).
However, the particle velocity VH (xH , ) has only non-zero contributions in the radial
direction due to the omni-directional nature of the wave field. The radial component
VH, (, ) can be computed as
VH, (, ) =

1
P (, ) ,
Z(k, )

(2.29)

where Z(k, ) denotes the acoustic impedance of the point source


Z(k, ) = 0 c

jk
.
1 + jk

(2.30)

The acoustic impedance is in general complex, which reveals that the point source has a
reactive part. However, in the far-field (k 1) the acoustic impedance can be approximated as Z 0 c, which exhibits no reactive contributions.

22

2. Fundamentals of Sound Propagation

5
1/
pressure
velocity

amplitude (real part) >

3
2
1
0
1
2
3
4
5

0.5

1.5

2.5
> [m]

3.5

4.5

Figure 2.6: Acoustic pressure P (, ) and normalized velocity VH, (, ) for a monochromatic point source with frequency f0 = 1000 Hz plotted over the distance . The plot
shows the real value of the functions and the 1/ decay curve.
Figure 2.6 shows the real part of the acoustic pressure P (, ) and the normalized velocity VH, (, ) of a point source with monochromatic excitation. The frequency of the
excitation was f0 = 1000 Hz. The 1/ decay is shown additionally. It can be seen that for
this example the phase difference between the acoustic pressure and velocity due to the
complex acoustic impedance is only considerable for relatively small distances < 1 m.
If the sign of the argument of the exponential term in Eq. (2.27) is reversed an incoming
spherical wave is produced
1
(1)
(1)
PH (xH , ) = PH (, ) = P () ejk .

(2.31)

This wave field can be interpreted as the result of an acoustic monopole drain placed at
the origin.

2.4.2

Greens Functions

The concept of the Greens functions provides a convenient way to compute arbitrary solutions of the inhomogeneous wave equation. The following section will briefly review the
relevant parts of the background to the Greens functions presented in [MF53a, Zio95].
The appendant inhomogeneous wave equation to the point source model, given by
Eq. (2.27), can be derived by plugging this special solution into the left hand side of

2.4. Solutions of the Inhomogeneous Wave Equation

23

the Helmholtz equation (2.7). After evaluating the nabla operator, the result can be
found as [Zio95]
2 P (x, ) + k 2 P (x, ) = 4 P () (x) ,
(2.32)
where (x) denotes a spatial Dirac pulse at the origin. Equation (2.32) states that the left
hand side of the wave equation is equal to a spatial Dirac pulse multiplied by a frequency
dependent factor. Thus, this proves that the excitation for the point source model (2.27)
is an infinitesimal small point in space. This result can be generalized to the case of a
point source placed at an arbitrary point x0 . The acoustic pressure field of the shifted
point source can be derived straightforward from Eq. (2.27) with = |x x0 | as
1 ejk|xx0 |
G0,3D (x|x0 , ) =
.
4 |x x0 |

(2.33)

Introducing Eq. (2.33) into the Helmholtz equation (2.7) yields


2 G0,3D (x|x0 , ) + k 2 G0,3D (x|x0 , ) = (x x0 ) .

(2.34)

This special solution of the inhomogeneous wave equation is known as the free-field Greens
function G0,3D (x|x0 , ).
In general, Greens functions are the solutions to inhomogeneous differential equations
when excited by a (multi-dimensional) Dirac pulse [Wei03]. In the context of linear
systems, Greens functions can be interpreted as the spatio-temporal impulse response of
the inhomogeneous wave equation. The vector x is also termed as observation point and
the vector x0 as the source point. The notation G(x|x0 , ) for a generic Greens function
highlights this interpretation.
One important property of Greens functions is their reciprocity principle [MF53a]. This
principle states, that the Greens function remains unchanged if the observation and source
position are interchanged
G(x|x0 , ) = G(x0 |x, ) .

(2.35)

Obviously Eq. (2.35) holds for the free-field Greens function given by Eq. (2.33). The
reciprocity principle will be utilized during the derivation of the general solution to an
inhomogeneous wave equation in Section 2.6.
The free-field solutions to the inhomogeneous wave equation (2.26) for arbitrary excitations Q(x, ) = Ft {q(x, t)} can be formulated in terms of the free-field Greens function
as follows [Wil99]
Z
P (x, ) =
Q(x , ) G0,3D (x|x , ) dV .
(2.36)
V

This principle will be exploited in the following two sections to derive the wave fields
generated by line and planar sources.

24

2. Fundamentals of Sound Propagation

The Greens function is typically defined in the (temporal) frequency domain. However,
for some applications its inverse Fourier transformation is of use
g(x|x0, t) = Ft1 {G(x|x0 , )} .

(2.37)

The time-domain Greens function g(x|x0 , t) can be interpreted as the spatio-temporal


impulse response of an acoustic source placed at the position x0 evaluated at the position
x.

2.4.3

Line Source

The concept of an acoustic line source is strongly related to the one of a point source.
Section 2.1.2 stated suitable conditions to reduce a three-dimensional description of a
wave field to a two-dimension description. In order to derive the transition from three- to
two-dimensions it was assumed that the acoustic field exhibits no dependence on the zcoordinate (e. g. PY (, r, z, ) = PP (, r, )). The analogon in this two-dimensional space,
to a point source in a three-dimensional space, is then a thin line with infinite length in
the z-direction and time-varying mass of fluid in radial direction. This type of source will
be termed as line source in the following. However, the concept of a line source is equal
to the one of a two-dimensional point source for truly two-dimensional wave propagation.
The wave field of a line source can be derived by calculating the field of an infinite long
radially oscillating cylinder [Bla00]. As for the point source, reducing the radius of the
cylinder until it has degenerated to a line derives the wave field of the line source. As a
result the pressure field exhibits a pole at r = 0. However, as for the point source, it can
be shown that a line source is a reasonable model for real-world line sources.
Although is possible to derive the wave field of a line source using the way depicted
above, an alternative derivation was chosen here. As indicated in the previous section it
is convenient to use the free-space Greens function (2.33) to calculate arbitrary solutions
of the inhomogeneous wave equation. The inhomogeneous wave equation for a line source
placed at the origin whose axis is perpendicular to the xy-plane is given as follows
2 PC (xC , ) + k 2 PC (xC , ) = P () (x)(y) .
{z
}
|

(2.38)

Q(x,)

The solution to Eq. (2.38) is given in terms of the free-field Greens function by Eq. (2.36).
Introduction of the line source excitation into (2.36) and exploitation of the sifting property of the Dirac function yields
Z
PC (xC , ) = P () (x )(y ) G0,3D (xC |xC , ) dV
V

(2.39)
Z
jk x2 +y 2 +(zz )2
e

p
= P ()
dz ,
x2 + y 2 + (z z )2
4

2.4. Solutions of the Inhomogeneous Wave Equation

25

where dV = dx dy dz denotes the volume element used for the first integral. The
second integral can be solved using an integral definition of the Hankel function [GR65].
Due to symmetry of the problem the solution is not dependent on the z-coordinate but
p
dependent on the distance r = x2 + y 2 to the source position. Thus, it is convenient
to use a cylindrical coordinate system (see Appendix B.3). The pressure field of a line
source is given as follows
(2)

(2)

PY (xY , ) = PY (r, ) =

j
(2)
P () H0 (kr) .
4

(2.40)

The particle velocity of a line source can be computed using Eulers equation (2.4). Due
to the geometry of the problem, the particle velocity will only have a contribution in the
radial direction. Evaluation of Eulers equation in the radial direction of a cylindrical
coordinate system yields

PY (xY , )
= j0 VY,r (r, ) .
r

(2.41)

Introduction of Eq. (2.40) into Eq. (2.41) yields the acoustic particle velocity of a line
source in radial direction as
VY,r (r, ) =

(2)
1
1
(2)
P () H0 (kr) =
P () H1 (kr) ,
40 c
40 c

(2.42)

(i)

where H () denotes the first derivative of the Hankel function with respect to the
radius r. The acoustic impedance Z(k, r) of a line source can be calculated according to
Eq. (2.29) using Eq. (2.42)
(2)
H0 (kr)
Z(k, r) = j0 c (2)
.
(2.43)
H1 (kr)
As for the point source, the acoustic impedance is in general complex. However, in the
far-field (kr 1) the acoustic impedance can be approximated as Z 0 c, which exhibits
no reactive contributions. The same result was obtained for the point source.
For the point source the amplitude decay was 1/r. In order to derive a similar result
for the amplitude decay of the line source, a closer look at the properties of the zerothorder Hankel function has to be taken. In the far-field (kr 1) the Hankel function can
be approximated as given by Eq. (2.22). This approximation states, that the amplitude

decay of a line source in the far field is 1/ r. In the near-field the amplitude decay is
strongly dependent on the small argument properties of the Hankel function, no simple
conclusion can be drawn here.
Figure 2.7 shows the real part of the acoustic pressure PY (r, ) and normalized velocity
40 c VY,r (r, ) of a monochromatic line source. The frequency of the excitation was f0 =

1000 Hz. The 1/ r decay is shown additionally.

26

2. Fundamentals of Sound Propagation

1
1/sqrt(r)
pressure
velocity

0.8

amplitude (real part) >

0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
1

0.5

1.5

2.5
r > [m]

3.5

4.5

Figure 2.7: Acoustic pressure PY (r, ) and normalized velocity 40 c VY,r (r, ) for a
monochromatic line source with frequency f0 = 1000 Hz plotted over the distance r.

The plot shows the real value of the functions and the 1/ r decay curve.
Repetition of above derivation with Greens function of an incoming spherical wave yields
the wave field of a line drain placed at the origin as
(1)

(1)

PY (xY , ) = PY (r, ) =

j
(1)
P () H0 (kr) .
4

(2.44)

However, this result is evident when taking the far-field approximation (2.22) of the Hankel
functions into account.

2.4.4

Planar Sources

The previous two sections introduced source models for point and line sources. The
traveling waves outside the excited point/line are the eigensolutions of the wave equation
formulated in spherical and cylindrical coordinates respectively. As was already shown
in Section 2.2 plane waves are eigensolutions of the acoustic wave equation formulated in
Cartesian coordinates. This section will introduce solutions of the inhomogeneous wave
equation that result in plane waves.
A plane wave can be exited by an infinitesimal thin plate with infinite size that vibrates
uniformly [JF86]. The simplest case is to place the vibrating plate in the xy-plane (z = 0).
This excitation would then result in a plane wave traveling parallel to the z-axis. Without
loss of generality it will be assumed in the following that the plate includes the origin of the

2.4. Solutions of the Inhomogeneous Wave Equation

27

coordinate system and its bearing is denoted by its surface normal n0 . The inhomogeneous
wave equation for such a planar source is given as follows
2 PC (xC , ) + k 2 PC (xC , ) = P () 2jk (nTC,0 xC ) ,

(2.45)

where P () denotes a frequency dependent factor. As for the derivation of the line source
it is convenient to use the free space Greens function together with Eq. (2.36) to calculate
the wave field of a planar source. Introduction of the right hand side of Eq. (2.45) into
Eq. (2.36) yields the following integral

Z
jk (xx )2 +(yy )2 +(zz )2
e
p
PC (xC , ) = 2jk P ()
(nTC,0 xC ) dV . (2.46)

2
(x x ) + (y y ) + (z z )
V 4

where dV = dx dy dz . Evaluation of this integral derives its solution as


PC (xC , ) = P () ejkC,0 xC ,
T

(2.47)

where kC,0 = /c nC,0 denotes the wave vector of the plane wave. This solution is, as
desired, equal to the homogeneous solution of the wave equation in Cartesian coordinates
given by Eq. (2.12). Plane waves with an arbitrary frequency spectrum can be excited
by using suitable frequency weights P (). The frequency weights can be derived from a
temporal Fourier transformation of the desired shape p(t).
Plane waves are often used to model the far-field of a point or line source. The curvature
of the wavefronts of a line/point source decays with increasing distance to the source.
In the far-field (kr 1) the wave fronts are approximately equal to the ones of a plane
wave.
The particle velocity of a plane wave can be found by introducing Eq. (2.47) into Eulers
equation (2.4). This results in
VC (xC , ) =

1
PC (xC , ) nC,0 .
0 c
|{z}

(2.48)

1/Z0

Above result states that the acoustic impedance Z0 of a plane wave is independent from
the position and frequency. The impedance Z0 = 0 c is also known as characteristic
acoustic impedance. The same impedance as for a plane wave is obtained for a point/line
source in the far-field (kr 1). This again highlights the connections between point/line
sources and plane waves in the far-field. Equation (2.48) states that the particle movement
is parallel to the direction of propagation. Hence, this proves that acoustic waves, as
described by the wave equation (2.5), are longitudinal waves.
In the remainder of this work two-dimensional wave fields will be used frequently. The
reduction to two-dimensions for a plane wave can be done by simply omitting the zcomponents of the position vector xC and the wave vector kC,0 . The resulting plane wave

28

2. Fundamentals of Sound Propagation

plane wave (isograms)

ky

k
0
kx

Figure 2.8: Illustration of the parameters used for a two-dimensional plane wave in
Cartesian and polar coordinates.
propagates in the xy-plane and can be characterized by its frequency weights P () and
its tilt angle 0 . Figure 2.8 illustrates the parameters of a two-dimensional plane wave in
Cartesian and polar coordinates. The connection between the wave vector kC,0 and the
tilt angle can be found by changing the underlying coordinate system from the Cartesian
coordinate system used for Eq. (2.47) to the polar coordinate system (see Appendix B.4).
The product of wave and position vector is then given as
kTC,0 xC = kr cos( 0 ) .

(2.49)

Thus, for a plane wave traveling in two-dimensional space with tilt angle 0 the acoustic
pressure of a plane wave can be reformulated in polar coordinates as
PP (xP , ) = P () ejkr cos(0 ) .

2.5

(2.50)

Boundary Conditions

In the foregoing sections only free-field propagation of acoustic waves was considered.
However, in the context of this thesis wave propagation inside enclosures is of special
interest to model wave propagation inside rooms. As stated in Section 2.1, the overall
solution of the wave equation inside an enclosure must meet the acoustic conditions at
the boundaries of the enclosure. The following section will introduce typical boundary
conditions.
The geometry used to formulate the boundary conditions inside an enclosure is depicted

2.5. Boundary Conditions

29

V
V

Figure 2.9: Volume V , surface V and inward pointing surface normal n used to formulate boundary conditions for the wave equation.
by Fig. 2.9: a compact volume V which is enclosed by a closed surface V . The orientation of the surface is given by the inward pointing surface normal n. The surface may
represent a real existing or a virtual boundary.
The entire range of possible boundary conditions can be classified into two basic classes:
homogeneous boundary conditions and inhomogeneous boundary conditions. Problems involving mixtures of these classes can be solved by a superposition of the corresponding
solutions. Inhomogeneous boundary conditions are typically used for radiation problems
(e. g. vibrating bodies), homogeneous boundary conditions when the boundaries are stationary. The points on the boundary will be denoted by xs in the following (xs V ).
The total wave field inside the enclosure can be diverted into two components
P (x, ) = Ps (x, ) + Pb (x, ) ,

(2.51)

where Ps (x, ) denotes the wave field generated by sources present inside the volume V
and Pb (x, ) the wave field generated by the boundaries. In the following a commonly
used classification of boundary conditions is reviewed briefly.

2.5.1

Classification of Boundary Conditions

Boundary conditions can be formulated in terms of the acoustic pressure, the acoustic particle velocity or both. Typically continuous pressure or particle velocity is assumed at the
boundary. Based on these principles three types of boundary conditions can be formulated
for the description of the acoustic properties of boundaries [JF86, Pie91, MF53a]:
1. Dirichlet boundary condition
P (xs , ) = f (xs , ) ,

(2.52)

30

2. Fundamentals of Sound Propagation

where f (xs , ) denotes an arbitrary function. Above conditions assume that the
acoustic pressure at the boundary is equal to f (xs , ). For the homogeneous case
(f (xs , ) = 0) this condition models a pressure release boundary. An example for
the application of the homogeneous condition is the boundary condition at the open
end of a duct. It can be shown that the pressure near the open end of a duct is
zero. Such boundaries are also often termed as acoustical soft surfaces.
2. Neumann boundary condition
P (xs , )
= f (xs , ) .
(2.53)
n
The partial derivative in above equation is an abbreviation for the gradient in direction of the normal vector n (see also Appendix C.1). Above condition can be
rewritten in terms of the particle velocity using Eulers equation (2.4) as follows

P (xs , ) = hP (xs , ), ni = j0 Vn (xs , ) = f (xs , ) ,


n

(2.54)

where Vn (xs , ) denotes the particle velocity in direction of the surface normal n.
Thus, the Neumann boundary condition assumes that the particle velocity in normal
direction is equal to f (xs , ) on the boundary V .
For the homogeneous case (f (xs , ) = 0) this condition models a rigid boundary
This homogeneous boundary condition is used to model acoustically impenetrable
surfaces. A typical example would be a structural wall with smooth surface, e. g. a
wall made of concrete. Such boundaries are also termed as acoustical hard surfaces.
3. Robin boundary condition
The third kind of boundary condition introduced here is a linear combination of the
first two kinds
P (xs , )
+ j(xs , )P (xs , ) = f (xs , ) .
(2.55)
n
For the homogeneous case (f (xs , ) = 0) above condition can be related to the
concept of the specific acoustic impedance [Pie91]. The specific acoustic impedance
at the boundary Zs (xs , ) is defined as follows
Zs (xs , ) =

P (xs , )
.
Vn (xs , )

(2.56)

Introducing Eq. (2.54) into Eq. (2.56) yields equality between Eq. (2.56) and
Eq. (2.55) for (xs , ) = 0 /Zs (xs , ). The specific acoustic impedance is typically being used to describe porous surfaces that are not necessarily impenetrable.
The robin boundary condition includes naturally the boundary conditions of first
and second kind. If Zs is chosen as Zs 1, a rigid or hard boundary is described.

2.6. Solution of the Inhomogeneous Wave Equation for a Bounded Region

x0

31

Q(x, )

O
Figure 2.10: Geometry used for the general solution of the inhomogeneous wave equation
for a bounded region V with prescribed boundary conditions on the closed boundary V .
Otherwise if Zs is chosen as 1/Zs 1, a pressure release or soft boundary is described. If Zs is equal to the characteristic acoustic impedance Z0 = 0 c free-space
propagation is modeled.

2.6

Solution of the Inhomogeneous Wave Equation


for a Bounded Region

The solution of the inhomogeneous wave equation with respect to arbitrary boundary
conditions is of special interest in the context of this thesis. It can be used e. g. to analyze
the reproduction of an acoustic scene in the listening room by the reproduction system.
The loudspeakers can be modeled by suitable source models and the characteristics of the
walls by suitable boundary conditions. In the sequel the solution of the inhomogeneous
wave equation will be derived for a bounded region in the presence of inhomogeneous
boundary conditions. The derivation, as given in this section, is based upon [MF53a].
Figure 2.10 illustrates the underlying geometry of the considered interior problem. A
source Q(x, ) generates a wave field within the bounded region V . The closed boundary V surrounding the region V may impose arbitrary homogeneous or inhomogeneous
boundary conditions. Please note, that the region V may be two- or three-dimensional.
In the first case V describes a plane and V the closed contour surrounding it, in the
second case V describes a volume and V the closed surface surrounding it. The derived
solution is based upon the concept of the Greens functions.
The acoustic pressure field P (x, ) within the region V obeys the inhomogeneous

32

2. Fundamentals of Sound Propagation

Helmholtz equation (2.7)


2 P (x, ) + k 2 P (x, ) = Q(x, ) ,

(2.57)

with respect to arbitrary boundary conditions formulated on the boundary V . It is


further assumed that a point source is placed at the point x within the volume V . The
wave field generated on the boundary V by the point source can be described in terms
of the Greens function G(x0 |x, ) (see Section 2.4.2). The Greens function G(x0 |x, )
has to fulfill the inhomogeneous wave equation (2.34)
2 G(x0 |x, ) + k 2 G(x0 |x, ) = (x0 x) ,

(2.58)

with respect to the same but homogeneous boundary conditions as used for Eq. (2.57).
Multiplying both sides of Eq. (2.57) with G(x0 |x, ) and both sides of Eq. (2.58) with
P (x, ) and subtracting the derived equations from each other results in
G(x0 |x, ) 2 P (x, ) P (x, ) 2 G(x0 |x, ) =

(x0 x) P (x, ) Q(x, )G(x0 |x, ) . (2.59)

Integration of both sides of Eq. (2.59) over the entire region V gives
Z

Q(x, )G(x0|x, ) dV +


G(x0 |x, ) 2 P (x, ) P (x, ) 2 G(x0 |x, ) dV =

P (x , ) for x V V,
0
0
=
(2.60)
0
otherwise.

where the sifting property of the Dirac delta function was exploited and dV denotes a
suitable chosen volume element in V . In above equation x constitutes a source point and
x0 a receiver point. However, it was desired to have x as receiver point. This can be
reached by interchanging the source and receiver points x and x0 . The second volume
integral can be simplified to a boundary integral using Greens second integral theorem
(see Appendix C.1). Comparison of above result with the left hand side of Eq. (C.3)
yields that v can be identified with G(x0 |x, ) and u with P (x, ). Applying the steps
described above to (2.60) yields
P (x, ) =

Q(x0 , ) G(x|x0, ) dV0



I 

G(x|x0 , )
P (x0 , ) P (x0 , )
G(x|x0 , ) dS0 ,
n
n
V
V

(2.61)

where dS0 denotes a suitably chosen surface element on V . Please note, that the Greens
function used in Eq. (2.61) is equal to the one defined by Eq. (2.58) due to the reciprocity

2.6. Solution of the Inhomogeneous Wave Equation for a Bounded Region

33

theorem (2.35). Equation (2.61) constitutes the solution to the inhomogeneous Helmholtz
equation (2.57) with respect to arbitrary inhomogeneous boundary conditions. It consists
of a volume integral involving the source terms and a boundary integral involving the
boundary conditions. In order to interpret Eq. (2.61) two special cases will be discussed
in the following:
1. Inhomogeneous wave equation, homogeneous boundary conditions
The pressure field and the Greens function have to obey the homogeneous boundary
conditions. As a result, the surface integral over V in (2.61) will vanish. Thus, the
solution for this case is given as
Z
P (x, ) =
Q(x0 , ) G(x|x0, ) dV0 ,
(2.62)
V

which for the free-field case is equal to Eq. (2.36).

2. Homogeneous wave equation, inhomogeneous boundary conditions


The volume integral over V in (2.61) will vanish in this case, since no source terms
are active (Q(x, ) = 0). Thus, the solution for this case is given as

I 

P (x, ) =
G(x|x0 , )
P (x0 , ) P (x0 , )
G(x|x0 , ) dS0 .
n
n
V
(2.63)
This integral is known as Kirchhoff-Helmholtz integral [Pie91, Wil99] (or Helmholtz
integral equation). The Kirchhoff-Helmholtz integral states that at any point within
the source-free region V the sound pressure P (x, ) can be calculated if both the

sound pressure P (x0 , ) and its directional gradient n


P (x0 , ) are known on the
boundary V enclosing the volume. The boundary V does not necessarily have to
be a real physical existing surface.
The Kirchhoff-Helmholtz integral is typically used in three areas: (1) the calculation
of a sound field emitted by a vibrating surface into a region, (2) the calculation
of a sound field inside a finite region produced by a source outside the volume
from measurements on the surface and (3) the acoustic control over the sound field
within a volume. The first application area will not be considered within this thesis.
The second application area will be utilized for wave field analysis, the third for
sound reproduction. The second application area is often termed as wave field
extrapolation.
Thus, the general solution to the inhomogeneous wave equation with inhomogeneous
boundary conditions is given by a superposition of a volume integral of type (2.62) and
the Kirchhoff-Helmholtz integral (2.63). In order to evaluate the Kirchhoff-Helmholtz
integral a suitable Greens function G(x|x0 , ) is required. The following sections will
introduce specialized Kirchhoff-Helmholtz integrals for three- and two-dimensional free
space wave fields.

34

2. Fundamentals of Sound Propagation

xx

x0

V
x

O
Figure 2.11: Parameters used for the three-dimensional free-space Kirchhoff-Helmholtz
integral (2.64).

2.6.1

Three-dimensional Free-Space Kirchhoff-Helmholtz Integral

The Greens function for a point source in free-space was already derived in Section 2.4.2.
Introduction of Greens function as given by Eq. (2.33) into the Kirchhoff-Helmholtz
integral (2.63) yields

I 

ejk|xx0 |
ejk|xx0|
1
(P (x0 , ))
P (x0 , )
dS0 , (2.64)
P (x, ) =
4 V n
|x x0 |
n |x x0 |

which will be denoted as three-dimensional free-field Kirchhoff-Helmholtz integral in the


following. Figure 2.11 illustrates the parameters used. In order to interpret Eq. (2.64)
a closer look at its different contributions will be taken. Using Eq. (C.2) and Eulers
equation (2.4) allows the rewrite the first term as follows

P (x, ) = hP (x, ), ni = j0 Vn (x, ) ,


(2.65)
n
where Vn (x, ) denotes the particle velocity in the direction of the surface normal n.
Thus, the first term in Eq. (2.64) can be interpreted as the particle velocity on the surface
V in direction of the surface normal. The second term represents a monopole source
distribution on the surface V and the third term the acoustic pressure on the surface.
The fourth term constitutes the directional gradient of a monopole source. The gradient
is given as follows
ejk|xx0|
1 + jkr
ejk|xx0|
=
cos
.
(2.66)
n |x x0 |
r
|x x0 |
It can be interpreted as the field of a dipole source placed on the surface, whose dipole
axis lies in the direction of the surface normal n [Pie91]. Hence, the Kirchhoff-Helmholtz

2.6. Solution of the Inhomogeneous Wave Equation for a Bounded Region

35

integral states that at any point within the source-free volume V the sound pressure
P (x, ) can be calculated if both the sound pressure and velocity are known on the
surface enclosing the volume. This principle will be exploited in Section 3.4 for efficient
wave field analysis. However, as stated before the Kirchhoff-Helmholtz integral provides
also the basis for sound reproduction. If a monopole and corresponding dipole source
distribution is placed on the surface V , the acoustic pressure inside that surface can be
controlled. The application of this principle to sound reproduction will be illustrated in
Section 4.1.

2.6.2

Two-dimensional Free-Space Kirchhoff-Helmholtz Integral

This section will specialize the three-dimensional Kirchhoff-Helmholtz integral derived in


the previous section to two-dimensional wave fields. As stated in Section 2.1.2 this transition in dimensionality can be performed by assuming that the wave field is independent
of the z-coordinate. In the case of the Kirchhoff-Helmholtz integral the surface V degenerates to a prism with arbitrary shaped base. Due to the symmetry the field inside
the prism is equal for each z-plane. Thus, without loss of generality it can be assumed
that the Kirchoff-Helmholtz integral is evaluated only at z = 0. As a consequence the
surface V degenerates to a closed contour S and the volume V to a surface S. The
closed contour S surrounds the surface S. The specialization of the three-dimensional
free-space Kirchhoff-Helmholtz to this two-dimensional geometry is thus given as

I 

P (x, ) =
G0,2D (x|x0 , )
P (x0, ) P (x0 , )
G0,2D (x|x0 , ) dL0 ,
n
n
S
(2.67)
where dL0 denotes a suitably chosen line element on S. Figure 2.12 illustrates the
geometry used for the two-dimensional free-space Kirchhoff-Helmholtz integral. The specialization of the Kirchhoff-Helmholtz integral to the two-dimensional case requires an
appropriate Greens function. As derived in Section 2.4.3, the two-dimensional analogon to a point source is a line source. Thus, the free-space Greens function for a line
source placed at the origin is given by Eq. (2.40). Generalization of this result to an arbitrary source position x and observation position x0 yields the two-dimensional free-space
Greens function as
G0,2D (x|x0 , ) = G0,2D (x0 |x, ) =

j (2)
H (k |x x0 |) .
4 0

(2.68)

The directional gradient of Eq. (2.68) can be expressed as

jk (2)
G0,2D (x|x0 , ) = hG0,2D (x|x0 , ), ni = H1 (k |x x0 |) cos ,
n
4

(2.69)

36

2. Fundamentals of Sound Propagation

xx

x0

S
x

O
Figure 2.12: Parameters used for the two-dimensional free-space Kirchhoff-Helmholtz
integral (2.70).
where denotes the angle between the inward pointing normal vector n of the closed
contour S and the vector x x0 . Equation (2.69) can be interpreted as the field of a
dipole line source whose axis lies parallel to the normal vector n. Introduction of Eq. (2.68)
and Eq. (2.69) into Eq. (2.67) together with Eq. (2.65) yields the two-dimensional freespace Kirchhoff-Helmholtz integral as
jk
P (x, ) =
4

j0 c Vn (x0 , )

(2)
H0 (k |x

x0 |) P (x0 , )

(2)
H1 (k

|x x0 |) cos

dL0 .

(2.70)
The two-dimensional Kirchhoff-Helmholtz integral states that any two-dimensional pressure distribution on the surface S can be reconstructed from a distribution of monopole
line sources and dipole line sources on the closed contour S surrounding the surface S.
The strength of the line sources is given by the acoustic velocity Vn (x0 , ), the strength
of the dipole line sources by the acoustic pressure P (x0 , ) on the closed contour S.

2.7

The Effect of Boundaries

The solution of the (inhomogeneous) wave equation is dependent on the geometry of the
boundary and the boundary conditions at the boundary. Suitable boundary conditions
have already been introduced in Section 2.5. Closed-form solutions of the (inhomogeneous)
wave equation can only be given for simple geometries and boundary conditions. The
following section will derive solutions for the reflection of plane waves at planar boundaries
and the principle of acoustic mode expansion for rectangular rooms.

2.7. The Effect of Boundaries

37

Medium 1
Z1 = 0,1 c1
Pi

Pr

Medium 2
Z2 = 0,2 c2
Pt

Figure 2.13: Illustration of the geometry and the parameters used in the discussion of
plane wave reflection and transmission at the planar boundary between two different fluid
media.

2.7.1

Plane Wave Reflection and Transmission at a Planar


Boundary

In the following the influence of the interface between two fluid materials with possibly different acoustic properties will be considered. This model is useful to calculate the acoustic
properties of penetrable materials. Nevertheless, the case of impenetrable surfaces is also
included inherently. The walls of a room can be modeled as plane boundaries in first approximation. However, real walls will always have finite extend. If the wavelength of the
considered waves are small compared to the extends of the wall, it is reasonable to model
them by an infinite large plane. Every wave field can be expanded into plane waves as
was shown in Section 2.2.1. Thus, the interaction of plane waves with planar boundaries
is of special interest. The following section will summarize the results given in [Bla00]. To
simplify the discussion, only the two-dimensional case will be considered here. However,
the results can be generalized straightforward to the three-dimensional case.
Figure 2.13 illustrates the underlying geometry. The two media are described by their
densities 0,1 , 0.2 and their acoustic velocities c1 , c2 . The boundary is illuminated by an
incident plane wave with incidence angle i . Due to the incident plane wave a reflected
and transmitted plane wave is produced with the incidence angles r and t respectively.
In order to derive the relations between these wave fields it is assumed that the pressure

38

2. Fundamentals of Sound Propagation

and the acoustic particle velocity in normal direction to the interface are continuous. As
a result, the incidence angle of the incident and the reflected wave field will be equal
i = r .

(2.71)

Please note, that this result is independent from the acoustic properties of the mediums. However, the incidence angle of the transmitted wave field is dependent from the
properties of the two mediums. It is given as follows
sin t
c2
=
.
sin i
c1

(2.72)

This relation is known as Snells law. The acoustic pressure of the reflected and the
transmitted wave field can be formulated in terms of reflection and transmission factors.
These relate the pressure amplitudes of the reflected Pr and the transmitted wave field
Pt to the pressure amplitude of the incident wave field Pi . These factors are given as
Pr
Z2 cos i Z1 cos t
=
,
Z2 cos i + Z1 cos t
Pi
Pt
2Z2 cos i
.
=
=
Z2 cos i + Z1 cos t
Pi

Rpw =

(2.73a)

Tpw

(2.73b)

Depending on the angle of incidence i and the properties of the two materials some
special cases arise. Only the case of normal incidence (i = 0) will be discussed briefly,
please refer to [Bla00] for other cases. In case of normal incidence the reflection and
transmission factors simplify to

2.7.2

Z2 Z1
,
Z2 + Z1
2Z2
=
.
Z2 + Z1

R0,pw =

(2.74a)

T0,pw

(2.74b)

Acoustic Modes in a Rectangular Room

One of the simplest models that can be used for a room is a box with rectangular shape.
Although such a simplified model neglects the complex structure of a real room (e. g. the
furniture) it allows, with feasible complexity, to calculate the wave field produced by
sources placed inside the room. Thus, it can be used to gain insight into the structure
of wave fields propagating in rooms. The rather broad approximation by a rectangular
shape is especially reasonable for low frequencies. In the following section the theory of
acoustic modes in rectangular rooms, as derived e. g. in [Pie91, Mec02], will be briefly
reviewed.
The underlying geometry of the problem is illustrated in Fig. 2.14: a rectangular enclosure
with the lengths Lx , Ly and Lz in the x-, y- and y-coordinate respectively. The acoustic

2.7. The Effect of Boundaries

39

z
Lz

x0
Ly

Lx
x
Figure 2.14: Parameters used to describe the rectangular room.
properties of the walls will be described by their specific acoustic impedance as given
by Eq. (2.56). It is convenient to use Cartesian coordinates in the following, due to the
geometry of the problem .
The solution of the homogeneous wave equation inside the room can be calculated by
assuming separation of the variables in the three spatial dimensions. It can be shown
that the solution of a partial differential equation on a finite domain can be expressed by
a set of basis functions with discrete spectra [MF53a]. The fundamental solution for a
rectangular room is then given as follows
T

m (x) = Am () ejkm x + Bm () ejkm x ,

(2.75)

where km = [kx,m ky,m kz,m ] denotes the modal wave vector, m = [mx my mz ]T the
vector of (integer) modal orders and Am (), Bm () arbitrary complex constants. Above
(fundamental) solutions are also known as the modes of the room. Each particular mode
has a specific order mx , my , mz in each spatial direction. The fundamental solution
compromises a combination of two plane waves for each spatial coordinate. The solution
of the wave equation in a rectangular room is then a weighted superposition of all possible
modes at one particular point x inside the room
X
T
T
P (x, ) =
Am () ejkm x + Bm () ejkm x ,
(2.76)
m

where m denotes the summation over all permutations of the modal orders mx , my and
mz . The constants Am () and Bm () depend on the acoustic excitation present in the

40

2. Fundamentals of Sound Propagation

room. The modal wavenumbers kx,m , ky,m and kz,m depend on the geometry and the
boundary conditions.
It will be assumed in the following that the acoustic properties of the walls are characterized by their impedance (2.56). One of the simplest solutions can be found by assuming
that all walls of the room are rigid. This assumption will serve as starting point for a, to
some extend, generalized model at the end of this section. Introducing the rigid boundary
condition into Eq. (2.76) yields
X
P (x, ) =
Am () cos(kx,m x) cos(ky,m y) cos(kz,m z) ,
(2.77)
|
{z
}
m

m (x)

where the wavenumbers of the (cosine shaped) plane waves can be derived as
k{x,y,z},m =

m{x,y,z}
.
L{x,y,z}

(2.78)

It will be assumed further that the rigid room is excited by a point source placed at
the position x0 . Figure 2.14 illustrates the configuration. For this purpose a solution of
the inhomogeneous wave equation (2.32) under the given boundary conditions has to be
found. Since it can be shown that the modes m (x) form a set of mutually orthogonal
functions it is possible to perform a modal expansion of the inhomogeneous part of the
wave equation. The expansion coefficients Am () can be derived by comparison of the
expansion coefficients for the homogeneous solution and the excitation. The resulting
wave field for a point source and rigid walls is
P (x, ) =

X m (x) m (x0 )
4
P ()
,
2
V
k 2 km
m

(2.79)

where km = |km | denotes the modal wavenumber, P () the spectrum of the point source
and V = Lx Ly Lz the volume of the room. A consequence of Eq. (2.79) is that a room
with rigid boundaries will exhibit resonances at the discrete frequencies m = km c.
The results derived so far can be generalized to the case of boundary conditions of the
third kind (see Section 2.5). The acoustic properties of the walls are then described by
their specific acoustic impedance Zs . It can be shown [Mec02] that the fundamental
solution for this generalized case is still given by Eq. (2.75). Unfortunately, it is not
possible to derive the wave vectors km (spatial eigenfrequencies) of the modes explicitly.
In the general case, they have to be calculated numerically. However, for a room with
nearly rigid walls
|Zs |
1,
(2.80)
0 c
it can be shown [Pie91] that a reasonable approximation is given by replacing the modal
2
wavenumber km
in Eq. (2.79) by a complex wavenumber whose imaginary part is proportional to the real part of the wall admittance. This approximated solution is given as

2.7. The Effect of Boundaries

follows
P (x, ) =

41

X m (x) m (x0 )
4
P ()
,
2 + jk
V
k 2 km
c
m

(2.81)

where denotes the characteristic time of the Sabine-Franklin-Jaeger model. It is related


to the reverberation time by T60 = (6 ln 10) .

42

2. Fundamentals of Sound Propagation

43

Chapter 3
Fourier Analysis of Wave Fields
Fourier analysis has been used in different fields of engineering in the last decades. Its
success is amongst others a reason of the efficient implementations, like the fast Fourier
transformation (FFT), which are widely available. The following chapter will first review Fourier analysis of spatio-temporal signals. The specialization to acoustic wave
fields yields the plane wave decomposition which will be introduced and discussed in the
remainder of this chapter.

3.1

Fourier Analysis of Multidimensional Signals

The Fourier transformation of signals has proven to be a powerful tool for system and
signal analysis. Acoustic pressure fields have, in general, four degrees of freedom: three
spatial dimensions and one temporal. They can be understood as four-dimensional signals. To keep the notation brief, this special type of multidimensional signals will be denoted as multidimensional or spatio-temporal signals within the remainder of this thesis.
The most commonly used form of the Fourier transformation is the time-domain Fourier
transformation of signals. This transformation was already introduced in Section 2.1.
The Fourier transformation of one-dimensional signals can be extended straightforward
to multidimensional signals as will be shown in the sequel. This section proceeds as follows: First the multidimensional Fourier transformation of generic signals which depend
on one temporal and up to three spatial dimensions will be introduced. As shown in
Chapter 2, it is convenient to use different coordinate systems for the description of wave
fields. Therefore, the multidimensional Fourier transformation will be specialized to the
Cartesian and cylindrical coordinate system in a next step.

44

3.1.1

3. Fourier Analysis of Wave Fields

Multidimensional Fourier Transformation

The time-domain Fourier transform pair of a one-dimensional time-dependent signal p(t)


was already introduced in Section 2.1 (see Eq. (2.6)) as
Z
P () = Ft {p(t)} =
p(t) ejt dt ,
(3.1a)

Z
1
1
p(t) = Ft {P ()} =
p(t) ejt d ,
(3.1b)
2
where denotes the temporal frequency and Ft {} the Fourier transformation with respect
to the time t. Time-domain Fourier transformed signals are denoted by capital letters
within this thesis. The properties and theorems of the time-domain Fourier transformation
will not be discussed within this thesis. Please refer to the literature, e. g. [GRS01, HV99].
The temporal Fourier transformation can be extended straightforward to spatio-temporal
signals p(x, t). The Fourier transform pair for a multidimensional signal p(x, t) is then
given as follows [JD93, Zio95]
Z
Z

P (k, ) = Fx,t {p(x, t)} =


p(x, t) ejhk,xijt dt d ,
(3.2a)
D
R

Z
Z
1
1
P (k, ) ejhk,xi+jt dt dK ,
(3.2b)
p(x, t) = Fx,t {P (k, )} =
(2)D+1 KRD
where k denotes the wave vector, D = {1, 2, 3} the spatial dimensionality of the signal,
hk, xi the inner product of k and x, d and dK volume elements of the position and wave
vector space respectively. Spatially Fourier transformed signals will be denoted by a tilde
over the variable within this thesis. The spatial part of the transformation, as formulated
above, is independent from the particular coordinate system used. The inner product
hk, xi and the volume elements d and dK have to be specialized to the coordinate system used. Please note, that the exponential terms for the spatial and the temporal part
have opposite signs. This choice of signs accounts for the spatio-temporal propagation of
plane waves as will be shown in Section 3.3.1. However, in the literature different choices
for the signs of the temporal and spatial part may be found. The Fourier transformation
and its inverse form a complete set of transformations. A signal can be transformed into
the wavenumber-frequency domain with the Fourier transformation (3.2a) and afterwards
back into the space-time domain using the inverse Fourier transformation (3.2b) without
any information loss.
Examination of the spatio-temporal Fourier integrals (3.2) yields that the integration
can be split into a spatial and temporal part. The spatio-temporal Fourier transformation (3.2a) can be expressed as a temporal Fourier transformation of the signal followed
by a spatial Fourier transformation
P (k, ) = Fx { Ft{ p(x, t) } } = Fx {P (x, )} .

(3.3)

3.1. Fourier Analysis of Multidimensional Signals

p(x, t)

Fx

Ft
P (x, )

45

p(k, t)

Ft
Fx

P (k, )

Figure 3.1: Separability of the spatio-temporal Fourier transformation into a temporal


and spatial part as given by Eq. (3.3).
This property is referred as separability. The spatio-temporal Fourier transformation as
given by Eq. (3.2a) is thus separable in the temporal and spatial dimension. The same
principle applies also to the inverse spatio-temporal Fourier transformation (3.2b). Within
this thesis, the temporal Fourier transformation will often not be mentioned explicitly.
The respective variables will be given directly in the (temporal) frequency domain. In this
case, the spatio-temporal Fourier transformations (3.2) will simplify to a spatial Fourier
transformation of the spatio-temporal signal given in the (temporal) frequency domain as
follows
Z

P (k, ) = Fx {P (x, )} =
P (x, ) ejhk,xi d ,
(3.4a)
D
R
Z
1
1
P (x, ) = Fx {P (k, )} =
P (k, ) ejhk,xi dK .
(3.4b)
(2)D KRD
These transformations will be referred as (spatial) Fourier transformations of a multidimensional signal in the following. Figure 3.1 gives an overview of the spatio-temporal
Fourier transformations of signals as introduced in this section. Up to now, this formulation is independent on the particular coordinate system used for the spatial variables.
The following two sections will specialize the transformations above to the Cartesian and
cylindrical coordinate system.

3.1.2

Multidimensional Fourier Transformation in Cartesian


Coordinates

If the spatial variable of the multidimensional signal is given in Cartesian coordinates


xC = [x y z]T , then the natural choice for the coordinate system of the wave vector is to
also use the Cartesian coordinate system kC = [kx ky kz ]T . The inner product of kC and
xC in Cartesian coordinates is defined as
hkC , xC i = kTC xC = kx x + ky y + kz z .

(3.5)

46

3. Fourier Analysis of Wave Fields

Introducing this definition into the Fourier transformation (3.4) yields the Fourier transformation of a signal whose spatial dependency is given in Cartesian coordinates

PC (kC , ) = Fx {PC (xC , )} =

PC (xC , ) ejkC xC d ,
3
R
Z
1
T
1
PC (xC , ) = Fx {PC (kC , )} =
PC (kC , ) ejkC xC dK ,
3
(2) KR3

(3.6a)
(3.6b)

where d = dx dy dz and dK = dkx dky dkz denote the space and wavenumber volume
elements used for integration. The multidimensional Fourier transformation formulated in
Cartesian coordinates is normally referred to as multidimensional Fourier transformation
in the literature.
The Fourier transformation exhibits several properties and theorems which are of interest
within this thesis. These will be reviewed shortly in the following. For a detailed discussion
of the properties of multidimensional Fourier transformations please refer to the literature,
e.g. [Bam89, Bra78].

Separability
The exponential terms in the Fourier integrals (3.6) can be split into a multiplication
of exponential terms involving only one spatial dimension. Thus, the integration can be
performed independently for each spatial dimension. Since the Fourier transformation
is also separable in the temporal dimension, the spatio-temporal Fourier transformation
formulated in Cartesian coordinates is fully separable in all variables. This property is
often exploited in practical implementations.

Scaling theorem
The scaling theorem of the Fourier transformation relates the Fourier transformation of a
signal whose spatial variable has been scaled to its unscaled Fourier transformation. Lets
consider the Fourier transformation of a spatio-temporal signal is given PC (kx , ky , kz , ) =
Fx {PC (x, y, z, )}. The Fourier transformation of this signal when scaling the x-coordinate
by the constant factor a is given as
Fx {PC (a x, y, z, )} =

1 kx
PC ( , ky , kz , ) .
|a|
a

(3.7)

The same result applies to a scaling of the other spatial variables, since the Fourier
transformation in Cartesian coordinates is fully separable for all spatial variables.

3.1. Fourier Analysis of Multidimensional Signals

47

Convolution theorem
The convolution theorem relates the convolution of two signals to their Fourier transformations. The convolution of two spatial signals is defined in Appendix B.1 by Eq. (B.8).
Introducing this definition into the Fourier transformation (3.6a) yields
C (kC , ) .
Fx {fC (xC , t) x,t hC (xC , t)} = FC (kC , ) H

(3.8)

Thus, the Fourier transformation of a (spatial) convolution of two signals is given as the
multiplication of the Fourier transformations of the two signals.

3.1.3

Multidimensional Fourier Transformation in Cylindrical


Coordinates

The Fourier transformation (3.6), as introduced in the previous section, is based on the
Cartesian coordinate system. However, problems with cylindrical symmetry can often
be described more convenient in cylindrical coordinates. This section will introduce the
Fourier transformation of signals whose spatial dependency is given in cylindrical coordinates.
The cylindrical coordinate system, as used within this work, is introduced in Section B.3.
Using the relations given there, the inner product of kY and xY in cylindrical coordinates
becomes
hkY , xY i = kr r cos( ) + kz z ,
(3.9)
where xY = [ r z]T denotes the position and kY = [ kr kz ]T the wave vector in cylindrical
coordinates. The volume elements d and dK in Eq. (3.4) can be expressed as d =
r d dr dz and dK = kr d dkr dkz , respectively. The Fourier transformation and its
inverse in cylindrical coordinates is then derived as follows

PY (kY , ) = Fx {PY (xY , )} =

PY (xY , ) ejkr r cos()+jkz z rddrdz ,

(3.10a)
PY (xY , ) = Fx1 {PY (kY , )} =
Z Z Z 2
1
=
PY (kY , ) ejkr r cos()jkz z kr d dkr dkz .
(2)3 0
0

(3.10b)

Figure 3.2 illustrates the connections between the Fourier transformation formulated in
Cartesian and cylindrical coordinates. The interconnection of both is given by the conversion of the position vector and the wave vector into Cartesian or cylindrical coordinates,
respectively. The transformations (3.10), as introduced above, assume three spatial di-

ts

48

3. Fourier Analysis of Wave Fields

PC (xC , )

Fx

PC (kC , )

x = r cos
y = r sin
z=z

kx = kr cos
ky = kr sin
kz = kz

PY (xY , )

Fx

PY (kY , )

Figure 3.2: Illustration of the connections between the Fourier transformation formulated in Cartesian and cylindrical coordinates.
mensions. However, they can be specialized straightforward to the case of two-dimensions
by assuming that they do not depend on the z-coordinate (see Section 2.1.2). The resulting polar Fourier transformation can be derived from Eq. (3.10) by discarding the
z-dependent parts and replacing the radial wavenumber kr by the wavenumber k
Z Z 2

PP (kP , ) = Fx {PP (xP , )} =


PP (xP , ) ejkr cos() rddr ,
(3.11a)
0
0
Z Z 2
1
1
PP (kP , ) ejkr cos() k d dk . (3.11b)
PP (xP , ) = Fx {PP (kP , )} =
(2)2 0
0
The cylindrical Fourier transformation exhibits several properties and theorems which are
of interest within this thesis. These will be reviewed shortly in the following:
Separability
As a result of the cylindrical basis, the cylindrical Fourier transformation is not fully
separable. However, as can be deduced from Eq. (3.10a) it is still separable in the zvariable, but not anymore in the angular and radial variable. Thus, the cylindrical Fourier
transformation can be separated as follows
PY (kY , ) = Fx {PY (xY , )} = Fz {F,r {PY (, r, z, )}} = Fz {PY (, kr , z, )} .

(3.12)

The transformation of the z-coordinate is then a one-dimensional Fourier transformation.


Scaling theorem
In the following only a scaling of the radial r and height z space variable is considered. A
scaling in the angular variable does not make much sense due to the periodicity of the
angle in cylindrical coordinates. Since the cylindrical Fourier transformation is separable
in the z-variable by using a one-dimensional Fourier transformation for the z-variable, the

3.1. Fourier Analysis of Multidimensional Signals

49

same scaling theorem as for the Cartesian Fourier transformation applies here. For the
radial variable the scaling theorem will be derived in the following. Lets assume a signal
given in cylindrical coordinates PY (xY , ) has the Fourier transformation PY (kY , ). The
problem is to find the Fourier transformation of the radially scaled signal PY (, a r, z, )
in terms of the Fourier transformation of the unscaled signal. By substitution of a r into
the definition of the cylindrical Fourier transformation (3.10a) the desired relation can be
found as
1
kr
Fx {PY (, a r, z, )} = 2 PY (, , z, ) .
(3.13)
a
|a|
Thus, scaling in the radial direction has only influence on the wavenumber k of the
Fourier transformed signal. This result is evident when considering the circular basis
of the cylindrical Fourier transformation. Scaling of the radial variable is a natural
operation on this basis and will not influence the angle or height dependent parts.
Convolution theorem
The convolution theorem of the Fourier transformation in Cartesian coordinates (3.8)
gives a simple relation between the convolution of two signals and their Fourier transformation. It would be desirable to find a similar relationship for the cylindrical Fourier
transformation. This relationship can be found by transforming both sides of Eq. (3.8)
from Cartesian to cylindrical coordinates. However, the convolution operation in Cartesian coordinates is then transformed into its representation in cylindrical coordinates as
given by Eq. (B.23). Since this representation is quite different from the definition of the
convolution in Cartesian coordinates, it will be termed as cylindrical convolution in the
following. The overall result is then given as
Y (kY , ) H
Y (kY , ) ,
Fx {GY (xY , ) x HY (xY , )} = G

(3.14)

where x denotes the cylindrical convolution with respect to xY .

3.1.4

Cylindrical Fourier Transformation Expressed as Hankel


Transformation

As PY (xY , ) is periodic in with a period of 2 for any fixed r and z, it is possible to


develop PY (xY , ) into a Fourier series with respect to the angle
(, r, z, )} =
PY (, r, z, ) = FS {P

(, r, z, ) ej ,
P

(3.15)

(, r, z, ) are
where Z denotes the angular frequency. The expansion coefficients P
given as
Z 2
1
1

P (, r, z, ) = FS {PY (, r, z, )} =
PY (, r, z, ) ej d .
(3.16)
2 0

50

3. Fourier Analysis of Wave Fields

The Fourier series transform pair will be denoted as Fourier series (FS) in the following, the
expansion coefficients by a circle over the respective variable. The series expansion (3.15)
will be plugged into the definition of the cylindrical Fourier transformation (3.10a) in the
following. Since the z-coordinate can be transformed independently from the angular and
radial coordinate, it is sufficient to consider only the latter two. In order to simplify the
notation, the Fourier transformation of the z-coordinate will not be mentioned explicitly.
Plugging Eq. (3.15) into the definition of the cylindrical Fourier transformation (3.10a)
yields
Z 2
Z
X
(, r, z, )r
PY (, kr , z, ) =
P
ej ejkr r cos() d dr ,
(3.17)
= 0
|0
{z
}
h (,r,kr )

where h (, r, kr ) abbreviates the angular integral. This can be solved by using the substitution = 2 [Wei03]
Z 2

h (, r, kr ) =
ej ej 2 ej(kr r sin ) d
(3.18)
0

j
= 2 j J (kr r) e ,
where J () denotes the -th order Bessel function of first kind [AS72]. Substitution of
this result into Eq. (3.17) results in
Z

X
j

(, r, z, ) J (kr r) rdr .
PY (, kr , z, ) = 2
j e
P
(3.19)
0

The remaining radial integral equals the definition of the -th order Hankel transformation [Gas78, Sne72]. This transformation will be denoted in the following as
Z
H,r {f ()} =
f () J (kr r) rdr .
(3.20)
0

Thus, the cylindrical Fourier transformation can be expressed in terms of a discrete Fourier
series. The series coefficients are given by the Hankel transformation of the Fourier series
coefficients of the pressure field. Applying the same steps to the inverse cylindrical Fourier
transformation yields the Fourier transformations (3.10) in terms of a Fourier series as
follows

PY (kY , ) = 2
PY (xY , ) = 2

(, r, kz , )} ej ,
j H,r {P

(3.21a)

Y (, kr , z, )} ej .
j H,kr {P

(3.21b)

3.2. Linear Systems and Fourier Analysis

51

FS1

PY (xY , )

(, r, z, )
P

2j H,r

FS
PY (kY , )

Figure 3.3: Representation of the cylindrical Fourier transformation as Fourier series of


the Hankel transformed angular expansion coefficients.
The cylindrical Fourier transformation is decomposed into a Fourier series with respect
to the angle . The expansion coefficients of the series are given by the Hankel transformation of the angular expansion coefficients of the wave field multiplied by the factor j .
The Hankel transformation itself operates only on the radial axis. As a result, the computation of the cylindrical Fourier transformation is split into an angular and an radial
part. Figure 3.3 illustrates this result. Similar considerations also apply to the inverse
cylindrical Fourier transformation.

3.2

Linear Systems and Fourier Analysis

The concept of the Fourier transformation is strongly related to the theory of linear
systems. For example, the time-domain Fourier transformation is widely used for the
analysis of linear systems in the frequency domain. The following section will discuss the
relation of multidimensional Fourier transformations to linear systems and will illustrate
that the wave equation can be understood as a linear system. As for the multidimensional
Fourier transformation, the discussion will be limited to spatio-temporal signals which are
dependent on one temporal and up to three spatial dimensions.
A one-dimensional system maps a one-dimensional input signal to a one-dimensional
output signal. For a time-domain signal this mapping is typically denoted as follows
p(t) = S{q(t)} ,

(3.22)

where S denotes the system, q(t) the input signal and p(t) the output signal. An example
for such a system could be a wireless transmission system for a speech signal. The concept
of one-dimensional systems can be extended straightforward to spatio-temporal signals.
The mapping between the input q(x, t) and the output signal p(x, t) can then be written
as
p(x, t) = S{q(x, t)} .

(3.23)

Such a system will be denoted as multidimensional system in the following or if the


context is clear, simply as system. Figure 3.4 gives a graphical representation of a generic
multidimensional system. If a system involves multiple inputs and multiple outputs it

52

3. Fourier Analysis of Wave Fields

q(x, t)

p(x, t)

Figure 3.4: Graphical representation of a multidimensional system mapping the input


q(x, t) to the output p(x, t).
is termed as multiple-input/multiple-output system (MIMO). Combining all inputs and
outputs into vectors q(x, t) and p(x, t) respectively yields
p(x, t) = S{q(x, t)} .

(3.24)

If there is only a single input or output, the system will be termed as single-input/multiple
output (SIMO) or multiple-input/single-output (MISO) system and consequently a system with one input and one output as single-input/single-output (SISO) system.
Systems can be characterized by considering certain properties of the connection between
the input and output signal. The next section will introduce a common classification of
(multidimensional) systems which is based on their properties.

3.2.1

Classification of Multidimensional Systems

This section gives a brief overview on the properties of systems which are useful within
the context of this thesis. For a more in depth discussion of the properties of systems
please refer to the literature [GRS01, Bam89]. Multidimensional systems can be classified
as follows:
Linear systems
A system is said to be linear if a superposition of scaled input signals leads to the output
of scaled output signals, where each of the output signals is the response of the particular
input signal. This principle can be formulated as follows
SL {A1 q1 (x, t) + A2 q2 (x, t)} = A1 p1 (x, t) + A2 p2 (x, t) ,

(3.25)

where A1 , A2 denote arbitrary constants and qi (x, t) the input signal that produces the
output pi (x, t) = S{qi (x, t)}. If only one input signal is present, then the response of a
linear system to a scaled input signal is an output signal which is scaled according to the
scaling of the input signal. The discussion of systems will be limited to linear systems
within this thesis. It will be assumed in the following that the linearity condition given
by Eq. (3.25) is always met.

3.2. Linear Systems and Fourier Analysis

53

Linear time-shift invariant systems (LTI)


A system is said to be time-shift invariant if the response to a delayed input signal is
the corresponding output signal delayed by the same amount as the input signal. This
principle can be formulated as follows
SLTI {q(x, t t0 )} = p(x, t t0 ) ,

(3.26)

where t0 denotes an arbitrary delay, and p(x, t) = S{q(x, t)}.


Linear space-shift invariant (LSI)
Space-shift invariance involves the same principle as time-shift invariance but applied to
the spacial coordinates. A system is said to be space-shift invariant if the response to a
spacial shifted input signal is the corresponding output signal shifted by the same amount
as the input signal. This principle can be formulated as follows
SLSI {q(x x0 , t)} = p(x x0 , t) ,

(3.27)

where x0 denotes an arbitrary vector, and p(x, t) = S{q(x, t)}.


Linear time- and space-shift invariant (LTSI)
The combination of all characteristics mentioned so far yields the concept of a linear
space- and time-shift invariant system. Its characteristics can be formulated as
SLTSI {q(x x0 , t t0 )} = p(x x0 , t t0 ) ,

(3.28)

where t0 denotes an arbitrary delay, x0 an arbitrary vector and p(x, t) = S{q(x, t)}.

3.2.2

The Wave Equation as Multidimensional System

The process of wave propagation can be understood as multidimensional system. The


input signal to the system is the acoustic source q(x, t), the output signal is the acoustic pressure p(x, t). The system in between these two signals can be characterized by
the properties of the propagation medium and the boundary conditions. In the following discussion the properties of the multidimensional system representing the process of
wave propagation will be linked to the characteristics of the propagation medium and the
boundary conditions imposed by a boundary.
The process of wave propagation is described by the wave equation. The homogeneous
wave equation however, has no source terms. The inhomogeneous wave equation has
source terms and thus can be understood as system. The derivation of the wave equation,
as reviewed within this thesis, implied that the medium characteristics are linear. As a

54

3. Fourier Analysis of Wave Fields

System property

Required conditions

linear system

linear time-shift invariant system


linear space-shift invariant system

linear medium
linear boundary conditions
homogeneous boundary conditions
medium characteristics are constant over time
boundary conditions are constant over time
unlimited domain
no boundary conditions are present
homogeneous medium

Table 3.1: Relations between system properties, medium characteristics and boundary
conditions for the inhomogeneous wave equation.

result, the inhomogeneous wave equation (2.26) consists of a linear superposition of differential operators in time and space. Since these operators are linear, the inhomogeneous
wave equation can can be interpreted as linear system when discarding the boundary
conditions. If boundary conditions are present then these also have to be linear and homogeneous in order for the system to be linear. Summarizing, wave propagation can be
understood as linear system if the medium characteristics are linear and the boundary
conditions are linear and homogeneous.
The differential operators in the wave equation (2.5) are time- and space-invariant. However, for the wave equation to represent a time-shift invariant system the medium characteristics and boundary conditions have to be constant over time. As boundary conditions
define boundaries in the spatial domain, space-shift invariance can only be obtained when
no boundary conditions are present. This implies that the domain where the inhomogeneous wave equation is defined on is unlimited. Additionally, the medium characteristics
have to be constant over space. Thus, space-shift invariance can only be derived for freefield propagation in a homogeneous medium. Table 3.1 summarizes the results. If the
process of wave propagation should represent a LTSI system then all of the requirements
on the right hand side of Table 3.1 have to hold.

3.2.3

Linear Systems and the Fourier Transformation

The time-domain Fourier transformation provides a powerful tool for the analysis of LTI
systems. However, the same applies to multidimensional LTSI systems and the multidimensional Fourier transformation as will be shown in the following.
Using the sifting property of the Dirac function, the response p(x, t) of a generic linear

3.2. Linear Systems and Fourier Analysis

55

multidimensional system S L to an input signal q(x, t) can be expressed as follows


Z
Z
p(x, t) = S L {q(x, t)} =
q(x , ) S L {(x x , t )} d d .
(3.29)
RD

where d denotes a suitable chosen volume element for the integration. Equation (3.29)
states that the response of the system S L can be expressed as an integral involving the
input signal q(x, t) and the response of the system to shifted Dirac pulses. The response
of a system to a Dirac pulse is typically denoted as the impulse response of the system.
In general, the impulse response is dependent on the time t, the position vector x and
their shifts
S L {(x x , t )} = h(x, x , t, ) .

(3.30)

where h(x, x , t, ) denotes the impulse response. However, if the system under consideration is a LTSI system, then h(x, x , t, ) can be expressed as the spatio-temporal shifted
impulse response
S LTSI {(x x , t )} = h(x x , t ) .

(3.31)

In this case, the integral (3.29) represents a multidimensional convolution. Thus, the
system output p(x, t) can be expressed as spatio-temporal convolution of the input signal
q(x, t) with the impulse response h(x, t) of the system
p(x, t) = S LTSI {q(x, t)} = q(x, t) x,t h(x, t) ,

(3.32)

where h(x, t) = S LTSI {(x, t)}.


Of special interest is the response of a LTSI system to a multidimensional exponential
signal
q(x, t) = e(x, t) = ejhk,xi+jt .

(3.33)

Introducing e(x, t) into Eq. (3.32) and using the definition of the multidimensional Fourier
transformation (3.2a) yields

p(x, t) = S LTSI {e(x, t)} = H(k,


) ejhk,xi+jt .

(3.34)

Since the output signal is simply a scaled version of the input signal, e(x, t) represents

an eigenfunction of a LTSI system and H(k,


) the corresponding eigenvalue. The (com
plex) scaling of the eigenfunction is given by the eigenvalue H(k,
) which is, in general,

dependent on the wave vector k and the temporal frequency . As H(k,


) describes the
system transfer characteristics for the eigenfunctions e(x, t) it is also denoted as transfer

function of the system. The eigenvalues H(k,


) can be derived from a multidimensional
Fourier transformation of the impulse response h(x, t). Using the convolution theorem of
the Fourier transformation (3.8), the system response is given as

56

3. Fourier Analysis of Wave Fields

) H(k,

P (k, ) = Q(k,
)

(3.35)

Thus, in the spatio-temporal Fourier transformed domain the system response is derived

by multiplication of the input signal with the multidimensional transfer function H(k,
).
Equation (3.35) can be interpreted as follows: The input signal and the impulse response
of the system are decomposed into the eigenfunctions of a LTSI system. These provide
a suitable basis for this purpose. The response of the system in this eigenspace is given
as simple multiplication of the corresponding eigenvalues. Since the kernel of the Fourier
transformation (3.2) is equivalent to the eigenfunctions of a LTSI system, it provides a
suitable transformation for the desired decomposition.
The space and time-shift invariance of the system and its impulse response is explicitly
required in order to express the system response as multidimensional convolution. If the
system is time- and/or space-shift variant then the impulse response is not invariant to
temporal and/or spatial shifts. As a result the output of the system cannot be expressed
as multidimensional convolution in the time-space domain or as product in the frequencywave vector domain.

3.3

The Continuous Plane Wave Decomposition

Up to now, the signals and systems to be analyzed were treated as generic four-dimensional
spatio-temporal signals and systems. The process of wave propagation, as covered by
the acoustic wave equation, can be understood as spatio-temporal system and the corresponding wave fields as spatio-temporal signals. Hence, the multidimensional Fourier
transformation provides a useful tool for system and signal analysis in this context. Since
acoustic wave fields exhibit characteristic properties, it is reasonable to specialize the
Fourier transformation to the case of acoustic wave fields.
It was derived in Section 2.2 that plane waves are eigensolutions of the acoustic wave
equation formulated in Cartesian coordinates. Thus, by decomposing a wave field into its
plane wave contributions it can be represented as superposition of plane waves. However,
this requires a specialized transformation that derives the expansion coefficients in terms
of plane waves. Because of its foundation this transformation will be termed as plane
wave decomposition.
This section will introduce the continuous plane wave decomposition. Furthermore the
properties and theorems of the plane wave decomposition and its representations in terms
of other transformations will be derived.

3.3. The Continuous Plane Wave Decomposition

57

15

0.8
10

0.6

kC,0
kC,0

0.2

k > [1/m]

0
0

y > [m]

0.4

0.2

0.4
0.6

10

0.8
1
1

0.5

0
x > [m]

0.5

15
15

10

0
5
k > [1/m]

10

(b) spatial Fourier transformation

(a) pressure field

Figure 3.5: Pressure field and its spatial Fourier transformation of a monochromatic
plane wave traveling in the x-y-plane. The shown plane wave has an incidence angle of
0 = 45o and a frequency of f0 = 1000 Hz. The circle denotes the position of a spatial
Dirac pulse.

3.3.1

Fourier Analysis of Plane Waves

In order to derive more insight into the connections between the Fourier transformation
of a wave field and the expansion into plane waves, the spatial Fourier transformation of
a monochromatic plane wave is investigated in the following.
Introducing the pressure field of a monochromatic plane wave, as given by Eq. (2.47), into
the definition of the spatial Fourier transformation (3.6a) yields
T
PC (kC , ) = Fx {( 0 ) ejkC,0xC } = (2)3 ( 0 ) (kC kC,0 ) ,

(3.36)

where kC,0 denotes the wave vector of the plane wave. Equation (3.36) states, that the
spatial part of the Fourier transformation of a plane wave equals a multidimensional Dirac
pulse at the position kC = kC,0 . Figure 3.5 illustrates this result for a monochromatic plane
wave traveling in the x-y-plane. The wave vector kC,0 of a plane wave is orthogonal to its
isophase planes. Its angle represents the incidence angle of the plane wave. As given by
Eq. (2.13) and Eq. (2.8), there is a link between the absolute value of the wave vector of
a monochromatic plane wave and its temporal frequency
k02

= |kC,0 | =

 2
0

(3.37)

15

58

3. Fourier Analysis of Wave Fields

Equation (3.37) states, that the absolute value of the wave vector kC,0 is related to its
temporal frequency by the speed of sound. Thus, the position of the Dirac pulse resulting
from the spatial Fourier transformation of a monochromatic plane wave will lie on the
surface of a sphere with the radius 0 /c. The angle under which the Dirac pulse is seen
from the origin denotes the incidence angle of the plane wave, since the wave vector of a
plane wave is orthogonal to its isophase planes. As stated before, complex pressure fields
can be expressed as a superposition of plane waves. Each plane wave has an individual
incidence angle and frequency spectrum, parameters which are captured by the expansion
coefficients. The spectrum of one particular plane wave can be extracted from the spatial
Fourier transformation by computing the coefficients along a line with the incidence angle
of the considered plane wave. Concluding the discussed results, the plane wave expansion
coefficients can be derived from a spatial Fourier transformation of the acoustic pressure
field.

3.3.2

Definition of the Plane Wave Decomposition

The previous section revealed that it is suggestive to use cylindrical coordinates to derive
the plane wave expansion coefficients. In order to further illustrate the benefits of this
coordinate system change, Fig. 3.6 illustrates the geometric relations between the wave
vector of a monochromatic plane wave and the position vector expressed in Cartesian and
cylindrical coordinates respectively. The wave vector formulated in cylindrical coordinates
kY = [ kr kz ]T consists directly of the incidence angle with respect to the x-y-plane, the
radial wavenumber kr and the wavenumber kz in the z-direction. The azimuth angle of
the plane wave is implicitly given by the wavenumber kz . Thus, the plane wave expansion
coefficients can be derived by applying a Fourier transformation formulated in cylindrical
coordinates to the wave field. The radial wavenumber kr is related to the wavenumber k
and the wavenumber kz in the z-direction by
p
p
(3.38)
kr (kz , ) = k 2 kz2 = (/c)2 kz2 ,
where the acoustic dispersion relation (2.8) was introduced to derive the second equality.
Thus, the dependency on the radial wavenumber kr can be dropped for the expansion
coefficients in the following. Based on these considerations, the plane wave expansion coefficients can be calculated from a specialization of the Fourier transformation formulated
in cylindrical coordinates (3.10a) as follows
P (, kz , ) = P 3 { PY (xY , ) } =
Z Z Z 2
=
PY (xY , ) ej( kr r cos()+kz z ) r d drdz .

(3.39)

Please note that the left hand side of Eq. (3.39) does not depend explicitly on the radial
wavenumber kr due to Eq. (3.38). This specialized transformation decomposes the wave

3.3. The Continuous Plane Wave Decomposition

59

plane wave (isograms)

ky

kr

kz

kx

Figure 3.6: Illustration of the position and wave vector of a monochromatic plane wave
in Cartesian and cylindrical coordinates. Only the x-y-plane is shown for simplicity. The
parallel lines illustrate the isograms of equal phase for a plane wave propagating in the
x-y-plane.
field PY (xY , ) into its plane wave expansion coefficients P (, kz , ). This transformation
is termed as plane wave decomposition [HdVB01]. The operator that performs the plane
wave decomposition is denoted as P D in general, the plane wave decomposed signals by
a bar over the signals throughout this thesis. The index D below the plane wave decomposition operator denotes the dimensionality of the decomposition. Figure 3.7 illustrates
the derivation of the plane wave decomposition from the spatial Fourier transformation.
It can be concluded from Eq. (3.38) and Fig. 3.7 that the plane wave expansion components P (, kz , ) of a wave field can be derived from its cylindrical Fourier transformation
PY (, kr , kz , ) as follows
P (, kz , ) = PY (, kr , kz , ) (kr

p
(/c)2 kz2 ) .

(3.40)

For most applications in the context of this thesis, it is not feasible to investigate the wave
field within the entire three-dimensional space. However, the analysis of a two-dimensional
plane is feasible and sufficient in most situations. It is assumed in the following that the
pressure field exhibits no dependence in the z-direction, thus PY (, r, z, ) = PP (, r, ).
The plane wave decomposition is then given by specializing the cylindrical Fourier transformation (3.10) assuming independence from the z-coordinate and using the dispersion
relation (2.8) as

60

3. Fourier Analysis of Wave Fields

Fourier transformation
PC (kC , ) = Fx {PC (xC , )}
xC xY
kC kY

cylindrical coordinates

cyl. Fourier transformation


PY (kY , ) = Fx {PY (xY , )}
k2 =

plane wave dispersion relation


2
c

= kr2 + kz2

plane wave decomposition


P (, kz , ) = P 3 {PY (xY , )}

Figure 3.7: Derivation of the plane wave decomposition from a spatial Fourier transformation of the wave field.

P (, ) = P { PP (xP , ) } =

PP (xP , ) ejkr cos() r d dr .

(3.41)

This transformation will be referred as two-dimensional plane wave decomposition in the


following and explicitly denoted by P 2 . Unless mentioned, the two-dimensional plane
wave decomposition will be used and referred as plane wave decomposition in the remainder of this work. In order to simplify the nomenclature, the two-dimensional plane wave
decomposition will be denoted simply by P.
Since the cylindrical Fourier transformation is separable in the z-coordinate, the same applies also to the three-dimensional plane wave decomposition. Thus, the three-dimensional
plane-wave decomposition can be computed by performing a two-dimensional plane wave
decomposition followed by a spatial Fourier transformation operating on the z-coordinate
only
P (, kz , ) = P 3 { P (xY , ) } = Fz { P 2 {P (xY , )} } .

(3.42)

Hence, in most cases the results derived within this thesis for the two-dimensional
plane wave decomposition can be generalized straightforward to three-dimensions using
Eq. (3.42). The two-dimensional plane wave decomposition is also referred as polar or
radial Fourier transformation in the literature [ACD+ 03] due to the underlying coordinate
system. Figure 3.8 illustrates the relation between the (spatial) Fourier transformation
and the plane wave decomposition of a wave field.

3.3. The Continuous Plane Wave Decomposition

PC (xC , )

61

Fx

PC (kC , )

x = r cos()
y = r sin()

PP (xP , )

P (, )

kx = k cos()
ky = k sin()
2
k 2 = c

Figure 3.8: Illustration of the connections between the spatial Fourier transformation
and the plane wave decomposition of a two-dimensional wave field.
3.3.2.1

Time-domain Plane Wave Decomposition

The plane wave decomposition, as introduced in the previous section, is based on a


frequency-domain representation of the acoustic pressure field. The frequency-domain
formulation was beneficial because it directly allowed to use the plane wave dispersion
relation. For some applications however, a time-domain formulation of the plane wave decomposition might be useful. This will be derived in the following. Inverse (time-domain)
Fourier transformation of the -dependent parts in the integrand of the plane wave decomposition (3.41) and application of the Fourier multiplication theorem [GRS01] allows
to formulate the plane wave decomposition in the time domain as
Z Z 2
r
1
pP (xP , t) t (t + cos( )) r ddr
p(, t) =
2 0
c
0
Z Z 2
r
1
=
pP (xP , t + cos( )) r ddr .
2 0
c
0

(3.43)

The plane wave decomposition can be interpreted as an integration over time-shifted


versions of the acoustic field pP (xP , t). The time-shifts correspond to the planes of constant
phase for a plane wave with incidence angle .

3.3.3

Definition of the Inverse Plane Wave Decomposition

The inverse three-dimensional plane wave decomposition is derived analogous to the plane
wave decomposition. Using the inverse Fourier transformation formulated in cylindrical
coordinates (3.10b) and discarding the integral over dkr due to Eq. (3.38) and Eq. (3.40)
yields the inverse plane wave decomposition as


PY (xY , ) = P 1
P (, kz , ) =
3
Z Z 2
k
=
P (, kz , ) ej( kr r cos()+kz z ) d dkz .
(2)3 0

(3.44)

62

3. Fourier Analysis of Wave Fields

As for the forward transformation (3.39) a restriction to two-dimensional wave fields


( PY (, r, z, ) = PP (, r, ) ) is beneficial. The two-dimensional inverse plane wave decomposition can be derived from the inverse polar Fourier transformation and (the polar
variants of) Eq. (3.38) and Eq. (3.40) as follows

PP (xP , ) = P

P (, )

k
=
(2)2

P (, )ejkr cos() d .

(3.45)

Unless explicitly mentioned, the two-dimensional inverse plane wave decomposition will
be used and referred to as inverse plane wave decomposition in the remainder of this work.
Introducing Eq. (3.45) into Eq. (3.41) proofs that a wave field can be transformed into
its plane wave expansion coefficients and afterwards back into its spatial representation
without information loss.
The inverse plane wave decomposition can be interpreted as the superposition of plane
waves from all directions . The frequency spectrum of a particular plane wave is given
by its plane wave decomposition P (, ). The k term present in the inverse plane wave
decomposition can be interpreted as time-domain filtering process with the frequency
characteristic |/c|. The filtering process has to be performed when reconstructing the
wave field from its plane wave decomposition. As a consequence, the plane wave decomposed signals P (, kz , ) must exhibit a frequency-domain low-pass characteristic. The
desired |/c| high-pass response in Eq. (3.45) can be realized only approximately in practical implementations. One approach to approximate the desired filter response is to split
the required filter onto the forward and the inverse plane wave decomposition using the

relation || = j j.

3.3.4

Representations of the Plane Wave Decomposition

The following section will derive connections between the plane wave decomposition and
other well known transformations. Additionally, some useful representations and variants
of the plane wave decomposition will be introduced.

3.3.4.1

Plane Wave Decomposition as Hankel Transformation

It was shown in Section 3.1.4 that the cylindrical Fourier transformation can be expressed
in terms of the Hankel transformation. The same applies to the plane wave decomposition
when introducing the dispersion relation into Eq. (3.21a) and discarding the z-dependent
parts. The plane wave decomposition in terms of the Hankel transformation is given as

3.3. The Continuous Plane Wave Decomposition

P (, ) = 2

63

(, r, )} ej .
j H,r {P

(3.46)

In the following the special case of a pressure field that exhibits radial symmetry
PP (, r, ) = P (r, ) will be considered. The Fourier series expansion coefficients are
(, r, ) = P (r, ) [], where [] denotes the discrete unit pulse. Hence,
then given as P
the summation over in Eq. (3.46) degenerates to the fixed value = 0. The plane wave
decomposition for a radially symmetric field is then given as
Z

P () = 2 H0,r {P (r, )} = 2
P (r, ) J0(kr) r dr .
(3.47)
0

This specialized plane wave decomposition for radially symmetric wave fields is well known
as Fourier-Bessel or zeroth-order Hankel transformation [Pap68]. Please note, that the
plane wave decomposition P () of a radial symmetric field P (r, ) is also radially symmetric and thus depends only from the frequency .
3.3.4.2

The Decomposition into Incoming/Outgoing Plane Waves

The Bessel function J (kr) can be expressed as sum of Hankel functions [AS72]
J (kr) =
(1)

1
[ H(1) (kr) + H(2) (kr) ] ,
2

(3.48)

(2)

where H () and H () denote the -th order Hankel functions of first and second kind.
The Hankel functions of first and second kind play an important role in the context of
traveling wave solutions for the wave equation formulated in cylindrical coordinates. The
(1)
physical interpretation of the Hankel functions gained in Section 2.3 is that H (kr)
(2)
corresponds to a converging incoming wave and H (kr) to a diverging outgoing wave.
Thus, Eq. (3.48) can be used to decompose a wave field into incoming and outgoing wave
contributions
P (x, ) = P (1) (x, ) + P (2) (x, ) ,

(3.49)

where P (1) (x, ) denotes the incoming part and P (2) (x, ) the outgoing part respectively.
Introducing Eq. (3.48) into Eq. (3.46) and using the definition of the Hankel transformation (3.20) yields
P (, ) =
+

j e

j ej

(, r, ) H (1) (kr) rdr


P

(3.50)
(, r, ) H (2) (kr) rdr .
P

replacemen
64

3. Fourier Analysis of Wave Fields

H(1)
,r

FS

j H(2)
,r

FS

j
PP (xP , )

FS1

P (1) (, )

(, r, )
P

P (2) (, )

Figure 3.9: Plane wave decomposition of a wave field into incoming and outgoing contributions as given by Eq. (3.52). The decomposition is given as Fourier series, where
the expansion coefficients are computed from the transformation (3.53) of the angular
expansion coefficients of the wave field.
Equation (3.50) can be divided into incoming and outgoing plane wave contributions
similar to Eq. (3.49)
P (, ) = P (1) (, ) + P (2) (, ) .
(3.51)
This allows to decompose a wave field into incoming and outgoing plane wave contributions
as follows

P (1) (, ) =
P (2) (, ) =

j H(1)
,
,r {P (, r, )} e

(3.52a)

j H(2)
,
,r {P (, r, )} e

(3.52b)

(2)
where the notation H(1)
,r {} and H,r {} is chosen in analogy to the definition of the Hankel
(2)
transformation (3.20). The transformations H(1)
,r {} and H,r {} are defined as follows

H(1),(2)
{f ()}
,r

f () H(1),(2) (kr) rdr .

(3.53)

This transformation will be denoted as complex Hankel transformation of first/second


kind. Figure 3.9 illustrates the derived relationship. The plane wave decomposition is
decomposed into a Fourier series for the incoming and outgoing parts respectively. The
expansion coefficients of the two series are given by the complex Hankel transformations
of the angular expansion coefficients of the wave field multiplied by the factor j .
The incoming/outgoing parts of a wave field P (1),(2) (xP , ) can be derived by performing
a plane wave decomposition according to Eq. (3.52) followed by an inverse plane wave
decomposition for the incoming and outgoing plane wave contributions P (1) (, ) and
P (2) (, ) independently.

3.3. The Continuous Plane Wave Decomposition

3.3.4.3

65

The Plane Wave Decomposition in Terms of Circular Harmonics

As discussed in Sections 2.2 and 2.3, plane waves and cylindrical waves are eigensolutions
of the acoustic wave equation in Cartesian/cylindrical coordinates. Both eigenfunction
bases allow to decompose a wave field into plane waves or cylindrical harmonics. However, their relations to each other are not clear. This section will derive the interrelation
between the decomposition into plane waves and the decomposition in terms of circular
harmonics for two-dimensional wave fields.
The expansion of a wave field into circular harmonics was already introduced in Section 2.3.2. It is given as follows (see Eq. (2.25))
PP (xP , ) =

P (1) (, )H(1) (kr)ej +

P (2) (, )H(2)(kr)ej .

(3.54)

This expansion can be interpreted as Fourier series with respect to the angle . In order
to derive a relation between plane waves and circular harmonics the angular Fourier series
of a plane wave is derived by calculating its expansion coefficients using Eq. (3.16). The
result is the following identity
jkr cos()

j J (kr)ej() ,

(3.55)

which is also known as Jacobi-Anger expansion [Wei03, GR65]. Equation (3.55) expands
the pressure field of a plane wave into a series of Bessel functions. This representation
of a plane wave can be split into incoming/outgoing cylindrical contributions, using the
relation between the Bessel and the Hankel functions given by Eq. (3.48). This was
shown in the previous section. In the following only the incoming part of the wave field
will be considered. Introducing the incoming part of Eq. (3.55) into the inverse plane
wave decomposition (3.45) yields
Z 2

X
k
(1)
(1) (, ) 1
P (xP , ) =
P
j H(1) (kr) ej() d
(2)2 0
2 =
(3.56)
Z 2
X
k
(1)
j
(1)
j
=
j P (, ) e
d H (kr)e .
2(2)2 = 0
Comparing above result with the decomposition of the acoustic field into cylindrical harmonics (3.54) for the incoming part of the field yields a relation between the plane wave
and circular harmonics expansion coefficients
Z 2
k
(1)

P (, ) =
j P (1) (, )ej d .
(3.57)
2
2(2) 0
Equation (3.57) relates the circular harmonics expansion coefficients to the plane wave
expansion coefficients. However, it was desired to derive a relation between the plane wave

66

3. Fourier Analysis of Wave Fields

P (1) (, )

FS

P (1) (, )

P (2) (, )

FS

P (2) (, )

Figure 3.10: Relationship between the expansion into circular harmonics and the plane
wave decomposition as given by Eq. (3.58). The frequency dependence was discarded in
this diagram.
expansion coefficients and the circular harmonics expansion coefficients. Equation (3.57)
bears strong resemblance with the calculation of the Fourier series expansion coefficients
given by Eq. (3.16). The desired relation can be found by identifying P (1) (, ) as the
expansion coefficients of a Fourier series. Performing the steps outlined above also for the
outgoing part of the wave field derives then the desired relations as

4 X (1)
(1)

j P (, ) ej ,
P (, ) =
k =

4 X (2)
P (2) (, ) =
j P (, ) ej .
k =

(3.58a)
(3.58b)

Equation (3.58) states that the plane wave decomposition of a wave field is, up to the
factor j , given by the discrete Fourier series of the expansion coefficients in terms of
circular harmonics. Figure 3.10 illustrates the derived relationship. Please note, that
these results are similar to those obtained in [HdVB01] although the derivations differ.

3.3.4.4

Overview on the Different Representations of the Plane Wave Decomposition

The previous sections introduced different representations of the plane wave decomposition. Figure 3.11 illustrates their links and the assumptions made to derive them. It can
be concluded that the cylindrical harmonics expansion coefficients can be derived from a
complex Hankel transformation when comparing the decomposition in terms of circular
harmonics (3.58) with the one in terms of the Hankel transformation (3.52)
k
(, r, )} .
P (1),(2) (, ) = H(1),(2)
{P
4 ,r

(3.59)

PY (xY , ) ej( kr r cos()+kz z ) r d drdz

3D PWD

0 0

(1),(2)

PY (, r, z, ) = PP (, r, )

(xP , ) =

P (, ) =

4
k

2D PWD

PP (xP , ) ejkr cos() r d dr

0 0

(1),(2)
P (1),(2) (, )H
(kr)ej

P (1),(2) (, ) =

R R2

PP (xP , ) =

(, r, ) ej
P

3.3. The Continuous Plane Wave Decomposition

R R R2

PP

P (, ) = 2

PWD as Hankel
transformation

(, r, )} ej
j H,r {P

J (kr) =

j P (1),(2) (, ) ej

circular harmonics

P (1),(2) (, ) =

1
2

(1)

(2)

[ H (kr) + H (kr) ]

(, r, )} ej
j H(1),(2)
{P
,r

incoming/outgoing
67

Figure 3.11: Different representations of the plane wave decomposition (PWD) and the
assumptions used to derive them.

P (, kz , ) =

68

3.3.5

3. Fourier Analysis of Wave Fields

Properties and Theorems of the Plane Wave Decomposition

The following section will derive useful properties and theorems of the two-dimensional
plane wave decomposition. Most of them have been derived in accordance with ones known
from the Fourier transformation. Since the plane wave decomposition is a specialization
of the cylindrical Fourier transformation for acoustic wave fields some of the properties
of Fourier transformations directly apply also to the plane wave decomposition.
3.3.5.1

Properties

The following properties have been derived from the definition of the plane wave decomposition and the properties of the cylindrical Fourier transformation (see Section 3.1.3).
Linearity
The plane wave transformation is a linear transform as can be proven easily. As a consequence, the superposition principle applies, which is desirable since this principle applies
also to (linear) acoustic fields.
Separability
As for the Fourier transformation in cylindrical coordinates, the plane wave transformation
is not separable in the angular and radial direction (, r). However, the three-dimensional
plane wave decomposition is separable in the z-coordinate. As a consequence, the properties and theorems will be derived only for the two-dimensional plane wave decomposition
in the following. The well-known theorems from the one-dimensional Fourier transformation apply then for the z-coordinate.
Time-domain filtering of a wave-field
Lets consider that a wave field pP (xP , t) and its plane wave decomposition P (, ) =
P{pP (xP , t)} is given. If the wave field has been filtered in the time-domain by a filter
h(t), then its plane wave decomposition will be a filtered version of the plane wave decomposition of the unfiltered field, filtered by the same filter as the field. This result can
be derived from the following equality by subsequently carrying out the transformations
as
ph (, t) = Ft1 { P{ Ft{ pP (xP , t) t h(t) }}}
= p(, t) t h(t) ,

(3.60)

where ph (, t) denotes the plane wave decomposition of the filtered wave field. This
property is useful e. g. when measuring the spatio-temporal impulse response of a source.

3.3. The Continuous Plane Wave Decomposition

69

The plane wave decomposed wave field for an arbitrary source excitation is given by a
time-domain convolution of the plane wave decomposed impulse response with the desired
source signal.
Real-valued wave field
A real-valued wave field pP (xP , t) is the natural case for acoustic pressure fields. The
plane wave transformation of a real-valued pressure field is again real-valued. Proof:
If the pressure field is real-valued then its frequency domain representation exhibits a
conjugate complex symmetry PP (xP , ) = PP (xP , ). The same symmetry applies to
the exponential term in the plane wave decomposition (3.41). The multiplication of these
terms is again of conjugate complex symmetry and thus the plane wave decomposition
P (, ) = P (, ). As result, the plane wave decomposition p(, t) of a real-valued
wave field pP (xP , t) is also real-valued.
Spatial symmetries
The plane wave transformation of radially symmetric field PP (r, ) is also radially symmetric P () = P{PP (r, )}. This result was already derived in Section 3.3.4.1.
Duality
The plane wave decomposition depends on one variable less than the wave field it was
computed from. This results from exploiting the acoustic dispersion relation. Hence, it
is not straightforward possible to formulate a duality principle as it is possible for the
Fourier transformation.
3.3.5.2

Scaling Theorem

The cylindrical Fourier transformation of a signal scaled in the radial variable r was
already derived in Section 3.1.3. Adopting this principle to the plane wave decomposition
yields
P{PP (, a r, )} =

1
P (, ) .
a
|a|2

(3.61)

Hence, scaling in the radial direction only influences the frequency characteristics of the
plane wave decomposed field. This result is evident when considering the polar basis of
the plane wave decomposition. Scaling of the radial variable is a natural operation on
this basis and will not influence the angle dependent parts.

70

3.3.5.3

3. Fourier Analysis of Wave Fields

Rotation Theorem

In the following the plane wave decomposition of a rotated acoustic pressure field will be
calculated. The plane wave decomposition of a pressure field rotated by the angle 0 is
given as
Z Z

P (, ) =

PP ( 0 , r, ) ejkr cos() r ddr .

(3.62)

Using the substitutions = 0 and = 0 yields

P{PP ( 0 , r, )} = P ( 0 , ) .

(3.63)

Thus, a rotation of a wave field results in a rotation of the plane wave decomposed
wave field. This result is evident when considering the polar basis of the plane wave
decomposition. Rotation is a natural operation on this basis.
3.3.5.4

Multiplication Theorem

In the following a relation between the multiplication of two wave fields and their plane
wave decompositions is derived. Consider the multiplication of two fields
CP (xP , ) = AP (xP , ) BP (xP , ) .

(3.64)

As each wave field has a periodicity of 2 in it can be developed into a Fourier series,
e. g. for the field AP (xP , ) as follows
AP (xP , ) =

r, ) ej ,
A(,

(3.65)

r, ) can be calculated according to Eq. (3.16). The


where the expansion coefficients A(,
multiplication of the two fields (3.64) can then be rewritten in terms of a Fourier series as
CP (xP , ) =

X
X

r, ) ej B(,
r, ) ej .
A(,

(3.66)

= =

Combining the two exponential terms and using the substitution = + allows to
rewrite Eq. (3.66) such that the series coefficients are given by the convolution of the
r, ) and B(,
r, )
expansion coefficients of the two fields A(,
CP (xP , ) =

r, ) B(,
r, )) ej .
(A(,

(3.67)

Developing CP (xP , ) into a Fourier series and comparing the expansion coefficients with
Eq. (3.67) it is easy to see that
r, ) = A(,
r, ) B(,
r, ) .
C(,

(3.68)

3.3. The Continuous Plane Wave Decomposition

71

This result states that the multiplication of two wave fields for one particular radius r
corresponds to a convolution of the Fourier series expansion coefficients for this particular
radius. The plane wave decomposition in terms of the Fourier series expansion coefficients
was derived in Section 3.3.4.1. It is given by a Fourier series of the Hankel transformation
of the Fourier series expansion coefficients. Introducing Eq. (3.68) into Eq. (3.46) yields
the plane wave decomposition of a multiplication of two wave fields as
) = 2
C(,

3.3.5.5

r, ) B(,
r, )} ej .
j H,r {A(,

(3.69)

Convolution Theorem

Consider the spatial convolution of two wave fields AP (xP , ) and BP (xP , )
CP (xP , ) = AP (xP , ) x BP (xP , ) .

(3.70)

Performing a multidimensional spatial Fourier transformation and applying the convolution theorem of the Fourier transformation (3.14) allows to represent the spatial convolution as multiplication of the spatial Fourier transformed signals
P (kP , ) .
CP (kP , ) = AP (kP , ) B

(3.71)

It would be desirable to find a similar relationship for the plane wave decomposition of
the two wave fields. However, the plane wave decomposed signals depend on one variable
less than the cylindrical Fourier transformation of the same signals. Hence, Eq. (3.71)
does not hold for plane wave decomposed signals. It seems that no similar relationship
exists for the plane wave decomposition. In order to confirm this it is instructive to
introduce the definition of the polar convolution (as given by Eq. (B.34)) into the plane
wave decomposition
Z Z 2

C(, ) =
CP (xP , ) ejkr cos() r ddr
0
0

Z Z 2 Z Z 2

=
AP ( , r , ) BP(, r, ) r d dr ejkr cos() r ddr ,
0

(3.72)

where = (, , r, r ) and r = r(, , r, r ) as defined by Eq. (B.35). Interchanging the order of integration yields
Z Z 2

Z Z 2

jkr cos()

C(, ) =
AP ( , r , ) r
BP (, r, )e
r ddr ddr .
0

(3.73)

72

3. Fourier Analysis of Wave Fields

Using geometric relations, the exponential term in the inner integrals can be expressed as
ejkr cos() = ejkr cos() ejkr

cos( )

(3.74)

Introducing Eq. (3.74) into Eq. (3.73) yields


Z

) =
C(,

0

Z

AP ( , r , ) r
Z

2
jkr cos()

BP (, r, ) e

r ddr ejkr

cos( )

d dr . (3.75)

Inspection of the inner integral yields that this does not only involve dependence on
and r, but also on r directly. Thus, it cannot be expressed as the plane wave decomposition of B(, r, ) which would be necessary in order to formulate a meaningful
convolution theorem.
3.3.5.6

Parsevals Theorem

Parsevals theorem relates the energy of a signal to the energy of its Fourier transformation [Bam89, GRS01]. A similar theorem for acoustic wave fields and their plane wave
decompositions will be derived in the following.
Parsevals theorem for spatio-temporal signals given in polar coordinates can be derived
from its formulation in Cartesian coordinates [Bam89] by introducing the coordinate and
wave vector transformations given by Eq. (B.28) and Eq. (B.29) respectively. Parsevals
theorem in polar coordinates is then given as
Z

(aP (xP , t) bP (xP , t)) r ddrdt =


Z Z Z 2
1
(kP , )) k ddkd , (3.76)
=
(AP (kP , ) B
P
(2)3 0
0

where bP (xP , t) denotes the conjugate complex field to bP (xP , t). For the special choice
aP (xP , t) = bP (xP , t) Eq. (3.76) simplifies to
Z

1
|aP (xP , t)| r ddrdt =
(2)3
{z
}
2

Ea


2


AP (kP , ) k ddkd . (3.77)

The left hand side will be defined as the energy Ea of the field aP (xP , t). Equation (3.77)
relates the energy of a field to the energy of its Fourier transformed representation. Parsevals theorem for the plane wave decomposition can be derived from Eq. (3.77) by
introducing the dispersion relation (2.8) into Eq. (3.77)

3.3. The Continuous Plane Wave Decomposition

2
0

1
|aP (xP , t)| r ddrdt =
(2)3
2

73



A(,
) 2



dd .
c

(3.78)

Equation (3.78) represents Parsevals theorem for the plane wave decomposition.
3.3.5.7

Plane Wave Extrapolation

Lets first assume that the spatial Fourier transformation PC,0 (kC , ) of a plane wave
traveling in three-dimensional space is given for the plane z = 0. For free-field propagation
the spatial Fourier transformation of the acoustic pressure at an arbitrary height z is given
as [Ber87, Wil99]
PC (kC , ) = PC,0 (kx , ky , ) ejkz z .
(3.79)
This principle is denoted as plane wave extrapolation [Ber87] and can be applied also to
plane wave decomposed signals as will be illustrated in the following. The plane wave
expansion coefficient P (, ) for any fixed denotes the spectrum of a plane wave with
incidence angle . This spectrum is given with respect to the origin of the coordinate
system. However, for some applications it is desirable to spatially shift the origin of the
plane wave decomposition to a point xP,0 . This shifted plane wave decomposition will be
denoted as PxP,0 (, ). Generalization of Eq. (3.79) allows express the shifted plane wave
decomposition PxP,0 (, ) in terms of the unshifted plane wave decomposition P (, ) as
PxP,0 (, ) = P (, ) ejkr0 cos(0 ) .

(3.80)

The wave field extrapolation technique for plane wave decomposed wave fields, as described by Eq. (3.80), is a powerful tool since it allows extrapolate a given (e. g. measured) plane wave decomposition to other positions. In this context, the inverse plane
wave decomposition (3.45) can be understood as the extrapolation of a plane wave decomposed field P (, ) to the position xP , followed by a superposition of all extrapolated
plane wave components. The superposition of all plane wave components copes for the
omnidirectional nature of the acoustic pressure PP (xP , ).
3.3.5.8

Summary

The properties and theorems of the plane wave decomposition that were derived in the
previous sections are summarized in Table 3.2. As stated before, they were derived for
the two-dimensional plane wave decomposition. A generalization to the three-dimensional
plane wave decomposition can be done easily for most of the results by using Eq. (3.42)
and the theorems of the one-dimensional Fourier transformation. Some of the derived
relationships exhibit a strong resemblance with theorems of the Fourier transformation

74

3. Fourier Analysis of Wave Fields

PP (xP , ) = P 1 {P (, )}

P (, ) = P 1 {PP (xP , )}

a AP (xP , ) + b BP (xP , )

) + b B(,
)
a A(,

PP (, a r, )

1
P (, a )
|a|2

PP ( 0 , r, )

P ( 0 , )

AP (xP , ) BP (xP , )

AP (xP , ) x BP (xP , )
R R R 2
|aP (xP , t)|2 r ddrdt
0
0

1
(2)3

r, ) B(,
r, )} ej
j H,r {A(,


R R 2
) 2 dd
A(,
0
c

Table 3.2: Properties and theorems of the plane wave decomposition.


formulated in cylindrical coordinates. This holds especially for the radial scaling theorem.

3.3.6

Relations to other Methods and Transformations

This section briefly outlines relationships between the plane wave decomposition and other
methods/transformations used for the analysis of multidimensional signals and wave fields.
Slant stack / Radon transformation
Slant stacking is a method well known from seismic data processing. The concept of
slant stacking is based on the Radon transformation [Rad17, Tof96, Dea93]. One method
can be derived from the other as shown in [Tof96]. Both transformations are frequently
used in seismics, computerized tomography (CT), medical imaging and inverse scattering
problems. The link between slant stacking and the plane wave decomposition will be
illustrated in the following. The slant stack integral, generalized to a two-dimensional
acoustic pressure field pC (xC , t) is defined as
uC (sC , ) =

pC (xC , + sTC xC ) d ,

(3.81)

R2

where uC (sC , ) denotes the radon transformation of pC (xC , t) and d = dx dy. Introducing
the coordinate system transformation from Cartesian to polar coordinates, as given by
Eq. (B.28) and Eq. (B.29) (substituting k with s in Eq. (B.29)), into Eq. (3.81) yields
uP (, s, ) =

2
0

pP (, r, + sr cos( )) r ddr .

(3.82)

3.3. The Continuous Plane Wave Decomposition

75

Comparing Eq. (3.82) with the time-domain plane wave decomposition (3.43) yields equality for


uP (, s, )
= p(, ) .
(3.83)
s=1/c

Thus, the plane wave decomposition as defined within this work, can be interpreted as
a slant stack or Radon transformation specialized to the case s = 1/c. A similar result
was derived in [Sca97]. This result is quite evident when looking at the applications of
the Radon transformation and the slant stack. The Radon transformation is often used
in image processing. The Radon transformation maps straight lines in the image domain
into Dirac peaks in the Radon domain. It is therefore typically used for edge detection in
digital image processing [Tof96]. The same principle can be applied to acoustic fields: A
Dirac shaped plane wave can be understood as an edge in the pressure field pC (xC , t).
Beamforming

Beamforming is a method to perform spatial filtering of acoustic fields using microphone


arrays. It is often applied in the area of acoustic signal processing for communications,
e. g. to suppress unwanted (interfering) acoustic sources for acoustic hands-free frontends.
It will be shown in the following, that the plane wave decomposition can be understood
as a delay-and-sum beamformer.
The frequency response of a filter-and-sum beamformer built of microphones placed at
arbitrary discrete microphone positions xC,m to an incident plane wave with the wave
vector kC and the steering vector kC,s are given as follows [Tre02]
YC (kC , kC,s , ) =

M
X

Wm () ej (kC kC,s )

xC,m

(3.84)

m=0

where the Wm () denote complex filter coefficients. These are often used to optimize the
beamformer response. The steering vector kC,s defines the look direction of the beamformer, which in typical applications is equal to the desired source direction. Figure 3.12
shows a block diagram of the filter-and-sum beamformer. The microphone signals are
multiplied by a phase factor (delay), filtered and summed up to create the beamformer
output signal. However, above formulation assumes discrete microphone positions and a
plane wave as input signal. Equation (3.84) can be generalized to an arbitrary wave field
XC (xC , ) as input signal and a continuous distribution of the microphones. Specializing to the case of Wm = 1 the following continuous representation can be derived in a
straightforward manner from Eq. (3.84):
Z
T
YC (kC,s , ) =
XC (xC , ) ejkC,sxC d ,
(3.85)
R2

where d = dxdy. Comparing Eq. (3.85) with Eq. (3.6a) yields that the delay-andsum beamformer can be interpreted as spatial Fourier transformation. Hence, if the

76

3. Fourier Analysis of Wave Fields

xC,0

ejkC,s xC,0

W0 ()

W1 ()

...

xC,1
kC

xC,M

Y (kC , kC,s , )

...

ejkC,s xC,1

ejkC,s rC,M

WM ()

Figure 3.12: Block diagram of a filter-and-sum beamformer as given by Eq. (3.84).


steering vector kC,s undergoes the same transformations as the wave vector of the spatial
Fourier transformation when deriving the plane wave decomposition, the delay-and-sum
beamformer can be interpreted as plane wave decomposition. This result is apparent
when considering that a beamformer is typically designed for the far-field case, thus for
the analysis of plane waves.

3.4

The Plane Wave Decomposition using Boundary


Measurements

The plane wave decomposition, as introduced in Section 3.3.2, requires to have access
to the acoustic pressure for the entire two-dimensional plane. For practical reasons the
area to be analyzed has to be limited. However, if the analyzed area is limited, then the
Kirchhoff-Helmholtz integral is applicable. Having knowledge of the acoustic field at the
boundary of the area to be analyzed allows to extrapolate the field within this boundary.
This extrapolated field can then be used to calculate the plane wave decomposition from
measurements on the boundary. The following section will derive the effects of limiting
the analyzed area and the plane wave decomposition of a wave field based on boundary
measurements.

3.4.1

The Finite Aperture Plane Wave Decomposition

The two-dimensional plane wave decomposition as given by Eq. (3.41) assumes that the
acoustic pressure field can be measured on an infinite large plane. As stated before,
in practical applications the measurement area will be finite. The artifacts caused by
this limitation of the measurement aperture will be derived in the following. Due to
the underlying polar geometry of the plane wave decomposition only a limitation in the

3.4. The Plane Wave Decomposition using Boundary Measurements

77

y
1

circC (x, y) = 0

circC (x, y) = 1

Figure 3.13: Illustration of the circ function as defined by Eq. (3.88)


radial direction to a fixed radius R will be considered. The finite aperture plane wave
decomposition PR (, ) can be derived from the definition of the plane wave decomposition
by limiting the radial integral
PR (, ) =

PP (xP , ) ejkr cos() r d dr ,

(3.86)

where R denotes the radius of the considered circular area. The limitation of the aperture can be analyzed by multiplying the wave field with a circular window function and
introducing this into the (infinite) plane wave decomposition (3.41)
r
PR (, ) = P{circP ( ) PP (xP , )} ,
R

(3.87)

where circP (r) denotes the circular window function which is defined as follows

1 , for r = px2 + y 2 1,
circP (r) = circC (x, y) =
0 , otherwise.

(3.88)

Figure 3.13 illustrates the circ function. The plane wave decomposition for the multiplication of two fields was derived in Section 3.3.5.4. Its calculation involves the knowledge
of the Fourier series expansion coefficients of the two fields. The expansion coefficients of
the window function can be calculated by utilizing Eq. (3.16)
r, ) = FS1 {circP ( r )} = 1
C(,

R
2

2
0

r
circP ( ) ej
R

[] for r R,
d =
0
otherwise.

(3.89)

78

3. Fourier Analysis of Wave Fields

The finite aperture plane wave decomposition PR (, ) is then derived by applying


Eq. (3.69)
PR (, ) = 2
= 2

r, ) P
(, r, )} ej
j H,r {C(,
j

(3.90)
(, r, )J (kr) rdr e
P

(, r, ) denotes the Fourier series expansion coefficients of the acoustic pressure


where P
field. The integral over dr can be interpreted as the finite Hankel transformation [Sne72]
H,R {f ()} =

f () J (kr) rdr .

(3.91)

Thus, the finite aperture plane wave decomposition is given by a Fourier series whose
coefficients are given by the finite Hankel transformation of the Fourier series coefficients
of the wave field
PR (, ) = 2

(, r, )} ej .
j H,R {P

(3.92)

This result is quite illustrative, since the representation of the plane wave decomposition as
Hankel transformation decouples the angular and the radial parts. The radially symmetric
finite aperture has thus only influence on the radial part and therefore only on the Hankel
transformation.

3.4.2

Plane Wave Decomposition based on Kirchhoff-Helmholtz


Extrapolation

The two-dimensional Kirchhoff-Helmholtz integral (2.70) can be used to extrapolate a


pressure field inside an area S from the acoustic pressure and its gradient on the contour L
enclosing the area S. The extrapolated wave field can then be used to calculated the plane
wave decomposition. As the plane wave decomposition is based on a polar coordinate
system, it is natural to chose a circular boundary as the boundary to be measured on.
The following discussion will be limited to the case of a circular boundary with fixed
radius R. Specialization of the two-dimensional Kirchhoff-Helmholtz integral (2.70) to
the geometry depicted in Fig. 3.14 yields
jk
PP,e (xP , ) =
4

(2)

(2)

j0 c VP,r (l , R, ) H0 (kr) PP (l , R, ) H1 (kr) cos R dl ,


(3.93)

3.4. The Plane Wave Decomposition using Boundary Measurements

79

y
R

xl

R
x

Figure 3.14: Geometry used for the Kirchhoff-Helmholtz integral for a circular boundary.
where PP,e (xP , ) denotes the extrapolated pressure field and r = |xl x|. The extrapolated pressure field can be introduced into the definition of the plane wave decomposition
in order to derive the plane wave decomposed wave field
P (, ) = P{PP,e (xP , )} .

(3.94)

Unfortunately, above specialization of the Kirchhoff-Helmholtz integral and thus the plane
wave decomposition cannot be solved conveniently. The next section will utilize the
concept of circular harmonics to provide an elegant solution for circular boundaries.

3.4.3

Plane Wave Decomposition based on Circular Harmonics

It is promising to express the measured wave field by circular harmonics, since the chosen
circular boundary geometry and the plane wave decomposition are both formulated conveniently in polar coordinates. It was shown in Section 2.3.2 that the acoustic pressure
can be expressed in terms of circular harmonics (see Eq. (2.25))


X
PP (xP , ) =
P (1) (, )H(1)(kr) + P (2) (, )H(2) (kr) ej .
(3.95)
=

Alternatively, it is convenient to express the acoustic pressure field as Fourier series with
respect to the angle

X
(, r, ) ej ,
PP (xP , ) =
P
(3.96)
=

80

3. Fourier Analysis of Wave Fields

(, r, ) are given by Eq. (3.16). Comparing these two


where the expansion coefficients P
Fourier series representations at the boundary (r = R) derives a relation between the
expansion coefficients of these two representations
(, R, ) = P (1) (, )H (1) (kR) + P (2) (, )H (2) (kR) .
P

(3.97)

Please note, that the expansion coefficients P (1) (, ) and P (2) (, ) in terms of circular
harmonics are independent of the radius R. They can be used to calculate the plane
wave decomposition, as was shown in Section 3.3.4.3. Unfortunately, Eq. (3.97) does
not provide a one-to-one relation between the Fourier series coefficients of the acoustic
pressure and the expansion coefficients in terms of circular harmonics. Thus, it cannot be
solved to derive the cylindrical harmonics coefficients. This conclusion is not surprising,
since the Kirchhoff-Helmholtz integral (2.70) states that the acoustic pressure and its
gradient are required on the boundary to describe the wave field within the boundary.
Hence, the gradient of the acoustic pressure has to be taken into account additionally.
Eulers equation (2.4) gives a relation between the gradient of the acoustic pressure and
the particle velocity. Its frequency domain representation is given as
PP (xP , ) = j0 VP (xP , ) ,

(3.98)

where VP (xP , ) denotes the acoustic particle velocity vector. However, only one component of the particle velocity vector VP (xP , ) is sufficient to solve Eq. (3.97). Introducing
Eq. (3.95) into Eulers equation (3.98) and evaluation of the inward directed radial component only derives
h~er , PP (xP , )i = k



X

P (1) (, )H(1) (kr) + P (2) (, )H(2) (kr) ej ,

(3.99)

(1),(2)

where ~er denotes the outward pointing radial normal vector and H
(kr) the derivatives of the Hankel functions with respect to the radius r. Expression of the radial
component of the acoustic velocity VP,r (xP , ) as Fourier series and introduction of this
series into Eq. (3.98) together with Eq. (3.99) for r = R derives a similar relationship as
Eq. (3.97) for the Fourier series expansion coefficients of the radial particle velocity
r (, R, ) = P (1) (, )H (1) (kR) + P (2) (, )H (2) (kR) ,
j0 cV

(3.100)

r (, R, ) denotes the Fourier series expansion coefficients of the acoustic particle


where V
velocity in radial direction. Combining Eq. (3.97) and Eq. (3.100) into a matrix equation
yields
"
# " (1)
#"
#
(2)
(, R, )
H (kR) H (kR)
P
P (1) (, )
(1)
(2)
.
(3.101)
H (kR)
H (kR)
r (, R, ) =
V
P (2) (, )
j0 c
j0 c
|
{z
}
M1 (kR)

3.5. Plane Wave Decomposition of Analytic Source Models

PP (, R, )
VP,r (, R, )

P (1) (, )

(, R, )
P

FS1

M(kR)

FS1

r (, R, )
V

81

P (2) (, )

4
j
k

4
j
k

FS
FS

P (1) (, )
P (2) (, )

Figure 3.15: Block diagram of the plane wave decomposition based on the measurement
of the acoustic pressure and velocity on a circular boundary only with radius R.
Solving Eq. (3.101) by inverting M1 (kR) yields the circular harmonics expansion coefficients P (1) (, ) and P (2) (, ) as
(2)
(, R, ) H(2) (kR)j0 cV
r (, R, )
H (kR)P
(1)

P (, ) =
,
(2)
(1)
(1)
(2)
H (kR)H (kR) H (kR)H (kR)
(1)
(, R, ) H(1) (kR)j0 cV
r (, R, )
H
(kR)P

(2)
.
P (, ) =

(2)
(1)
(1)
(2)
H (kR)H (kR) H (kR)H (kR)

(3.102a)
(3.102b)

The derived result allows to calculate the plane wave decomposition using pressure and
pressure gradient measurements performed on a circular boundary. It can be interpreted as multidimensional filtering process where the Fourier series expansion coefficients
(, R, ) and V
r (, R, ) are filtered by the filter M(kR). The plane wave decomposiP
tion of the incoming and outgoing part of the wave field is then given as the Fourier
series (3.58) of the circular harmonics expansion coefficients. Figure 3.15 illustrates the
entire computation of the plane wave decomposition based on boundary measurements
by combining Eq. (3.102) with Eq. (3.58). This result is similar to the one derived by
[HdVB01, Hul04]. The practical implementation of the presented plane wave decomposition using microphone arrays will be discussed in Section 5.2.2.

3.5

Plane Wave Decomposition of Analytic Source


Models

The following section will derive the plane wave decomposition of two analytic spatial
source models. The models used for this purpose are plane waves and line sources. These
source types play an important role for the analysis of wave fields on the one hand and
for auralization purposes on the other hand. It will be assumed that the time-domain excitation of the sources is a Dirac pulse. This assumption poses no restriction, as arbitrary
time-domain excitations can be considered by time-domain convolution of the plane wave

82

3. Fourier Analysis of Wave Fields

decomposed wave fields. This was shown in Section 3.3.5.1. The plane wave decompositions will be derived for the ideal case of an unlimited aperture as well as for the more
practical case of a finite aperture.

3.5.1

Plane Wave Decomposition of a Plane Wave

The plane wave decomposition of a plane wave is derived in the following. Starting point
is the pressure field of a plane wave as given by Eq. (2.47)
T

PC (xP , ) = ejkC,0 xC ,

(3.103)

where kC,0 denotes the wave vector of the plane wave and implicitly its incidence angle 0 .
First no limitation of the aperture will be assumed in order to derive a result under ideal
conditions. Introducing the wave field of a plane wave into the two-dimensional spatial
Fourier transformation in Cartesian coordinates as given by Eq. (3.6a) yields
PC (kC , ) =

Tx

ej(kC kC,0 )

d =

R2
2

(3.104)

= (2) C (kC kC,0 ) ,


where d = dxdy. In order to derive the plane wave decomposition of a plane wave
the coordinate system will be changed from the Cartesian coordinate system to polar
coordinates. The Dirac pulse C (kC kC,0 ) can be expressed in polar coordinates as (see
Appendix B.4)
C (kC kC,0 ) = P (kP kP,0 ) =

1
(k k0 ) ( 0 ) ,
k

(3.105)

where 0 denotes the incidence angle of the plane wave. The Dirac pulse (k k0 )
compromises the dispersion relation (2.8) and can be discarded for the derivation of the
desired plane wave decomposition. Hence, the plane wave decomposition of a plane wave
is given as
c
2

P (, ) = (2)
( 0 ) .
(3.106)

The obtained result is evident, since the plane wave decomposition was designed to decompose a wave field into plane waves. Hence, the plane wave decomposition of a plane
wave should have the derived form of a spatial Dirac pulse at the incidence angle of the
plane wave. It was stated in Section 3.3.3 that the plane wave decomposition exhibits
a low-pass character, a fact which can be seen also by the frequency dependent factor
c/ in Eq. (3.106). The following paragraph will consider the case of a finite aperture as
discussed in Section 3.4.1.

3.5. Plane Wave Decomposition of Analytic Source Models

83

Finite aperture
The finite aperture plane wave decomposition of a plane wave can be calculated by utilizing
Eq. (3.92). However, this requires to calculate the finite Hankel transformation of the
angular Fourier series expansion coefficients of a plane wave. The Fourier series expansion
coefficients of a plane wave with incidence angle 0 can be derived from the Jacobi-Anger
expansion (3.55) as
(, r, ) = j J (kr) ej0 .
P
(3.107)
(, r, ) is
The finite Hankel transformation of the Fourier series expansion coefficients P
given as [GR65]
(, r, )} = j
H,R {P

j0

J (kr)J (kr) r dr =
0

1
2
= R j ej0 J+1
(kR) ,
2

(3.108)

where R denotes the radius of the circular aperture. The finite aperture plane wave
decomposition of a plane wave becomes
PR (, ) = R

2
J+1
(kR) ej(0 ) .

(3.109)

Figure 3.16 shows the finite aperture plane wave decomposition of a plane wave with
an incidence angle 0 = 180o and an aperture of R = 1 m. Figure 3.16(a) shows the
frequency-angle response PR (, ) and Fig. 3.16(b) the time-angle response pR (, t). The
time-angle response was calculated utilizing an inverse (time-domain) Fourier transformation pR (, t) = Ft1 {PR (, )}. The widening of the frequency response for low frequencies
due to the finite aperture can be seen clearly. This is a consequence of the properties of the
Bessel function included in the angular Fourier series given by Eq. (3.109). For = 0 this
series exhibits no dependence on the angle of the plane wave contributions. Thus, = 0
represents the omni-directional case. The zeroth-order Bessel function has its maximum
at a lower frequency compared to its higher orders and this results in the dominant omnidirectional response for low frequencies. As expected, the time-domain response consists
of a peak at the incidence angle of the plane wave. However, there are also some artifacts
present due to the finite aperture of the decomposition. This ringing effect is also known
from the Fourier transformation as Gibbss phenomena.

3.5.2

Plane Wave Decomposition of a Line Source

This section derives the finite aperture plane wave decomposition of a line source. The
acoustic pressure of a line source placed at the origin is given by Eq. (2.40). The wave

84

3. Fourier Analysis of Wave Fields

2000
1800

28

1600

26
24

1200

time (ms)

frequency (Hz)

1400

1000
800

22
20
18

600
400

16

200

14

90

180
angle (o)

270

(a) frequency domain response (magnitude)

90

180
angle (o)

270

(b) time domain response

Figure 3.16: Finite aperture plane wave decomposition of a (time-domain) Dirac shaped
plane wave. The incidence angle of the plane wave was chosen as 0 = 180o , the radius of
the circular aperture as R = 1 m. The frequency-angle PR (, ) and time-angle pR (, t)
responses are shown.
field produced by a line source placed at an arbitrary position xP,0 can be derived from
the shift theorem of the Hankel functions. This theorem is given as follows [Wil99]

(2)
P
J (kr)H (kr0 )ej(0 )
(2)
H0 (k |xP xP,0 |) = P=
(2)
j(0 )

= J (kr0 )H (kr)e

for r r0 ,

for r > r0 .

(3.110)

It will be considered in the following that the line source is placed outside of the circular
aperture r0 > R. The acoustic pressure within the circular aperture produced by a line
source placed at the position xP,0 can be derived from Eq. (3.110) as

j X
J (kr)H(2) (kr0 )ej(0 ) ,
PP (xP , ) =
4 =

(3.111)

where r0 > R r. As for the plane wave in the previous section, the limited aperture
plane wave decomposition of a line source will be derived by applying Eq. (3.92). The
required Fourier series coefficients of the line source can be deduced from Eq. (3.111) as
(, r, ) = j J (kr)H (2) (kr0 ) ej0 .
P

(3.112)

3.6. The Discrete Plane Wave Decomposition

85

(, r, ) is
The finite Hankel transformation of the Fourier series expansion coefficients P
then given as [GR65]
Z R
j
j
(2)
0
(, r, )} = e
H,R {P
H (kr0 )
J (kr)J (kr) r dr =
4
0
j
2
= R ej0 H(2) (kr0 )J+1
(kR) .
8

(3.113)

Introducing this result into Eq. (3.92) yields the finite aperture plane wave decomposition
of a line source as

R X +1 (2)
2

PR (, ) =
j
H (kr0 )J+1
(kR) ej(0 ) .
4 =

(3.114)

Figure 3.17 shows the finite aperture plane wave decomposition of a line source placed
at the position 0 = 180o , r0 = 3 m for an aperture of R = 1 m. Figure 3.17(a) shows
the frequency-angle response PR (, ) and Fig. 3.17(b) the time-angle response pR (, t).
The time-angle response was calculated utilizing an inverse (time-domain) Fourier transformation. The widening of the frequency response for low frequencies due to the finite
aperture can be seen clearly.

3.6

The Discrete Plane Wave Decomposition

Up to now the plane wave decomposition was based on a continuous representation of


the wave field. The calculation of the (continuous) plane wave decomposition requires to
have access to the acoustic pressure at each position within the entire continuous space.
However, in practical implementations the wave field will be accessible by measurements
only at a limited number of discrete positions. As for other transformations, this requires
a discrete formulation of the plane wave decomposition.

3.6.1

Derivation of the Discrete Plane Wave Decomposition

This section will derive the discrete plane wave decomposition. For this purpose a procedure similar to the derivation of the discrete Fourier transformation will be applied [OS99].
3.6.1.1

Definition of the Two-dimensional Polar Pulse Train

Due to practical aspects it is not feasible to measure the wave field to be analyzed in a
continuous manner. In practice, the wave field will be measured only at a limited number
of discrete points in space. As for time-domain sampling, this spatial discretization will
be modeled by a spatial sampling grid consisting of spatial Dirac pulses. Sampling is best

86

3. Fourier Analysis of Wave Fields

1800

16

1600

14

1400

12

1200

10

time (ms)

frequency (Hz)

2000

1000
800

8
6

600
4

400

200
0

90

180
angle (o)

270

(a) frequency domain response (magnitude)

90

180
angle (o)

270

(b) time domain response

Figure 3.17: Finite aperture plane wave decomposition of the wave field produced by
a line source excited with an (time-domain) Dirac pulse. The position of the source was
chosen as 0 = 180o, r0 = 3 m , the circular aperture has an radius of R = 1 m. The
frequency-angle PR (, ) and time-angle pR (, t) responses are shown.
performed in the angular and radial direction due to the underlying polar geometry of the
plane wave decomposition. Thus, a pulse train consisting of spatial Diracs placed on a
polar grid is the natural choice. Within this thesis, the two-dimensional polar pulse train
P (, r) will be defined as follows
P (, r) =

M
1
X
X
1
=0 =0

(r ) (

2) ,
M

(3.115)

where M denotes the number of angular discretization points. Figure 3.18 illustrates
the sampling grid for M = 24 angular discretization points. Although the number of
sampling points is finite in the angular coordinate, it is infinite in the radial coordinate.
Limiting the number of radial sampling points limits the aperture of the analyzed area.
Please note that the sampling grid is not dependent on the temporal frequency since
no time-domain sampling will be considered in the following.
3.6.1.2

Definition of the Discrete Space Plane Wave Decomposition

The polar pulse train, as introduced in the previous section, can be used to model the
spatial sampling of a wave field. Due to the sifting property of the Dirac function, multiplication of the continuous wave field with the pulse train yields the values of the wave

3.6. The Discrete Plane Wave Decomposition

87

Figure 3.18: Two-dimensional polar pulse train P (, r) for M = 24 angular discretization points. The circles mark the positions of the spatial Dirac pulses.
field at the discrete sampling positions. Introducing the polar pulse train into the definition of the plane wave decomposition (3.41) yields the plane wave decomposition of a
sampled field as follows
P (, ) = P {PP (xP , )} = P{PP (xP , ) P (xP )}
=

M
1
X
X
=0 =0

PP (

2
2
, , ) ejk cos( M ) .
M

(3.116)

In accordance to the discrete time Fourier transformation (DTFT) [OS99], this transformation will be termed as discrete space plane wave decomposition. The transformation
will be denoted in the following by P and the transformed signals with an asterisk in
the index. The spectral properties of the discrete space plane wave decomposition will be
analyzed in the next section.

3.6.1.3

Spectral Properties of the Discrete Space Plane Wave Decomposition

The spectral characteristics of the discrete space plane wave decomposition can be computed using the multiplication theorem (3.69). However, this requires to calculate the
Fourier series expansion coefficients of the polar pulse train (3.115) using Eq. (3.16).

88

3. Fourier Analysis of Wave Fields

Carrying out this calculation yields


Z 2

X
X
1
1
(, r) =

P (xP ) ej d =
[ M]
(r ) .
2 0
r
=0
=

(3.117)

According to Section 3.3.5.4 these coefficients have to be convolved with the Fourier series
(, r, ) of the wave field and the result is then Hankel transformed.
expansion coefficients P
The entire procedure results in
Z
X

X
(, r) P
(, r, )} J (kr) r dr =
( M, , ) J (k) . (3.118)
{
P
0

= =0

The summation over constitutes a discrete Fourier-Bessel (or Hankel) transformation


(DFBT). Thus, the discrete space plane wave decomposition can be formulated in terms
of the DFBT
P (, ) = P {PP (xP , )} = 2

( M, r, )} ej , (3.119)
DFBT {P

where DFBT {} denotes the discrete Fourier-Bessel transformation of order . The sum(, r, ) of
mation over consists of the DFBTs of repetitions of the angular spectrum P
the pressure field at the positions = M. These repetitions result from the angular
discretization of the wave field. Thus, aliasing may occur if the number of angular sampling points M is finite and the repetitions of the angular spectrum overlap. In order to
(, r, ) has to be limited. Above considerations
avoid spatial aliasing the bandwidth of P
result in the following anti-aliasing condition for an even number of angular sampling
positions

M
M
(, r, ) = P (, r, ) , for 2 + 1 2
(3.120)
P
0
, otherwise.

If above anti-aliasing condition is met, then the P transformation can be rewritten as


P (, ) = P {PP (xP , )} = 2

M/2
X

=M/2+1

(, r, )} ej .
j DFBT {P

(3.121)

The anti-aliasing condition (3.120) requires a limitation of the bandwidth in the angular
frequency domain. However, this comes down to a filtering process performed on the
angle coordinate of the pressure field P (, r, ). Thus, the anti-aliasing condition requires
a spatial filtering of the wave field.
3.6.1.4

Definition of the Discrete Plane Wave Decomposition

Equation (3.121) reveals that the discrete space plane wave decomposition can be expressed as Fourier series with respect to the angle . Thus, it is possible to express the

3.6. The Discrete Plane Wave Decomposition

89

series in terms of a discrete Fourier transformation (DFT) when the angle is also discretized. However, it must be assured that the anti-aliasing condition (3.120) is met. In
the following it is assumed that the discrete version of the wave field is given as
P (, , ) = P (, r, )

2
1
(r )(
)

(3.122)

and accordingly the plane wave decomposed field as


2
P (, ) = P (, ) (
).
M

(3.123)

Introducing these definitions into the discrete space plane wave decomposition (3.121)
derives the following
P (, ) =

M
1
X
X

P (, , ) ejk cos(

2
)
M

(3.124)

=0 =0

A complete discretization of the spatio-temporal domain would also require to discretize


the time. However, time-domain discretization will not be considered here since the
discrete time Fourier transformation provides a well developed framework for this purpose.
Please refer to the literature [OS99] for a detailed discussion on this topic.

3.6.2

Sampling and Truncation Artifacts

The anti-aliasing condition, as given by Eq. (3.120), requires a band-limitation of the


(, r, ). In order to get more insight into the meaning of this condiangular spectrum P
tion, it is descriptive to formulate it more generally by means of a window function. The
W (, r, ) are given as
windowed angular Fourier series coefficients P
W (, r, ) = P
(, r, ) W
(, r, ) ,
P

(3.125)

(, r, ) denotes the window function. Above equation matches with the antiwhere W
aliasing condition (3.120) if the window function is chosen as rectangular window with
a width of M in the angular frequency coordinate. Transforming Eq. (3.125) into the
spatio-temporal domain using Eq. (3.16) yields
1
PP,W (, r, ) =
2

PP ( , r, ) WP( , r, ) d ,

(3.126)

(, r, ). Equation (3.126)
where WP (, r, ) denotes the Fourier series expansion of W
can be interpreted as a cyclic convolution of the pressure field with the window function.
Hence, the anti-aliasing condition can be realized by spatially filtering the pressure field
in the angular coordinate. The spatial filtering of the wave field has to take place before

90

3. Fourier Analysis of Wave Fields

x
R

Figure 3.19: Angular pulse train for M = 24 angular discretization points. The circles
mark the sampling positions. Measurements are taken only at these positions.
sampling the field, in order to avoid spatial aliasing. One possible realization of this spatial
anti-aliasing filter would be to use directive microphones for the recording. Unfortunately,
this would require directive microphones of very high order which are not available yet.
In the following the aliasing artifacts will be analyzed when no spatial anti-aliasing filter
is applied, thus the anti-aliasing condition is not met.
3.6.2.1

Angular Sampling of Boundary Measurements

It was shown in Section 3.4 that measurements taken on a boundary surrounding the area
of interest are sufficient to derive the plane wave decomposition of the wave field enclosed
by this boundary. However, taking measurements on a circular boundary requires no
discretization in radial direction. It is sufficient to consider a discretization in the angular
coordinate only in order to derive the sampling artifacts. Without loss of generality it is
assumed in the following that the plane wave decomposition is derived from measurements
that are taken on a circular boundary with radius R. The polar pulse train (3.115)
simplifies to an angular pulse train in this case
P () =

M
1
X
=0

2) .
M

(3.127)

Figure 3.19 illustrates the angular pulse train for M = 24. The plane wave decomposition
based on measurements taken on a circular boundary requires to measure the acoustic
pressure and the radial particle velocity (see Section 3.4.3). Both will be measured at
discrete positions in a practical implementation.

3.6. The Discrete Plane Wave Decomposition

91

1800

0.9

1600

0.8

1400

0.7

1200

0.6

1000

0.5

800

0.4

600

0.3

400

0.2

200

0.1

0
36

24

12

0
12
angular frequency >





P (, R, )

frequency [Hz] >

2000

24

(, R, ) of a plane wave for


Figure 3.20: Absolute value of the angular spectrum P
R = 1 m.
3.6.2.2

Sampling of a Plane Wave on a Circular Boundary

It was shown before that arbitrary wave fields can be decomposed into plane waves.
Thus, it is sufficient to discuss the case of a plane wave in order to derive the sampling
artifacts for arbitrary wave fields. The derived results for this special choice of field can
be generalized straightforward to arbitrary wave fields. Due to the radial symmetry of
the plane wave decomposition, the incidence angle of the plane wave will play no role for
these considerations. In the following the sampling artifacts of a Dirac shaped plane wave
will be derived.
The pressure field of a plane wave in polar coordinates can be expressed in terms of the
Jacobi-Anger expansion (3.55). Thus, the acoustic pressure of a plane wave on a circular
boundary with radius R is given as
jkR cos(0 )

PP (, R, ) = e

j J (kR)ej0 ej .
|
{z
}

(3.128)

(,R,)
P

The calculation of the plane wave decomposition based on boundary measurements using
Eq. (3.102) requires to calculate the Fourier series expansion coefficients of the acoustic
pressure PP (, R, ). Equation (3.128) can be interpreted as Fourier series with respect to
(, R, ) denote the expansion coefficients of the series. Figure 3.20
the angle , where P
(, R, ) of a plane wave for R = 1 m. The triangular
illustrates the angular spectrum P
(, R, ) becomes clearer when taking a closer look at the properties of the
structure of P

92

3. Fourier Analysis of Wave Fields

1
= 0
= 1
= 2
= 5
= 10
= 15

0.8

J (kr)

0.6
0.4
0.2
0
0.2
0.4

10

12

14

16

18

20

kr
Figure 3.21: Bessel functions J (kr) for the orders = [0, 1, 2, 5, 10, 15].
involved Bessel functions J (kR). Bessel functions of order || 1 start with small values
that increase monotonically until the maximum value of the Bessel function is reached and

decay oscillating with 1/ kr for large arguments. Hence, they exhibit a kind of spatial
band pass character. Figure 3.21 shows the Bessel functions for some particular orders
(, R, ) in Fig. 3.20 exhibits a kind of triangular
. As a result of these properties, P
structure.
However, due to the discretization only a limited number of M positions will be measured
on the circular boundary. The effects of the angular sampling can be derived analogously
to the case of polar sampling shown in Section 3.6.1.3. The sampled field Ps (, R, )
can be derived by multiplying the field with the angular sampling grid P (). For the
Fourier series expansion coefficients this results in the following convolution
() =
s (, R, ) = P
(, R, )
P

X
( + M, R, ) ,
=
P

(3.129)

() denote the Fourier series expansion coefficients of the angular sampling


where
grid (3.127). Thus, the angular sampling results in repetitions of the angular spectrum
(, R, ). Figure 3.22 illustrates this principle. The overlapping of the repeated anguP
lar spectrums, when these are not band-limited, can be seen clearly. As stated before,
these repetitions of the baseband should not overlap in order to avoid aliasing. A suitable anti-aliasing condition for the spatio-temporal sampling has already been derived in

93

frequency

3.6. The Discrete Plane Wave Decomposition

M2

M
2

(, R, ) due to
Figure 3.22: Illustration of the repetitions of the angular spectrum P
angular sampling.
Section 3.6.1.3 and is given by Eq. (3.120). However, in the following it will be assumed
that the angular spectrum is not band-limited.
(, R, ) for a plane wave
Figure 3.23 shows the absolute value of the angular spectrum P
with incidence angle 0 = 0o sampled on M = 24 angular sampling points with a radius of
R = 1 m. The aliasing in the baseband || 12 caused by the repetitions of the baseband
can be seen clearly. As a result of these properties and properties of the Bessel functions
discussed before, aliasing contributions will be always present in the angular spectrum
of a sampled wave field. Thus, the discrete plane wave decomposition will always be an
approximate representation of a wave field. Please note that the continuous plane wave
decomposition is exact in this sense.
(, R, ) using
The wave field can be reconstructed from its angular spectrum P
Eq. (3.128). However, as illustrated by Fig. 3.22, it is necessary to limit the angular
frequencies used for the reconstruction due to the aliasing contributions and the symmetry of the repeated spectrums. Thus, the reconstruction of the sampled field is given as

PP (xP , ) =

M/2
X

s (, R, ) ej .
P

(3.130)

=M/2+1

The limitation of the order , however, introduces additional artifacts when reconstructing
the wave field.
Summarizing, the discrete plane wave decomposition when considering no band-limitation
of the angular spectrum exhibits two types of artifacts:
1. aliasing and

94

3. Fourier Analysis of Wave Fields

1800

0.9

1600

0.8

1400

0.7

1200

0.6

1000

0.5

800

0.4

600

0.3

400

0.2

200

0.1

0
36

24

12

0
12
angular frequency >





P (, R, )

frequency [Hz] >

2000

24

Figure 3.23: Absolute value of the angular spectrum of a sampled plane wave signal
(M = 24, R = 1 m).
2. truncation artifacts.
The following sections perform a quantitative analysis of the aliasing artifacts utilizing
two different criteria and perform an analysis of the error caused by truncation of the
angular frequencies.
3.6.2.3

Quantitative Analysis of Aliasing Artifacts

In the previous section the origin of the aliasing contributions was derived in a more
qualitative fashion. Unfortunately, these aliasing contributions will lead to artifacts in
the plane wave decomposition and hence to artifacts when reconstructing a wave field
from its plane wave decomposition. It was also shown that aliasing errors will be present
always when assuming no band-limitation of the angular spectrum. Figure 3.23 illustrates
further that the aliasing error will be dependent on the angular frequency and the
temporal frequency . In order to perform a quantitative analysis of the error introduced,
a suitable error criterion has to be chosen and evaluated. The following section will derive
two different criteria for the aliasing error and their upper bounds.
Energy of the Aliasing Contributions
The interfering aliasing contributions in the baseband || < M/2 are caused by the
(, R, ). Thus, the energy of these repeated contributions
repetitions of the spectrum P

3.6. The Discrete Plane Wave Decomposition

95

[dB]
al = 60 [dB]
al = 30 [dB]

1800

frequency [Hz] >

1600

0
10
20

1400

30

1200

40

1000

50

800

60

600

70

400

80

200

90
0

>

10

al (, R, )

2000

100

Figure 3.24: Energy al (, R, ) of the aliasing contributions as defined by Eq. (3.131)


and lower frequency limit fal for different choices of al (M = 24, R = 1 m). The gray
levels denote the signal energy in dB.
in the baseband can be used as error measure. This leads to the following definition of
the energy of the aliasing error al (, R, )
Z
2
1

al (, R, ) =
(3.131)
Pal (, R, ) d ,
0

al (, R) denotes the angular spectrum of the aliased contributions. These can


where P
be derived from Eq. (3.128) and Eq. (3.129) as
X
al (, R, ) =
P
j M J+M (kR) ej(+M )0 .
(3.132)
||1

Figure 3.24 shows the energy al (, R, ) of the aliasing contributions for M = 24 angular
discretization points at a radius of R = 1 m. It can be seen clearly that the energy is
dependent on the order . Higher orders exhibit more aliasing at lower frequencies than
lower orders. This observation is also in conjunction with Fig. 3.23.
The calculation of al (, R, ) requires numeric evaluation of Eq. (3.131). In the sequel a
closed-form upper bound for the aliasing error al will be derived. An upper bound for
the absolute value of the aliasing contributions is given as
X



P
(,
R,
)
J+M (kR) .
(3.133)

al
||1

96

3. Fourier Analysis of Wave Fields

When taking the properties of the Bessel functions into consideration (see also Fig. 3.21),
it can be concluded that for large M the sum above can be approximated quite well by
its first term. Taking additionally the symmetry of the problem into account and noting
that J () 0 in the area of interest, an upper bound for the aliasing error is derived as
follows
Z
1 2
R
al (, R, )
J+M ( ) d ,
(3.134)
0
c

where M/2. Using the upper bound for the Bessel functions given in [JKA02] an
upper bound for the integral in Eq. (3.134) can be derived as
al (, R, )

c
(kR)2+2M +1 ,
R u(, M)

(3.135)

where
u(, M) = 22+2M ( + M)!2 (2 + 2M + 1) .

(3.136)

This result can be used to derive a frequency limit for a given order and allowable aliasing
error. If the allowable aliasing error is chosen equal for all angular frequencies this results
in
1


al R u(, M) 2+2M +1
c
,
(3.137)
fal (, M, al )
2R
c
where al denotes the allowable aliasing error. Equation (3.137) can be understood as
a kind of anti-aliasing condition in the temporal-frequency domain. Please note that
Eq. (3.137) does not provide an exact anti-aliasing condition, as this is well-known from
time-domain sampling of one-dimensional signals [OS99, GRS01]. There will always be
aliasing contributions in the analyzed field, without a band-limitation of the angular
spectrum.
Figure 3.24 shows the derived lower frequency limit fal (, M, al ) for two choices of al . It
can be seen clearly that the derived bound provides an upper bound.
Energy Ratio of the Aliasing to Signal Contributions
Up to now only the energy of the aliasing contributions was taken into account. The energy
of the signal without the aliasing contributions was discarded for these considerations. It
is reasonable to weight the energy of the aliasing contributions by the desired signal
energy, since the aliasing contributions interfere with the desired signal components. The
aliasing-to-signal ratio (ASR) will be defined as follows
ASR(, R, ) =

al (, R, )
.

R
J ( R ) 2 d
0
c

(3.138)

In the following results obtained from a numerical evaluation of Eq. (3.138) will be shown.
Figure 3.25 shows the ASR for a plane wave signal sampled at M = 24 angular points with

3.6. The Discrete Plane Wave Decomposition

97

1800

10

1600

20

1400

30

1200

40

1000

50

800

60

600

70

400

80

200

90
0

>

10

ASR(, R, )

frequency [Hz] >

[dB]
2000

100

Figure 3.25: Aliasing-to-signal ratio ASR(, R, ) between the aliasing contributions


and the desired plane wave signal (M = 24, R = 1 m). The gray levels denote the signal
energy in dB.

an radius of R = 1 m. It can be seen that the aliasing contributions get more dominant
for higher angular frequencies . This result indicates that it is not meaningful to use
the higher angular and temporal frequencies of the analyzed wave field since these are
distorted by spatial aliasing contributions.

3.6.2.4

Quantitative Analysis of the Truncation Error

As stated before, it is necessary to limit the number of Fourier series expansion coefficients
(angular frequencies ) used for the reconstruction of a wave field. This truncation is
necessary due to the aliasing present and leads to artifacts when reconstructing the wave
field. In the following these artifacts will be analyzed quantitatively.
As before, the basis for the following considerations will be a Dirac shaped plane wave.
The wave field of a plane wave on a circular boundary with radius R can be expressed as
the series (3.128). Truncation of this series leads to

PP,Mtr (, R, ) =

Mtr
X

=Mtr +1

j J (kR)ej(0 ) .

(3.139)

98

3. Fourier Analysis of Wave Fields

[dB]
tr < 60 [dB]
tr < 30 [dB]

0
10
20
30

1000

40
50
60
500

tr (M, R, )

frequency [Hz] >

1500

70
80
90

10

15
Mtr >

20

25

100

Figure 3.26: Truncation error tr (Mtr , R, ) for a plane wave (R = 1 m) and upper
bound for the truncation error as given by Eq. (3.141) for different upper bounds tr . The
gray levels denote the signal energy in dB.
where Mtr = M/2. Equation (3.139) can be used to formulate an error measure between
the wave field of a plane wave and its truncated series representation
tr (Mtr , R, ) = |PP (, R, ) PP,Mtr (, R, )| ,

(3.140)

where tr (Mtr , R, ) denotes the truncation error.


Due to its radial symmetry
tr (Mtr , R, ) will not depend on the angle for a plane wave. Figure 3.26 illustrates
the truncation error for different truncations Mtr for a Dirac shaped plane wave reconstructed at a radius of R = 1 m. It can be seen that the error decreases when using more
angular frequency components for the reconstruction and increases for higher temporal
frequencies. This result is evident when considering that a wave field may contain finer
spatial structures for higher frequencies and these have to be modeled by higher angular
frequencies. As for the aliasing artifacts it is desirable to derive an upper bound for the
truncation error. The chosen definition of the truncation error is equivalent to the one
used in [JKA02]. Thus, the error bound given there is also valid within our context. It is
given as follows
tr (Mtr , R, )

2
tr (Mtr , R)Mtr +1

,
(Mtr + 1) 1 tr (Mtr , R)

(3.141)

3.6. The Discrete Plane Wave Decomposition

where
tr (Mtr , R) =

99

keR
.
2(Mtr + 1)

(3.142)

Figure 3.26 shows additionally the derived upper bound of the truncation error for two
different maximum truncation errors tr .
The plane wave decomposition, as given by Eq. (3.46), can be expressed as Fourier series.
The expansion coefficients of this series are given as the Hankel transformations of the
angular expansion coefficients of the pressure field. However, due to the foregoing considerations this series representation will have to be truncated also. As a result, the discussed
truncation artifacts will also be present in the plane wave decomposed wave field. Please
note that truncation artifacts are also well known from the Fourier transformation [OS99].
In practice a such derived field representation will be truncated always. Hence, the derived results state that it is in practice not possible to exactly reconstruct a plane wave
which was captured on a circular boundary. The authors [JKA02] use this fact to define
the dimensionality of a wave field. They conclude that for a given reconstruction error a
finite number of components is sufficient to characterize a band-limited wave field within
a circular area with finite size.

3.6.3

Summary

The continuous plane wave decomposition has to be discretized in real-world applications,


since measurements can only be taken at a limited number of positions. This section derived the discrete counterpart to the continuous plane wave decomposition. The discrete
plane wave decomposition gives the same results as the continuous plane wave decomposition when the angular spectrum of the wave field is band-limited in the angular frequency domain. The necessary condition is given by Eq. (3.120). However, this condition
requires angular filtering of the wave field. Since this is not feasible in practical applications, aliasing contributions will be present in the captured wave field representation. The
corresponding aliasing error was analyzed in the previous sections. When reconstructing
the wave field from a finite number of measured points, these aliasing artifacts will be superimposed by truncation artifacts additionally. While both types of artifacts are present
in a reconstruction of the wave field, the aliasing artifacts are only present in the angular
Fourier series expansion coefficients of the field. Since the plane wave decomposition is
derived from these by a Fourier series it also contains both artifacts.

100

3. Fourier Analysis of Wave Fields

101

Chapter 4
Listening Room Compensation
The problem of listening room compensation was discussed on a qualitative level in Section 1.1. This chapter will analyze the influence of the listening room on sound reproduction quantitatively and will propose active listening room compensation as countermeasure
for its influence. For this purpose the fundamentals of sound reproduction systems will be
introduced first, followed by an analysis of the wave field generated by the sound reproduction system in the listening room. An active approach to compensate for the influence
of the listening room will be developed on basis of the foregoing analysis. This approach is
first derived in continuous space and time and then discretized. Application of traditional
adaptive filtering schemes provide a promising solution on first glance. However, their
application is not advisable due to fundamental problems related to the high number of
(correlated) channels used for massive multichannel reproduction systems. Based on the
analysis of the fundamental problems of these algorithms, an improved approach to active
listening room compensation is proposed in the last part of this chapter.

4.1

Sound Reproduction

It was stated in Chapter 1 that sound reproduction systems aim at perfectly reconstructing an acoustic scene. This section will derive the fundamentals of sound reproduction
systems that are capable of fulfilling this goal. For this purpose the scenario depicted in
Figure 4.1 will be considered. The wave field emitted by an arbitrary virtual source S
should be reproduced in the bounded region V . This region will be termed as listening
region in the following, since the listeners reside there. The virtual source S may not
have contributions in V . The limitation to one virtual source poses no constraints on the
wave field to be reproduced, since this source may have arbitrary shape and frequency
characteristics. Additionally, multiple sources can be reproduced on basis of the principle
of superposition.
The basic principle of sound reproduction can be illustrated with the principle of Huy-

102

4. Listening Room Compensation

virtual
source

xx

S(x, )

x0

V
x

Figure 4.1: Reproduction of the wave field emitted by a virtual source S(x, ) inside the
bounded region V and the parameters used for the Kirchhoff-Helmholtz integral (4.1).
gens [MF53a, Wik05b]. Huygens stated that any point of a propagating wave front at
any time-instant conforms to the envelope of spherical waves emanating from every point
on the wavefront at the prior instant. This principle can be used to synthesize acoustic wavefronts of arbitrary shape. Spherical waves are generated by point sources (see
Section 2.4.1) or approximately by closed loudspeakers. Accordings to Huygens principle
these loudspeakers have to be placed on a wave front. However, it is not very practical to
position the acoustic sources on the wavefronts for synthesis. By placing the loudspeakers
on an arbitrary fixed curve and by weighting and delaying the driving signals, an acoustic
wavefront can be synthesized with a loudspeaker array. Figure 4.2 illustrates this principle.
The mathematical foundation of this more illustrative description to sound reproduction
is given by the Kirchhoff-Helmholtz integral. This principle was introduced in Section 2.6
and will be utilized in the following to derive a generic theory of sound reproduction
systems.

4.1.1

Sound Reproduction based on the Kirchhoff-Helmholtz


Integral

The solution of the inhomogeneous wave equation for a bounded region with respect
to arbitrary boundary conditions was derived in Section 2.6. It will be assumed in the
following that no acoustic sources and obstacles are present within the listening region V .

4.1. Sound Reproduction

103

virtual
source

Figure 4.2: Application of Huygens principle to sound reproduction.


Hence, the homogeneous wave equation is valid within V . It remains to choose suitable
inhomogeneous boundary conditions on V for the reproduction of the virtual source
S. The solution of the homogeneous wave equation subject to inhomogeneous boundary
conditions, as discussed in Section 2.6, is given by the Kirchhoff-Helmholtz integral (2.63).
This basic principle states that the sound pressure within a source-free bounded region V is
fully determined by the inhomogeneous boundary conditions imposed on the closed surface
V surrounding V . In general, a suitable choice for the desired boundary conditions for
sound reproduction is given by a combination of the pressure and the directional pressure
gradient of the wave field produced by the virtual source S(x, ) on V . For sound
reproduction this combination is specified by the Kirchhoff-Helmholtz integral. Thus, the
wave field within V is given as

I 

P (x, ) =
G0 (x|x0 , )
S(x0 , ) S(x0 , )
G0 (x|x0 , ) dS0 ,
(4.1)
n
n
V
where S(x0 , ) denotes the wave field produced by the virtual source S on the boundary
V . The underlying geometry is illustrated in Figure 4.1. Equation (4.1) is valid for all
points within V (x V ), outside V the acoustic pressure P (x, ) equals zero. In order to
derive an interpretation of Eq. (4.1) the involved Greens functions have to be specified.
As stated in Section 2.6.1 and Section 2.6.2, the free-field Greens function G0 (x|x0 , )
and its directional gradient can be understood as the field emitted by sources placed on
V . These sources will be termed as secondary sources in the following. The strength of
these sources is determined by the pressure and the directional pressure gradient of the
virtual source field S(x0 , ) on V .
Thus, this specialized Kirchhoff-Helmholtz integral can be interpreted as follows: If appropriately chosen secondary sources are driven by the sound pressure and the directional

104

4. Listening Room Compensation

pressure gradient of the wave field emitted by the virtual source S on the boundary V ,
then the wave field within the region V is equivalent to the wave field which would have
been produced by the virtual source inside V . Thus, the theoretical basis of sound reproduction is given by the Kirchhoff-Helmholtz integral (2.63).
Please note, that the space V in Eq. (4.1) may be two- or three-dimensional. In the first
case V describes a plane and V the closed curve surrounding it, in the second case V
describes a volume and V the closed surface surrounding it. However, the Greens function and thus the secondary sources used in the Kirchhoff-Helmholtz integral depend on
the dimensionality of the problem. The free-field Kirchhoff-Helmholtz integral for threeand two-dimensional regions V was already derived in the Sections 2.6.2 and 2.6.1. The
next two sections will illustrate the application of this principle to sound reproduction in
two- and three-dimensions.

4.1.2

Three-dimensional Sound Reproduction

The three-dimensional free-field Greens function was already derived in Section 2.4.2 and
is given by Eq. (2.33). In the context of sound reproduction, it can be interpreted as the
field of a monopole point source distribution on the surface V . The Kirchhoff-Helmholtz
integral (4.1) also involves the directional gradient of the Greens function. The directional
gradient of the three-dimensional free-field Greens function, as given by Eq. (2.66), can
be interpreted as the field of a dipole source whose main axis lies in direction of the
normal vector n. Thus, the Kirchhoff-Helmholtz integral states in the three-dimensional
case, that the acoustic pressure inside the volume V can be controlled by a monopole and
a dipole point source distribution on the surface V enclosing the volume V .

4.1.3

Two-dimensional Sound Reproduction

In general it will not be feasible to control the pressure and its gradient on the entire
two-dimensional surface of a three-dimensional volume. Typical reproduction systems
are restricted to the reproduction in a plane only. This reduction of dimensionality is
reasonable for most scenarios due to the spatial characteristics of human hearing [Bla96].
The two-dimensional Kirchhoff-Helmholtz integral has been derived in Section 2.6.2.
Please note, that within this work (see Section 2.1.2) the term two-dimensional refers to
truly two-dimensional wave fields or fields that are independent from one of the three
spacial coordinates. The required two-dimensional free-field Greens function for the
Kirchhoff-Helmholtz integral is given by Eq. (2.68). It can be interpreted as the field
of a monopole line source which intersects the reproduction plane at the position x0 .
The directional gradient of the two-dimensional free-field Greens function, as given by
Eq. (2.69), can be interpreted as the field of a dipole line source whose main axis lies in
direction of the normal vector n. Thus, the Kirchhoff-Helmholtz integral states in this

4.1. Sound Reproduction

105

case, that the acoustic pressure on the plane V can be controlled by a monopole and a
dipole line source distribution on the closed curve V surrounding the plane.

4.1.4

Sound Reproduction with Monopole Secondary Sources

The Kirchhoff-Helmholtz integral states that a sound reproduction system may be realized with secondary monopole and dipole sources. In practice it is desirable to utilize
only one of these two source types. Thus, one of the two secondary source terms in the
Kirchhoff-Helmholtz integral (4.1) has to be eliminated. Since monopole sources can be
realized as a first approximation by closed loudspeakers, it is reasonable to drop the dipole
sources. However, similar principles as shown in the following can be applied to drop the
monopole contributions.
The second term in the Kirchhoff-Helmholtz integral (4.1) belonging to the dipole secondary sources can be eliminated by assuming homogeneous Neumann boundary conditions on V . As a result, the boundary V will be modeled as acoustically rigid boundary.
In order to fulfill this requirement, the Greens function used in the Kirchhoff-Helmholtz
integral has to be modified. As a consequence of the desired boundary condition, this
modified Greens function G(x|x0 , ) has to obey the following condition


=0.
(4.2)
G(x|x0 , )
n
x0 V
The desired Greens function can be derived by adding a suitable homogeneous solution
(with respect to the region V ) to the free-field Greens function
G(x|x0 , ) = G0 (x|x0 , ) + G0,m (xm (x)|x0 , ) ,

(4.3)

where G0,m (xm (x)|x0 , ) denotes a suitable free-field Greens function with the source
point x0 and the receiver point xm (x). The receiver point xm (x) has to be chosen such,
that G(x|x0 , ) fulfills the presumed Neumann boundary condition. In the following
the Greens function G0,m (xm (x)|x0 , ) will be derived for the three-dimensional case.
However, the same principles can be applied also to the two-dimensional case as will be
shown later in this section.
The directional gradient of the modified Greens function (4.3) is given as

1 ejk|xx0|
1 ejk|xm (x)x0 |
+
=
4 |x x0 |
4 |xm (x) x0 |
ejk|xx0| 1 + jk|xm (x) x0 |
ejk|xm (x)x0 | !
1 + jk|x x0 |
=
cos

cos
=0,
4|x x0 |
|x x0 |
4|xm (x) x0 |
|xm (x) x0 |
(4.4)

where and denote the angles between the normal vector n and the vectors x x0
and xm (x) x0 respectively. A solution of Eq. (4.4) is given by choosing the receiver

106

4. Listening Room Compensation

x
x0

xm

xm

x0

V
n

x0

Figure 4.3: Illustration of the geometry used for the derivation of the modified Greens
function for a sound reproduction system using monopole secondary sources only.
point xm (x) according to Fig. 4.3 as the point x mirrored at the tangent to the curve V
at the position x0 . This way the directional gradient of G0,m (xm (x)|x0 , ) is equal to the
directional gradient of G0 (x|x0 , ) but with opposite sign, since cos and cos have the
same magnitude but opposite signs for this special geometry. Thus, G0,m (xm (x)|x0 , ) is
given as
1 ejk|xm (x)x0 |
G0,m (xm (x)|x0 , ) =
.
(4.5)
4 |xm (x) x0 |

Please note, that the point xm (x) is always outside the region V (xm R3 \V ). Introducing Eq. (4.5) into Eq. (4.4), with the geometry depicted in Fig. 4.3, yields that
G0,m (xm (x)|x0 , ) fulfills the requirement prescribed by Eq. (4.2). Thus, the reproduction of arbitrary wave fields inside the volume V using a distribution of monopole sources
only on the surface V is possible in principle. However two problems remain:
1. the field outside the volume V will not vanish and
2. the unmodified reproduction system would produce undesired reflections due to the
homogeneous Neumann boundary condition assumed to discard the dipoles.
In the sequel these two problems will be addressed as well as two-dimensional sound
reproduction.
Consequences of the wave field produced outside of the listening area
The field outside the volume V will not vanish, as this would be the case for reproduction
based on the Kirchhoff-Helmholtz integral. Equation (4.5) together with Eq. (4.3) states,
that the field outside the region V is a mirrored version of the field within V . As a
consequence, V has to be concave in order to avoid deteriorations of the wave field
within the listening region V by its mirrored version.

4.1. Sound Reproduction

107

Both the Greens function G0 (x|x0 , ) and G0,m (xm (x)|x0 , ) can be interpreted as point
sources placed at the position x0 . Their wave fields produced inside V will be equivalent,
since by construction |x x0 | = |xm x0 |. The Greens function G(x|x0 , ) used for
reproduction is then given by introducing Eq. (4.5) into Eq. (4.3) as
G(x|x0 , ) = 2 G0(x|x0 , ) .

(4.6)

Hence, the strength of the virtual source has to be doubled to cope with the additional
secondary source at x0 represented by G0,m (xm (x)|x0 , ). The wave field at arbitrary
receiver points x is then given as
I

ejk|xx0|
1
P (x, ) =
2
(S(x0 , ))
dS0 .
(4.7)
4 V n
|x x0 |
Consequences of the imposed Neumann boundary condition
The homogeneous Neumann boundary condition chosen in the derivation above implicitly
models the surface V as rigid surface. As a result, the reproduction of the desired wave
field using Eq. (4.7) will additionally reproduce reflections at the (virtual) boundary V .
These reflections however, are not desired since the reproduction system should model a
free-field space V . The reflections at the border V only take place for those components
of the wave field where the local propagation direction of the wave field to be reproduced
does not coincide with the normal vector n. Thus, these undesired reflections are avoided
by exciting only those secondary sources whose normal vector n coincides with the local
propagation direction of the wave field to be reproduced. This selection can be performed
by introducing a window function a(x0 ) into Eq. (4.7)
1
P (x, ) =
4

2 a(x0 )

ejk|xx0|
(S(x0 , ))
dS0 ,
n
|x x0 |

where a(x0 ) is defined as

1 , if the local propagation direction of S(x , ) coincides with n,


0
a(x0 ) =
0 , otherwise.

(4.8)

(4.9)

Equation (4.8) together with Eq. (4.9) comprise the basis of sound reproduction systems
utilizing secondary monopole sources only. If the distribution of monopole sources on V
is driven by the directional pressure gradient of the virtual source weighted by the window
function a(x0 ), then the wave field of this source is reconstructed perfectly inside V .
The terms involved to drive the secondary sources in Eq. (4.8) can be combined into the
driving function D(x0 , )
D(x0 , ) = 2 a(x0 )

(S(x0 , )) = 2 a(x0 ) j0 Vn,S (x0 , ) ,


n

(4.10)

108

4. Listening Room Compensation

where Vn,S (x0 , ) denotes the particle velocity of the virtual source in direction of the
surface normal n.
Two-dimensional sound reproduction
Similar considerations as performed above lead to the wave field created by a distribution
of monopole line sources on the closed curve V
I
j
(2)
P (x, ) =
D(x0 , ) H0 (k |x x0 |) dS0 ,
(4.11)
4 V
where the driving function D(x0 , ) is given by Eq. (4.10). Figure 4.4 shows the reproduced wave field when using a two-dimensional circular distribution of secondary
monopole sources. The radius of the circular region was chosen as R = 1.50 m. Two
cases were evaluated: (1) the left row shows the results when the window function a(x0 )
is discarded, (2) the right row shows the results when incorporating the window function.
These results were calculated by numerical evaluation of Eq. (4.11). It can be clearly seen
that the window function eliminates the reflections introduced by the Neumann boundary conditions. The wave field is reproduced correctly within the circular distribution of
secondary monopole sources. The wave field outside of that region does not vanish, as
this would be the case when using secondary monopole and dipole sources.
Linear secondary source distributions
The closed contour V can be degenerated to an infinite line for two-dimensional reproduction or an infinite plane for three-dimensional reproduction. The line/plane will then
divide the two/three-dimensional space into two-regions. One of these can be chosen as
the listening area. However, only virtual source fields whose local propagation direction at
the secondary source distribution coincides with the normal vector n can be reproduced.
Specializing Eq. (4.11) to the case that the secondary source contour V degenerates to
an infinite line located on the x-axis yields
Z


j
(2)
PC (xC , ) =
DC (xC,0 , ) H0 (k xC xC,0 ) dx0 ,
(4.12)
4

where xC,0 = [ x0 0 ]T . Above formulation can also be derived from the two-dimensional
Rayleigh integral [Sta97]. Equation (4.12) can be used to describe linear secondary source
distributions. These are frequently used for practical implementations of spatial sound
reproduction systems, e. g. for wave field synthesis systems (see Section 5.1).
Focused virtual sources

The theory of sound reproduction introduced so far assumed that the virtual source
S(x, ) has no contributions within the listening area. The theory can be extended to

4.1. Sound Reproduction

109

t = 0.0 ms
3

1
y > [m]

y > [m]

t = 0.0 ms
3

3
2

0
x > [m]

3
2

3
2

0
x > [m]

3
2

2
0
x > [m]

0
x > [m]
t = 5.0 ms

y > [m]

y > [m]

t = 5.0 ms
3

t = 2.0 ms

y > [m]

y > [m]

t = 2.0 ms

0
x > [m]

0
x > [m]

Figure 4.4: Reproduction of a band-limited (sinc shaped) plane wave using a twodimensional circular distribution of secondary monopole sources (R = 1.50 m). The left
row shows the resulting wave field when discarding the window function a(x0 ), the right
row when taking it into account.

110

4. Listening Room Compensation

focused virtual sources, these are virtual sources with contributions in the listening area.
This extension will not be considered here, but can be found in [Ver97, TAG+ 01, YTF03].

4.1.5

Reproduction of a Plane Wave Decomposed Field

It was shown in the previous section that the local traveling direction of the virtual source
wave field has to be taken into account for reproduction using a monopole only secondary
source distribution. The plane wave decomposition, as introduced in Section 3.3.2, inherently includes this information. In the following the loudspeaker driving signals will be
derived from the plane wave decomposed wave field of the virtual source. Without loss
of generality the discussion will be limited to the two-dimensional case.
The representation of a wave field by its plane wave decomposition is given by the inverse
plane wave decomposition (3.45). Hence, the wave field of the virtual source S(x, ) can
be represented as follows
Z 2
k
) ejkr cos() d ,
SP (xP , ) =
S(,
(4.13)
2
(2) 0
) denotes the plane wave expansion coefficients of the virtual source field
where S(,
SP (xP , ), as given by Eq. (3.41). The calculation of the monopole driving signal (4.10)
includes the calculation of the directional gradient of the virtual source field. The gradient
of the exponential term in Eq. (4.13) can be derived as follows
"
#
cos( )
.
(4.14)
ejkr cos() = jk ejkr cos()
sin( )
Using above result yields the gradient of Eq. (4.13) as
jk 2
SP (xP , ) =
(2)2

2
0

) ejkr cos()
S(,

"

cos( )
sin( )

d ,

(4.15)

where and r denote the coordinates of the spatial polar coordinate system. The directional gradient of the plane wave decomposed virtual source field takes the local geometry
of the virtual boundary V into account. Figure 4.5 illustrates the underlying geometry
for one particular plane wave with the wave vector k ((k) = ). The directional gradient
of the plane wave decomposed virtual source field can be expressed as
Z 2

jk 2
) cos( ) ejkr cos() d , (4.16)
SP (xP , ) = hn, SP (xP , )i =
S(,
n
(2)2 0
where denotes the angle of the surface normal n ((n) = ). The driving signal D(x0 , )
can be derived from Eq. (4.16) by using Eq. (4.10) together with a suitably chosen window
function a(x0 ). Considering the geometry depicted in Fig. 4.5, the generic definition (4.9)

4.1. Sound Reproduction

111

x0

e
an
pl

g
iso
(
ve
wa

s)
m
a
r

Figure 4.5: Geometric parameters used to derive the driving signals for the reproduction
of a plane wave decomposed wave field. The geometry is illustrated for one particular
plane wave, with k denoting the wave vector of this plane wave.
of the window function can be formulated more precisely in terms of the angles of the
individual plane wave contributions

1 , if | | /2,
a(x0 ) =
(4.17)
0 , otherwise.

The influence of the window function is illustrated by the gray wedge in Fig. 4.5. Introducing Eq. (4.16) and Eq. (4.17) into Eq. (4.10) yields the driving signal at the location
xP,0 = [ 0 r0 ]T as
jk 2
DP (xP,0 , ) =
(2)2

+ 2

) cos( ) ejkr0 cos(0 ) d .


S(,

(4.18)

The presented results reveal that a plane wave decomposition of the virtual source S(x, )
allows to conveniently incorporate the effect of the window function a(x0 ) into the formulation of the driving signal D(x0 , ).

4.1.6

Spatial Sampling of the Secondary Monopole Source Distribution

The theory of sound reproduction, as presented so far, assumed a continuous distribution


of secondary sources placed on the boundary V surrounding the listening region V . In
practical implementations, the secondary source distribution will be realized by a finite

112

4. Listening Room Compensation

number of secondary sources placed at discrete positions. This spatial sampling of the
secondary source distribution may lead to spatial aliasing. The following section will
derive the effects of spatial sampling and suitable anti-aliasing conditions. The theory
will be presented for the two-dimensional case, however it can be extended also in a
straightforward way to the three-dimensional case using the presented techniques.
The reproduced wave field within the listening area V was derived in Section 4.1.4. It is
given by Eq. (4.11), which can be generalized as follows
I
P (x, ) =
D(x0 , ) V (x x0 , ) dS0 ,
(4.19)
V

where V (x x0 , ) denotes the wave field of the secondary sources including the negative
sign. In the two-dimensional case these secondary sources constitute line sources. The
wave field of the secondary line sources is given by the two-dimensional free-field Greens
function (2.68)
j (2)
V2D (x x0 , ) = H0 (k |x x0 |) .
(4.20)
4
Sampling of a one-dimensional signal in the time-domain leads to repetitions of the spectrum of this signal [GRS01, OS99]. Due to these repetitions of the spectrum, a frequency
domain analysis of temporal sampling is most convenient. Aliasing artifacts will be present
in a time-domain sampled signal, if the signal is not band-limited or the band-limited repeated spectrums overlap. The same principles as used for time-domain signals can be
applied to spatial sampling of multidimensional signals.
Equation (4.19) can be understood as a generalized multidimensional convolution integral. The convolution is performed on the contour V with the listening position x as
parameter. This generalized convolution can be derived from the multidimensional convolution (3.32) by a suitable parameterization of the boundary V . For the derivation of
the sampling artifacts, a spatio-temporal frequency domain description of the reproduced
wave field is desired. Due to the convolutional structure of Eq. (4.19) this will lead to a

multiplication of the driving function D(k,


) and the secondary source term V (k, ) in
the spatio-temporal frequency domain for special secondary source geometries. In the following two sections this will be illustrated for linear and circular contours V . The results
will then be generalized to arbitrary shaped contours. For the following spatial sampling
considerations secondary line sources will be assumed. However, the results hold also
for secondary point sources used in two-dimensional reproduction. Since the secondary
sources will be realized by loudspeakers, the discretized secondary source distribution will
be termed as (loudspeaker) array.
4.1.6.1

Linear Arrays

In the following a linear secondary source distribution will be considered. Anti-aliasing


conditions for linear secondary source distributions were already derived by [Sta97, LB05].

4.1. Sound Reproduction

113

y
PC (xC , )

n
x

V
x

Figure 4.6: Geometry used to derive the sampling artifacts for linear loudspeaker arrays.
The denote the sampling positions of the driving function DC,S (x, ).
However, no detailed analysis of the aliasing artifacts has been performed so far. This
section analyzes the spatial aliasing artifacts of linear secondary source distributions and
derives an anti-aliasing condition.
It will be assumed that the secondary source distribution is located on the x-axis (y = 0) of
a Cartesian coordinate system and in a first step has infinite length. Figure 4.6 illustrates
the geometry of the line array. The reproduced wave field is given by specializing Eq. (4.19)
to the geometry depicted in Fig. 4.6, as follows
Z
PC (xC , ) =
DC (xC,0 , ) VC(xC xC,0 , ) dx0 ,
(4.21)

where for a line array placed on the x-axis xC,0 = [ x0 0 ]T . Equation (4.21) exhibits
the form of a convolution integral along the x-axis. Applying a two-dimensional spatial
Fourier transformation and the convolution theorem (3.8) yields the pressure field in the
spatio-temporal frequency domain as
C (kx , ) VC (kC , ) ,
PC (kC , ) = D

(4.22)

C (kx , ) = Fx {DC (x, )}. In order to derive the wave field reproduced by a
where D
discrete distribution of secondary sources it is assumed that the driving function DC (x, )
is sampled at equidistant discrete positions. The process of sampling can be modeled by
a multiplication of the continuous driving function with a series of Dirac functions
DC,S (x, ) = DC (x, )

1 X
(x x)
x =

1 X
=
DC (x, ) (x x) ,
x =

(4.23)

where DC,S (x, ) denotes the sampled driving function and x the distance (sampling
period) between the sampling positions. The sampling positions are indicated in Fig. 4.6

114

4. Listening Room Compensation

by the dots . The result of sampling is a series of weighted Dirac pulses at the sampling
positions. The spatio-temporal spectrum of the sampled driving function can be calculated
by applying a spatial Fourier transformation to Eq. (4.23) with respect to the x-coordinate
C,S (kx , ) = 2 D
C (kx , ) kx
D

(kx

2
)
x

C (kx 2 , ) .
= 2
D
x
=

(4.24)

As for time domain sampling, the spatial sampling results in a repetition of the spectrum
C (kx , ) on the spatial frequency kx -axis. Introducing
of the continuous driving function D
C,S (kx , ) into Eq. (4.22) derives the spectrum of the wave field reproduced by a sampled
D
secondary source distribution as
PC,S (kC , ) = 2

C (kx 2 , ) VC (kC , ) .
D
x
=

(4.25)

In order to derive the effects of spatial sampling and a sampling theorem, the spatio C (kx , ) and the secondary sources VC (kC , )
temporal spectrums of the driving function D
have to be specialized. The Fourier transformation of a secondary line source was derived
in Appendix C.3.1 and is given by Eq. (C.16). The spectrum of the driving function
C (kx , ) is dependent of the wave field of the virtual source S(x, ) to be reproduced.
D
Since arbitrary wave fields can be decomposed into plane waves, the following paragraph
will derive the sampling artifacts for the reproduction of a plane wave.

Sampling artifacts for the reproduction of plane waves


In the following it will be assumed that the wave field to be reproduced is a monochromatic
plane wave, as given by (2.47). The driving signal is given according to Eq. (4.10) by the
window function a(x0 ) and the directional gradient of the wave field of a plane wave as
DC,pw (x, ) = 2j a(x)

sin(pw ) ej c x cos pw ,
c

(4.26)

where pw denotes the incidence angle of the plane wave. For the upper half plane (y > 0),
the secondary source distribution is only capable of reproducing plane waves traveling into
the positive y-direction. Thus, it is reasonable to limit the incidence angle of the virtual
plane waves to 0 pw < . As a consequence to this limitation, the window function will
become a constant a(x0 ) = 1. Calculating the spectrum of the driving signal DC,pw (x, )

4.1. Sound Reproduction

115

and introducing this result into Eq. (4.25) derives the reproduced wave field as

PC,S (kC , ) = 4j sin pw


(kx
cos pw ) VC (kC , ) =
c
x
c
=

= 4j sin pw
(kx
cos pw ) VC (
+ cos pw , ky , ) .
c
x
c
x
c
=

(4.27)

Thus, the spectrum of the reproduced wave field for a discrete distribution of secondary
sources is given as a series of Diracs weighted by the spectrum of the secondary sources
evaluated at the positions of these Diracs. The result for = 0 comprises the desired
plane wave. The other terms in the sum for 6= 0 are potential aliasing contributions.
Their strength depends on the weighting given by VC (kC , ). In the ideal case the spatiotemporal spectrum of the secondary sources would have to be chosen such to filter out
the aliasing contributions. A suitable choice would be a spatial-temporal lowpass filter.
However, for sound reproduction the spectrum VC (kC , ) is given by the secondary sources
and their underlying physics. The spectrum VC (kC , ) for line sources as secondary sources
was derived in Appendix C.3.1 and is given by Eq. (C.16). Introducing this result into
Eq. (4.27) yields

1 q 2
(kx
cos pw ) ( kx + ky2 ) +
x
c
k
c
=
!

X
2

1
+j
(kx
cos pw ) 2
.
2
x
c
kx + ky ( c )2
=

PC,S (kC , ) = sin pw


c

(4.28)

The reproduced spectrum consists of a real and an imaginary part. The imaginary part
can be identified as being produced by the near-field of the secondary sources. Hence,
this part will not be considered further for the sampling considerations derived in the
following. Figure 4.7 illustrates the real part of PC,S (kC , ) in the spatial kx -ky -frequency
plane. For a fixed temporal frequency , the first Delta function in the real part of
Eq. (4.28) can be interpreted as a series of Dirac lines perpendicular to the ky -axis in
2
+ c cos pw . The second Delta
the spatial frequency plane at the positions kx = x
function can be interpreted as a circular Dirac pulse with the radius c . Due to the sifting
property of Dirac functions, the result of the multiplication of the two Dirac functions is
given by their crossings in the spatial frequency plane. For the situation shown in Fig. 4.7
the result will be two Diracs at the positions indicated by the dots . In this particular
example, these two Diracs represent the desired wave field of a plane wave traveling into
the positive y-direction for the upper half plane y > 0 and into the negative y-direction
for the lower half plane y < 0. This symmetry results from the reproduction using only

116

4. Listening Room Compensation

ky
(kx +

2
x

(kx

cos pw )

cos pw )

(kx

2
x

2
x

cos pw )

pw
2
x

(
= 1

=0

kx

kx2 + ky2 c )
=1

Figure 4.7: Illustration of the real part of the spectrum PC,S (kC , ) reproduced by a
discrete secondary monopole source distribution for the reproduction of a plane wave
with incidence angle pw . The resulting spectrum is given by the intersection of the two
Dirac functions at the positions indicated by the dots .
secondary monopoles as discussed in Section 4.1.4.
However, for an increasing distance x between the secondary sources there may also
be additional contributions in the reproduced wave field. The repetitions of the first
Delta function in the real part of Eq. (4.28) for 6= 0 move towards the circular Delta
function for an increasing distance x. If these repetitions overlap with the circular Delta
function additional plane wave contributions will result due to the sifting property. These
contributions constitute the spatial aliasing due to the spatial sampling of the secondary
source distribution. These are avoided if the frequency of the reproduced plane wave is
limited. The anti-aliasing condition for the driving function can be derived from Fig. 4.7
and Eq. (4.28) as
c
.
(4.29)
f
x (1 + |cos pw |)
Thus, a reduction of the temporal bandwidth of the reproduced plane wave avoids aliasing
contributions present in the reproduced wave field. For arbitrary wave fields the condition (4.29) has to be fulfilled for the minimum and maximum incidence angle of their
plane wave contributions.
If the anti-aliasing condition given by Eq. (4.29) is not fulfilled then aliasing artifacts
will be present in the reproduced wave field. According to Fig. 4.7 and Eq. (4.28) these
artifacts will be a superposition of plane waves (for 6= 0) with different incidence angles.
However, only those spectral repetitions will result in spatial aliasing contributions where
the circular Dirac pulse and the Dirac lines in Fig. 4.7 intersect. Hence, only a subset

4.1. Sound Reproduction

117

90
120

60

150

30

180

210

330

240

300
270

Figure 4.8: Incidence angle of the desired plane wave pw (dashed line) and its aliasing
contributions pw,al (solid lines). The desired monochromatic plane wave has a frequency
of f0 = 10 kHz and an incidence angle of pw = 90o , the secondary source distance was
chosen to x = 0.15 m.
al of all possible spectral repetitions will be present in the reproduced wave field for a
particular incidence angle and frequency of the desired plane wave. This subset includes
all al Z\0 for which the following condition holds



2

+
cos

(4.30)
al
pw
x

c
c
Using this subset, the incidence angles al of the plane waves produced by aliasing can
then be derived from Eq. (4.28) as
cos pw,al =

x al

cos pw

(4.31)

Figure 4.8 shows the incidence angles of the desired plane wave and its aliasing contributions for the reproduction of a monochromatic plane wave with a frequency of f0 = 10 kHz
with a secondary source distance of x = 0.15 m. The incidence angle of the plane wave
was pw = 90o .
Up to now, only the real part of the reproduced spectrum was considered since the imaginary part is related to near-field effects of the secondary sources. The poles of the
imaginary part are also located on the circle shown in Fig. 4.7. Applying the sifting property of the Delta function, the spectrum of these contributions is given by evaluation the

118

4. Listening Room Compensation

y
P (x, )

kpw
pw

pw

pw

V
x

Figure 4.9: Illustration of the effects caused by truncation of an infinite linear array.
The gray area illustrates approximately the area where the wave field of a plane wave
with the incidence angle pw is reproduced.
2
imaginary part at kx = x
+ c cos pw . The result is not bandlimited in the ky but in
the kx direction. Hence, the anti-aliasing condition (4.29) applies also to the imaginary
part. The aliasing contributions of the imaginary part have the form of evanescent plane
waves.

Truncated Loudspeaker Arrays


Up to now, the linear secondary source distribution was assumed to be of infinite length.
Practical implementations, however, of linear loudspeaker arrays will always have finite
length. The truncation of the infinite secondary source distribution can be considered by
multiplying the loudspeaker driving function DC (x, ) with a rectangular window function.
The multiplication with this window function will lead to a convolution in the spatial
frequency domain. Please note, that this procedure is similar to the modeling of the
sampling effects illustrated by Eq. (4.23) and Eq. (4.24). The effects of truncation of
linear arrays have been discussed in detail by [Sta97]. For simplicity these effects will be
approximated only in the following.
For the reproduction of plane waves, the effect of truncation can be approximated quite
well by simple geometric means as illustrated in Fig. 4.9. This approximation states that
a plane wave will be reproduced only in a tilted rectangular area in front of the array,
whose width is equivalent to the aperture of the array and length is infinite. The area
is tilted by the incidence angle pw of the plane wave to be reproduced. Outside of this
area the energy of the reproduced wave field will be quite low. Inside of this area the
reproduced wave field will match the desired virtual source wave field. However, some
aperture artifacts [Sta97] will be present additionally.
As a consequence to this limited reproduction area, the aliasing effects discussed above
will depend on the listener position. This is due to the fact that not all plane waves with
different incidence angles will be reproduced at all listener positions. Since the spatial

4.1. Sound Reproduction

119

y
x0
V
R

P (x, )

r
0

x
x

Figure 4.10: Geometry used to derive the sampling artifacts of circular loudspeaker
arrays. The dots denote the spatial sampling positions of the driving function DC,S (x, ).
aliasing artifacts of reproduced plane waves will be plane waves themselves, these aliasing
artifacts will not be present at all listener positions. Those plane waves who are relevant
at a given listener position can be found easily by the geometric approximation discussed
above (see also Fig. 4.9). A special case is represented by a plane wave incidence angle
of pw = 90o and listener positions far away from the array: no aliasing artifacts will be
present. The aliasing frequency will be infinite in this case.
Since typical listening rooms have a rectangular shape, rectangular arrays are frequently
used to build spatial auralization systems. These consist of four truncated linear arrays,
one at each side. Thus, rectangular arrays can be regarded as a superposition of truncated
linear arrays and the sampling theory introduced so far can be applied. However, it has to
be taken care that only those linear arrays are considered which are active for a particular
plane wave to be reproduced. This selection can be performed on basis of the window
function a(x0 ).

4.1.6.2

Circular Arrays

In the following the spatial sampling of a circular shaped secondary source distribution
with radius R will be investigated. Figure 4.10 illustrates the geometry of the considered
circular array. Due to the underlying circular geometry of the problem it is convenient to
use polar coordinates for the description of the reproduced wave field. The reproduced
wave field PP (xP , ) can be derived by specializing Eq. (4.19) to the geometry depicted in

120

4. Listening Room Compensation

Fig. 4.10
j
PP (xP , ) =
4

(2)

DP (0 , R, ) H0 (kr) R d0 ,

(4.32)

where r = |xP xP,0 | and V2D (xP xP,0 , ) as given by Eq. (4.20) was introduced. The
(2)
Hankel function H0 (kr) in Eq. (4.32) can be expressed by Bessel and Hankel functions
which depend only on one of the positions xP and xP,0 using the shift theorem of the
Hankel functions given by Eq. (3.110). Introducing Eq. (3.110) for r0 = R and r R into
Eq. (4.32) yields the reproduced wave field inside the circular boundary V as
Z 2

j X
(2)
j
PP (xP , ) =
J (kr) H (kR) R e
DP (0 , R, ) ej0 d0 =
4 =
0

X
R, ) ej ,
= j R
J (kr) H(2) (kR) D(,
2 =

(4.33)

where for the second equality the definition of the Fourier series, as given by Eq. (3.16),
was introduced. Equation (4.33) states that the reproduced wave field is given by a Fourier
series with respect to the angle . The coefficients of this Fourier series are given by the
R, ) of the driving function weighted by a Bessel and a
Fourier series coefficients D(,
Hankel function.
The effect of discretizing the secondary source distribution is modeled by sampling the
loudspeaker driving function DP (0 , R, ) at equidistant angles, resulting in a total of N
sampled secondary source positions. The sampled driving function DP,S (0 , R, ) is given
as
DP,S (0 , R, ) = DP (0 , R, ) P (0 ) ,
(4.34)
where P (0 ) denotes the angular pulse train, as defined by Eq. (3.127). The effects of
angular sampling were discussed in Section 3.6.2.2. Sampling will result in repetitions of
the angular spectrum as illustrated by Eq. (3.129). Applying this principle to the sampled
S (, R, ) of the
driving function DP,S (0 , R, ) results in the Fourier series coefficients D
sampled driving function
S (, R, ) =
D

+ N, R, ) .
D(

(4.35)

Introducing Eq. (4.35) into Eq. (4.33) yields the wave field PP,S (xP , ) reproduced by a
discrete secondary source distribution as

X X
S ( + N, R, ) ej .
PP,S (xP , ) = j R
J (kr) H(2) (kR) D
2 = =

(4.36)

The result for = 0 constitutes the desired wave field. Please note, that effects of the
limited aperture of the array are included inherently in Eq. (4.36) by the Hankel function

4.1. Sound Reproduction

121

(2)

H (kR). The terms for 6= 0 are potential aliasing contributions. Their energy should
be zero in the ideal case or at least be minimized in practical applications.
The formulation of the reproduced wave field in terms of angular frequencies given by
Eq. (4.36) can be used to split the reproduced wave field PP,S (xP , ) into the wave
field PP,S,0 (xP , ) containing no aliasing contributions and into its aliasing contributions
PP,S,al (xP , ). The wave field PP,S,0 (xP , ) would have been reproduced by a continuous
secondary source distribution and is given as

X
S (, R, ) ej .
PP,S,0 (xP , ) = j R
J (kr) H(2) (kR) D
2 =

(4.37)

The aliasing contributions PP,S,al (xP , ) reproduced by a discretized secondary source distribution can be derived from the spectral repetitions present in Eq. (4.36)

X X
S ( + N, R, ) ej .
J (kr) H(2) (kR) D
PP,S,al (xP , ) = j R
2
=

(4.38)

||1

The split-up of the reproduced wave field can be used to calculate the energy of the
aliasing contributions with respect to the desired wave field. The reproduced aliasingto-signal ratio RASR is defined according to the aliasing-to-signal ratio ASR derived in
Section 3.6.2.3 as follows
R
|PP,S,al (xP , )|2 d
RASR(xP , ) = R0
.
(4.39)
)|2 d
|P
(x
,

P,S,0
P
0
In general, the RASR will be dependent on the desired wave field and the listener position.
As for the linear arrays the reproduction of a plane wave will be considered in the following.
Sampling artifacts for the reproduction of plane waves
The driving function for the reproduction of a plane wave can be derived according to
Eq. (4.10) by considering the window function a(x0 ) and calculating the directional gradient of the wave field of a plane wave. The continuous driving function DP,pw (0 , R, )
for a plane wave is given as

DP,pw (0 , R, ) = 2j a(0 ) cos(0 pw ) ej c R cos(0 pw ) ,


(4.40)
c
where pw denotes the incidence angle of the plane wave. The window function a(0 ) for
the circular array will select those secondary sources which are relevant for the reproduction of a plane wave with incidence angle pw . For the geometry depicted in Fig. 4.10 the
window function is given as

1 for 3 ,
0
pw
2
2
apw (0 ) =
(4.41)
0 otherwise.

122

4. Listening Room Compensation

Introducing Eq. (4.41) into Eq. (4.40) allows to calculate the Fourier series expansions
S,pw (, R, ) of the sampled driving function for the reproduction of a plane
coefficients D
wave. These can then be used to calculate the wave field reproduced by a discrete secondary source distribution using Eq. (4.36). It was shown in Section 3.6.2.2 that a plane
wave exhibits an infinite bandwidth in the angular frequency domain. As a result, no
exact anti-aliasing condition can be given for the reproduction of plane waves on circular arrays. In the following results derived by numerical evaluation of Eq. (4.36) will be
shown.
The reproduction of a band-limited (sinc-shaped) plane wave with an incidence angle of
pw = 3
on a circular array was numerically evaluated for this purpose. The circular ar2
ray consists of 48 secondary line sources placed on a circle with a radius of RLS = 1.50 m.
The aliasing artifacts will depend on the bandwidth of the desired plane wave. Figure 4.11(a) shows a snapshot of the reproduced wave field PP,S (xP , ) for a bandwidth of
1 kHz. The desired plane wave as well as the aliasing contributions can be clearly seen.
Figure 4.11(b) illustrates additionally the extracted aliasing contributions PP,S,al (xP , )
of Fig. 4.11(a). Figure 4.12 shows the RASR(xP , ) for different maximum frequencies.
The presented results show that the RASR is dependent on the listener position and the
bandwidth of the reproduced plane wave. Two conclusions can be drawn from Fig. 4.12:
(1) the higher the bandwidth of the plane wave is, the more energy is contained in the
aliasing contributions of the reproduced field and (2) the farer the listener position is from
the active secondary sources, the lower is the energy of the aliasing contributions. The
latter conclusion was also derived for truncated linear arrays discussed in the previous
section.
4.1.6.3

Arbitrary Shaped Arrays

The derivation of sampling artifacts for circular arrays, as given in the previous section,
can be generalized to the case of arbitrary shaped arrays. This will be illustrated briefly
in the following section.
The basic idea is to perform a mapping of an arbitrary shaped listening area V and its
boundary V to a circular listening area V and its boundary V . Figure 4.13 illustrates
the desired mapping. The Riemann mapping theorem [Wei03] states that every simply
connected region can be mapped with a one-to-one transformation to a unit circular region
(with radius R = 1) using an analytic function. The desired mapping can be performed
using a conformal mapping [Wei03, SS67] M which maps the Cartesian coordinate system
of the arbitrary listening region V shown on the left side of Fig. 4.13 into the circular
region V shown on the right side of Fig. 4.13. A benefit of using a conformal mapping
is that the local rectangular angles of the Cartesian coordinate system are preserved in
the transformed domain. It remains to find a suitable analytic function for this purpose.
Tables for a wide variety of geometries can be found in [SS67]. Since conformal mappings

4.1. Sound Reproduction

123

(a) reproduced wave field PP,S (xP , )

(b) aliasing contributions PP,S,al (xP , )

Figure 4.11: Reproduction of a band-limited plane wave with pw = 3


on a loudspeaker
2
array with N = 48 loudspeakers and a radius of R = 1.50 m. The plane wave has a
bandwidth of 1 kHz. The upper plot shows the reproduced wave field, the lower one its
aliasing contributions.

124

4. Listening Room Compensation

(a) f = 500 Hz

(b) f = 650 Hz

(c) f = 800 Hz

(d) f = 1000 Hz

Figure 4.12: RASR(xP , ) for a circular loudspeaker array with N = 48 loudspeakers


and a radius of R = 1.50 m when reproducing a band-limited Dirac shaped plane wave.
The gray levels denote the level in [dB].

4.1. Sound Reproduction

125

P (x, )

V
P (x , )

Figure 4.13: Derivation of the sampling artifacts of an arbitrary shaped array by performing a conformal mapping to a circular array.
can be combined also, these tables of conformal mappings allow to handle or approximate
nearly all possible reproduction geometries. Once a suitable conformal mapping has been
found for the coordinates, this can be introduced into the relations for the circular arrays
derived in the previous section. The mapping of an equidistantly sampled arbitrary shaped
array onto a circular array may result in a non-equidistant angular sampling on the circle
as illustrated by Fig. 4.13. As a consequence to this irregular angular sampling, the
effects of angular sampling will not be a simple repetition of the Fourier series expansion
S (, R, ) of the loudspeaker driving function as given by Eq. (4.35). Instead,
coefficients D
the sampling will result in a convolution with a complex function in the angular frequency
domain . As a result, it will not be possible to derive a generic anti-aliasing theorem.
4.1.6.4

Point Sources as Secondary Sources

Secondary line sources are required for the reproduction of a wave field in a plane only.
However, typical implementations of two-dimensional reproduction systems use point
sources as secondary sources, since point sources can be approximated quite well by closed
loudspeakers. In this case the secondary source field is given as

1 ej c |xx0 |
.
V3D (x x0 , ) =
4 |x x0 |

(4.42)

The concept of wave field synthesis introduced in Section 5.1 is an example for a sound
reproduction system utilizing secondary point sources.
The spatial Fourier transformation of V3D (x x0 , ) has been derived in Section C.3.2.
The poles of the spatial Fourier transformation VC,3D (kC , ) are located on a circle with

126

4. Listening Room Compensation

111111111111111111111111111111111111111
000000000000000000000000000000000000000
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
n
000000000000000000000000000000000000000
111111111111111111111111111111111111111
V
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
R
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
listening region
V
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
R
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
listening room
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111

Figure 4.14: Illustration of the geometry used to derive the characteristics of a sound
reproduction system located in the listening room.

radius c in the kx -ky -domain. Comparison with the imaginary part of the spectrum
derived for a sampled linear secondary line source distribution as illustrated by Eq. (4.7)
yields a strong similarity. For the reproduction of a plane wave the resulting spectrum for
point sources as secondary sources is not bandlimited in the ky but in the kx direction.
Hence, the sampling theory presented so far can also be applied with minor changes to
the case of secondary point sources.

4.2

Sound Reproduction in Rooms

The mathematical description of sound reproduction systems presented in the previous


section is based on the use of free-field Greens functions. Thus, it is implicitly assumed
that the sound propagation from the secondary sources to the listening area conforms to
free-field conditions. This requirement is in general not fulfilled when placing the sound
reproduction system in a listening room. Typically these rooms will exhibit some amount
of reverberation, as already qualitatively discussed in Section 1.1. In the following the
influence of the listening room on sound reproduction using monopole secondary sources
will be analyzed quantitatively.
Figure 4.14 illustrates the underlying geometry. An arbitrarily shaped bounded listening
region V is surrounded by a distribution of monopole sources located on the boundary

4.3. Fundamentals of Listening Room Compensation

127

G(x|x0 , )
L(x, )

D(x0 , )

Figure 4.15: Response of the reproduction system located in the listening room illustrated as space shift-variant system.
V . The listening region V is located inside the listening room, which is modeled by
an arbitrarily shaped bounded region R surrounded by the boundary R. In order to
derive the influence of the listening room, the acoustic conditions at the boundary R
have to be considered. In the following it will be assumed that the walls of the listening
are not actively vibrating. Hence, their characteristics can be modeled by homogeneous
boundary conditions formulated on R. The general solution of the inhomogeneous wave
equation subject to homogeneous boundary conditions was derived in Section 2.6. For
the geometry depicted in Fig. 4.14 the wave field L(x, ) within R (x R) is given by
specializing Eq. (2.62) as follows
Z
L(x, ) = D(x0 , ) G(x|x0, ) dV0 ,
(4.43)
R

where G(x|x0 , ) denotes a suitably chosen Greens function which conforms to the homogeneous boundary conditions imposed on R and D(x0 , ) the driving function of the
monopole distribution. The negative sign of Eq. (4.43) in comparison to Eq. (2.62) is
chosen in accordance to the negative sign of the inhomogeneous part of the inhomogeneous wave equation (2.57). It was derived in the previous section, that the wave field
within V can be controlled by a distribution of monopole secondary sources on V . Thus,
D(x0 , ) will only have contributions on the boundary V . Hence, the integral above can
be rewritten in terms of the virtual boundary V
L(x, ) =

D(x0 , ) G(x|x0, ) dS0 .

(4.44)

Please note, that the integral (4.44) is still valid for the entire listening room. The response
of the listening room to the auralized wave field, as given by Eq. (4.44), can be understood
as a linear space shift-variant system (H(x, x0 , ) = G(x|x0 , )). Figure 4.15 illustrates
this interpretation of Eq. (4.44).

4.3

Fundamentals of Listening Room Compensation

The review of classical approaches to room compensation presented in Section 1.2 stated
fundamental requirements for an improved active listening room compensation system.

128

4. Listening Room Compensation

In particular the first two were: (1) sufficient observation and (2) control of the wave
field within the listening region. The required observational capabilities of the room compensation system can be realized by the techniques derived in Section 3. The techniques
described there can be utilized to analyze the wave field within a bounded region by
measurements on the boundary of that region. However, for room compensation it is also
required to gain control over the wave field within the listening region. The methods
for sound reproduction introduced in Section 4.1 allow to control the wave field within
a bounded region by a distribution of monopoles on the boundary of that region. Thus,
destructive interference can be used to compensate for the reflections of the listening room
in the listening area. The following section will derive the fundamentals of active listening
room compensation using destructive interference.

4.3.1

Room Compensation as Deconvolution Problem

It will be assumed in the following that the compensation of the listening room influence
is only performed within the listening region V . It was derived in Section 4.1.4 that
a distribution of monopoles is suitable to control the acoustic pressure field within the
enclosed region. In the context of room compensation this principle can be interpreted
also as follows: if the influence of the listening room is compensated on the boundary
surrounding the listening region, then the reproduction within that region will be free
of artifacts emerging from a non-ideal listening room. This basic idea is utilized in the
following.
The reproduced wave field within the listening region (x V ) is given by Eq. (4.44)
I
L(x, ) =
D(x0 , ) G(x|x0, ) dS0 ,
(4.45)
V

where L(x, ) denotes the reproduced wave field within the listening area and G(x|x0 , )
a suitably chosen Greens function that fulfills the boundary conditions imposed at the
walls of the listening room. The underlying geometry is illustrated by Fig. 4.14. The
Greens function G(x|x0 , ) inherently includes the effect of the listening room on the
auralized wave field, e. g. the acoustic reflections. However, these effects are not desired
as shown in Section 1.1. Desired are free-field propagation conditions for the reproduction.
The desired wave field A(x, ) within the listening area can be derived from Eq. (4.44) as
I
A(x, ) =
D(x0 , ) G0(x|x0 , ) dS0 ,
(4.46)
V

where G0 (x|x0 , ) denotes a suitably chosen free-field Greens function. The desired wave
field, as given by Eq. (4.46), can be can be interpreted as the response of a LTSI system.
Figure 4.16 illustrates the computation of the desired wave field. The error between the
desired and the reproduced wave field is defined as follows
E(x, ) = A(x, ) L(x, ) .

(4.47)

4.3. Fundamentals of Listening Room Compensation

129

G0 (x|x0 , )
A(x, )

D(x0 , )

Figure 4.16: Calculation of the desired wave field illustrated as LTSI system.

D(x0 , )

C(x0 |x0 , )

G(x|x0 , )
W (x0 , )

L(x, )

Figure 4.17: Illustration of the pre-equalization approach to room compensation. The


monopole distribution driving signal is filtered by a suitable filter C(x0 |x0 , ) in order to
perform active room compensation.
The goal of listening room compensation is to minimize this error. An obvious solution
is to modify the Greens function for the listening room to match the free-field Greens
function. This could be reached e. g. in a passive way by acoustic damping of the room.
However, due to the reasons explained in Section 1.1 an active solution to room compensation within the listening area is favorable. For this purpose the driving signals D(x0 , )
have to be modified in order to derive the desired free-field propagation conditions. Hence,
the problem of room compensation is to modify the system response such to match a desired one. In signal and system theory this requirement is covered by the fundamental
problem of inverse filtering (pre-equalization) of a system [GRS01]. The solution is given
by pre-equalization of the system input using a suitable filter.
In the context of room compensation this can be realized by pre-equalization of the
monopole distribution driving signal D(x0 , ). Figure 4.17 illustrates this basic idea. In
the following the filter C(x0 |x0 , ) will be denoted as room compensation filter since it
compensates for the influence of the listening room. The pre-filtering of the driving signal
D(x0 , ) is performed on the boundary V of the listening region. The filtered driving
signal W (x0 , ) is given as follows
W (x0 , ) =

D(x0 , ) C(x0|x0 , ) dS0 .

(4.48)

The reproduced wave field L(x, ) when pre-equalizing the driving signal is then derived
by combining Eq. (4.45) and Eq. (4.48) as
L(x, ) =
=

IV

W (x0 , ) G(x|x0, ) dS0


I
D(x0 , ) C(x0|x0 , ) dS0 G(x|x0 , ) dS0 .
V

(4.49)

130

4. Listening Room Compensation

Perfect listening room compensation is gained if the reproduced wave field is as close as
possible to the desired (free-field) wave field. Thus, the compensation filter has to ensure
that L(x, ) is close to A(x, ). Under the assumption of arbitrary excitation by D(x0 , ),
a suitable compensation filter C(x0 |x0 , ) can be derived by comparing the integrands of
Eq. (4.49) to the integrand of Eq. (4.46)
Z

C(x0 |x0 , ) G(x|x0, ) dS0 = G0 (x|x0 , ) ,

(4.50)

where x0 and x0 have been interchanged in Eq. (4.49) to derive Eq. (4.50). Above result
states that the optimal room compensation filter C(x0 |x0 , ) can be found by solving the
integral equation Eq. (4.50). In case of an anechoic listening room (free-field propagation)
the compensation filter is given as
C(x0 |x0 , ) = (x0 x0 ) .

(4.51)

This solution is evident since it states that no filtering is required in this simplified case.
However, for generic listening rooms a solution of Eq. (4.50) can become quite complex.
The integral on the left hand side of this equation can be interpreted as a generalized
convolution operation. In contrast to the standard convolution operation this formulation takes the space-variance of the problem explicitly into account. The fundamental
problem given by Eq. (4.50) is termed as deconvolution or inverse-filtering problem in the
context of LTSI systems. Hence, the process of calculating an appropriate compensation
filter C(x0 |x0 , ) for active room compensation is similar to solving a space-variant deconvolution problem.
In order to calculate the compensation filters, explicit knowledge of the Greens function
G(x|x0 , ) is required. In general, this function will not be known a-priori and will have
to be derived from acoustic measurements taken in the listening room. Additionally, the
acoustic characteristics of the listening room may change due to e. g. persons entering
the room or temperature variations. The active listening room compensation system has
to cope with these changes. These requirements call for an adaptive solution to room
compensation.
Summarizing, active listening room compensation exhibits two fundamental problems:
1. complex solution of the space-variant deconvolution problem given by Eq. (4.50),
and
2. the listening room transfer function G(x|x0 , ) is not known a-priori.
The following two sections will briefly address these two problems.

4.3. Fundamentals of Listening Room Compensation

4.3.2

131

Generic Solution to the Room Deconvolution Problem

As stated in the previous section, the inverse filtering problem for a spatio-temporal LTSI
system leads to a similar result as given by Eq. (4.50). However, in this case the integral on the left hand side of Eq. (4.50) degenerates to a (space-invariant) convolution
integral. A solution to this shift-invariant problem is given by transforming the signals
into the spatio-temporal frequency domain by performing a multidimensional Fourier
transformation (3.2a). The compensation filter can then be derived by calculating the
fraction between the transfer function of the actual system response and the desired system response for each spatial and temporal frequency. This results from the convolution
theorem (3.8) of the multidimensional Fourier transformation. Unfortunately, this procedure is not applicable to shift-variant systems since the convolution theorem does not
hold. In the following section a generic solution to the shift-variant deconvolution problem
is presented which is based upon the functional transformation method (FTM) [TR03].
Equation (4.50) has the form of a Fredholm integral equation of the first kind with a
symmetric Greens function as kernel. The solution to such a problem may be found by
expanding the Greens function G(x|x0 , ) into an eigenfunction series [MF53a]. A wide
variety of expansions can be used for this purpose. One expansion that could be used
was already introduced in Section 2.7.2 by the concept of the modal expansion (2.79).
However, this expansion is only suitable for rectangular rooms with nearly rigid walls.
The functional transformation method provides a versatile framework for the solution of
partial differential equations in bounded domains by means of a series expansion. In the
following this concept will be utilized to provide a generic solution to the deconvolution
problem given by Eq. (4.50).
Using the FTM the Greens function G(x|x0 , ) on a bounded domain with arbitrary
boundary conditions can be expanded into the following series [Pet04]
1 X
1
0 , ) K(x, ) ,
G(x|x0 , ) =
K(x
(4.52)
N j(j + )

where N denotes a suitably chosen constant, K(x, ) the transformation kernel,


0 , ) the adjoint transformation kernel and the corresponding discrete eigenK(x
values. The kernels may be interpreted as the eigenfunctions of the differential equation
considered. The transformation kernels and their eigenvalues can be found by solving
the underlying Sturm-Liouville problem given by the wave equation and the boundary
conditions of the listening room. Details on this procedure for the wave equation can be
found in [PR04, Pet04]. The next step is to expand the compensation filter C(x0 |x0 , )
into an equivalent series as G(x|x0 , ) but with the discrete eigenvalues and unknown
expansion coefficients c( )
X
0 , ) K(x0 , ) .
C(x0 |x0 , ) =
c( ) K(x
(4.53)

132

4. Listening Room Compensation

Introducing the expansions of the Greens function and the compensation filter into
Eq. (4.50) yields then
I
G0 (x|x0 , ) =
C(x0 |x0 , ) G(x|x0, ) dS0 =
V

c( )
0 , )K(x, ) ,
K(x
j(j + )

(4.54)

where the biorthogonality property of the kernels was exploited [PR04] to derive the
second equality. Above equation can then be solved for the coefficients c( ) using the
biorthogonality property again
I
j(j + )

G (x|x0 , ) K(x,
) dS .
(4.55)
c( ) =
0 , ) V 0
N K(x
Introducing this into Eq. (4.53) yields an explicit expression for the compensation filters
I
X j(j + )

C(x0 |x0 , ) =
K(x0 , )
G0 (x|x0 , ) K(x,
) dS .
(4.56)
N

The integral on the right hand side of Eq. (4.56) can be interpreted as the expansion of
the free-field Greens function G0 (x|x0 , ) into the eigenfunctions K(x, ) and its adjoint

eigenfunctions K(x,
) of the wave equation bounded by the listening room. Division of
these by the expansion coefficients of the Greens function of the room G(x|x0 , ) yields
then the expansion coefficients of the compensation filter C(x0 |x0 , ). Thus, by expand
ing the related functions into a series with respect to the kernels K(x, ) and K(x,
) a
closed form solution is found for the deconvolution problem given by Eq. (4.50). For the
space-invariant case the eigenfunctions of the underlying systems are exponential functions
(see Section 3.2.3). A transformation of the functions using the Fourier transformation
provides an equivalent solution to the (space-invariant) deconvolution system since the
Fourier transformation has exponential functions as kernels. However, the presented solution still requires knowledge of the Greens function of the room. A solution to this
problem will be discussed in the next section.

4.3.3

Adaptation of the Room Compensation Filters

In an abstract sense room compensation can be understood as pre-equalizing a system (the


listening room) such that the overall system response matches a desired one. However,
the system to be equalized is not known a priori in general. Thus, the pre-equalization
filter has to be determined without explicit knowledge of the system. A solution to
such problems is provided by adaptive filters. In the context of adaptive filter theory,
room compensation can be understood as supervised inverse filtering problem [Hay96].

4.3. Fundamentals of Listening Room Compensation

D(x0 , )

C(x0 |x0 , )

G0 (x|x0 , )

A(x, )

133

W (x0 , )

G(x|x0 , )

E(x, )

L(x, )

Figure 4.18: Block diagram of a system for the adaptation of the room compensation
filter.

The room compensation problem, as defined within this thesis, is supervised since the
monopole driving function D(x0 , ) is assumed to be known and thus can be utilized
to adapt the compensation filters. The equivalent unsupervised problem would have to
derive the room compensation filters from measurements only. Inverse filtering is one
of various cases where adaptive filters provide a convenient solution. Adaptive filtering
schemes utilize in general the error between the desired and the actual system response
in order to adapt the filter. For a system given in continuous time and space, this filter
has to be parameterized by a finite number of parameters [Son67, SK92]. If operating
optimally, the adaptive filter is adapted such that the error between the desired and the
actual system response is minimized. A wide variety of algorithms have been developed
in the past decades to tackle this fundamental problem [Hay96].
Adapting the structure of a generic adaptive inverse filtering scheme to the problem of
active room compensation results in the block diagram given by Fig. 4.18. The monopole
driving signal D(x0 , ) is fed through the room compensation filter C(x0 |x0 , ) resulting
in the filtered driving signal W (x0 , ). The wave field within the listening region L(x, )
is then determined by the room transfer function G(x|x0 , ). Since the room transfer
function is not known in general, the wave field within the listening region L(x, ) will be
measured in a generic room compensation system. The desired wave field A(x0 , ) within
the listening region is given by the monopole driving signal D(x0 , ) and the free-field
transfer function G0 (x|x0 , ). The reproduction error E(x0 , ) between the desired wave
field A(x0 , ) and the reproduced wave field L(x, ) is then used to adapt the compensation filters C(x0 |x0 , ). The relations between the particular signals shown in Fig. 4.18
were already derived in Section 4.3.

134

4. Listening Room Compensation

The continuous space and time representation used so far is not appropriate for a practical
realization of room compensation. The following section will develop a generic framework
for adaptive active room compensation which is based on a discrete time and space representation of the involved signals and systems.

4.4

Listening Room Compensation for Massive Multichannel Reproduction Systems

The previous sections illustrated the principle solution to active room compensation in
continuous time and space. For a practical system, both time and space have to be
sampled appropriately. Additionally, the formulations used in the previous sections were
independent from the dimensionality of the problem. Due to complexity constraints, practical reproduction systems are typically limited to the reproduction in a plane only. In the
following discussions the case of two-dimensional reproduction and analysis will be considered. However, the principles derived in the sequel can be generalized straightforwardly
to three-dimensional sound reproduction systems. Since the room transfer function is not
known a priori or may change over time, an adaptive solution to calculate the compensation filter is favorable. The need for an adaptive calculation of the room compensation
filters was e. g. illustrated in [PSR05] by simulating a non-adaptive room compensation
system operating under varying acoustic conditions.
The following section presents a discrete framework for an adaptive active room compensation system, derives an adaptation algorithm and highlights the fundamental problems
of adaptive filtering for massive multichannel reproduction systems.

4.4.1

Discrete Realization of Room Compensation

In order to derive a discrete realization of the active listening room compensation system,
spatial sampling, temporal sampling and a frequency domain description of the signals
and systems are discussed in the sequel.
4.4.1.1

Spatial Discretization

A practical realization of room compensation requires to sample the monopole line source
distribution on the line V surrounding the listening area V . The effects of spatial
sampling of the monopole distribution were already discussed in Section 4.1.6. The spatial sampling may result in spatial aliasing if the anti-aliasing conditions derived in Section 4.1.6 are not met reasonable. In particular, the temporal bandwidth of the reproduced
wave field has to be limited in order to avoid spatial aliasing. Thus if the temporal bandwidth is reduced, then the sampled version of the monopole line distribution with an

4.4. Listening Room Compensation for Massive Multichannel Reproduction Systems

135

111111111111111111111111111111111111111
000000000000000000000000000000000000000
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
N
1
discrete
2
000000000000000000000000000000000000000
111111111111111111111111111111111111111
synthesis
000000000000000000000000000000000000000
111111111111111111111111111111111111111
and analysis
000000000000000000000000000000000000000
111111111111111111111111111111111111111
positions
000000000000000000000000000000000000000
111111111111111111111111111111111111111
M
1
000000000000000000000000000000000000000
111111111111111111111111111111111111111
2
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
x
000000000000000000000000000000000000000
111111111111111111111111111111111111111
listening area
000000000000000000000000000000000000000
111111111111111111111111111111111111111
x
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
0
000000000000000000000000000000000000000
111111111111111111111111111111111111111
V
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
listening room
000000000000000000000000000000000000000
111111111111111111111111111111111111111
000000000000000000000000000000000000000
111111111111111111111111111111111111111
n

Figure 4.19: Discretized version of the active room compensation problem. The denote
the discrete sampling positions.
appropriate driving function provides full control over the wave field within V .
In order to adapt the room compensation filter, the reproduced wave field within the
listening area has to be analyzed. It was shown in Section 3.4 that acoustic measurements taken on the boundary of the listening area are suitable to characterize the wave
field within the listening area. In general, these measurements will be taken using microphones. However, in a practical implementation it will not be possible to place the
secondary acoustic sources and the microphones at the same positions. On the one hand
mechanical limitations will prohibit this. On the other hand the acoustic pressure level
in the vicinity of the secondary sources will be quite high, leading potentially to overload
of the microphones. For these reasons it is advisable to separate the points where the line
sources for synthesis and the microphones for analysis are placed. Figure 4.19 illustrates
the scenario that is considered in the following. The line source distribution is sampled at
a total of N positions, the reproduced wave field is analyzed at a total of M positions. The
analysis points are located within the area spanned by the sampled contour V . Since the
room compensation filter is calculated based on the measurements taken at these points,
perfect room compensation can only be achieved within the area spanned by the analysis
points. This area comprises the listening area as depicted in Fig. 4.19.
The following discussion will assume that the anti-aliasing conditions for the synthesis
and analysis of the wave field are met reasonably. As derived in Section 3.6.2.3 and Section 4.1.6 a limitation of the temporal bandwidth is necessary when assuming no spatial

136

4. Listening Room Compensation

filtering at the actuators and sensors.


4.4.1.2

Temporal Discretization

All signals are sampled synchronously in the time-domain at uniform time steps kTs ,
where k denotes the discrete time index and Ts the temporal sampling interval. The
sampling frequency is then given as fs = 1/Ts . It has to be chosen according to the
temporal sampling theorem [OS99] and the upper frequency limit provided by the sound
reproduction and analysis system. Temporal sampling will be illustrated in the following
for the monopole driving signal d(x, t). The same procedure as illustrated for this signal
applies also to the other signals used for room compensation.
The discretized version of the monopole driving signal d(x, t) for the n-th synthesis position is defined as
dn (k) := d(xn , kTs ) ,

(4.57)

where xn denotes the n-th spatially sampled position on V . For the description of
adaptation algorithms it is convenient to consider subsequent temporal samples or spatially discrete positions together. Subsequent temporal samples are considered together
by capturing K samples of dn (k) into a K 1 vector dn (k) as follows
dn (k) =

dn (k) dn (k 1) dn (k K + 1)

iT

(4.58)

Equivalent definitions apply to the signals w(x, t), l(x, t), a(x, t) and e(x, t) resulting in
their discrete counterparts wn (k), lm (k), am (k) and em (k). Please note that the dimension
of all these vectors is K 1. In order to capture the spatial information, all spatial samples
at one time instant k are combined together into a vector. For the loudspeaker driving
signals this vector is defined as follows
(N)

(k) =

d1 (k) d2 (k) dN (k)

iT

(4.59)

Equivalent definitions apply again to the other signals used for room compensation.
The Greens function G(x|x0 , ) characterizing the listening room can be regarded as the
acoustic transfer function from the excitation point x0 to the measurement point x. Thus,
its counterpart in the temporal domain g(x|x0 , t) = Ft1 {G(x|x0 , )} can be interpreted
as the corresponding Greens impulse response. In general this impulse response may be of
infinite length. For practical purposes this impulse response is truncated at a reasonable
time (e. g. when its energy has decayed below a suitably chosen threshold) resulting in a
finite impulse response (FIR). The discretized impulse response is then defined as follows
rm,n (k) := g(xm |xn , kTs ) ,

(4.60)

4.4. Listening Room Compensation for Massive Multichannel Reproduction Systems

137

where rm,n (k) will be denoted as room impulse response. As for the signals, Nr subsequent
temporal samples can be captured into a Nr 1 vector
h
iT
rm,n = rm,n (0) rm,n (1) rm,n (Nr 1)
,
(4.61)

where Nr denotes the finite number of temporal samples. The impulse responses from all
excitation to all analysis points can be alternatively combined into the matrix

r1,1 (k) r1,N (k)

..
..
..
R(k) =
(4.62)
,
.
.
.
rM,1(k) rM,N (k)

where R(k) exhibits the dimensions M N. The matrix R(k) describes the sound propagation in the listening room from all synthesis to all analysis points. The matrix R(k) will
be termed as matrix of room impulse responses. It can be interpreted as a spatio-temporal
sampled version of the Greens impulse response. If the respective impulse responses in
R(k) are actually measured, then these will not only contain the acoustic transfer functions but also the influence of the employed hardware [Gar00]. These influences will be
neglected in the following since this section aims at deriving the principal problems of
adaptive filtering for multichannel systems.
Analogous definitions, as given above for rm,n and R(k), apply also to the functions
g0 (x|x0 , t) and c(x0 |x0 , t) resulting in their discrete counterparts fm,n , cn,n , F(k) and
C(k). The dimensions of fm,n , cn,n are Nf 1, Nc 1 and the dimensions of F(k), C(k)
are M N, N N. The matrix F(k) is termed as the matrix of free-field impulse responses
and C(k) as the compensation filter.
4.4.1.3

Frequency Domain Description of Signals and Systems

For the temporal frequency domain description of the signals and systems used for room
compensation, the discrete time Fourier transformation (DTFT) [OS99] will be used in the
following. The DTFT transformation and its inverse for the spatially discrete loudspeaker
driving signal dn (k) is given as follows
Dn () =

dn () ejTs ,

(4.63a)

Ts
dn (kTs ) =
2

2/Ts

Dn () ejkTs d .

(4.63b)

Analogous definitions apply to the other signals wn (k), lm (k), am (k) and em (k). The
vector of spatially combined loudspeaker driving signals d(N) (k) will be transformed into
the temporal frequency domain by transforming each element dn (k) separately using the
DTFT (4.63). The resulting vector will be denoted by d(N) (). Vectors and matrices of

138

C(k)

w(N) (k)

R(k)

d(N) (k)

4. Listening Room Compensation

l(M) (k)

e(M) (k)

F(k)
1

l(M) (k)

a(M) (k)

Figure 4.20: Block diagram illustrating the adaptive MIMO inverse filtering approach
to room compensation.
frequency domain signals will be underlined in the sequel. Analogous definitions as for
d(N) (k) apply to the other signals used.
The matrix of room impulse responses R(k) will be transformed into the temporal frequency domain by transforming each element rm,n (k) separately using the DTFT (4.63).
The resulting matrix in the frequency domain will be termed as room transfer matrix and
denoted by R(). The matrices F(k) and C(k) will be transformed analogously. The
transfer matrix F() will be termed as free-field transfer matrix.
4.4.1.4

Adaptation of Room Compensation Filters

Using the foregoing definitions a discrete counterpart of the adaptive framework given by
Fig. 4.18 is derived in this section. Figure 4.20 illustrates the resulting discrete time and
space block diagram. The matrices of impulse responses R(k), F(k) and C(k) describe
discrete linear multiple-input/multiple-output (MIMO) systems (see Section 3.2). Since
the respective impulse responses are finite, they will be termed as MIMO FIR systems
in the following. Thus, room compensation can be understood as inverse MIMO FIR
system filtering problem. For few synthesis and analysis channels numerous solutions to
this problem have been developed in the past [Gar00, TW02, TW03, BHK03, NOBH95].
However, algorithms for massive multichannel systems still remain a challenge as will be
shown in the following. In the next section the case of perfect knowledge of the room
transfer matrix will be considered first in order to derive useful results for the solvability
of such inverse problems.

4.4. Listening Room Compensation for Massive Multichannel Reproduction Systems

4.4.2

139

Exact Inverse Filtering using the MINT

It is possible, under certain reasonable assumptions, to derive an exact solution of the


MIMO inverse filtering problem if the room transfer matrix is given. This solution is
provided by the multiple-input/output inverse theorem (MINT). The following section
shortly reviews this approach as presented in [MK88, Fur01] and its implications on active
room compensation.
The fundamental problem of inverse filtering will be derived similar to Section 4.3 by
comparing the desired signal am (k) to the reproduced signal lm (k) for all analysis positions
m = 1...M. The desired signal am (k) for the m-th analysis position is given as
am (k) =

N
X
n=1

dn (k) fm,n (k) ,

(4.64)

where denotes the discrete convolution in this context. Due to the MIMO structure of
the system the desired signal am (k) is given as a sum of the filtered monopole driving
signals dn (k).
The reproduced signal lm (k) at the m-th analysis position is derived by considering the
room transfer function and the compensation filters. It is given as
lm (k) =
=

N
X

wn (k) rm,n (k) =

n=1

n =1

n=1
N X
N
X

(4.65)

dn (k) cn,n (k) rm,n (k) .

For perfect compensation of the room influence the desired and the reproduced signal
!
should be equivalent for all analysis positions (am (k) = lm (k) for m = 1...M). Comparing
the terms inside the summation of Eq. (4.64) and Eq. (4.65), for sufficient excitation dn (k),
derives the following relation for the computation of the room compensation filters
N
X

n =1

rm,n (k) cn ,n (k) = fm,n (k)

for m = 1...M .

(4.66)

Equation (4.66) can be regarded as the discrete counterpart of Eq. (4.50). It constitutes
a spatio-temporal discrete space-variant deconvolution problem.
For the single channel case (N = M = 1) Eq. (4.66) states that the exact solution would
be given by computing an inverse filter to rm,n (k). In general, room impulse responses
are not minimum phase and thus cannot be inverted exactly. Typical approaches towards
this problem calculate a minimum-phase representation of the room impulse response and
invert only this [NA79]. However, in the MIMO case the situation improves. The MINT
states that an exact solution can be found under the following assumptions:

140

4. Listening Room Compensation

1. the number of synthesis positions is higher than the number of analysis points
(N > M),
2. the n-th input dn (k) can be equalized jointly for all M analysis positions independently of the other inputs, and
3. the transfer functions in the z-domain of R(k) and F(k) do not exhibit common
zeros.
The first condition states that at least one synthesis position more than analysis positions
is required for an exact solution. This condition is also fundamental for exact synthesis
of sound fields, as shown in [Fli02]. The second condition holds in the context of this
work since in linear acoustics the superposition principle applies. The third condition
states that the inversion problem can be solved much easier in the MIMO case, since only
common zeros of the transfer matrices in the z-domain of R(k) and F(k) pose a problem
(see also Eq. (4.66)). In most practical cases this condition will be fulfilled. The solution
to the MIMO inverse filtering problem is found by setting up a system of linear equations.
This is performed by representing the convolution in Eq. (4.66) by a matrix multiplication
using Toeplitz matrices and formulating the problem jointly for all analysis points. This
results in the following system of linear equations

T1,1 T1,N
c1,n
f1,n
..
.. .. = .. ,
..
(4.67)
.
.
. . .
|

TM,1 TM,N
{z
}

cN,n

fM,n

where the Tm,n denote Toeplitz matrices, and cn ,n and fm,n are defined equivalently to
Eq. (4.61) with the lengths Nc and Nf respectively. The Toeplitz matrices Tm,n are given
as follows

rm,n (0)
0

..

..
.

rm,n (1)
rm,n (0)
.

..
..

.
.
rm,n (1)
0

.
.

.
..
..
Tm,n = rm,n (Nr 1)
(4.68)
rm,n (0)

0
rm,n (Nr 1) . .
rm,n (1)

..
..
..

.
.
0
.
0
0
0 rm,n (Nr 1)
An exact solution to the system of equations (4.67) can be found when Nf = Nr + Nc 1
and the matrix T is square. This leads to the following condition for the length of the
inverse filters
M(Nr 1)
Nc =
.
(4.69)
N M

4.4. Listening Room Compensation for Massive Multichannel Reproduction Systems

141

9
8
7

Nc / Nr >

6
5
4
3
2
1
0

5
M >

Figure 4.21: Relative filter length of the room compensation filters Nc /Nr for a fixed
number of excitation points N = 10 and a varying number of analysis (equalization)
points (Nr 1).

Equation (4.69) states that the length Nc of the inverse filters increases with the number
of analysis points (= equalized points). If M is chosen to be near N then the length of
the inverse filters will be longer than the length of the room impulse responses. If M is
chosen quite small then the length of the inverse filters will be smaller than the length of
the room impulse responses. The length of the inverse filters is approximately equivalent
to the length of the room impulse responses for M = N/2. Figure 4.21 illustrates the
relative length of the compensation filters for N = 10 synthesis positions.
The calculation of the compensation filter coefficients, by solution of Eq. (4.67), requires
to invert the matrix T. For a high number of synthesis and analysis positions or coefficients Nr of the room impulse response this matrix becomes very large and the inversion
gets computationally very complex. Thus, direct implementation of the MINT is unfeasible for a high number of synthesis and analysis points and typical lengths of room
impulse responses due to the matrix inversion required. However, the derived conditions
for an exact solution give implications for the solvability and the required length of the
compensation filters. This length may become quite long compared to the room response,
if the number of synthesis points is chosen near the number of analysis points. Unfortunately, this is the desired case for equalization of the entire listening area. The following
section will introduce an adaptive solution to room compensation that will overcome the
problems of the MINT.

142

4.4.3

4. Listening Room Compensation

Least-Squares Error Adaptation of the Room Compensation Filters

The previous section derived an exact solution for the computation of the room compensation filters when certain assumptions are fulfilled. Besides its computational complexity,
the presented solution requires that the listening room transfer matrix is known a-priori
and additionally does not change over time. Since both of these requirements are not
fulfilled in typical listening room scenarios an alternative approach will be proposed. As
stated before, an adaptive algorithm may provide a solution to these problems. This
section will introduce the concept of linear inverse optimum least-squares error adaptive
filtering and will point out the fundamental problems when applied to room compensation
in the context of this thesis. The following discussion will first derive the normal equation
of the inverse least-squares error adaptive filtering problem and will then give some remarks on the derivation of the filtered-x recursive least-squares (X-RLS) algorithm [BQ00].
The filtered-x RLS algorithm deviates from the standard recursive least-squares (RLS)
algorithm by using a filtered version of the input signal for the adaptation. The results
derived for the X-RLS algorithm are generic since most of the frequently used adaptive
filtering schemes (e. g. filtered-x least-mean squared (X-LMS) algorithm) can be understood as specialization of the presented X-RLS algorithm [Hay96].
Perfect listening room compensation is gained when the reproduced wave field matches the
desired wave field. A measure for this condition is given by the error em (k) = am (k)lm (k)
between the desired wave field am (k) and the reproduced wave field lm (k). If this measure is zero for all M analysis points, perfect listening room compensation is gained. In
the context of adaptive filtering the error em (k) is used as cost function for the inverse
filter optimization problem. The compensation filters should be adapted such that they
minimize the error em (k). The weighted least-squares (WLS) estimate uses a weighted
sum of the time-averaged error as cost function [Hay96]
(
c, k) =

k
X

WWLS (k, )

=0

M
X

m=1

|em ()|2 ,

(4.70)

= c(k) the coefficients of the


where 0 < WWLS 1 denotes the weighting factor and c
estimated room compensation filter at the time-instant k. In the following the weighting
factor is chosen as the exponential function WWLS (k, ) = k . This way more recent
samples have greater influence on the cost function, allowing for better tracking of a time
varying room response. The special case = 1 takes all past measurements into account
and corresponds to an infinite memory. Using this specialized weighting factor results in
the cost function of the filtered-x RLS algorithm which is given as
(
c, k) =

k
X
=0

M
X

m=1

|em ()|2 .

(4.71)

4.4. Listening Room Compensation for Massive Multichannel Reproduction Systems

143

Combining all M error signals at one time-instant k into the vector e(M) (k) =
[ e1 (k) e2 (k) eM (k) ]T allows to write the cost function more conveniently as
(
c, k) =

k
X

k e(M) ()T e(M) () .

(4.72)

=0

The optimal filter coefficients in the mean-squared error (MSE) sense are found by setting
the gradient with respect to the estimated filter coefficients c of the cost function to zero
(
c, k) !
=0.
(4.73)

c
It remains in the following to express the error e(M) (k) in terms of the filter coefficients.
The reproduced signal lm (k) at the m-th analysis point is given by Eq. (4.65). By rearranging the time-domain convolutions, Eq. (4.65) can be rewritten as
c (
c, k) =

lm (k) =

N X
N
X

n=1 n =1

N X
N
X

(dn (k) rm,n (k)) cn,n (k)


|
{z
}
dm,n,n

(4.74)

dTm,n,n (k) cn,n (k) ,

n=1 n =1

where dm,n,n (k) denote the driving signals dn (k) which have been filtered by the room
transfer function rm,n (k), as defined above. The vectors dm,n,n (k) and cn,n (k) containing
the time-history of the filtered driving signals and the coefficients of the room compensation filter. They are defined according to Eq. (4.58) and Eq. (4.61). The length of both
vectors is chosen equal to the length of the inverse filters Nc , that should be longer than
the room impulse response Nc > Nr for typical room compensation scenarios as stated in
Section 4.4.2. Introducing the following definitions
c(N)
Tn,1 (k) c
Tn,2(k) c
Tn,N (k) ]T ,
n (k) = [ c

(N)
(N)
T
c(k) = [ c(N)
2 (k)T c
N (k)T ]T ,
c
1 (k)

T
T
T
T
d(N)
m,n (k) = [ dm,n,1 (k) dm,n,2 (k) dm,n,N (k) ] ,

(N)
(N)
T
d(N,N)
(k) = [ d(N)
dm,2 (k)T dm,N (k)T ]T ,
m
m,1 (k)
(N,N)
(N,N)
DR (k) = [ d(N,N)
(k) d2 (k) dM (k) ] ,
1

(4.75)
(4.76)
(4.77)
(4.78)
(4.79)

where c(k) and DR (k) have the sizes Nc NN 1 and Nc NN M respectively, allows to
express the reproduced wave field jointly at all M analysis positions as
l(M) (k) = DTR (k) c(k) .

(4.80)

Using Eq. (4.80) allows to express the error at all M analysis positions by the desired
signals a(M) (k), the filtered loudspeaker driving signals DR (k) and the filter coefficients
c(k)
e(M) (k) = a(M) (k) DTR (k) c(k) .
(4.81)

144

4. Listening Room Compensation

As stated before, the optimal filter coefficients that minimize the error e(M) (k) can be
derived according to Eq. (4.73). Evaluation of the gradient with respect to c(k) of the
cost function yields
(
c, k)

c (
c, k) =
=

c
=2

k e(M) ()T e(M) ()

=0

k
X
=0

k
X


k DR ()DTR ()
c(k) DR ()a(M) () =

(4.82)

= 0Nc N N 1 .

Rearranging Eq. (4.82) yields the normal equation of the multichannel inverse filtering
problem as
dd (k) c(k) =
da (k) ,

(4.83)

dd (k) and
da (k) abbreviate the terms involving the signals DR (k) and a(M) (k)
where
dd can be interpreted as the time and analysis
in the sums of Eq. (4.82). The matrix
position averaged auto-correlation matrix of the filtered loudspeaker driving signals which
is defined as
k
X

dd (k) =
k DR () DTR () .
(4.84)
=0

da can be interpreted as the as the time and analysis position averaged crossThe vector
correlation vector between the filtered loudspeaker driving signals and the desired signals
which is defined as
k
X

da (k) =
k DR () a(M) () .
(4.85)
=0

The filtered-x RLS algorithm can be derived from the normal equation (4.83) and above
definitions of the correlation matrices by computing the sums in a recursive fashion and
by applying the matrix inversion lemma. The derivation of the filtered-x RLS algorithms
using this procedure can be found e. g. in [BQ00]. The filtered-x RLS algorithm deviates
from the standard RLS algorithm by using a filtered version of the input signal for adaptation. The coefficients of the room compensation filter c(k) are typically not computed
for every time step k, they are updated at a lower rate than the sampling rate due to
complexity constraints.
The following discussion will focus on the fundamental problems of the adaptive inverse
filtering problem given by Eq. (4.83) and not on particular algorithms and their problems
used for realization.

4.4. Listening Room Compensation for Massive Multichannel Reproduction Systems

4.4.4

145

Fundamental Problems of Adaptive Inverse Filtering

Three fundamental problems of adaptive inverse filtering can be concluded from the
derivation of the normal equation (4.83) in the previous section. These are:
1. Required a-priori knowledge of the room transfer function,
2. non-uniqueness problem when minimizing the cost function (
c, k), and
dd (k).
3. ill-conditioning of the auto-correlation matrix
It was stated before that in general the room transfer impulse responses rm,n (k) will not
be known a-priori and are potentially time-varying. Thus it seems that not much has been
gained by using an adaptive filtering scheme since this still requires a-priori knowledge to
calculate the filtered driving signals DR (k). While some amount of deviation between the
actual room transfer functions and a-priori measured ones can be tolerated [BQ00] large
changes in the listening room will likely cause poor convergence of the adapted filters. In
order to cope with changing environments, the listening room transfer functions have to
be identified additionally. There are various on-line identification methods for this task. It
will be assumed in the following that a suitable algorithm for this purpose can be applied
within the context of room compensation to solve the first fundamental problem. An
overview over possible methods can be found in [Gar00, HS97]. However, most of these
algorithms are not capable of handling the massive multichannel case [SMH95, BMS98].
The second problem is related to the minimization of the cost function (
c, k). The optimal room compensation filters are given by calculating the inverse filters to the room
impulse responses, as shown by Eq. (4.66). However, minimization of the cost function (
c, k) may not provide the optimal solution in these terms. Depending on the
driving signals d(N) (k) there may be multiple possible solutions for c(k) that minimize
(
c, k) [SMH95, BMS98]. This problem will be termed as the non-uniqueness problem in
the following. It likely occurs if the room transfer functions are not excited in all spatial
and temporal frequencies. For example if only one wall in a room is excited by a plane
wave emitted by the reproduction system, the reflections caused by the other walls for
other excitations are not included in the compensation filters. In general, a MIMO system can only be identified perfectly if it is excited in all of its spatio-temporal degrees
of freedom. As a consequence of the non-uniqueness problem, changing the spatial and
temporal characteristics of the driving signals may temporally result in a rapid increase
of the error e(M) (k). This makes a new convergence of the compensation filters necessary.
Thus, the non-uniqueness problem is mainly an issue for highly temporally and spatially
non-stationary driving signals. A typical situation is the synthesis of a moving virtual
source.
The third fundamental problem is related to the solution of the normal equation (4.83).

146

4. Listening Room Compensation

The normal equation has to be solved with respect to the coefficients of the room compensation filter. However, due to the dimensionality and potentially ill-conditioning of
dd (k) this may become an infeasible task for a large number
the auto-correlation matrix
of reproduction channels and analysis points. Additionally, an exact solution may not
exist always. Some conditions for the solvability of the normal equation in the context of
room compensation are given in [Gar00].
The following section derives a generic framework for room compensation which explicitly
solves the last problem by utilizing spatio-temporal signal and system transformations.
It will be shown additionally that the first two problems are highly improved by this
approach.

4.5

Generic Framework of an Improved Listening


Room Compensation System

The major conclusion of the previous section was, that the adaption of the room compensation filters using conventional filtered-x algorithms is not feasible due to its complexity
for massive multichannel systems. The following section will derive a generic framework
for an improved room compensation system. Improved means in this context to overcome
most of the problems discussed in the previous section.

dd
Analysis of the Auto-Correlation Matrix

4.5.1

The main shortcomings of the X-RLS algorithm introduced, as in the previous section,
dd of the filtered driving signals. The structure
emerge from the correlation matrix
and contents of this matrix are analyzed in detail in order to propose a solution to these
dd , as given by Eq. (4.84), can be expressed in
shortcomings. The correlation matrix
form of the following block matrix

(1,1),(1,1) (k)
(1,1),(N,N ) (k)

..
..
..
dd (k) =

(4.86)

.
,
.
.
(N,N ),(1,1) (k)
(N,N ),(N,N ) (k)

where (n , n) denotes an index consisting of all combinations of the two variables n and
(n ,n1 ),(n ,n2 ) (k) of dimension Nc Nc are defined as
n. The matrices
1
2
(n ,n1 ),(n ,n2 ) (k) =

1
2

k
X
=0

M
X

dm,n1 ,n1 () dTm,n2 ,n2 () ,

(4.87)

m=1

where n1 , n1 , n2 , n2 {1, 2, , N}. Using above definition (4.87), the correlation matri (n ,n ),(n ,n ) (k) can be interpreted as the
ces
1

4.5. Generic Framework of an Improved Listening Room Compensation System

147

1. weighted time-average and


2. over all M analysis positions average
of the cross-correlation matrix between the filtered driving signals dm,n1 ,n1 (k) and
dm,n2 ,n2 (k). The filtered driving signal dm,n1 ,n1 (k) is computed by connecting the driving
signal dn1 to the n1 -th input of the system constituting the room transfer function. The
same applies to dm,n2 ,n2 (k). The spatial and temporal averaging of the cross-correlation
dd is
matrices is a result of the chosen cost function (4.71). The correlation matrix
(n ,n1 ),(n ,n2 ) (k) for all permutations of
constructed from the cross-correlation matrices
1
2
n1 , n1 , n2 , n2 . Their total number is N 4 . Hence, the total size of the correlation matrix
dd of the filtered driving signals is N 2 Nc Nc N 2 .

The permutations are a result of the MIMO structure of the room transfer function and
the room compensation filter. Thus, if the room transfer matrix is diagonalized then
dd will become a block-diagonal matrix. As a result, the MIMO system representing

the listening room and also the MIMO FIR inverse filtering problem will be decoupled.
This fundamental idea will be used in the following sections to derive an improved room
compensation system.

4.5.2

Decoupling of the Listening Room Transfer Matrix

The previous section stated that a decoupling of the listening room transfer matrix yields
dd (k). This section will show how the desired
a decoupling of the auto-correlation matrix
decoupling can be obtained using the concept of the singular value decomposition (SVD).
Performing a DTFT of Eq. (4.65) yields the signal at the M analysis points in the frequency domain as
l(M) () = R() w(N) () ,
(4.88)
where R() denotes the DTFT transformed matrix of impulse responses from each synthesis to each analysis position, l(M) () and w(N) () the DTFT transformed signal at the
analysis positions and filtered loudspeaker driving signal respectively. They are defined
according to Section 4.4.1.3. The following two sections will introduce the SVD for the
decomposition of the room transfer matrix and its application for the decomposition of
Eq. (4.88).
4.5.2.1

Singular Value Decomposition

The singular value decomposition states that any matrix R() can be decomposed into

two unitary matrices U() and V(), and a diagonal matrix R()
[Hay96, GL89, Dep88].
It will be assumed in the following that R() has the dimensions M N with N M.
The SVD of the room transfer matrix is then given as follows

R() = U() R()


VH () ,

(4.89)

148

4. Listening Room Compensation

where U() and V() have the dimensions M M and N M respectively, and R()
H
H
the dimension M M. Since U() and V() are unitary U()U () = V()V () =
IM M . The columns of the matrix V() are constructed from the right singular vectors
vb (), the matrix U() from the left singular vectors ub () of R(). The diagonal matrix

R()
is defined as follows

R()
= diag{ [1 , 2 , , M ] } ,

(4.90)

where 1 2 B > 0 denote the B nonzero singular values b of R().


Their total number B is given by the rank of the matrix R() as B = rk{R()} with
1 B M. For B < M the remaining singular values B+1 , B+2 , , M are zero.
Equation (4.89) can be rewritten as the following series
R() =

B
X

b ub () vH
b () .

(4.91)

b=1

Hence, the matrix R() can be represented as a finite series constructed from the left
and right singular vectors weighted by the corresponding singular values. Above series
representation may also be used to calculate an approximation of R() by using only a
subset of the singular values and the their corresponding left and right singular vectors.
For this purpose the Btr < B largest singular values and corresponding singular vectors
are typically used.
From a formal point of view above expansion exhibits similarities to the expansion (4.52)
of the Greens function using the FTM. The difference is that the FTM expansion requires
an infinite number of components. The listening room transfer function is constructed
from measurements at a limited number of discrete positions. It was shown in Section 3.6.2
that this limited number of positions inherently imposes a truncation. However, the FTM
expansion may also be limited to a reasonable number of components.
The relation given by Eq. (4.89) can be inverted by exploiting the unitary property of the
left and right singular matrices. This results in

= UH () R() V() .
R()

(4.92)

Hence, each matrix R() can be transformed using the left and right singular matrices

U() and V() into a diagonal matrix R().


In general, these singular matrices are
dependent on the matrix R(). The SVD is a data-dependent transformation due to this
dependency. The SVD can be used to define the pseudoinverse R+ () of the matrix R()
as follows [Hay96]
1 () UH () .
R+ () = V() R
(4.93)
The concept of the SVD can be generalized to the diagonalization of a pair of matrices.
This decomposition is known as generalized singular value decomposition (GSVD) [Dep88,

4.5. Generic Framework of an Improved Listening Room Compensation System

149

GL89]. The GSVD for the matrices R() and F() is given as follows

R() = X() R()


VH () ,

F() = X() F()


UH () ,

(4.94a)
(4.94b)

where F() has the dimension M N. The matrices X(), V() and U() are unitary
matrices with the dimensions M M, N M and N M respectively. The matrix
X() is the generalized singular matrix of R() and F(). As for the SVD, the matrices

and F()
are diagonal matrices constructed from the singular values of R() and
R()
F(). The GSVD transforms R() and F() into their joint eigenspace using the singular
matrices X(), V() and U().
Equation (4.94) and Eq. (4.93) can be combined to derive the following result
1 () F()

R+ () F() = V() R
UH () ,

(4.95)

where it is assumed that R() and F() have both full rank. Equation (4.95) will be
used in Section 4.5.3 to derive the desired decoupling of the MIMO adaptive system.
4.5.2.2

Decoupling of the Listening Room Transfer Matrix

The SVD, as introduced in the previous section, can be used to transform the listening
room transfer matrix into a diagonal matrix. Equation (4.92) together with the unitary
property of the left and right singular matrices can be used to reformulate Eq. (4.88) as
follows

VH () w(N) () ,
(4.96)
UH () l(M) () = R()
|
{z
}
|
{z
}
l(M) ()

(M)

(M) ()
w

(M) () denote the transformed signal at the analwhere the M 1 vectors l () and w
ysis positions and the transformed driving signal respectively. In the context of signals
and systems the SVD can be understood as a transformation. The left and right singular
matrices U() and V() constitute the kernels of this transformation. The transformation of the MIMO system representing the listening room can be performed by pre- and
post-filtering the room transfer matrix with V() and UH (). The pre- and post-filters
constitute MIMO systems themselves. Thus, Eq. (4.88) can be expressed entirely in this
transformed domain. This principle is illustrated by Fig. 4.22. A similar matrix formulation as illustrated by Fig. 4.22 can also be given for the fast convolution technique using
the discrete Fourier transformation (DFT) of a signal [OS99]. In this special case the
transformation matrices are given by sampled exponential functions.
The benefit of using this transform domain description of the listening room transfer

matrix lies in the simplified structure of R().


As stated in the previous section R()
denotes the diagonal matrix composed of the singular values (some possibly equal to zero)

150

4. Listening Room Compensation

(M) ()
w
M

UH ()

R()

V()
N

l(M) ()

R()
Figure 4.22: Transformation of the MIMO listening room response using the singular
value decomposition.
of R(). Due to its diagonal structure, the signals at the analysis positions in the trans(M)
formed domain l () can be computed by scalar multiplication of the main diagonal

m () of R()
(M) ()
elements R
with the transformed loudspeaker driving signals w
m () = R
m () W
m () .
L

(4.97)

If R() is not full rank (B < M), then above equation only has to be evaluated for
m = 1...B. Hence, the transformation of the signals and systems using the SVD decomposes the MIMO system given by R() into B SISO systems. The space-variant system
R() is transformed into a space-invariant representation in a transformed domain by the
SVD.
In general, the computation of the SVD will be too complex to benefit from the complexity reduction given by this decomposition of the MIMO system. However, presuming an
efficient transformation of R() with equivalent properties as the SVD based transformation of the systems and signals may result in a highly reduced complexity. A prominent
example for such an efficient realization is the fast Fourier transformation (FFT) which
provides a very efficient implementation of the DFT for signals and systems.
In the context of adaptive filtering of MIMO systems the introduced transformation results in a decoupling of the entire adaptive system as will be shown in the following.

4.5.3

Eigenspace Adaptive Inverse Filtering

The previous section derived a decomposition of the MIMO system R(), representing
the listening room, into several SISO systems by using a transformation which was based
on the SVD. This section will derive a decoupling of the entire adaptive system depicted
by Fig. 4.20, by jointly decoupling the systems F() and R() using the GSVD.
The error between the desired a(M) () and the reproduced l(M) () wave field at the M
analysis positions can be derived by transforming Eq. (4.64) and Eq. (4.65) into the

4.5. Generic Framework of an Improved Listening Room Compensation System

151

frequency domain
e(M) () = a(M) () l(M) () =

= F() d(N) () R() C() d(N) () .

(4.98)

In the sequel a decoupling of Eq. (4.98) will be derived, resulting in a decoupling of the
MIMO adaptive inverse filtering problem. The basic idea is to diagonalize the listening
room transfer matrix R() and the free-field transfer matrix F() using the GSVD. It
will be assumed first that both transfer matrices have full rank. However, the results can
be generalized straightforwardly to the case that F() and/or R() have not full rank.
The decompositions of the transfer matrices F() and R() are given by Eq. (4.94). It
remains to chose a suitable decomposition of the compensation filter C(). In general, the
compensation filter is derived by solving the deconvolution problem given by Eq. (4.66).
The least-squares solution of Eq. (4.66) in the frequency domain is given by C() =
R+ () F(). An eigenspace expansion of C() using the GSVD is given by Eq. (4.95).
However, the room transfer function R() and its pseudoinverse R+ () will not be known
a-priori. This problem can be solved by expanding the room compensation filter as given

by Eq. (4.95) but with unknown expansion coefficients C()

C() = V() C()


UH () ,

(4.99)

where C()
denotes a diagonal matrix, where some diagonal elements may be zero. Using
(M)
Eq. (4.99) together with Eq. (4.94a) yields the transformed signal l () at the analysis
points

l(M) () = R()

(M) () ,
(4.100)
C()
d
(M)
(M) () = UH () d(N) (). Decomposition of the freewhere l () = XH () l(M) () and d
field transfer matrix, according to Eq. (4.94b), yields the desired signal in the transformed
domain as
(M) () ,

(M) () = F()
a
d
(4.101)

(M) () = XH () a(M) (). Using Eq. (4.100) and Eq. (4.101) allows to decouple
where a
Eq. (4.98) in the transformed domain
(M) () R()
(M) () ,

(M) () = F()
e
d
C()
d

(4.102)

(M) () denotes the error signal for all M components in the transformed domain.
where e

Since R(),
C()
and F()
are diagonal matrices, the m-th component of the error signal

Em () in the transformed domain is given by the following relation


m() R
m () Cm() D
m () ,
Em () = Fm () D

(4.103)

m (), Cm () and Fm () denote the m-th component of the main diagonal of


where R

m () is only dependent from the m-th


R(),
C()
and F(),
respectively. The error E

152

4. Listening Room Compensation

CM ()
F1 ()

(M) ()
d

FM ()

1 ()
R

UH ()

d(N) ()

C1 ()

M ()
R

(M) ()
e
+

l(M) ()

(M) ()
a
+

Figure 4.23: Block diagram illustrating the eigenspace adaptive inverse filtering approach to room compensation.
component of the respective signals and systems. Thus, Eq. (4.103) states that the MIMO
adaptive inverse filtering problem can be decomposed into M SISO adaptive inverse filtering problems using the GSVD. The computation of the room compensation filters can be
performed independently for each of the M transformed components. The transformation
of the systems and signals is performed by transforming them into the joint eigenspace
of R() and F() using the GSVD. Therefore this approach will be termed as eigenspace
inverse adaptive filtering in the following. Please note that the transformation is not dependent from the driving signals. Figure 4.23 illustrates the eigenspace inverse adaptive
filtering approach. Up to now it was assumed that R() and F() have full rank. However, if the transfer matrix R() or F() does not have not full rank, then Eq. (4.103)
does not have to evaluated for all M transformed components.
In the following the theory of multichannel adaptive inverse filtering presented in Section 4.4.3 will be specialized to the derived system decoupling. Due to the decoupling,
the cost function (
c, k) given by Eq. (4.71) can be minimized independently for each
component m. The error signal for the m-th component in the transformed domain is
given by Eq. (4.103). Applying the procedure outlined in Section 4.4.3 derives the normal
equation in the transformed domain as

dd,m (k) cm (k) =

da,m (k) ,

(4.104)

dd,m (n) denotes the time-averaged auto-correlation matrix of the m-th compowhere

da,m (n) the correspondnent of the transformed filtered loudspeaker driving signal and
ing cross-correlation matrix between the filtered loudspeaker driving signal and the desired signal, and cm (k) the filter coefficients of the m-th compensation filter. The auto
dd,m (n) has the dimensions Nc Nc . Due to this reduction in dimencorrelation matrix

4.5. Generic Framework of an Improved Listening Room Compensation System

153

sionality, the solution of the M equations Eq. (4.104) is much more efficient as for the
adaptation using the original (not transformed) signals. Equation (4.104) corresponds to
the well known single channel normal equation. There may still be time-domain correlations present in the filtered input signals which may cause problems when solving the
normal equation (4.104). However, there are numerous approaches known in the literature
on single-channel adaptive inverse filtering to overcome these problems [Hay96]. Please
dd (k) have been removed in the transformed
note, that the spatial correlations present in
domain by the spatial decoupling of the MIMO systems. Thus, the non-uniqueness problem discussed in Section 4.4.4 is improved additionally.
There exist approaches to additionally diagonalize the single channel auto-correlation matrices, e. g. frequency-domain adaptive filtering (FDAF) using the discrete Fourier transformation [Hay96, BBK03]. Since the eigenfunctions of LTI systems are exponential
functions, the FDAF approach is based on an equivalent idea as presented here for the
spatial decoupling of a MIMO system.

4.5.4

Wave Domain Adaptive Inverse Filtering

In the previous sections an eigenspace approach to adaptive inverse filtering was developed. Its main benefit was the decoupling of the MIMO adaptive inverse filtering problem
into a series of single channel adaptive inverse filtering problems. This way the complexity
to compute room compensation filters for massive multichannel systems was significantly
reduced. Thus, the second and third fundamental problem formulated at the beginning of
Section 4.4.4 have been solved. However, the first problem still remains. More precisely,
the major drawback of the presented approach eigenspace adaptive filtering are:
1. The computation of the joint and right singular matrix X() and V() requires
a-priori knowledge of the room transfer matrix R(),
2. the computation of the joint and right singular matrix is quite complex, and
3. the transformation of the loudspeaker driving signals, the filtered loudspeaker driving signals and the measured signals may become complex without an efficient algorithm.
These problems emanate mainly from the fact that the GSVD is a data dependent transformation. In general no optimizations can be performed without placing restrictions on
the structure of the transfer matrices F() and R(). However, it is known that the
listening room transfer matrix is constructed from a sampled version of the Greens function. The Greens function itself has to fulfill the wave equation and the homogeneous
boundary conditions imposed by the room. Thus, it should be possible to construct an
analytic transformation from this knowledge. Comparing the series representation of the

154

4. Listening Room Compensation

d(N)

T1

(M)
d

(M)
w

T2

w(N)

(M)
e

F
+

l(M)

(M)
a
+

T3

l(M)
M

Figure 4.24: Block diagram of wave domain adaptive inverse filtering approach to active
room compensation.

Greens function of the listening room given by the FTM (4.52) with the series representation of the listening room transfer matrix given in terms of its SVD (4.91) yields that
both expansions exhibit very similar structures. Both expand the Greens function and
its sampled representation into a series of kernels and adjoint kernels in case of the FTM
and left and right singular vectors in case of the SVD. Hence, this similarity can be used
to construct analytic transformations with the potential of incorporating optimizations.
But this will still not solve the first problem mentioned above.
The basic idea to overcome this problem is to give up the idea of a perfect diagonalization of the MIMO system in favor of a generic transformation which is to some degree
independent of the listening room characteristics. In order to still benefit from the transformation of the MIMO system, the generic transformation should compact the MIMO
system to its main diagonal elements and as few off-diagonal elements as possible. Choices
for suitable transformations include e. g. free-field expansions of the Greens functions or
statistics based transformations like the Karhunen-Lo`eve transformation (KLT) [JN84]
that analyze several listening rooms in order to derive a suitable transformation. Since
these generic transformations inherently have to account for the wave nature of sound in
order to perform well, this approach will be termed as wave domain adaptive (inverse)
filtering (WDAF) and the transformed domain as wave domain in the following.
Based on the approach of eigenspace adaptive filtering and above considerations a generic
block diagram of the WDAF approach can be developed. Figure 4.24 displays this generic
block diagram. The signal and system transformations are performed by three generic
transformations. Their structure is not limited to the MIMO FIR systems derived from

4.5. Generic Framework of an Improved Listening Room Compensation System

155

the GSVD. Transformation T1 transforms the driving signals into the wave domain, T2
inversely transforms the filtered loudspeaker driving signals from the wave domain, and
T3 transforms the signals at the analysis points into the wave domain. The signals and
transfer functions in the wave domain are denoted by a tilde over the respective variable,
since suitable transforms will be based on the idea of a transformation into the eigenspace
of the respective systems. The adaptation is then performed entirely in the wave domain.
If the transformations perfectly decouple the MIMO system, then a series of single-channel
inverse filtering problems in the wave domain results.
Please note, that the generic block diagram depicted by Fig. 4.24 also includes the
eigenspace adaptive inverse filtering and multi-point equalization approaches. In the
former case the transformations are given as the following MIMO FIR systems (see Section 4.5.3): T1 = UH (), T2 = V() and T3 = XH (). In the latter case the transformations equal unit matrices.
The concept of WDAF as introduced in this section exhibits strong similarities to the theoretical concepts used for the coding of digital signals. Especially the frequently utilized
approach of transform coding [JN84] is based on the same fundamental principle. A transformation of the digital signal into a transformed domain should yield a more compact
representation than in the original domain. Typically this transformed representation is
then quantized to further compact the data. The optimal transformation in the sense of
the fewest coefficients in the transformed domain is also provided by the SVD. However,
the transformation matrices then depend on the data to be coded, which is undesirable.
As a solution to this problem, suboptimal transformations are used which perform well
in terms of decorrelation and energy compaction. The optimal transformation in these
terms is given by the Karhunen-Lo`eve transformation [JN84]. For natural still images
the discrete cosine transformation (DCT) has proven to provide a well-suited transformation for a wide variety of images [JN84, AR75, SGP+ 95]. In audio coding the modified
discrete cosine transformation (MDCT) is often employed due to its energy compaction
properties [WVY00] for typical audio signals.
The next section will introduce suitable transformations for the approximate decoupling
of the listening room transfer matrix.

4.5.5

Approximate Decoupling of the Listening Room Transfer


Matrix

The previous section introduced WDAF. The basic idea behind this concept, in contrary
to eigenspace adaptive filtering, is to utilize a data independent transformation to transform the MIMO adaptation problem into a more compact representation. This section
will introduce two analytic data-independent transformations for this purpose. They are
based on free-field wave field representations and thus neglect the influence of the listening

156

4. Listening Room Compensation

room. Hence, they will provide a decoupling of the free-field transfer matrix F() but not
a perfect decoupling of the listening room transfer matrix R(). However, both transformations can be implemented quite efficiently for special analysis geometries. Their
performance in the context of WDAF based listening room compensation will be evaluated in Section 5.3. Especially the decomposition into circular harmonics has proven its
suitability for active listening room compensation in typical rectangular listening rooms.
4.5.5.1

Decomposition into Plane Waves

The Greens function G(x|x0 , ) can be interpreted as the transfer function from a source
point x0 to a receiver point x with respect to the boundary conditions imposed by the
listening room. Section 2.2 stated that plane waves are eigensolutions of the free-field
wave equation formulated in Cartesian coordinates. It was further shown in Section 3.3
that arbitrary wave fields can be decomposed into plane waves by the plane wave decomposition. In order to describe the room characteristics for the propagation of plane waves,
it is reasonable to define a transfer function for the propagation of plane waves similar to
the Greens function G(x|x0 , ). This function will be termed as the plane wave Greens

function G(|
0 , ) in the sequel. The plane wave Greens function can be interpreted as
the transfer function for an excitation resulting in a plane wave with incidence angle 0
to a resulting plane wave with the incidence angle .
It was derived in Section 2.4.4 that the wave field emitted by a planar source under freefield conditions is a plane wave. Hence, the model of a planar source can be used to
generate a plane wave in the listening room. The wave field produced by a planar source
inside the listening room is given by introducing the inhomogeneous part belonging to a
planar source (see Eq. (2.45)) into Eq. (4.43)
Z
Ppw,0 (x, ) = 2jk
(nT0 x) G(x|x0 , ) dV0 ,
(4.105)
R

where nT0 denotes the normal vector of the planar source (plane wave) and G(x|x0 , ) a
suitably chosen Greens function which conforms to the homogeneous boundary conditions
of the listening room. The normal vector nT0 depends from the incidence angle of the

produced plane wave nT0 = nT0 (0 ). The plane wave Greens function G(|
0 , ) can be
derived by performing a plane wave decomposition of the wave field produced by the
planar source in the listening room

G(|
0 , ) = P{Ppw,0 (x, )} .

(4.106)

In general, G(|
0 , ) will depend on the wave equation and its boundary conditions. The
plane wave Greens function describes the propagation of plane waves under the given
boundary conditions. For free-field propagation conditions, the excited plane wave will

4.5. Generic Framework of an Improved Listening Room Compensation System

(M) ()
w
M

Dpw

R()
N

157

l(M) ()
M

R()
Figure 4.25: Transformation of the listening room response R() into its plane wave

representation R()
using the plane wave decomposition P.
be the only one present. Thus, the plane wave representation of the free-field Greens
function G0 (x|x0 , ) is given as
0 (|0 , ) = ( 0 ) .
G

(4.107)

It was stated in Section 4.4.1 that the listening room transfer matrix R() can be interpreted as a spatially sampled version of the Greens function from the synthesis points
to the analysis points. Thus, it can be transformed into its plane wave representation

R()
using suitable transformations. The discrete plane wave decomposition allows to
decompose the wave field captured at the analysis points into its plane wave representation. However, it has to be taken care that the reproduction system generates a plane
wave with incidence angle 0 for the measurement of the plane wave room transfer matrix

For a continous secondary source distribution, the driving signals for the reproducR().
tion of a plane wave decomposed wave field were derived in Section 4.1.5 and are given
by Eq. (4.18). For the special choice P (, ) = ( 0 ) Eq. (4.18) generates the driving
signals for a Dirac shaped plane wave with incidence angle 0 . A suitably discretized

version of Eq. (4.18) can then be used for the computation of R().
Figure 4.25 illustrates the transformation of the listening room transfer matrix into its
plane wave representation. The number of plane wave components is denoted by M and
the number of secondary sources by N. The transformation Dpw denotes the calculation
(M) () using
of suitable driving signals from the plane wave decomposed desired field w

a discrete formulation of Eq. (4.18). The plane wave transfer matrix R()
describes the
(M)
() in terms of plane waves.
influence of the listening room on the desired field w
4.5.5.2

Decomposition into Circular Harmonics

It was shown in Section 2.3.2 that arbitrary wave fields can be decomposed into circular
harmonics. Circular harmonics are the eigensolutions of the wave equation formulated in

polar coordinates. Similar to the plane wave Greens function G(|


0 , ), it is reasonable
to define a transfer function for the propagation of circular harmonics. This function will

be termed as the circular harmonics Greens function G(|


0 , ). The circular harmonics

158

4. Listening Room Compensation

(M) ()
w
M

Dch

R()
N

l(M) ()
M

R()
Figure 4.26: Transformation of the listening room response R() into its circular har
monics representation R()
using the circular harmonics decomposition.

Greens function can be interpreted as the transfer function for an excitation resulting in
a circular harmonic (or multipole) with order 0 to a resulting circular harmonic (or multipole) with the order . It can be computed similar to the plane wave Greens function
by introducing a suitable excitation into Eq. (4.43) and calculating the circular harmonics

expansion coefficients of the resulting wave field using Eq. (3.59). In general, G(|
0 , )
will depend on the wave equation and its boundary conditions.

in terms of cirThe transformations required to derive the room transfer function R()
cular harmonics will be summarized briefly in the following for a circular geometry of
the analysis positions. The expansion coefficients in terms of circular harmonics of the
captured wave field are then given by Eq. (3.102). However, it has to be taken care that
the reproduction system generates a circular harmonic with order 0 for the measurement

of the plane wave room transfer matrix R().


Due to the secondary source selection criterion (4.9), the driving signals of the secondary sources are most conveniently computed
on basis of the plane wave decomposition. The plane wave decomposition and the decomposition into circular harmonics are connected to each other by the Fourier series (3.58).
Thus, the plane wave decomposition of the desired wave field can be derived by a Fourier
transformation from its circular harmonics decomposition. As for the plane wave room
transfer matrix, application of a discrete formulation of Eq. (4.18) allows to conveniently
compute the driving signals from the plane wave decomposition of the desired wave field.
Figure 4.26 illustrates the transformation of the listening room transfer matrix into its circular harmonics representation. The number of circular harmonics components is denoted
by M. The transformation Dch denotes the calculation of suitable driving signals from
(M) () using the procedure
the circular harmonics decompositions of the desired field w

outlined above. The circular harmonics transfer matrix R()


describes the influence of
(M)
() in terms of circular harmonics.
the listening room on the desired field w

4.6. Summary

4.6

159

Summary

This chapter derived an improved listening room compensation system which overcomes
the limitations of the traditional multi-point compensation algorithms by fulfilling the
requirements stated in Section 1.2.
The first requirement stated in Section 1.2 called for a proper analysis of the wave field
reproduced within the listening area. The wave field analysis techniques introduced in
Chapter 3 allow the analysis of the entire wave field within the listening area by measurements taken on its boundary. The analysis can be performed by measuring at a limited
number of discrete positions if the bandwidth of the analyzed wave field is limited.
The second requirement stated in Section 1.2 called for a spatial reproduction system
which provides control over the reproduced wave field within the listening area. Section 4.1
introduced a generic theory of sound reproduction based on the Kirchhoff-Helmholtz integral. It was further shown in Section 4.1.6 that a spatially discrete distribution of
monopoles surrounding the listening area allows to control the wave field within the listening area if the (temporal) bandwidth of the reproduced wave field is limited.
Both WFA and sound reproduction presume a proper spatial sampling in order to be able
to analyze and control a band-limited wave field within the listening area. The number
of analysis and reproduction channels required for a proper control and analysis is quite
high for an even relatively moderate bandwidth of the wave fields concerned. As a result,
the application of traditional adaptive multichannel inverse modeling algorithms will become unfeasible due to the dimensionality of the normal equation (4.83) of the adaptation
dd (k) may be ill-conditioned due to
problem. Additionally, the auto-correlation matrix
spatio-temporal correlations.
The third requirement stated in Section 1.2 called for an improved multichannel adaptation algorithm that overcomes these problems. Section 4.5.3 and Section 4.5.4 proposed
the use of eigenspace adaptive inverse filtering or WDAF in order to fulfill the third requirement. Eigenspace adaptive inverse filtering uses the GSVD to spatially decouple
the MIMO adaptation problem. This decoupling results in a series of single-channel inverse adaptation problems. This results in a significant complexity reduction, since the
dimensionality of the single-channel normal equation (4.104) is reduced by a factor of N 4
(where N equals the number of loudspeakers) compared to the multichannel case (see
Section 4.5.1). Another effect of the decoupling is that the identification of the room
transfer matrix required for the filtered-x adaptation algorithms is also decoupled into
single-channel identification problems [BSK04b, BSK04c, BSK04a]. The complexity of
eigenspace adaptive filtering can be reduced further by discarding components in the
transformed domain with low significance.
However, the drawback of using the GSVD, as transformation for the decoupling, is its
dependence on the room transfer matrix. The concept of WDAF proposes the use of a

160

4. Listening Room Compensation

generic transformation which is, to some extend, independent of the room characteristics.
The transformation used for WDAF should compact the MIMO system to as few as possible paths. In the optimal case these paths should be decoupled or should only have weak
couplings in-between. Two analytic transformations which may be used for this purpose
were introduced in the previous section.
The following chapter will illustrate the application of WDAF to active listening room
compensation for wave field synthesis systems.

161

Chapter 5
Room Compensation Applied to
Spatial Sound Systems
The previous chapter derived the theoretical basis of active listening room compensation. For a practical implementation of the presented techniques and algorithms, sound
reproduction and WFA have to be realized by properly designed systems. The following
section discusses the practical realization of sound reproduction using the concept of wave
field synthesis, the practical realization of WFA by circular microphone arrays and active
listening room compensation using WDAF.

5.1

Wave Field Synthesis

Wave field synthesis (WFS) is a sound reproduction technique which is essentially based on
the Kirchhoff-Helmholtz integral. The concept of sound reproduction using this physical
foundation was discussed in Section 4.1. However, in order to arrive at a realizable
system several theoretical and practical problems have to be solved. These constitute the
concept of WFS which was initially developed by the technical university of Delft [Ber88]
and has been developed further by a vital WFS research community during the past two
decades [Sta97, Ver97, Vog93, dVSV94, SdVL98, dB04, Hul04, TWR03, PEBL05, Spo04,
WW+ 04, WKRT04, STR02].
It was stated in Section 4.1.4 that a two-dimensional sound reproduction system can be
realized by appropriately driving a monopole line source distribution surrounding the
listening area. In theory, these secondary line sources would have to be of infinite length.
In practice however, a finite line source from the ceiling to the floor is sufficient when
assuming acoustic rigid floor and ceiling [Hul04]. Although it was shown by [Hul04] that
such line sources can be realized by electrostatic loudspeakers, this solution is costly and
impractical. The concept of WFS utilizes therefore a distribution of monopole point
sources as secondary sources. The benefit of this choice is, that closed loudspeakers

162

5. Room Compensation Applied to Spatial Sound Systems

constitute a reasonable approximation of point sources and thus can be used for the
realization of such a system. The mismatch of secondary source types may lead to the
artifacts discussed in detail in the following section. Without compensation of these
artifacts, the reproduced wave field is given by Eq. (4.11) by exchanging the secondary
source term constituting a line source by the one constituting a monopole point source
I
1
ejk|xx0|
P (x, ) =
D(x0 , )
dL0 ,
(5.1)
4 V
|x x0 |
where the contour V delimits the listening area V (as illustrated by Fig. 4.1) and dL0
denotes a suitably chosen line element on V . The driving signal D(x0 , ) is defined by
Eq. (4.10).
For a practical system the secondary source distribution has to be discretized. The discretization of the secondary source distribution and its consequences on the reproduced
wave field were discussed in Section 4.1.6. It is assumed in the following, that the antialiasing conditions derived in Section 4.1.6 are fulfilled reasonably. These conditions are
not fulfilled for the entire auditory frequency range of humans for a typical WFS system.
Thus, aliasing will be present in the reproduced wave field. However, for reproduction
purposes aliasing artifacts do not play a dominant role since the human auditory system
doesnt seem to be too sensible for spatial aliasing [dB04, Sta97, Wit05]. A distance of
x = 10 . . . 30 cm between the secondary sources positions has proven to be suitable in
practice. Unfortunately, aliasing poses limits for active listening room compensation since
no control can be gained within the listening area above the aliasing frequency.
Concluding the above considerations, a WFS system can be realized by using closed loudspeakers which surround the listening area. These loudspeakers should be leveled with
the listeners ears for optimal auralization results. The listening area and the surrounding
loudspeakers may have arbitrary shapes. Examples of WFS systems will be shown in Section 5.1.5. However, care has to be taken to compensate for the monopole point sources
instead of line sources used as secondary sources. The following section will investigate
this in detail.

5.1.1

Correction of Secondary Source Type Mismatch

In order to analyze and correct the artifacts caused by using secondary point sources
instead of line sources for two-dimensional sound reproduction, the geometry depicted in
Fig. 5.1 is considered. The listening area V and its surrounding contour V are located
in the z = 0 plane. The surface V is generated by extending V infinitely into both
z-directions. This special choice allows to degenerate the reproduction geometry from the
three-dimensional case to the desired two-dimensional case. By comparing the resulting
solution with Eq. (5.1) the artifacts of the point source usage will be derived.
The wave field generated by a distribution of monopole point sources placed on the surface

5.1. Wave Field Synthesis

163

V
x0

z
x

x0
x

V
O
Figure 5.1: Illustration of the geometry used to derive the artifacts of using secondary
monopole point sources for two-dimensional reproduction.
V is covered by Eq. (4.8). Its specialization to the geometry illustrated in Fig. 5.1 is
given as
I

1
ejk|xCxC,0 |

PC (xC , ) =
DC (xC,0 , )
dS0 ,
(5.2)

4 V
|xC xC,0 |
where the monopole driving function is defined according to Eq. (4.10) and dS0 denotes
a suitably chosen surface element on V . For this specialized geometry, the vector xC,0
can be expressed by the vector xC,0 and an offset in z-direction
xC,0 (xC,0 , z0 ) = xC,0 + [ 0 0 z0 ]T .

(5.3)

It was shown in Section 3.3 that arbitrary wave fields can be decomposed into plane
waves. Thus, it is sufficient to derive the artifacts for the reproduction of a plane wave.
The driving function for the reproduction of a plane wave is given as
T
DC,pw (xC,0 , ) = 2 jk P () a(xC,0 ) cos ejkC,0 xC,0 ,

(5.4)

where denotes the angle between the surface normal n onto V and the wave vector k0
of the plane wave ( = (n, k0 )) and P () the spectrum of the plane wave. Due to the
special geometry it is possible to split the integration path of Eq. (5.2) into an integration
along the closed curve V and an integration along the z-direction. Please note, that the
surface normal n is independent from z0 . Introducing DC,pw (xC,0 , ) into Eq. (5.2) and

164

5. Room Compensation Applied to Spatial Sound Systems

splitting the integration path yields


I
1
PC (xC , ) =
P () a(xC,0)
2 V
Z
jk cos

T

ej(k|xCxC,0 (xC,0 ,z0 )|kC,0 xC,0 ) dz0 dL0 . (5.5)



xC xC,0 (xC,0 , z0 )

The inner integral can be approximated, for not too small wave numbers k and distances


xC x (xC,0 , z ) , using the stationary phase method. Details on this method and its
C,0
0
application to the inner integral can be found in Appendix C.2. This approximation of
the inner integral leads to the following result
r
I q
1
1
ejk|xCxC,0 |
PC (xC , )
2 |xC xC,0 | DC,pw (xC,0 , )
dL0 .
(5.6)
4 jk V
|xC xC,0 |
Equation (5.6) states that the contributions of all secondary sources on a line parallel
to the z-axis through the point xC,0 can be approximated by one secondary point source

placed at the stationary phase point z0,s


= 0. Comparing Eq. (5.6) with Eq. (5.1) yields
that the choice of secondary point sources for WFS leads to amplitude and spectral
artifacts in the reproduced wave field. These artifacts can be corrected to some extend
as will be shown later in this section. Performing the same steps as above additionally
for the dipole contributions in the Kirchhoff-Helmholtz integral yields the 2.5D KirchhoffHelmholtz integral [Sta97].
The amplitude artifacts result from the mismatch in secondary source types occurring in
WFS. The fields emitted by point and line sources are given by Eq. (2.27) and Eq. (2.40)
respectively. Point sources exhibit an amplitude decay which is inverse proportional to
the distance. The amplitude decay of a line source can be derived from the far-field
(kr 1) approximations of the Hankel functions as given by Eq. (2.22). Please note,
that the approximations (2.22) can be derived in a straightforward manner by applying the
stationary phase method to the Hankel functions. These state that line sources exhibit an
amplitude decay which is inverse proportional to the square root of the distance. Hence,
both source types exhibit different amplitude decays over distance. Additionally, point
sources exhibit a flat frequency response, whereas line sources exhibit a frequency response
which is inverse proportional to the square root of the frequency in the far-field (kr 1).
The amplitude and spectral artifacts can be compensated for the point x by using a
modified driving function Dcorr (x0 , ) derived from Eq. (5.6)
r
1 p
2 |x x0 | Dpw (x0 , ) .
(5.7)
Dcorr,pw (x0 , ) =
jk

Equation (5.7) states that the driving signal has to modified in its amplitude and frequency
characteristics. The amplitude correction is dependent from the secondary source position
x0 and the receiver position x within the listening area. Thus, a compensation of the

5.1. Wave Field Synthesis

165

amplitude artifacts seems to be possible only for one particular point within V . This is
not desirable, since an optimal spatial audio reproduction system should reproduce the
desired wave field without any artifacts at all positions within the listening area. An
amplitude correction for a reference line instead of a reference point is also possible for
certain geometries of the secondary sources and the reference line. For the reproduction
of virtual point sources this was shown in [Sta97]. The amplitude correction included in
Eq. (5.6) can be used to estimate the amplitude error of an arbitrary shaped WFS system.
This will be shown in Section 5.1.3.
The results also indicate that no phase errors are present in the reproduced wave field using
WFS when applying the necessary compensation given by Eq. (5.7). Especially for active
room compensation this is an important result. Please note, that the derived correction
is only valid for the reproduction of plane waves. Similar results for the reproduction of
virtual point sources can be found in [Sta97, Ver97, Vog93].

5.1.2

Artifacts of WFS and their Impact on Room Compensation

The previous section derived that WFS is not capable of correctly reproducing the amplitude of the desired wave field throughout the entire listening area. In general, the
amplitude of the desired wave field can only be reproduced correctly at one reference
position (or reference line). A WFS system may also exhibit other artifacts besides these
amplitudes errors. This section will briefly discuss four types of artifacts of WFS and
their impact on active listening room compensation. The artifacts of WFS are:
1. Spatial aliasing
In a practical realization the continuous distribution of secondary point sources will
be realized by point sources placed at discrete positions. This spatial sampling of
the secondary source distribution was discussed in detail in Section 4.1.6. Spatial
sampling will result in spatial aliasing artifacts present in the reproduced wave
field if the anti-aliasing conditions are not met. Most WFS systems are designed
for wide-band reproduction of sound. However, the distance between the sampled
secondary sources is typically chosen not to fulfill the anti-aliasing conditions for
the entire frequency range due to complexity considerations. As a result spatialaliasing artifacts are present in the reproduced wave field. Fortunately, the human
auditory system does not seem to be very sensible to these artifacts if the sampling of
the secondary sources is performed in accordance to a resulting minimum temporal
aliasing frequency of about 1 kHz [dB04, Sta97, Wit05].
2. Truncation and diffraction
Practical realizations of WFS systems will have finite dimensions, may have bents

166

5. Room Compensation Applied to Spatial Sound Systems

WFS artifacts

Impact on room compensation

1. spatial aliasing

provides upper frequency limit of active room


compensation
limit achievable performance
limits achievable suppression of elevated reflections and performance for out-of-plane
listeners
limit achievable performance for different listener positions

2. truncation and diffraction errors


3. 2D restriction

4. amplitude errors

Table 5.1: Overview over the artifacts of WFS and their impact on active room compensation.
in their secondary source contours and non-closed contours. This may lead to truncation and diffraction artifacts present in the reproduced wave field. The impact
of these artifacts can be decreased to some extend by applying spatial tapering to
the driving signal. For a detailed discussion of these artifacts and countermeasures
please refer to [Sta97, Ver97].
3. Restriction to two-dimensional reproduction
Typical WFS systems are limited to the reproduction in two-dimensions. Additionally to the discussed artifacts, this limitation has consequences for the control a
WFS system has over the reproduced wave field. A two-dimensional WFS system
is only capable of acoustically controlling the plane where the secondary sources are
placed in. The wave field above and under this plane will exhibit artifacts [Sta97].
4. Amplitude errors
The secondary source type mismatch for a two-dimensional WFS system results
in the spectral and amplitude errors derived in Section 5.1.1. The spectral errors
can be corrected for all listener positions within the listening area. However, the
amplitude errors can only be corrected for one listener position (or reference line)
in general. As a result, the reproduced wave field will exhibit position dependent
amplitude errors.
Table 5.1 summarizes the artifacts of WFS. For a properly designed WFS system and
typical auralization scenarios these effects play no dominant role. However, for active
room compensation they pose limits on the achievable performance. Spatial aliasing (1)
limits the frequency up to which an application of room compensation is possible, since
above of this frequency WFS gains no proper control over the reproduced wave field.
The truncation and diffraction errors (2) will pose limits to room compensation in that

5.1. Wave Field Synthesis

167

sense that they introduce artifacts into the wave field reproduced for destructive interference. The limitation to two-dimensional reproduction (3) constricts the suppression
of reflections twofold. On the one side, reflections emerging from boundaries outside the
reproduction plane (elevated reflections) cannot be compensated for the entire listening
area. On the other side, the performance of active room compensation will decrease for
listener positions above or below the listening area. The amplitude errors (4) limit the
suppression a WFS system can gain by destructive interference. Perfect compensation is
only possible at the reference position (line), outside of this position (line) the listening
room reflections cannot be compensated perfectly. These amplitude errors have to be
taken into account when prescribing a desired wave field for the adaptation process.
In the following section, the amplitude errors and the suppression of elevated reflections
will be analyzed in more detail for active room compensation using a circular loudspeaker
array.

5.1.3

Quantitative Analysis of the Artifacts for Circular WFS


Systems

The previous section introduced the artifacts of WFS systems on a qualitative level. The
following section will analyze the artifacts of circular WFS systems and their impact
on room compensation in a more quantitative fashion. The reason for considering this
specialized geometry is that a circular array was used for the experimental validation of
the proposed methods in Section 5.3. In the sequel the reproduction of plane waves will
be considered. It is sufficient to derive the artifacts of a circular WFS system for one
fixed incidence angle of the reproduced plane wave only, since circular arrays are radially
symmetric. The resulting characteristics for arbitrary wave fields can then be derived
easily from the presented results. Some of the results shown in the following have been
presented in [SRR05].
Figure 5.2 illustrates the geometry of a circular WFS system. Specialization of Eq. (5.1)
to the desired geometry together with the corrected plane wave driving function (5.7)
yields the wave field PP (xP , ) reproduced by the circular WFS system as
PP (xP , ) =

R3
8 jk

0 +3/2
0 +/2

DP,pw ( , R, )

ejkr
d ,
r

(5.8)

where 0 denotes the incidence angle of the plane wave, R the radius of the loudspeaker
array and DP,pw the uncorrected driving function for a plane wave. The integration limits
in Eq. (5.8) are chosen in accordance to the effect of the window function a( ) given by
Eq. (4.9). The uncorrected driving function DP,pw can be derived from Eq. (4.40) as

DP,pw = 2 jk P () cos( 0 ) ejkR cos( 0 ) ,

(5.9)

168

5. Room Compensation Applied to Spatial Sound Systems

r
x

r
V
V

Figure 5.2: Illustration of the geometric parameters used to describe the wave field
reproduced by a circular WFS system.
where 0 denotes the incidence angle of the plane wave and P () its spectrum. The
amplitude correction included in Eq. (5.8) was chosen such to derive a correct amplitude
at the center of the circular array. The distance r between the secondary sources and a
listener position depends on the integration variables and is given as
p
(5.10)
r(, , r, R) = R2 + r 2 Rr cos(2 + ) .

Equation (5.8) together with Eq. (5.9) and Eq. (5.10) constitutes the mathematical description of the wave field reproduced by a circular WFS system. The effects of secondary
source sampling for a circular reproduction system have been discussed in Section 4.1.6.2.
The following two sections will quantitatively analyze the effects of the amplitude errors
and the two-dimensional restriction on room compensation using a circular WFS system.
5.1.3.1

Amplitude Errors

Section 5.1.1 illustrated the application of the stationary phase approximation to compensate for the secondary source type mismatch of WFS. This approximation can be used
further to derive an upper bound for the amplitude error for the reproduction of a plane
wave. The amplitude of the reproduced wave field is given by introducing Eq. (5.4) into
Eq. (5.6)


r I


jk
cos
T


a(x0 ) p
ejk0 x0 ejk|xx0 | dV0 ,
(5.11)
|P (x, )| = P ()


2 V
|x x0 |
where denotes the angle between the surface normal n onto V and the wave vector k0
of the plane wave to be reproduced. An upper bound for the amplitude error is derived

5.1. Wave Field Synthesis

169

by shifting the calculation of the absolute value into the integral and noting that the
exponential functions have unit amplitude. This results in the following upper bound



r jk I
cos




|P (x, )| P ()
(5.12)
a(x0 ) p
dV0 .
|x x0 |
2 V

Equation (5.12) can be specialized straightforward to the geometry depicted in Fig. 5.2.
The limiting effect of the amplitude errors on room compensation can be estimated by
calculating the error between the reproduced wave field P (x, ) and the wave field of the
desired plane wave Ppw (x, ) as follows
E(x, ) = P (x, ) Ppw (x, ) ,

(5.13)

where Ppw (x, ) is given by Eq. (2.47). For room compensation the desired wave field
Ppw (x, ) can be interpreted as the reflections caused by the listening room which should
be canceled by destructive interference.
In the following, results for one particular reproduction scenario will be shown. For this
purpose the reproduction of a monochromatic plane wave with a frequency of f0 = 400 Hz
and incidence angle 0 = 90o on a circular array with radius R = 1.50 m is considered.
The reproduced wave field and the upper bound for the amplitude were calculated by
numeric evaluation of Eq. (5.8) and Eq. (5.12) for the chosen geometry and desired plane
wave. The amplitude of the reproduced wave field was adjusted such that it had unit
amplitude in the center. The numeric evaluation of the reproduced wave field by Eq. (5.8)
also includes all other reproduction artifacts mentioned in Section 5.1.2.
Figure 5.3 illustrates the derived results for an 2 2 m area in the center of the listening
area. Figure 5.3(a) shows a snapshot of the reproduced wave field: a plane wave with
incidence angle 0 = 90o and frequency f0 = 400 Hz. On first sight, the circular system
described by Eq. (5.8) seems to be capable of reproducing a plane wave without major
artifacts. However, there are some slight deviations in the amplitude visible. Figure 5.3(b)
shows the amplitude of the reproduced plane wave. The equi-amplitude contours illustrate
the amplitude variations. For the region shown the overall amplitude variation is about
8 dB. Figure 5.3(c) shows the upper bound of the amplitude error as given by Eq. (5.12)
and its equi-amplitude contours. It can be seen clearly that the upper bound provides a
reasonable approximation of the amplitude of the simulated wave field for the upper half
(y > 0) of the region shown. For the lower half (y < 0) the error is overestimated by
about 3 dB.
Figure 5.3(d) shows the amplitude of the averaged error E(x, ). The error was averaged
over one signal period of the monochromatic plane wave in order to eliminate numerical
artifacts. The error is small in the vicinity of the center due to the amplitude adjustment
in the center. As predicted by Figure 5.3(b), the error is smaller above the center. This is
due to the fact that the secondary sources above the center are used for the reproduction

5. Room Compensation Applied to Spatial Sound Systems

[dB]

170

0.5

0.5

0.5

0.5

0.5

1
1

0.5

0
x > [m]

0.5

(a) snapshot of the reproduced wave field

y > [m]

y > [m]

1
2
0.5 3
4
6
5
1
1
0.5

1
2

0
x > [m]

3
4
5
0.5

0
0.5

2
4

1
1

2
0.5

0
x > [m]

2
0.5

(c) upper bound for the amplitude error

0
5

25

0.5

203

30

2
5

10
2 1
5
0

y > [m]

0
2
15

y > [m]

10
5

[dB]

00

(b) amplitude of the reproduced plane wave

15

0.5

[dB]

1 3

3
25

10

20

15

15

0.5 1

20
10

1
1

0.5

0
x > [m]

0.5

25
5
1

30

(d) averaged error between the desired and the reproduced wave field

Figure 5.3: Results when reproducing a monochromatic plane wave with a circular WFS
system. The desired plane wave has a frequency of f0 = 400 Hz and an incidence angle
of 0 = 90o . The radius of the simulated WFS system is R = 1.50 m.

5.1. Wave Field Synthesis

171

of this particular plane wave. Figure 5.3(d) shows the maximum achievable position
dependent suppression that can be reached for the compensation of a plane wave by
destructive interference. The results also include truncation errors due to the numerical
evaluation of Eq. (5.8).
5.1.3.2

Suppression of Elevated Reflections

The previous section investigated the amplitude errors of circular WFS systems and their
impact on active room compensation. As stated before, a two-dimensional WFS system
has only full control over the wave field within the reproduction plane. Reflections caused
by boundaries out of that plane (e. g. by the ceiling) will result in elevated contributions
with respect to the reproduction plane. In the following some particular results for the
suppression of elevated plane waves will be shown.
The pressure field of an elevated plane wave in the reproduction plane (z = 0) is given as
PP,pw,0 (, r, ) = P () ejkr cos(0 ) cos(0 ) ,

(5.14)

where 0 denotes the elevation angle with respect to the reproduction plane. An elevation
angle of 0 = 0o denotes no elevation. As in the previous section Eq. (5.8) was numerically
evaluated. For this purpose the driving function belonging to an elevated plane wave was
computed using Eq. (4.10) and introduced into Eq. (5.8). The absolute value of the
error E(x, ) defined by Eq. (5.13) is used as performance measure for the suppression of
elevated contributions. Figure 5.4 shows the suppression of the incident wave field by the
circular WFS system for different elevation angles of the incident plane wave. Figure 5.4(a)
with an elevation angle 0 = 0o is shown for reference. It is equal to Figure 5.3(d). As
expected, an increasing elevation angle lowers the suppression of the incident field which
can be achieved by WFS. For an elevation angle of 0 = 30o the averaged error is even
higher than without room compensation. As a consequence to these results a proper
damping of the ceiling and the floor in order to avoid elevated reflections seems to be
mandatory for an active room compensation system using two-dimensional reproduction
techniques.

5.1.4

Rendering Techniques

The practical realization of a WFS system requires to generate the individual driving
signals for each loudspeaker. The process of generating the driving signals and reproducing
the desired wave field is referred to as acoustic rendering in the following. This term is
chosen in accordance to the frequently used term rendering in computer graphics for the
creation of (virtual) visual scenes. The following section will introduce two rendering
techniques for WFS.

172

5. Room Compensation Applied to Spatial Sound Systems

[dB]

20
10

0.5

0
x > [m]

0.5

15

0.5

1
1

30

(a) elevation angle 0 = 0o

0.5

10

30
0
x > [m]

0.5

15

0
1
0
x > [m]

0.5

(c) elevation angle 0 = 20o

30

10

0
0

15

0
0.5

20

25
0.5

20

y > [m]

10

0.5

0.5
y > [m]

[dB]

1
1

25

(b) elevation angle 0 = 10o

0.5

20

10

[dB]

15

5
1

10

25
5
1

20

15

2
0

20

y > [m]

25

2
5

10

15

0.5 1
0

1 5
1

15
0
2

30
25

20

0.5

1
5

10

203

30

15

5
2

10
2 1
5
0

1 0
5
1

0
2
15

y > [m]

[dB]

0.5

1
15

1
1

0.5

25
0
x > [m]

0.5

30

(d) elevation angle 0 = 30o

Figure 5.4: Average error between the desired and the reproduced wave field for a
plane wave with a frequency of f0 = 400 Hz, an incidence angle of 0 = 90o and varying
elevation angles 0 . The radius of the simulated WFS system is R = 1.50 m.

5.1. Wave Field Synthesis

5.1.4.1

173

Data-based Rendering

The technique of data-based rendering auralizes a recorded or synthetically created (virtual) acoustic scene by the knowledge of its acoustic wave field on the border of the listening region. Section 4.1.4 illustrated the reproduction of arbitrary virtual source wave
fields using secondary monopole sources. The presented technique required to obtain the
local propagation direction of the virtual source wave field at the loudspeaker positions.
It was shown in [BdVV93] that one possibility to obtain the desired loudspeaker driving
signals is to place directional microphones (Cardiods) at or nearby the loudspeaker positions. One major drawback of this approach is that the loudspeaker and the microphone
setup have to match exactly.
A more convenient way is to measure or virtually create a plane wave decomposition of
the wave field generated by the virtual source. Suitable techniques for this purpose were
introduced in Section 3.4. The loudspeaker driving signals can be obtained by extrapolation of the plane wave decomposed signals to the loudspeaker positions, as shown in
Section 4.1.5. The area analyzed in order to derive the plane wave decomposition should
include at least the listening area for optimal results.
Typically, the plane wave decomposition of the spatial impulse response from one virtual
source to the desired listening area is measured or simulated [HdVB02, Hul04] due to
complexity restrictions. The plane wave decomposition of the virtual source wave field
for arbitrary excitations is obtained by (time-domain) convolution of the virtual source
signal and the plane wave decomposition of the spatial impulse response, as given by
Eq. (3.60). The loudspeaker driving signals for arbitrary virtual source signals can be
computed as follows
d(xn , t) = d0 (xn , t) s(t) ,

(5.15)

where d(xn , t) denotes the loudspeaker driving signal of the n-th loudspeaker, d0 (xn , t)
the impulse response obtained from the (extrapolated) measurements and s(t) the virtual source signal. The impulse response d0 (xn , t) for WFS can be obtained from the
plane wave decomposition of the measured wave field (spatial impulse response) by using
Eq. (4.18) together with the secondary source correction (5.7). The convolution process (5.15) has to be performed for each loudspeaker. Thus, reproduction of a virtual
source using data-based rendering requires a multichannel convolution of the virtual source
signal with the impulse responses d0 (xn , t). For a high number of loudspeakers and/or
long impulse responses d0 (xn , t) this process may become computationally very complex.
Data-based rendering of acoustic scenes is a technique typically used for the high-quality
reproduction of complex static acoustic scenes. Since data-based rendering is capable
of reproducing arbitrary wave fields, it can be used to generate the wave field for the
compensation of the listening room reflections.

174

5.1.4.2

5. Room Compensation Applied to Spatial Sound Systems

Model-based Rendering

Model-based rendering uses analytic spatial models for the virtual sources to calculate
the appropriate driving signals for the loudspeakers. Point sources and plane waves are
the most common models used for this purpose. For a plane wave, the driving signal of
the n-th loudspeaker is given by combining Eq. (5.4) and Eq. (5.7) as
p
p
T
Dpw (xn , ) = 2a(xn ) cos( ) 2 |x xn | jk S() ej c n0 xn ,
(5.16)

where nT0 denotes the normal vector (incidence angle) of the plane wave to be reproduced,
the angle between the vector n0 and the surface normal n at the point xn and S() the
spectrum of the plane wave (virtual source). By transforming this equation back into the
time-domain, the loudspeaker driving signals can be computed from the source signals by
delaying, weighting and filtering,
p
(5.17)
d(xn , t) = 2a(xn ) cos( ) 2 |x xn | ( f (t) s(t) ) (t n ) ,

where the delay n = nT0 xn /c and f (t) = F 1 { jk} denotes the inverse Fourier trans
form of jk. Equation (5.17) can be realized very efficient by weighting and delaying
blocks of the filtered virtual source signal s(t). Only one single-channel convolution per
virtual source is necessary. It can be shown that similar relations can be derived for the
driving functions of a virtual point source [Ver97, dB04, Hul04]. Multiple virtual sources
(e. g. plane waves) can be synthesized by superimposing the loudspeaker signals from each
virtual source.
The main benefit of model-based rendering of virtual point sources and plane waves is its
computational efficiency. Besides delaying and weighting of the virtual source signal only
one filtering process per virtual source is required.
Plane waves and point sources can further be used to simulate classical loudspeaker setups, like stereo and 5.1 setups. Thus WFS is backward compatible to existing sound
reproduction systems and can even improve them by optimal loudspeaker positioning in
small listening rooms. It is possible to place the required virtual speakers at positions
outside the listening room. This creates proper two- or five-channel reproduction in small
listing spaces not compatible with the recommendations for correct loudspeaker placement, e.g. [ITU97]. Besides these two types of virtual sources also other sources models
have been implemented successfully [CCW03, Baa05].

5.1.5

Practical Implementation of a WFS System

Up to now, the theoretical aspects of sound reproduction in general and WFS have been
discussed. The theory presented implied possible solutions for a practical implementation of WFS. The main outcome was that a discrete distribution of closed loudspeakers
(monopoles) driven by appropriate driving signals can be used to reproduce the wave field

5.1. Wave Field Synthesis

175

of a virtual source within the listening area. The following section briefly discusses realization aspects of WFS systems by the example of the implementations realized within
the course of this work at Multimedia Communications and Signal Processing of the
University Erlangen-Nuremberg [LMS].
5.1.5.1

Hardware

The hardware of a WFS system consists of three main building blocks:


1. loudspeakers,
2. multichannel amplifiers, and
3. signal processing units (e. g. personal computers).
The loudspeakers used for WFS should, on the one side, have rather small dimensions
due to the spatial sampling requirements for the secondary sources. On the other side,
they should provide a high sound quality since the reproduction quality of the WFS system depends on the quality of the individual loudspeakers. Three types of loudspeakers
were investigated: one-way loudspeakers, two-way loudspeakers and multi-exciter panels (MEP) [dB04, KdVBB05, BdB00, SSR05, CHP02]. The first two types were classical
cone loudspeakers with closed housing. The third kind is a new kind of multichannel loudspeaker that has been developed within the EC IST project CARROUSO [CAR, BSP01].
MEPs have a relatively simple mechanical construction. They consist of a foam board
with both sides covered by a thin plastic layer. Multiple electro-mechanical exciters are
glued equidistant in a line on the back side of the board. The board is fixed on the sides
and placed in a damped housing. The benefits of MEPs are their simple construction and
the possibility of seamless integration into walls. Due to their multiple exciters, MEPs are
designed for the use with WFS. However, because of their simple mechanical construction
their characteristics are not comparable to high-quality loudspeakers.
The choice of the loudspeakers used for a WFS system depends mainly on quality, cost,
and design considerations. For the use as laboratory WFS system the loudspeakers were
mounted on a flexible mounting system consisting of several linear and circular bars. Figure 5.6 shows a U-shaped loudspeaker array consisting of 24 one-way speakers, Fig. 5.6
shows a circular WFS system consisting of 48 two-way speakers and Fig. 5.7 a system consisting of four MEP panels (32 individual channels). Equations (5.15) and (5.17) state
that each loudspeaker channel has to be driven individually. Thus, each loudspeaker
needs its own digital/analog-converter and power amplifier. In order to drive the loudspeakers, 16-channel amplifiers including digital/analog-conversion have been developed.
The amplifiers use the Alesis digital audio interface (ADAT) [ALE] as input interface.
The laboratory system used multiprocessor personal computers to compute the driving
signals. For this purpose 24-channel soundcards (RME Hammerfall series [RME]) with

176

5. Room Compensation Applied to Spatial Sound Systems

Figure 5.5: U-shaped WFS system with 24 one-way loudspeakers. The size of the
listening area is approximately 1.50 1.50 m.

Figure 5.6: Circular WFS system with 48 two-way loudspeakers. The listening area has
a radius of R = 1.50 m.

5.1. Wave Field Synthesis

177

Figure 5.7: U-shaped WFS system consisting of four MEPs. The system has 32 individual channels.
ADAT interface were utilized. The loudspeaker driving signals are purely generated by
the software discussed in the next section.

5.1.5.2

Software

The software utilized to generate the loudspeaker driving signals for the laboratory WFS
system operates on the LINUX operating system. The software interface to the soudcards
is provided by the Advanced Linux Sound Architecture (ALSA) [ALS] together with the
JACK audio audio connection kit [JAC]. The JACK server acts as real-time low-latency
patchbay for all applications that access the soundcards. A wide variety of audio software
exists for the JACK/ALSA bundle. One of the most useful applications for model-based
rendering is BruteFIR [Tor], a very efficient real-time convolution engine. Once the WFS
filters have been derived according to Section 5.1.4.1, the virtual source signals can be
convolved with BruteFIR for auralization. Using a multiprocessor workstation, the computationally complex convolutions can be performed in real-time for typical scenarios.
As stated in Section 5.1.4.2 model-based rendering mainly requires to weight and delay
the virtual source signals. Thus, an approach based on convolution of the source signals
with appropriate impulse responses would not be computationally efficient here. A dedicated application for model-based rendering was developed that explicitly exploits the

178

5. Room Compensation Applied to Spatial Sound Systems

weighting and delay approach. The model based rendering software provides the following
features:
synthesis of point sources and plane waves,
synthesis of moving point sources with arbitrary trajectories,
interactive graphical user interface for loudspeaker and source setup,
room effects using a mirror image model,
source input from files or life input from ADAT/SPDIF interfaces, and
simulation of conventional loudspeaker setups (e. g. 5.1 surround setup [ITU97]).
Figure 5.8 shows a snapshot of the graphical user interface of the model based realtime rendering software (wfsapp). The upper half of the application window shows the
loudspeaker and source setup. Sources can be moved intuitively in real-time by clicking
on the source and dragging using the computer mouse. The lower half of the application
window controls the synthesis and application parameters and the setup for the virtual
room used for the mirror image model. All source and room parameters can be changed
in real-time during playback operation.

5.2

Practical Implementation of Wave Field Analysis

It was shown in Section 4.3 that adaptive room compensation requires to analyze the
wave field reproduced in the listening area by the reproduction system. Chapter 3 introduced analysis techniques for this purpose that were based upon orthogonal wave field
expansions and measurements take on the boundary of the region of interest. The plane
wave expansion and the expansion into circular harmonics have proven to provide suitable
bases. So far, only the theoretical aspects of wave field analysis using these techniques
have been discussed. This section will discuss some practical aspects of two-dimensional
wave field analysis in the context of room compensation using linear and circular microphone arrays.

5.2.1

Linear Microphone Arrays

The plane wave decomposition, as introduced in Section 3.3, is based on a description


of the wave field in polar coordinates. Thus, the natural choice for the microphone array geometry using boundary measurements would be a circular one. However, for some
applications linear microphone arrays may be more useful, e. g. to capture the wave field

5.2. Practical Implementation of Wave Field Analysis

Figure 5.8: Screenshot of model based WFS rendering application (wfsapp).

179

180

5. Room Compensation Applied to Spatial Sound Systems

reproduced by a rectangular WFS system without scaling down the listening area by a circular array. Linear microphone arrays have been used widely in the past for the analysis
of acoustic scenes on the basis of beamforming techniques [Tre02, JD93, BW01]. Section 3.3.6 derived the relation between the plane wave decomposition and beamforming
techniques. Hence, these techniques can be used to calculate the plane wave decomposition from linear microphone array measurements. Similar analysis techniques have
also been derived by [HdVB01, HdVB02, Hul04] on basis of the Kirchhoff-Helmholtz and
Rayleigh integrals. In the following some of these techniques and their properties will be
reviewed briefly.
The Kirchhoff-Helmholtz integral implies that acoustic pressure and velocity have to be
measured on an arbitrary shaped closed curve in order to characterize the wave field
within the region enclosed by the curve (see Section 2.6.2). This curve can be degenerated to a line that extends to infinity at both sides in order to derive the relations for
linear analysis geometries. It was shown in [HdVB01] that the plane wave decomposition
based on this principle can be understood as spatial Fourier transformation. This result is
evident due the similarity between the plane wave decomposition and the Fourier transformation shown in Section 3.3. The data used for the spatial Fourier transformation
is combined from omnidirectional (pressure) and dipole (velocity) microphones in that
way, that the combination comprises a hypercardioid microphone. A technique based on
pressure and velocity measurements is capable of distinguishing between the wave field
emerging in front of the array from the wave field emerging behind the array. Using only
pressure microphones, as this is done often in beamforming techniques, does not allow this
front/back discrimination. The analysis capabilities of a finite length (truncated) linear
array for the analysis of plane waves are dependent on the incidence angle of the plane
wave. The angular resolution gets coarser for plane waves that travel nearly parallel to
the array. Thus, linear arrays are not suitable for the analysis of arbitrary wave fields
with plane wave contributions coming from all directions.
To overcome this problem [HdVB01] proposed to use two linear arrays intersecting each
other at an angle of 90 degrees to form a cross shaped array. This way, the problem of
a limited angular resolution can be improved by using the more appropriate of the two
arrays for the analysis of a plane wave contribution. The combined array is then capable of analyzing contributions from all directions without severe artifacts. It was shown
in [SKR03] that a cross shaped array is suitable for the analysis of the reproduced wave
field required for listening room compensation.
Linear arrays can be used to build rectangular analysis arrays which may fit better to
rectangular loudspeaker setups. For this purpose the techniques discussed in [HdVB01]
could be extended to the rectangular geometry or beamforming approaches could be used.
Especially constant directivity beamforming techniques [BW01] might be of use in this
context to reduce geometry dependent artifacts. Linear microphone arrays and their

5.2. Practical Implementation of Wave Field Analysis

181

variations will not be discussed further in the context of this thesis. Please refer to the
literature, e. g. [Tre02, JD93, BW01], for details. The following section will concentrate
on circular microphone arrays.

5.2.2

Circular Microphone Arrays

As stated before, circular arrays are the natural choice for the wave field analysis techniques introduced in Chapter 3. This is due to the underlying polar geometry of the plane
wave decomposition and the decomposition into circular harmonics. The following section
will briefly discuss the discrete realization of the decomposition into circular harmonics
using circular microphone arrays. Section 3.4.3 introduced this decomposition on basis
of continuous pressure and pressure gradient measurements performed on a circle with
radius R. The method depicted in Fig. 3.15 first calculates the Fourier series expansion
coefficients of the pressure and pressure gradient measurements and then carries out a
multidimensional filtering process in order to derive the circular harmonics expansion coefficients.
In a practical implementation, the continuous pressure and pressure gradient measurements have to be replaced by measurements taken at discrete positions on the circle. The
artifacts resulting from this spatial sampling process were derived in Section 3.6. One
major result of the angular sampling is that no exact anti-aliasing condition on basis of
the (temporal) bandwidth of the captured wave field can be given for arbitrary wave fields.
The number of sampling positions necessary on the circle has to be determined by prescribing a maximum allowable aliasing-to-signal ratio, as shown in Section 3.6.2.3. It will
be considered in the following that the angular sampling has been performed reasonable
for the required aliasing-to-signal ratio. The pressure and pressure gradient measurements
can be performed by placing pressure (omnidirectional) and pressure gradient (figure-ofeight) microphones at equi-angular positions on the circle. The main-axis of the pressure
gradient microphone has to coincide with the normal vector in radial direction at the
microphone position.
(, R, ) and V
r (, R, ) is
The calculation of the Fourier series expansion coefficients P
carried out by performing a discrete Fourier transformation (DFT) [OS99] with respect
to the discretized angle . The filtering process given by Eq. (3.102) is not affected by the
angular discretization since the angular frequency is already discrete in the continuous
formulation, due to the periodicity of the measurements in the angular coordinate. Figure 5.9 shows a block diagram of the discrete angle circular harmonics decomposition. If
the aliasing and truncation errors are reasonable small, then the captured wave field is described with minor deviations by the circular harmonics expansion coefficients P (1) (, )
and P (2) (, ). Please note, that in practice an exact description is not possible when
using a finite number of angular sampling positions, due to the aliasing and truncation

182

5. Room Compensation Applied to Spatial Sound Systems

PP (, R, )

DFT

(, R, )
P

P (1) (, )

M(kR)
VP,r (, R, )

DFT

r (, R, )
V

P (2) (, )

Figure 5.9: Block diagram of the discrete space circular harmonics decomposition for a
circular microphone array using the discrete Fourier transformation (DFT).
errors introduced.
For the continuous case, the plane wave decomposition can be derived from the circular
harmonics expansion coefficients by the Fourier series given by Eq. (3.58). In the discrete
angle case this Fourier series can be realized by the inverse discrete Fourier transformation
(IDFT) [OS99] performed on the angular frequency . This results in the discrete plane
wave decomposition of the analyzed wave field.
Up to now, no time discretization has been considered. If performed properly, above
considerations hold also for a time discretization of the signals. The transformation into
the frequency domain can then be performed by the DFT. Hence, for the spatio-temporal
discrete microphone signals pP (, R, k) and vP,r (, R, k) a temporal and an angular DFT
has to be performed. This will result in total in a two-dimensional DFT of the microphone
signals. The filter M(kR) has to be discretized according to the time-domain discretization. Both the DFT and the IDFT can be realized very efficiently by the fast Fourier
transformation (FFT) [OS99] in a practical implementation.

5.2.3

Artifacts of Two-dimensional Wave Field Analysis and Extrapolation

Section 5.1.2 derived several artifacts of WFS and their impacts on listening room compensation. Most of these artifacts are caused by using a two-dimensional reproduction
technique in a three-dimensional environment. The theoretical basis of both sound reproduction using WFS and the WFA techniques introduced in this work is given by the
two-dimensional Kirchhoff-Helmholtz integral. As a result, WFA will exhibit similar artifacts as derived for WFS. However, the main difference between WFS and WFA in this
context is, that the Greens functions used in the Kirchhoff-Helmholtz integral (2.70) act
as virtual secondary line sources used for the extrapolation of the boundary measurements
taken on V into the region V . WFS realizes these secondary line sources physically by
loudspeakers. This section will briefly discuss four types of artifacts of WFA and their
impact on active listening room compensation. These artifacts of WFA are:
1. Spatial aliasing
As for WFS, the discretization of the underlying physical and mathematical rela-

5.2. Practical Implementation of Wave Field Analysis

183

tions may result in spatial aliasing due to spatial sampling. For the plane wave
decomposition this spatial sampling and its artifacts were discussed in Section 3.6.
It was shown that anti-aliasing conditions can be formulated in terms of the (temporal) bandwidth of the analyzed wave field. The analyzed wave field will exhibit
aliasing artifacts if the anti-aliasing conditions derived for the discrete plane wave
decomposition are not reasonably fulfilled.
2. Truncation and diffraction
Practical implementations of microphone arrays used for WFA will have finite extends and may be based on non-closed contours. The result will be truncation and
diffraction errors present in the captured wave field. Truncation will limit the effective spatial resolution for low frequencies. This was illustrated for different analytic
source models in Section 3.5.
3. Restriction to two-dimensional analysis
As for WFS, the restriction to two-dimensions for WFA leads to limited analysis capabilities for three-dimensional wave fields. Two-dimensional methods cannot fully
distinguish between reflections emitted in the analysis plane and elevated reflections. Contributions from elevated reflections will be mixed into the contributions
of sources located in the analysis plane. However, the results presented in [Gro05]
indicate that these limited capabilities can be improved to some extend.
4. Amplitude errors
The virtual line sources used for the extrapolation process in two-dimensional WFA
are only capable of correctly extrapolating wave fields that can be be expressed
by the circular harmonics decomposition (2.25). Since point sources cannot be expressed by this two-dimensional decomposition, wave fields captured of point sources
will exhibit amplitude and spectral errors due to the reasons explained in Section 5.1.2. The analyzed wave field will not be known a-priori in general and thus
no compensation of these artifacts is possible.
Table 5.2 summarizes the artifacts of two-dimensional WFA systems. The discussed artifacts play no dominant role for a well designed WFA system when its results are used for
auralization purposes [HdVB01, HdVB02, Hul04]. However, these artifacts will limit the
performance of an active room compensation system since an exact analysis of the reproduced wave field is mandatory. Spatial aliasing (1) will effectively limit the frequency up
to which the reproduced wave field is analyzed correctly and hence room compensation
can be applied. The truncation errors (2) limit the low-frequency analysis capabilities and
the diffraction errors (2) may introduce unwanted artifacts into the compensation signals.
The two-dimensional restriction (3) and its implication on the analysis of elevated wave
field contributions may have severe consequences for active room compensation. Since

184

5. Room Compensation Applied to Spatial Sound Systems

WFA artifacts

Impact on room compensation

1. spatial aliasing
2. truncation and diffraction errors

provides upper frequency limit of WFA


limits spatial resolution for low frequencies
and introduces diffraction artifacts
limited capability to analyze elevated sources
amplitude errors in extrapolated wave field

3. 2D restriction
4. amplitude errors

Table 5.2: Overview over the artifacts of two-dimensional WFA and their impact on
active room compensation.

elevated contributions may be present in the analysis data of the reproduction plane,
the compensation filters may also contain contributions to compensate for them. However, the elevated contributions can only be compensated at the microphone positions, as
WFS has very limited capabilities to compensate for elevated contributions (see e. g. Section 5.1.3.2). Thus, the compensation signals for the elevated contributions may cause
artifacts for positions out of the microphone positions. A countermeasure for this problem
is a proper damping of the ceiling and the floor of the listening room in order to avoid
elevated reflections as far as possible. If extrapolation techniques are used to generate the
compensation signals at the loudspeaker positions, then these signals will exhibit amplitude errors (4).
In the following sections, the amplitude errors and analysis capabilities of elevated reflections of circular microphone arrays will be studied more quantitatively.

5.2.4

Quantitative Analysis of the Artifacts of Circular WFA


Systems

The previous section introduced the artifacts of two-dimensional WFA systems on a qualitative level. This section will analyze the artifacts of circular WFA systems and their
impact on room compensation in a more quantitative fashion. The reason for considering
this specialized geometry is that a circular microphone array is used for the experimental
validation of the proposed room compensation algorithms in Section 5.3. In the sequel
the reproduction of plane waves will be considered. It is sufficient to derive the artifacts
of a circular WFA system for one fixed incidence angle of the reproduced plane wave.
Section 3.4.3 introduced an efficient method for the computation of the circular harmonics expansion coefficients of wave fields analyzed by circular microphone arrays. In the
following, simulations of circular WFA systems based on this algorithm will be presented.
Some of the results have been presented in [SRR05].

5.2. Practical Implementation of Wave Field Analysis

5.2.4.1

185

Amplitude Errors due to Extrapolation

The following section will analyze the amplitude errors when using circular harmonics
for the extrapolation of boundary measurements. The results hold also for the extrapolation of wave fields based on the two-dimensional Kirchhoff-Helmholtz integral (2.70)
since the expansion into circular harmonics can be used for arbitrary wave fields. The
Hankel functions used as basis for the decomposition (2.25) exhibit a far-field amplitude
decay (see Eq. (2.22)) which is inverse proportional to the square root of the extrapolation
radius. The extrapolation of the wave field of a point source using Eq. (2.25) will exhibit
amplitude errors, due to this property of the Hankel functions. As a result of the radial
symmetry of the circular harmonics expansion, these amplitude errors will be radially
symmetric. Thus, it is sufficient to analyze the error for one particular angle of the
extrapolated wave field. In the following simulations based on numerical evaluation of the
algorithm illustrated in Section 5.2.2 will be presented.
As stated before, the two-dimensional analysis and extrapolation techniques will not be
capable of correctly reproducing the amplitude of three-dimensional point sources. The
extrapolation using the circular harmonics expansion of the wave field emitted by a point
source in a plane was calculated in order to illustrate this. The wave field of a point
source PP,ps (r, ) is given by Eq. (2.27). Its circular harmonics decomposition for a circular microphone array was calculated according to Eq. (3.102). The expansion coefficients
were then used for the extrapolation using Eq. (2.25). This procedure results in the extrapolated wave field PP,e (r, ) of a point source using circular harmonics.
Figure 5.10 illustrates the results for a point source located at a distance d = 3 m and an
array with an radius of R = 0.75 m. Figure 5.10(a) shows the amplitude decays of a point
source and its wave field extrapolated from the circular harmonics. The amplitudes were
normalized to the radius of the array. The deviation from the decay of a point source
after extrapolation of the field is clearly visible. Figure 5.10(b) shows the absolute value
of the amplitude error EP (r, ) = PP,ps (r, ) PP,e (r, ) between the wave field of a point
source PP,ps (r, ) and its extrapolation PP,e (r, ).
Point sources are widely used approximations for real-world sources (e. g. for loudspeakers). The mirror image model indicates that the typical room response of a point source
can be understood as a combination of the primary point source and its reflections. These
reflections are again modeled as point sources. Hence, it would be desirable to derive a
two-dimensional extrapolation technique which is capable of correctly extrapolating the
wave field of a point source. In principle it is possible to modify the circular harmonics
decomposition and extrapolation technique to fulfill these requirements. A drawback of
such a modification would be then that the extrapolation of plane waves would exhibit
amplitude errors (this effect is similar to the amplitude error discussed for WFS). One
possibility for modification of Eq. (2.25) could be to use spherical Hankel functions instead
of Hankel functions. Another possibility proposed by [HdVB01], which only works for the

186

5. Room Compensation Applied to Spatial Sound Systems


15
point source

1.15

amplitude error [dB] >

normalized amplitude >

20

extraplolated wave field

1.1
1.05
1
0.95
0.9
0.85
0.8

25

30

35

40

0.75
0.7

0.25

0.5
r > [m]

0.75

45

0.25

0.5
r > [m]

0.75

(a) amplitude of a point source PP,ps (r, ) and of (b) absolute value of the amplitude error EP (r, )
its extrapolated field PP,e (r, )

Figure 5.10: Amplitude decay of a point source compared to the decay of the extrapolation of its measured field. The point source is located at a distance of d = 3 m, the
radius of the WFA array is R = 0.75 m.
measurement of spatio-temporal impulse responses from point sources to the listening
area, is to modify measured impulse responses according to the desired decay.
5.2.4.2

Analysis of Elevated Reflections

It was stated before, that two-dimensional analysis techniques have limited capabilities
for the analysis of three-dimensional wave fields. This section will show some results in
order to illustrate this drawback. For this purpose the plane wave decomposition of a
plane wave with incidence angle 0 = 180o and varying elevation angle 0 using a circular
array is computed
P0 (, ) = P{PP,pw,0 (, r, )} ,
(5.18)
where PP,pw,0 (, r, ) denotes the wave field of an elevated plane wave, as given by
Eq. (5.14). Figure 5.11 shows the absolute value of the plane wave decomposition
P0 (, ). The plane wave decompositions were computed by simulating a Dirac shaped
plane wave with varying elevation angle on a circular array with an radius R = 0.75 m.
Ideally, the plane wave decomposition of a Dirac shaped plane wave would be a Dirac line
along the incidence angle of the plane wave. The plane wave decomposition of a plane
wave differs from this theoretic result as shown in Fig. 5.11(a), due to the finite aperture
of the array and aliasing errors. However, the results even differ more with increasing

5.2. Practical Implementation of Wave Field Analysis

187

[dB]
700

700

5
10

600

15

500

20
400

25

300

30
35

200

5
10

600

frequency (Hz)

frequency (Hz)

[dB]

15

500

20
400

25

300

30
35

200

40
100
0

40
100

45
0

90

180
angle (o)

270

45
0

(a) elevation angle 0 = 0o

90

180
angle (o)

270

(b) elevation angle 0 = 30o


[dB]

700

700

5
10

600

15

500

20
400

25

300

30
35

200

5
10

600

frequency (Hz)

frequency (Hz)

[dB]

15

500

20
400

25

300

30
35

200

40
100
0

45
0

90

180
angle (o)

270

(c) elevation angle 0 = 60o

40
100
0

45
0

90

180
angle (o)

270

(d) elevation angle 0 = 90o

Figure 5.11: Plane wave decomposition of a Dirac shaped plane wave with an incidence
angle of 0 = 180o and varying elevation angles 0 using a circular microphone array.
The plots show the magnitude of the frequency response P0 (, ) for different elevation
angles 0 of the plane wave. The radius of the simulated array is R = 0.75 m.

188

5. Room Compensation Applied to Spatial Sound Systems

40

0 = 0o
0 = 30o
0 = 60o
0 = 90o

30

energy [dB] >

20
10
0
10
20
30
40

90

180
angle (o)

270

Figure 5.12: Energy E0 () of the plane wave contributions P0 (, ) (see Fig. 5.11) for
different elevation angles 0 of the analyzed elevated plane wave.
elevation angle, as shown in Fig. 5.11(b) to Fig. 5.11(d). The plane wave decomposition exhibits no directionality at all in the extreme case of Fig. 5.11(d) (elevation angle
0 = 90o ). Figure 5.12 shows the energy E0 () of the plane wave components illustrated
in Fig. 5.11. The energy E0 () is defined as follows
Z


1

P0 (, ) 2 d .
(5.19)
E0 () =
2

This measure gives insight into the energy distribution with respect to the incidence angle
of the plane wave contributions derived by the plane wave decomposition. This way the
decreasing directionality with increasing elevation angle of the plane wave can be seen
clearly in Fig. 5.12.
It can be concluded from the presented results that elevated plane waves interfere into all
components of the decomposed field. Similar results have been reported by [HSdVB03].
The presented results are also valid for other types of elevated sources, since arbitrary
wave fields can be expressed as superposition of plane waves.

5.2.5

Practical Realization of a Circular WFA System

The following section will discuss some hardware related aspects of the practical realization of a circular WFA system. Section 3.4.3 and Section 5.2.2 introduced a basic
algorithm and its discrete realization for a circular harmonics decomposition using a circular microphone array. The theory behind this approach states that the measurement of

5.2. Practical Implementation of Wave Field Analysis

189

the acoustic pressure and its gradient at discrete positions on a closed circular contour are
suitable to derive the circular harmonics decomposition coefficients of the entire analyzed
wave field within this contour. The number of spatial sampling positions on the circle and
thus the number of microphones is dependent on the anti-aliasing conditions derived in
Section 3.6.2. The total number of microphones is twice the number of angular sampling
positions, since the acoustic pressure and its gradient have to be measured. The spatial
aliasing frequency of the WFA system should be at least as large as the spatial aliasing
frequency of the WFS system when used for active room compensation. Hence, the number of sampling positions on the circle will be quite high in typical applications. The WFA
system is typically realized by sequentially measuring the different sampling positions on
the circle due to the complexity when recording all signals in real-time. However, one
drawback of this sequential approach is that it is not possible to record live sound events.
It is only possible to record the spatio-temporal impulse response from an acoustic source
(e. g. a loudspeaker) to the microphones. The circular harmonics decomposition of the
wave field emitted by this source, when excited by a temporal Dirac pulse, can be calculated by applying Eq. (3.102) to the measurements.
A stepper motor drive that can be controlled from a PC was used to perform the sequential measurements within the course of this work. A rod was mounted on the stepper
motor and microphones were placed at the end(s) of the rod. Different microphone types
can be used for the measurements. High quality pressure and pressure gradient (figureof-eight) microphones are commercially available and can be used without modifications
for this purpose. However, one problem occurs when using pressure and pressure gradient
microphones for the measurements: they cannot be placed at the same spatial position.
This problem can be overcome by placing the pressure and the pressure gradient microphone at the two opposite ends of the rod. Their opposite positions have then to be taken
into account when calculating the decomposition. Figure 5.13 and Fig. 5.14 illustrate
a practical implementation using this principle. There are also microphones available
which capture pressure and pressure gradient at the same spatial position. The soundfield
microphone [Jag84] or velocity probes are examples for such microphones.
Figure 5.15 shows the plane wave decomposition p(, t) computed from the spatiotemporal impulse produced by a loudspeaker placed in a room response which was measured by a circular microphone array. The direct part of the wave field at = 180o and
the reflections caused by the room can be seen clearly. The plane wave decomposition is
a powerful tool for the analysis of acoustic scenes.
Besides the fundamental artifacts discussed in Section 5.2.3 also other artifacts have to
be considered in a practical implementation. E. g. if multiple microphones are used for
a practical realization of a circular array then their different characteristics (microphone
mismatch) and position errors have to be take into account. These and other practical
issues are discussed e. g. in [HSdVB03, Teu05, BW01].

190

5. Room Compensation Applied to Spatial Sound Systems

Figure 5.13: Rod with pressure (AKG CK92 [AKG]) and pressure gradient (AKG
CK94 [AKG]) microphone mounted on a stepper motor drive. The total length of the rod
is l = 1.50 m.

5.3

Listening Room Compensation for Wave Field


Synthesis

Section 4.4 derived the concept of WDAF for improved active listening room compensation. The basic idea of WDAF is to perform a spatio-temporal transformation of the
multidimensional signals and systems in order to decouple the MIMO inverse system
identification problem. The complexity of the identification problem is reduced significantly this way. Up to now, the application of WDAF using a particular spatial sound
reproduction system was not considered. This section will illustrate the application of
WDAF based listening room compensation to WFS systems. Results based on simulated
and measured acoustic environments will prove the performance of the proposed method.
The relevance of the discussed WFS and WFS artifacts, some extensions to the base algorithm and the application to other spatial sound reproduction systems will be discussed
additionally.
Active listening room compensation requires to be able to control the wave field within the
listening area on the one side, and on the other side to analyze the reproduced wave field
within the listening area. A closed contour WFS system is capable of controlling the wave
field within its contour below the spatial aliasing frequency. This was illustrated in the
Sections 4.1 and 5.1. A circular microphone array is capable of analyzing the wave field
within its contour below the spatial aliasing frequency. This was shown in Section 3.4.3.
Both techniques will be utilized in the following for active listening room compensation.
The concept of WDAF requires to find a suitable wave domain transformation. This
transformation should approximately diagonalize the listening room transfer matrix, as
stated in Section 4.5.4. In general, the optimal transformation will be dependent on the
acoustic boundary conditions imposed by the listening room. Hence, the transformation

5.3. Listening Room Compensation for Wave Field Synthesis

191

Figure 5.14: Setup used for the sequential circular microphone array measurements.
depends on the geometry of the listening room and the acoustic boundary conditions
present at its walls. The following section proposes a wave domain transformation for
typical listening rooms.

5.3.1

Decoupling of the Listening Room Transfer Matrix

The geometry of the listening room has to be taken into account in order to derive a
suitable wave domain transformation. Typical listening rooms will exhibit a rectangular
shape as first approximation. Hence, the boundary conditions imposed by the listening
room can be modeled approximately by a rectangular box with homogeneous boundary
conditions at its sides. In the following this simplified model of a listening room will be
used.
Two different wave field representations have been introduced in Chapters 2 and 3: the
plane wave decomposition and the decomposition into circular harmonics. Both are candidates for the wave domain transformation. Due to the assumed simplified geometry of
the listening room and the relatively low frequencies considered (due to spatial aliasing)

192

5. Room Compensation Applied to Spatial Sound Systems

35

30

time (ms)

25

20

15

10

90

180
angle (o)

270

Figure 5.15: Plane wave decomposition of the spatio-temporal impulse response emitted
by a loudspeaker placed in a room. The distance of the loudspeaker to the center of the
array was d = 2.50 m, the radius of the microphone array R = 0.50 m. The circle was
sampled at 128 positions.

circular harmonics seem to provide a more suitable basis for the wave domain transformation than plane waves. In the free-field case, each circular harmonics component emitted
by the WFS system will result only in contributions of the same component after analysis.
However, in a reverberant listening room each circular harmonics component emitted will
result in additional components other that the emitted one. The energy of these components should be as low as possible. There are some physical indications for the energy
compaction performance of circular harmonics [Oes56, DGZD01, GD04] in rectangular
rooms.
The following section will specialize the generic concept of wave domain adaptive filtering to the use of circular harmonics for the signal representation in the wave domain.
The performance of this special choice for the wave domain transformation, on basis of
simulated and measured acoustic environments, will be shown in the Sections 5.3.4 and
5.3.5.

5.3. Listening Room Compensation for Wave Field Synthesis

(M)
w

T2

w(N)
N

listening room

(M)
e

F
+

T3

l(M)

(M)
a

(M)
d

T1

S()

193

l(M)

Figure 5.16: Block diagram of the proposed wave domain adaptive inverse filtering
based active room compensation system for WFS.

5.3.2

WDAF based Room Compensation for WFS Systems

The concept of WDAF was derived in Section 4.5.4 and is illustrated by Fig. 4.24. This
generic concept will be specialized in the sequel to the reproduction using a WFS system
and the circular harmonics decomposition as wave domain transformation. Two rendering
techniques have been introduced in Section 5.1.4 for WFS based rendering of acoustic
scenes: model-based and data-based rendering. Both techniques differ in the generation of
the loudspeaker driving signals. Model-based rendering uses spatial models for the virtual
sources, while data-based rendering allows to prescribe any desired virtual wave field at the
cost of an increased complexity. For the derivation of the room compensation algorithm,
the model-based approach will be considered in the following. However, the algorithm
can be generalized straightforward to the data-based approach as will be indicated later
in this section.
Figure 5.16 illustrates the proposed WDAF based active room compensation algorithm
for WFS systems. The notation of the signals and transfer matrices is similar to the one
used in Section 4.5, the virtual source signal is denoted by S(). Only one virtual source
is covered in Fig. 5.16, multiple virtual sources can be considered by applying the principle
of superposition. The transformations T1 through T3 are based on the decomposition into
circular harmonics of the respective wave fields. For active room compensation only the
incoming parts (see Section 2.3.1) of the respective wave fields are of interest. For the
particular scenario considered these generic transformations are specialized as:

194

5. Room Compensation Applied to Spatial Sound Systems

Transformation T1 :
This transformation transforms the virtual source signal S() using a spatial model
(M) in the wave domain. Suitable models for
into the loudspeaker driving signals d
the virtual source characteristics are line sources or plane waves. The plane wave
decompositions for these two source types were derived in Section 3.5 and are given
by Eq. (3.114) and Eq. (3.109), respectively. The circular harmonics components
can be derived easily from these analytic plane wave decompositions since they are
given in terms of a Fourier series equal to the expansion of a wave field into circular
harmonics (2.25).
It is desirable to include a parametric room model into the analytic source models for the creation of virtual acoustic scenes. The mirror image model [AB79]
is widely used for this purpose. Efficient algorithms for the mirror image
model [DGZD01, GD04] can be included conveniently into transformation T1 . These
algorithms are based on a decomposition into spherical harmonics since a three dimensional wave field was considered for their derivation. However, they can be
reformulated straightforward in terms of circular harmonics for the two-dimensional
case considered here.
Transformation T2 :
This transformation generates the loudspeaker driving signals from the filtered driv (M) . It was shown in Section 4.1.5 that the loudspeaker driving signals
ing signals w
can be calculated conveniently from a plane wave decomposition of the virtual source
wave field. Thus, transformation T2 can be realized by calculating the the plane wave
decomposition from the circular harmonics representation using Eq. (3.58) and applying Eq. (4.18) together with the secondary source correction (5.7) to derive the
loudspeaker driving signals.
Transformation T3 :
This transformation calculates the circular harmonics decomposition coefficients of
the wave field within the listening area from the microphone array measurements.
Equation (3.102) can be used for this purpose. However, this requires to measure
the acoustic pressure and additionally its gradient at the analysis positions.
The signal representation in terms of circular harmonics should be based on the aperture
models the free-field propagaof the loudspeaker array. In this case the transfer matrix F
tion in terms of circular harmonics from the loudspeaker to the microphone array. In the
ideal case this matrix would only model the propagation delay. However, it is advisable to
incorporate additionally the artifacts of WFS systems (see Section 5.1.2) in order to pre(M) that can be achieved. Besides these artifacts the transfer
scribe an desired wave field a
should also include an additional delay to ensure the computation of causal
matrix F
room compensation filters. This delay is also known as modeling delay [Hay96, Gar00].

5.3. Listening Room Compensation for Wave Field Synthesis

195

So far only the model-based approach to auralization with WFS has been considered. As
stated before, this specialized approach can be extended straightforward to the data-based
(M) are given by the
approach. In this case the transformed loudspeaker driving signals d
circular harmonics decomposition coefficients of the virtual wave field to be reproduced.
For circular apertures these coefficients can be calculated from Eq. (3.102).
It was proposed by the author in previous papers [BSK04b, SBR04c, BSK04c, SBR04a,
BSK04a, SBR04b] to perform a Fourier transformation with respect to the angular variable of the plane wave decomposition to derive the signals in the wave domain. This
Fourier transformation of the plane wave decomposed signals is up to the factor j (and
a frequency correction) equivalent to calculating the circular harmonics expansion coefficients, since the plane wave decomposition in terms of circular harmonics is given by the
Fourier series (3.58).
An on-line implementation of the room compensation algorithm would require to use a
filtered-x algorithm for the adaptation of the compensation filters. The proposed active listening room compensation algorithm was implemented in an off-line fashion using
MATLAB [MAT]. Therefore, for the implementation of the active listening room compensation algorithm depicted by Fig. 5.16 no filtered-x algorith was used. This way the
performance of the proposed decoupling without the shortcomings of the filtered-x algorithms [TBB00, SH94, Bja95] can be evaluated. The frequency domain adaptive filtering
(FDAF) algorithm introduced in [BBK03] was utilized for the adaptation process in the
implementation. This algorithm can be extended straightforward to the filtered-x scheme.

5.3.3

Performance Measures for Active Room Compensation

The following performance measures will be used to evaluate the performance of the
proposed transformations and the resulting active listening room compensation system.
Energy compaction
The ability of the different transformed signal and system representations to compact the
room characteristics into less coefficients than using the microphone signals directly will
be evaluated by two measures: (1) the energy of the elements of the room transfer matrix
and (2) the energy compaction performance. For the room transfer matrix in its different representations the first measure is defined by calculating the energy of each spatial
transmission path. For its representation in the pressure domain (pressure microphones)
the energy of the elements of the room transfer matrix is defined as
Z
1
E(m, n) =
|Rm,n ()|2 d .
(5.20)
2
Similar definitions apply to the room transfer matrix in its plane wave and circular har 0 ) and
monics representation (see Section 4.5.5) yielding the energy representations E(,

196

5. Room Compensation Applied to Spatial Sound Systems

0 ).
E(,
In transform coding the energy compaction performance of different transformations is
typically evaluated by calculating the energy compaction EC(i) [AR75]. For one particular room transfer matrix this measure is defined by calculating the ratio between the
energy of the first i dominant elements and all elements. For this purpose the energies
E(m, n) are sorted in descending order yielding the sorted elements Esort (). Then the
ratio between the energy of the first i sorted elements and all elements is calculated as
follows
Pi
=0 Esort ()
EC(i) = PM N 1
,
(5.21)
E
()
sort
=0
where 0 EC(i) 1. Equivalent definitions apply for the transformed representations.
The more energy is captured by the first i elements the better the performance in terms
of energy compaction.
Both the energy of the matrix elements E(m, n) and the energy compaction ratio EC(i) are
typically evaluated in a logarithmic scale. The energy of the elements E(m, n) illustrates
the distribution of the energy in the different transformed domains, while EC(i) illustrates
the ability of a transformation to compact the energy on few coefficients. The concept of
eigenspace adaptive filtering, as introduced in Section 4.5.3, compacts the entire energy
on the main diagonal by using the GSVD as signal and system transformation.

Plane wave decomposition


The plane wave decompositions of the different wave fields used for the room compensation
algorithm give insight into their spatio-temporal structure. The plane wave decomposition
provides a powerful tool to analyze the reflections caused by the listening room and
the spatio-temporal performance of the room compensation algorithm. The plane wave
decomposition can be calculated from the circular harmonics expansion coefficients by
applying Eq. (3.58).

Energy of the plane wave components


The plane wave decomposition can be used to calculate the energy of the different plane
wave contributions. This measure gives insight into the energy of the potentially disturbing contributions added by the listening room and their suppression by active listening
room compensation. The energy of the plane wave components is defined as follows
1

E()
=
2



P (, ) 2 d .

(5.22)

5.3. Listening Room Compensation for Wave Field Synthesis

197

Adaptation error
The error e(M) used to adapt the room compensation filters gives insight into the peradapt (k) will be used as
formance of the adaptation process. The mean squared error E
measure for the adaptation error in the sequel. It is defined as follows
Eadapt (k) =

M
X

m=1

| em (k) |2

(5.23)

Of special interest is the convergence speed, its lower limit for stationary scenarios and its
behavior for spatio-temporal non-stationary scenarios. In the ideal case the adaptation
error should decrease fast until its lower bound and should stay low for non-stationary
scenarios.

5.3.4

Results based on Simulated Acoustic Environments

The following section will illustrate the performance of active listening room compensation for WFS using the proposed WDAF approach on the basis of simulated acoustical
environments. The main benefits of using simulated versus real acoustical environments
are that the parameters of the simulated environment can be changed easily in order to
simulate different scenarios and that practical aspects as noise, transducer mismatch and
misplacement can be excluded for a first proof of the WDAF concept.
A wide variety of methods can be used to simulate the wave propagation between the
loudspeakers and the microphones in the listening room. One of the most common methods used for this purpose is the mirror image method [AB79]. It assumes that the walls of
the listening room act as acoustic mirrors creating mirror sources. However, this assumption is only accurate for high frequencies and walls with large extends [FHLB99]. Due
to the relatively low spatial aliasing frequency of typical WFS and WFA systems, room
compensation will be typically applied to the lower frequencies only. In order to have an
accurate simulation of the acoustic environment, a FTM based simulation method of the
wave equation was used [PR05]. Using this method, no spatial discretization is required
and thus the microphone and speaker placement can be done exact in space. This is not
possible when using methods which perform a spatial discretization, like e. g. the finite
element method (FEM) [Red05]. The particular implementation used (wave2d) simulates
the wave propagation in two-dimensions. The acoustical properties of the walls are characterized by the frequency-independent plane wave reflection factor Rpw . A reflection
factor of Rpw = 0 models free-field propagation, a reflection factor of Rpw = 1 perfectly
reflecting walls. The plane wave reflection factor can be linked to the reverberation time
T60 of a room [Pie91].
The geometry of the simulated acoustical environment is a simplification of the real environment described in Appendix D.1. The real environment was simplified for the simu-

198

5. Room Compensation Applied to Spatial Sound Systems

lation to a two-dimensional rectangular plane with the dimensions 5.90 5.80 m, due to
complexity considerations. The simulated geometry is described in detail in Appendix D.2
and illustrated by Fig. D.3.
For the evaluation of the algorithms the impulse responses from all loudspeakers to all
(pressure/pressure-gradient) microphones were simulated, in order to derive the required
room transfer matrix. Two scenarios were simulated: a free-field scenario with Rpw = 0
and a reverberant scenario with Rpw = 0.8. The latter one approximates the acoustic
characteristics of the real room described in Appendix D.1 (with all curtains opened).
The following two sections will show results for a circular and a rectangular WFS system.
5.3.4.1

Circular WFS System

The first setup that is evaluated in the sequel consists of a circular WFS and WFA system.
The reason for the chosen setup is that an equivalent WFS system has been build at the
laboratory (see Fig. 5.6). The exact geometry of the laboratory setup is described in Appendix D.1. The size and position of the loudspeaker and microphone array was simulated
according to Fig. D.3, the listening room was approximated by a rectangular shape for the
simulations. The simulated WFS system consists of 48 loudspeakers placed equidistant
on a circle with a radius of RLS = 1.50 m. The loudspeakers were approximated by point
sources. According to Section 4.1.6.2 and Fig. 4.12(b) this particular angular sampling
results in a reproduced aliasing-to-signal ratio of RASR(0, 650 Hz) 32 dB at a frequency of 650 Hz in the center of the circular array. The simulated WFA system consists
of 48 angular sampling positions at a radius of Rmic = 0.75 m. Both virtual pressure
and pressure-gradient microphones were simulated at the sampling positions. According
to Section 3.6.2.3 and Eq. (3.138) this particular angular sampling of the WFA system
results in an aliasing-to-signal ratio of ASR(23, Rmic, 650 Hz) 44 dB for the = 23
circular harmonics component at a frequency of fal = 650 Hz. For the results all signals
were low-pass filtered to a frequency of fLP = 650 Hz in order to keep the aliasing artifacts
reasonably small. The remainder of this section will show and discuss the results obtained
from the simulated setup.
Singular Vectors
It was stated in Section 5.3.1, that circular harmonics provide a reasonable basis for
the WDAF approach. An indication for the suitability of this choice can be found by
calculating the right singular vectors vb () of the room transfer matrix R() (for the
pressure microphones) using the SVD. Figure 5.17(a) shows the absolute value of the
first eight right singular vectors vb () for the free-field case Rpw = 0 and a frequency of
f = 80 Hz. The singular vectors have been sorted by their descending singular values and
thus by their energy. The results are similar to the angular basis functions of the circular

5.3. Listening Room Compensation for Wave Field Synthesis

90

270 90

199

270 90

270 90

270

0.5
1
1.5
2
180

0.5
1
1.5
2
180

0.5
1
1.5
2
180

0.5
1
1.5
2
180

90

270 90
0.5
1
1.5
2
180

270 90
0.5
1
1.5
2
180

270 90
0.5
1
1.5
2
180

270
0.5
1
1.5
2
180

(a) reflection factor Rpw = 0


0

90

270 90

270 90

270 90

270

0.5
1
1.5
2
180

0.5
1
1.5
2
180

0.5
1
1.5
2
180

0.5
1
1.5
2
180

90

270 90
0.5
1
1.5
2
180

270 90
0.5
1
1.5
2
180

270 90
0.5
1
1.5
2
180

270
0.5
1
1.5
2
180

(b) reflection factor Rpw = 0.8

Figure 5.17: Absolute value of the first eight right singular vectors (f = 80 Hz) for the
simulated circular WFS/WFA system sorted by their descending singular values (top left
to bottom right). The singular vectors for two different plane wave reflection factors Rpw
at the walls of the simulated room are shown.

200

5. Room Compensation Applied to Spatial Sound Systems

harmonics (see Fig. 2.4). However, this result is evident since free-field propagation has
been simulated and circular harmonics match optimally to the geometry of the circular
WFA system in this case. In order to prove the suitability of the circular harmonics
decomposition for a typical listening room the singular vectors were also computed for
the reverberant case Rpw = 0.8. These singular vectors are shown in Fig. 5.17(b). It can be
seen that they are detoriated to some extend compared to the circular harmonics, however
their coarse structure still matches the circular harmonics quite well. The presented results
indicate that circular harmonics provide a suitable transformation for typical listening
rooms.
Energy Compaction
The concept of WDAF based active room compensation relies, among other properties,
on the energy compaction performance of the chosen transformation. Two measures were
introduced in Section 5.3.3 to investigate the energy compaction in the different signal
representations: (1) the energy of the elements of the room transfer matrix and (2) the
energy compaction performance. These measures are evaluated in the following for the
free-field and reverberant case.
Figure 5.18 shows the energy of the elements of the room transfer matrix for the pressure microphones and its circular harmonics representation for the two different simulated
cases. Figure 5.18(a) shows E(m, n) for the pressure microphones in the free-field case
(Rpw = 0). The direct path from the loudspeakers to the microphones can be seen clearly.
0 ) for the free-field case. Almost all energy is compacted onto
Figure 5.18(b) shows E(,
the main diagonal elements by the circular harmonics representation of the room transfer
matrix. Please note the different scales used for the results shown. For the free-field case
all energy in the circular harmonics domain should be compacted onto the main diagonal.
However, some off diagonal elements with very low energy are also present in Fig. 5.18(b)
due to aliasing and truncation artifacts present in the analyzed wave field and numerical
errors in the free-field simulation. Figure 5.18(c) and 5.18(d) show the energy distributions for the reverberant case. The reflections of the loudspeaker wave fields at the walls
of the listening room can be seen clearly in Fig. 5.18(c). Figure 5.18(d) shows the energy
of the room transfer matrix represented in circular harmonics. As desired, the main diagonal elements represent a major portion of the energy. The off-diagonal elements are a
result of the reverberant enclosure.
Not only the performance of the circular harmonics in terms of energy compaction towards the main diagonal is of interest, but also its performance to represent the MIMO
system by as few as possible coefficients. This can be measured by the energy compaction
of the different representations. Results for the reverberant case are shown in Fig. 5.19.
This figure illustrates the energy compaction according to Eq. (5.21) for the room transfer

matrix in the pressure domain EC(i), in the plane wave decomposed domain EC(i)
and

5.3. Listening Room Compensation for Wave Field Synthesis

201

[dB]

[dB]

0.5

20

40

15

35

1.5

10

30

25

2.5

20

15

3.5

10

10

5
0 >

microphone m >

45

15

0
5

20
10
15

25

4.5

20
5
10

20
30
loudspeaker n >

20

40

(a) pressure domain Rpw = 0

10

0
>

10

(b) circular harmonics domain Rpw = 0


[dB]

[dB]

0
45

0.5

40

0
20
5

15

1.5

35

10

2
30

2.5

25

20

3.5

15

4
4.5

10
5
10

20
30
loudspeaker n >

40

(c) pressure domain Rpw = 0.8

10

5
0 >

microphone m >

30

20

15

0
5

20
10

15

5.5

20

25

20

10

0
>

10

20

(d) circular harmonics domain Rpw = 0.8

Figure 5.18: Energy of the room transfer matrix of the signals captured by the pressure
0 ). The two rows show
microphones E(m, n) and in the circular harmonics domain E(,
the results for different plane wave reflection factors Rpw at the walls of the simulated
room.

30

202

5. Room Compensation Applied to Spatial Sound Systems

0
2

SVD
circular harmonics
plane wave decomposed
pressure

energy compaction EC(i) > [dB]

4
6
8
10
12
14
16
18
20

12

24

36

48

60

i >

Figure 5.19: Energy compaction performance EC(i) for the different representations of
the room transfer matrix. Shown are the results for Rpw = 0.8

in the circular harmonics domain EC(i).


The energy compaction of the SVD is shown
for reference. It can be seen clearly that the circular harmonics representation of the
room transfer matrix compacts the energy much better than its pressure and plane wave
representation. It can be seen also that the SVD provides the optimal transformation in
this sense.
The presented results indicate that the circular harmonics decomposition is not the optimal transformation for the reverberant case. However, the results prove that this transformation provides a quite reasonable approximation. The major benefit of using the
circular harmonics decomposition instead of the SVD for the WDAF concept is its data
independence.
Room Compensation Performance
The following results illustrate the performance of the proposed active room compensation
0 ) of the plane wave
algorithm. Figure 5.20(a) shows the energy of the elements E(,
decomposed room transfer matrix for the reverberant case. Only energy on the main
diagonal would be present in the free-field case, since each incident plane wave would

5.3. Listening Room Compensation for Wave Field Synthesis

203

[dB]

[dB]

0
5

460

450

10

440

15

430

20

420

25

410

30

400

35

390

40

380

45

270

180

15

time [ms] >

0 >

10

20
90
25
0

90

180
>

370

30

270

90

50

180
270
angle [o] >

(a) energy of the elements of the plane wave decom- (b) plane wave decomposed target wave field: plane
posed room transfer matrix
wave with 0 = 180o
[dB]

200

460

10

450

10

15

440

15

430

20

420

25

410

30

35

400

35

40

390

40

45

380

45

50

370

20

150

25
30

100

50

90

180
270
angle [o] >

time [ms] >

250

time [ms] >

[dB]

90

180
270
angle [o] >

50

(c) plane wave decomposed reproduced wave field (d) plane wave decomposed reproduced wave field with
without room compensation
room compensation

Figure 5.20: Results obtained by the proposed WDAF based active room compensation
algorithm using a circular WFS/WFA system placed in a rectangular room with Rpw = 0.8
at the walls. The results for converged compensation filters are shown.

204

5. Room Compensation Applied to Spatial Sound Systems

only produce a contribution at its incidence angle (see Eq. (4.107)). However, for the
shown reverberant case also contributions at other angles than the incidence angle are
present due to the reflections caused by the listening room. Results for one particular
incident plane wave are shown in Figures 5.20(b) to 5.20(d). The figures show the plane
wave decompositions of the desired wave field and the resulting wave fields without and
with applying room compensation. The compensation filters were adapted using white
noise. This results derived this way can be reproduced easily. However, the performance
of the adaptation when using signals like speech or music will be typically lower than for
white noise. The desired wave field for the results shown is a band-limited (sinc shaped)
plane wave with an incidence angle of 0 = 180o . The plane wave decomposition of the
(M) () is shown in Fig. 5.20(b). The plane wave decomposition of
desired wave field a
(M)
the wave field reproduced within the listening area l0 () without compensation is illustrated in Fig. 5.20(c). The reflections caused by the listening room are clearly visible.
Figure 5.20(d) shows the resulting wave field l(M) () when applying the proposed room
compensation algorithm. There are only some slight variations from the target wave field
visible. Figure 5.21(a) shows the energy of the plane wave components shown in Figures 5.20(b) to 5.20(d). The results prove that the compensation filters are capable of
compensating the undesired contributions from other directions that the incidence angle
of the target plane wave. Figure 5.21(b) shows the adaptation error for this particular
simulated scenario. Please note, that due to the limited bandwidth of the room compensation system only very limited information is available to adapt the coefficients of the
room compensation filters. However, the results prove that fast and stable adaptation is
possible even in this szenario using the WDAF concept.
The room compensation filters for the shown results were fed back into the simulation of
the listening room in order to visualize the impact of room compensation [PSR05]. Figure 5.22 shows the resulting wave fields without and with room compensation for three
different time-instants. The left row of Figure 5.22 shows the resulting wave fields for the
simulated reverberant room without applying room compensation. The upper wave field
for t = 0 ms shows the desired plane wave in the center of the array, the middle wave
field for t = 6.8 ms shortly before the upper wall and the lower one for t = 18.1 ms when
the reflected plane wave enters the listening area again traveling in the opposite direction.
The disturbances caused be the reflections off the walls of the listening room can be seen
clearly. The right row of Figure 5.22 shows the results when applying the WDAF based
room compensation algorithm. The wave fields after convergence of the compensation
filters are shown. It can be seen clearly that the room compensation algorithm is capable
of actively compensating the listening room reflections within the listening area. Only
the desired band-limited plane wave is present inside the circular array.

5.3. Listening Room Compensation for Wave Field Synthesis

205

0
w/o room compensation
with room compensation
desired wave field

signal energy [dB] >

5
10
15
20
25
30
35

90

180
angle [o] >

270

360

(a) energy of plane wave components for converged compensation filters

Mean squared error > [dB]

5
10
15
20
25
30
35

3
time > [s]

(b) mean squared adaptation error

Figure 5.21: Results of room compensation for a band-limited (sinc shaped) plane wave
with incidence angle 0 = 180o (simulated circular WFS/WFA system).

5. Room Compensation Applied to Spatial Sound Systems

t = 18.1 ms

t = 6.8 ms

t = 0 ms

206

Figure 5.22: The left row shows the wave field without room compensation for different
time-instants. The right row show the results when applying room compensation for
converged room compensation filters.

5.3. Listening Room Compensation for Wave Field Synthesis

5.3.4.2

207

Rectangular WFS System

The second setup that will be evaluated in the sequel consists of a rectangular WFS and
a circular WFA system. The geometry of a rectangular WFS system is a better match to
the geometry of typical listening rooms. The size and position of the loudspeaker and microphone array was simulated according to Fig. D.3. The simulated WFS system consists
of 60 loudspeakers placed equidistantly at a distance of x 0.27 m on a rectangular
contour with the dimensions 3.50 3.50 m. The distance of the loudspeakers is chosen
such to result in approximately the same aliasing properties as for the circular WFS system discussed in the previous section. According to Section 4.1.6.1 and Eq. (4.29), this
particular loudspeaker distance results is an aliasing frequency of fal 625 . . . 1250 Hz
depending of the incidence angle of a reproduced plane wave. As for the setup used in
the previous section, the simulated WFA system consists of 48 angular sampling positions with a radius of Rmic = 0.75 m. All signals were low-pass filtered to a frequency of
fLP = 650 Hz in order to keep the aliasing artifacts reasonably small.
The remainder of this section will only show results for the singular vectors and the energy compaction performance in order to prove the applicability of the circular harmonics
decomposition as transformation for the WDAF concept. However, further simulations
have been performed. They revealed a similar performance of active room compensation
as compared to the case of the circular WFS system discussed in the previous section.
Singular Vectors
Figure 5.23(a) shows the absolute value of the first eight right singular vectors for the freefield case (Rpw = 0) and a frequency of f = 80 Hz. The singular vectors have been sorted
by their descending singular values and thus by their energy. As for the circular WFS
system, the results are similar to the angular basis functions of the circular harmonics
(see Fig. 2.4). The singular vectors for the reverberant case (Rpw = 0.8) are shown in
Fig. 5.23(b). It can be seen that they are detoriated to some extend in comparison with the
circular harmonics, however their coarse structure still matches the circular harmonics.
As expected, the presented results indicate that circular harmonics provide a suitable
transformation for the rectangular WFS system also.
Energy Compaction
Figure 5.24 shows the energy of the elements of the room transfer matrix for the pressure
microphones and its circular harmonics representation for the reverberant case. Please
note the different scales used for the results shown. Figure 5.24(a) shows E(m, n) for
0 ) of the room transfer
the pressure microphones and Fig. 5.24(b) shows the energy E(,
matrix represented in circular harmonics. As desired, the main diagonal elements in the
circular harmonics domain represent a major portion of the energy. The energy com-

208

5. Room Compensation Applied to Spatial Sound Systems

90

270 90

270 90

270 90

270

0.5
1
1.5
2
180

0.5
1
1.5
2
180

0.5
1
1.5
2
180

0.5
1
1.5
2
180

90

270 90
0.5
1
1.5
2
180

270 90
0.5
1
1.5
2
180

270 90
0.5
1
1.5
2
180

270
0.5
1
1.5
2
180

(a) reflection factor Rpw = 0.0


0

90

270 90

270 90

270 90

270

0.5
1
1.5
2
180

0.5
1
1.5
2
180

0.5
1
1.5
2
180

0.5
1
1.5
2
180

90

270 90
0.5
1
1.5
2
180

270 90
0.5
1
1.5
2
180

270 90
0.5
1
1.5
2
180

270
0.5
1
1.5
2
180

(b) reflection factor Rpw = 0.8

Figure 5.23: Absolute value of the first eight right singular vectors (f = 80 Hz) for the
simulated rectangular WFS/circular WFA system sorted by descending singular values
(top left to bottom right). The singular vectors for two different plane wave reflection
factors Rpw at the walls of the simulated room are shown.

5.3. Listening Room Compensation for Wave Field Synthesis

209

[dB]

[dB]

45

20
1

40
35

10

30
3
25
4

20

10

5
0 >

microphone m >

15

15

0
5

20

15

10

10
15

5
10

20
30
40
loudspeaker n >

50

(a) pressure domain

60

25

20
20

10

0
>

10

20

30

(b) circular harmonics domain

Figure 5.24: Energy of the room transfer matrix for the signals captured by the pressure
0 ). Shown are the results
microphones E(m, n) and in the circular harmonics domain E(,
for Rpw = 0.8.
paction of the different representations for the reverberant case is shown in Fig. 5.25. It
can be seen clearly that the circular harmonics representation of the room transfer matrix
compacts the energy much better than its pressure or plane wave representation.
As for the circular WFS system, the results prove that the circular harmonics decomposition of the room transfer matrix provides the desired properties. This result is evident
since an optimally designed WFS system should have no influence on the optimal transformation used for WDAF.

5.3.5

Results based on Measurement of the Acoustic Environment

Up to now, two-dimensional simulations of acoustic environments were used to derive the


performance of the proposed active listening room compensation algorithm. To get insight into its performance in real-world applications, measurements have been taken from
a laboratory WFS/WFA system. These measurements were then used to evaluate the
WDAF based room compensation algorithm depicted in Fig. 5.16. The following section
will discuss the derived results.

210

5. Room Compensation Applied to Spatial Sound Systems

0
2
SVD
circular harmonics
plane wave decomposed
pressure

energy compaction EC(i) > [dB]

4
6
8
10
12
14
16
18
20

12

24

36

48

60

i >

Figure 5.25: Energy compaction performance EC(i) for the different representations
of the room transfer matrix for the rectangular WFS system. Shown are the results for
Rpw = 0.8.

The exact geometry of the laboratory setup is described in Appendix D.1. The listening room is equipped with removable curtains at all sides in order to control its acoustic
properties. These curtains have been removed for the measurements. The measured setup
consists of a 48 channel circular WFS system with a radius of RLS = 1.50 m. The system
is realized with 48 two-way high-quality loudspeakers driven individually by multichannel
amplifiers. For a detailed description of the WFS system see Section 5.1.5 and Appendix D.1. The WFA system used for the measurements consists of 48 angular sampling
positions with a radius of Rmic = 0.75 m. The measurements were taken at the sampling
positions in a sequential fashion using a stepper motor unit with a rod mounted onto it.
Both a pressure and a pressure-gradient microphone was used for the measurements. They
were mounted at oppsite ends of the rod and the measurements were post-processed to
compensate for the opposite positions. See Section 5.2.5 and Appendix D.1 for a detailed
description of the circular WFA system. As stated in Section 5.1.3.2 and Section 5.2.4.2
elevated reflections can neither be compensated nor analyzed properly by two-dimensional
WFS and WFA systems. The ceiling of the measured listening room is damped properly,

5.3. Listening Room Compensation for Wave Field Synthesis

211

Figure 5.26: Setup used for the measurement of the listening room transfer matrix. The
gray curtains were removed for the measurements used in Section 5.3.5.
the floor was damped by damping material placed below the WFS system for the measurements. Figure 5.26 illustrates the measurement setup. As for the simulated acoustic
environment, all signals were low-pass filtered to a frequency of fLP = 650 Hz in order
to keep the aliasing artifacts reasonably small. The properties of the transformed signal
representations and the room compensation performance will be discussed in the sequel.

Singular Vectors
Figure 5.27 shows the absolute value of the first eight right singular vectors derived from a
SVD of the measured room transfer matrix. Shown are the singular vectors for a frequency
of f = 80 Hz. The singular vectors are quite detoriated compared to the simulation results
shown in Fig. 5.17(b). However, their coarse structure is still equivalent to the circular
harmonics. Possible reasons for the detoriations are measurement and ambient noise,
variations in the directivity and frequency response of the microphones used compared to
ideal omnidirectional and figure-of-eight microphones and elevated reflections.

212

5. Room Compensation Applied to Spatial Sound Systems

90

270 90

270 90

270 90

270

0.5
1
1.5
2
180

0.5
1
1.5
2
180

0.5
1
1.5
2
180

0.5
1
1.5
2
180

90

270 90
0.5
1
1.5
2
180

270 90
0.5
1
1.5
2
180

270 90
0.5
1
1.5
2
180

270
0.5
1
1.5
2
180

Figure 5.27: Absolute value of the first eight right singular vectors (f = 80 Hz) of the
measured circular WFS/WFA system sorted by descending singular values (top left to
bottom right).

Energy Compaction

Figure 5.28 shows the energy of the elements of the room transfer matrix for the pressure
microphones and its circular harmonics representation for the measured setup. Please
note the different scales used. Figure 5.28(b) shows the energy of the room tranfer matrix represented by its circular harmonics decomposition. The main diagonal elements
represent a major portion of the energy. The off-diagonal elements are a result of the
reverberation of the listening room and the measurement errors mentioned before. The
structure of the energy representation in the circular harmonics domain is quite similar
to the simulation results shown in Fig. 5.18(d). This shows that the simulations can be
used quite well to predict the real-world performance.
The energy compaction of the different representations for the measured room is shown in
Fig. 5.29. The energy compaction performance of the circular harmonics for the measured
setup is only slightly lower than for the simulated setup shown in Fig. 5.19.
Summarizing, the results obtained from the measurement of the circular WFS/WFA setup
prove that the circular harmonics decomposition of the room transfer matrix provides the
desired properties for the WDAF concept. Hence, the circular harmonics decomposition
is a reasonable transformation for WDAF based active room compensation in real-world
applications.

5.3. Listening Room Compensation for Wave Field Synthesis

213

[dB]

[dB]

0
45

0
20

1
40

10

10

5
4

25

0 >

microphone m >

35
30

15

20

15

10

15

20

10

20
30
loudspeaker n >

40

(a) pressure domain

15

0
5

20
10
25

20

10

0
>

10

20

30

(b) circular harmonics domain

Figure 5.28: Energy of the room transfer matrix for the signals captured by the pressure
0 ) for the measured
microphones E(m, n) and in the circular harmonics domain E(,
circular WFS/WFA setup.
Room Compensation Performance
The investigation of the singular vectors and the energy compaction for the measured
setup showed similar results compared to the simulated acoustic environment used in Section 5.3.4.1. Similar simulations as shown in Fig. 5.20 and Fig. 5.21 have been performed
also for the measured setup. The results were similar. However, these simulations only
prove the performance at the reference (microphone) positions. For the two-dimensional
environments these results are also valid for positions inside the microphone array. This
was shown for the simulated setup in Fig. 5.22. For three-dimensional environments and
two-dimensional WFS systems amplitude errors and elevated reflections will limit the
achievable room compensation performance as discussed in Section 5.1.3. The influence
of these and other artifacts will be discussed briefly in the following section.
The results of active room compensation shown for the simulated environments included
only stationary acoustic scenes. In order to illustrate the tracking capabilities of the
proposed algorithm a non-stationary scenario was additionally simulated. For this purpose a point source at a distance of r = 5 m to center of the WFS system was chosen
as desired virtual source wave field. The angle of the virtual point source was changed
every T = 5 s starting from 1 = 0o to 2 = 90o . The adaptation was performed using

214

5. Room Compensation Applied to Spatial Sound Systems

0
2
SVD
circular harmonics
plane wave decomposed
pressure

energy compaction EC(i) > [dB]

4
6
8
10
12
14
16
18
20

12

24

36

48

60

i >

Figure 5.29: Energy compaction performance EC(i) for the different representations of
the room transfer matrix for the measured circular WFS/WFA setup.

white noise as virtual source signal. Figure 5.30 shows the resulting adaptation error.
The error decays fast for the first position. This indicates a fast convergence of the room
compensation filters. The error increases slightly at the times the virtual point source
changes its position. This reveals that the room compensation filters have to be adapted
again slightly to the changed source position. The reason for this need for renewed adaptation is that the new virtual source position excites portions of the room that have not
be excited before. However, the fact that the increase is only slight indicates that the
major characteristics of the room are already included from the previous position. The
results for the moving point source prove that the proposed algorithm is capable of handling non-stationary virtual scenes without major degradations by capturing the major
room characteristics by only some few coefficients. Traditional multi-channel adaptation
algorithms would not exhibit this superior tracking performance illustrated in Fig. 5.30
for this reason. An equivalent behavior as for the tracking of a point source discussed
above is also known from single-channel adaptive filtering algorithms if not all frequencies
of the concerned system are excited by the input signal that is used for the adaptation
process [SMH95, BMS98]. In this case the adaptation error may increase if new frequency

5.3. Listening Room Compensation for Wave Field Synthesis

Mean squared error > [dB]

0
0

10

20

virtual source angle > [o]


30
40
50
60

215

70

80

90

10

10

15

15

20

20

25

25

30

30

35
0

50

100

150

35

time > [s]

Figure 5.30: Mean squared adaptation error Eadapt (k) for a moving point source for the
measured circular WFS/WFA setup.
components become active in the input signal.

5.3.6

The Influence of WFS and WFA Artifacts

Using two-dimensional reproduction and analysis techniques in a three-dimensional environment may cause artifacts during reproduction and analysis of the wave field in the
listening area. This section will shortly discuss the impact of these and other artifacts on
the performance of active listening room compensation.
For WFS the major limiting artifacts for room compensation are the amplitude errors and
the limited suppression capabilities for elevated reflections discussed in Section 5.1.2. For
WFA the major limiting artifact is the limited analysis capabilities for elevated reflections
as discussed in Section 5.2.3. The results derived in Sections 5.3.4 and 5.3.5 proved that
a (perfect) compensation of the listening room reflections is, in principle, possible using
the proposed active room compensation algorithm. However, these results only show
the performance at the microphone positions, since these results are based on the twodimensional wave field decompositions derived from the microphone measurements. For
the two-dimensional simulated environments these results are also valid within the entire
listening area as illustrated by Fig. 5.22. For three-dimensional environments, like the
measured setup, the artifacts of WFS and WFA have to be taken into account to predict
the performance within the listening area. It was already stated in Section 5.1.3.2 and
Section 5.2.4.2 that elevated reflections should be suppressed by passive damping meth-

216

5. Room Compensation Applied to Spatial Sound Systems

ods due to the limited suppression and analysis capabilities of two-dimensional WFS and
WFA systems. The quantitative influence of elevated reflections for the measured circular
setup has been derived in Section 5.1.3.2 and Section 5.2.4.2. Assuming a proper passive
damping of elevated reflections, the major remaining artifacts are the amplitude errors
of two-dimensional WFS. This artifact will effectively limit the achievable performance
of active listening room compensation depending on the listener position. A quantitative
analysis of the influence of the amplitude error for a circular WFS system that matches
the measured setup was given in Section 5.1.3.1. Up to now, the only countermeasure for
these amplitude errors is to use line sources instead of point sources for the secondary
sources.
Besides the artifacts discussed above, other effects will also influence the performance
of an active listening room compensation system. These will be mentioned only shortly
here. On the analysis side additional limiting effects may be, e. g. microphone mismatch
and position errors, microphone and pre-amplifier noise, and ambient noise. On the reproduction side limiting effects are e. g. directivity and nonlinear characteristics of the
loudspeakers used.

5.3.7

Possible Modifications of the Proposed Algorithm

The WDAF based active listening room compensation algorithm for WFS, as introduced
in Section 5.3.2, minimizes the mean squared error between the desired wave field and the
actually reproduced wave field. The complexity reduction for the adaptation of the room
compensation filters in the WDAF concept is based on a spatio-temporal transformation
of the MIMO system represented by the listening room. In the sequel some possible
modifications which may further decrease the complexity of the proposed algorithm will
be briefly discussed on a qualitative level. Possible modifications could be based on:
1. modification of the wave domain transformation,
2. performing spatio-temporal selective compensation of reflections, and
3. modification of the (single-channel) adaptation algorithm used.
The results shown for the simulated and measured acoustic environments in the previous
sections illustrated that the circular harmonics decomposition provides a quite reasonable
choice for the wave domain transformation for circular analysis arrays and typical listening
rooms. However, there may exist other analytic transformations which perform better.
A possible candidate could be to use elliptical harmonics [MF53a] as basis for the wave
domain transformation for rectangular listening rooms.
The presented algorithm compensates all reflections of the listening room. The complexity
of the active room compensation algorithm can be reduced further by focusing on specific

5.3. Listening Room Compensation for Wave Field Synthesis

217

reflections of the listening room. Since the reproduction using WFS and the analysis using
WFA is performed in a spatio-temporal fashion, this concentration could be performed
in the temporal domain, the spatial domain or both. Examples for these cases could
be to compensate only early reflections, reflections emerging from an rigid wall or early
reflections emerging from a rigid wall. The focusing plane wave decomposition [Hul04]
may be used for this purpose. The particular choice for the spatio-temporal selectively
of active room compensation should be based on psychoacoustic criteria. This would
allow to concentrate on the removal of only the relevant reflections from a psychoacoustic
viewpoint. The benefit of spatio-temporal selectively is that it may result in shorter or
less room compensation filters.
The basic concept of WDAF is to decouple the room transfer matrix and hence the MIMO
adaptive filter algorithm to several SISO adaptive filter algorithms. The particular choice
of the SISO adpative filter algorithm used is independent from this decoupling. Efficient
implementations or approximations of the filtered-x RLS algorithm may be used for this
purpose (e. g. X-LMS).

5.3.8

Compensation of Listening Room Reflections above the


Spatial Aliasing Frequency

Active compensation of the listening room reflections within the entire listening area is
in principle only possible below the spatial aliasing frequency of the reproduction system
used. Above the aliasing frequency no control over the wave field within the listening
area is gained. As a consequence, destructive interference to cancel the listening room
reflections will fail. Perfect compensation of the listening room reflections will only be
possible at the microphone positions above the aliasing frequency. Hence, above the
aliasing frequency the proposed active listening room compensation system will exhibit
similar performance as the traditional multi-point approaches reviewed in Section 1.2.
From a theoretic viewpoint nothing can be done to overcome the limitations imposed
to active room compensation by spatial sampling. However, these limitations can be
improved to some extend in a more practical sense. This could be done for example by
(1) incorporating the knowledge of the listening room reflections into the virtual scene
during reproduction or (2) by utilizing the properties of human spatial perception in
order to hide the listening room reflections. In the first approach equivalent reflections
present in the virtual scene and in the listening room could be replaced by the ones
of the listening room. In the second approach reflections of the listening room could
be hidden by modifying the virtual scene to be reproduced according to psychoacoustic
spatial masking effects. However, no solutions or algorithms that successfully demonstrate
the applicability of these ideas are known to the author at the time this work was written.
A very broad overview over some ideas can be found in [CN03].

218

5.4

5. Room Compensation Applied to Spatial Sound Systems

Room Compensation for other Spatial Reproduction Systems

The application of WDAF based active listening room compensation to WFS systems was
illustrated in Section 5.3. The results shown there prove that active listening room compensation is capable of removing the reflections caused by a reverberant listening room.
However, the formulation of the generic framework for room compensation developed in
Section 4.5 is not based on a specific reproduction or analysis system used. It is also not
limited to two-dimensional reproductions systems, it can be extended straightforward to
three-dimensional reproduction systems.
The particular reproduction and analysis system used to assemble an active room compensation system have to fulfill two requirements in order to be able to compensate for
the listening room reflections in the entire listening area. These are: (1) the reproduction
system must be capabale of controlling the wave field within the entire listening area and
(2) the analysis system must be capable of analyzing the wave field reproduced within the
entire listening area. Both requirements are fulfilled by the different WFS systems and the
circular WFA system used for the room compensation setups presented in Sections 5.3.4
and 5.3.5. Either reproduction or analysis system that fulfills the requirements stated
above can be used to construct an advanced active listening room compensation system
which results in large compensated area. It remains to find a suitable transformation in
order to decouple the MIMO system represented by the room transfer matrix. A suitable
data-dependent transformation that can always be applied is given by the GSVD. This
was shown in Section 4.5.3. No generic data-independent transformation for the WDAF
concept can be given, since this transformation depends on the geometry of the particular
problem. Hence, it has to be derived problem specific. Two examples for the application
of the proposed techniques to other reproduction scenarios are given briefly in the following.
Higher-order ambisonics [Dan00, Ger85] is an alternative as spatial reproduction system
to WFS. It has been shown that ambisonics is equivalent to WFS when discarding the effects of spatial sampling [NE98, NE99, DNM03]. It was also shown in [DNM03] that WFS
and ambisoncis exhibit different characteristics for the aliasing artifacts present above the
spatial aliasing frequency. There are indications that the spatial aliasing artifacts of an
ambisonics system scale inversely with the distance to the center of the system while for
WFS the aliasing artifacts are spread over the entire listening area. For an ambisonics
system the virtual source wave fields are represented by its ambisonics representation.
This representation is equivalent to the circular harmonics decomposition of wave fields.
Thus if a circular WFA system is used to analyze the reproduced wave field, then the
ambisonics representation may be used for a WDAF based active room compensation
system for ambisonics. Active listening room compensation systems based on the am-

5.4. Room Compensation for other Spatial Reproduction Systems

219

bisonics representation have been intoduced e. g. by [BA05b, BA05a, SH00]. However,


the concept of WDAF is not explicitly exploited there.
As indicated above, the concept of WDAF based active room compensation could also
be extended to the case of three-dimensional reproduction systems. The theory of threedimensional reproduction based on the Kirchhoff-Helmholtz integral was introduced in
Section 4.1.1. The basic idea is to surround the listening volume by loudspeakers. A suitable system for the analysis of the reproduced wave field could be based on a spherical
microphone array [Mey04]. In this case spherical harmonics [Wil99] may provide a suitable
basis for WDAF as these are the three-dimensional counterpart of circular harmonics.

220

5. Room Compensation Applied to Spatial Sound Systems

221

Chapter 6
Summary and Conclusions
Since its early days sound reproduction aims at creating the perfect acoustic illusion.
Many reproduction systems emerged over the past decades with the ambition to fulfill
this goal. However, the perfect acoustic illusion has never been realized by either of them.
Nevertheless, sound reproduction has been improved considerably in terms of quality and
spacial impression compared to the first monophonic systems [Tor98, Ste96, Gri00].
The theory behind most reproduction systems assumes free-field propagation from the
loudspeakers to the listener. Unfortunately, this assumption does not hold for reproduction systems placed in a listening room. The listening room is typically a compromise
between design, cost and acoustic requirements. As a consequence, most listening rooms
are more or less reverberant and will impose their reflections onto the reproduced wave
field. This work developed an improved framework for active compensation of the listening room acoustics.
The basic idea behind this framework is to utilize the existing sound reproduction system in order to cancel the reflections imposed by the listening room with destructive
interference. This additionally requires to analyze the influence of the listening room on
the reproduced wave field. In order to be able to compensate the reflections for the entire listening area, two basic requirements were derived for the reproduction and analysis
system. These are: (1) the analysis system should be able to (perfectly) analyze the
reproduced wave field within the listening area, and (2) the reproduction system should
provide (perfect) control over the reproduced wave field within the listening area.
Chapter 3 introduced WFA based on the plane wave and circular harmonics decomposition as solution to the first requirement. Especially, circular microphone arrays have
proven to provide many desirable properties for WFA. Section 4.1 derived a generic theoretical framework for sound reproduction systems. Wave field synthesis was introduced
in Section 5.1 as particular implementation of a spatial sound reproduction system that
fulfills the desired control capabilities.
Both WFA and WFS systems are realized by densely placed microphones and loudspeak-

222

6. Summary and Conclusions

ers. A large number of analysis and reproduction channels is required, in order to provide sufficient analysis and control capabilities for an even modest upper frequency limit
and size of the listening area. Please note, that most of the multiple-point equalization
schemes use only a relatively low number of channels and thus do not provide sufficient
acoustic analysis and control. It was derived in Section 4.4.4 that traditional approaches
to adaptive inverse filtering are not applicable in the context of active room compensation for massive multichannel reproduction systems. As solution, a spatial decoupling of
the MIMO adaptive inverse filtering problem on basis of signal and system transformations was proposed. It was shown that the optimal transformation is given on basis of
the GSVD. However, the GSVD depends on the acoustic characteristics of the listening
room. In general these characteristics are unknown and may change over time. In order
to resolve these problems the concept of WDAF proposes to use a transformation which
is independent from the particular characteristics of the listening room. As drawback,
this transformation will not result in an optimal spatial decoupling of the multichannel
adaptive system.
Section 5.3.2 used WFA, WFS and WDAF as building blocks for an improved listening
room compensation system. The results derived from off-line simulations of this system
proved that the proposed methods are able to provide room compensation for the entire
listening area. Additionally, the computational complexity has been lowered significantly
by WDAF.
It was shown by [Hay96] that the building block adaptive filter can be used for four basic
classes of applications: (1) inverse modeling, (2) identification, (3) interference canceling,
and (4) prediction. Thus, the presented theoretical framework of WDAF can be applied
also to other massive multichannel adaptation problems. The first application class has
been discussed in detail within this work. The second class is typically used for acoustic
echo cancellation (AEC) systems, the third for acoustic human-machine interfaces and
the fourth for coding of signals. The concept of WDAF was also successfully applied
to massive multichannel AEC by [BSK04b, BSK04c, BSK04a] and massive multichannel
adaptive beamforming by [HNS+ 05]. An approach that uses spatial transformations for
the coding of massive multichannel audio was published by [TGAA+ 03]. Thus, the concept of WDAF provides an unified framework for massive multichannel adaptive filtering.
The application of this framework is not necessarily limited to acoustic wave fields, it
may also be very useful for non-acoustic applications. The WDAF approach for massive
multichannel adaptation problems has also been patented [BSHK03a, BSHK03b] due to
its generality.
This work focused mainly on the theoretical foundation of WFA, sound reproduction,
eigenspace adaptive filtering and WDAF. However, for a practical implementation of these
techniques practical aspects have to be considered also. Some of them already have been
mentioned in Sections 5.2.5 and 5.3.6, e. g. transducer mismatch and misplacement, and

6. Summary and Conclusions

223

transducer and amplifier noise. These issues have been discussed for arbitrary microphone
arrays e. g. by [Tre02, JD93, BW01] and for circular microphone arrays e. g. by [Teu05].
If not handled properly the resulting artifacts may reduce the achievable room compensation gain considerably.
Active listening room compensation, as introduced in this work, gains at perfectly canceling the reflections imposed by the listening room. From a psychoacoustic viewpoint
this may not be necessary or desired in all situations [CNC04]. For a particular scenario
it may be sufficient to e. g. cancel only the early reflections produced by one wall of the
listening room. The presented framework can be extended straightforward to handle such
cases since it provides full control over the spatio-temporal structure of the reproduced
field. As benefit, this may further reduce the complexity of active listening room compensation algorithms. However, the author is not aware of research results from the field
of psychoacoustics that could be applied straightforward for this purpose.

224

6. Summary and Conclusions

225

Appendix A
Notations
The following section introduces the notations used throughout this thesis.

A.1

Conventions

The following conventions are used in this thesis: For scalar variables lower case denotes
the time domain, upper case the temporal frequency domain. Vectors are denoted
by lower case boldface and matrices by upper case boldface variables. The temporal
frequency domain for vectors and matrices is denoted by underlining the respective
variables. The spatial frequency domain is denoted by a tilde placed over the symbol.
The following table summarizes the conventions used at the example of the variable p:

domain

scalar

vector

matrix

space-time domain

p(x, t)

p(x, t)

P(x, t)

temporal frequency domain

P (x, )

p(x, )

P(x, )

spatial frequency domain

p(k, t)

(k, t)
p

P(k,
t)

spatio-temporal frequency domain

P (k, )

(k, )
p

P(k,
)

The quantities may additionally have the following decorations

2
1

P
3 ()

1
3 represent the following:
where the symbols -

226

A. Notations

1 The domain into which a quantity has been transformed is denoted by the following

symbols over the respective quantity:

domain

transformation

decoration

P ()

FS1

()
P

plane wave decomposition

P ()

cylindrical harmonics coefficient

spatial frequency domain


Fourier series coefficient

P ()

2 The traveling direction of a traveling wave is denoted by the following symbols



placed in the exponent of the respective quantity:

traveling direction

decoration

incoming wave

(1)

outgoing wave

(2)

3 The coordinate system in which a quantity is defined is denoted by the following



symbols used in the index of the respective quantity:

coordinate system

A.2

symbol

decoration

Cartesian

PC ()

spherical

PH ()

cylindrical

PY ()

polar

PP ()

Abbreviations and Acronyms


ASR
DFBT
DFT
DTFT
FFT
FIR
FTM

aliasing-to-signal ratio
discrete Fourier-Bessel transformation
discrete Fourier transform
discrete-time Fourier transform
fast Fourier transformation
finite impulse response
functional transformation method

A.3. Mathematical Symbols

LMS
LSE
LSI
LTI
LTSI
MIMO
MISO
MMSE
MSE
PWD
RASR
RLS
SIMO
SISO
VBAP
WFA
WFS
X-LMS
X-RLS

A.3

least mean squares


least-squares error
linear space-shift invariant
linear time-shift invariant
linear time- and space-shift invariant
multiple-input/multiple-output
multiple-input/single-output
minimum mean squared error
mean squared error
plane wave decomposition
reproduced aliasing-to-signal ratio
recursive least-squares
single-input/multiple-output
single-input/single-output
vector base amplitude panning
wave field analysis
wave field synthesis
filtered-x least mean squares
filtered-x recursive least-squares

Mathematical Symbols
()1

inverse operation of ()

diag{} a diagonal matrix formed by the listed entries


min{}

minimum of the scalars given in the argument

rk{}

rank of a matrix

()

()T
()

hermitian of ()
transposed of ()
conjugate complex of ()

()+

pseudoinverse of a matrix

nabla operator

convolution operator (time domain)

multi-dimensional convolution operator with respect to x

for all

convolution operator with respect to x

cylindrical/polar convolution with respect to x

Laplace operator

227

228

A. Notations

()

Dirac pulse

(x)

multi-dimensional Dirac pulse

[x]

discrete unit pulse

proportional to

{}

imaginary part

{}

real part

()
n

gradient in direction of the vector n

h, i

inner product of two vectors

Transformations
Ft {}

Fourier transformation with respect to the time t

Fx {}

spatial Fourier transformation with respect to the x-coordinate

Fx1{}

inverse spatial Fourier transformation with respect to the xcoordinate

Fx {}

spatial Fourier transformation with respect to the position vector


x

Fx1{}

inverse spatial Fourier transformation with respect to the position


vector x

FS {}

Fourier series with respect to variable

H,r {}

-th order Hankel transformation with respect to variable r

Ft1{}

inverse Fourier transformation with respect to the time t

FS1
{}

inverse Fourier series with respect to variable

H,R {}

-th order finite Hankel transformation

H(1)
,r {}

-th order complex Hankel transformation of first kind with respect


to variable r

H(2)
,r {}

-th order complex Hankel transformation of second kind with respect to variable r

DFBT {} -th order discrete Fourier-Bessel (Hankel) transformation


S{}

response of system

Ti {}

generic WDAF transformation

P 1 {}

inverse plane wave decomposition

P 1
{}

inverse discrete space plane wave decomposition

M{}

conformal mapping

P{}

plane wave decomposition

P {}

discrete space plane wave decomposition

A.3. Mathematical Symbols

229

Symbols
0mn
P (, r)

vector or matrix of size m n with zeroes


two-dimensional polar pulse train

P ()

angular pulse train

[rad]

polar coordinate of cylindrical/polar coordinate system

az [rad]

azimuth angle of spherical coordinate system

pw [rad]

plane wave incidence angle

discrete eigenvalues of FTM

al (, R, )

energy of aliasing contributions

tr (Mtr , R, )

truncation error

(xC , t)

argument of DAlmebert solution

index for general purpose

al

spectral repetitions constituting aliased plane waves

[rad]

polar angle of wave vector in cylindrical/polar coordinates or angle


of plane wave decomposition

az [rad]

azimuth angle of wave vector spherical coordinates

angle of discrete plane wave decomposition

index for general purpose

weighting factor for least-squares cost function

index for general purpose

angular frequency

(
c, k)

weighted least-squares cost function

radial variable of spherical coordinate system


3

[kg/m ]

density of propagation medium

0 [kg/m3 ]

static density of propagation medium

(xS , )

constant of proportionality used for Robin boundary conditions

singular values of room transfer matrix

characteristic time of Sabine-Franklin-Jaeger model

index for general purpose

m (x)

modes of a rectangular room

[rad/s]
da (k)

temporal radial frequency

dd (k)

auto-correlation matrix of filtered secondary source driving signals


cross-correlation matrix of filtered secondary source driving signals
and desired signals

230

A. Notations

(n ,n1 ),(n ,n2 ) (k)

1
2

blocks of auto-correlation matrix of filtered secondary source driving signals

dd,m (k)

auto-correlation matrix of the m-th component of the transformed


filtered secondary source driving signals

da,m (k)

cross-correlation matrix of m-th component of the transformed filtered secondary source driving signals and desired signals

m (x)

complex modes of a rectangular room

A(x, )

desired wave field in listening area

ASR(, R, )

aliasing-to-signal ratio

am (k)

vector consisting of temporal samples of discretized desired wave


field in listening area

a(M ) (k)

vector consisting of spatial samples of discretized desired wave field


in listening area

constant for general purpose

a(x)

window function for selection of secondary sources

am (k)

discretized desired wave field in listening area

number of non-zero singular values of room transfer matrix

C(k)

matrix consisting of the coefficients of all discretized room compensation filters

C(x|x , )

room compensation filter

cn,n (k)

vector consisting of temporal samples of discretized room compensation filter

c(k)

vector consisting of all estimated coefficients of discretized room


compensation filter

c [m/s]

speed of sound

cn,n (k)

discretized room compensation filter

circ()

circular window function

DR (k)

matrix of filtered secondary source driving signals

D(x, )

secondary source driving signal

Dpw (x0 , )

secondary source driving signal for a plane wave

Dcorr,pw (x0 , )

corrected secondary source driving signal for a plane wave

DS (x, )

sampled secondary source driving signal

dn (k)

vector consisting of temporal samples of discretized secondary


source driving signal

A.3. Mathematical Symbols

231

d(N ) (k)

vector consisting of spatial samples of discretized secondary source


driving signal

d0 (xn , t)

loudspeaker driving impulse response

dn (k)

discretized secondary source driving signal

E(x, )

E()

error between desired and reproduced wave field

E(m, n)
0 )
E(,

energy of the elements of the room transfer matrix

energy of plane wave components


energy of the elements of the room transfer matrix in the plane
wave decomposed domain

0 )
E(,

energy of the elements of the room transfer matrix in the circular


harmonics domain

Eadapt (k)

mean-squared adaptation error

EC(i)

energy compaction of the room transfer matrix

em (k)

vector consisting of temporal samples of discretized error between


desired and reproduced wave field

e(M ) (k)

vector consisting of spatial samples of discretized error between


desired and reproduced wave field

e()

exponential function

em (k)

discretized error between desired and reproduced wave field

F(k)

matrix consist-sing of impulse responses from all synthesis to all


analysis positions of discretized free-field impulse response

fm,n

vector consisting of temporal samples of discretized free-field impulse response

f [1/s]

temporal frequency variable

fs [1/s]

sampling frequency

f ()

arbitrary function

fm,n (k)

discretized free-field impulse response

fal

lower frequency limit for aliasing contributions of plane wave decomposition

G(x|x0 , )

G(|
0 , )

Greens function

G(|
0 , )

circular harmonics Greens function

G0 (x|x0 , )

free-field Greens function

G0,2D (x|x0 , )

two-dimensional free-field Greens function

G0,3D (x|x0 , )

three-dimensional free-field Greens function

(1)
H ()

-th order Hankel function of first kind

plane wave Greens function

232

A. Notations
(2)

H ()

-th order Hankel function of second kind

J ()

-th order Bessel function of first kind

square root of 1

number of spatial samples combined in vectors

K(x, )

K(x,
)

transformation kernel of FTM

k [rad/m]

wave vector

km

modal wave vector

k [rad/m]

wavenumber

discrete time index

kr [rad/m]

radial wavenumber

kx [rad/m]

wavenumber in x-direction of Cartesian coordinate system

ky [rad/m]

wavenumber in y-direction of Cartesian coordinate system

kz [rad/m]

wavenumber in z-direction of Cartesian/cylindrical coordinate system

L(x, )

reproduced wave field in listening area

lm (k)

vector consisting of temporal samples of discretized reproduced


wave field in listening area

l(M ) (k)

vector consisting of spatial samples of discretized reproduced wave


field in listening area

lm (k)

discretized reproduced wave field in listening area

M(kR)

filter matrix for computation of circular harmonics for a circular


microphone array

number of discrete analysis positions

number of discrete secondary sources

Nc

length of room compensation filter

Nf

length of free-field impulse response

Nr

length of listening room impulse response

normalization factor used for FTM

surface normal

n0

normal vector

P (x, )
P ()

acoustic pressure

PP,pw,0 (, r, )
E0 ()

pressure of elevated plane wave

adjoint transformation kernel of FTM

frequency dependent amplitude


energy of plane wave components of plane wave decomposition of
an elevated plane wave

A.3. Mathematical Symbols

233

po [N/m2 ]

static pressure

Q(x, )

inhomogeneous part of wave equation

R(k)

matrix consisting of impulse responses from all synthesis to all analysis positions of discretized listening room impulse response

fixed radius

RLS

radius of loudspeaker array

Rmic

radius of microphone array

RASR(xP , )

reproduced aliasing-to-signal ratio

rm,n

vector consisting of temporal samples of discretized listening room


impulse response

r [m]

radial coordinate of polar/cylindrical coordinate system

rm,n (k)

discretized listening room impulse response

S(x, )

acoustic pressure of virtual source

matrix build from Toeplitz matrices used for the MINT

Tm,n

Toeplitz matrices used for the MINT

Ts [s]

sampling period

T60 [s]

reverberation time

t [s]

continuous time

U()

left singular matrix

ub ()

left singular vector

uC (sC , )

slant stack/radon transform of pressure field

V()

right singular matrix

Vn (x, ) [m/s]

acoustic particle velocity in direction of normal vector n

Vr (x, ) [m/s]

acoustic particle velocity in radial direction

V (x x0 , )

acoustic pressure of secondary sources (including sign)

V2D (x x0 , )

acoustic pressure of secondary line sources (including sign)

V2D (x x0 , )

acoustic pressure of secondary point sources (including sign)

v(x, t) [m/s]

acoustic particle velocity

vb ()

right singular vector

W (x, )

filtered loudspeaker driving signal

wn (k)

vector consisting of temporal samples of discretized filtered loudspeaker driving signal

w(N ) (k)

vector consisting of spatial samples of discretized filtered loudspeaker driving signal

wn (k)

discretized filtered loudspeaker driving signal

X()

joint singular matrix

234

A. Notations

position vector

x [m]

x-coordinate of Cartesian coordinate system

YC (kC , kC,S , )

response of filter-and-sum beamformer to a plane wave

YC (kC,S , )

response of filter-and-sum beamformer

y [m]

y-coordinate of Cartesian coordinate system

Z()

(specific) acoustic impedance

Z0

characteristic acoustic impedance

z [m]

z-coordinate of Cartesian/cylindrical coordinate system

235

Appendix B
Coordinate Systems
In the sequel the different coordinate systems used throughout this work are defined.
Additionally their interrelations, some operators and special functions are introduced.
Since the position vector x and the wave vector k will be affected by a coordinate system
change, most of the results will be given for both quantities.

B.1

Cartesian Coordinate System

Figure B.1 illustrates the Cartesian coordinate system exemplarily for the position vector
xC . However, this illustration also applies to the wave vector kC . The three-dimensional

z
xC
y

Figure B.1: Illustration of the Cartesian coordinate system for the position vector xC .
case will be considered in the following. Nevertheless, the introduced relations hold also
for the two-dimensional case when neglecting the z-components. The position vector xC

236

B. Coordinate Systems

in Cartesian coordinates is defined as follows


h
iT
xC = x y z
,

and the wave vector as

kC =

kx ky kz

iT

(B.1)

(B.2)

The volume element used for integration in Cartesian coordinates is given as


dV = dx dy dz .

(B.3)

Operators
The gradient of a scalar variable pC (xC ) defined in Cartesian coordinates is given as [AW01]

pC (xC ) =

its Laplacian as

pC (xC ) = 2 pC (xC ) =

pC (xC )
x
pC (xC )
y
pC (xC )
z

2 pC (xC ) 2 pC (xC ) 2 pC (xC )


+
+
.
x2
y 2
z 2

(B.4)

(B.5)

Special Functions
A Dirac pulse placed at the origin of the Cartesian coordinate system is given in Cartesian
coordinates as [AW01, Bra03]
C (x, y, z) = C (xC ) = (x) (y) (z) ,

(B.6)

a Dirac pulse with an offset xC,0 as


C (xC xC,0 ) = (x x0 ) (y y0 ) (z z0 ) .

(B.7)

Convolution
The spatial convolution of two three-dimensional functions in Cartesian coordinates is
given as
gC (xC ) = fC (xC ) x hC (xC )
Z Z Z
=
fC (x , y , z ) hC (x x , y y , z z ) dx dy dz .

(B.8)

B.2. Spherical Coordinate System

237

xH
az

Figure B.2: Illustration of the spherical coordinate system for the position vector xH .

B.2

Spherical Coordinate System

Fig. B.3 illustrates the spherical coordinate system exemplarily for the position vector
xH . However, this illustration applies also to the wave vector kH . The position and wave
vector in spherical coordinates are defined as follows:
h
iT
xH = az
,
(B.9)
kH =

az k

iT

(B.10)

The volume element used for integration in spherical coordinates is given as


dV = 2 sin az d daz d .

(B.11)

Relations between Cartesian and spherical coordinates


The relations between the position vector x given in spherical xH and Cartesian
dinates are given as follows
y 
x = sin az cos
= tan1
x
!
z
y = sin az sin
az = cos1 p
x2 + y 2 + z 2
p
z = cos az
= x2 + y 2 + z 2 ,

xC coor-

(B.12a)
(B.12b)
(B.12c)

238

B. Coordinate Systems

xY
y

Figure B.3: Illustration of the cylindrical coordinate system for the position vector xY .
where , az , denote the polar angle, the azimuth angle and the radius respectively.
These variables are subject to the following ranges: 0 2, 0 az < , 0 < .
The relations between the wave vector k in spherical kH and Cartesian kC coordinates is
given in accordance to the position vector as
 
ky
1
kx = k sin az cos
= tan
(B.13a)
kx
!
k
z
ky = k sin az sin
az = cos1 p 2
(B.13b)
kx + ky2 + kz2
q
kz = k cos az
k = kx2 + ky2 + kz2 ,
(B.13c)

where 0 2, 0 az < , 0 k < .

B.3

Cylindrical Coordinate System

Fig. B.3 illustrates the cylindrical coordinate system exemplarily for the position vector
xY . However, this illustration applies also to the wave vector kY . The position and wave
vector in cylindrical coordinates are defined as follows:
h
iT
xY = r z
,
(B.14)
kY =

kr kz

iT

(B.15)

B.3. Cylindrical Coordinate System

239

kC

ky
xC

kr

kz

kx

Figure B.4: Illustration of the position vector x and the wave vector k expressed in
Cartesian and cylindrical coordinates as given by Eq. (B.17) and Eq. (B.18).
The volume element used for integration in cylindrical coordinates is given as
dV = r d dr dz .

(B.16)

Relations between Cartesian and cylindrical coordinates


The relations between the position vector x given in cylindrical xY and Cartesian xC
coordinates are given as follows
 
1 y
(B.17a)
x = r cos
= tan
x
p
y = r sin
r = x2 + y 2
(B.17b)
z=z

z=z,

(B.17c)

where , r, z denote the angle, radius, and height respectively. These variables are subject
to the following ranges: 0 2, 0 r < , < z < . The relations between
the wave vector k in cylindrical kY and Cartesian kC coordinates is similar to the position
vector given as
 
ky
1
(B.18a)
kx = kr cos
= tan
kx
q
ky = kr sin
kr = kx2 + ky2
(B.18b)
kz = kz

kz = kz ,

(B.18c)

where 0 2, 0 kr < , < kz < . Figure B.4 illustrates the relations


between the cylindrical and Cartesian coordinate systems for the position vector and the
wave vector.

240

B. Coordinate Systems

Operators
The gradient of a scalar variable pY (xY ) defined in cylindrical coordinates is given
as [AW01]
p (x )
Y

pY (xY ) =

its Laplacian as

1
pY (xY ) = pY (xY ) =
r r
2

r
1 pY (xY )
r

pY (xY )
z

(B.19)



pY (xY )
1 2 pY (xY ) 2 p(xY )
r
+ 2
+
.
r
r
2
z 2

(B.20)

Special Functions
A Dirac pulse placed at the origin of the cylindrical coordinate system is given in cylindrical coordinates as [Gas78]
(r)
Y (xY ) =
,
(B.21)
r
a Dirac pulse with an offset xY,0 as [Gas78]
Y (xY xY,0 ) =

1
(r r0 )( 0 )(z z0 ) .
r0

(B.22)

Convolution
Using the definition of the spatial convolution in Cartesian coordinates (B.8) and introducing the coordinate system change to polar coordinates yields
gY (xY ) = fY (xY ) hY (xY )
Z Z Z 2
=
fY ( , r , z )hY (, r, z z ) r d dr dz ,

(B.23)

where and r are given as

r sin r sin
r cos r cos

(, , r, r ) = tan
,
q
r(, , r, r ) = r 2 + r 2 2rr cos( ) .

B.4

(B.24a)
(B.24b)

Polar Coordinate System

The polar coordinate system can be derived from the cylindrical coordinate system by
setting z = 0 and kz = 0. Figure B.5 illustrates the polar coordinate system for the

B.4. Polar Coordinate System

241

xP

x
Figure B.5: Illustration of the polar coordinate system for the position vector xP .
position vector xP . The position and wave vector in polar coordinates are defined as
follows
h
iT
xP = r
,
(B.25)
kP =

iT

(B.26)

The volume element in polar coordinates is then given as


dV = r d dr .

(B.27)

Relations between two-dimensional Cartesian and polar coordinates


The relations between two-dimensional Cartesian and polar coordinates can be derived
from Section B.3 for z = 0
y
(B.28a)
x = r cos
= tan1
x
p
y = r sin
r = x2 + y 2 ,
(B.28b)
and

kx = k cos
ky = k sin

ky
kx

= tan
q
k = kx2 + ky2 .

(B.29a)
(B.29b)

Operators
The gradient of a scalar variable pY (xP ) defined in polar coordinates is given as [AW01]
"
#
pP (xP ) =

pP (xP )
r
1 pP (xP )
r

(B.30)

242

B. Coordinate Systems

its Laplacian as
1
pP (xP ) = pP (xP ) =
r r
2



pP (xP )
1 2 pP (xP )
r
+ 2
.
r
r
2

(B.31)

Special functions
A Dirac pulse placed at the origin of the polar coordinate system is given in polar coordinates as [Gas78]
(r)
,
(B.32)
P (xP ) =
r
a Dirac pulse with an offset xP,0 as [Gas78]
(xP xP,0 ) =

1
(r r0 )( 0 ) .
r0

(B.33)

Convolution
Using the definition of the spatial convolution in two-dimensional Cartesian coordinates (B.8) and introducing the coordinate system change to polar coordinates yields
gP (xP ) = fP (xP ) hP (xP )
Z Z 2
=
fP ( , r )hP (, r) r ddr ,
0

(B.34)

where and r are given as

r sin r sin
r cos r cos

,
(, , r, r ) = tan
q

r(, , r, r ) = r 2 + r 2 2rr cos( ) .

(B.35a)
(B.35b)

243

Appendix C
Mathematical Preliminaries
The following section summarizes mathematical preliminaries for reference within this
thesis.

C.1

Greens Second Integral Theorem

Greens second integral theorem is given as follows [AW01]


Z
I
2
2
(v u u v) dV =
hvu uv, ni dS ,
V

(C.1)

where v and u denote scalar fields which second derivatives must exist and are integrable,
n denotes the inward pointing surface normal, dV and dS volume and surface elements
respectively, and ha, bi denotes the inner product of two vectors a and b. Figure C.1
illustrates the geometry used for Greens second integral theorem. The parts of the inner
product on the right hand side of Greens theorem (C.1) involving the surface normal
n and the gradients of the fields v and u can be understood as a directional gradient.
This operation calculates the gradient in direction of the inward pointing surface normal.

V
V
Figure C.1: Bounded region V , surface V , and inward pointing surface normal n used
for Greens second integral theorem (C.1).

244

C. Mathematical Preliminaries

These terms will be abbreviated in the following, e. g. for u, as


hu, ni =

u
.
n

(C.2)

Thus, Greens theorem yields a relation between the directional gradients of the scalar
fields u, v on the boundary V with the fields and their Laplacians inside the bounded
region V . Introducing Eq. (C.2) into Eq. (C.1) results in
Z

(v u u v) dV =

v
u
v
u
n
n

dS .

(C.3)

This form of Greens second integral theorem will be used in this work.

C.2

The Stationary Phase Method

The stationary phase method [Ble84, Wil99] considers the approximate calculation of
integrals that exhibit the following form
Z
F =
f (z) ej(z) dz .
(C.4)

An approximate solution of above integral for (z) 1 can be given as


F

2j
f (zs ) ej(zs ) ,
(zs )

(C.5)

where (z) denotes the second derivative of (z) with respect to z and zs the stationary
phase point. The stationary phase point zs is found by setting the first derivative (z)
of (z) to zero
!

(zs ) = 0 .

C.2.1

(C.6)

Approximation of a Linear Distribution of Point Sources

Section 5.1.1 discusses the correction of the secondary source type mismatch for a twodimensional WFS system. For this purpose Eq. (5.5) is approximated using the stationary
phase method outlined in the previous section.
Comparing the inner integral of Eq. (5.5) to Eq. (C.4) yields
(z0 ) = k |x x0 (x0 , z0 )| kT0 x0 ,
jk cos

f (z0 ) =
,
|x x0 (x0 , z0 )|

(C.7)
(C.8)

C.3. Spatio-temporal Spectrum of the Two- and Three-dimensional Free-field Greens Functions245

where the integration in Eq. (5.5) is performed over the variable z0 . The first derivative
of (z0 ) with respect to z0 is given as
(z0 ) =

k z0
.
|x x0 (x0 , z0 )|

(C.9)

Thus, the stationary phase point is given by z0,s


= 0. Considering the geometry depicted
in Fig. 5.1 this choice results in the following functions required for the approximate
solution of Eq. (5.5) using Eq. (C.5)

(0) = k |x x0 | kT0 x0 ,
k
(0) =
,
|x x0 |
jk cos
f (0) =
.
|x x0 |

(C.10)
(C.11)
(C.12)

Introducing Eq. (C.5) together with above results into Eq. (5.5) yields Eq. (5.6).

C.3

Spatio-temporal Spectrum of the Two- and


Three-dimensional Free-field Greens Functions

The following section will derive the spatio-temporal spectrums of the two- and threedimensional free-field Greens functions.

C.3.1

Two-dimensional Greens Function

The two-dimensional free-field Greens function is given by Eq. (2.68) as


GC,2D (xC |xC,0 , ) =

j (2)
H ( |xC xC,0 |) .
4 0 c

(C.13)

Applying a coordinate transformation into polar coordinates assuming xC,0 = 0 yields


GP,2D (xP |0, ) =

j (2)
H ( r) .
4 0 c

(C.14)

Due to the radial symmetry of the Greens function GP,2D (xP |0, ) in polar coordinates,
its spatial Fourier transformation is given by its Hankel transformation [Pap68]
Z
j
(2)
P,2D (kP |0, ) =
G
H0 ( r) J0 (kr) r dr
4 0
c
(C.15)
1
j

=
+
(k

)
,
4(k 2 ( c )2 ) 4k
c
where [GR65] and [Gas78] was utilized to derive the solution of above integral. Transforming this result back into Cartesian coordinates and applying the shift theorem of

246

C. Mathematical Preliminaries

the spatial Fourier transformation yields the spectrum of the two-dimensional free-field
Greens function as
!
q
1

j
T
C,2D (kC |xC,0 , ) =
G
( kx2 + ky2 ) ejkC xC,0 .
2 + p 2
2
2
2
4(kx + ky ( c ) ) 4 kx + ky
c
(C.16)
The spatio-temporal spectrum of the two-dimensional free-field Greens function describes
the complex valued pressure field of a line source placed at the position xC,0 .

C.3.2

Three-dimensional Greens Function

The three-dimensional free-field Greens function is given by Eq. (2.33) as

1 ej c |xC xC,0 |
.
GC,3D (xC |xC,0 , ) =
4 |xC xC,0 |

(C.17)

Applying a coordinate transformation into spherical coordinates assuming xC,0 = 0 yields

1 ej c r
GH,3D (xH |0, ) =
.
4 r

(C.18)

Due to the radial symmetry of the Greens function GH,3D (xP |0, ) in spherical coordinates,
its spatial Fourier transformation is given by its Hankel transformation [Pap68]
Z

1
1

GP,3D (kP |0, ) =


ej c r J0 (kr) dr = p 2
,
(C.19)
4 0
k ( c )2

where [GR65] was utilized to derive the solution of above integral. Transforming this
result back into Cartesian coordinates and applying the shift theorem of the spatial Fourier
transformation yields the spectrum of the three-dimensional free-field Greens function as
1
T
C,3D (kC |xC,0 , ) = 1 q
G
ejkC xC,0 .
4 k 2 + k 2 ( )2
x
y
c

This result can also be found in [Zio95].

(C.20)

247

Appendix D
Measured and Simulated Acoustic
Environments
In this section, the acoustic environments and experimental setups are described which
were used to derive the performance of the proposed WDAF based room compensation
approach. In particular the acoustic environments considered in this work consisted of a
laboratory environment and a simulated environment. The acoustical properties of the
laboratory environment were measured as described in the next section.

D.1

Measured Acoustic Environment

The acoustic environment that was used for the measurements was the multimedia laboratory of Multimedia Communications and Signal Processing at the University of
Erlangen-Nuremberg [LMS]. The laboratory has the dimensions 5.905.803.10 m (w
l h). The room is equipped with carpet, a damped ceiling and sound absorbing curtains at all sides. The curtains can be removed partly of fully in order to change the
reverberation characteristics of the room. The reverberation time for closed curtains is
approximately T60 250 ms, for opened curtains T60 400 ms. For opened curtains, the
plane wave reflection factor of the walls can be approximated by Rpw = 0.8. Figure D.1
illustrates the geometry of the multimedia laboratory. A circular loudspeaker and microphone array was used for the measurement of the room transfer matrix. Their geometry
and position is also depicted in Fig. D.1.
The circular loudspeaker array with a radius of RLS = 1.50 m consists of 48 equidistantly
mounted two-way loudspeakers (ELAC Type 301 [ELA]). All loudspeakers are mounted
at a height of 1.60 m. Figure 5.6 shows the circular loudspeaker array. The circular microphone array with a radius of RMic = 0.75 m was realized in a sequential fashion using a
stepper motor drive. The setup shown in Fig. 5.13 and Fig. 5.14 was used for this purpose.
The acoustic pressure and velocity in normal direction were measured at 48 equidistant

248

D. Measured and Simulated Acoustic Environments

Figure D.1: Geometry of the multimedia laboratory, loudspeaker and microphone array.
The dimensions are given in centimeters [cm].

D.2. Simulated Acoustic Environments

249

4.5
4

y > [m]

3.5
3
2.5
2
1.5
microphones
circular LS array
rectangular LS array

1
0.5

4
x > [m]

Figure D.2: Exact positions of the loudspeakers and microphones used for the measured/simulated setups (LS = loudspeaker). The coordinates are given with respect to
the lower left corner of the multimedia laboratory illustrated by Fig. D.1.
positions. Figure D.2 additionally illustrates the exact positions of the loudspeakers and
microphones with respect to the lower left corner of the multimedia laboratory.
Using the described setup, the impulse responses from each loudspeaker to each microphone were measured sequentially. The entire measurement procedure was controlled
by a personal computer. The measurements were taken at a sampling rate of 48 kHz
using MLS sequences [RV89]. The measurements taken from each loudspeaker to each
microphone were then used to compose listening room transfer matrix R() used for the
results presented in this work.

D.2

Simulated Acoustic Environments

The simulated acoustic environment is a simplified version of the multimedia laboratory


described in the previous section. The laboratory was approximated by a recangular
enclosure with the same dimensions as the original room. Figure D.3 shows the setup
used for the simulations. Two loudspeaker setups were simulated: (1) the circular array described in the previous section and (2) a rectangular array with the dimensions
3.50 3.50 m consisting of 60 equidistantly placed loudspeakers. Figure D.2 illustrates
the exact positions of the loudspeakers and microphones. The impulse responses from
each loudspeaker to each microphone position was simulated using the method described
in [PR05]. The loudspeakers were modeled as point sources for this purpose.

250

D. Measured and Simulated Acoustic Environments

Figure D.3: Simulated setups of loudspeaker and microphone array. Dimensions are
given in centimeters [cm].

251

Appendix E
Titel, Inhaltsverzeichnis, Einleitung
und Zusammenfassung
The following german translations of the title (Section E.1), the table of contents (Section E.2), the introduction (Section E.3), and the summary (Section E.4) are a mandatory requirement for a doctoral thesis at the Faculty of Engineering of the University of
Erlangen-Nuremberg.

E.1

Titel

Aktive Kompensation des Wiedergaberaumes f


ur Raumklangwiedergabesysteme.

E.2

Inhaltsverzeichnis

1 Einleitung
1.1 Der Einfluss des Wiedergaberaumes . . . . . . . . . . . . . . . . . . . . . .
1.2 Systeme zur Kompensation des Wiedergaberaumes . . . . . . . . . . . . .

1.3 Uberblick
u
ber diese Arbeit . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Grundlagen der akustischen Wellenausbreitung
2.1 Die akustische Wellengleichung . . . . . . . . . . . . . . . . . . . . . .
2.1.1 Herleitung der homogenen akustischen Wellengleichung . . . . .
2.1.2 Zweidimensionale Wellenfelder . . . . . . . . . . . . . . . . . . .
2.1.3 Allgemeine Losung der Wellengleichung . . . . . . . . . . . . . .
2.2 Losungen der homogenen Wellengleichung in kartesischen Koordinaten
2.2.1 Expansion in ebene Wellen . . . . . . . . . . . . . . . . . . . . .
2.3 Losungen der homogenen Wellengleichung in Zylinderkoordinaten . . .

.
.
.
.
.
.
.

.
.
.
.
.
.
.

1
3
4
6
7
7
7
10
10
11
13
14

252

E. Titel, Inhaltsverzeichnis, Einleitung und Zusammenfassung

2.3.1 Expansion in zylindrische Harmonische . . . . . . . . . . . . . . . .


2.3.2 Expansion in zirkulare Harmonische . . . . . . . . . . . . . . . . . .
Losungen der inhomogenen Wellengleichung . . . . . . . . . . . . . . . . .
2.4.1 Punktquelle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.2 Greensche Funktionen . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.3 Linienquelle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.4 Planare Quellen . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Randbedingungen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5.1 Klassifikation von Randbedingungen . . . . . . . . . . . . . . . . .
Losung der inhomgenen Wellengleichung f
ur eine begrenzte Region . . . . .
2.6.1 Dreidimensionales Kirchhoff-Helmholtz Integral . . . . . . . . . . .
2.6.2 Zweidimensionales Kirchhoff-Helmholtz Integral . . . . . . . . . . .
Der Einfluss von Randern . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.7.1 Reflektion und Transmission von ebenen Wellen an ebenen Flachen
2.7.2 Akustische Moden in einem rechteckigen Raum . . . . . . . . . . .

16
18
20
20
22
24
26
28
29
31
34
35
36
37
38

3 Fourier Analyse von Wellenfeldern


3.1 Fourier Analyse von mehrdimensionalen Signalen . . . . . . . . . . . . . .
3.1.1 Mehrdimensionale Fourier Transformation . . . . . . . . . . . . . .
3.1.2 Mehrdimensionale Fourier Transformation in kartesischen Koordinaten . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.3 Mehrdimensionale Fourier Transformation in zylindrischen Koordinaten . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.4 Zylindrische Fourier Transformation ausgedr
uckt als Hankel Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Lineare Systeme und Fourier Analyse . . . . . . . . . . . . . . . . . . . . .
3.2.1 Klassifikation mehrdimensionaler Systeme . . . . . . . . . . . . . .
3.2.2 Die Wellengleichung als mehrdimensionales System . . . . . . . . .
3.2.3 Lineare Systeme und die Fourier Transformation . . . . . . . . . . .
3.3 Die kontinuierliche Zerlegung in ebene Wellen . . . . . . . . . . . . . . . .
3.3.1 Fourier Analyse von ebenen Wellen . . . . . . . . . . . . . . . . . .
3.3.2 Definition der Zerlegung in ebene Wellen . . . . . . . . . . . . . . .
3.3.2.1 Zeitbereichsformulierung der Zerlegung in ebene Wellen . .
3.3.3 Definition der inversen Zerlegung in ebene Wellen . . . . . . . . . .
3.3.4 Reprasentationen der Zerlegung in ebene Wellen . . . . . . . . . . .
3.3.4.1 Die Zerlegung in ebene Wellen als Hankel Transformation
3.3.4.2 Die Zerlegung in einlaufende/auslaufende ebene Wellen . .
3.3.4.3 Die Zerlegung in ebene Wellen dargestellt in zylindrischen
Harmonischen . . . . . . . . . . . . . . . . . . . . . . . . .

43
43
44

2.4

2.5
2.6

2.7

45
47
49
51
52
53
54
56
57
58
61
61
62
62
63
65

E.2. Inhaltsverzeichnis

253

Uberblick
u
ber die verschiedenen Reprasentationen der
Zerlegung in ebene Wellen . . . . . . . . . . . . . . . . . .
3.3.5 Eigenschaften und Theoreme der Zerlegung in ebene Wellen . . . .
3.3.5.1 Eigenschaften . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.5.2 Skalierungstheorem . . . . . . . . . . . . . . . . . . . . . .
3.3.5.3 Rotationstheorem . . . . . . . . . . . . . . . . . . . . . . .
3.3.5.4 Multiplikationstheorem . . . . . . . . . . . . . . . . . . . .
3.3.5.5 Faltungstheorem . . . . . . . . . . . . . . . . . . . . . . .
3.3.5.6 Parsevalsches Theorem . . . . . . . . . . . . . . . . . . . .
3.3.5.7 Extrapolation von ebenen Wellen . . . . . . . . . . . . . .
3.3.5.8 Zusammenfassung . . . . . . . . . . . . . . . . . . . . . .
3.3.6 Beziehungen zu anderen Methoden und Transformationen . . . . .
Die Zerlegung in ebene Wellen unter der Benutzung von Randmessungen .
3.4.1 Die Zerlegung in ebene Wellen mit beschrankter Apertur . . . . . .
3.4.2 Die Zerlegung in ebene Wellen auf der Basis von KirchhoffHelmholtz Extrapolation . . . . . . . . . . . . . . . . . . . . . . . .
3.4.3 Die Zerlegung in ebene Wellen auf der Basis von zylindrischen Harmonischen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Die Zerlegung in ebene Wellen von analytischen Quellenmodellen . . . . .
3.5.1 Die Zerlegung in ebene Wellen einer ebenen Welle . . . . . . . . . .
3.5.2 Die Zerlegung in ebene Wellen einer Linienquelle . . . . . . . . . .
Die diskrete Zerlegung in ebene Wellen . . . . . . . . . . . . . . . . . . . .
3.6.1 Die Herleitung der diskrete Zerlegung in ebene Wellen . . . . . . . .
3.6.1.1 Der Zweidimensionale polare Impulskamm . . . . . . . . .
3.6.1.2 Definition der raumdiskreten Zerlegung in ebene Wellen .
3.6.1.3 Spektrale Eigenschaften der raumdiskreten Zerlegung in
ebene Wellen . . . . . . . . . . . . . . . . . . . . . . . . .
3.6.1.4 Definition der diskreten Zerlegung in ebene Wellen . . . .
3.6.2 Abtastungs- und Endlichkeitsartefakte . . . . . . . . . . . . . . . .
3.6.2.1 Angulare Abtastung von Randmessungen . . . . . . . . .
3.6.2.2 Abtastung einer ebenen Wellen auf einem zirkularen Rand
3.6.2.3 Quantitative Analyse der Aliasingartefakte . . . . . . . . .
3.6.2.4 Quantitative Analyse der Endlichkeitsartefakte . . . . . .
3.6.3 Zusammenfassung . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.4.4

3.4

3.5

3.6

4 Raumkompensation
4.1 Schallwiedergabe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.1 Schallwiedergabe basierend auf dem Kirchhoff-Helmholtz Integral
4.1.2 Dreidimensionale Schallwiedergabe . . . . . . . . . . . . . . . . .

66
68
68
69
70
70
71
72
73
73
74
76
76
78
79
81
82
83
85
85
85
86
87
88
89
90
91
94
97
99

101
. 101
. 102
. 104

254

E. Titel, Inhaltsverzeichnis, Einleitung und Zusammenfassung

4.1.3
4.1.4
4.1.5
4.1.6

4.2
4.3

4.4

4.5

4.6

Zweidimensionale Schallwiedergabe . . . . . . . . . . . . . . . . . .
Schallwiedergabe mit Monopolen als Sekundarquellen . . . . . . . .
Wiedergabe eines in ebene Wellen zerlegten Wellenfeldes . . . . . .
Raumliche Abtastung der Verteilung der Monopol Sekundarquellen
4.1.6.1 Lineare Arrays . . . . . . . . . . . . . . . . . . . . . . . .
4.1.6.2 Zirkulare Arrays . . . . . . . . . . . . . . . . . . . . . . .
4.1.6.3 Beliebig geformte Arrays . . . . . . . . . . . . . . . . . . .
4.1.6.4 Punktquellen als Sekundarquellen . . . . . . . . . . . . . .
Schallwiedergabe in Raumen . . . . . . . . . . . . . . . . . . . . . . . . . .
Grundlagen der Raumkompensation . . . . . . . . . . . . . . . . . . . . . .
4.3.1 Raumkompensation als Entfaltungsproblem . . . . . . . . . . . . .
4.3
Allgemeine Losung des Entfaltungsproblems f
ur Raume . . . . . . .
4.3.3 Adaption der Raumkompensationsfilter . . . . . . . . . . . . . . . .
Raumkompensation f
ur Reproduktionssysteme mit vielen Kanalen . . . . .
4.4.1 Diskrete Realisierung der Raumkompensation . . . . . . . . . . . .
4.4.1.1 Raumliche Diskretisierung . . . . . . . . . . . . . . . . . .
4.4.1.2 Zeitliche Diskretisierung . . . . . . . . . . . . . . . . . . .
4.4.1.3 Frequenzbereichsbeschreibung der Signale und Systeme . .
4.4.1.4 Adaption der Raumkompensationsfilter . . . . . . . . . . .
4.4.2 Exakte inverse Filterung mit dem MINT . . . . . . . . . . . . . . .
4.4.3 Adaption der Raumkompensationsfilter mittels des minimalen quadratischen Fehlers . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.4 Fundamentale Probleme der adaptiven inversen Filterung . . . . . .
Allgemeines Konzept f
ur ein verbessertes Raumkompensationssystem . . .
dd . . . . . . . . . . . . . . .
4.5.1 Analyse der Autokorrelationsmatrix
4.5.2 Entkoppelung der Raumtransfermatrix des Wiedergaberaumes . . .
4.5.2.1 Singulawertzerlegung . . . . . . . . . . . . . . . . . . . . .
4.5.2.2 Entkoppelung der Raumtransfermatrix des Wiedergaberaumes . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.5.3 Adaptive inverse Filterung im Eigenraum . . . . . . . . . . . . . . .
4.5.3 Adaptive inverse Filterung im Wellenbereich . . . . . . . . . . . . .
4.5.5 Angenaherte Entkoppelung der Raumtransfermatrix der Wiedergaberaumes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.5.5.1 Zerlegung in ebene Wellen . . . . . . . . . . . . . . . . . .
4.5.5.2 Zerlegung in zirkulare Harmonische . . . . . . . . . . . . .
Zusammenfassung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

104
105
110
111
112
119
122
125
126
127
128
131
132
134
134
134
136
137
138
139
142
145
146
146
147
147
149
150
153
155
156
157
159

5 Raumkompensation fu
161
r Raumklangsysteme
5.1 Wellenfeldsynthese . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

E.2. Inhaltsverzeichnis

5.1.1
5.1.2
5.1.3

5.2

5.3

5.4

Korrektur des unpassenden Sekundarquellentyps . . . . . . . . . .


Artefakte der WFS und deren Einfluss auf die Raumkompensation
Quantitative Analyse der Artefakte f
ur zirkulare WFS Systeme .
5.1.3.1 Amplitudenfehler . . . . . . . . . . . . . . . . . . . . . .
5.1.3.2 Unterdr
uckung von elevierten Reflektionen . . . . . . . .
5.1.4 Verfahren f
ur das Rendering . . . . . . . . . . . . . . . . . . . . .
5.1.4.1 Datenbasiertes Rendering . . . . . . . . . . . . . . . . .
5.1.4.2 Modellbasiertes Rendering . . . . . . . . . . . . . . . . .
5.1.4.2 Praktische Implementierung eines WFS Systems . . . . . . . . . .
5.1.5.1 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1.5.2 Software . . . . . . . . . . . . . . . . . . . . . . . . . . .
Praktische Implementierung der Wellenfeldanalyse . . . . . . . . . . . . .
5.2.1 Lineare Mikrophonarrays . . . . . . . . . . . . . . . . . . . . . . .
5.2.2 Zirkulare Mikrophonarrays . . . . . . . . . . . . . . . . . . . . . .
5.2.3 Artefakte der 2D Wellenfeldanalyse und Extrapolation . . . . . .
5.2.4 Quantitative Analyse der Artefakte von zirkularen WFA Systemen
5.2.4.1 Amplitudenfehler aufgrund der Extrapolation . . . . . .
5.2.4.2 Analyse von elevierten Reflektionen . . . . . . . . . . . .
5.2.5 Praktische Realisierung eines zirkularen WFA Systems . . . . . .
Raumkompensation f
ur Wellenfeldsynthese . . . . . . . . . . . . . . . . .
5.3.1 Entkoppelung der Raumtransfermatrix des Wiedergaberaumes . .
5.3.2 WDAF basierte Raumkompensation f
ur WFS Systeme . . . . . .
5.3.3 Bewertungsmae f
ur die aktive Raumkompensation . . . . . . . .
5.3.4 Ergebnisse basierend auf simulierten akustischen Umgebungen . .
5.3.4.1 Zirkulares WFS System . . . . . . . . . . . . . . . . . .
5.3.4.2 Rechteckiges WFS System . . . . . . . . . . . . . . . . .
5.3.5 Ergebnisse basierend auf einer gemessenen akustischen Umgebung
5.3.6 Der Einfluss der Artefakte von WFS und WFA . . . . . . . . . .
5.3.7 Mogliche Modifikationen der vorgeschlagenen Algorithmen . . . .
5.3.8 Kompensation der Reflektionen des Wiedergaberaumes u
ber der
raumliche Aliasingfrequenz . . . . . . . . . . . . . . . . . . . . . .
Raumkomensation f
ur andere raumliche Wiedergabesysteme . . . . . . .

6 Zusammenfassung und Schlussfolgerungen


A Notationen
A.1 Konventionen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A.2 Abk
urzungen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A.3 Liste der mathematischen Symbole . . . . . . . . . . . . . . . . . . . . .

255

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

162
165
167
168
171
171
173
174
174
175
177
178
178
181
182
184
185
186
188
190
191
193
195
197
198
207
209
215
216

. 217
. 218
221
225
. 225
. 226
. 227

256

E. Titel, Inhaltsverzeichnis, Einleitung und Zusammenfassung

B Koordinatensysteme
B.1 Kartesisches Koordinatensystem .
B.2 Spharisches Koordinatensystem .
B.3 Zylindrisches Koordinatensystem
B.4 Polarkoordinatensystem . . . . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

C Mathematische Grundlagen
C.1 Greens zweiter Integralsatz . . . . . . . . . . . . . . . . . . . . . .
C.2 Die Methode der stationare Phase . . . . . . . . . . . . . . . . . .
C.2.1 Approximation einer linearen Verteilung von Punktquellen
C.3 Raum-zeitliches Spektrum der zwei- und dreidimensionalen freifeld
schen Funktionen . . . . . . . . . . . . . . . . . . . . . . . . . . .
C.3.1 Zweidimensionale Greensche Funktion . . . . . . . . . . .
C.3.1 Dreidimensionale Greensche Funktion . . . . . . . . . . . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

235
. 235
. 237
. 238
. 240

243
. . . . . 243
. . . . . 244
. . . . . 244
Green. . . . . 245
. . . . . 245
. . . . . 246

D Gemessene und simulierte akustische Umgebungen


247
D.1 Gemessene akustische Umgebung . . . . . . . . . . . . . . . . . . . . . . . 247
D.2 Simulierte akustische Umgebungen . . . . . . . . . . . . . . . . . . . . . . 249
E Titel, Inhaltsverzeichnis, Einleitung und Zusammenfassung
E.1 Titel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
E.2 Inhaltsverzeichnis . . . . . . . . . . . . . . . . . . . . . . . . .
E.3 Einleitung . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
E.3.1 Der Einfluss des Wiedergaberaumes . . . . . . . . . . .
E.3.2 Systeme zur Kompensation des Wiedergaberaumes . .

E.3.3 Uberblick
u
ber diese Arbeit . . . . . . . . . . . . . . .
E.4 Zusammenfassung und Schlussfolgerungen . . . . . . . . . . .
Literaturverzeichnis

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

251
. 251
. 251
. 257
. 259
. 260
. 261
. 262
265

E.3. Einleitung

E.3

257

Einleitung

Unter den verschiedenen Sinnen der Menschen sind Horen und Sehen in alltaglichen Situationen die prasentesten. Horen ist der Sinn, der sensibel f
ur Schallwellen ist. Von einem
strickt physikalischen Standpunkt aus konnten Schallwellen einfach nur als Kompressionswellen, die sich in der Luft fortbewegen, betrachtet werden. Jedoch ist Schall f
ur den
Menschen viel mehr als aus dieser physikalische Definition hervorgeht. Schall transportiert eine groe Bandbreite an Informationen, Impressionen und Gef
uhlen und kann zum
Beispiel vom Empfanger als Gerausch, Sprache oder Musik interpretiert werden. Diese
Bedeutung f
ur den Menschen macht es erstrebenswert, Techniken zur Wiederherstellung
von akustischen Ereignissen zu besitzen. Schallreproduktion strebt die Wiederherstellung
einer (virtuellen) akustischen Szene an einem anderen Ort oder zu einer anderen Zeit an.
Richtig ausgef
uhrt sollte eine perfekte akustische Illusion der originalen Szene erweckt
werden. Demnach ist das Ziel der Schallreproduktion die perfekte akustische Illusion zu
erzeugen. Jedoch ist der menschliche Horsinn sehr sensibel und vermag minimale Unterschiede zwischen der originalen Szene und der reproduzierten zu detektieren. Die perfekte
Reproduktion von aufgenommenen oder synthetischen akustischen Szenen ist ein aktives
Forschungsgebiet in den letzten Dekaden. Generation von Ingenieuren erfanden verschiedenste Schallreproduktionssysteme [Tor98, Ste96, Gri00, KTH99, Pol00]. Die perfekte
akustische Illusion wurde jedoch von keinem dieser Systeme realisiert. Nichtsdestotrotz
wurde die Schallreproduktion im Sinne der Reproduktionsqualitat und der raumlichen
Wahrnehmung in den letzten Jahrzehnten nennenswert gesteigert. Im Folgenden wird ein

kurzer Uberblick
u
ber Systeme zur Schallreproduktion gegeben.
Die Geschichte der Schallwiedergabe reicht zur
uck bis zur Erfindung des Telefons im
spaten 19ten Jahrhundert. Johann Philipp Reis entwickelte einen ersten Prototyp des

Telefons im Jahre 1860, das die Uberbr


uckung von einer ca. 100 Meter langen Strecke
ermoglichte [Wik05c]. Alexander Graham Bell verbesserte diesen ersten Prototypen und
patentierte das Telefon im Jahre 1876 [Wik05a]. Das Telefon kann als eines der ersten
Systeme zur Schall
ubertragung betrachtet werden. Es wurde hauptsachlich zur Sprach
kommunikation entwickelt und bot eine relativ schlechte Qualitat zur Ubertragung
von
Musik. Auch ber
ucksichtigte es die binaurale Natur des menschlichen Gehores nicht. Einige Jahre spater erkannten Forscher das eine binaurale Wiedergabe den raumlichen Eindruck erheblich verbessern konnte [Tor98]. Von diesen ersten Ansatzen an wurde bis heute
eine breite Vielfalt von Reproduktionssystemen entwickelt. Vorliegende Arbeit beschaftigt
sich hauptsachlich mit Systemen, die f
ur die hochqualitative Reproduktion von Schall und
Musik entwickelt wurden.
Die ersten Systeme zur Reproduktion in diesem Kontext bestanden nur aus einem Lautsprecher und werden daher als monophone Systeme bezeichnet. Ein System mit einem
Lautsprecher ermoglicht die raumliche Impression der originalen Szene nur bis zu einem

258

E. Titel, Inhaltsverzeichnis, Einleitung und Zusammenfassung

gewissen Grad wiederzugeben. Um die Situation zu verbessern, benutzt die stereophonische Reproduktion zwei Lautsprecher. Typischerweise ist ein stereophonisches System
daf
ur entwickelt worden, u
ber einen Winkel von 30o , vom Zuhorer aus betrachtet, zwischen den Lautsprechern zu verf
ugen und gleiche Abstande von den Lautsprechern zum
Zuhorer zu haben. Die meisten stereophonischen Systeme streben nur die Wiedererstellung des Schallfeldes in der horizontalen Ebene an. Stereophonie basiert auf Prinzipien
zur stereophonischen Wiedergabe, wie dem Amplituden-Panning [HW98], die aus der
psychoakustischen Forschung abgeleitet wurden. Als Folge kann die korrekte raumliche
Impression der originalen Szene nur an einer bestimmten Zuhorerposition wahrgenommen werden. Diese Position wird oft als sweet-spot Bezeichnet. Um die Situation zu
verbessern, wurde Stereophonie um die Raumklang-Wiedergabetechniken erweitert. Die
treibende Kraft f
ur die Weiterentwicklung der Raumklangverfahren war die Filmindustrie.
Erste Raumklangsysteme, bestehend aus drei Lautsprechern vorne und zwei Lautsprechern hinten, wurden 1940 prasentiert [Tor98]. Diese basierten ebenfalls auf stereophonischen Prinzipien und wiesen zum Teil auch deren Limitierungen auf. Jedoch dauerte es
noch einige Zeit bis die Raumklangverfahren den kommerziellen Durchbruch erreichten.
Heute sind f
unfkanalige Raumklangsysteme Stand der Technik in Heimkinosystemen und
in Kinos werden typischerweise noch mehr Kanale genutzt. Die Reproduktionstechniken
erweiterten sich auch um eine dreidimensionale Wiedergabe. Vektorbasiertes Amplitudenpanning [Pul97, Pul99] kann als Beispiel f
ur ein dreidimensionales stereophonisches
Wiedergabesystem dienen.
Fortgeschrittene Raumklangsysteme, welche den sweet-spot und andere Limitationen
der stereophonischen Systeme aufheben, wurden ebenfalls entwickelt [Cam67, KNOBH96,
KN93, WA01]. Die Systeme, die hier genannt werden sollten, sind Wellenfeldsynthese
(WFS) [Ber88, Hul04] und Ambisonics hoherer Ordnung [Dan00]. Beide basieren auf einer soliden physikalischen Grundlage. WFS wird im Detail im Abschnitt 5.1 diskutiert.
Im Allgemeinen werden die Ansteuerungssignale f
ur die Lautsprecher aus den aufgenommenen Signalen der originalen Quellen, aus geometrischen Informationen u
ber die Quellenpositionen und aus Informationen u
ber die Akustik des Aufnahmeraumes gewonnen.
Diese Information kann in einem existierenden Aufnahmeraum aufgenommen worden sein,
(zum Beispiel in einer Konzerthalle) oder k
unstlich erzeugt worden sein. Die Signalverarbeitungsalgorithmen zur Erzeugung der Lautsprechersignale werden von fundamentalen
psychoakustischen oder physikalischen Prinzipien abgeleitet. Obwohl sich die Qualitat der
Reproduktion schon erheblich verbessert hat, gibt es noch eine Reihe offener Probleme. Eines davon, das alle oben genannten Systeme gemein haben, ist der Einfluss des Raumes in
dem die Wiedergabe stattfindet, der Wiedergaberaum. Der folgende Abschnitt illustriert
den Einfluss des Wiedergaberaumes auf die Raumklangwiedergabe.

E.3. Einleitung

E.3.1

259

Der Einfluss des Wiedergaberaumes

Der Einfluss des Wiedergaberaumes auf ein Wiedergabesystem und die wiedergegebene Szene wird zuerst in einer intuitiven Weise aufgezeigt. Im Folgenden betrachten wir
daf
ur ein einfaches Wiedergabeszenario. Abbildung 1.1 zeigt dieses vereinfachte Szenario.

Die Ubertragung
einer akustischen Szene von einer Kirche (z.B. ein Sanger der im Chor
singt) in den Wiedergaberaum wird exemplarisch aufgezeigt. Zur Vereinfachung illustriert
Abbildung 1.1 die Ausbreitung von akustischen Wellen mittels akustischer Strahlen. Die
gestrichelten Linien in Abbildung 1.1 von der virtuellen Quelle zu einer beispielhaften
Horerposition zeigen die akustischen Strahlen f
ur den Direktpfad und von verschiedenen
Reflektionen von den Seitenwanden des Aufnahmeraumes. Das Lautsprechersystem im
Wiedergaberaum reproduziert den Direktpfad und die Reflektionen, um den gew
unschten
raumlichen Eindruck der originalen Szene wiedherzustellen. Die Theorie hinter fast allen
angewendeten Methoden nimmt einen reflektionsfreien Wiedergaberaum an. Jedoch wird
diese idealisierte Annahme selten von typischen Wiedergaberaumen erf
ullt. Die durchgezogene Linie in Abbildung 1.1, von einem Lautsprecher in der oberen Reihe zur Horerposition, zeigt eine mogliche Reflektion des vom Lautsprecher produzierten Wellenfeldes von
der Wand des Wiedergaberaumes. Wie dieses vereinfachte Beispiel intuitiv zeigt, konnen
diese zusatzlichen, durch den Wiedergaberaum verursachten Reflektionen den gew
unschten raumlichen Eindruck verfalschen.
Der Einfluss des Wiedergaberaumes auf die Raumklangwiedergabe ist ein Themengebiet der aktuellen Forschung [Gri98, DZJR05, KS03, Vol98, VTB02, Vol96]. Nachdem
die akustischen Eigenschaften des Wiedergaberaumes und des angewendeten Wiedergabesystems in einem weiten Bereich variieren, konnen keine allgemeinen Aussagen u
ber
den perzeptualen Einfluss des Wiedergaberaumes gegeben werden. Jedoch werden die
durch den Wiedergaberaum hinzugef
ugten Reflektionen Einfluss auf die psychoakustischen Eigenschaften der wiedergegebenen Szene haben. Diese Einfl
usse konnten z.B. eine
Verschlechterung der Lokalisierung und Klangverfarbung sein. Speziell dominante erste
Reflektionen scheinen den gew
unschten raumlichen Eindruck zu beeinflussen. Ein anderer
Effekt des Wiedergaberaums auf die niederfrequente Wiedergabe konnte das Auftreten
von niederfrequenten Resonanzen sein. Diese Resonanzen beeinflussen die Perzeption von
kurzen Klangereignissen negativ.
Im Allgemeinen wird ein hallender Wiedergaberaum seine Charakteristik auf die
gew
unschte Impression des aufgenommenen Raumes aufpragen. Die Kompensation des
Wiedergaberaumes will den Einfluss des Wiedergaberaumes auf die wiedergegebene Szene
eliminieren oder vermindern. Der folgende Abschnitt wird Systeme zur Raumkompensa
tion vorstellen und einen kurzen Uberblick
u
ber bekannte Ansatze geben.

260

E.3.2

E. Titel, Inhaltsverzeichnis, Einleitung und Zusammenfassung

Systeme zur Kompensation des Wiedergaberaumes

Zur Kompensation des Wiedergaberaumes sind verschiedene Ansatze mogliche. Als erste
Manahme gegen Raumreflektionen dient die Anwendung von akustischen Dammmaterialien, im Rahmen dieser Arbeit als passive Raumkompensation bezeichnet. Jedoch ist
allgemein bekannt, dass akustische Dammung unpraktisch und teuer ist und eine relativ
geringe Dampfung erreicht. Dies gilt speziell f
ur niedrige Frequenzen. Zusatzlich begrenzen Kosten- und Design
uberlegungen die Wirkung dieser Gegenmanahme. Als Folge
kann passive Raumkompensation alleine in einer praktischen Anwendung keine hinreichende Unterdr
uckung der Reflektionen des Wiedergaberaumes bieten. Die grundlegende
Idee der aktiven Kompensation des Wiedergaberaumes bilden Konzepte der aktiven Kontrolle, um die gew
unschte Kompensation durchzuf
uhren. Unter den verschiedenen Varianten konnen hier zwei grundsatzliche Ansatze genannt werden: (1) aktive Kontrolle der
akustischen Impedanz an den Wanden des Wiedergaberaumes und (2) Verwendung des
Wiedergabesystems. Der erste Ansatz versucht die Impedanz einer Wand aktiv zu beeinflussen, um Freifeldbedingungen zu erhalten [GKR85, OM53]. Der zweite nutzt Synergien
mit dem Reproduktionssystem, um Kontrolle u
ber das Wellenfeld im Wiedergabebereich
zu erhalten. Die darauf wiederum basierenden Ansatze werden im Folgenden kurz zusammengefasst.

Ein Uberblicke
u
ber klassische einkanalige Ansatze zur Raumkompensation und deren
Limitierungen finden sich bei [Fie01, Fie03, HM04, Mou94]. Die einkanaligen Ansatze
analysieren das von einem Lautsprecher wiedergegebene Wellenfeld an nur einer Position.
Gemeinsam ist den meisten Ansatze zur aktiven Raumkompensation die grundsatzliche
Idee der Vorverzerrung von Lautsprecheransteuerungssignalen mittels geeigneter Kompensationsfilter, die aus der Analyse des reproduzierten Wellenfeldes berechnet wurden.
Jedoch leiden diese unter drei fundamentalen Problemen. Das erste Problem besteht darin,
dass die Kompensationsfilter aufgrund der zeitveranderlichen Charakteristik des Wiedergaberaumes adaptiv berechnet werden m
ussen. Eine Temperaturveranderung im Wie
dergaberaum resultiert zum Beispiel in einer Anderung
der Schallgeschwindigkeit und
+
damit der akustischen Eigenschaften [OYS 99]. Eine groe Bandbreite von Problemen
ist mit den Algorithmen verbunden, die zur Adaption des Kompensationsfilters genutzt
werden. Das zweite Problem besteht darin, dass Raumimpulsantworten im Allgemeinen
nicht minimalphasig sind und es daher nicht moglich ist, einen exakten inversen Filter
zu berechnen. Jedoch ist der optimale Kompensationsfilter ein inverser Filter zur Impulsantwort vom Lautsprecher zur gemessenen Position. Das dritte Problem besteht darin,
dass der Kompensationsfilter nur f
ur die gemessene Position optimal ist. In der Folge wird
die Performance, im Sinne der Kompensation, mit ansteigender Distanz zum gemessenen
Punkt [TW02, TW03, BHK03, NOBH95] niedriger ausfallen.
Um die letzten beiden dieser Probleme zu losen, schlugen verschiedene Autoren mehrkanalige aktive Raumkompensationssysteme vor. Dabei werden die akustischen Eigen-

E.3. Einleitung

261

schaften des Wiedergaberaumes von einer oder mehreren Lautsprecherpositionen zu einer


oder mehreren Mikrophonpositionen gemessen. Im mehrkanaligen Fall ist die Berechnung von exakten inversen Filtern in den meisten praktischen Fallen moglich [MK88].
Die unerw
unschte Positionsabhangigkeit der aktiven Raumkompensation verbessert sich
zusatzlich. Jedoch benutzten die meisten klassischen mehrkanaligen Verfahren nur eine
geringe Anzahl von Lautsprechern und Analysepositionen. Dementsprechend ermoglichen
diese keine ausreichende Kontrolle u
ber das Wellenfeld und keine ausreichende Analyse
des wiedergegebenen Wellenfeldes. Infolgedessen wird der Einfluss des Wiedergaberaumes hauptsachlich an den analysierten Positionen kompensiert mit Potenzial f
ur schwere
Artefakte neben diesen Positionen. Diese Ansatze werden daher als Mehrpunkt Kompensationsansatze bezeichnet. Fortgeschrittene Wiedergabesysteme wie WFS und Ambisonics hoherer Ordnung konnten eine Verbesserung im Sinne der Kontrolle u
ber das
wiedergegebene Wellenfeld bieten. Jedoch benotigen diese auch eine hohe Anzahl von Reproduktionskanalen. Eine ausreichende Analyse des reproduzierten Wellenfeldes benotigt
zusatzlich eine hohe Anzahl von Analysekanalen. Ungl
ucklicherweise ist die Adaption der
Kompensationsfilter f
ur ein Szenario mit vielen Reproduktions- and Analysekanalen mit
fundamentalen Problemen behaftet [SMH95]. Die folgenden Anforderungen an ein verbessertes System zur Kompensation des Wiedergaberaumes konnen aus den Problemen
der oben diskutierten klassischen Verfahren abgeleitet werden.
Verbesserte Methoden zur Raumkompensation sollten
1. Analysesignale vom gesamten Zuhorerbereich ableiten, nicht nur von ausgewahlten
Punkten,
2. auf einem raumlichen Reproduktionssystem basieren, welches Kontrolle u
ber das
akustische Wellenfeld innerhalb des gesamten Zuhorerbereiches bietet und
3. einen verbesserten mehrkanaligen Adaptionsalgorithmus nutzen, welcher die fundamentalen Probleme der mehrkanaligen Adaption mit vielen Kanalen bewaltigt.
Vorliegende Arbeit wird einen neuen effizienten Ansatz zur aktiven Kompensation des
Wiedergaberaumes herleiten, diskutieren und evaluieren, der in einem erweiterten kom
pensierten Bereich resultiert. Im Folgenden wird ein kurzer Uberblick
u
ber diese Arbeit
gegeben.

E.3.3

Uberblick
u
ber diese Arbeit

Diese Arbeit ist wie folgt gegliedert: Kapitel 2 f


uhrt die Grundlagen der Schallausbreitung
ein. Diese fungieren als Basis f
ur die restlichen Kapitel. Es werden die Wellengleichung,
ihre Losungen und Wellenfeldzerlegungen diskutiert. Kapitel 3 ber
ucksichtigt die erste

262

E. Titel, Inhaltsverzeichnis, Einleitung und Zusammenfassung

der oben genannten Anforderungen und f


uhrt die Fourier-basierte Analyse von Wellenfeldern ein. Die Fourieranalyse von beliebigen mehrdimensionalen Signalen wird auf den Fall
von akustischen Feldern spezialisiert, resultierend in effizienten Methoden zur Wellenfeldanalyse (WFA) des reproduzierten Wellenfeldes. Kapitel 4 diskutiert zuerst die Grundlagen fortgeschrittener Reproduktionssysteme, die ausreichend Kontrolle bieten. Dann wird
der Einfluss des Wiedergaberaumes und die adaptive Berechnung der Kompensationsfilter analysiert, sowie die fundamentalen Probleme der klassischen Adaptionsalgorithmen
bei einer hohen Anzahl von Reproduktions- und Analysekanalen. Raumzeitliche Signalund Systemtransformationen werden dann als Losung dieser fundamentalen Probleme
vorgeschlagen. Dies f
uhrt zu einem verbesserten Kompensationssystem f
ur den Wiedergaberaum, das alle oben genannten Anforderungen erf
ullt. Kapitel 5 prasentiert WFS
als eine mogliche Implementierung eines Raumklangreproduktionssystems, diskutiert die
Artefakte von WFS und WFA und f
uhrt ein verbessertes Kompensationssystem f
ur den
Wiedergaberaum f
ur WFS ein. Ergebnisse von simulierten und gemessenen akustischen
Umgebungen werden prasentiert. Kapitel 6 wird zu guter Letzt eine Zusammenfassung
dieser Arbeit geben und einige Schlussfolgerungen ziehen.

E.4

Zusammenfassung und Schlussfolgerungen

Seit Anbeginn versucht die Schallreproduktion die perfekte akustische Illusion zu erzeugen. Viele Reproduktionssysteme mit dem Ehrgeiz dieses Ziel zu erf
ullen, entstanden in
den letzten Dekaden. Die perfekte akustische Illusion wurde jedoch von keinem dieser
Systeme realisiert. Nichtsdestotrotz verbesserte sich die Reproduktion, was die Qualitat
und den raumlichen Eindruck betrifft, verglichen mit den ersten monophonen Systemen
erheblich. [Tor98, Ste96, Gri00].
Die Theorie hinter den meisten Reproduktionssystemen nimmt eine Freifeldausbreitung
von den Lautsprechern zum Zuhorer an. Ungl
ucklicherweise wird diese Annahme von
Reproduktionssystemen, die in einem Wiedergaberaum platziert sind nicht erf
ullt. Der
Wiedergaberaum gestaltet sich typischerweise als Kompromiss zwischen Design, Kosten
und akustischen Anforderungen. Als Folge sind die meisten Wiedergaberaume mehr oder
weniger hallend und pragen ihre Reflektionen auf das reproduzierte Wellenfeld auf. Vorliegende Arbeit entwickelt ein theoretisches Framework f
ur die aktive Kompensation der
Reflektionen im Wiedergaberaum. Grundlegende Idee dazu ist, das bereits existierende
Schallreproduktionssystem zu nutzen, um destruktive Interferenzen zur Unterdr
uckung
der Reflektionen zu erzeugen. Dies erfordert jedoch eine zusatzliche Analyse des Einflusses, den der Wiedergaberaum hat. Um die Reflektionen in einem groen Zuhorerbereich
zu unterdr
ucken, wurden zwei grundsatzliche Anforderungen an das Reproduktions- und
Analysesystem erarbeitet. Diese sind: (1) das Analysesystem sollte die (perfekte) Analyse des reproduzierten Wellenfeldes innerhalb des Zuhorerbereiches ermoglichen und (2)

E.4. Zusammenfassung und Schlussfolgerungen

263

das Reproduktionssystem sollte (perfekte) Kontrolle u


ber das Wellenfeld innerhalb des
Zuhorerbereiches ermoglichen. Kapitel 3 f
uhrt die Wellenfeldanalyse basierend auf der
Zerlegung in ebene und zirkulare harmonische Wellen als Losung f
ur die erste Anforderung ein. Speziell, zirkulare Mikrophonarrays zeigen viele w
unschenswerte Eigenschaften
f
ur WFA. Kapitel 4.1 und 5.1 f
uhren Schallreproduktionssysteme mit dicht gepackten
Lautsprechern und WFS als mogliche Losungen f
ur die zweite Anforderung ein. Beide
WFA und WFS werden durch Lautsprecher und Mikrophone an diskreten Positionen
realisiert. Eine hohe Anzahl von Analyse- und Reproduktionskanalen ermoglicht es, ausreichende Analyse- und Kontrollfahigkeiten bis zu einer oberen Frequenz und Groe des
Zuhorerbereiches zu bieten. Beachten Sie bitte, dass die Mehrpunktverfahren nur eine relativ geringe Anzahl von Kanalen nutzen und daher keine ausreichende akustischen Analyse und Kontrolle bieten. In Abschnitt 4.4.4 zeigt sich, dass die traditionellen Ansatze
zur adaptiven inversen Filterung in Zusammenhang mit der Raumkompensation f
ur Reproduktionssysteme mit sehr vielen Kanalen nicht anwendbar sind. Als Losung bietet
sich eine raumliche Entkoppelung des MIMO adaptiven inversen Filterproblems auf der
Basis von Signal- und Systemtransformationen an. Die optimale Transformation ist auf
Basis der GSVD gegeben. Die GSVD wiederum hangt von den akustischen Eigenschaften des Wiedergaberaumes ab. Im Allgemeinen sind diese Eigenschaften unbekannt und
konnen sich mit der Zeit verandern. Um diese Probleme zu losen, schlagt das Konzept
der adaptiven Filterung im Wellenbereich die Nutzung einer Transformation vor, die unabhangig von den Eigenschaften des Wiedergaberaums ist. Nachteilig wirkt sich aus, dass
diese Transformation nicht in einer optimalen Entkoppelung des mehrkanaligen adaptiven Systems resultiert. Abschnitt 5.3.2 nutzt WFA, WFS und die adaptive Filterung im
Wellenbereich, um eine verbessertes Raumkompensationssystem f
ur den Wiedergaberaum
zu entwickeln. Die Ergebnisse von offline Simulationen mit diesem System zeigten, dass
die vorgeschlagenen Methoden eine Kompensation des Wiedergaberaumes im gesamten
Zuhorerbereich ermoglichen. Zusatzlich verringert die adaptive Filterung die Komplexitat
im Wellenbereich signifikant. Statt 2304 (korrelierten) Kompensationsfiltern wurden nur
48 (unkorrelierte) Kompensationsfilter benotigt.
Es wurde dargestellt [Hay96], dass der Funktionsblock adaptiver Filter f
ur vier grundlegende Klassen von Anwendungen genutzt werden kann: (1) inverse Modellierung, (2) Identifikation, (3) Interferenzunterdr
uckung und (4) Pradiktion. Als Folge kann das prasentierte
Konzept der adaptiven Filterung im Wellenbereich auch f
ur andere Adaptionsprobleme
mit sehr vielen Kanalen genutzt werden. Die erste Anwendung wird im Rahmen dieser
Arbeit ausf
uhrlich diskutiert. Die zweite Klasse wird typischerweise zur Unterdr
uckung
akustischer Echos genutzt, die dritte f
ur die akustische Mensch-Maschine Schnittstellen
und die vierte zur Kodierung. Das Konzept der adaptiven Filterung im Wellenbereich
lasst sich ebenso erfolgreich f
ur die Unterdr
uckung akustischer Echos bei Systemen mit
sehr vielen Kanalen anwenden [BSK04b, BSK04c, BSK04a] und f
ur die adaptive mehrka-

264

E. Titel, Inhaltsverzeichnis, Einleitung und Zusammenfassung

nalige Keulenformung [HNS+ 05]. Folglich bietet das Konzept der adaptiven Filterung im
Wellenbereich ein generelles Framework zur adaptiven Filterung mit sehr vielen Kanalen.
Die Anwendung dieses Frameworks ist nicht notwendigerweise auf akustische Wellenfelder
limitiert, es konnte auch sehr nutzvoll f
ur nicht-akustische Anwendungen sein.
Vorliegende Arbeit konzentrierte sich hauptsachlich auf theoretische Grundlagen von
WFA, Schallwiedergabe, adaptiver Filterung im Eigenraum und adaptive Filterung im
Wellenbereich. Jedoch m
ussen f
ur eine praktische Implementierung dieser Techniken
zusatzlich praktische Aspekte bedacht werden. Einige davon werden bereits in den Abschnitten 5.2.5 und 5.3.6 benannt, zum Beispiel Wandler Variationen und Fehlpositionierung und Wandler- und Verstarkerrauschen. Diese Punkte werden f
ur beliebige Mikrophonarrays zum Beispiel von [Tre02, JD93, BW01] und f
ur zirkulare Mikrophonarrays
z.B. von [Teu05] diskutiert. Wenn diese nicht angemessen bedacht werden, konnen die
resultierenden Artefakte den erreichbaren Gewinn durch Raumkompensation erheblich
reduzieren.
Wie in dieser Arbeit besprochen, ist das Ziel einer aktiven WiedergaberaumKompensation die hinzugef
ugten Reflektionen des Wiedergaberaums perfekt zu unterdr
ucken. Von einem psychoakustischen Standpunkt aus betrachtet, konnte dies nicht in
allen Situationen notwendig sein. F
ur ein spezifisches Szenario konnte es zum Beispiel
ausreichen, die ersten Reflektionen, die durch eine bestimmte Wand erzeugt werden, zu
unterdr
ucken. Das vorgestellte Framework kann direkt um diese Moglichkeit erweitert
werden, da es volle Kontrolle u
ber die raumzeitliche Struktur des reproduzierten Wellenfeldes ermoglicht. Von Vorteil ware dann die weitere Verringerung der Komplexitat
aktiver Raumkompensationsalgorithmen. Jedoch sind dem Autor keine Erkenntnisse aus
der psychoakustischen Forschung bekannt, die zu diesem Zweck direkt angewendet werden
konnten.

265

Bibliography
[AB79]

J.B. Allen and D.A. Berkley. Image method for efficiently simulating smallroom acoustics. Journal of the Acoustical Society of America, 65(4):943950,
1979.

[ACD+ 03]

A. Averbuch, R.R. Coifman, D.L. Donoho, M. Elad, and M. Israeli. Accurate


and fast polar Fourier transform. In The 37th Asilomar on Signals, Systems
and Computers, Pacific Grove, Pacific Grove, CA, USA, Nov 2003.

[AKG]

AKG Acoustics. http://www.akg.com/.

[ALE]

Alesis. http://www.alesis.com/.

[ALS]

The
Advanced
Linux
http://www.alsa-project.org/.

[AR75]

N. Ahmed and K.R. Rao. Orthogonal Transforms for Digital Signal Processing. Springer, 1975.

[AS72]

M. Abramowitz and I.A. Stegun. Handbook of Mathematical Functions.


Dover Publications, 1972.

[AW01]

G.B. Arfken and H.J. Weber. Mathematical Methods for Physicists. Academic Press, 2001.

[BA05a]

T. Betlehem and T.D. Abhayapala. A modal approach to soundfield reproduction in reverberant rooms. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages III 289292, Philadephia,PA,USA, 2005.

[BA05b]

T. Betlehem and T.D. Abhayapala. Theory and design of sound field reproduction in reverberant rooms. Journal of the Acoustical Society of America,
117(4):21002111, April 2005.

[Baa05]

M.A.J. Baalman. Discretization of complex sound sources for reproduction


with wave field synthesis. In 31. Deutsche Jahrestagung f
ur Akustik, Munic,
Germany, 2005.

Sound

Architecture

(ALSA).

266

BIBLIOGRAPHY

[Bam89]

R. Bamler. Mehrdimensionale lineare Systeme. Springer, 1989.

[BBK03]

H. Buchner, J. Benesty, and W. Kellermann. Multichannel frequency-domain


adaptive algorithms with application to acoustic echo cancellation. In J. Benesty and Y. Huang, editors, Adaptive signal processing: Application to realworld problems. Springer, 2003.

[BdB00]

M.M. Boone and W. de Bruijn. On the applicability of distributed mode


loudspeaker panels for wave field synthesis based sound reproduction. In
108th AES Convention, Paris, France, Feb 2000. Audio Engineering Society
(AES).

[BdVV93]

A.J. Berkhout, D. de Vries, and P. Vogel. Acoustic control by wave field


synthesis. Journal of the Acoustic Society of America, 93(5):27642778, May
1993.

[Ber87]

A.J. Berkhout. Applied Seismic Wave Theory. Elsevier, 1987.

[Ber88]

A.J. Berkhout. A holographic approach to acoustic control. Journal of the


Audio Engineering Society, 36:977995, December 1988.

[BHK03]

S. Bharitkar, P. Hilmes, and C. Kyriakakis. Sensitivity of multichannel room


equalization to listener position. In IEEE International Conference on Multimedia and Expo (ICME), Balitmore, USA, July 2003.

[Bja95]

E. Bjarnason. Analysis of the filtered-X LMS algorithm. IEEE Transactions


on Speech and Audio Processing, 3(6):504514, November 1995.

[Bla96]

J. Blauert. Spatial Hearing: The Psychophysics of Human Sound Localization. MIT Press, 1996.

[Bla00]

D.T. Blackstock. Fundamentals of Physical Acoustics. J. Wiley & Sons,


2000.

[Ble84]

N. Bleistein. Mathematical methods for wave phenomena. Academic Press,


1984.

[BMS98]

J. Benesty, D.R. Morgan, and M.M. Sondhi. A better understanding and


an improved solution to the specific problems of stereophonic acoustic echo
cancellation. IEEE Transactions on Speech and Audio Processing, 6(2):156
165, March 1998.

[BQ00]

M. Bouchard and S. Quednau. Multichannel recursive-least-squares algorithms and fast-transversal-filter algorithms for active noise control and

BIBLIOGRAPHY

267

sound reproduction systems. IEEE Transactions on Speech and Audio Processing, 8(5):606618, September 2000.
[Bra78]

R.N. Bracewell. The Fourier Transform and its Applications. McGraw-Hill,


1978.

[Bra03]

R.N. Bracewell. Fourier Analysis and Imaging. Kluwer, 2003.

[BSHK03a] H. Buchner, S. Spors, W. Herbordt, and W. Kellermann. Vorrichtung


und Verfahren zum Verarbeiten eines akustischen Eingangssignals. Patent
DE10351793, 2003.
[BSHK03b] H. Buchner, S. Spors, W. Herbordt, and W. Kellermann. Vorrichtung und
Verfahren zum Verarbeiten eines Eingangssignals. Patent DE10362073, 2003.
[BSK04a]

H. Buchner, S. Spors, and W. Kellermann. Full-duplex systems for sound


field recording and auralization based on Wave Field Synthesis. In 116th
AES Convention, Berlin, Germany, 2004. Audio Engineering Society (AES).

[BSK04b]

H. Buchner, S. Spors, and W. Kellermann. Wave-domain adaptive filtering:


Acoustic echo cancellation for full-duplex systems based on wave-field synthesis. In IEEE International Conference on Acoustics, Speech, and Signal
Processing (ICASSP), Montreal, Canada, 2004.

[BSK04c]

H. Buchner, S. Spors, and W. Kellermann. Wave-domain adaptive filtering for acoustic human-machine interfaces based on wavefield analysis and
synthesis. In European Signal Processing Conference (EUSIPCO), 2004.

[BSP01]

S. Brix, T. Sporer, and J. Plogsties. CARROUSO - An European approach


to 3D-audio. In 110th AES Convention. Audio Engineering Society (AES),
May 2001.

[BW01]

M. Brandstein and D. Ward. Microphone Arrays: Signal Processing Techniques and Applications. Springer, 2001.

[Cam67]

M. Camras. Approach to recreating a sound field. Journal of the Acoustic


Society of America, 43(6):14251431, Nov. 1967.

[CAR]

The CARROUSO project. http://emt.iis.fhg.de/projects/carrouso.

[CCW03]

T. Caulkins, E. Corteel, and O. Warusfel. Wave field synthesis interaction


with the listening environment, improvements in the reproduction of virtual
sources situated inside the reproduction room. In 6th International Conference on Digital Audio Effects (DAFx-03), London, UK, Sept. 2003.

268

BIBLIOGRAPHY

[CHP02]

E. Corteel, U. Horbach, and R.S. Pellegrini. Multichannel inverse filtering of


multiexciter distributed mode loudspeakers for wave field sythesis. In 112th
AES Convention, Munich, Germany, May 2002. Audio Engineering Society
(AES).

[CN03]

E. Corteel and R. Nicol. Listening room compensation for wave field synthesis. What can be done? In 23rd AES International Conference, Copenhagen,
Denmark, May 2003. Audio Engineering Society (AES).

[CNC04]

D. Cabrera, A. Nguyen, and Y.J. Choi. Auditory versus visual spatial impression: A study of two auditoria. In 10th Meeting of the International
Conference on Auditory Display (ICAD), Sydney, Australia, July 2004.

[Dan00]

J. Daniel. Representation de champs acoustiques, application `


a la transmission et `a la reproduction de sc`enes sonores complexes dans un contexte
multimedia. PhD thesis, Universite Paris 6, 2000.

[dB04]

W. de Bruijn. Application of Wave Field Synthesis in Videoconferencing.


PhD thesis, Delft University of Technology, 2004.

[Dea93]

S.R. Deans. The Radon Transform and Some of its Applications. Krieger
Publishing Company, 1993.

[Dep88]

Ed.F. Deprettere, editor. SVD and Signal Processing: Algorithms, Applications and Architectures. North-Holland, 1988.

[DGZD01]

R. Duraiswami, N.A. Gumerov, D.N. Zotkin, and L.S. Davis. Efficient evaluation of reverberant sound fields. In IEEE Workshop on Applications of
Signal Processing to Audio and Acoustics, pages 203206, New Paltz, USA,
Oct 2001.

[DNM03]

J. Daniel, R. Nicol, and S. Moreau. Further investigations of high order ambisonics and wavefield synthesis for holophonic sound imaging. In 114th AES
Convention, Amsterdam, The Netherlands, March 2003. Audio Engineering
Society (AES).

[dVSV94]

D. de Vries, E.W. Start, and V.G. Valstar. The Wave Field Synthesis concept
applied to sound reinforcement: Restrictions and solutions. In 96th AES
Convention, Amsterdam, Netherlands, February 1994. Audio Engineering
Society (AES).

[DZJR05]

M. Dewhirst, S. Zielinski, P. Jackson, and F. Rumsey. Objective assessment


of spatial localization attributes of surround-sound reproduction systems.

BIBLIOGRAPHY

269

In 118th AES Convention, Bercelona, Spain, May 2005. Audio Engineering


Society (AES).
[ELA]

ELAC Electroacoutsic GmbH. http://www.elac.com/.

[FHLB99]

P. Filippi, D. Habault, JP. Lefebvre, and A. Bergassoli. Acoustics: Basic


physics, theory and methods. Academic Press, 1999.

[Fie01]

L.D. Fielder. Practical limits for room equalization. In 111th AES Convention, New York, NY, USA, September 2001. Audio Engineering Society
(AES).

[Fie03]

L.D. Fielder. Analysis of traditional and reverberation-reducing methods of


room equalization. Journal of the Audio Engineering Society, 51(1/2):326,
Jan./Feb. 2003.

[Fli02]

P.G. Flikkma. An algebraic theory of 3D sound synthesis with loudspeakers.


In 22nd International Conference on Virtual, Synthetic and Entertainment
Audio. Audio Engineering Society (AES), June 2002.

[Fur01]

K. Furuya. Noise reduction and dereverberation using correlation matrix


based on the multiple-input/output inverse-filtering theorem (MINT). In
Int. Workshop on Hands-Free Speech Communication, Kyoto, Japan, April
2001.

[Gar00]

J. Garas. Adaptive 3D Sound Systems. Kluwer Academic Publishers, 2000.

[Gas78]

J.D. Gaskill. Linear Systems, Fourier Transforms, and Optics. John Wiley
& Sons, 1978.

[GD04]

N.A. Gumerov and R. Duraiswami. Fast Multipole Methods for the Helmholtz
Equation in three Dimensions. Elsevier, 2004.

[Ger85]

M.A. Gerzon. Ambisonics in multichannel broadcasting and video. Journal


of the Acoustic Society of America, 33(11):859871, Nov. 1985.

[GKR85]

D. Guicking, K. Karcher, and M. Rollwage. Coherent active methods for


applications in room acoustics. Journal of the Acoustical Society of America,
78(4):14261434, October 1985.

[GL89]

G.H. Golub and C.F. Van Loan. Matrix Computations. The Johns Hopkins
University Press, 1989.

[GR65]

I.S. Gradshteyn and I.M. Ryzhik. Tables of Integrals, Series, and Products.
Academic Press, 1965.

270

BIBLIOGRAPHY

[Gri98]

D. Griesinger. Multichannel sound systems and their interaction with the


room. In 16th International Conference on Audio, Acosutics, and Small
Places, pages 159173. Audio Engineering Society (AES), Oct./Nov. 1998.

[Gri00]

D. Griesinger. Surround: The current technological situation. In 108th AES


Convention, Paris, France, Feb. 2000. Audio Engineering Society (AES).

[Gro05]

P. Grond. Extraction of 3D information from 2D array measurements. Masters thesis, Laboratory of Acoustical Imaging and Sound Control, Delft University of Technology, 2005.

[GRS01]

B. Girod, R. Rabenstein, and A. Stenger. Signals and Systems. J.Wiley &


Sons, 2001.

[Hay96]

S. Haykin. Adaptive Filter Theory. Prentice-Hall, 1996.

[HdVB01]

E. Hulsebos, D. de Vries, and E. Bourdillat. Improved microphone array configurations for auralization of sound fields by Wave Field Synthesis. In 110th
AES Convention, Amsterdam, Netherlands, May 2001. Audio Engineering
Society (AES).

[HdVB02]

E. Hulsebos, D. de Vries, and E. Bourdillat. Improved microphone array configurations for auralization of sound fields by Wave Field Synthesis. Journal
of the Audio Engineering Society (AES), 50(10), Oct. 2002.

[HM04]

P. Hatziantoniou and J. Mourjopoulos. Errors in real-time room acoustics


dereverberation. Journal of the Audio Engineering Society, 52(9):883899,
September 2004.

[HNS+ 05]

W. Herbordt, S. Nakamura, S. Spors, H. Buchner, and W. Kellermann. Wave


field cancellation using wave-domain adaptive filtering. In Hand-Free Speech
Communication and Microphone Arrays, New Jersey, USA, 2005.

[HS97]

C.H. Hansen and S.D. Snyder. Active Control of Noise and Vibration. E&FN
Spon, 1997.

[HSdVB03] E. Hulsebos, T. Schuurmanns, D. de Vries, and R. Boone. Circular microphone array recording for discrete multichannel audio recording. In 114th
AES Convention, Amsterdam, Netherlands, March 2003. Audio Engineering
Society (AES).
[Hul04]

E. Hulsebos. Auralization using Wave Field Synthesis. PhD thesis, Delft


University of Technology, 2004.

BIBLIOGRAPHY

271

[HV99]

S. Haykin and B. VanVeen. Signals and Systems. Wiley, 1999.

[HW98]

C. Hugonnet and P. Walder. Stereophonic Sound Recording: Theory and


Practice. Wiley, 1998.

[ITU97]

ITU. Recommendation ITU-R BS.1116-1, 1994-1997.

[JAC]

The JACK audio connection kit. http://jackit.sourceforge.net/.

[Jag84]

D.S. Jagger. Recent developments and improvements in soundfield microphone technology. In 75th AES Convention, Paris, France, March 1984.
Audio Engineering Society (AES).

[JD93]

D.H. Johnson and D.E. Dudgeon. Array Signal Processing: Concepts and
Techniques. Prentice-Hall, 1993.

[JF86]

M.C. Junger and D. Feit. Sound, Structures, and Their Interaction. Acoustical Society of America, 1986.

[JKA02]

H.M. Jones, A. Kennedy, and T.D. Abhayapala. On dimensionality of multipath fields: Spatial extent and richness. In Proc. Int. Conf. Acoustics,
Speech, and Signal Processing (ICASSP 02), Orlando, USA, May 2002.

[JN84]

N.S. Jayant and P. Noll. Digital Coding of Waveforms. Prentice-Hall, 1984.

[KdVBB05] M. Kuster, D. de Vries, D. Beer, and S. Brix. Structural and acoustic analysis
of multi-actuator panels. In 118th AES Convention, Barcelona, Spain, May
2005. Audio Engineering Society (AES).
[KN93]

O. Kirkeby and P.A. Nelson. Reproduction of plane wave sound fields. Journal of the Acoustic Society of America, 94(5):29923000, Nov. 1993.

[KNOBH96] O. Kirkeby, P.A. Nelson, F. Orduna-Bustamante, and H. Hamada. Local sound field reproduction using digital signal processing. Journal of the
Acoustic Society of America, 100(3):15841593, Sept. 1996.
[KS03]

B. Klehs and T. Sporer. Wave field synthesis in the real world: Part 1 In
the living room. In 114th AES Convention, Amsterdam, The Netherlands,
March 2003. Audio Engineering Society (AES).

[KTH99]

S. Kiriakakis, P. Tsakalides, and T. Holman. Surrounded by sound. IEEE


Signal Processing Magazine, 16(1):5566, Jan. 1999.

272

BIBLIOGRAPHY

[LB05]

D. Leckschat and M. Baumgartner. Wellenfeldsynthese: Untersuchungen


zu Alias-Artefakten im Ortsfrequenzbereich und Realisierung eines praxistauglichen WFS-Systems. In 31. Deutsche Jahrestagung f
ur Akustik, Munic,
Germany, 2005.

[LMS]

Multimedia Communications and Signal Processing at the University of


Erlangen-Nuremberg. http://www.lnt.de/LMS (last viewed on 4/16/2007).

[MAT]

The MathWorks. http://www.mathworks.com/.

[Mec02]

F.P. Mechel. Formulas of Acoustics. Springer, 2002.

[Mey04]

J. Meyer.
Spherical microphone arrays for 3D sound reproduction.
In Y.Huang and J.Benesty, editors, Audio Signal Processing for NextGeneration Multimedia Communication Systems. Kluwer Academic Publishers, 2004.

[MF53a]

P.M. Morse and H. Feshbach.


McGraw-Hill, New York, 1953.

Methods of theoretical physics. Part I.

[MF53b]

P.M. Morse and H. Feshbach.


McGraw-Hill, New York, 1953.

Methods of theoretical physics. Part II.

[MK88]

M. Miyoshi and Y. Kaneda. Inverse filtering of room acoustics. IEEE Transactions on Acoustics, Speech, and Signal Processing, 36(2):145152, February
1988.

[Mou94]

J. N. Mourjopoulos. Digital equalization of room acoustics. J. Audio Eng.


Soc., 42(11):884900, November 1994.

[NA79]

S.T. Neely and J.B. Allen. Invertibility of a room impulse response. Journal
of the Acoustical Society of America, 66:165169, July 1979.

[NE98]

R. Nicol and M. Emerit. Reproducing 3D-sound for videoconferencing: A


comparison between holophony and ambisonic. In First COST-G6 Workshop
on Digital Audio Effects (DAFX98), Barcelona, Spain, Nov. 1998.

[NE99]

R. Nicol and M. Emerit. 3D-sound reproduction over an extensive listening


area: A hybrid method derived from holophony and ambisonic. In 16th Int.
Converence of the Audio Enginering Society (AES), Rovaniemi, Finnland,
April 1999.

[NOBH95]

P.A. Nelson, F. Orduna-Bustamante, and H. Hamada. Inverse filter design


and equalization zones in multichannel sound reproduction. IEEE Transactions on Speech and Audio Processing, 3(3):185192, May 1995.

BIBLIOGRAPHY

273

[Oes56]

H.L. Oestreicher. Representation of the field of an acoustic source as a series


of multipole fields. Journal of the Acoustical Society of America, 29(11):1219
1222, November 1956.

[OM53]

H.F. Olson and E.G. May. Electronic sound absorber. Journal of the Acoustical Society of America, 25(6):11301136, November 1953.

[OS99]

A.V. Oppenheim and R.W. Schafer.


Prentice-Hall, 1999.

[OYS+ 99]

M. Omura, M. Yada, H. Saruwatari, S. Kajita, K. Takeda, and F. Itakura.


Compensating of room acoustic transfer functions affected by change of room
temperature. In IEEE International Conference on Acoustics, Speech, and
Signal Processing (ICASSP), Phoenix, USA, March 1999.

[Pap68]

A. Papoulis. Systems and Transforms with Applications in Optics. McGrawHill, 1968.

[PEBL05]

B. Pueo, J. Escolano, S. Bleda, and J.J. Lopez. An approach for wave field
synthesis high power applications. In 118th AES Convention, Barcelona,
Spain, May 2005. Audio Engineering Society (AES).

[Pet04]

S. Petrausch. Solution of the wave equation using the functional transformation method. LMS Internal Report, 19th April 2004.

[Pie91]

A.D. Pierce. Acoustics. An Introduction to its Physical Principles and Applications. Acoustical Society of America, 1991.

[Pol00]

M.A. Poletti. A unified theory of horizontal holographic sound systems.


Journal of the AES, 48(12):11551182, December 2000.

[PR04]

S. Petrausch and R. Rabenstein. A simplified design of multidimensional


transfer function models. In International Workshop on Spectral Methods and
Multirate Signal Processing (SMMSP), Vienna, Austria, September 2004.

[PR05]

S. Petrausch and R. Rabenstein. Highly efficient simulation and visualization of acoustic wave fields with the functional transformation method. In
Simulation and Visualization, pages 279290, Magdeburg, March 2005. Otto
von Guericke Universitat.

[PSR05]

S. Petrausch, S. Spors, and R. Rabenstein. Simulation and visualization


of room compensation for wave field synthesis with the functional transformation method. In 119th AES Convention, New York, USA, 2005. Audio
Engineering Society (AES).

Discrete-time signal processing.

274

BIBLIOGRAPHY

[Pul97]

V. Pulkki. Virtual sound source positioning using vector base amplitude


panning. Journal of the AES, 45(6):456466, June 1997.

[Pul99]

V. Pulkki. Uniform spreading of amplitude panned virtual sources. In Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz,
USA, Oct. 1999.

[Rad17]

J. Radon. Uber
die Bestimmung von Funktionen durch ihre Integralwerte
langs gewisser Mannigfaltigkeiten. Berichte Sachsische Akadamie der Wissenschaften, 69:262267, 1917.

[Red05]

J.N. Reddy. Introduction to the finite element method. Mcgraw Hill, 2005.

[RME]

RME Intelligent Audio Solutions. http://www.rme-audio.com/.

[RV89]

D.D. Rife and J. Vanderkooy.


Transfer-function measurement with
maximum-length sequences. Journal of the AES, 37(6):419444, June 1989.

[SBR04a]

S. Spors, H. Buchner, and R. Rabenstein. Adaptive listening room compensation for spatial audio systems. In European Signal Processing Conference
(EUSIPCO), 2004.

[SBR04b]

S. Spors, H. Buchner, and R. Rabenstein. Efficient active listening room


compensation for Wave Field Synthesis. In 116th AES Convention, Berlin,
Germany, 2004. Audio Engineering Society (AES).

[SBR04c]

S. Spors, H. Buchner, and R. Rabenstein. A novel approach to active listening room compensation for wave field synthesis using wave-domain adaptive
filtering. In IEEE International Conference on Acoustics, Speech, and Signal
Processing (ICASSP), Montreal, Canada, 2004.

[Sca97]

J.A. Scales. Theory of Acoustic Imaging. Samizdat Press, 1997.

[SdVL98]

J.-J. Sonke, D. de Vries, and J. Labeeuw. Variable acoustics by wave field


synthesis: A closer look at amplitude effects. In 104th AES Convention,
Amsterdam, Netherlands, May 1998. Audio Engineering Society (AES).

[SGP+ 95]

V. Sanchez, P. Garcia, A.M. Peinado, J.C. Segura, and A.J. Rubino. Diagonalizing properties of the discrete cosine tranforms. IEEE Transactions on
Signal Processing, 43(11), Nov. 1995.

[SH94]

S.D. Snyder and C.H. Hansen. The effect of transfer function estimation
errors on the filtered-X LMS algorithm. IEEE Transactions on Signal Processing, 42(4):950953, April 1994.

BIBLIOGRAPHY

275

[SH00]

A. Sontacchi and R. Hoeldrich. Enhanced 3D sound field synthesis and


reproduction system by compensating interfering reflections. In Conference
on Digital Audio Effects (DAFX-00), Verona, Italy, Dec. 2000.

[SK92]

M.M. Sondhi and W. Kellermann. Adaptive echo cancellation for speech


signals. In S. Furui and M.M. Sondhi, editors, Advances in Speech Signal
Processing, chapter 11, pages 327356. Marcel Dekker, 1992.

[SKR03]

S. Spors, A. Kuntz, and R. Rabenstein. Listening room compensation for


wave field synthesis. In IEEE International Conference on Multimedia and
Expo (ICME), pages 725728, Baltimore, USA, July 2003.

[SMH95]

M.M. Sondhi, D.R. Morgan, and J.L. Hall. Stereophonic acoustic echo cancellation an overview of the fundamental problem. IEEE Signal Processing
Letters, 2(8):148151, August 1995.

[Sne72]

I.N. Sneddon. The Use of Integral Transforms. McGraw-Hill, New York,


1972.

[Son67]

M.M. Sondhi. An adaptive echo canceller. The Bell System Technical Journal, 46(3):497511, March 1967.

[Spo04]

T. Sporer. Wave field synthesis Generation and reproduction of natural


sound environments. In 7th International Conference on Digital Audio Effects (DAFx-04), Naples, Italy, Oct. 2004.

[SRR05]

S. Spors, M. Renk, and R. Rabenstein. Limiting effects of active room compensation using wave field synthesis. In 118th AES Convention, Barcelona,
Spain, May 2005. Audio Engineering Society (AES).

[SS67]

R. Sauer and I. Szabo. Mathematische Hilfmittel des Ingenieurs. Teil I.


Springer, 1967.

[SSR05]

S. Spors, D. Seuberth, and R. Rabenstein. Multiexciter panel compensation


for wave field synthesis. In 31st German Annual Conference on Acoustics
(DAGA), 2005.

[Sta97]

E.W. Start. Direct Sound Enhancement by Wave Field Synthesis. PhD


thesis, Delft University of Technology, 1997.

[Ste96]

G. Steinke. Surround sound The new phase. An overview. In 100th AES


Convention, Copenhagen, Denmark, May 1996. Audio Engineering Society
(AES).

276

BIBLIOGRAPHY

[STR02]

S. Spors, H. Teutsch, and R. Rabenstein. High-quality acoustic rendering


with wave field synthesis. In Vision, Modelling and Visualization (VMV),
pages 101108, November 2002.

[TAG+ 01]

M. Tanter, J.-F. Aubry, J. Gerber, J.-L. Thomas, and M. Fink. Optimal


focusing by spatio-temporal inverse filter. I. Basic principles. Journal of the
Acoustic Society of America, 110(1):3747, July 2001.

[TBB00]

O.J. Tobias, J.C. Bermudez, and N.J. Bershad. Mean weight behavior of
the filtered-X LMS algorithm. IEEE Transactions on Signal Processing,
48(4):10611075, April 2000.

[Teu05]

H. Teutsch. Wavefield Decomposition using Microphone Arrays and its Application to Acoustic Scene Analysis. PhD thesis, University of ErlangenNuremberg, 2005. http://www.lnt.de/lms/publications (last viewed on
4/16/2007).

[TGAA+ 03] S. Torres-Guijarro, J. Ander, B. Alava, F.J. Casjus-Quiros, and L.I. OrtizBerenguer. Multichannel audio decorrelation for coding. In 6th Int. Conference on Digital Audio Effects (DAFX-03), London, UK, Sept. 2003.
[Tof96]

P. Toft. The Radon Transform. Theory and Implementation. PhD thesis,


Technical University of Denmark, 1996.

[Tor]

A. Torger. BruteFIR - an open-source general-purpose audio convolver.


http://www.ludd.luth.se/~ torger/brutefir.html.

[Tor98]

E. Torick. Highlights in the history of multichannel sound. Journal of the


AES, 46(1/2):2731, Jan./Feb. 1998.

[TR03]

L. Trautmann and R. Rabenstein. Digital Sound Synthesis by Physical Modeling using the Functional Transformation Method. Kluwer Academic/Plenum Publishers, New York, 2003.

[Tre02]

H.L. Van Trees. Optimum Array Processing. Wiley-Interscience, 2002.

[TW02]

F. Talantzis and D.B. Ward. Multi-channel equalization in an acoustic reverberant environment: Establishment of robustness measures. In Institute
of Acoustics Spring Conference, Salford, UK, March 2002.

[TW03]

F. Talantzis and D.B. Ward. Robustness of multichannel equalization in


an acoustic reverberant environment. Journal of the Acoustical Society of
America, 114(2):833841, Aug. 2003.

BIBLIOGRAPHY

277

[TWR03]

G. Theile, H. Wittek, and M. Reisinger. Potential wavefield synthesis applications in the multichannel stereophonic world. In AES 24th International
Conference on Multichannel Audio, Banff, Canada, June 2003. Audio Engineering Society (AES).

[Ver97]

E.N.G. Verheijen. Sound Reproduction by Wave Field Synthesis. PhD thesis,


Delft University of Technology, 1997.

[Vog93]

P. Vogel. Application of Wave Field Synthesis in Room Acoustics. PhD


thesis, Delft University of Technology, 1993.

[Vol96]

E.J. Volker. Home cinema surround sound Acoustics and neighbourhood.


In 100th AES Convention, Copenhagen, Denmark, May 1996. Audio Engineering Society (AES).

[Vol98]

E.J. Volker. To nearfield monitoring of multichannel reproduction is the


acoustics of the living room sufficient? In Proc. of the Tonmeistertagung,
Hannover, Germany, 1998.

[VTB02]

E.J. Volker, W. Teuber, and A. Bob. 5.1 in the living room - on acoustics
of multichannel reproduction. In Proc. of the Tonmeistertagung, Hannover,
Germany, 2002.

[WA01]

D.B. Ward and T.D. Abhayapala. Reproduction of a plane-wave sound field


using an array of loudspeakers. IEEE Transactions on Speech and Audio
Processing, 9(6):697707, Sept. 2001.

[Wei03]

E.W. Weisstein. CRC Concise Encyclopedia of Mathematics. Chapman &


Hall/CRC, 2003. http://mathworld.wolfram.com.

[Wik05a]

Wikipedia. Alexander Graham Bell Wikipedia, the free encyclopedia,


2005. [Online; accessed 31-August-2005].

[Wik05b]

Wikipedia. Huygens principle Wikipedia, the free encyclopedia, 2005.


[Online; accessed 22-September-2005].

[Wik05c]

Wikipedia. Johann Philipp Reis Wikipedia, the free encyclopedia, 2005.


[Online; accessed 27-September-2005].

[Wil99]

E.G. Williams. Fourier Acoustics: Sound Radiation and Nearfield Acoustical


Holography. Academic Press, 1999.

[Wit05]

H. Wittek. Raumliche Wahrnehmung von Wellenfeldsynthese Der Einfluss


von Alias-Effekten auf die Klangfarbe. In 31. Deutsche Jahrestagung fur
Akustik, Munic, Germany, 2005.

278

BIBLIOGRAPHY

[WKRT04] H. Wittek, S. Kerber, F. Rumsey, and G. Theile. Spatial perception in wave


field synthesis rendered sound fields: Distance of real and virtual nearby
sources. In 116th AES Convention, Berlin, Germany, 2004. Audio Engineering Society (AES).
[WVY00]

Y. Wang, M. Vilermo, and L. Yaroslavsky. Energy compaction property of


the MDCT in comparison with other transforms. In 109th AES Convention,
Los Angeles, USA, Sept. 2000. Audio Engineering Society (AES).

[WW+ 04]

A. Wagner, A. Walther, , F. Melchior, and M. Strau. Generation of highly


immersive atmospheres for wave field synthesis reproduction. In 116th AES
Convention, Berlin, Germany, 2004. Audio Engineering Society (AES).

[YTF03]

S. Yon, M. Tanter, and M. Fink. Sound focusing in rooms. II. The


spatio-temporal inverse filter. Journal of the Acoustic Society of America,
114(6):30443052, Dec. 2003.

[Zio95]

L.J. Ziomek. Fundamentals of Acoustic Field Theory and Space Time Signal
Processing. CRC Press, 1995.

Curriculum Vitae
Name:

Sascha Michael Spors

Birth:
Nationality:

1st of July, 1972 in St.Ingbert, Germany


German

School Education
1977
September 1978 July 1980
September 1980 July 1982
September 1982 June 1992
June 1992

Buglawton County Primary School,


Buxton, Congleton, UK.
Grundschule Tennenlohe, Erlangen, Germany.
Grundschule Soldnerstrasse, F
urth, Germany.
Helene-Lange Gymnasium, F
urth, Germany.
Abitur

Alternative Service
July 1992 October 1993

Nuremberg, Germany.

University Education
October 1994 October 2000 Student of electrical engineering at the
University of Erlangen/Nuremberg, Germany.
October 2000
Reception of Dipl.Ing. degree
Professional Live
January 2001 October 2005

November 2005

Scientific assistant,
Multimedia Communications and Signal Processing,
University of Erlangen/Nuremberg.
Senior Scientist,
Deutsche Telekom Laboratories,
Deutsche Telekom AG, Technical University of Berlin.