Académique Documents
Professionnel Documents
Culture Documents
Axel Schuler
January 3, 2007
Contents
1 Real and Complex Numbers
Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Notations . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sums and Products . . . . . . . . . . . . . . . . . . . . . .
Mathematical Induction . . . . . . . . . . . . . . . . . . . .
Binomial Coefficients . . . . . . . . . . . . . . . . . . . . .
1.1 Real Numbers . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.1 Ordered Sets . . . . . . . . . . . . . . . . . . . . .
1.1.2 Fields . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.3 Ordered Fields . . . . . . . . . . . . . . . . . . . .
1.1.4 Embedding of natural numbers into the real numbers
1.1.5 The completeness of R . . . . . . . . . . . . . . . .
1.1.6 The Absolute Value . . . . . . . . . . . . . . . . . .
1.1.7 Supremum and Infimum revisited . . . . . . . . . .
1.1.8 Powers of real numbers . . . . . . . . . . . . . . . .
1.1.9 Logarithms . . . . . . . . . . . . . . . . . . . . . .
1.2 Complex numbers . . . . . . . . . . . . . . . . . . . . . . .
1.2.1 The Complex Plane and the Polar form . . . . . . .
1.2.2 Roots of Complex Numbers . . . . . . . . . . . . .
1.3 Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.1 Monotony of the Power and Exponential Functions .
1.3.2 The Arithmetic-Geometric mean inequality . . . . .
1.3.3 The CauchySchwarz Inequality . . . . . . . . . . .
1.4 Appendix A . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Sequences and Series
2.1 Convergent Sequences . . . . . . . . . . .
2.1.1 Algebraic operations with sequences
2.1.2 Some special sequences . . . . . .
2.1.3 Monotonic Sequences . . . . . . .
2.1.4 Subsequences . . . . . . . . . . . .
2.2 Cauchy Sequences . . . . . . . . . . . . .
2.3 Series . . . . . . . . . . . . . . . . . . . .
3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
11
11
11
12
12
13
15
15
17
19
20
21
22
23
24
26
29
31
33
34
34
34
35
36
.
.
.
.
.
.
.
43
43
46
49
50
51
55
57
CONTENTS
4
2.3.1
2.3.2
2.3.3
2.3.4
2.3.5
2.3.6
2.3.7
2.3.8
2.3.9
2.3.10
2.3.11
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
57
59
59
61
63
65
66
67
68
69
72
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
101
101
107
108
108
111
112
113
115
117
5 Integration
119
5.1 The RiemannStieltjes Integral . . . . . . . . . . . . . . . . . . . . . . . . . . 119
5.1.1 Properties of the Integral . . . . . . . . . . . . . . . . . . . . . . . . . 126
CONTENTS
5.2
5.3
5.4
5.5
5.6
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
132
134
135
138
140
141
143
143
146
147
148
150
151
152
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
157
157
158
158
162
163
166
168
171
177
177
178
180
182
185
187
188
.
.
.
.
.
.
.
.
.
193
194
197
199
199
202
206
206
208
211
CONTENTS
6
7.5
7.6
7.7
7.8
7.9
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
10 Surface Integrals
10.1 Surfaces in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.1.1 The Area of a Surface . . . . . . . . . . . . . . . . . . . .
10.2 Scalar Surface Integrals . . . . . . . . . . . . . . . . . . . . . . . .
10.2.1 Other Forms for dS . . . . . . . . . . . . . . . . . . . . .
10.2.2 Physical Application . . . . . . . . . . . . . . . . . . . . .
10.3 Surface Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.3.1 Orientation . . . . . . . . . . . . . . . . . . . . . . . . . .
10.4 Gau Divergence Theorem . . . . . . . . . . . . . . . . . . . . . .
10.5 Stokes Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.5.1 Greens Theorem . . . . . . . . . . . . . . . . . . . . . . .
10.5.2 Stokes Theorem . . . . . . . . . . . . . . . . . . . . . . .
10.5.3 Vector Potential and the Inverse Problem of Vector Analysis
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
216
219
223
225
225
225
227
230
.
.
.
.
.
231
231
231
233
236
239
.
.
.
.
.
.
.
245
245
247
248
249
250
253
257
.
.
.
.
.
.
.
.
.
.
.
.
259
259
261
262
262
264
264
264
268
272
272
274
276
.
.
.
.
279
279
279
284
285
11 Differential Forms on n
11.1 The Exterior Algebra (Rn ) . . .
11.1.1 The Dual Vector Space V
11.1.2 The Pull-Back of k-forms
11.1.3 Orientation of Rn . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CONTENTS
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
285
285
286
288
291
293
293
295
296
298
299
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
305
305
306
308
313
314
316
318
318
319
322
322
324
325
326
327
329
329
.
.
.
.
.
.
.
.
.
.
.
331
331
331
334
335
339
343
344
344
347
349
351
CONTENTS
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
363
363
363
365
366
369
369
371
373
377
380
383
385
386
387
392
394
395
395
396
.
.
.
.
.
.
.
.
.
.
401
401
401
402
405
408
408
408
409
411
414
.
.
.
.
.
.
417
417
417
418
422
422
424
CONTENTS
16.2.3 Convergence and Limits of Distributions
16.2.4 The distribution P x1 . . . . . . . . . . .
16.2.5 Operation with Distributions . . . . . . .
16.3 Tensor Product and Convolution Product . . . . .
16.3.1 The Support of a Distribution . . . . . .
16.3.2 Tensor Products . . . . . . . . . . . . . .
16.3.3 Convolution Product . . . . . . . . . . .
16.3.4 Linear Change of Variables . . . . . . . .
16.3.5 Fundamental Solutions . . . . . . . . . .
16.4 Fourier Transformation in S (Rn ) and S (Rn ) .
16.4.1 The Space S (Rn ) . . . . . . . . . . . .
16.4.2 The Space S (Rn ) . . . . . . . . . . . .
16.4.3 Fourier Transformation in S (Rn ) . . . .
16.5 AppendixMore about Convolutions . . . . . .
9
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
425
426
427
433
433
433
434
437
438
439
440
446
447
450
453
453
453
455
456
459
459
460
464
466
468
469
473
477
477
478
485
485
490
494
10
CONTENTS
Chapter 1
Real and Complex Numbers
Basics
Notations
R
C
Q
N = {1, 2, . . . }
Z
Real numbers
Complex numbers
Rational numbers
positive integers (natural numbers)
Integers
defining equation
implication, if . . . , then . . .
if and only if, equivalence
for all
there exists
closed interval
open interval
half-open interval
half-open interval
closed half-line
open half-line
closed half-line
open half-line
11
12
k=m
n
Y
ak := am + am+1 + + an ,
k=m
ak := am am+1 an .
In case m = n the sum and the product consist of one summand and one factor only, respectively. In case n < m it is customary to set
n
X
ak := 0, (empty sum)
k=m
n
Y
ak := 1
(empty product).
k=m
k=m
ak +
p
X
ak =
k=n+1
We have for a R,
n
X
k=m
p
X
ak ,
k=m
n
X
ak =
k=m
n+d
X
akd
(index shift).
k=m+d
a = (n m + 1)a.
n
X
k=1
(2k 1) = n2 .
Proof. We use induction over n. In case n = 0 we have an empty sum on the left hand side (lhs)
and 02 = 0 on the right hand side (rhs). Hence, the statement is true for n = 0.
Suppose it is true for some fixed n. We shall prove it for n + 1. By the definition of the sum
P
and by induction hypothesis, nk=1 (2k 1) = n2 , we have
n+1
X
k=1
(2k 1) =
n
X
k=1
(2k 1) + 2(n + 1) 1
ind. hyp.
n2 + 2n + 1 = (n + 1)2 .
13
(b) For all positive integers n 8 we have 2n > 3n2 .
Proof. In case n = 8 we have
2n = 28 = 256 > 192 = 3 64 = 3 82 = 3n2 ;
and the statement is true in this case.
Suppose it is true for some fixed n 8, i. e. 2n > 3n2 (induction hypothesis). We will show
that the statement is true for n + 1, i. e. 2n+1 > 3(n + 1)2 (induction assertion). Note that n 8
implies
= (n 1)2 > 4 > 2
n1 7>2
= 3(n2 2n 1) > 0
= 3n2 6n 3 > 0
= n2 2n 1 > 0
| +3n2 + 6n + 3
(1.1)
By induction assumption, 2n+1 = 2 2n > 2 3n2 . This together with (1.1) yields
2n+1 > 3(n + 1)2 . Thus, we have shown the induction assertion. Hence the statement is true
for all positive integers n 8.
For a positive integer n N we set
n! :=
n
Y
k,
read: n factorial,
0! = 1! = 1.
k=1
The numbers nk (read: n choose k) are called binomial coefficients since they appear in the
binomial theorem, see Proposition 1.4 below. It just follows from the definition that
n
= 0 for k > n,
k
n!
n
n
=
for 0 k n.
=
k
k!(n k)!
nk
Lemma 1.2 For 0 k n we have:
n
n
n+1
.
+
=
k+1
k
k+1
14
We say that X is an n-set if X has exactly n elements. We write Card X = n (from cardinality) to denote the number of elements in X.
Lemma 1.3 The number of k-subsets of an n-set is nk .
The Lemma in particular shows that nk is always an integer (which is not obvious by its definition).
Proof. We denote the number of k-subsets of an n set Xn by Ckn . It is clear that C0n = Cnn = 1
since is the only 0-subset of Xn and Xn itself is the only n-subset of Xn . We use induction
over n. The case n = 1 is obvious since C01 = C11 = 10 = 11 = 1. Suppose that the claim is
true for some fixed n. We will show the statement for the (n + 1)-set X = {1, . . . , n + 1} and
all k with 1 k n. The family of (k + 1)-subsets of X splits into two disjoint classes. In the
first class A1 every subset contains n + 1; in the second class A2 , not. To form a subset in A1
one has to choose another k elements out of {1, . . . , n}. By induction assumption the number
is Card A1 = Ckn = nk . To form a subset in A2 one has to choose k + 1 elements out of
n
n
. By Lemma 1.2
= k+1
{1, . . . , n}. By induction assumption this number is Card A2 = Ck+1
we obtain
n+1
n
n
n+1
Ck+1 = Card A1 + Card A2 =
=
+
k+1
k+1
k
which proves the induction assertion.
Proof. We give a direct proof. Using the distributive law we find that each of the 2n summands
of product (x + y)n has the form xnk y k for some k = 0, . . . , n. We number the n factors
as (x + y)n = f1 f2 fn , f1 = f2 = = fn = x + y. Let us count how often the
summand xnk y k appears. We have to choose k factors y out of the n factors f1 , . . . , fn . The
remaining n k factors must be x. This gives a 1-1-correspondence between the k-subsets
of {f1 , . . . , fn } and the different summands of the form xnk y k . Hence, by Lemma 1.3 their
number is Ckn = nk . This proves the proposition.
15
tends to 2. But unless the irrational number 2 has been clearly defined, the question must
arise: What is it that this sequence tends to?
This sort of question can be answered as soon as the so-called real number system is constructed.
Example 1.2 As shown in the exercise class, there is no rational number x with x2 = 2. Set
A = {x Q+ | x2 < 2} and
B = {x Q+ | x2 > 2}.
Then A B = Q+ and A B = . One can show that in the rational number system, A
has no largest element and B has no smallest element, for details see Appendix A or Rudins
book [Rud76, Example 1.1, page 2]. This example shows that the system of rational numbers
has certain gaps in spite of the fact that between any two rationals there is another: If r < s
then r < (r + s)/2 < s. The real number system fills these gaps. This is the principal reason
for the fundamental role which it plays in analysis.
We start with the brief discussion of the general concepts of ordered set and field.
x = y,
y<x
(trichotomy)
(transitivity).
16
17
Indeed, if a A and b B then a2 < 2 < b2 . Taking the square root we have a < b. Since B
contains no smallest member, A has no supremum in Q+ .
Similarly, B is bounded below by any element of A. Since A has no largest member, B has no
infimum in Q.
Remarks 1.1 (a) It is clear from (ii) and the trichotomy of < that there is at most one such .
Indeed, suppose also satisfies (i) and (ii), by (ii) we have and ; hence = .
(b) If sup E exists and belongs to E, we call it the maximum of E and denote it by max E.
Hence, max E = sup E and max E E. Similarly, if the infimum of E exists and belongs to
E we call it the minimum and denote it by min E; min E = inf E, min E E.
bounded subset of Q
[0, 1]
[0, 1)
A
1
1
max
1
(c) Suppose that is an upper bound of E and E then = max E, that is, property (ii) in
Definition 1.2 is automatically satisfied. Similarly, if E is a lower bound, then = min E.
(d) If E is a finite set it has always a maximum and a minimum.
1.1.2 Fields
Definition 1.3 A field is a set F with two operations, called addition and multiplication which
satisfy the following so-called field axioms (A), (M), and (D):
(A) Axioms for addition
(A1)
(A2)
(A3)
(A4)
(A5)
(M1)
(M2)
(M3)
(M4)
(M5)
18
Remarks 1.2 (a) One usually writes
x y,
x
, x + y + z, xyz, x2 , x3 , 2x, . . .
y
in place of
1
x + (y), x , (x + y) + z, (xy)z, x x, x x x, 2x, . . .
y
(b) The field axioms clearly hold in Q if addition and multiplication have their customary meaning. Thus Q is a field. The integers Z form not a field since 2 Z has no multiplicative inverse
(axiom (M5) is not fulfilled).
(c) The smallest field is F2 = {0, 1} consisting of the neutral element 0 for addition and the neu+ 0 1
tral element 1 for multiplication. Multiplication and addition are defined as follows 0 0 1
1 1 0
0 1
0 0 0 . It is easy to check the field axioms (A), (M), and (D) directly.
1 0 1
(d) (A1) to (A5) and (M1) to (M5) mean that both (F, +) and (F \ {0}, ) are commutative (or
abelian) groups, respectively.
Proposition 1.5 The axioms of addition imply the following statements.
(a) If x + y = x + z then y = z (Cancellation law).
(b) If x + y = x then y = 0 (The element 0 is unique).
(c) If x + y = 0 the y = x (The inverse x is unique).
(d) (x) = x.
Proof. If x + y = x + z, the axioms (A) give
y = 0 + y = (x + x) + y = x + (x + y) = x + (x + z)
A4
A5
A3
assump.
= (x + x) + z = 0 + z = z.
A3
A5
A4
This proves (a). Take z = 0 in (a) to obtain (b). Take z = x in (a) to obtain (c). Since
x + x = 0, (c) with x in place of x and x in place of y, gives (d).
Proposition 1.6 The axioms for multiplication imply the following statements.
(a) If x 6= 0 and xy = xz then y = z (Cancellation law).
(b) If x 6= 0 and xy = x then y = 1 (The element 1 is unique).
(c) If x 6= 0 and xy = 1 then y = 1/x (The inverse 1/x is unique).
(d) If x 6= 0 then 1/(1/x) = x.
The proof is so similar to that of Proposition 1.5 that we omit it.
19
Proposition 1.7 The field axioms imply the following statements, for any x, y, z F
(a) 0x = 0.
(b) If xy = 0 then x = 0 or y = 0.
(c) (x)y = (xy) = x(y).
(d) (x)(y) = xy.
Proof. 0x + 0x = (0 + 0)x = 0x. Hence 1.5 (b) implies that 0x = 0, and (a) holds.
Suppose to the contrary that both x 6= 0 and y 6= 0 then (a) gives
1 1
1 1
1 = xy = 0 = 0,
y x
y x
a contradiction. Thus (b) holds.
The first equality in (c) comes from
(x)y + xy = (x + x)y = 0y = 0,
combined with 1.5 (b); the other half of (c) is proved in the same way. Finally,
(x)(y) = [x(y)] = [xy] = xy
by (c) and 1.5 (d).
20
Proposition 1.9 The following statements are true in every ordered field.
(a) If x > 0 then x < 0, and if x < 0 then x > 0.
(b) If x > 0 and y < z then xy < xz.
(c) If x < 0 and y < z then xy > xz.
(d) If x 6= 0 then x2 > 0. In particular, 1 > 0.
(e) If 0 < x < y then 0 < 1/y < 1/x.
Proof. (a) If x > 0 then 0 = x + x > x + 0 = x, so that x < 0. If x < 0 then
0 = x + x < x + 0 = x so that x > 0. This proves (a).
(b) Since z > y, we have z y > 0, hence x(z y) > 0 by axiom (O2), and therefore
xz = x(z y) + xy
>
P rp. 1.8
0 + xy = xy.
Remarks 1.3 (a) The finite field F2 = {0, 1}, see Remarks 1.2, is not an ordered field since
1 + 1 = 0 which contradicts 1 > 0.
(b) The field of complex numbers C (see below) is not an ordered field since i2 = 1 contradicts
Proposition 1.9 (a), (d).
(n times).
21
Proof. We use induction over n. By Proposition 1.9 (d) the statement is true for n = 1. Suppose
it is true for a fixed n, i. e. nF > 0F . Moreover 1F > 0F . Using axiom (O2) we obtain
(n + 1)1F = nF + 1F > 0.
From Lemma 1.10 it follows that m 6= n implies nF 6= mF . Indeed, let n be greater than m,
say n = m + k for some k N, then nF = mF + kF . Since kF > 0 it follows from 1.8 (a) that
nF > mF . In particular, nF 6= mF . Hence, the mapping
N F,
n 7 nF
is a one-to-one correspondence (injective). In this way the positive integers are embedded into
the real numbers. Addition and multiplication of natural numbers and of its embeddings are the
same:
nF mF = (nm)F .
nF + mF = (n + m)F ,
From now on we identify a natural number with the associated real number. We write n for nF .
Definition 1.5 (The Archimedean Axiom) An ordered field F is called Archimedean if for all
x, y F with x > 0 and y > 0 there exists n N such that nx > y.
An equivalent formulation is: The subset N F of positive integers is not bounded above.
Choose x = 1 in the above definition, then for any y F there in an n N such that n > y;
hence N is not bounded above.
Suppose N is not bounded and x > 0, y > 0 are given. Then y/x is not an upper bound for N,
that is there is some n N with n > y/x or nx > y.
Using the axioms so far we are not yet able to prove the existence of irrational numbers. We
need the completeness axiom.
Definition 1.6 (Order Completeness) An ordered set S is said to be order complete if for
every non-empty bounded subset E S has a supremum sup E in S.
(C) Completeness Axiom
The real numbers are order complete, i. e. every bounded subset E R has a supremum.
The set Q of rational numbers is not order complete since, for example, the bounded set
22
Remarks 1.4 (a) If x, y Q with x < y, then there exists z R \ Q with x < z < y; chose
z = x + (y x)/ 2.
Ex class: (b) We shall show that inf n1 | n N = 0. Since n > 0 for all n N, n1 > 0 by
Proposition 1.9 (e) and 0 is a lower bound. Suppose > 0. Since R is Archimedean, we find
m N such that 1 < m or, equivalently 1/m < . Hence, is not a lower bound for E
which proves the claim.
(c) Axiom (C) is equivalent to the Archimedean property together with the topological completeness (Every Cauchy sequence in R is convergent, see Proposition 2.18).
(d) Axiom (C) is equivalent to the axiom of nested intervals, see Proposition 2.11 below:
Let In := [an , bn ] a sequence of closed nested intervals, that is (I1 I2 I3 )
such that for all > 0 there exists n0 such that 0 bn an < for all n n0 .
Then there exists a unique real number a R which is a member of all intervals,
T
i. e. {a} = nN In .
(
x,
x,
if
if
x 0,
x < 0.
x a).
Proof. (a) By Proposition 1.9 (a), x < 0 implies | x | = x > 0. Also, x > 0 implies | x | > 0.
Putting both together we obtain, x 6= 0 implies | x | > 0 and thus | x | = 0 implies x = 0.
Moreover | 0 | = 0. This shows the first part.
The statement | x | = | x | follows from (b) and (x) = x.
(b) Suppose first that x 0. Then x 0 x and we have max{x, x} = x = | x |. If x < 0
then x > 0 > x and
23
cation with | y1 | .
(d) By (b) we have x | x | and y | y |. It follows from Proposition 1.8 (b) that
(x + y) | x | + | y | .
By the second part of (b) with a = | x | + | y |, we obtain | x + y | | x | + | y |.
(e) Inserting u := x + y and v := y into | u + v | | u | + | v | one obtains
| x | | x + y | + | y | = | x + y | + | y | .
= sup M .
x M
1
n
< x.
24
y x = (y x)
n
X
k=1
y nk xk1 = c(y x)
(1.2)
25
P
with c := nk=1 y nk xk1 > 0 since x, y > 0. The claim follows.
(b) We have
y n xn nxn1 (y x) = (y x)
= (y x)
n
X
k=1
n
X
k=1
y nk xk1 xn1
xk1 y nk xnk 0
since by (a) y x and y nk xnk have the same sign. The proof of the second inequality is
quite analogous.
Proposition 1.15 For every real x > 0 and every positive integer n N there is one and only
one y > 0 such that y n = x.
1
This number y is written n x or x n , and it is called the nth root of x.
Proof. The uniqueness is clear since by Lemma 1.14 (a) 0 < y1 < y2 implies 0 < y1n < y2n .
Set
E := {t R+ | tn < x}.
Observe that E 6= since 0 E. We show that E is bounded above. By Bernoullis inequality
and since 0 < x < nx we have
t E tn < x < 1 + nx < (1 + x)n
=
t<1+x
Lemma 1.14
Hence, 1 + x is an upper bound for E. By the order completeness of R there exists y R such
that y = sup E. We have to show that y n = x. For, we will show that each of the inequalities
y n > x and y n < x leads to a contradiction.
Assume y n < x and consider (y + h)n with small h (0 < h < 1). Lemma 1.14 (b) implies
0 (y + h)n y n n (y + h)n1 (y + h y) < h n(y + 1)n1 .
Choosing h small enough that h n(y + 1)n1 < x y n we may continue
(y + h)n y n x y n .
Consequently, (y + h)n < x and therefore y + h E. Since y + h > y, this contradicts the fact
that y is an upper bound of E.
Assume y n > x and consider (y h)n with small h (0 < h < 1). Again by Lemma 1.14 (b)
we have
0 y n (y h)n n y n1(y y + h) < h ny n1.
Choosing h small enough that h ny n1 < y n x we may continue
y n (y h)n y n x.
26
Consequently, x < (y h)n and therefore tn < x < (y h)n for all t in E. Hence y h is
an upper bound for E smaller than y. This contradicts the fact that y is the least upper bound.
Hence y n = x, and the proof is complete.
Remarks 1.6 (a) If a and b are positive real numbers and n N then (ab)1/n = a1/n b1/n .
Proof. Put = a1/n and = b1/n . Then ab = n n = ()n , since multiplication is commutative. The uniqueness assertion of Proposition 1.15 shows therefore that
(ab)1/n = = a1/n b1/n .
(b) Fix b > 0. If m, n, p, q Z and n > 0, q > 0, and r = m/n = p/q. Then we have
(bm )1/n = (bp )1/q .
(1.3)
(1.4)
1
.
1 x
b
Without proof we give the familiar laws for powers and exponentials. Later we will redefine the
power bx with real exponent. Then we are able to give easier proofs.
(d) If a, b > 0 and x, y R, then
x
(i) bx+y = bx by , bxy = bby , (ii) bxy = (bx )y , (iii) (ab)x = ax bx .
1.1.9 Logarithms
Fix b > 1, y > 0. Similarly as in the preceding subsection, one can prove the existence of a
unique real x such that bx = y. This number x is called the logarithm of y to the base b, and we
write x = logb y. Knowing existence and uniqueness of the logarithm, it is not difficult to prove
the following properties.
Lemma 1.16 For any a > 0, a 6= 1 we have
(a) loga (bc) = loga b + loga c if b, c > 0;
(b) loga (bc ) = c loga b, if b > 0;
logd b
if b, d > 0 and d 6= 1.
(c) loga b =
logd a
Later we will give an alternative definition of the logarithm function.
27
30
45
60
90
120
135
150
2
3
3
4
5
6
180
270
3
2
360
2
hypotenuse
opposite side
adjacent side
Let be any angle between 0 and 360 . Further let P be the point on the unit circle (with
center in (0, 0) and radius 1) such that the ray from P to the origin (0, 0) and the positive x-axis
make an angle . Then cos and sin are defined to be the x-coordinate and the y-coordinate
of the point P , respectively.
y
P
1
sin
cos
y
P
sin
cos
28
y
cos
sin
P
y
cos
sin
1
where k Z is chosen such that 0 + k 360 < 360 . Thinking of to be given in radians,
cosine and sine are functions defined for all real taking values in the closed interval [1, 1].
If 6= 2 + k, k Z then cos 6= 0 and we define
tan :=
If 6= k, k Z then sin 6= 0 and we define
cot :=
sin
.
cos
cos
.
sin
In this way we have defined cosine, sine, tangent, and cotangent for arbitrary angles.
(c) Special Values
x in degrees
30
45
60
90
120
135
150
180
270
360
x in radians
2
3
3
4
5
6
3
2
sin x
1
2
1
2
cos x
3
2
12
2
2
2
2
3
2
1
2
1
0
2
2
3
2
2
2
3
2
3
3
tan x
0
1
3
/
3
1
0
/
0
3
3
Recall the addition formulas for cosine and sine and the trigonometric pythagoras.
sin2 x + cos2 x = 1.
(1.5)
(1.6)
29
x1 = 2 + 4 and x2 = 2 4.
We will see that one can work with this notation.
Definition 1.7 A complex number is an ordered pair (a, b) of real numbers. Ordered means
that (a, b) 6= (b, a) if a 6= b. Two complex numbers x = (a, b) and y = (c, d) are said to be
equal if and only if a = c and b = d. We define
x + y := (a + c, b + d),
xy := (ac bd, ad + bc).
Theorem 1.17 These definitions turn the set of all complex numbers into a field, with (0, 0) and
(1, 0) in the role of 0 and 1.
Proof. We simply verify the field axioms as listed in Definition 1.3. Of course, we use the field
structure of R.
Let x = (a, b), y = (c, d), and z = (e, f ). (A1) is clear.
(A2) x + y = (a + c, b + d) = (c + a, d + b) = y + x.
(A3) (x+y)+z = (a+c, b+d)+(e, f ) = (a+c+e, b+d+f ) = (a, b)+(c+e, d+f ) = x+(y+z).
(A4) x + 0 = (a, b) + (0, 0) = (a, b) = x.
(A5) Put x := (a, b). Then x + (x) = (a, b) + (a, b) = (0, 0) = 0.
(M1) is clear.
(M2) xy = (ac bd, ad + bc) = (ca db, da + cb) = yx.
(M3) (xy)z = (ac bd, ad + bc)(e, f ) = (ace bde adf bcf, acf bdf + ade + bce) =
(a, b)(ce df, cf + de) = x(yz).
(M4) x 1 = (a, b)(1, 0) = (a, b) = x.
(M5) If x 6= 0 then (a, b) 6= (0, 0), which means that at least one of the real numbers a, b is
different from 0. Hence a2 + b2 > 0 and we can define
b
1
a
,
.
:=
x
a2 + b2 a2 + b2
Then
1
x = (a, b)
x
(D)
a
b
,
a2 + b2 a2 + b2
= (1, 0) = 1.
30
Remark 1.7 For any two real numbers a and b we have (a, 0) + (b, 0) = (a + b, 0) and
(a, 0)(b, 0) = (ab, 0). This shows that the complex numbers (a, 0) have the same arithmetic
properties as the corresponding real numbers a. We can therefore identify (a, 0) with a. This
gives us the real field as a subfield of the complex field.
Note that we have defined the complex numbers without any reference to the mysterious square
root of 1. We now show that the notation (a, b) is equivalent to the more customary a + bi.
Definition 1.8 i := (0, 1).
Lemma 1.18 (a) i2 = 1. (b) If a, b R then (a, b) = a + bi.
Proof. (a) i2 = (0, 1)(0, 1) = (1, 0) = 1.
(b) a + bi = (a, 0) + (b, 0)(0, 1) = (a, 0) + (0, b) = (a, b).
Definition 1.9 If a, b are real and z = a + bi, then the complex number z := a bi is called the
conjugate of z. The numbers a and b are the real part and the imaginary part of z, respectively.
We shall write a = Re z and b = Im z.
Proposition 1.19 If z and w are complex, then
(a) z + w = z + w,
(b) zw = z w,
(c) z + z = 2 Re z,
z z = 2i Im z,
(d) z z is positive real except when z = 0.
Proof. (a), (b), and (c) are quite trivial. To prove (d) write z = a + bi and note that z z = a2 + b2 .
Definition 1.10 If z is complex number, its absolute value | z | is the (nonnegative) root of z z;
that is | z | := z z.
The existence (and uniqueness)of | x | follows from Proposition 1.19 (d). Note that when x is
real, then x = x, hence | x | = x2 . Thus | x | = x if x > 0 and | x | = x if x < 0. We have
recovered the definition of the absolute value for real numbers, see Subsection 1.1.6.
Proposition 1.20 Let z and w be complex numbers . Then
(a) | z | > 0 unless z = 0,
(b) | z | = | z |,
(c) | zw | = | z | | w |,
(d) | Re z | | z |,
(e) | z + w | | z | + | w | .
Proof. (a) and (b) are trivial. Put z = a + bi and w = c + di, with a, b, c, d real. Then
| zw |2 = (ac bd)2 + (ad + bc)2 = (a2 + b2 )(c2 + d2 ) = | z |2 | w |2
31
or | zw |2 = (| z | | w |)2 . Now (c) follows from the uniqueness assertion for roots.
To prove (d), note that a2 a2 + b2 , hence
| a | = a2 a2 + b2 = | z | .
To prove (e), note that z w is the conjugate of z w, so that z w + zw = 2 Re (z w). Hence
| z + w |2 = (z + w)(z + w) = z z + z w + z w + w w
= | z |2 + 2 Re (z w) + | w |2
| z |2 + 2 | z | | w | + | w |2 = (| z | + | w |)2 .
Now (e) follows by taking square roots.
z=a+b i
b
r=| z|
Re
b
,
|z|
cos =
a
.
|z|
This gives with r = | z |, a = r cos and b = r sin . Inserting these into the rectangular form
of z yields
z = r(cos + i sin ),
(1.7)
32
z+w
z
zw
z
w
(1.10)
Proof. (a) First let n > 0. We use induction over n to prove De Moivres formula. For n = 1
there is nothing to prove. Suppose (1.10) is true for some fixed n. We will show that the
assertion is true for n + 1. Using induction hypothesis and (1.8) we find
z n+1 = z n z = r n (cos(n)+i sin(n))r(cos +i sin ) = r n+1 (cos(n+)+i sin(n+)).
This proves the induction assertion.
(b) If n < 0, then z n = 1/(z n ). Since 1 = 1(cos 0 + i sin 0), (1.9) and the result of (a) gives
zn =
1
z n
1
r n
33
+ i sin 15
= 215 37 3(cos(5) + i sin(5))
z 15 = (2 3)15 cos 15
3
3
15
15 7
z = 2 3 3.
+ 2k
+ 2k
n
, k = 0, 1, . . . , n 1
wk = r cos
+ i sin
n
n
are the n different nth roots of z.
Example 1.7 Compute the 4th roots of z = 1.
w1
w0
| z | = 1 = | w | =
,
4
1 360
1 = +
= 135,
4
4
2 360
= 225,
2 = +
4
4
3 360
= 315.
3 = +
4
4
0 =
z=-1
w2
w3
We obtain
1
1
2+i
2
2
2
1
1
2+i
2,
w1 = cos 135 + i sin 135 =
2
2
1
1
2i
2,
w2 = cos 225 + i sin 225 =
2
2
1
1
w3 = cos 315 + i sin 315 =
2i
2.
2
2
Geometric interpretation of the nth roots. The nth roots of z 6= 0 form a regularp
n-gon in the
n
complex plane with center 0. The vertices lie on a circle with center 0 and radius | z |.
w0 = cos 45 + i sin 45
34
1.3 Inequalities
1.3.1 Monotony of the Power and Exponential Functions
Lemma 1.23 (a) For a, b > 0 and r Q we have
a < b ar < br
r
if r > 0,
a < b a > b
if r < 0.
r < s ar < as
if a > 1,
r < s ar > as
if a < 1.
Proof. Suppose that r > 0, r = m/n with integers m, n Z, n > 0. Using Lemma 1.14 (a)
twice we get
1
1
a < b am < bm (am ) n < (bm ) n ,
which proves the first claim. The second part r < 0 can be obtained by setting r in place of r
in the first part and using Proposition 1.9 (e).
(b) Suppose that s > r. Put x = s r, then x Q and x > 0. By (a), 1 < a implies
1 = 1x < ax . Hence 1 < asr = as /ar (here we used Remark 1.6 (d)), and therefore ar < as .
Changing the roles of r and s shows that s < r implies as < ar such that the converse direction
is also true.
The proof for a < 1 is similar.
x1 + + xn
n x1 xn .
n
We have equality if and only if x1 = x2 = = xn .
(1.11)
Proof. We use forward-backward induction over n. First we show (1.11) is true for all n which
are powers of 2. Then we prove that if (1.11) is true for some n + 1, then it is true for n. Hence,
it is true for all positive integers.
!
!
!1/k
!1/k
k
k
k
k
k
k
X
Y
Y
1 X
1X
1 1X
1
xi +
yi
xi +
yi
xi
+
yi
2k i=1
2
k
k
2
i=1
i=1
i=1
i=1
i=1
k
Y
i=1
xi
k
Y
i=1
yi
! 2k1
1.3 Inequalities
35
This completes the forward part. Assume now (1.11) is true for n + 1. We will show it for n.
P
Let x1 , . . . , xn R+ and set A := ( ni=1 xi )/n. By induction assumption we have
1
! n+1
n
Y
n
Y
1
! n+1
1
1
1
A n+1
(x1 + + xn + A)
(nA + A)
xi A
xi
n+1
n+1
i=1
i=1
1
1
!
!
!1/n
n
n
n
n+1
n+1
Y
Y
Y
1
n
A
A n+1 A n+1
A
xi
xi
xi
.
i=1
i=1
i=1
!1/n
This contradicts the choice x1 < x2 . Hence, x1 = x2 = = xn is the only case where
equality holds. This completes the proof.
x2k
yk2.
(1.12)
k=1
k=1
k=1
Equality holds if and only if there exists t R such that yk = t xk for k = 1, . . . , n that is, the
vector y = (y1 , . . . , yn ) is a scalar multiple of the vector x = (x1 , . . . , xn ).
Proof. Consider the quadratic function f (t) = at2 2bt + c where
a=
n
X
x2k ,
b=
k=1
n
X
xk yk ,
c=
k=1
n
X
yk2 .
k=1
Then
f (t) =
=
n
X
k=1
n
X
k=1
x2k t2
n
X
k=1
2xk yk t +
n
X
yk2
k=1
n
X
x2k t2 2xk yk t + yk2 =
(xk t yk )2 0.
k=1
36
Equality holds if and only if there is a t R with yk = txk for all k. Suppose now, there is no
such t R. That is
f (t) > 0, for all t R.
In other words, the polynomial f (t) = at2 2bt + c has no real zeros, t1,2 = a1 b b2 ac .
That is, the discriminant D = b2 ac is negative (only complex roots); hence b2 < ac:
!2
n
n
n
X
X
X
2
xk yk
<
xk
yk2 .
k=1
k=1
k=1
k=1
k=1
Equality holds if and only if there exists a C such that y = x, where y = (y1 , . . . , yn )
Cn , x = (x1 , . . . , xn ) Cn .
k=1
k=1
k=1
k=1
1.4 Appendix A
In this appendix we collect some assitional facts which were not covered by the lecture.
We now show that the equation
x2 = 2
(1.14)
(1.15)
1.4 Appendix A
37
This shows that m2 is even and hence m is even. Therefore m2 is divisible by 4. It follows that
the right hand side of (1.15) is divisible by 4, so that n2 is even, which implies that n is even.
But this contradicts our choice of m and n. Hence (1.14) is impossible for rational x.
We shall show that A contains no largest element and B contains no smallest. That is for every
p A we can find a rational q A with p < q and for every p B we can find a rational q B
such that q < p.
Suppose that p is in A. We associate with p > 0 the rational number
2p + 2
2 p2
=
.
p+2
p+2
(1.16)
4p2 + 8p + 4 2p2 8p 8
2(p2 2)
=
.
(p + 2)2
(p + 2)2
(1.17)
q =p+
Then
q2 2 =
If p is in A then 2 p2 > 0, (1.16) shows that q > p, and (1.17) shows that q 2 < 2. If p is in B
then 2 < p2 , (1.16) shows that q < p, and (1.17) shows that q 2 > 2.
A Non-Archimedean Ordered Field
The fields Q and R are Archimedean, see below. But there exist ordered fields without this
property. Let F := R(t) the field of rational functions f (t) = p(t)/q(t) where p and q are
polynomials with real coefficients. Since p and q have only finitely many zeros, for large t,
f (t) is either positive or negative. In the first case we set f > 0. In this way R(t) becomes an
ordered field. But t > n for all n N since the polynomial f (t) = t n becomes positive for
large t (and fixed n).
Our aim is to define bx for arbitrary real x.
Lemma 1.27 Let b, p be real numbers with b > 1 and p > 0. Set
M = {br | r Q, r < p},
Then
sup M = inf M .
Proof. (a) M is bounded above by arbitrary bs , s Q, with s > p, and M is bounded below by
any br , r Q, with r < p. Hence sup M and inf M both exist.
(b) Since r < p < s implies ar < bs by Lemma 1.23, sup M bs for all bs M . Taking the
infimum over all such bs , sup M inf M .
(c) Let s = sup M and > 0 be given. We want to show that inf M < s + . Choose n N
such that
1/n < /(s(b 1)).
(1.18)
sr <
1
.
n
(1.19)
38
Using s r < 1/n, Bernoullis inequality (part 2), and (1.18), we compute
1
1
bs br = br (bsr 1) s(b n 1) s (b 1) < .
n
Hence
inf M bs < br + sup M + .
Since was arbitrary, inf M sup M, and finally, with the result of (b), inf M = sup M.
Inequalities
Now we extend Bernoullis inequality to rational exponents.
Proposition 1.29 (Bernoullis inequality) Let a 1 real and r Q. Then
(a) (1 + a)r 1 + ra if r 1,
(b) (1 + a)r 1 + ra if 0 r 1.
Equality holds if and only if a = 0 or r = 1.
Proof. (b) Let r = m/n with m n, m, n N. Apply (1.11) to xi := 1 + a, i = 1, . . . , m and
xi := 1 for i = m + 1, . . . , n. We obtain
1
1
(m(1 + a) + (n m)1) (1 + a)m 1nm n
n
m
m
a + 1 (1 + a) n ,
n
which proves (b). Equality holds if n = 1 or if x1 = = xn i. e. a = 0.
(a) Now let s 1, z 1. Setting r = 1/s and a := (1 + z)1/r 1 we obtain r 1 and
a 1. Inserting this into (b) yields
1 r
(1 + a)r (1 + z) r 1 + r ((1 + z)s 1)
z r ((1 + z)s 1)
1 + sz (1 + z)s .
This completes the proof of (a).
1.4 Appendix A
39
Proof. (a) First let a > 0. By Proposition 1.29 (a) (1 + a)r 1 + ra if r Q. Hence,
(1 + a)x = sup{(1 + a)r | r Q, r < x} sup{1 + ra | r Q, r < x} = 1 + xa.
Now let 1 a < 0. Then r < x implies ra > xa, and Proposition 1.29 (a) implies
(1 + a)r 1 + ra > 1 + xa.
(1.20)
1
sup{(1/(a + 1))r | r Q, r < x}
HW 2.1
1 p 1 q
a + b,
p
q
(1.21)
1 p
1
1
(y 1) + 1 y y p + y.
p
p
q
If b = 0 the statement is always true. If b 6= 0 insert y := ab/bq into the above inequality:
p
1
1 ab
ab
+ q
q
p b
q
b
1 ap bp 1
ab
+ q
| bq
pq
p b
q
b
1 p 1 q
a + b ab,
p
q
since bp+q = bpq . We have equality if y = 1 or p = 1. The later is impossible by assumption.
y = 1 is equivalent to bq = ab or bq1 = a or b(q1)p = ap (b 6= 0). If b = 0 equality holds if
and only if a = 0.
40
Proposition 1.32 (Holders inequality) Let p > 1, 1/p + 1/q = 1, and x1 , . . . , xn R+ and
y1 , . . . , yn R+ non-negative real numbers. Then
n
X
k=1
n
X
xk yk
! p1
xpk
k=1
n
X
ykq
k=1
! 1q
(1.22)
We have equality if and only if there exists c R such that for all k = 1, . . . , n, xpk /ykq = c (they
are proportional).
1
1
P
P
Proof. Set A := ( nk=1 xpk ) p and B := ( nk=1 ykq ) q . The cases A = 0 and B = 0 are trivial. So
we assume A, B > 0. By Youngs inequality we have
1 xpk
xk yk
1 ykq
+
A B
p Ap q B q
n
n
n
1 X
1 X p
1 X q
=
xk yk
x +
y
AB k=1
pAp k=1 k qB q k=1 k
1 X p
1 X q
xk + P q
yk
= P p
p xk
q yk
1 1
= + =1
p q
! p1
! q1
n
n
n
X
X
X
.
xk yk
xpk
ykq
=
k=1
k=1
k=1
Equality holds if and only if xpk /Ap = ykq /B q for all k = 1, . . . , n. Therefore, xpk /ykq = const.
Corollary 1.33 (Complex Holders inequality) Let p > 1, 1/p + 1/q = 1 and xk , yk C,
k = 1, . . . , n. Then
n
X
k=1
| xk yk |
n
X
k=1
| xk |
! p1
n
X
k=1
| yk |
! 1q
! p1
n
X
k=1
xpk
! p1
n
X
k=1
ykp
! p1
(1.23)
1.4 Appendix A
41
Proof. The case p = 1 is obvious. Let p > 1. As before let q > 0 be the unique positive number
with 1/p + 1/q = 1. We compute
n
n
n
n
X
X
X
X
p
p1
p1
(xk + yk ) =
(xk + yk )(xk + yk )
=
xk (xk + yk )
+
yk (xk + yk )p1
k=1
(1.22)
k=1
X
X
xpk
xpk
p1
1/p
X
k
(xk + yk )(p1)q
X
! q1
k=1
X
ykp
k=1
1p
1/q X
1/q
q
(xk + yk )p
.
yk
(xk + yk )(p1)q
! q1
P
1
1
We can assume that (xk + yk )p > 0. Using 1 = by taking the quotient of the last
q
p
P
p 1/q
inequality by ( (xk + yk ) ) we obtain the claim.
Equality holds if xpk /(xk + yk )(p1)q = const. and ykp /(xk + yk )(p1)q) = const.; that is
xk /yk = const.
+
.
(1.24)
| xk + yk |
| xk |
| yk |
k=1
k=1
k=1
n
X
k=1
| xk |
! p1
42
Chapter 2
Sequences and Series
This chapter will deal with one of the main notions of calculus, the limit of a sequence. Although we are concerned with real sequences, almost all notions make sense in arbitrary metric
spaces like Rn or Cn .
Given a R and > 0 we define the -neighborhood of a as
U (a) := (a , a + ) = {x R | a < x < a + } = {x R | | x a | < }.
or simply
x = lim xn
or
xn x.
If there is no such x with the above property, the sequence (xn ) is said to be divergent.
In other words: (xn ) converges to x if any neighborhood U (x), > 0, contains almost all
elements of the sequence (xn ). Almost all means all but finitely many. Sometimes we say
for sufficiently large n which means the same.
43
44
This is an equivalent formulation since xn U (x) means x < xn < x+, hence | x xn | <
. The n0 in question need not to be the smallest possible.
We write
lim xn = +
(2.1)
if for all E > 0 there exists n0 N such that n n0 implies xn E. Similarly, we write
lim xn =
(2.2)
if for all E > 0 there exists n0 N such that n n0 implies xn E. In these cases we say
that + and are improper limits of (xn ). Note that in both cases (xn ) is divergent.
Example 2.2 This is Example 2.1 continued.
(a) limn n1 = 0. Indeed, let > 0 be fixed. We are looking for some n0 with n1 0 <
for all n n0 . This is equivalent to 1/ < n. Choose n0 > 1/ (which is possible by the
Archimedean property). Then for all n n0 we have
n n0 >
1
1
= < = | xn 0 | < .
E
+ 1 = 2n 2 > E = xn = 2n 1 > 2n 2 > E.
2
This proves the claim. Similarly, one can show that lim n3 = . But both ((n)n ) and
(1, 2, 1, 3, 1, 4, 1, 5, . . . ) have no improper limit. Indeed, the first one becomes arbitrarily large
for even n and arbitrarily small for odd n. The second one becomes large for eveb n but is
constant for odd n.
(e) xn = an , (a 0).
(
1, if a = 1,
lim an =
n
0, if 0 a < 1.
(an ) is divergent for a > 1. Moreover, lim an = +. To prove this let E > 0 be given. By
the Archimedean property of R and since a 1 > 0 we find m N such that m(a 1) > E.
Bernoullis inequality gives
am m(a 1) + 1 > m(a 1) > E.
45
(2.3)
1
1
. Then >
and n n0 implies
b
n0 b
| an 0 | = | an | = an <
(2.3)
1
1
< .
nb
n0 b
Hence, an 0.
Proposition 2.1 The limit of a convergent sequence is uniquely determined.
Proof. Suppose that x = lim xn and y = lim xn and x 6= y. Put := | x y | /2 > 0. Then
n1 N n n1 : | x xn | < ,
n2 N n n2 : | y xn | < .
for all n N.
Similarly, (xn ) is said to be bounded above or bounded below if there exists C R such that
xn C or xn C, respectively, for all n N
Proposition 2.2 If (xn ) is convergent, then (xn ) is bounded.
46
Proof. Let x = lim xn . To = 1 there exists n0 N such that | x xn | < 1 for all n n0 .
Then | xn | = | xn x + x | | xn x | + | x | < | x | + 1 for all n n0 . Put
C := max{| x1 | , . . . , | xn0 1 | , | x | + 1}.
The reversal statement is not true; there are bounded sequences which are not convergent, see
Example 2.1 (b).
Ex Class: If (xn ) has an improper limit, then (xn ) is divergent.
Proof. Suppose to the contrary that (xn ) is convergent; then it is bounded, say | xn | C for all
n. This contradicts xn > E as well as xn < E for E = C and sufficiently large n. Hence,
(xn ) has no improper limits, a contradiction.
Proposition 2.3 If (xn ) and (yn ) are convergent sequences and c R, then their sum, difference, product, quotient (provided yn 6= 0 and lim yn 6= 0), and their absolute values are also
convergent:
(a) lim(xn yn ) = lim xn lim yn ;
(b) lim(cxn ) = c lim xn , lim(xn + c) = lim xn + c.
(c) lim(xn yn ) = lim xn lim yn ;
xn
(d) lim xynn = lim
if yn 6= 0 for all n and lim yn 6= 0;
lim yn
(e) lim | xn | = | lim xn | .
Proof. Let xn x and yn y.
(a) Given > 0 then there exist integers n1 and n2 such that
n n1 implies | xn x | < /2 and n n2 implies | yn y | < /2.
If n0 := max{n1 , n2 }, then n n0 implies
| (xn + yn ) (x + y) | | xn x | + | yn y | .
The proof for the difference is quite similar.
(b) follows from | cxn cx | = | c | | xn x | and | (xn + c) (x + c) | = | xn x |.
(c) We use the identity
xn yn xy = (xn x)(yn y) + x(yn y) + y(xn x).
Given > 0 there are integers n1 and n2 such that
(2.4)
47
n n1 implies | xn x | <
1
yn y
1
yn y = yn y
< 2 | yn y | <
| y |2
1
. The general case can be reduced to the above case using (c) and
lim yn
(xn /yn ) = (xn 1/yn ).
(e) By Lemma 1.12 (e) we have | | xn | | x | | | xn x |. Given > 0, there is n0 such that
n n0 implies | xn x | < . By the above inequality, also | | xn | | x | | and we are
done.
and we get lim( y1n ) =
3 + 13
n
.
1 n22
Since lim 1/n = 0, by Proposition 2.3, we obtain lim 1/n2 = 0 and lim 13/n = 0. Hence
lim 2/n2 = 0 and lim (3 + 13/n) = 3. Finally,
limn 3 + 13
3
3n2 + 13n
n
=
= = 3.
lim
2
2
n
n 2
1
limn 1 n
48
p(t), and a1 , . . . an are called the coefficients of p(t). The set of all real polynomials forms a
real vector space denoted by R[x].
Given two polynomials p and q; put D := {t R | q(t) 6= 0}. Then r = pq is a called a rational
function where r : D R is defined by
r(t) :=
p(t)
.
q(t)
Polynomials are special rational functions with q(t) 1. the set of rational functions with real
coefficients form both a real vector space and a field. It is denoted by R(x).
Lemma 2.4 (a) Let an 0 be a sequence tending to zero with an 6= 0 for every n. Then
(
+,
if an > 0 for almost all n;
1
lim
=
n an
,
if an < 0 for almost all n.
(b) Let yn a be a sequence converging to a and a > 0. Then yn > 0 for almost all n N.
Proof. (a) We will prove the case with . Let > 0. By assumption there is a positive integer
n0 such that n n0 implies < an < 0. Tis implies 0 < an < and further a1n < 1 < 0.
Suppose E > 0 is given; choose = 1/E and n0 as above. Then by the previous argument,
n n0 implies
1
1
< = E.
an
k
k=0 ak t
0,
ar ,
p(n)
= br
n q(n)
+,
,
lim
Pr
and q(t) =
Ps
k
k=0 bk t
r < s,
r = s,
r>s
and
r>s
and
ar
br
ar
br
> 0,
< 0.
nr ar + ar1 n1 + + a0 n1r
ar + ar1 n1 + + a0 n1r
p(n)
1
1
= s
=
1
1 =: sr cn
1
1
sr
q(n)
n
n
bs + bs1 n + + b0 ns
n bs + bs1 n + + b0 ns
Suppose that r = s. By Proposition 2.3, n1k 0 for all k N. By the same proposition, the
limits of each summand in the numerator and denominator is 0 except for the first in each sum.
Hence,
limn ar + ar1 n1 + + a0 n1r
p(n)
a
= r.
=
lim
1
1
n q(n)
br
limn bs + bs1 n + + b0 ns
49
Supose now that r < s. As in the previous case, the sequence (cn ) tends to ar /bs but the first
1
factor nsr
tends to 0. Hence, the product sequence tends to 0.
Suppose that r > s and abrs > 0. The sequence (cn ) has a positive limit. By Lemma 2.4 (b)
almost all cn > 0. Hence,
q(n)
1 1
dn :=
= rs
p(n)
n cn
tends
above part and dn > 0 for almost all n. By Lemma 2.4 (a), the sequence
to 0 by the
p(n)
1
= q(n) tends to + as n , which proves the claim in the first case. The case
dn
ar /bs < 0 can be obtained by multiplying with 1 and noting that limn xn = + implies
limn (xn ) = .
In the German literature the next proposition is known as the Theorem of the two policemen.
Proposition 2.6 (Sandwich Theorem) Let an , bn and xn be real sequences with an xn bn
for all but finitely many n N. Further let limn an = limn bn = x. Then xn is also
convergent to x.
Proof. Let > 0. There exist n1 , n2 , and n3 N such n n1 implies an U (x), n n2
implies bn U (x), and n n3 implies an xn bn . Choosing n0 = max{n1 , n2 , n3 },
n n0 implies xn U (x). Hence, xn x.
Remark. (a) If two sequences (an ) and (bn ) differ only in finitely many elements, then both
sequences converge to the same limit or both diverge.
(b) Define the shifted sequence bn := an+k , n N, where k is a fixed positv integer. Then
both sequences converge to the same limit or both diverge.
(c) lim n n = 1.
n
n
(d) If a > 1 and R, then lim n = 0.
n a
Proposition 2.7
Proof. (a) Let > 0. Take n0 > (1/)1/p (Note that the Archimedean Property of the real
numbers is used here). Then n n0 implies 1/np < .
(b) If p > 1, put xn = n p1. Then, xn > 0 and by Bernoullis inequality (that is by homework
1
1
that is
4.1) we have p n = (1 + p 1) n < 1 + p1
n
0 < xn
1
(p 1)
n
By Proposition 2.6, xn 0. If p = 1, (b) is trivial, and if 0 < p < 1 the result is obtained by
taking reciprocals.
50
(c) Put xn =
Hence
n(n 1) 2
xn .
2
2
, (n 2).
n1
1
0. Applying the sandwich theorem again, xn 0 and so n n 1.
By (a), n1
(d) Put p = a 1, then p > 0. Let k be an integer such that k > , k > 0. For n > 2k,
n k n(n 1) (n k + 1) k nk pk
n
p =
p > k .
(1 + p) >
k
k!
2 k!
0 xn
Hence,
0<
n
n
2k k! k
=
<
n
an
(1 + p)n
pk
(n > 2k).
Q 2. Let (xn ) > 0 a convergent sequence of positive numbers with and lim xn = x > 0. Then
n
x1 x2 xn x. Hint: Consider yn = log xn .
51
xn+1 =
c
xn ,
n+1
(2.5)
c
<
n+1
xn . On the other hand, xn > 0 for all n such that (xn ) is bounded below by 0. By Proposition 2.8, (xn ) converges to some x R. Taking the limit n in (2.5), we have
we observe that (xn ) is strictly decreasing for n c. Indeed, n c implies xn+1 = xn
c
lim xn = 0 x = 0.
n n + 1 n
2.1.4 Subsequences
Definition 2.4 Let (xn ) be a sequence and (nk )kN a strictly increasing sequence of positive
integers nk N. We call (xnk )kN a subsequence of (xn )nN . If (xnk ) converges, its limit is
called a subsequential limit of (xn ).
Example 2.5 (a) xn = 1/n, nk = 2k . then (xnk ) = (1/2, 1/4, 1/8, . . . ).
(b) (xn ) = (1, 1, 1, 1, . . . ). (x2k ) = (1, 1, . . . ) has the subsequential limit 1; (x2k+1 ) =
(1, 1, 1, . . . ) has the subsequential limit 1.
Proposition 2.9 Subsequences of convergent sequences are convergent with the same limit.
Proof. Let lim xn = x and xnk be a subsequence. To > 0 there exists m0 N such that
n m0 implies | xn x | < . Since nm m for all m, m m0 implies | xnm x | < ;
hence lim xnm = x.
Definition 2.5 Let (xn ) be a sequence. We call x R a limit point of (xn ) if every neighborhood of x contains infinitely many elements of (xn ).
Proposition 2.10 The point x is limit point of the sequence (xn ) if and only if x is a subsequential limit.
Proof. If lim xnk = x then every neighborhood U (x) contains all but finitely many xnk ; in
k
particular, it contains infinitely many elements xn . That is, x is a limit point of (xn ).
Suppose x is a limit point of (xn ). To = 1 there exists xn1 U1 (x). To = 1/k there exists
nk with xnk U1/k (x) and nk > nk1 . We have constructed a subsequence (xnk ) of (xn ) with
| x xnk | <
1
;
k
52
Hence, (xnk ) converges to x.
Question: Which sequences do have limit points? The answer is: Every bounded sequence has
limit points.
Proposition 2.11 (Principle of nested intervals) Let In := [an , bn ] a sequence of closed
nested intervals In+1 In such that their lengths bn an tend to 0:
Given > 0 there exists n0 such that 0 bn an < for all n n0 .
For any such interval sequence {In } there exists a unique real number x R which is a member
T
of all intervals, i. e. {x} = nN In .
Proof. Since the intervals are nested, (an ) is an increasing sequence bounded above by each
of the bk , and (bn ) is decreasing sequence bounded below by each of the ak . Consequently,
by Proposition 2.8 we have
x = lim an = sup{an } bm ,
n
In .
T
We show the converse inclusion namely that nN [an , bn ] [x, y]. Let p In for all n, that
is, an p bn for all n N. Hence supn an p inf n bn ; that is p [x, y]. Thus,
T
[x, y] = nN In . We show uniqueness, that is x = y. Given > 0 we find n such that
y x bn an . Hence y x 0; therefore x = y. The intersection contains a unique
point x.
53
Example 2.6 (a) xn = (1)n1 + n1 ; set of limit points is {1, 1}. First note that 1 =
limn x2n and 1 = limn x2n+1 are subsequential limits of (xn ). We show, for example,
that 13 is not a limit point. Indeed, for n 4, there exists a small neighborhood of 13 which has
no intersection with U 1 (1) and U 1 (1). Hence, 13 is not a limit point.
4
hni 4
, where [x] denotes the least integer less than or equal to x ([] = [3] = 3,
(b) xn = n 5
5
[2.8] = 3, [1/2] = 0). (xn ) = (1, 2, 3, 4, 0, 1, 2, 3, 4, 0, . . . ); the set of limit points is
{0, 1, 2, 3, 4}
(c) One can enumerate the rational numbers in (0, 1) in the following way.
1
,
x1 ,
2
1
2
, 3,
x2 , x3
3
1
2
3
, 4,
, x4 , x5 , x6 ,
4
4
The set of limit points is the whole interval [0, 1] since in any neighborhood of any real number
there is a rational number, see Proposition 1.11 (b) 1.11 (b). Any rational number of (0, 1)
appears infinitely often in this sequence, namely as pq = 2p
= 3p
= .
2q
3q
(d) xn = n has no limit point. Since (xn ) is not bounded, Bolzano-Weierstra fails to apply.
Definition 2.6 (a) Let (xn ) be a bounded sequence and A its set of limit points. Then sup A is
called the upper limit of (xn ) and inf A is called the lower limit of (xn ). We write
lim xn , and
= lim xn .
n
x < x < x.
2
Since x is a limit point, U 2 (x ) contains infinitely many elements xk . By construction,
U 2 (x ) U (x). Indeed, x U 2 (x ) implies | x x | < 2 and therefore
| x x | = | x x + x x | | x x | + | x x | <
+ = .
2 2
lim xn b.
(2.6)
54
Similarly, if (xn ) is bounded below, then
lim xn b.
(2.7)
Proof. We prove only the first part for lim xn . Proving statement for lim xn is similar.
Let t := lim xn . Suppose to the contrary that t > b. Set = (t b)/2, then U (t) contains
infinitely many xn (t is a limit point) which are all greater than b; this contradicts xn b for
all but finitely many n. Hence lim xn b.
Applying the first part to b = supn {xn } and noting that inf A sup A, we have
inf {xn } lim xn lim xn sup{xn }.
n
lim sn lim tn .
55
Proof. (a) We keep the notations s and t for the upper limits of (sn ) and (tn ), respectively. Set
s = lim sn and t = lim tn . Let > 0. By homework 6.3 (a)
by assumption
s sn
s sn tn
= s lim tn
by Prp. 2.14
sup{s | > 0 } t
s t.
(b) The proof for the lower limit follows from (a) and sup E = inf(E).
Proposition 2.18 (Cauchy convergence criterion) A real sequence is convergent if and only
if it is a Cauchy sequence.
Proof. One direction is Lemma 2.17. We prove the other direction. Let (xn ) be a Cauchy
sequence. First we show that (xn ) is bounded. To = 1 there is a positive integer n0 such
that m, n n0 implies | xm xn | < 1. In particular | xn xn0 | < 1 for all n n0 ; hence
| xn | < 1 + | xn0 |. Setting
C = max{| x1 | , | x2 | , . . . , | xn0 1 | , | xn0 | + 1},
56
Pn
1
k=1 k
= 1+
x2m xm =
1
2
1
3
2m
2m
X
X
1
1
1
1
=m
= .
k k=m+1 2m
2m
2
k=m+1
1
1
1
1
k+1
xn+k xn = (1)
+
+ (1)
n+1 n+2 n+3
n+k
1
1
1
1
+
+
= (1)n
n+1 n+2
n+3 n+4
(
1
1
, k even
n+k
n+k1
+
1
,
k odd
n+k
n
| xn+k xn | =
1
n+1 n+2
n+3 n+4
,
n+k
(
"
1
,
k even
1
1
1
+ + n+k 1
=
1
n+1
n+2 n+3
, k even
n+k1
k even
k odd
n+k
1
,
| xn+k xn | <
n+1
since all summands in parentheses are positive. Hence, (xn ) is a Cauchy sequence and converges.
2.3 Series
57
2.3 Series
Definition 2.8 Given a sequence (an ), we associate with (an ) a sequence (sn ), where
sn =
n
X
k=1
ak = a1 + a2 + + an .
ak ,
(2.8)
k=1
and we call it an infinite series or just a series. The numbers sn are called the partial sums of
the series. If (sn ) converges to s, we say that the series converges, and write
ak = s.
k=1
X
1
is divergent. This is the harmonic series.
(1)
n
n=1
X
1
(2)
(1)n+1 is convergent. It is an example of an alternating series (the summands are
n
n=1
changing their signs, and the absolute value of the summands form a decreasing to 0 sequence).
X
P n
1
q n is called the geometric series. It is convergent for | q | < 1 with
(3)
0 q = 1q . This
n=0
n
X
1 q n+1
, see proof of Lemma 1.14, first formula with y = 1, x = q.
is seen from
q =
1q
k=0
The series diverges for | q | 1. The general formula in case | q | < 1 is
k
cq n =
n=n0
cq n0
.
1q
(2.9)
58
(3) If (an ) is a sequence of nonnegative real numbers, then
partial sums are bounded.
P
P
Proof. (1). Suppose that
a = s (a1 + a2 + + am1 ).
n=1 an = s; we show that
n=m
Pk
P
Indeed, let (sn ) and (tn ) denote the nth partial sums of k=1 ak and
k=m ak , respectively.
Pm1
Then for n > m one has tn = sn k=1 ak . Taking the limit n proves the claim.
P
P
We prove (2). Suppose that
1 an converges to s. By (1), rn =
k=n+1 ak is also a convergent
series for all n. We have
ak =
k=1
n
X
ak +
k=1
= s = sn + rn
ak
k=n+1
= rn = s sn
= lim rn = s s = 0.
n
n
X
ak <
(2.10)
k=m
if m, n n0 .
Proof. Clear from Proposition 2.18. Consider the sequence of partial sums sn =
P
note that for n m one has| sn sm1 | = | nk=m ak |.
Corollary 2.21 If
Pn
k=1 ak
and
Proposition 2.22 (Comparison test) (a) If | an | Cbn for some C > 0 and for almost all
P
P
n N, and if bn converges, then an converges.
P
P
(b) If an Cdn 0 for some C > 0 and for almost all n, and if
dn diverges, then
an
diverges.
2.3 Series
59
Proof. (a) Suppose n n1 implies | an | Cbn . Given > 0, there exists n0 n1 such that
m, n n0 implies
n
X
bk <
C
k=m
by the Cauchy criterion. Hence
n
n
n
X X
X
ak
| ak |
Cbk < ,
k=m
k=m
k=m
Caution, the product series cn need not to be convergent. Indeed, let an := bn := (1)n / n.
P
P
P
One can show that an and bn are convergent (see Proposition 2.29 below), however, cn
P
is not convergent, when cn = nk=1 ak bnk+1 . Proof: By the arithmetic-geometric mean inP
2
2n
2
n+1
. Hence, | cn | nk=1 n+1
= n+1
. Since cn doesnt
equality, | ak bnk+1 | = 1
k(n+1k)
P
converge to 0 as n , n=0 cn diverges by Corollary 2.21
X
an converges if and only if the series
n=1
X
k=0
converges.
Proof. By Lemma 2.19 (3) it suffices to consider boundedness of the partial sums. Let
sn = a1 + + an ,
tk = a1 + 2a2 + + 2k a2k .
(2.11)
60
For n < 2k
(2.12)
(2.13)
By (2.12) and (2.13), the sequences sn and tk are either both bounded or both unbounded. This
completes the proof.
X
1
Example 2.10 (a)
converges if p > 1 and diverges if p 1.
np
n=1
If p 0, divergence follows from Corollary 2.21. If p > 0 Proposition 2.23 is applicable, and
we are led to the series
k
X
X
1
k 1
2 kp =
.
2
2p1
k=0
k=0
1
This is a geometric series wit q = p1 . It converges if and only if 2p1 > 1 if and only if
2
p > 1.
(b) If p > 1,
X
n=2
1
n(log n)p
(2.14)
converges; if p 1, the series diverges. log n denotes the logarithm to the base e.
If p < 0, n(log1 n)p > n1 and divergence follows by comparison with the harmonic series. Now let
p > 0. By Lemma 1.23 (b), log n < log(n + 1). Hence (n(log n)p ) increases and 1/(n(log n))p
decreases; we can apply Proposition 2.23 to (2.14). This leads us to the series
X
k=1
2k
X
X
1
1
1
1
=
=
,
p
p
p
2k (log 2k )p
(k
log
2)
(log
2)
k
k=1
k=1
X
n=3
X
n=3
1
converges.
n log n(log log n)2
1
diverges, whereas
n log n log log n
2.3 Series
61
X
1
,
(2.15)
e :=
n!
n=0
where 0! = 1! = 1 by definition. Since
1
1
1
+
++
12 123
1 2n
1
1
1
< 1 + 1 + + 2 + + n1 < 3,
2 2
2
the series converges (by the comparing it with the geometric series with q = 12 ) and the definition makes sense. In fact, the series converges very rapidly and allows us to compute e with
great accuracy. It is of interest to note that e can also defined by means of another limit process.
e is called the Euler number.
sn = 1 + 1 +
Proposition 2.24
Proof. Let
n
1
e = lim 1 +
.
n
n
n
X
1
,
sn =
k!
k=0
(2.16)
n
1
tn = 1 +
.
n
62
Hence, tn sn , so that by Proposition 2.16
lim tn lim sn = lim sn = e.
(2.17)
Next if n m,
1
1
1
m1
1
1
++
1
1
.
tn 1 + 1 +
2!
n
m!
n
n
1
1
++
= sm .
2!
m!
(2.18)
1
1
+
+
(n + 1)! (n + 2)!
1
1
1
1
1
1
1+
+
+ =
<
1 =
2
(n + 1)!
n + 1 (n + 1)
(n + 1)! 1 n+1
n! n
e sn =
so that
0 < e sn <
1
.
n! n
(2.19)
1
1
1
1
1
1
1 1
+ +
+
+
+
+
+
= 2.718281526...
2 6 24 120 720 5040 40, 320 362, 880
(2.20)
By (2.19)
3.1
107
such that the first six digits of e in (2.20) are correct.
e s9 <
Example 2.11
n
n
1
1
1
n1
n = lim
(a) lim 1
= lim
= lim
n1
n
1
n
n
n
n 1 +
n
n
1+
n1
n1
4n
4n
4n
3n + 1
3n + 1
3n
= lim
lim
(b) lim
n 3n 1
n
n 3n 1
3n
3n ! 34
3n ! 34
8
1
1
lim
= e3.
= lim
1+
1+
n
n
3n
3n 1
1
n1
1
= .
e
2.3 Series
63
(2.21)
64
In place of (b) one can also use the
(weaker) statement
an+1
P
> 1.
(b) an diverges if lim
an
n
an+1
Indeed, if (b) is satisfied, almost all elements of the sequence
an
an
are 1.
an+1
< 1,
(a) converges if lim
a
n
an+1
> 1.
(b) diverges if lim
an
Proof of Theorem 2.27. If condition (a) holds, we can find < 1 and an integer m such that
n m implies
an+1
an < .
In particular,
| am+1 | < | am | ,
| am+p | < p | am | .
That is,
| an | <
| am | n
P
for n m, and (a) follows from the comparison test, since n converges. If | an+1 | | an |
for n n0 , it is seen that the condition an 0 does not hold, and (b) follows.
Remark 2.2 Homework 7.5 shows that in (b) all but finitely many cannot be replaced by the
weaker assumption infinitely many.
Example 2.12 (a) The series
2
n
n=0 n /2
converges since, if n 3,
2
2
an+1 (n + 1)2 2n
1
1
1
1
8
=
1+
1+
=
= < 1.
an
n+1
2
2 n
2
n
2
3
9
1 1
1
1
1
1
1
+1+ + +
+
+
+
+
2
8 4 32 16 128 64
=
1
1
1
1
1
1
+ 0 + 3 + 2 + 5 + 4 +
1
2
2
2
2
2
2
2.3 Series
65
an+1
an+1
1
= , lim
= 2, but lim n an = 12 . Indeed, a2n = 1/22n2 and a2n+1 =
8 n an
n an
1/22n+1 yields
a2n
1
a2n+1
= ,
= 2.
a2n
8
a2n1
where lim
The root test indicates convergence; the ratio test does not apply.
X 1
X1
and
(c) For
both the ratio and the root test do not apply since both (an+1 /an ) and
n
n2
n
( an ) converge to 1.
The ratio test is frequently easier to apply than the root test. However, the root test has wider
scope.
Remark 2.3 For any sequence (cn ) of positive real numbers,
cn+1
cn+1
lim n cn lim n cn lim
.
n
n cn
n cn
n
lim
cn+1
P
P
bn =
Proposition 2.29 (Leibniz criterion) Let
bn be an alternating serie, that is
P
n+1
(1) an with a decreasing sequence of positive numbers a1 a2 0. If lim an = 0
P
then bn converges.
Proof. The proof is quite the same as in Example 2.8 (b). We find for the partial sums sn of
P
bn
| sn sm | am+1
P
if n m. Since (an ) tends to 0, the Cauchy criterion applies to (sn ). Hence,
bn is
convergent.
Proposition 2.30 If
k=m
| an | converges.
an converges.
66
Remarks 2.4 For series with positive terms, absolute convergence is the same as convergence.
P
P
P
If
an converges but
| an | diverges, we say that
an converges nonabsolutely. For inP
n+1
stance (1) /n converges nonabsolutely. The comparison test as well as the root and
the ratio tests, is really a test for absolute convergence and cannot give any information about
nonabsolutely convergent series.
We shall see that we may operate with absolutely convergent series very much as with finite
sums. We may multiply them, we may change the order in which the additions are carried out
without effecting the sum of the series. But for nonabsolutely convergent sequences this is no
longer true and more care has to be taken when dealing with them.
Without proof we mention the fact that one can multiply absolutely convergent series; for the
proof, see [Rud76, Theorem 3.50].
Proposition 2.31 If
cn =
n
X
k=0
ak bnk , n Z+ . Then
an = A,
n=0
bn converges,
bn = B,
n=0
cn = AB.
n=0
an 10n .
(2.22)
n=1
X
=
an 10n .
n=1
X
n=1
an 10n 9
X
n=1
10n =
1
9
= 1.
10 1 1/10
where
sn =
n
X
ak 10k .
k=1
2.3 Series
67
sn+1
The induction step is complete. By construction | sn | < 10n , that is, lim sn = .
Remarks 2.5 (a) The proof shows that any real number [0, 1) can be approximated by
rational numbers.
(b) The construction avoids decimal expansion of the form = . . . a9999 . . . , a < 9, and gives
instead = . . . (a + 1)000 . . . . It gives a bijective correspondence between the real numbers of
the interval [0, 1) and the sequences (an ), an {0, 1, . . . , 9}, not ending with nines. However,
the sequence (an ) = (0, 1, 9, 9, ) corresponds to the real number 0.02.
(c) It is not difficult to see that [0, 1) is rational if and only if there exist positive integers
n0 and p such that n n0 implies an = an+p the decimal expansion is periodic from n0 on.
68
| zn |2 = x2n + yn2 , | zn |2 0 as n ; this implies zn 0.
Since the complex field C is not an ordered field, all notions and propositions where the order
is involved do not make sense for complex series or they need modifications. The sandwich
theorem does not hold; there is no notion of monotonic sequences, upper and lower limits. But
still there are bounded sequences (| zn | C), limit points, subsequences, Cauchy sequences,
series, and absolute convergence. The following theorems are true for complex sequences, too:
Proposition/Lemma/Theorem 1, 2, 3, 9, 10, 12, 15, 17, 18
The BolzanoWeierstra Theorem for bounded complex sequences (zn ) can be proved by considering the real and the imaginary sequences ( Re zn ) and ( Im zn ) separately.
The comparison test for series now reads:
P
(a) If | an | C | bn | for some C > 0 and for almost all n N, and if
| bn |
P
converges, then an converges.
P
(b) If | an | C | dn | for some C > 0 and for almost all n, and if | dn | diverges,
P
then an diverges.
The Cauchy criterion, the root, and the ratio tests are true for complex series as well. Propositions 19, 20, 26, 27, 28, 30, 31 are true for complex series.
cn z n
(2.23)
n=0
is called a power series. The numbers cn are called the coefficients of the series; z is a complex
number.
In general, the series will converge or diverge, depending on the choice of z. More precisely,
with every power series there is associated a circle with center 0, the circle of convergence, such
that (2.23) converges if z is in the interior of the circle and diverges if z is in the exterior. The
radius R of this disc of convergence is called the radius of convergence.
On the disc of convergence, a power series defines a function since it associates to each z with
P
| z | < R a complex number, namely the sum of the numerical series n cn z n . For example,
P n
1
n=0 z defines the function f (z) = 1z for | z | < 1. If almost all coefficients cn are 0, say
cn = 0 for all n m + 1, the power series is a finite sum and the corresponding function is a
Pm
P
n
n
2
m
polynomial:
n=0 cn z =
n=0 cn z = c0 + c1 z + c2 z + + cm z .
P
Theorem 2.34 Given a power series cn z n , put
p
1
(2.24)
= lim n | cn |, R = .
n
P
If = 0, R = +; if = +, R = 0. Then
cn z n converges if | z | < R, and diverges if
| z | > R.
2.3 Series
69
X
2(i/3)2
i
3+i
2
=
=
.
3
1
i/3
15
n=2
P
(e) The series z n /n has R = 1. It diverges if z = 1. It converges for all other z with | z | = 1
(without proof).
P
(f) The series z n /n2 has R = 1. It converges for all z with | z | = 1 by the comparison test,
since | z n /n2 | = 1/n2 .
2.3.10 Rearrangements
The generalized associative law for finite sums says that we can insert brackets without effecting
the sum, for example, ((a1 + a2 ) + (a3 + a4 )) = (a1 + (a2 + (a3 + a4 ))). We will see that a
similar statement holds for series:
P
P
P
Suppose that k ak is a converging series and l bl is a sum obtained from k ak by inserting
brackets, for example
b1 + b2 + b3 + = (a1 + a2 ) + (a3 + + a10 ) + (a11 + a12 ) +
| {z } |
{z
} | {z }
b1
b2
b3
P
P
Then l bl converges and the sum is the same. If k ak diverges to +, the same is true
P
P
P
for l bl . However, divergence of
ak does not imply divergence of
bl in general, since
1 1 + 1 1 + 1 1 + diverges but (1 1) + (1 1) + converges. For the proof
P
P
let sn = nk=1 and tm = m
l=1 bl . By construction, tm = snm for a suitable subsequence (snm )
70
P
of the partial sums of k ak . Convergence (proper or improper) of (sn ) implies convergence
P
(proper or improper) of any subsequence. Hence, l bl converges.
For finite sums, the generalized commutative law holds:
a1 + a2 + a3 + a4 = a2 + a4 + a1 + a3 ;
that is, any rearrangement of the summands does not effect the sum. We will see in Example 2.14 below that this is not true for arbitrary series but for absolutely converging ones, (see
Proposition 2.36 below).
Definition 2.11 Let : N N be a bijective mapping, that is in the sequence ((1), (2), . . . )
every positive integer appears once and only once. Putting
an = a(n) ,
we say that
an is a rearrangement of
(n = 1, 2, . . . ),
an .
P
P
P
If (sn ) and (sn ) are the partial sums of
an and a rearrangement
an of
an , it is easily
seen that, in general, these two sequences consist of entirely different numbers. We are led to
the problem of determining under what conditions all rearrangements of a convergent series
will converge and whether the sums are necessarily the same.
Example 2.14 (a) Consider the convergent series
X
(1)n+1
n=1
=1
1 1
+ +
2 3
(2.25)
1 1
1 1 1
1
1
1
+ +
+ .
2 4
3 6 8
5 10 12
(2.26)
1
2
3 4
We will show that (2.26) converges to s = s/2. Namely
X
1
1
1
1
1 1
1
1
s =
+
+
+
an = 1
2
4
3 6
8
5 10
12
1 1 1 1
1
1
= + +
+
2 4 6 8 10 12
1 1 1
1
1
1 + + = s
=
2
2 3 4
2
Since s 6= 0, s 6= s. Hence, there exist rearrangements which converge; however to a different
limit.
2.3 Series
71
+
+
+
+
9 11 13 15
8
+
1
1
1
1
+ n
+ + n+1
+
+
n
2 +1 2 +3
2
1
2n + 2
+ n
+ + n+1
> 2n1 n+1
>
>
n
2 +1 2 +3
2
1
2n + 2
2
2n + 2
4 2n + 2
5
the rearranged series diverges to +.
Without proof (see [Rud76, 3.54 Theorem]) we remark the following surprising theorem. It
shows (together with the Proposition 2.36) that the absolute convergence of a series is necessary
and sufficient for every rearrangement to be convergent (to the same limit).
P
Proposition 2.35 Let
an be a series of real numbers which converges, but not absolutely.
P
Suppose +. Then there exists a rearrangement an with partial sums sn
such that
lim sn = , lim sn = .
n
P
Proposition 2.36 If
an is a series of complex numbers which converges absolutely, then
P
every rearrangement of an converges, and they all converge to the same sum.
P
Proof. Let an be a rearrangement with partial sums sn . Given > 0, by the Cauchy criterion
P
for the series | an | there exists n0 N such that n m n0 implies
n
X
k=m
| ak | < .
(2.27)
Now choose p such that the integers 1, 2, . . . , n0 are all contained in the set (1), (2), . . . , (p).
{1, 2, . . . , n0 } {(1), (2), . . . , (p)}.
Then, if n p, the numbers a1 , a2 , . . . , an0 will cancel in the difference sn sn , so that
n
n
n
n
X
X
X
X
| sn sn | =
ak
a(k)
ak
| ak | < ,
k=1
k=1
k=n0 +1
k=n0 +1
72
p0 p1 p3
p2 p4 p7
p5 p8 p12
..
.. ..
.
. .
P
and the diagonal enumeration of the products. The question is: under which conditions on an
P
and
bn the product series converges and its sum does not depend on the arrangement of the
products ai bk .
P
P
P
ak and
bk converge absolutely with A =
a
Proposition 2.37 If both series
k=0
k=0
P
P
P k=0 k
and B = k=0 bk , then any of their product series pk converges absolutely and k=0 pk =
AB.
P
Proof. For the nth partial sum of any product series nk=0 | pk | we have
| p0 | + | p1 | + + | pn | (| a0 | + + | am |) (| b0 | + + | bm |),
X
k=0
| ak |
X
k=0
| bk | .
P
That is, any series
by Lemma 2.19 (c). By Propok=0 | pk | is bounded and hence convergent
P
sition 2.36 all product series converge to the same sum s = k=0 pk . Consider now the very
P
special product series
k=1 qn with partial sums consisting of the sum of the elements in the
upper left square. Then
q1 + q2 + + q(n+1)2 = (a0 + a1 + + an )(b0 + + bn ).
converges to s = AB.
Arranging the elements ai bj as above in a diagonal array and summing up the elements on the
nth diagonal cn = a0 bn + a1 bn1 + + an b0 , we obtain the Cauchy product
X
n=0
cn =
X
n=0
(a0 bn + a1 bn1 + + an b0 ).
P
P
Corollary 2.38 If both series k=0 ak and
bk converge absolutely with A =
k=0
k=0 ak
P
P
P
and B = k=0 bk , their Cauchy product k=0 ck converges absolutely and k=0 ck = AB.
2.3 Series
73
p q p2 q 2 p3 q 3
1 X n
=
+
+
+ =
(p q n )
pq
pq
pq
p q n=1
=
| p |<1,| q |<1
1
p
q
1 p(1 q) q(1 p)
1
1
=
=
.
pq1p 1q
p q (1 p)(1 q)
1p 1q
ak z bnk z
nk
=z
k=0
such that
X
k=0
ak z k
n
X
ak bnk ,
k=0
bk z k =
X
n=0
k=0
(a0 bn + + an b0 )z n .
P
Corollary 2.39 Suppose that n an z n and n bn z n are power series with positive radius
of convergence R1 and R2 , respectively. Let R = min{R1 , R2 }. Then the Cauchy product
P
n
n=0 cn z , cn = a0 bn + + an b0 , converges absolutely for | z | < R and
P
X
n=0
an z n
bn z n =
n=0
cn z n ,
n=0
| z | < R.
This follows from the previous corollary and the fact that both series converge absolutely for
| z | < R.
Example 2.16 (a)
1
, | z | < 1.
(1 z)2
n=0
P n
1
= 1z
, | z | < 1 with itself.
Indeed, consider the Cauchy product of
n=0 z
Pn
Pn
an = bn = 1, cn = k=0 ak bnk = k=0 1 1 = n + 1, the claim follows.
(b)
2
(z + z )
X
n=0
z =
(z
n+1
(n + 1)z n =
+z
n+2
n=0
=z+2
)=
X
n=1
X
n=2
z +
Since
zn =
n=2
z n = z + 2z 2 + 2z 3 + 2z 4 + = z +
z + z2
2z 2
=
.
1z
1z
74
Chapter 3
Functions and Continuity
This chapter is devoted to another central notion in analysisthe notion of a continuous function. We will see that sums, product, quotients, and compositions of continuous functions are
continuous. If nothing is specified otherwise D will denote a finite union of intervals.
Example 3.1 (a) Power series (with radius of convergence R > 0), polynomials and rational
functions are the most important examples of functions.
Let c R. Then f (x) = c, f : R R, is called the constant function.
(b) Properties of the functions change drastically if we change the domain or the image set.
Let f : R R, g : R R+ , k : R+ R, h : R+ R+ function given by x 7 x2 .
g is surjective, k is injective, h is bijective, f is neither injective nor surjective. Obviously,
f R+ = k and gR+ = h.
P n
1
\
(c) Let f (x) =
n=0 x , f : (1, 1) R and h(x) = 1x , h : R {1} R. Then
h(1, 1) = f .
75
76
y
f(x)=x
f(x)=|x|
c
x
The graphs of the constant, the identity, and absolute value functions.
xx0
f(x)+
f(x)
f(x)-
x- x x+
0 < | x x0 | <
and
| f (x) A | .
77
Proposition 3.1 (sequences definition) Let f and x0 be as above. Then lim f (x) = A if and
xx0
only if for every sequence (xn ) with xn (a, b), xn 6= x0 for all n, and lim xn = x0 we have
n
lim f (xn ) = A.
Proof. Suppose limxx0 f (x) = A, and xn x0 where xn 6= x0 for all n. Given > 0 we find
> 0 such that | f (x) A | < if 0 < | x x0 | < . Since xn x0 , there is a positive integer
n0 such that n n0 implies | xn x0 | < . Therefore n n0 implies | f (xn ) A | < . That
is, limn f (xn ) = A.
Suppose to the contrary that the condition of the proposition is fulfilled but limxx0 f (x) 6= A.
Then there is some > 0 such that for all = 1/n, n N, there is an xn (a, b) such that
0 < | xn x0 | < 1/n, but | f (xn ) A | . We have constructed a sequence (xn ), xn 6= x0
and xn x0 as n such that limn f (xn ) 6= A which contradicts our assumption.
Hence lim f (x) = A.
xx0
xx0 +0
if for all sequences (xn ) with xn > x0 and lim xn = x0 , we have lim f (xn ) = A. Sometimes
n
we use the notation f (x0 + 0) in place of lim f (x). We call f (x0 + 0) the right-hand limit
xx0 +0
if for all sequences (xn ) with lim xn = + we have lim f (xn ) = A. Sometimes we use the
n
(c) Finally, the notions of (a), (b), and Definition 3.2 still make sense in case A = + and
A = . For example,
lim f (x) =
xx0 0
if for all sequences (xn ) with xn < x0 and lim xn = x0 we have lim f (xn ) = .
n
Remark 3.1 All notions in the above definition can be given in - or -D or E- or E-D
languages using inequalities. For example, lim f (x) = if and only if
xx0 0
78
For example, we show that limx00 x1 = . To E > 0 choose = E1 . Then 0 < x < =
1
implies 0 < E < x1 and hence f (x) < E. This proves the claim.
E
Similarly, lim f (x) = + if and only if
x+
xn0
lim f (x) = n.
f(x)=[x]
xn+0
n-1
xn+0
Since the one-sided limits are different, lim f (x) does not exist.
xn
Definition 3.4 Suppose we are given two functions f and g, both defined on (a, b) \ {x0 }. By
f + g we mean the function which assigns to each point x 6= x0 of (a, b) the number f (x) +
g(x). Similarly, we define the difference f g, the product f g, and the quotient f /g, with the
understanding that the quotient is defined only at those points x at which g(x) 6= 0.
Proposition 3.2 Suppose that f and g are functions defined on (a, b) \ {x0 }, a < x0 < b, and
limxx0 f (x) = A, limxx0 g(x) = B, , R. Then
(a) lim f (x) = A implies A = A.
xx0
A
f
(x) = , if B 6= 0.
xx0 g
B
(e) lim | f (x) | = | A |.
(d) lim
xx0
Proof. In view of Proposition 3.1, all these assertions follow immediately from the analogous
properties of sequences, see Proposition 2.3. As an example, we show (c). Let (xn ) , xn 6= x0 ,
be a sequence tending to x0 . By assumption, limn f (xn ) = A and limn g(xn ) = B.
By the Propostition 2.3 limn f (xn )g(xn ) = AB, that is, limn (f g)(xn ) = AB. By
79
Remark 3.2 The proposition remains true if we replace (at the same time in all places) x x0
by x x0 + 0, x x0 0, x +, or x . Moreover we can replace A or B by +
or by provided the right members of (b), (c), (d) and (e) are defined.
Note that + + (), 0 , /, and A/0 are not defined.
The extended real number system consists of the real field R and two symbols, + and .
We preserve the original order in R and define
< x < +
for every x R.
It is the clear that + is an upper bound of every subset of the extended real number system,
and every nonempty subset has a least upper bound. If, for example, E is a set of real numbers
which is not bounded above in R, then sup E = + in the extended real system. Exactly the
same remarks apply to lower bounds.
The extended real system does not form a field, but it is customary to make the following
conventions:
(a) If x is real then
x + = +,
x = ,
x
x
=
= 0.
+
xa
This immediately follows from limxa x = a, limxa c = c and Proposition 3.2. Indeed, by (b)
and (c), for p(x) = 3x3 4x + 7 we have limxa (3x2 4x + 7) = 3 (limxa x)3 4 limxa x +
7 = 3a2 4a + 7 = p(a). This works for arbitrary polynomials . Suppose moreover that
q(a) 6= 0. Then by (d),
p(a)
p(x)
lim
=
.
xa q(x)
q(a)
Hence, the limit of a rational function f (x) as x approaches a point a of the domain of f is
f (a).
80
P
(b) Let f (x) = p(x)
be a rational function with polynomials p(x) = rk=0 ak xk and q(x) =
q(x)
Ps
k
k=0 bk x with real coefficients ak and bk and of degree r and s, respectively. Then
0, if r < s,
ar , if r = s,
lim f (x) = bs
x+
+, if r > s and
, if r > s and
ar
bs
ar
bs
> 0,
< 0.
The first two statements (r s) follow from Example 3.2 (b) together with Proposition 3.2.
Namely, ak xkr 0 as x + provided 0 k < r. The statements for r > s follow from
xrs + as x + and the above remark.
Note that
lim f (x) = (1)r+s lim f (x)
x+
since
r
p(x)
(1)r ar xr + . . .
r+s ar x + . . .
=
=
(1)
.
q(x)
(1)s bs xs + . . .
bs xs + . . .
(3.1)
Example 3.4 (a) In example 3.2 we have seen that every polynomial is continuous in R and
every rational functions f is continuous in their domain D(f ).
f (x) = | x | is continuous in R.
(b) Continuity is a local property: If two functions f, g : D R coincide in a neighborhood
U (x0 ) D of some point x0 , then f is continuous at x0 if and only if g is continuous at x0 .
(c) f (x) = [x] is continuous in R \ Z. If x0 is not an integer, then n < x0 < n + 1 for some
n N and f (x) = n coincides with a constant function in a neighborhood x U (x0 ). By (b),
f is continuous at x0 . If x0 = n Z, limxn [x] does not exist; hence f is not continuous at n.
x2 1
(d) f (x) =
if x 6= 1 and f (1) = 1. Then f is not continuous at x0 = 1 since
x1
x2 1
= lim (x + 1) = 2 6= 1 = f (1).
x1 x 1
x1
lim
81
There are two reasons for a function not being continuous at x0 . First, limxx0 f (x) does not
exist. Secondly, f has a limit at x0 but limxx0 f (x) 6= f (x0 ).
Proposition 3.3 Suppose f, g : D R are continuous at x0 D. Then f + g and f g are also
continuous at x0 . If g(x0 ) 6= 0, then f /g is continuous at x0 .
The proof is obvious from Proposition 3.2.
The set C(D) of continuous function on D R form a commutative algebra with 1.
Proposition 3.4 Let f : D R and g : E R functions with f (D) E. Suppose f is
continuous at a D, and g is continuous at b = f (a) E. Then the composite function
g f : D R is continuous at a.
Proof. Let (xn ) be a sequence with xn D and limn xn = a. Since f is continuous
at a, limn f (xn ) = b. Since g is continuous at b, limn g(f (xn )) = g(b); hence
g f (xn ) g f (a). This completes the proof.
Example 3.5 f (x) = x1 is continuous for x 6= 0, g(x) = sin x is continuous (see below), hence,
(g f )(x) = sin x1 is continuous on R \ {0}.
The statement is clear from the graphical presentation. Nevertheless, it needs a proof since pictures do not prove anything.
The statement is wrong for rational numbers. For example, let
82
Hence, = f (c).
Example 3.6 (a) We again show the existence of the nth root of a positive real number a > 0,
n N. By Example 3.2, the polynomial p(x) = xn a is continuous in R. We find p(0) =
a < 0 and by Bernoullis inequality
p(1 + a) = (1 + a)n a 1 + (n 1)a 1 > 0.
Theorem 3.5 shows that p has a root in the interval (0, 1 + a).
(b) A polynomial p of odd degree with real coefficients has a real zero. Namely, by Example 3.3,
if the leading coefficient ar of p is positive, lim p(x) = and lim p(x) = +. Hence
x
there are a and b with a < b and p(a) < 0 < p(b). Therefore, there is a c (a, b) such that
p(c) = 0.
There are polynomials of even degree having no real zeros. For example f (x) = x2k + 1.
Remark 3.3 Theorem 3.5 is not true for continuous functions f : Q R. For example,
f (x) = x2 2 is continuous, f (0) = 2 < 0 < 2 = f (2). However, there is no r Q
with f (r) = 0.
3.2.2 Continuous Functions on Bounded and Closed IntervalsThe Theorem about Maximum and Minimum
We say that f : [a, b] R is continuous, if f is continuous on (a, b) and f (a + 0) = f (a) and
f (b 0) = f (b).
Theorem 3.6 (Theorem about Maximum and Minimum) Let f : [a, b] R be continuous.
Then f is bounded and attains its maximum and its minimum, that is, there exists C > 0 with
| f (x) | C for all x [a, b] and there exist p, q [a, b] with sup f (x) = max f (x) = f (p)
axb
axb
axb
Remarks 3.4 (a) The theorem is not true in case of open, half-open or infinite intervals. For
example, f : (0, 1] R, f (x) = x1 is continuous but not bounded. The function f : (0, 1) R,
f (x) = x is continuous and bounded. However, it doesnt attain maximum and minimum.
Finally, f (x) = x2 on R+ is continuous but not bounded.
(b) Put M := max f (x) and m := min f (x). By the Theorem about maximum and minimum
xK
xK
and the intermediate value theorem, for all R with m M there exists c [a, b] such
that f (c) = ; that is, f attains all values between m and M.
Proof. We give the proof in case of the maximum. Replacing f by f yields the proof for the
minimum. Let
A = sup f (x) R {+}.
axb
83
(Note that A = + is equivalent to f is not bounded above.) Then there exists a sequence
(xn ) [a, b] such that limn f (xn ) = A. Since (xn ) is bounded, by the Theoremm of
Weierstra there exists a convergent subsequence (xnk ) with p = lim xnk and a p b. Since
k
f is continuous,
In particular, A is a finite real number; that is, f is bounded above by A and f attains its
maximum A at point p [a, b].
(3.2)
84
Proof. Suppose to the contrary that f is not uniformly continuous. Then there exists 0 > 0
without matching > 0; for every positive integer n N there exists a pair of points xn , xn
with | xn xn | < 1/n but | f (xn ) f (xn ) | 0 . Since [a, b] is bounded and closed, (xn ) has
a subsequence converging to some point p [a, b]. Since | xn xn | < 1/n, the sequence (xn )
also converges to p. Hence
lim f (xnk ) f (xnk ) = f (p) f (p) = 0
which contradicts f (xnk ) f (xnk ) 0 for all k.
There exists an example of a bounded continuous function f : [0, 1) R which is not uniformly continuous, see [Kon90, p. 91].
Discontinuities
If x is a point in the domain of a function f at which f is not continuous, we say f is discontinuous at x or f has a discontinuity at x. It is customary to divide discontinuities into two
types.
Definition 3.7 Let f : (a, b) R be a function which is discontinuous at a point x0 . If the
one-sided limits limxx0 +0 f (x) and limxx0 0 f (x) exist, then f is said to have a simple discontinuity or a discontinuity of the first kind. Otherwise the discontinuity is said to be of the
second kind.
Example 3.7 (a) f (x) = sign(x) is continuous on R \ {0} since it is locally constant. Moreover, f (0 + 0) = 1 and f (0 0) = 1. Hence, sign(x) has a simple discontinuity at x0 = 0.
(b) Define f (x) = 0 if x is rational, and f (x) = 1 if x is irrational. Then f has a discontinuity
of the second kind at every point x since neither f (x + 0) nor f (x 0) exists.
(c) Define
f (x) =
sin x1 , if
0,
if
x 6= 0;
x = 0.
1
+ n
and yn =
1
,
n
Then both sequences (xn ) and (yn ) approach 0 from above but limn f (xn ) = 1 and
limn f (yn ) = 0; hence f (0 + 0) does not exist. Therefore f has a discontinuity of the
second kind at x = 0. We have not yet shown that sin x is a continuous function. This will be
done in Section 3.5.2.
85
t(a,x)
(3.3)
(3.4)
86
Hence x = g(y) = n y is continuous, too. This gives an alternative proof of homework 5.5.
X
zn
n=0
n!
=1+z+
z2 z3
+
+ .
2
6
(3.5)
Note that E(0) = 1 and E(1) = e by the definition at page 61. The radius of convergence of
the exponential series (3.5) is R = +, i. e. the series converges absolutely for all z C, see
Example 2.13 (c).
Applying Proposition 2.31 (Cauchy product) on multiplication of absolutely convergent series,
we obtain
n
X
z n X w m X X z k w nk
=
E(z)E(w) =
n! m=0 m!
k! (n k)!
n=0
n=0 k=0
X 1 X n
X
(z + w)n
k nk
,
z w
=
=
n!
n!
k
n=0
n=0
k=0
z, w C.
(3.6)
z C.
(3.7)
This shows that E(z) 6= 0 for all z. By (3.5), E(x) > 0 if x > 0; hence (3.7) shows E(x) > 0
for all real x.
Iteration of (3.6) gives
E(z1 + + zn ) = E(z1 ) E(zn ).
(3.8)
n N.
(3.9)
87
(3.10)
so that
p Q+ .
E(p) = ep ,
(3.11)
It follows from (3.7) that E(p) = ep if p is positive and rational. Thus (3.11) holds for all
rational p. This justifies the redefinition
x C.
ex := E(x),
The notation exp(x) is often used in place of ex .
| z |n
n!
if
|z|
k=n
z k /k! as follows
n+1
.
2
(3.12)
| z |n
=
n!
|z|
| z |2
| z |k
1+
+
++
+
n + 1 (n + 1)(n + 2)
(n + 1) (n + k)
!
| z |k
|z|
| z |2
++
+ .
1+
+
n + 1 (n + 1)2
(n + 1)k
| z | (n + 1)/2 implies,
| z |n
| rn (z) |
n!
1 1
1
1+ + ++ k +
2 4
2
2 | z |n
.
n!
| z | 1.
z
e 1
|z |,
1
z
z0
ez 1
z
= 1.
3
|z| .
2
3
|z| < .
2
88
By (3.5), limx E(x) = +; hence (3.7) shows that limx E(x) = 0. By (3.5),
0 < x < y implies that E(x) < E(y); by (3.7), it follows that E(y) < E(x); hence, E
is strictly increasing on the whole real axis.
The addition formula also shows that
lim (E(z + h) E(z)) = E(z) lim (E(h) 1) = E(z) 0 = 0,
h0
h0
(3.13)
where limh0 E(h) = 1 directly follows from Example 3.9. Hence, E(z) is continuous for all
z.
Proposition 3.11 Let ex be defined on R by the power series (3.5). Then
(a) ex is continuous for all x.
(b) ex is a strictly increasing function and ex > 0.
(c) ex+y = ex ey .
(d) lim ex = +, lim ex = 0.
x+
xn
(e) lim x = 0 for every n N.
x+ e
Proof. We have already proved (a) to (d); (3.5) shows that
ex >
xn+1
(n + 1)!
(n + 1)!
xn
,
<
x
e
x
and (e) follows. Part (e) shows that ex tends faster to + than any power of x, as x +.
y > 0,
(3.14)
or, equivalently, by
log(ex ) = x,
x R.
(3.15)
u > 0, v > 0.
(3.16)
89
This shows that log has the familiar property which makes the logarithm useful for computations. Another customary notation for log x is ln x. Proposition 3.11 shows that
lim log x = +,
lim log x = .
x+
x0+0
x0+0
(3.17)
xm = e
log x
m
(3.18)
(3.19)
for any rational . We now define x for any real and x > 0, by (3.19). In the same way, we
redefine the exponential function
ax = ex log a ,
a > 0,
x R.
It turns out that in case a 6= 1, f (x) = ax is strict monotonic and continuous since ex is so.
Hence, f has a stict monotonic continuous inverse function loga : (0, +) R defined by
loga (ax ) = x,
x R,
aloga x = x,
x > 0.
0.5
0.5
such that
Sine
Cosine
1 iz
e + ei z ,
2
ei z = cos z + i sin z
sin z =
1 iz
e ei z
2i
(Euler formula)
(3.20)
(3.21)
90
Proposition 3.13 (a) The functions sin z and cos z can be written as power series which converge absolutely for all z C:
cos z =
sin z =
X
(1)n
n=0
X
n=0
1
1
1
z 2n = 1 z 2 + z 4 z 6 +
(2n)!
2
4!
6!
1
1
(1)n 2n+1
= z z3 + z5 + .
z
(2n + 1)!
3!
5!
(3.22)
(b) sin x and cos x are real valued and continuous on R, where cos x is an even and sin x is an
odd function, i. e. cos(x) = cos x, sin(x) = sin x. We have
sin2 x + cos2 x = 1;
(3.23)
(3.24)
Proof. (a) Inserting iz into (3.5) in place of z and using (in ) = (i, 1, i, 1, i, 1, . . . ), we have
iz
e =
nz
n=0
n!
X
k=0
X
z 2k
z 2k+1
(1)
(1)k
+i
.
(2k)!
(2k + 1)!
k
k=0
in
n=0
X
zn X
z 2k
z 2k+1
(1)k
(1)k
=
i
.
n!
(2k)!
(2k
+
1)!
k=0
k=0
cos x
ix
e = 1.
(3.25)
On the other hand, the Euler formula and the fact that
cos x and sin x are real give
1 = ei x = | cos x + i sin x | = cos2 x + sin2 x.
91
Hence, eix = cos x + i sin x is a point on the unit circle in the complex plane, and cos x and
sin x are its coordinates. This establishes the equivalence between the old definition of cos x as
the length of the adjacent side in a rectangular triangle with hypothenuse 1 and angle x 180
and
the power series definition of cos x. The only missing link is: the length of the arc from 1 to eix
is x.
It follows directly from the definition that cos(z) = cos z and sin(z) = sin z for all
z C. The addition laws for sin x and cos x follow from (3.6) applied to ei (x+y) . This
completes the proof of (b).
Lemma 3.14 There exists a unique number (0, 2) such that cos = 0. We define the
number by
= 2.
(3.26)
x3
< sin x < x.
6
6 implies x
(3.27)
(3.28)
(3.29)
(3.30)
(c) cos x is strictly decreasing on [0, ]; whereas sin x is strictly increasing on [/2, /2].
2
In particular, the sandwich theorem applied to statement (a), 1 x6 < sinx x < 1 as x 0 + 0
gives limx0+0 sinx x = 1. Since sinx x is an even function, this implies limx0 sinx x = 1.
The proof of the lemma is in the Appendix B to this chapter.
Proof of Lemma 3.14. cos 0 = 1. By the Lemma 3.15, cos2 1 < 1/2. By the double angle
formula for cosine, cos 2 = 2 cos2 1 1 < 0. By continuity of cos x and Theorem 3.5, cos has
a zero in the interval (0, 2).
By addition laws,
x+y
xy
cos x cos y = 2 sin
sin
.
2
2
So that by Lemma 3.15 0 < x < y < 2 implies 0 < sin((x + y)/2) and sin((x y)/2) < 0;
therefore cos x > cos y. Hence, cos x is strictly decreasing on (0, 2). The zero is therefore
unique.
92
By definition, cos 2 = 0; and (3.23) shows sin(/2) = 1. By (3.27), sin /2 = 1. Thus
ei/2 = i, and the addition formula for ez gives
e i = 1,
e2 i = 1;
(3.31)
z C.
(3.32)
hence,
ez+2 i = ez ,
Cotangent
4
For x 6= /2 + k, k Z, define
tan x =
sin x
.
cos x
(3.33)
cot x =
cos x
.
sin x
(3.34)
For x 6= k, k Z, define
Lemma 3.17 (a) tan x is continuous at x R \ {/2 + k | k Z}, and tan(x + ) = tan x;
(b) lim
tan x = +, lim
tan x = ;
x 2 0
x 2 +0
93
Proof. (a) is clear by Proposition 3.3 since sin x and cos x are continuous. We show only (c) and
let (b) as an exercise. Let 0 < x < y < /2. Then 0 < sin x < sin y and cos x > cos y > 0.
Therefore
tan x =
sin x
sin y
<
= tan y.
cos x
cos y
Hence, tan is strictly increasing on (0, /2). Since tan(x) = tan(x), tan is strictly
increasing on the whole interval (/2, /2).
x0+0
0.2
0.4
0.6
0.8
(3.35)
(3.36)
94
10
x
1
10
(3.37)
(3.38)
arctan
arccot
The functions
3
2
1
ex ex
,
2
ex + ex
cosh x =
,
2
ex ex
tanh x = x
=
e + ex
ex + ex
=
coth x = x
e ex
sinh x =
1
1
2
3
cosh
sinh
(3.39)
(3.40)
sinh x
cosh x
cosh x
sinh x
(3.41)
(3.42)
are called hyperbolic sine, hyperbolic cosine, hyperbolic tangent, and hyperbolic cotangent, respectively.
There are many analogies between these functions and
their ordinary trigonometric counterparts.
3.6 Appendix B
95
Hyperbolic Cotangent
3
Hyperbolic Tangent
1
2
y
0.5
x
1
0.5
2
The functions sinh x and tanh x are strictly increasing with sinh(R) = R and tanh(R) =
(1, 1). Hence, their inverse functions are defined on R and on (1, 1), respectively, and are
also strictly increasing and continuous. The function
arsinh : R R
(3.43)
(3.44)
(3.45)
(3.46)
3.6 Appendix B
3.6.1 Monotonic Functions have One-Sided Limits
Proof of Theorem 3.8. By hypothesis, the set {f (t) | a < t < x} is bounded above by f (x),
and therefore has a least upper bound which we shall denote by A. Evidently A f (x). We
96
(3.47)
if x < t < x.
(3.48)
x < t < x.
Hence f (x 0) = A.
The second half of (3.3) is proved in precisely the same way. Next, if a < x < y < b, we see
from (3.3) that
f (x + 0) = inf f (t) = inf f (t).
x<t<b
x<t<y
(3.49)
The last equality is obtained by applying (3.3) to (a, y) instead of (a, b). Similarly,
f (y 0) = sup f (t) = sup f (t).
a<t<y
(3.50)
x<t<y
0 < x < 2 implies 1 x2 /2 > 0 and, moreover 1/(2n)! x2 /(2n + 2)! > 0 for all n N;
hence C(x) > 0.
By (3.22),
1 2
1 2
1
5
x + .
sin x = x 1 x + x
3!
5! 7!
Now,
1
1
1 2
x > 0 x < 6,
x2 > 0 x < 42, . . . .
3!
5! 7!
3.6 Appendix B
97
and we obtain sin x < x if 0 < x < 20. Finally we have to check whether sin x x cos x > 0;
equivalently
?
1
1
1
1
1
1
3
5
7
x
+x
+
0<x
2! 3!
4! 5!
6! 7!
?
2
6
3
2 4
7
2 8
0<x
+x
+
x
x
3!
5!
7!
9!
x >0
(2n + 1)! (2n + 3)!
for all n N. This completes the proof of (a)
(b) Using (3.23), we get
0 < x cos x < sin x = 0 < x2 cos2 x < sin2 x
1
.
= x2 cos2 x + cos2 x < 1 = cos2 x <
1 + x2
(c) In the proof of Lemma 3.14 we have seen that cos x is strictly decreasing in (0, /2). By
(3.23), sin x = 1 cos2 x is strictly increasing. Since sin x is an odd function, sin x is strictly
increasing on [/2, /2]. Since cos x = sin(x /2), the statement for cos x follows.
n
X
(1)k
x2k
+ r2n+2 (x)
(2k)!
(3.51)
(1)k
x2k+1
+ r2n+3 (x),
(2k + 1)!
(3.52)
k=0
sin x =
n
X
k=0
where
| x |2n+2
| r2n+2 (x) |
(2n + 2)!
| x |2n+3
| r2n+3 (x) |
(2n + 3)!
Proof. Let
Put
if
| x | 2n + 3,
(3.53)
if
| x | 2n + 4.
(3.54)
x2
x2n+2
r2n+2 (x) =
1
.
(2n + 2)!
(2n + 3)(2n + 4)
ak :=
x2k
.
(2n + 3)(2n + 4) (2n + 2(k + 1))
98
Then we have, by definition
r2n+2 (x) =
x2n+2
(1 a1 + a2 + ) .
(2n + 2)!
Since
ak = ak1
x2
,
(2n + 2k + 1)(2n + 2k + 2)
| x | 2n + 3 implies
1 > a1 > a2 > > 0
and finally as in the proof of the Leibniz criterion
0 1 a1 + a2 a3 + 1.
Hence, | r2n+2 (x) | | x |2n+2 /(2n + 2)!. The estimate for the remainder of the sine series is
similar.
This is an application of Proposition 3.21. For numerical calculations it is convenient to use the
following order of operations
cos x =
x2
x2
x2
+1
+1
+1
2n(2n 1)
(2n 2)(2n 3)
(2n 4)(2n 5)
x2
+ 1 + r2n+2 (x).
x2
x2
x2
x2
x2
x2
x2
+1
+1
+1
+1
+1
+1
+
182
132
90
56
30
12
2
+ 1 + r16 (x).
By Proposition 3.21
| r16 (x) |
| x |16
0.9 1010
16!
if
| x | 1.6.
3.6 Appendix B
99
Now we compute cos x for two values of x which are close to
the linear interpolation
cos 1.5
a = 1.5 + 0.1
1.5
cos 1.5
= 1.57078 . . .
cos 1.5 cos 1.6
1.6
cos 1.6
100
Chapter 4
Differentiation
4.1 The Derivative of a Function
We define the derivative of a function and prove the main properties like product, quotient and
chain rule. We relate the derivative of a function with the derivative of its inverse function. We
prove the mean value theorem and consider local extrema. Taylors theorem will be formulated.
Definition 4.1 Let f : (a, b) R be a function and x0 (a, b). If the limit
lim
xx0
f (x) f (x0 )
x x0
(4.1)
df (x0 )
d
=
f (x0 ).
dx
dx
f (x0 + h) f (x0 )
Remarks 4.1 (a) Replacing x x0 by h , we see that f (x0 ) = lim
.
h0
h
(b) The limits
f (x0 + h) f (x0 )
f (x0 + h) f (x0 )
,
lim
lim
h00
h0+0
h
h
are called left-hand and right-hand derivatives of f in x0 , respectively. In particular for
f : [a, b] R, we can consider the right-hand derivative at a and the left-hand derivative at b.
Example 4.1 (a) For f (x) = c the constant function
f (x) f (x0 )
cc
= lim
= 0.
xx0
xx0 x x0
x x0
f (x0 ) = lim
(b) For f (x) = x,
f (x0 ) = lim
xx0
x x0
= 1.
x x0
101
4 Differentiation
102
(c) The slope of the tangent line. Given a function f : (a, b) R which is differentiable at x0 .
Then f (x0 ) is the slope of the tangent line to the graph of f through the point (x0 , f (x0 )).
The slope of the secant line through (x0 , f (x0 )) and
(x1 , f (x1 )) is
y
f( x 1)
m = tan 1 =
f( x 0)
1
x0
x1
f (x1 ) f (x0 )
.
x1 x0
xx0
The converse of this proposition is not true. For example f (x) = | x | is continuous in R but
|h|
|h|
= 1 whereas lim
= 1. Later we will become
differentiable in R \ {0} since lim
h0+0 h
h00 h
aquainted with a function which is continuous on the whole line without being differentiable at
any point!
Proposition 4.2 Let f : (r, s) R be a function and a (r, s). Then f is differentiable at a if
and only if there exists a number c R and a function defined in a neighborhood of a such
that
f (x) = f (a) + (x a)c + (x),
(4.2)
where
lim
xa
(x)
= 0.
xa
(4.3)
The proposition says that a function f differentiable at a can be approximated by a linear function, in our case by
y = f (a) + (x a)f (a).
The graph of this linear function is the tangent line to the graph of f at the point (a, f (a)). Later
we will use this point of view to define differentiability of functions f : Rn Rm .
103
xa
(x)
f (x) f (a)
= lim
f (a) = 0.
xa
xa
xa
P (x,y)
P0
f( x 0)
f (x0 ) = tan =
Q
x0
y y0
,
x x0
This function is called the linearization of f at x0 . It is also the Taylor polynomial of degree 1
of f at x0 , see Section4.5 below.
Proposition 4.3 Suppose f and g are defined on (a, b) and are differentible at a point x
(a, b). Then f + g, f g, and f /g are differentiable at x and
(a) (f + g) (x) = f (x) + g (x);
(b)
(f g)
(x) = f (x)g(x) + f (x)g (x);
(c)
f
g
(x) =
4 Differentiation
104
Noting that f (t) f (x) as t x, (b) follows.
Next let h = f /g. Then
f (t)
g(t)
f (x)
g(x)
f (t)g(x) f (x)g(t)
tx
g(x)g(t)(t x)
1
f (t)g(x) f (x)g(x) + f (x)g(x) f (x)g(t)
=
g(t)g(x)
tx
f (t) f (x)
g(t) g(x)
1
g(x)
.
f (x)
=
g(t)g(x)
tx
tx
h(t) h(x)
=
tx
(ex ) = lim
(4.4)
sin h
= 1 (by the argument after Lemma 3.15), we obtain
h0 h
(sin x) = cos x. The proof for cos x is analogous.
1
(d) (tan x) =
. Using the quotiont rule for the function tan x = sin x/ cos x we have
cos2 x
Since cos x is continuous and lim
(tan x) =
The next proposition deals with composite functions and is probably the most important statement about derivatives.
105
(4.5)
Proof. We have
f (g(x)) f (g(x0 )) g(x) g(x0 )
f (g(x)) f (g(x0 ))
=
x x0
g(x) g(x0 )
x x0
f (y) f (y0 )
lim
g (x0 ) = f (y0 )g (x0 ).
xx0 yy0
y y0
Here we used that y = g(x) tends to y0 = g(x0 ) as x x0 , since g is continuous at x0 .
Proposition 4.5 Let f : (a, b) R be strictly monotonic and continuous. Suppose f is differentiable at x. Then the inverse function g = f 1 : f ((a, b)) R is differentiable at y = f (x)
with
g (y) =
1
f (x)
1
f (g(y))
(4.6)
Proof. Let (yn ) f ((a, b)) be a sequence with yn y and yn 6= y for all n. Put xn = g(yn ).
Since g is continuous (by Corollary 3.9), limn xn = x. Since g is injective, xn 6= x for all n.
We have
g(yn) g(y)
xn x
= lim
= lim
n
n f (xn ) f (x)
n
yn y
lim
1
f (xn )f (x)
xn x
1
f (x)
1
1
1
= x = .
x
(e )
e
y
1
(c) x = e log x . Hence, (x ) = (e log x ) = e log x = x1 .
x
1
(d) Suppose f > 0 and g = log f . Then g = f ; hence f = f g .
f
4 Differentiation
106
(e) arcsin : [1, 1] R is the inverse function to y = f (x) = sin x. If x (1, 1) then
(arcsin(y)) =
1
1
=
.
(sin x)
cos x
1 < y < 1.
(tan x)
cos2 x
Since y = tan x we have
y 2 = tan2 x =
1
1 + y2
1
(arctan y) =
.
1 + y2
cos2 x =
1 cos2 x
1
sin2 x
=
=
1
2
2
cos x
cos x
cos2 x
107
function
const.
derivative
0
xn (n N)
nxn1
x ( R, x > 0)
x1
ex
ex
ax , (a > 0)
ax log a
1
x
1
x log a
cos x
log x
loga x
sin x
cos x
tan x
cot x
sinh x
cosh x
tanh x
coth x
arcsin x
arccos x
arctan x
arccot x
arsinh x
arcosh x
artanh x
arcoth x
sin x
1
cos2 x
1
2
sin x
cosh x
sinh x
1
cosh2 x
1
sinh2 x
1
1 x2
1
1 x2
1
1 + x2
1
1 + x2
1
x2 + 1
1
x2 1
1
1 x2
1
1 x2
4 Differentiation
108
(k)
dk f (x)
(x) =
=
dxk
d
dx
k
f (x).
Definition 4.2 Let D R and k N a positive integer. We denote by Ck (D) the set of all
functions f : D R such that f (k) (x) exists for all x D and f (k) (x) is continuous. Obviously
C(D) C1 (D) C2 (D) . Further, we set
\
Ck (D) = {f : D R | f (k) (x) exists k N, x D}.
(4.7)
C (D) =
f Ck (D) is called k times continuously differentiable. C(D) = C0 (D) is the vector space of
continuous functions on D.
Using induction over n, one proves the following proposition.
Proposition 4.6 (Leibniz formula) Let f and g be n times differentiable. Then f g is n times
differentiable with
(f (x)g(x))
(n)
n
X
n (k)
f (x)g (nk) (x).
=
k
k=0
(4.8)
109
Proposition 4.7 Let f be defined on [a, b]. If f has a local extremum at a point (a, b), and
if f () exists, then f () = 0.
Proof. Suppose f has a local maximum at . According with the definition choose > 0 such
that
a < < < + < b.
If < x < , then
f (x) f ()
0.
x
f (x) f ()
0.
x
Remarks 4.2 (a) f (x) = 0 is a necessary but not a sufficient condition for a local extremum
in x. For example f (x) = x3 has f (x) = 0, but x3 has no local extremum.
(b) If f attains its local extrema at the boundary, like f (x) = x on [0, 1], we do not have
f () = 0.
Theorem 4.8 (Rolles Theorem) Let f : [a, b] R be continuous with f (a) = f (b) and let f
be differentiable in (a, b). Then there exists a point (a, b) with f () = 0.
In particular, between two zeros of a differentiable function there is a zero of its derivative.
Proof. If f is the constant function, the theorem is trivial since f (x) 0 on (a, b). Otherwise, there exists x0 (a, b) such that f (x0 ) > f (a) or f (x0 ) < f (a). Then f attains its
maximum or minimum, respectively, at a point (a, b). By Proposition 4.7, f () = 0.
Theorem 4.9 (Mean Value Theorem) Let f : [a, b] R be continuous and differentiable in
(a, b). Then there exists a point (a, b) such that
f () =
f (b) f (a)
ba
(4.9)
4 Differentiation
110
Theorem 4.10 (Generalized Mean Value Theorem) Let f and g be continuous functions on
[a, b] which are differentiable on (a, b). Then there exists a point (a, b) such that
(f (b) f (a))g () = (g(b) g(a))f ().
Proof. Put
h(t) = (f (b) f (a))g(t) (g(b) g(a))f (t).
Then h is continuous in [a, b] and differentiable in (a, b) and
h(a) = f (b)g(a) f (a)g(b) = h(b).
Rolles theorem shows that there exists (a, b) such that
h () = f (b) f (a))h () (g(b) g(a))f () = 0.
The theorem follows.
In case that g is nonzero on (a, b) and g(b)g(a) 6= 0, the generalized MVT states the existence
of some (a, b) such that
f ()
f (b) f (a)
= .
g(b) g(a)
g ()
This is in particular true for g(x) = x and g = 1 which gives the assertion of the Mean Value
Theorem.
Remark 4.3 Note that the MVT fails if f is complex-valued, continuous on [a, b], and differentiable on (a, b). Indeed, f (x) = eix on [0, 2] is a counter example. f is continuous on [0, 2],
differentiable on (0, 2) and f (0) = f (2) = 1. However, there is no (0, 2) such that
(0)
0 = f (2)f
= f () = iei since the exponential function has no zero, see (3.7) (ez ez = 1)
2
in Subsection 3.5.1.
Corollary 4.11 Suppose f is differentiable on (a, b).
If f (x) 0 for all x (a, b), then f in monotonically increasing.
If f (x) = 0 for all x (a, b), then f is constant.
If f (x) 0 for all x in (a, b), then f is monotonically decreasing.
Proof. All conclusions can be read off from the equality
f (x) f (t) = (x t)f ()
which is valid for each pair x, t, a < t < x < b and for some (t, x).
111
and f () < 0,
f () = lim
f (x) > 0
if < x < ,
if < x < + .
f( x+(1- )y )
x+(1- )y
4 Differentiation
112
(4.11)
If
(a)
(b)
(4.12)
(4.13)
xa+0
xa+0
xa+0
xa+0
then
f (x)
= A.
xa+0 g(x)
lim
(4.14)
Proof. First we consider the case of finite a R. (a) One can extend the definition of f and
g via f (a) = g(a) = 0. Then f and g are continuous at a. By the generalized mean value
theorem, for every x (a, b) there exists a (a, x) such that
f (x)
f ()
f (x) f (a)
=
= .
g(x) g(a)
g(x)
g ()
If x approaches a then also approaches a, and (a) follows.
(b) Now let f (a + 0) = g(a + 0) = +. Given > 0 choose > 0 such that
f (t)
g (t) A <
if t (a, a + ). By the generalized mean value theorem for any x, y (a, a + ) with x 6= y,
f (x) f (y)
g(x) g(y) A < .
We have
f (x) f (y) 1
f (x)
=
g(x)
g(x) g(y) 1
g(y)
g(x)
f (y)
f (x)
The right factor tends to 1 as x approaches a, in particular there exists 1 > 0 with 1 < such
that x (a, a + 1 ) implies
f (x) f (x) f (y)
g(x) g(x) g(y) < .
f (x)
< 2.
A
g(x)
113
x
1
2 x
(b) lim
= lim
= lim
= +.
x0+0 1 cos x
x0+0 sin x
x0+0 2 x sin x
(c)
1
log x
x
= lim
= lim x = 0.
lim x log x = lim
1
x0+0
x0+0
x0+0 12
x0+0
x
x
0
Remark 4.5 It is easy to transform other indefinite expressions to or
of lHospitals rule.
0
f
0: f g = 1
g
f g =
1
g
1
fg
g log f
1
f
00 : f g = e
.
Similarly, expressions of the form 1 and 0 can be transformed.
p(n) (0) n
p (0)
p (0) 2
x+
x ++
x .
1!
2!
n!
Now, fix a R and let q(x) = p(x + a). Since q (k) (0) = p(k) (a), (4.15) gives
p(x + a) = q(x) =
p(x + a) =
n
X
q (k) (0)
k=0
n
X
k=0
k!
xk ,
p(k) (a) k
x .
k!
(4.15)
4 Differentiation
114
Replacing in the above equation x + a by x yields
p(x) =
n
X
p(k) (a)
k=0
k!
(x a)k .
(4.16)
Theorem 4.15 (Taylors Theorem) Suppose f is a real function on [r, s], n N, f (n) is continuous on [r, s], f (n+1) (t) exists for all t (r, s). Let a and x be distinct points of [r, s] and
define
Pn (x) =
n
X
f (k) (a)
k=0
k!
(x a)k .
(4.17)
f (n+1) ()
(x a)n+1 .
(n + 1)!
(4.18)
For n = 0, this is just the mean value theorem. Pn (x) is called the nth Taylor polynomial of f
at x = a, and the second summand of (4.18)
Rn+1 (x, a) =
f (n+1) ()
(x a)n+1
(n + 1)!
and put
g(t) = f (t) Pn (t) M(t a)n+1 ,
for r t s.
(4.19)
We have to show that (n + 1)!M = f (n+1) () for some between a and x. By (4.17) and (4.19),
g (n+1) (t) = f (n+1) (t) (n + 1)!M,
(4.20)
Hence the proof will be complete if we can show that g (n+1) () = 0 for some between a and
x.
(k)
Since Pn (a) = f (k) (a) for k = 0, 1, . . . , n, we have
g(a) = g (a) = = g (n) (a) = 0.
Our choice of M shows that g(x) = 0, so that g (1 ) = 0 for some 1 between a and x, by
Rolles theorem. Since g (a) = 0 we conclude similarly that g (2 ) = 0 for some 2 between
a and 1 . After n + 1 steps we arrive at the conclusion that g (n+1) (n+1 ) = 0 for some n+1
between a and n , that is, between a and x.
115
Definition 4.5 Suppose that f is a real function defined on [r, s] such that f (n) (t) exists for all
t (r, s) and all n N. Let x and a points of [r, s]. Then
Tf (x) =
X
f (k) (a)
k=0
k!
(x a)k
(4.21)
1
h
e1/h
2
= lim xpn (x)ex = 0,
x
h
X
xn
n=0
n!
x R,
X
n=0
xn =
1
,
1x
x (1, 1).
4 Differentiation
116
(c) f (x) = (1 + x) , R, a = 0. We have
f (k) (x) = (1) (k+1)(1+x)k ,
Therefore,
(1 + x) =
n
X
( 1) ( k + 1)
k!
k=1
xk + Rn (x)
(4.22)
The quotient test shows that the corresponding power series converges for | x | < 1. Consider
the Lagrangian remainder term with 0 < < x < 1 and n + 1 > . Then
0
(1 + )n1xn+1
xn+1
| Rn+1 (x) | =
n+1
n+1
n+1
as n . Hence,
(1 + x) =
X
n=0
xn ,
0 < x < 1.
(4.23)
(4.23) is called the binomial series. Its radius of convergence is R = 1. Looking at other forms
of the remainder term gives that (4.23) holds for 1 < x < 1.
(d) y = f (x) = arctan x. Since y = 1/(1 + x2 ) and y = 2x/(1 + x2 )2 we see that
y (1 + x2 ) = 1.
Differentiating this n times and using Leibnizs formula, Proposition 4.6 we have
n
X
(k)
2 (nk)
(y ) (1 + x )
k=0
n
= 0.
k
n (n+1)
n
n
2
(n)
y
=
y 2x +
y (n1) 2 = 0;
(1 + x ) +
n
n1
n2
x=0:
This yields
y (n) (0) =
(
0,
(1) (2k)!,
n = 2k,
if
n = 2k + 1.
Therefore,
n
X
(1)k 2k+1
x
arctan x =
+ R2n+2 (x).
2k
+
1
k=0
(4.24)
One can prove that 1 < x 1 implies R2n+2 (x) 0 as n . In particular, x = 1 gives
1 1
= 1 + + .
4
3 5
4.6 Appendix C
117
4.6 Appendix C
Corollary 4.16 (to the mean value theorem) Let f : R R be a differentiable function with
f (x) = cf (x) for all x R,
(4.25)
for all x R.
(4.26)
Proof. Consider F (x) = f (x)ecx . Using the product rule for derivatives and (4.25) we obtain
F (x) = f (x)ecx + f (x)(c)ecx = (f (x) cf (x)) ecx = 0.
By Corollary 4.11, F (x) is constant. Since F (0) = f (0) = A, F (x) = A for all x R; the
statement follows.
Corollary 4.18 If f is differentiable on [a, b], then f cannot have discontinuities of the first
kind.
Proof of Proposition 4.13. (a) Suppose first that f 0 for all x. By Corollary 4.11, f is
increasing. Let a < x < y < b and [0, 1]. Put t = x + (1 )y. Then x < t < y and by
the mean value theorem there exist 1 (x, t) and 2 (t, y) such that
f (y) f (t)
f (t) f (x)
= f (1 ) f (2 ) =
.
tx
yt
Since t x = (1 )(y x) and y t = (y x) it follows that
f (t) f (x)
f (y) f (t)
4 Differentiation
118
Hence, f is convex.
(b) Let f : (a, b) R be convex and twice differentiable. Suppose to the contrary f (x0 ) < 0
for some x0 (a, b). Let c = f (x0 ); put
(x) = f (x) (x x0 )c.
Then : (a, b) R is twice differentiable with (x0 ) = 0 and (x0 ) < 0. Hence, by
Proposition 4.12, has a local maximum in x0 . By definition, there is a > 0 such that
U (x0 ) (a, b) and
(x0 ) < (x0 ), (x0 + ) < (x0 ).
It follows that
f (x0 ) = (x0 ) >
1
1
((x0 ) + (x0 + )) = (f (x0 ) + f (x0 + )) .
2
2
Chapter 5
Integration
In the first section of this chapter derivatives will not appear! Roughly speaking, integration
generalizes addition. The formula distance = velocity time is only valid for constant
Rt
velocity. The right formula is s = t01 v(t) dt. We need integrals to compute length of curves,
areas of surfaces, and volumes.
The study of integrals requires a long preparation, but once this preliminary work has been
completed, integrals will be an invaluable tool for creating new functions, and the derivative
will reappear more powerful than ever. The relation between the integral and derivatives is
given in the Fundamental Theorem of Calculus.
i = 1, . . . , n.
5 Integration
120
Now suppose f is a bounded real function defined on [a, b]. Corresponding to each partition P
of [a, b] we put
Mi = sup{f (x) | x [xi1 , xi ]}
x=
a
0
x1
x2
x=
3 b
(5.1)
(5.2)
(5.3)
i=1
and finally
Z
f dx = inf U(P, f ),
(5.4)
f dx = sup L(P, f ),
(5.5)
a
b
a
where the infimum and supremum are taken over all partitions P of [a, b]. The left members of
(5.4) and (5.5) are called the upper and lower Riemann integrals of f over [a, b], respectively.
If the upper and lower integrals are equal, we say that f Riemann-integrable on [a, b] and we
write f R (that is R denotes the Riemann-integrable functions), and we denote the common
value of (5.4) and (5.5) by
Z b
Z b
f dx or by
f (x) dx.
(5.6)
a
Since f is bounded, there exist two numbers m and M such that m f (x) M for all
x [a, b]. Hence for every partition P
m(b a) L(P, f ) U(P, f ) M(b a),
so that the numbers L(P, f ) and U(P, f ) form a bounded set. This shows that the upper and the
lower integrals are defined for every bounded function f . The question of their equality, and
hence the question of the integrability of f , is a more delicate one. Instead of investigating it
separately for the Riemann integral, we shall immediately consider a more general situation.
Definition 5.2 Let be a monotonically increasing function on [a, b] (since (a) and (b) are
finite, it follows that is bounded on [a, b]). Corresponding to each partition P of [a, b], we
write
i = (xi ) (xi1 ).
It is clear that i 0. For any real function f which is bounded on [a, b] we put
U(P, f, ) =
L(P, f, ) =
n
X
i=1
n
X
i=1
Mi i ,
(5.7)
mi i ,
(5.8)
121
where Mi and mi have the same meaning as in Definition 5.1, and we define
Z
f d = inf U(P, f, ),
(5.9)
f d = sup U(P, f, ),
(5.10)
a
b
a
where the infimum and the supremum are taken over all partitions P .
If the left members of (5.9) and (5.10) are equal, we denote their common value by
Z
f d
or sometimes by
f (x) d(x).
(5.11)
This is the RiemannStieltjes integral (or simply the Stieltjes integral) of f with respect to ,
over [a, b]. If (5.11) exists, we say that f is integrable with respect to in the Riemann sense,
and write f R().
By taking (x) = x, the Riemann integral is seen to be a special case of the RiemannStieltjes
integral. Let us mention explicitely, that in the general case, need not even be continuous.
We shall now investigate the existence of the integral (5.11). Without saying so every time, f
will be assumed real and bounded, and increasing on [a, b].
Definition 5.3 We say that a partition P is a refinement of the partition P if P P (that
is, every point of P is a point of P ). Given two partitions, P1 and P2 , we say that P is their
common refinement if P = P1 P2 .
Lemma 5.1 If P is a refinement of P , then
L(P, f, ) L(P , f, ) and U(P, f, ) U(P , f, ).
(5.12)
Proof. We only prove the first inequality of (5.12); the proof of the second one is analogous.
Suppose first that P contains just one point more than P . Let this extra point be x , and
suppose xi1 x < xi , where xi1 and xi are two consecutive points of P . Put
w1 = inf{f (x) | x [xi1 , x ]},
If P contains k points more than P , we repeat this reasoning k times, and arrive at (5.12).
5 Integration
122
Proposition 5.2
b
a
f d
f d.
a
Proof. Let P be the common refinement of two partitions P1 and P2 . By Lemma 5.1
L(P1 , f, ) L(P , f, ) U(P , f, ) U(P2 , f, ).
Hence
L(P1 , f, ) U(P2 , f, ).
If P2 is fixed and the supremum is taken over all P1 , (5.13) gives
Z b
f d U(P2 , f, ).
(5.13)
(5.14)
Proposition 5.3 (Riemann Criterion) f R() on [a, b] if and only if for every > 0 there
exists a partition P such that
U(P, f, ) L(P, f, ) < .
(RC)
b
a
f d
b
a
f d
Z
Z
b
a
f d U(P, f, ).
f d < .
a
since the above inequality can be satisfied for every > 0, we have
Z
f d =
a
f d,
a
that is f R().
Conversely, suppose f R(), and let > 0 be given. Then there exist partitions P1 and P2
such that
Z b
Z b
U(P2 , f, )
(5.15)
f d < ,
f d L(P1 , f, ) < .
2
2
a
a
We choose P to be the common refinement of P1 and P2 . Then Lemma 5.1, together with
(5.15), shows that
Z b
123
f
d
< .
i
i
a
i=1
Proof. Lemma 5.1 implies (a). Under the assumptions made in (b), both f (si) and f (ti ) lie in
[mi , Mi ], so that | f (si ) f (ti ) | Mi mi . Thus
n
X
i=1
and
L(P, f, )
prove (c).
f d U(P, f, )
(5.16)
if x, t [a, b] and | x t | < . If P is any partition of [a, b] such that xi < for all i, then
(5.16) implies that
Mi mi ,
i = 1, . . . , n
(5.17)
5 Integration
124
and therefore
U(P, f, ) L(P, f, ) =
n
X
i=1
(Mi mi )i
n
X
i=1
if xi < .
Rb
We compute I = a sin x dx. Let > 0. Since sin x is continuous, f R. There exists > 0
such that | x t | < implies
| sin x sin t | <
.
ba
(5.18)
sin xi xi =
i=1
n
X
i=1
X
h
sin(a + ih)h =
2 sin h/2 sin(a + ih)
2 sin h/2 i=1
n
X
h
(cos(a + (i 1/2)h) cos(a + (i + 1/2)h))
2 sin h/2 i=1
h
(cos(a + h/2) cos(a + (n + 1/2)h))
2 sin h/2
h/2
(cos(a + h/2) cos(b + h/2)))
=
sin h/2
=
Since limh0 sin h/h = 1 and cos x is continuous, we find that the above expression tends to
Rb
cos a cos b. Hence a sin x dx = cos a cos b.
(b) For x [a, b] define
(
1,
x Q,
f (x) =
0,
x 6 Q.
We will show f 6 R. Let P be any partition of [a, b]. Since any interval contains rational
as well as irrational points, mi = 0 and Mi = 1 for all i. Hence L(P, f ) = 0 whereas
125
P
U(P, f ) = ni=1 xi = b a. We conclude that the upper and lower Riemann integrals dont
coincide; f 6 R. A similar reasoning shows f 6 R() if (b) > (a) since L(P, f, ) = 0 <
U(P, f, ) = (b) (a).
Proposition 5.6 If f is monotonic on [a, b], and is continuous on [a, b], then f R().
(x)
Proof.
(b) (a)
,
n
i = 1, . . . , n.
x2
b
n-1
We suppose that f is monotonically increasing (the proof is analogous in the other case). Then
Mi = f (xi ),
mi = f (xi1 ),
i = 1, . . . , n,
so that
n
U(P, f, ) L(P, f, ) =
=
(b) (a) X
(f (xi ) f (xi1 ))
n
i=1
(b) (a)
(f (b) f (a)) <
n
5 Integration
126
where M0 and m0 are the supremum and infimum of f (x) on [a , b ]. The Riemann criterion is
satisfied for f on [a, b], f R.
Proposition 5.8 If f R() on [a, b], m f (x) M, is continuous on [m, M], and
h(x) = (f (x)) on [a, b]. Then h R() on [a, b].
Remark 5.1 (a) A bounded function f is Riemann-integrable on [a, b] if and only if f is continuous almost everywhere on [a, b]. (The proof of this fact can be found in [Rud76, Theorem 11.33]).
Almost everywhere means that the discontinuities form a set of (Lebesgue) measure 0. A set
S
M R has measure 0 if for given > 0 there exist intervals In , n N such that M nN In
P
and nN | In | < . Here, | I | denotes the length of the interval. Examples of sets of measure
0 are finite sets, countable sets, and the Cantor set (which is uncountable).
(b) Note that such a chaotic function (at point 0) as
(
cos x1 ,
x 6= 0,
f (x) =
0,
x = 0,
is integrable on [, ] since there is only one single discontinuity at 0.
(c) If f R() on [a, b] and if a < c < b, then f R() on [a, c] and on [c, b], and
Z b
Z c
Z b
f d =
f d +
f d.
a
127
(5.19)
Ii
Ii
Ii
Ii
Ii
If f1 R() and f2 R(), let > 0 be given. There are partitons Pj , j = 1, 2, such that
U(Pj , fj , ) L(Pj , fj , ) < .
These inequalities persist if P1 and P2 are replaced by their common refinement P . Then (5.19)
implies
U(P, f, ) L(P, f, ) < 2
which proves that f R(). With the same P we have
Z b
fj d + ,
U(P, fj , ) <
j = 1, 2;
Rb
(5.20)
i=1
Note that in (c), f R() on [a, c] and on [c, b] in general does not imply that f R() on
[a, b]. For example consider the interval [1, 1] with
(
0,
1 x < 0,
f (x) = (x) =
1,
0 x 1.
5 Integration
128
R1
Then 0 f d = 0. The integral vanishes since is constant on [0, 1]. However, f 6 R() on
[1, 1] since for any partition P including the point 0, we have U(P, f, ) = 1 and L(P, f, ) =
0.
Proposition 5.10 If f, g R() on [a, b], then
(a) f g R();
Z b
Z b
(b) | f | R() and
f d
| f | d.
a
Proof. If we take (t) = t2 , Proposition 5.8 shows that f 2 R() if f R(). The identity
4f g = (f + g)2 (f g)2
completes the proof of (a).
If we take (t) = | t |, Proposition 5.8 shows that | f | R(). Choose c = 1 so that
R
c f d 0. Then
since f | f |.
Z
Z
Z
Z
f d = c f d = cf d | f | d,
The unit step function or Heaviside function H(x) is defined by H(x) = 0 if x < 0 and
H(x) = 1 if x 0.
Example 5.2 (a) If a < s < b, f is bounded on [a, b], f is continuous at s, and (x) = H(xs),
then
Z b
f d = f (s).
a
For the proof, consider the partition P with n = 3; a = x0 < x1 < s = x2 < x3 = b. Then
1 = 3 = 0, 2 = 1, and
U(P, f, ) = M2 ,
L(P, f, ) = m2 .
f d =
a
N
X
n=1
cn f (sn ).
129
X
n=1
c3
cn H(x sn ).
(5.21)
c1
f d =
c2
s1
cn f (sn ).
s2 s 3
s4
(5.22)
n=1
Proof. The comparison test shows that the series (5.21) converges for every x. Its sum is
P
evidently an increasing function with (a) = 0 and (b) =
cn . Let > 0 be given, choose
N so that
X
cn < .
n=N +1
Put
1 (x) =
N
X
n=1
cn H(x sn ),
2 (x) =
n=N +1
f d1 =
a
N
X
cn H(x sn ).
cn f (sn ).
n=1
Proposition 5.12 Assume that is increasing and R on [a, b]. Let f be a bounded real
function on [a, b].
Then f R() if and only if f R. In that case
Z b
Z b
f d =
f (x) (x) dx.
(5.23)
a
The statement remains true if is continuous on [a, b] and differentiable up to finitely many
points c1 , c2 , . . . , cn .
5 Integration
130
Proof. Let > 0 be given and apply the Riemann criterion Proposition 5.3 to : There is a
partition P = {x0 , . . . , xn } of [a, b] such that
U(P, ) L(P, ) < .
(5.24)
n
X
i=1
for
i = 1, . . . , n.
(5.25)
f (si )i =
i=1
n
X
i=1
In particular,
(5.26)
i=1
n
X
i=1
f (si )i U(P, f ) + M,
(5.27)
Now (5.25) remains true if P is replaced by any refinement. Hence (5.26) also remains true.
We conclude that
Z b
Z b
f d
f (x) (x) dx M.
a
a
But is arbitrary. Hence
f d =
a
for any bounded f . The equality for the lower integrals follows from (5.26) in exactly the same
way. The proposition follows.
We now summarize the two cases.
131
Proposition 5.13 Let f be continuous on [a, b]. Except for finitely many points c0 , c1 , . . . , cn
with c0 = a and cn = b there exists (x) which is continuous and bounded on
[a, b] \ {c0 , . . . , cn }.
Then f R() and
Z
f d =
a
f (x) (x) dx +
a
n1
X
i=1
1 (x) =
A+
i H(x
i=0
ci ) +
k
X
i=1
A
i H(ci x).
f d2 =
Further,
f d =
By Proposition 5.11
b
a
f 2
dx =
f dx.
f d(1 + 2 ) =
f dx +
f d1 =
n
X
A+
i f (ci )
i=1
n1
X
f d1 .
a
A
i (f (ci )).
i=1
Example 5.3 (a) The Fundamental Theorem of Calculus, see Theorem 5.15 yields
Z
(b) f (x) = x2 .
2
3
x dx =
2
x4
x 3x dx = 3 = 12.
4 0
2
x,
7,
(x) =
x2 + 10,
64,
0 x < 1,
x = 1,
1 < x < 2,
x = 2.
5 Integration
132
Z
f d =
Remark 5.2 The three preceding proposition show the flexibility of the Stieltjes process of
integration. If is a pure step function, the integral reduces to an infinite series. If has
an initegrable derivative, the integral reduces to the ordinary Riemann integral. This makes it
possible to study series and integral simultaneously, rather than separately.
Then F is continuous on [a, b]; furthermore, if f is continuous at x0 [a, b] then F is differentiable at x0 and
F (x0 ) = f (x0 ).
Proof. Since f R, f is bounded. Suppose | f (t) | M on [a, b]. If a x < y b, then
Z y
| F (y) F (x) | =
f (t) dt M(y x),
x
and a s < t b,
133
f
(x
)
0
t x0
It follows that F (x0 ) = f (x0 ).
f (x) dx =
f dx.
The function f is called the integrand. Integration and differentiation are inverse to each other:
Z
Z
d
f (x) dx = f (x),
f (x) dx = f (x).
dx
Theorem 5.15 (Fundamental Theorem of Calculus) Let f : [a, b] R be continuous. (a) If
Z x
F (x) =
f (t) dt.
a
b
a
Rx
Proof. (a) By Theorem 5.14 F (x) = a f (x) dx is differentiable at any point x0 [a, b] with
F (x) = f (x).
(b) By the above remark, the antiderivative is unique up to a constant, hence F (x) G(x) = C.
Ra
Since F (a) = a f (x) dx = 0 we obtain
G(b) G(a) = (F (b) C) (F (a) C) = F (b) F (a) = F (b) =
f (x) dx.
a
5 Integration
134
Note that the FTC is also true if f R and G is an antiderivative of f on [a, b]. Indeed, let > 0
be given. By the Riemann criterion, Proposition 5.3 there exists a partition P = {x0 , . . . , xn }
of [a, b] such that U(P, f ) L(P, f ) < . By the mean value theorem, there exist points
ti [xi1 , xi ] such that
F (xi ) F (xi1 ) = f (ti )(xi xi1 ),
Thus
F (b) F (a) =
n
X
i = 1, . . . , n.
f (ti )xi .
i=1
It follows from Lemma 5.4 (c) and the above equation that
Z b
Z b
n
X
f (ti )xi
f (x) dx = F (b) F (a)
f (x) dx < .
a
a
i=1
domain
R \ {1}, x > 0
1
x
ex
x < 0 or
x>0
log | x |
ax
a > 0, a 6= 1, x R
sin x
R
R
cos x
1
sin2 x
1
cos2 x
1
1 + x2
1
1 + x2
1
1 x2
1
x2 1
antiderivative
1
x+1
+1
ex
ax
log a
cos x
sin x
R \ {k | k Z}
R\
n
2
+ k | k Z
cot x
tan x
R
R
arctan x
arsinh x = log(x +
1 < x < 1
x < 1
or
x>1
x2 + 1)
arcsin x
log(x +
x2 1)
135
f g dx = f (x)g(x)|a
f g dx.
a
(5.29)
f ((t)) (t) dt =
f (x) dx.
a
(a)
5 Integration
136
using the fundamental theorem of calculus. For example, we show the second part of (c). By
the above part, F ((t)) is an antiderivative of f ((t))(t). By the FTC we have
Z b
f ((t)) (t) dt = F ((t))|ba = F ((b)) F ((a)).
a
(b) Put f (x) = ex and g(x) = x, then f (x) = ex and g (x) = 1 and we obtain
Z
Z
x
x
xe dx = xe 1 ex dx = ex (x 1).
R
R
R
(c) I = (0, ). log x dx = 1 log x dx = x log x x x1 dx = x log x x.
(d)
Z
Z
Z
1
arctan x dx = 1 arctan x dx = x arctan x x
dx
1 + x2
Z
1
1
(1 + x2 )
= x arctan x
dx = x arctan x log(1 + x2 ).
2
2
1+x
2
In the last equation we made use of (5.32).
(e) Recurrent computation of integrals.
Z
dx
In :=
,
(1 + x2 )n
I1 = arctan x.
In =
Put u = x, v =
n N.
(1 + x2 ) x2
= In1
(1 + x2 )n
x2 dx
.
(1 + x2 )n
x
. Then U = 1 and
(1 + x2 )n
Z
x dx
1 (1 + x2 )1n
v=
=
.
(1 + x2 )n
2 1n
137
Z
1 x(1 + x2 )1n
1
In = In1
(1 + x2 )1n dx
2
1n
2(1 n)
x
2n 3
In =
In1 .
+
(2n 2)(1 + x2 )n1 2n 2
In particular, I2 =
x
2(1+x2 )
+ 12 arctan x and I3 =
x
4(1+x2 )2
+ 34 I2 .
f (x) dx = f ()(b a)
Proof. Put m = inf{f (x) | x [a, b]} and M = sup{f (x) | x [a, b]}. Since 0 we obtain
m(x) f (x)(x) M(x). By Proposition 5.9 (a) and (b) we have
Z b
Z b
Z b
(x) dx
f (x)(x) dx M
(x) dx.
m
a
Since f is continuous on [a, b] the intermediate value theorem Theorem 3.5 ensures that there
is a with = f (). The claim follows.
Example 5.5 The trapezoid rule. Let f : [0, 1] R be twice continuously differentiable. Then
there exists [0, 1] such that
Z 1
1
1
f (x) dx = (f (0) + f (1)) f ().
(5.34)
2
12
0
Proof. Let (x) = 12 x(1 x) such that (x) 0 for x [0, 1], (x) = 12 x, and (x) = 1.
Using integration by parts twice as well as Theorem 5.18 we find
Z 1
Z 1
Z 1
1
f (x) dx =
(x)f (x) dx = (x)f (x)|0 +
(x)f (x) dx
0
0
0
Z 1
1
1
= (f (0) + f (1)) + (x)f (x)|0
(x)f (x) dx
2
0
Z 1
1
= (f (0) + f (1)) f ()
(x) dx
2
0
1
1
= (f (0) + f (1)) f ().
2
12
5 Integration
138
Indeed,
R1
0
1
x
2
1
12 x2 dx = 14 x2 16 x3 0 =
1
4
1
6
1
.
12
139
with a polynomial p1 (z) of degree n 1. Applying the induction hypothesis to p1 the statement
follows.
A root of p is said to be a root of multiplicity k, k N, if appears exactly k times among
the zeros z1 , z2 , . . . , zn . In that case (z )k divides p(z) but (z )k+1 not.
If p is a real polynomial, i. e. a polynomial with real coefficients, and is a root of multiplicity
k of p then is also a root of multiplicity k of p. Indeed, taking the complex conjugation of the
equation
p(z) = (z )k q(z)
we have since p(z) = p(z) = p(z)
p(z) = (z )k q(z) = p(z) = (z )k q(z).
z:=z
Note that the product of the two complex linear factors z and z yield a real quadratic
factor
(z )(z ) = z 2 ( + )z + = z 2 2 Re + | |2 .
Using this fact, the real version of Lemma 5.21 is as follows.
Lemma 5.22 Let q be a real polynomial of degree n with leading coefficient an . Then there
exist real numbers i , j , j and multiplicities ri , sj N, i = 1, . . . , k, j = 1, . . . , l such that
q(x) = an
k
Y
i=1
(x i )
ri
l
Y
j=1
(x2 2j x + j )sj .
We assume that the quadratic factors cannot be factored further; this means
Of course, deg q =
i ri
j2 j < 0,
j
j = 1, . . . , l.
2sj = n.
(x 2)(x + 2)(x2 + 2)
(b) x3 + x 2. One can guess the first zero x1 = 1. Using long division one gets
x3
(x3 x2 )
x2
(x2
+x
2 = (x 1)(x2 + x + 2)
+x 2
x
)
2x 2
(2x 2)
0
5 Integration
140
x4
Example 5.7 (a) Compute f (x) dx =
dx. We use long division to obtain a ratiox3 1
x
nal function p/q with deg p < deg q, f (x) = x + x3 1 . To obtain the partial fraction decomposition (PFD), we need the factorization of the denominator polynomial q(x) = x3 1. One can
guess the first real zero x1 = 1 and divide q by x 1; q(x) = (x 1)(x2 + x + 1).
The PFD then reads
x
a
bx + c
=
+ 2
.
3
x 1
x1 x +x+1
We have to determine a, b, c. Multiplication by x3 1 gives
0 x2 + 1 x + 0 = a(x2 + x + 1) + (bx + c)(x 1) = (a + b)x2 + (a b + c)x + a c.
The two polynomials on the left and on the right must coincide, that is, there coefficients must
be equal:
0 = a c, 1 = a b + c, 0 = a + b;
x
1 1
1 x1
=
.
1
3 x 1 3 x2 + x + 1
We can integrate the first two terms but we have to rewrite the last one
x2
Recall that
Z
x1
1 2x + 1
3
1
=
.
2
1
+x+1
2 x +x+1 2 x+ 2+ 3
2
4
2
2x 2
x 2x + ,
dx
=
log
x2 2x +
x+b
1
dx
.
= arctan
2
2
(x + b) + a
a
a
Therefore,
Z
2x + 1
x4
1 2 1
1
1
2
.
arctan
dx
=
x
log
|
x
1
|
log(x
+
+
x
+
1)
+
x3 1
2
2
6
3
3
(b) If q(x) = (x 1)3 (x + 2)(x2 + 2)2 (x2 + 1) and p(x) is any polynomial with deg p < deg q =
10, then the partial fraction decomposition reads as
A11
A12
B11 x + C11 B12 x + C12 B21 + C21
A13
A21
p(x)
=
+
+
+
.
+
+
+
2
3
q(x)
x 1 (x 1)
(x 1)
x+2
x2 + 2
(x2 + 2)2
x2 + 1
(5.36)
141
Suppose now that p(x) 1. One can immediately compute A13 and A21 . Multiplying (5.36)
by (x 1)3 yields
1
= A13 + (x 1)p1 (x)
(x + 2)(x2 + 2)2 (x2 + 1)
with a rational function p1 not having (x 1) in the denominator. Inserting x = 1 gives
1
1
= 54
. Similarly,
A13 = 32 32
1
1
A21 =
.
=
3
2
2
2
3
(x 1) (x + 2) (x + 1) x=2 (3) 6 5
esin( x1)
f (x) =
.
x + log x
A function is called elementary integrable if it has an elementary antiderivative. Rational functions are elementary integrable. Most functions are not elementary integrable as
2
ex ,
ex
,
x
1
,
log x
sin x
.
x
(Gaussian integral),
(integral logarithm)
(elliptic integral of the first kind),
(elliptic integral of the second kind).
R(cos x, sin x) dx
2t
,
1 + t2
R(cos x, sin x) dx =
cos x =
1 t2
,
1 + t2
1 t2 2t
,
1 + t2 1 + t2
dx =
2dt
.
1 + t2
2dt
=
1 + t2
p(u,v)
q(u,v)
R1 (t) dt
with polino-
5 Integration
142
(a) R(u, v) = R(u, v), R is odd in u. Substitute t = sin x.
(b) R(u, v) = R(u, v), R is odd in v. Substitute t = cos x.
(c) R(u, v) = R(u, v). Substitute t = tan x.
sin3 x dx. Here, R(u, v) = v 3 is an odd function in v, such that (b) applies;
=
tan x dx =
=
cos2 x
1 t2
2
1 t2
1
= log(1 t2 ) = log | cos x | .
2
R
R(x,
n
ax + b) dx
The substitution
t=
ax + b
t b
n
n
R
, t tn1 dt.
R(x, ax + b) dx =
a
a
R
R(x,
ax2 + 2bx + c) dx
Using the method of complete squares the above integral can be written in one of the three basic
forms
Z
Z
Z
2
2
R(t, t 1) dt,
R(t, 1 t2 ) dt.
R(t, t + 1) dt,
Further substitutions
t = sinh u,
t = cosh u,
t = cos u,
t2 + 1 = cosh u,
dt = cosh u du,
t2 1 = sinh u,
dt = sinh u du,
1 t2 = sin u,
dt = sin u du
143
Z
dx
. Hint: t = x2 + 6x + 5 x.
x2 + 6x + 5
2
2
2
Then (x + t) = x + 2tx + t = x2 + 6x + 5 such that t2 + 2tx = 6x + 5 and therefore x =
and
2t(6 2t) + 2(t2 5)
2t2 + 12t 10
dx =
dt
=
dt.
(6 2t)2
(6 2t)2
Example 5.9 Compute I =
Hence, using t + x = t +
t2 5
62t
t2 5
62t
t2 +6t5
,
62t
Z
(2t2 + 12t 10) dt 1
2(6 2t)(t2 + 6t 5)
I=
=
dt
(6 2t)2
t+x
(t2 + 6t 5)(6 2t)2
Z
dt
= log | 6 2t | + const. = log 6 2 x2 + 6x + 5 + 2x + const.
=2
6 2t
f (x) dx = lim
b+
f (x) dx
(5.37)
if this limit exists (and is finite). In that case, we say that the integral on the left converges. If it
also converges if f has been replaced by | f |, it is said to converge absolutely.
If an integral converges absolutely, then it converges, see Example 5.11 below, where
Z
Z
f dx
| f | dx.
a
f dx :=
f dx +
f dx
5 Integration
144
Z
Since
R+
Rs1
(b)
Hence
dx
converges for s > 1 and diverges for 0 < s 1.
xs
R
dx
1
1
1
1
1 s1 .
=
=
xs
1 s xs1 1
s1
R
lim
it follows that
ex dx = 1.
(
0,
+,
dx
1
,
=
s
x
s1
if s > 1,
if 0 < s < 1,
if s > 1.
R
1
ex dx = ex 0 = 1 R .
e
for every > 0 there exists some b > a such that for all c, d > b
Z d
< .
f
dx
c
Proof. The following Cauchy criterion for limits of functions is easily proved using sequences:
The limit lim F (x) exists if and only if
x
(5.38)
xn > R. Hence, | F (xn ) F (xm ) | < as m, n n0 . Thus, (F (xn )) is a Cauchy sequence and
therefore convergent. This proves one direction of the above criterion. The inverse direction is
even simpler: Suppose that lim F (x) = A exists (and is finite!). We will show that the above
x+
criterion is satisfied.Let > 0. By definition of the limit there exists R > 0 such that x, y > R
imply | F (x) A | < /2 and | F (y) A | < /2. By the triangle inequality,
| F (x) F (y) | = | F (x) A (F (y) A) | | F (x) A | + | F (y) A | <
+ = ,
2 2
145
R
R
Example 5.11 (a) If a f dx converges absolutely, then a f dx converges. Indeed, let > 0
R
and a | f | dx converges. By the Cauchy Criterion for the later integral and by the triangle
inequality, Proposition 5.10, there exists b > 0 such that for all c, d > b
Z d
Z d
f dx
| f | dx < .
(5.39)
c
R
Hence, the Cauchy criterion is satisfied for f if it holds for | f |. Thus, a f dx converges.
R
(b) 1 sinx x dx. Partial integration with u = x1 and v = sin x yields u = x12 , v = cos x and
d Z d
Z d
1
sin x
cos x
dx = cos x
dx
x
x
x2
c
c
c
Z d
Z d
1
1
sin
x
dx
cos d + cos c +
dx
d
2
x
c
c
c x
1 1
1 1 1 1
<
+
+ + 2
c d
d c
c d
R
if c and d are sufficiently large. Hence, 1 sinx x dx converges.
The integral does not converge absolutely. For non-negative integers n Z+ we have
Z (n+1)
Z (n+1)
sin x
1
2
dx
;
| sin x | dx =
x
(n + 1) n
(n + 1)
n
hence
n
X
sin x
1
dx 2
.
x
k=1 k + 1
1
R
Since the harmonic series diverges, so does the integral sinx x dx.
R
Proposition 5.25 Suppose f R is nonnegative, f 0. Then a f dx converges if there
exists C > 0 such that
Z b
f dx < C, for all b > a.
Z
(n+1)
The proof is similar to the proof of Lemma 2.19 (c); we omit it. Analogous propositions are true
Ra
for integrals f dx.
Proposition 5.26 (Integral criterion for series) Assume that f R is nonnegative f 0 and
R
P
decreasing on [1, +). Then 1 f dx converges if and only if the series
n=1 f (n) converges.
Proof. Since f (n) f (x) f (n 1) for n 1 x n,
Z n
f dx f (n 1).
f (n)
n1
f (n)
N
1
f dx
N
1
X
n=1
f (n).
5 Integration
146
R
P
If 1 f dx converges the series
and therefore convergent.
n=1 f (n) is bounded
RR
P
P
Conversely, if
n=1 f (n) converges, the integral 1 f dx
n=1 f (n) is bounded as
R , hence convergent by Proposition 5.25.
R dx
P
1
Example 5.12
n=2 n(log n) converges if and only if 2 x(log x) converges. The substitution
gives
y = log x, dy = dx
x
Z
Z
dx
dy
=
x(log x)
2
log 2 y
which converges if and only if > 1 (see Example 5.10).
f dx = lim
f dx
tb0
f dx = lim
ta+0
f dx
t
= lim
= lim arcsin x|t0 = lim arcsin t = arcsin 1 = .
2
2
t10
t10
t10
2
1x
1x
0
0
(b)
Z
1
0
dx
= lim
t0+0
x
1
t
dx
= lim
t0+0
x
1
1 1
x
1
t
1
log x|t ,
, 6= 1
=1
1
,
1
< 1,
+,
1.
Remarks 5.4 (a) The analogous statements to Proposition 5.24 and Proposition 5.25 are true
Rb
for improper integrals a f dx.
R 1 dx
R1
R1
For example, 0 x(1x)
diverges since both improper integrals 02 f dx and 1 f dx diverge,
2
R 1 dx
R1
dx
diverges
since
it
diverges
at
x
=
1,
finally
I
=
converges.
Indeed, the
0
0
x(1x)
x(1x)
f dx =
a
f dx +
a
f dx
c
147
if c is between a and b and both improper integrals on the right side exist.
(c) Also, if f is unbounded at a define
Z
Z b
Z
f dx =
f dx +
f dx
a
if the two improper integrals on the right side exist. For example,
Z 1
Z 0
Z 1
Z t
Z 1
dx
dx
dx
dx
dx
p
p
p
p
p
=
+
= lim
+ lim
|x|
|x|
| x | t00 1 | x | t0+0 t
|x|
1
1
0
t
1
= lim 2 x + lim 2 x = 4.
1
t00
t0+0
tx1 et dt.
(5.40)
R1
0
R
1
tx1 et
1
t1x
1
.
t2
Note that limt tx+1 et = 0 by Proposition 3.11. Hence, (x) is defined for every x > 0.
Proposition 5.27 For every positive x
x(x) = (x + 1).
(5.41)
tx1 et dt.
Taking the limits 0 + 0 and R + one has (x + 1) = x(x). Since by Example 5.10
Z
(1) =
et dt = 1,
0
5 Integration
148
it follows from (5.41) that
(n + 1) = n(n) = = n(n 1)(n 2) (1) = n!
The Gamma function interpolates the factorial function n! which is defined only for positive
integers n. However, this property alone is not sufficient for a complete characterization of the
Gamma function. We need another property. This will be done more in detail in the appendix
to this chapter.
Rb
Rb
In other words a f d is the point in Rk whose jth coordinate is a fj d. It is clear that parts
(a), (c), and (e) of Proposition 5.9 are valid for these vector valued integrals; we simply apply
the earlier results to each coordinate. The same is true for Proposition 5.12, Theorem 5.14, and
Theorem 5.15. To illustrate this, we state the analog of the fundamental theorem of calculus.
(5.42)
12
By Proposition 5.10 (a) each of the functions fi2 belong to R(); hence so does their sum f12 +
f22 + + fk2 . Note that the square-root is a continuous function on the positive half line. If we
149
kyk =
k
X
yj2
j=1
k
X
yj
Z bX
k
fj d =
j=1
(yj fj ) d.
j=1
t [a, b].
Inserting this into the preceding equation, the monotony of the integral gives
2
kyk kyk
kf k d.
dx =
a
u dx + i
a
v dx.
The fundamental theorem of calculus holds: If the complex function is Riemann integrable,
R on [a, b] and F (x) is an antiderivative of , then
Z
Rx
Similarly, if u and v are both continuous, F (x) = a (t) dt is an antiderivative of (x).
Proof. Let F = U +iV be the antiderivative of where U = u and V = v. By the fundamental
theorem of calculus
Z b
Z b
Z b
dx =
u dx + i
v dx = U(b) U(a) + i (V (b) V (a)) = F (b) F (a).
a
Example:
b
1 t
e dt = e ,
a
t
C.
5 Integration
150
5.5 Inequalities
R
Rb
b
Besides the triangle inequality a f d a | f | d which was shown in Proposition 5.10
we can formulate Holders, Minkowskis, and the CauchySchwarz inequalities for Riemann
Stieltjes integrals. For, let p > 0 be a fixed positive real number and an increasing function
on [a, b]. For f R() define the Lp -norm
kf kp =
Z
| f | d
p1
(5.43)
CauchySchwarz Inequality
Proposition 5.30 Let f, g : [a, b] C be complex valued functions and f, g R on [a, b].
Then
Z b
2 Z b
Z b
2
| f g | dx
| f | dx
| g |2 dx.
(5.44)
a
R
R
Proof. Letting f = | f | and g = | g |, it suffices to show ( f g dx)2 f 2 dx g 2 dx. For,
Rb
Rb
Rb
put A = a g 2 dx, B = a f g dx, and C = a f 2 dx. Let C be arbitrary. By the positivity
and linearity of the integal,
Z b
Z b
Z b
Z b
2
2
2
(f + g) dx =
f dx + 2
f g dx +
g 2 dx = C + 2B + A2 =: h().
0
a
+ .
A
A
A
A
A
This is satisfied for all if and only if
2
B
C
A
A
(5.45)
(5.46)
5.6 Appendix D
151
1 1
(b) Holders inequality. Let p and q be positive real numbers such that + = 1. If f, g R(),
p q
then
Z b
Z b
f
g
d
| f g | d kf kp kgkq .
(5.47)
a
kf + gkp kf kp + kgkp .
(5.48)
5.6 Appendix D
The composition of an integrable and a continuous function is integrable
Proof of Proposition 5.8. Let > 0. Since is uniformly continuous on [m, M], there exists
> 0 such that < and | (s) (t) | < if | s t | < and [s, t [m, M].
Since f R(), there exists a partition P = {x0 , x1 , . . . , xn } of [a, b] such that
U(P, f, ) L(P, f, ) < 2 .
(5.49)
Let Mi and mi have the same meaning as in Definition 5.1, and let Mi and mi the analogous
numbers for h. Divide the numbers 1, 2, . . . , n into two classes: i A if Mi mi < and
i B if Mi mi > . For i A our choice of shows that Mi mi . For i B,
Mi mi 2K where K = sup{| (t) | | m t M}. By (5.49), we have
X
X
i
(Mi mi )i < 2
(5.50)
so that
iB
iB
iB
U(P, h, ) L(P, h, ) =
X
iA
(Mi mi )i +
X
iB
(Mi mi )i
= f (x)
(x x1 ).
x3 x1
x x1
x3 x1
5 Integration
152
This means that f is bounded below on [x3 , x2 ] by a linear function; hence f is bounded on
[x3 , x2 ], say | f (x) | C on [x3 , x2 ].
The convexity implies
1
1
1
(x + h) + (x h) (f (x + h) + f (x h))
f
2
2
2
= f (x) f (x h) f (x + h) f (x).
Iteration yields
f (x ( 1)h) f (x h) f (x + h) f (x) f (x + h) f (x + ( 1)h).
Summing up over = 1, . . . , n we have
f (x) f (x nh) n (f (x + h) f (x)) f (x + nh) f (x)
1
1
= (f (x) f (x nh)) f (x + h) f (x) (f (x + nh) f (x)) .
n
n
Let > 0 be given; choose n N such that 2C/n < and choose h such that x3 < x nh <
x < x + nh < x2 . The above inequality then implies
| f (x + h) f (x) |
2C
< .
n
f (t)g(t) dt
x1
p
e p ,
Z
g(t) = t
y1
q
p1 Z
f (t)p dt
e q
R
1q
g(t)q dt .
5.6 Appendix D
153
Note that
x
f (t)g(t) = t p + q 1 et ,
f (t)p = tx1 et ,
g(t)q = ty1 et .
p q
Remark 5.5 One can prove that a convex function (see Definition 4.4) is continuous, see Proposition 5.32. Also, an increasing convex function of a convex function f is convex, for example
ef is convex if f is. We conclude that (x) is continuous for x > 0.
Theorem 5.34 Let F : (0, +) (0, +) be a function with
(a) F (1) = 1,
(b) F (x + 1) = xF (x),
(c) F is logarithmic convex.
Then F (x) = (x) for all x > 0.
Proof. Since (x) has the properties (a), (b), and (c) it suffices to prove that F is uniquely
determined by (a), (b), and (c). By (b),
F (x + n) = F (x)x(x + 1) (x + n)
for every positive x and every positive integer n. In particular F (n + 1) = n! and it suffices to
show that F (x) is uniquely determined for every x with x (0, 1). Since n + x = (1 x)n +
x(n + 1) from (c) it follows
F (n + x) F (n)1x F (n + 1)x = F (n)1x F (n)x nx = (n 1)!nx .
Similarly, from n + 1 = x(n + x) + (1 x)((n + 1 + x) it follows
n! = F (n + 1) F (n + x)x F (n + 1 + x)1x = F (n + x)(n + x)1x .
Combining both inequalities,
n!(n + x)x1 F (n + x) (n 1)!nx
and moreover
an (x) :=
Since
bn (x)
an (x)
(n 1)!nx
n!(n + x)x1
F (x)
=: bn (x).
x(x + 1) (x + n 1)
x(x + 1) (x + n 1)
(n+x)nx
n(n+x)x
converges to 1 as n ,
(n 1)!nx
.
n x(x + 1) (x + n)
F (x) = lim
Hence F is uniquely determined.
5 Integration
154
Stirlings Formula
We give an asymptotic formula for n! as n . We call two sequences (an ) and (bn ) to be
an
asymptotically equal if lim
= 1, and we write an bn .
n bn
Proposition 5.35 (Stirlings Formula) The asymptotical behavior of n! is
n n
.
n! 2n
e
Proof. Using the trapezoid rule (5.34) with f (x) = log x, f (x) = 1/x2 we have
Z
k+1
log x dx =
1
1
(log k + log(k + 1)) +
2
12k2
Since
n
1
n
X
n1
1 X 1
1
log x dx =
log k log n +
.
2
12 k=1 k2
k=1
log k =
k=1
where n = 1
1
12
Pn1
1
k=1 k2 .
n
X
k=1
n1
log k
1
n+
2
1
1 X 1
log n +
2
12 k=1 k2
log n n + n ,
n! = nn+ 2 en cn .
(5.51)
X
1
= lim n = 1
n
2
k=1 k
c2n
c2n
c2
c
=2
2244 2n2n
4k 2
2
=
lim
4k 2 1 n 1335 (2n 1)(2n + 1)
k=1
(5.52)
5.6 Appendix D
155
we have
n
Y
4k 2
2
4k 2 1
k=1
! 12
24 2n
1
=q
35 (2n 1) 2n + 1
n+
1
=q
n+
such that
1
2
22 42 (2n)2
234 (2n 1)(2n)
22n (n!)2
,
(2n)!
Consequently, c =
1
2
22n (n!)2
= lim
.
n
n(2n)!
n
X
i=1
n
X
i=1
b
f (ti )p i <
q
g(ti ) i <
i=1
Z
Z
f p d + ,
(5.54)
g q d + ,
(5.55)
a
b
a
for any ti [xi1 , xi ]. Using the two preceding inequalities and Holders inequality (1.22) we
have
! p1
! 1q
n
n
n
1
1
X
X
X
f (ti )ip g(ti )iq
f (ti )p i
g(ti)q i
i=1
i=1
<
Z
i=1
f d +
p1 Z
g d +
1q
By (5.53),
Z
b
a
Z b
p1 Z b
1q
n
X
p
q
f g d <
(f g)(ti )i + <
f d +
g d + + .
i=1
156
Since > 0 was arbitrary, the claim follows.
5 Integration
Chapter 6
Sequences of Functions and Basic
Topology
In the present chapter we draw our attention to complex-valued functions (including the realvalued), although many of the theorems and proofs which follow extend to vector-valued functions without difficulty and even to mappings into more general spaces. We stay within this
simple framework in order to focus attention on the most important aspects of the problem that
arise when limit processes are interchanged.
x E.
(6.1)
Under these circumstances we say that (fn ) converges on E and f is the limit (or the limit
function) of (fn ). Sometimes we say that (fn ) converges pointwise to f on E if (6.1) holds.
P
Similarly, if
n=1 fn (x) converges for every x E, and if we define
f (x) =
fn (x),
n=1
n=1
x E,
(6.2)
fn .
The main problem which arises is to determine whether important properties of the functions
fn are preserved under the limit operations (6.1) and (6.2). For instance, if the functions fn are
continuous, or differentiable, or integrable, is the same true of the limit function? What are the
relations between fn and f , say, or between the integrals of fn and that of f ? To say that f is
continuous at x means
lim f (t) = f (x).
tx
157
158
Hence, to ask whether the limit of a sequence of continuous functions is continuous is the same
as to ask whether
lim lim fn (t) = lim lim fn (t)
tx n
(6.3)
n tx
i. e. whether the order in which limit processes are carried out is immaterial. We shall now
show by means of several examples that limit processes cannot in general be interchanged
without affecting the result. Afterwards, we shall prove that under certain conditions the order
in which limit operations are carried out is inessential.
Example 6.1 (a) Our first example, and the simplest one, concerns a double sequence. For
positive integers m, n N let
m
smn =
.
m+n
Then, for fixed n
lim smn = 1,
m
so that lim lim smn = 1. On the other hand, for every fixed m,
n m
lim smn = 0,
x10 n
n t10
After these examples, which show what can go wrong if limit processes are interchanged carelessly, we now define a new notion of convergence, stronger than pointwise convergence as
defined in Definition 6.1, which will enable us to arrive at positive results.
(6.4)
159
f(x) +
f(x)
f(x)
It is clear that every uniformly convergent sequence is pointwise convergent (to the same function). Quite explicitly, the difference between the two concepts is this: If (fn ) converges pointwise on E to a function f , for every > 0 and for every x E, there exists an integer n0
depending on both and x E such that (6.4) holds if n n0 . If (fn ) converges uniformly on
E it is possible, for each > 0 to find one integer n0 which will do for all x E.
P
We say that the series
k=1 fk (x) converges uniformly on E if the sequence (sn (x)) of partial
sums defined by
n
X
fk (x)
sn (x) =
k=1
converges uniformly on E.
Proposition 6.1 (Cauchy criterion) (a) The sequence of functions (fn ) defined on E converges
uniformly on E if and only if for every > 0 there is an integer n0 such that n, m n0 and
x E imply
| fn (x) fm (x) | .
(b) The series of functions
(6.5)
k=1
Proof. Suppose (fn ) converges uniformly on E and let f be the limit function. Then there is an
integer n0 such that n n0 , x E implies
| fn (x) f (x) | ,
2
so that
| fn (x) fm (x) | | fn (x) f (x) | + | fm (x) f (x) |
if m, n n0 , x E.
Conversely, suppose the Cauchy condition holds. By Proposition 2.18, the sequence (fn (x))
converges for every x to a limit which may we call f (x). Thus the sequence (fn ) converges
160
pointwise on E to f . We have to prove that the convergence is uniform. Let > 0 be given,
choose n0 such that (6.5) holds. Fix n and let m in (6.5). Since fm (x) f (x) as
m this gives
| fn (x) f (x) |
for every n n0 and x E.
P
(b) immediately follows from (a) with fn (x) = nk=1 gk (x).
Remark 6.1 Suppose
lim fn (x) = f (x),
x E.
Put
Mn = sup | fn (x) f (x) | .
xE
(prove!)
The following comparison test of a function series with a numerical series gives a sufficient
criterion for uniform convergence.
Theorem 6.2 (Weierstra) Suppose (fn ) is a sequence of functions defined on E, and suppose
Then
n=1
| fn (x) | Mn ,
fn converges uniformly on E if
x E, n N.
n=1
(6.6)
Mn converges.
P
Proof. If Mn converges, then, for arbitrary > 0 there exists n0 such that m, n n0 implies
Pn
i=m Mi . Hence,
n
n
n
X
X
X
f
(x)
|
f
(x)
|
Mi ,
x E.
i
i
tr.In.
(6.6)
i=m
i=m
i=m
P
Proposition 6.3 ( Comparison Test) If
converges uniformly on E and | fn (x) |
n=1 gn (x)P
gn (x) for all sufficiently large n and all x E then
n=1 fn (x) converges uniformly on E.
n=k
n=k
161
an C,
an z n ,
n=0
(6.7)
nan z n1
n=0
has the same radius of convergence R as the series (6.7) and hence also converges uniformly on
the closed disc {z | | z | r}.
Indeed, this simply follows from the fact that
lim
p
n
(n + 1) | an+1 | = lim
p
1
n
| an | = .
n
R
n + 1 lim
(b) Note that the power series in general does not converge uniformly on the whole open disc
of convergence | z | < R. As an example, consider the geometric series
f (z) =
X
1
=
zk ,
1z
k=0
| z | < 1.
n1
X
1
| sn1 (zn ) f (zn ) | =
znk
1 zn
k=0
X
znn
znk =
1.
=
1 zn (6.8)
k=n
The geometric series doesnt converge uniformly on the whole open unit disc.
162
Example 6.2 (a) A series of the form
an cos(nx) +
n=0
an , bn , x R,
bn sin(nx),
n=1
(6.9)
P
P
is called a Fourier series (see Section 6.3 below). If both
n=0 | an | and
n=0 | bn | converge
then the series (6.9) converges uniformly on R to a function F (x).
Indeed, since | an cos(nx) | | an | and | bn sin(nx) | | bn |, by Theorem 6.2, the series (6.9)
converges uniformly on R.
(b) Let f : R R be the sum of the Fourier series
f (x) =
X
sin nx
n=1
(6.10)
P
P
Note that (a) does not apply since n | bn | = n n1 diverges.
If f (x) exists, so does f (x + 2) = f (x), and f (0) = 0. We will show that the series converges
uniformly on [, 2 ] for every > 0. For, put
!
n
n
X
X
sn (x) =
sin kx = Im
eikx .
k=1
k=1
If x 2 we have
n
ei(n+1)x eix
X
2
1
1
ikx
=
| sn (x) |
e =
.
x
ix
ix/2
ix/2
e 1
|e
e
|
sin 2
sin 2
k=1
Note that | Im z | | z | and eix = 1. Since sin x2 sin 2 for /2 x/2 /2 we have
for 0 < m < n
n
n
X
X
sin
kx
s
(x)
s
(x)
k
k1
=
k
k
k=m
k=m
n
X
(x)
(x)
s
1
s
1
n
m1
=
+
sk (x)
k
k
+
1
n
+
1
m
k=m
!
n
X
1
1 1
1
1
+
n+1 m
sin 2 k=m k k + 1
1
2
1
1
1
1
+
+
sin 2 m n + 1 n + 1 m
m sin 2
The right side becomes arbitraryly small as m . Using Proposition 6.1 (b) uniform convergence of (6.10) on [, 2 ] follows.
163
for all
x E.
+ + = .
3 3 3
f(x)
/2
Also, Example 6.1 (b) shows that the continuity of the fn (x) = xn alone is not sufficient for
the continuity of the limit function. On the other hand, the sequence of continuous functions
(xn ) on (0, 1) converges to the continuous function 0. However, the convergence is not uniform.
Prove!
2 x2
fn (x) dx = en
2 x2
1
2
= 1 en 1.
0
164
R1
R1
On the other hand 0 limn fn (x) dx = 0 0 dx = 0. Thus, limn and integration cannot
be interchanged. The reason, (fn ) converges pointwise to 0 but not uniformly. Indeed,
2n2 1 2n
1
fn
=
e =
+.
n
n
e n
Theorem 6.6 Let be an increasing function on [a, b]. Suppose fn R() on [a, b] for all
n N and suppose fn f uniformly on [a, b]. Then f R() on [a, b] and
Z b
Z b
f d = lim
fn d.
(6.11)
n
Proof. Put
n = sup | fn (x) f (x) | .
x[a,b]
Then
fn n f fn + n ,
so that the upper and the lower integrals of f satisfy
Z
b
a
(fn n ) d
Hence,
0
b
a
f d
f d
b
a
f d
(fn + n ) d.
(6.12)
f d 2n ((b) (a)).
Since n 0 as n (Remark 6.1), the upper and the lower integrals of f are equal. Thus
f R(). Another application of (6.12) yields
Z b
Z b
Z b
(fn n ) d
f d
(fn + n ) d
a
a
a
Z b
Z b
n (((b) (a)).
f
d
f
d
n
a
fn (x),
n=1
axb
X
n=1
fn
d =
Z
X
n=1
fn d .
165
Proof. The pointwise convergence of Fn follows from the above theorem with (t) = t and a
and b replaced by x0 and x.
We show uniform convergence: Let > 0. Since fn f on [a, b], there exists n0 N such
| Fn (x) F (x) | =
(b a) = .
(fn (t) f (t)) dt
| fn (t) f (t) | dt
ba
x0
x0
X (1)n1
t2 t3
tn .
log(1 + t) = t + =
2
3
n
n=1
(6.13)
Proof. In Homework 13.5 (a) there was computed the Taylor series
T (x) =
X
(1)n1
n=1
xn
X
By Proposition 6.4 the geometric series
(1)n xn converges uniformly to the function
1
1+x
n=0
on [r, r] for all 0 < r < 1. By Corollary 6.7 we have for all t [r, r]
Z tX
Z t
dx
t
=
(1)n xn dx
log(1 + t) = log(1 + x)|0 =
1
+
x
0
0 n=0
t
Z
t
X
X (1)n
X
(1)n1 n
n n
n+1
x =
t
=
(1) x dx =
Cor 6.7
n
+
1
n
0
0
n=0
n=0
n=1
(b) For | t | < 1 we have
X
t2n+1
t3 t5
(1)n
arctan t = t + =
3
5
2n + 1
n=0
(6.14)
As in the previous example we use the uniform convergence of the geometric series on [r, r]
for every 0 < r < 1 that allows to exchange integration and summation
Z t
Z tX
Z t
X
X
dx
(1)n 2n+1
n 2n
n
2n
arctan t =
t
=
(1) x dx =
(1)
x dx =
.
2
2n + 1
0 1+x
0 n=0
0
n=0
n=0
166
Note that you are, in general, not allowed to insert t = 1 into the equations (6.13) and (6.14).
However, the following proposition (the proof is in the appendix to this chapter) fills this gap.
Proposition 6.9 (Abels Limit Theorem) Let
Then the power series
f (x) =
n=0
an xn
n=0
X (1)n1
1 1 1
log 2 = 1 + =
,
2 3 4
n
n=0
X (1)n
1 1 1
= 1 + =
.
4
3 5 7
2n + 1
n=0
1 x
e n f (x) 0 on [0, +). Indeed, | fn (x) 0 |
n
and for all x R+ . However,
+
nt
fn (t) dt = e
Hence
lim
= lim
t+
fn (t) dt = 1 6= 0 =
nt
1e
1
n
<
= 1.
f (t) dt.
sin(nx)
,
n
x R, n N,
(6.15)
fn (x) =
n cos(nx),
n + as n , whereas
n
167
a x b.
(6.16)
Proof. Put g(x) = limn fn (x), then g is continuous by Theorem 6.5. By the Fundamental
Theorem of Calculus, Theorem 5.14,
Z x
fn (x) = fn (a) +
fn (t) dt.
a
fn (t) dt
Rx
Since g is continuous, the right hand side defines a differentiable function, namely the
antiderivative of g(x), by the FTC. Hence, f (x) = g(x); since g is continuous the proof is now
complete.
For a more general result (without the additional assumption of continuity of fn ) see [Rud76,
7.17 Theorem].
P
n
Corollary 6.11 Let f (x) =
n=0 an x be a power series with radius of convergence R.
(a) Then f is differentiable on (R, R) and we have
f (x) =
nan xn1 ,
n=1
x (R, R).
(6.17)
X
n=k
an =
1 (n)
f (0),
n!
n N0 .
(6.18)
(6.19)
168
(b) Iterated application of (a) yields that f (k1) is differentiable on (R, R) with (6.18). In
particular, inserting x = 0 into (6.18) we find
f (k) (0) = k!ak ,
ak =
f (k) (0)
.
k!
These are exactly the Taylor coefficients of f hat a = 0. Hence, f coincides with its Taylor
series.
nxn =
n=1
x
.
(1 x)2
P
n
Since the geometric series f (x) =
n=0 x equals 1/(1 x) on (1, 1) by Corollary 6.11 we
have
!
X
X
1
d
d
d X n
1
n
=
=
x
nxn1 .
=
(x ) =
(1 x)2
dx 1 x
dx n=0
dx
n=1
n=1
Multiplying the preceding equation by x gives the result.
f (x) =
a0 X
ak cos kx + bk sin kx.
+
2
k=1
(6.20)
(6.21)
169
cos kx sin mx dx = 0,
0
Z 2
0
cos kx cos mx dx = km , k, m N,
(6.22)
sin kx sin mx dx = km ,
n
X
ck eikx ,
(6.23)
k=n
where c0 = a0 /2 and
ck =
1
(ak ibk ) ,
2
ck =
1
(ak + ibk ) ,
2
k 1.
To obtain the coefficients ck using integration we need the notion of an integral of a complexvalued function, see Section 5.5. If m 6= 0 we have
Z
b
imx
b
1 imx
dx =
e .
im
a
eimx dx =
(
0,
2,
m Z \ {0},
m = 0.
We conclude,
1
ck =
2
k = 0, 1, . . . , n.
(6.24)
170
f (x)eikx dx,
kZ
(6.25)
ck eikx ,
(6.26)
k=
n
X
ck eikx ,
k=n
n N,
a0 X
+
ak cos kx + bk sin kx.
2
(6.27)
k=1
where ak and bk are given by (6.21). One can ask whether the Fourier series of a function
converges to the function itself. It is easy to see: If the function f is the uniform limit of a series
of trigonometric polynomials
f (x) =
k eikx
(6.28)
k=
then f coincides with its Fourier series. Indeed, since the series (6.28) converges uniformly, by
Proposition 6.6 we can change the order of summation and integration and obtain
1
ck =
2
=
2
0
m eimx
m=
1
2 m=
eikx dx
m ei(mk)x dx = k .
In general, the Fourier series of f neither converges uniformly nor pointwise to f . For Fourier
series convergence with respect to the L2 -norm
kf k2 =
is the appropriate notion.
1
2
21
| f | dx
2
(6.29)
171
f g + h = f g + f h,
f g = f g,
f g = g f.
R 2
For every f V we have f f = 1/(2) 0 | f |2 dx 0. However, f f = 0 does not imply
f = 0 (you can change f at finitely many points without any impact on f f ). If f V is
k Z.
(6.30)
Any such subset {ek | k N} of an inner product space V satisfying (6.30) is called an
orthonormal system (ONS). Using ek (x) = cos kx + i sin kx the real orthogonality relations
(6.22) immediately follow from (6.30).
The next lemma shows that the Fourier series of f is the best L2 -approximation of a periodic
function f V by trigonometric polynomials.
Lemma 6.12 (Least Square Approximation) Suppose f V has the Fourier coefficients ck ,
k Z and let k C be arbitrary. Then
2
2
n
n
X
X
ck ek
f
k ek
,
(6.31)
f
k=n
k=n
k=n
(6.32)
172
Proof. Let
always denote
n
X
. Put gn =
k=n
f g n = f
and gn ek = k such that
k ek . Then
k ek =
g n g n =
k f ek =
ck k
| k |2 .
P
P
ck k
| ck |2 +
ck k +
| k |2
| k ck | 2
(6.33)
which is evidently minimized if and only if k = ck . Inserting this into (6.33), equation (6.32)
follows.
Corollary 6.13 (Bessels Inequality) Under the assumptions of the above lemma we have
k=
| ck |2 kf k22 .
(6.34)
k=n
| ck |2 kf k22 .
lim kfn f k2 = 0.
Explicitly
Z
2
0
| fn (x) f (x) |2 dx 0.
n
173
Remarks 6.3 (a) Note that the L2 -limit in V is not unique; changing f (x) at finitely many
R 2
points of [0, 2] does not change the integral 0 | f fn |2 dx.
(b) If fn f on R then fn f . Indeed, let > 0. Then there exists n0 N such that
kk2
This shows fn f 0.
kk2
(c) The above Lemma, in particular (6.32), shows that the Fourier series converges in L2 to f if
and only if
kf k22
k=
| ck | 2 .
(6.35)
This is called Parsevals Completeness Relation. We will see that it holds for all f V .
Let use write
f (x)
ck eikx
k=
to express the fact that (ck ) are the (complex) Fourier coefficients of f . Further
sn (f ) = sn (f ; x) =
n
X
ck eikx
(6.36)
k=n
ck ek ,
k=
k ek ,
k=
then
(i)
1
lim
n 2
(ii)
(iii)
| f sn (f ) |2 dx = 0,
1
2
Z
1
2
f g dx =
| f |2 dx =
k=
X
k=
(6.37)
ck k ,
| ck | 2 =
(6.38)
a20 1 X 2
+
(a + b2k ) Parsevals formula.
4
2 k=1 k
(6.39)
The proof is in Rudins book, [Rud76, 8.16, p.191]. It uses StoneWeierstra theorem about
the uniform approximation of a continuoous function by polynomials. An elementary proof is
in Forsters book [For01, 23].
174
(
0,
2
2
(1)k+1 + 1 = 4
cos kx =
sin kx dx =
k
k
,
0
k
Noting that
kf k22
1
=
2
dx = 1 =
if k is odd..
4 X sin(2n + 1)x
.
f
n=0
2n + 1
if k is even,
| ck | 2 =
a20 1 X 2
+
(a + b2n )
4
2 nN n
1
1X 2
8 X
8
2
.
bn = 2
=:
s
=
s
=
1
1
2 nN
n=0 (2n + 1)2
2
8
P
1
Now we can compute s =
n=1 n2 . Since this series converges absolutely we are allowed to
rearrange the elements in such a way that we first add all the odd terms, which gives s1 and then
all the even terms which gives s0 . Using s1 = 2 /8 we find
1
1
1
s = s1 + s0 = s1 + 2 + 2 + 2 +
2
4 6
1 1
1
s
+ 2 + = s1 +
s = s0 + 2
2
2 1
2
4
2
4
s = s1 = .
3
6
(b) Fix a [0, 2] and consider f V with
(
1,
f (x) =
0,
0 x a,
a < x < 2
Z a
1
a
dx =
and
The Fourier coefficients of f are c0 =
2 0
2
Z a
1
i
eika 1 ,
eikx dx =
ck = f ek =
2 0
2k
If k 6= 0,
| ck | 2 =
k 6= 0.
1 cos ka
1
ika
ika
1
e
1
e
=
,
4 2 k 2
2 2 k 2
175
X 1 cos ak
a2
| ck | = 2 +
4
2k2
k=
k=1
where s =
a2
1 X 1
1 X cos ak
= 2+ 2
4
k=1 k 2 2 k=1 k 2
!
X
cos ak
a2
1
= 2 + 2 s
,
4
k2
k=1
1
=
2
dx =
0
a
.
2
X
cos ka
k=1
X
k=1
k2
a
2
cos ka
(a )2 2
a2 a 2
+
=
.
=
k2
4
2
6
4
12
(6.40)
X
cos kx
(6.41)
k2
k=1
x [0, 2]
and the Fourier series converges uniformly on R to the above function. Since the term by term
differentiated series converges uniformly on [, 2 ], see Example 6.2, we obtain
X
sin kx
k=1
X
cos kx
k2
k=1
(x )2 2
4
12
x
2
xX
k=1
X 1
cos kt
dt
=
k2
k2
k=1
=
x
0
X
sin kx
k=1
k3
X
1
cos kt dt =
sin
kt
3
k
0
k=1
176
On the other hand,
Z
(t )2 2
4
12
By homework 19.5
f (x) =
X
sin kx
k3
k=1
dt =
(x )3 2
3
x+ .
12
12
12
3
(x )3 2
x+
12
12
12
Theorem 6.15 Let f : R R be a continuous periodic function which is piecewise continuously differentiable, i. e. there exists a partition {t0 , . . . , tr } of [0, 2] such that f |[ti1 , ti ]
is continuously differentiable.
Then the Fourier series of f converges uniformly to f .
X
| k |2 kk22 < .
k=
If k 6= 0 the Fourier coefficients ck of f can be found using integration by parts from the Fourier
coefficients of k .
Z ti
Z ti
i
ikx
ikx ti
ikx
f (x)e
f (x)e
dx =
(x)e
dx .
ti1
k
ti1
ti1
Hence summation over i = 1, . . . , r yields,
1
ck =
2
ck =
i
2k
2
ikx
f (x)e
1 X
dx =
2 i=1
(x)eikx dx =
ti
f (x)eikx dx
ti1
ik
.
k
ti
ikx
f (x)e
ti1
177
X
X
1
Since both
and
| k |2 converge,
2
k
k=1
k=
k=
| ck | < .
Thus, the Fourier series converges uniformly to a continuous function g (see Theorem 6.5).
Since the Fourier series converges both to f and to g in the L2 norm, kf gk2 = 0. Since both
f and g are continuous, they coincide. This completes the proof.
P
P
2
Note that for any f V , the series
|
c
|
converges
while
the
series
k
kZ
kZ | ck |
converges only if the Fourier series converges uniformly to f .
178
(b) Countable sets represent the smallest infinite cardinality: No uncountable set can be a
subset of a countable set. Any countable set can be arranged in a sequence. In particular, Q is
contable, see Example 2.6 (c).
(c) The countable union of countable sets is a countable set; this is Cantors First Diagonal
Process:
x11 x12
x13 x14
...
x21
x22
x23
x24
...
x31
x32
x33
x34
...
x41
x42
x43
x44
x51
(d) Let A = {(xn ) | xn {0, 1} n N} be the set of all sequences whose elements are 0
and 1. This set A is uncountable. In particular, R is uncountable.
Proof. Suppose to the contrary that A is countable and arrange the elements of A in a sequence
(sn )nN of distinct elements of A. We construct a sequence s as follows. If the nth element in
sn is 1 we let the nth digit of s be 0, and vice versa. Then the sequence s differs from every
member s1 , s2 , . . . at least in one place; hence s 6 Aa contradiction since s is indeed an
element of A. This proves, A is uncountable.
179
(
1, if
0, if
x 6= y,
x = y.
Then (X, d) becomes a metric space. It is called the discrete metric space.
Definition 6.9 Let E be a vector space over C (or R). Suppose on E there is given a function
kk : E R which associates to each x E a real number kxk such that the following three
conditions are satisfied:
(i) kxk 0 for every x E, and kxk = 0 if and only if x = 0,
(ii) kxk = | | kxk for all C (in R, resp.)
(iii) kx + yk kxk + kyk , for all x, y E.
Then E is called a normed (vector) space and kxk is the norm of x.
kxk generalizes the length of vector x E. Every normed vector space E is a metric space
if we put d(x, y) = kx yk.
Prove!
However, there are metric spaces that are not normed spaces, for example (N, d(m, n) =
| n m |).
Example 6.12 (a) E = Rk or E = Ck . Let x = (x1 , , xk ) E and define
v
u k
uX
kxk2 = t
| xi |2 .
i=1
kxk1 =
k
X
i=1
| xk | ,
kf kp =
Z
b
a
p1
| f (t) | dt .
p
X
n=1
| xn |2
! 21
180
defines a norm on 2 .
(d) The bounded sequences. E = = {(xn ) | supnN | xn | < }. Then
kxk = sup | xn |
defines a norm on E.
(e) E = C([a, b]). Then
kf k1 =
defines a norm on E.
b
a
| f (t) | dt
181
Indeed, a is an accumulation point of both (a, b) and [a, b). This is true since every neighborhood U (a), < b a, has a + /2 (a, b) (resp. in [a, b)) which is different from a. For any
point x 6 [a, b] we find a neighborhood U (x) with U (x) [a, b) = ; hence x 6 E.
The set of rational numbers Q is dense in R. Indeed, every neighborhood U (r) of every real
number r contains a rational number, see Proposition 1.11 (b).
For the real line one can prove: Every open set is the at most countable union of disjoint open
(finite or infinite) intervals. A similar description for closed subsets of R is false. There is no
similar description of open subsets of Rk , k 2.
(b) For every metric space X, both the whole space X and the empty set are open as well as
closed.
(c) Let B = {x Rk | kxk2 < 1} be the open unit ball in Rk . B is open (see Lemma 6.16
below); B is not closed. For example, x0 = (1, 0, . . . , 0) is an accumulation point of B since
xn = (1 1/n, 0, . . . , 0) is a sequence of elements of B converging to x0 , however, x0 6 B.
The accumulation points of B are B = {x Rk | kxk2 1}. This is also the closure of B in
Rk .
f+
g
f-
for all
x [a, b].
Remarks 6.4 (a) If p is an accumulation point of a set E, then every neighborhood of p contains
infinitely many points of E.
(b) A finite set has no accumulation points; hence any finite set is closed.
Example 6.14 (a) The open complex unit disc, {z C | | z | < 1}.
(b) The closed unit disc, {z C | | z | 1}.
(c) A finite set.
(d) The set Z of all integers.
(e) {1/n | n N}.
(f) The set C of all complex numbers.
(g) The interval (a, b).
182
Here (d), (e), and (g) are regarded as subsets of R. Some properties of these sets are tabulated
below:
(a)
(b)
(c)
(d)
(e)
(f)
(g)
Closed
No
Yes
Yes
Yes
No
Yes
No
Open
Yes
No
No
No
No
Yes
Yes
Bounded
Yes
Yes
Yes
No
Yes
No
Yes
Proposition 6.17 A subset E X of a metric space X is open if and only if its complement
E c is closed.
Proof. First, suppose E c is closed. Choose x E. Then x 6 E c , and x is not an accumulation
point of E c . Hence there exists a neighborhood U of x such that U E c is empty, that is
U E. Thus x is an interior point of E and E is open.
Next, suppose that E is open. Let x be an accumulation point of E c . Then every neighborhood
of x contains a point of E c , so that x is not an interior point of E. Since E is open, this means
that x E c . It follows that E c is closed.
for all x E.
(6.42)
183
Proof. Condition (6.42) is obviously symmetric with respect to E1 and E2 since kxk2 /c2
kxk1 kxk2 /c1 . Therefore, it is sufficient to show the following: If xn x w.r.t. kk1 then
n
n N.
Since the first and the last expressions tend to 0 as n , the sandwich theorem shows that
limn kxn xk2 = 0, too. This proves xn x w.r.t. kk2 .
Example 6.15 Let E = Rk or E = Ck with the norm kxkp =
All these norms are equivalent. Indeed,
kxkp
k
X
i=1
| xi |
= kxk kxkp
k
X
i=1
p
p
| x1 |p + | xk |p , p [1, ].
kxkp = k kxkp ,
k kxk .
(6.43)
The following Proposition is quite analogous to Proposition 2.33 with k = 2. Recall that a
complex sequence (zn ) converges if and only if both Re zn and Im zn converge.
Proposition 6.19 Let (xn ) be a sequence of vectors of the euclidean space (Rk , kk2 ),
xn = (xn1 , . . . , xnk ).
Then (xn ) converges to a = (a1 , . . . , ak ) Rk if and only if
lim xni = ai ,
i = 1, . . . , k.
Proof. Suppose that lim xn = a. Given > 0 there is an n0 N such that n n0 implies
n
| xni ai | < .
k
For n max{n01 , . . . , n0k } we have (see (6.43))
184
Proof. Since B is bounded all coordinates of B are bounded; hence there is a subsequence (xn )
(2)
(1)
of (xn ) such that the first coordinate converges. Further, there is a subsequence (xn ) of (xn )
(k)
(k1)
such that the second coordinate converges. Finally there is a subsequence (xn ) of (xn )
(k)
such that all coordinates converge. By the above proposition the subsequence (xn ) converges
in Rk .
The same statement is true for subsets B Ck .
Definition 6.12 A mapping f : X Y from the metric space X into the metric space Y is
said to be continuous at a X if one of the following equivalent conditons is satisfied.
(a) For every > 0 there exists > 0 such that for every x X
d(x, a) <
(6.44)
(b) For any sequence (xn ), xn X with lim xn = a it follows that lim f (xn ) = f (a).
n
Remark 6.5 Since the complement of an open set is a closed set, it is obvious that the proposition holds if we replace open set by closed set.
In general, the image of an open set under a continuous function need not to be open; consider
for example f (x) = sin x and G = (0, 2) which is open; however, f ((0, 2)) = [1, 1] is not
open.
185
for all
m, n n0 .
186
Remark 6.6 Further properties. (a) A compact subset of a metric space is closed and bounded.
(b) A closed subsets of a compact set is compact.
(c) A subset K of Rk or Ck is compact if and only if K is bounded and closed.
Proof. Suppose K is closed and bounded. Let (xn ) be a sequence in K. By Corollary 6.20 (xn )
has a convergent subsequence. Since K is closed, the limit is in K. By the above proposition
K is compact. The other directions follows from (a)
Proof. (a) Let {G } be an open covering of f (X). By Proposition 6.22 f 1 (G ) is open for
every . Hence, {f 1 (G )} is an open cover of X. Since X is compact there is an open
subcover of X, say {f 1(G1 ), . . . , f 1 (Gn )}. Then {G1 , . . . , Gn } is a finite subcover of
{G } covering f (X). Hence, f (X) is compact. We skip (b).
Similarly as for real function we have the following proposition about uniform continuity. The
proof is in the appendix.
187
implies
kx ak <
implies
,
3M
.
| g(x) g(a) | <
3M
| f (x) f (a) | <
(6.45)
Note that
f g(x) f g(a) = (f (x) f (a))(g(x) g(a)) + f (a)(g(x) g(a)) + g(a)(f (x) f (a)).
Taking the absolute value of the above identity, using the triangle inequality as well as (6.45)
we have that kx ak < implies
| f g(x)f g(a) | | f (x)f (a) | | g(x)g(a) | + | f (a) | | g(x) g(a) | + | g(a) | | f (x) f (a) |
2
+M
+ + = .
+M
2
9M
3M
3M
3 3 3
188
Example 6.17 Let f : R3 R2 be given by
f (x, y, z) =
sin
x2 +ez
x2 +y 2 +z 2 +1
2
2
log | x2 + y + z + 1 |
Then f is continuous p
on U. Indeed, since product, sum, and composition of continuous functions are continuous, x2 + y 2 + z 2 + 1 is a continuous function on R3 . We also made use
of Proposition 6.26 (a); the coordinate functions x, y, and z are continuous. Since the denomi2
z
nator is nonzero, f1 (x, y, z) = sin 2x +e
is continuous. Since | x2 + y 2 + z 2 + 1 | > 0,
2
2
x +y +z +1
6.5 Appendix E
(a) A compact subset is closed
Proof. Let K be a compact subset of a metric space X. We shall prove that the complement of
K is an open subset of X.
Suppose that p X, p 6 K. If q K, let V q and U(q) be neighborhoods of p and q,
respectively, of radius less than d(p, q)/2. Since K is compact, there are finitely many points
q1 , . . . , qn in K such that
K Uq1 Uqn =: U.
If V = V q1 V qn , then V is a neighborhood of p which does not intersect U. Hence
U K c , so that p is an interior point of K c , and K is closed. We show that K is bounded.
Let > 0 be given. Since K is compact the open cover {U (x) | x K} of K has a finite
S
subcover, say {U (x1 ), . . . , U (xn )}. Let U = ni=1 U (xi ), then the maximal distance of two
points x and y in U is bounded by
2 +
d(xi , xj ).
1i<jn
6.5 Appendix E
189
If Ck = we have found a finite subcover, namely V1 , V2 , . . . , Vk . Suppose that all the Cn are
nonempty, say xn Cn . Further, let x be the limit of the subsequence (xni ). Since xni Cm
T
for all ni m and Cm is closed, x Cm for all m. Hence x mN Cm . However,
\
Cm = K \
Vm = .
(6.46)
190
We put := 12 min{(p1 ), . . . , (pn )}. Then > 0. Now let p and q be points of K with
| x y | < . By (6.46), there is an integer m, 1 m n, such that p J(pm ); hence
1
| p pm | < (pm ),
2
and we also have
1
| q pm | | p q | + | p pm | < + (pm ) (pm ).
2
Finally, continuity at pm gives
| f (p) f (q) | | f (p) f (pm ) | + | f (pm ) f (q) | < .
Proposition 6.27 There exists a real continuous function on the real line which is nowhere
differentiable.
Proof. Define
(x) = | x | ,
x [1, 1]
(6.47)
n
X
3
n=0
(4n x).
(6.48)
Since 0 1, Theorem 6.2 shows that the series (6.48) converges uniformly on R. By
Theorem 6.5, f is continuous on R.
Now fix a real number x and a positive integer m N. Put
m =
1
2 4m
where the sign is chosen that no integer lies between 4m x and 4m (x + m ). This can be done
since 4m | m | = 12 . It follows that | (4m x) (4m x + 4m m ) | = 12 . Define
n =
(4n (x + m )) (4n x)
.
m
6.5 Appendix E
191
3n = (3m + 1) .
3
n
n=0 4
m
2
n=0
Proof of Abels Limit Theorem, Proposition 6.9. By Proposition 6.4, the series converges on
(1, 1) and the limit function is continuous there since the radius of convergence is at least 1,
by assumption. Hence it suffices to proof continuity at x = 1, i. e. that limx10 f (x) = f (1).
P
n Z+
Put rn =
k=n ak ; then r0 = f (1) and rn+1 rn = cn for all nonnegative integers
P
and limn rn = 0. Hence there is a constant C with | rn | C and the series n=0 rn+1 xn
converges for | x | < 1 by the comparison test. We have
(1 x)
rn+1 xn =
n=0
rn+1 xn +
n=0
X
n=0
rn+1 xn+1
n=0
rn+1 x
X
n=0
rn x + r0 =
hence,
f (1) f (x) = (1 x)
an xn + f (1),
n=0
rn+1 xn .
n=0
Let > 0 be given. Choose N N such that n N implies | rn | < . Put = /(CN); then
x (1 , 1) implies
| f (1) f (x) | (1 x)
N
1
X
n=0
| rn+1 | xn + (1 x)
(1 x)CN + (1 x)
n=N
| rn+1 | xn
xn = 2;
n=0
(6.49)
xX
192
function (see the triangle inequality below) and the sum of continuous functions is a continuous
function (see Proposition 6.26). We show that kf k is indeed a norm on C(X).
(i) Obviously, kf k 0 since the absolute value | f (x) | is nonnegative. Further k0k = 0.
Suppose now kf k = 0. This implies | f (x) | = 0 for all x; hence f = 0.
(ii) Clearly, for every (real or complex) number we have
kf k = sup | f (x) | = | | sup | f (x) | = | | kf k .
xX
xX
(iii) If h = f + g then
| h(x) | | f (x) | + | g(x) | kf k + kgk ,
x X;
hence
kf + gk kf k + kgk .
We have thus made C(X) into a normed vector space. Remark 6.1 can be rephrased as
A sequence (fn ) converges to f with respect to the norm in C(X) if and only if
fn f uniformly on X.
Accordingly, closed subsets of C(X) are sometimes called uniformly closed, the closure of a
set A C(X) is called the uniform closure, and so on.
Theorem 6.28 The above norm makes C(X) into a Banach space (a complete normed space).
Proof. Let (fn ) be a Cauchy sequence of C(X). This means to every > 0 corresponds an
n0 N such that n, m n0 implies kfn fm k < . It follows by Proposition 6.1 that there
is a function f with domain X to which (fn ) converges uniformly. By Theorem 6.5, f is
continuous. Moreover, f is bounded, since there is an n such that | f (x) fn (x) | < 1 for all
x X, and fn is bounded.
Thus f C(X), and since fn f uniformly on X, we have kf fn k 0 as n .
Chapter 7
Calculus of Functions of Several Variables
In this chapter we consider functions f : U R or f : U Rm where U Rn is an open set.
In Subsection 6.4.6 we collected the main properties of continuous functions f . Now we will
study differentiation and integration of such functions in more detail
The Norm of a linear Mapping
Proposition 7.1 Let T L(Rn , Rm ) be a linear mapping of the euclidean spaces Rn into Rm .
(a) Then there exists some C > 0 such that
kT (x)k2 C kxk2 ,
for all x Rn .
(7.1)
Proof. (a) Using the standard bases of Rn and Rm we identify T with its matrix T = (aij ),
P
T ej = m
i=1 aij ei . For x = (x1 , . . . , xn ) we have
!
n
n
X
X
a1j xj , . . . ,
amj xj ;
T (x) =
j=1
j=1
| aij |
| xj | =
aij
| xj |2 = C 2 kxk2 ,
i=1 j=1
where C =
qP
i,j
j=1
i,j
j=1
a2ij . Consequently,
kT xk C kxk .
(b) Let > 0. Put = /C with the above C. Then kx yk < implies
kT x T yk = kT (x y)k C kx yk < ,
which proves (b).
193
194
Definition 7.1 Let V and W normed vector spaces and A L(V, W ). The smallest number C
with (7.1) is called the norm of the linear map A and is denoted by kAk.
kAk = inf{C | kAxk C kxk
for all x V }.
(7.2)
By definition,
kAxk kAk kxk .
(7.3)
kT xk
= sup kT xk = sup kT xk .
kxk
kxk=1
kxk1
h0
f (a1 , . . . , ai + h, . . . , an ) f (a1 , . . . , an )
h
(7.4)
exists where h is real and sufficiently small (such that (a1 , . . . , ai + h, . . . , an ) U).
Di f (x) is called the ith partial derivative of f at a. We also use the notations
Di f (a) =
f
f (a)
(a) =
= fxi (a).
xi
xi
It is important that Di f (a) is the ordinary derivative of a certain function; in fact, if g(x) =
f (a1 , . . . , x, . . . , an ), then Di f (a) = g (ai ). That is, Di f (a) is the slope of the tangent line at
(a, f (a)) to the curve obtained by intersecting the graph of f with the plane xj = aj , j 6= i. It
also means that computation of Di f (a) is a problem we can already solve.
Example 7.1 (a) f (x, y) = sin(xy 2 ). Then D1 f (x, y) = y 2 cos(xy 2 ) and D2 f (x, y) =
2xy cos(xy 2 ).
(b) Consider the radius function r : Rn R
q
r(x) = kxk2 = x21 + + x2n ,
x = (x1 , . . . , xn ) Rn . Then r is partial differentiable on Rn \ 0 with
r
xi
,
(x) =
xi
r(x)
Indeed, the function
f () =
x 6= 0.
q
x21 + + 2 + + x2n
(7.5)
195
is differentiable, where x1 , . . . , xi1 , xi+1 , . . . , xn are considered to be constant. Using the chain
rule one obtains (with = xi )
2
1
xi
r
(x) = f () = p 2
= .
xi
2 x1 + + 2 + + x2n
r
(c) Let f : (0, +) R be differentiable. The composition x 7 f (r(x)) (with the above
radius function r) is denoted by f (r), it is partial differentiable on Rn \ 0. The chain rule gives
r
xi
f (r) = f (r)
= f (r) .
xi
xi
r
(d) Partial differentiability does not imply continuity. Define
(
xy
xy
(x, y) 6= (0, 0),
2
2 2 = r4 ,
f (x, y) = (x +y )
0,
(x, y) = (0, 0).
Obviously, f is partial differentiable on R2 \ 0. Indeed, by definition of the partial derivative
f
f (h, 0)
(0, 0) = lim
= lim 0 = 0.
h0
h0
x
h
Since f is symmetric in x and y, f
(0, 0) = 0, too. However, f is not continuous at 0 since
y
2
f (, ) = 1/(4 ) becomes large as tends to 0.
Remark 7.1 In the next section we will become acquainted with stronger notion of differentiability which implies continuity. In particular, a continuously partial differentiable function is
continuous.
Definition 7.3 Let U Rn be open and f : U R partial differentiable. The vector
f
f
grad f (x) =
(x), . . . ,
(x)
x1
xn
(7.6)
(f g) =
g+f
.
xi
xi
xi
(c) f (x, y) = xy . Then grad f (x, y) = (yxy1 , xy log x).
(7.7)
196
=
.
,...,
x1
xn
Definition 7.4 Let U Rn . A vector field on U is a mapping
v = (v1 , . . . , vn ) : U Rn .
(7.8)
(7.9)
n
X
vi .
x
i
i=1
The product rule gives the following rule for the divergence. Let f : U R a partial differentiable function and
v = (v1 , . . . , vn ) : U R
a partial differentiable vector field, then
f
vi
(f vi ) =
vi + f
.
xi
xi
xi
Summation over i gives
div (f v) = grad f v + f div v.
Using the nabla operator this can be rewritten as
f v = f v + f v.
n
X
xi
i=1
xi
=n
and
xx = r 2 ,
x
n
1
x
n1
1
= grad x + div x = 3 x + =
.
r
r
r
r
r
r
(7.10)
197
k f
2f
2f
f
=
= fxi xj , Di Di f =
,
D
D
.
i
i
1
k
xj xi
x2i
xik xi1
(7.11)
fx (, ) y = fxy (, ) y.
y
Altogether we have
F (x) F (0) = f (x, y) f (x, 0) f (0, y) + f (0, 0) = fxy (, )xy.
(7.12)
198
The same arguments but starting with the function G(y) = f (x, y) f (0, y) show the existence
of and with | | | x |, | | | y | and
f (x, y) f (x, 0) f (0, y) + f (0, 0) = fxy ( , ) xy.
(7.13)
Corollary 7.3 Let U Rn be open and f : U Rn be k-times continuously partial differentiable. Then
Dik Di1 f = Di(k) Di(1) f
for every permutation of 1, . . . , k.
Proof. The proof is by induction on k using the fact that any permutation can be written as a
product of transpositions (j j + 1).
Example 7.4 Let U R3 be open and let v : U R3 be a partial differentiable vector field.
One defines a new vector field curl v : U R3 , the curl of v by
v3
v2 v1
v3 v2
v1
curl v =
.
(7.14)
x2 x3 x3 x1 x1 x2
Formally one can think of curl v as being the vector product of and v
e
1 e2 e3
curl v = v = x 1 x 2 x 3 ,
v1 v2 v3
where e1 , e2 , and e3 are the unit vectors in R3 . If f : U R has continuous second partial
derivatives then, by Proposition 7.2,
curl grad f = 0.
(7.15)
= 0.
x2 x3 x3 x2
The other two components are obtained by cyclic permutation of the indices.
We have found: curl v = 0 is a necessary condition for a continuously partial differentiable
vector field v : U R3 to be the gradient of a function f : U R.
199
2f
2f
+
+
,
x21
x2n
(7.16)
2
2
+
+
x21
x2n
the Laplacian or Laplace operator. The equation f = 0 is called the Laplace equation; its
solution are the harmonic functions. If f depends on an additional time variable t, f : U I
R, (x, t) 7 f (x, t) one considers the so called wave equation
ftt a2 f = 0,
(7.17)
ft kf = 0.
(7.18)
n1
f (r).
r
= 0 if n 3 and log r = 0 if n = 2.
f (r) = f (r) +
1
In particular, rn2
Prove!
Note that the mapping h 7 ah is linear from R R and any linear mapping is of that form.
200
(7.19)
The linear map A L(Rn , Rm ) is called the derivative of f at x and will be denoted by Df (x).
In case n = m = 1 this notion coincides with the ordinary differentiability of a function.
Remark 7.2 We reformulate the definition of differentiability of f at a U: Define a function
a : U (0) Rn Rm (depending on both a and h) by
f (a + h) = f (a) + A(h) + a (h).
(7.20)
a (h)k
Then f is differentiable at a if and only if limh0 kkhk
= 0. Replacing the r.h.s. of (7.20) by
f (a) + A(h) (forgetting about a ) and inserting x in place of a + h and Df (a) in place of A,
we obtain the linearization L : Rn Rm of f at a:
(7.21)
+
khk
khk
khk
Since the limit h 0 on the right exists and equals 0, the l.h.s also tends to 0 as h 0, that is
kA(h) A (h)k
= 0.
h0
khk
lim
kA(th0 ) A (th0 )k
| t | kA(h0 ) A (h0 )k
kA(h0 ) A (h0 )k
= lim
=
.
t0
t0
kth0 k
| t | kh0 k
kh0 k
0 = lim
Definition 7.6 The matrix (aij ) Rmn to the linear map Df (x) with respect to the standard
bases in Rn and Rm is called the Jacobi matrix of f at x. It is denoted by f (x), that is
0
..
.
201
Remark 7.3 (a) Using a column vector h = (h1 , . . . , hn ) the map Df (x)(h) is then given by
matrix multiplication
Pn
h1
a11 . . . a1n
j=1 a1j hj
..
.. .. =
Df (x)(h) = f (x) h = ...
.
.
. .
Pn
hn
am1 . . . amn
j=1 amj hj
Once chosen the standard basis in Rm , we can write f (x) = (f1 (x), . . . , fm (x)) as vector of m
scalar functions fi : Rn R. By Proposition 6.19 the limit of the vector function
1
(f (x + h) f (x) Df (x)(h)) .
h0 khk
lim
exists and is equal to 0 if and only if the limit exists for every coordinate i = 1, . . . , m and is 0
!
n
X
1
fi (x + h) fi (x)
(7.22)
aij hj = 0, i = 1, . . . , m.
lim
h0 khk
j=1
We see, f is differentiable at x if and only if all fi , i = 1, . . . , m, are. In this case the Jacobi
matrix f (x) is just the collection of the row vectors fi (x), i = 1, . . . , m:
f1 (x)
f (x) = ... ,
(x)
fm
n
X
i,j=1
cij xi xj ,
x = (x1 , . . . , xn ) Rn .
202
If a, h Rn we have
h0
h0
h0
f1
. . . x
x1
n
fi (x)
..
.. (x) =
.
(7.23)
(aij ) = f (x) = .
i = 1, . . . , m
xj
fm
fm
. . . xn
x1
j = 1, . . . , n
Notation. (a) For the Jacobi matrix we also use the notation
(f1 , . . . , fm )
f (x) =
(x) .
(x1 , . . . , xn )
(b) In case n = m the determinant det(f (x)) of the Jacobi matrix is called the Jacobian or
functional determinant of f at x. It is denoted by
det(f (x)) =
(f1 , . . . , fn )
(x) .
(x1 , . . . , xn )
203
Proof. Inserting h = tej = (0, . . . , t, . . . , 0) into (7.22) (see Remark 7.3) we have, since khk =
| t | and hk = tkj for all i = 1, . . . , m
P
kfi (x + tej ) fi (x) nk=1 aik hk k
0 = lim
t0
ktej k
| fi (x1 , . . . , xj + t, . . . , xn ) fi (x) taij |
= lim
t0
|t|
fi (x1 , . . . , xj + t, . . . , xn ) fi (x)
= lim
aij
t0
t
fi (x)
=
aij .
xj
Hence aij =
fi (x)
.
xj
Hyper Planes
A plane in R3 is the set H = {(x1 , x2 , x3 ) R3 | a1 x2 + a2 x2 + a3 x3 = a4 } where, ai R,
i = 1, . . . , 4. The vector a = (a1 , a2 , a3 ) is the normal vector to H; a is orthogonal to any
vector x x , x, x H. Indeed, a(x x ) = ax ax = a4 a4 = 0.
The plane H is 2-dimensional since H can be written with two parameters 1 , 2 R as
(x01 , x02 , x03 )+1 v1 +2 v2 , where (x01 , x02 , x03 ) is some point in H and v1 , v2 R3 are independent
vectors spanning H.
This concept is can be generalized to Rn . A hyper plane in Rn is the set of points
H = {(x1 , . . . , xn ) Rn | a1 x1 + a2 x2 + + an xn = an+1 },
where a1 , . . . , an+1 R. The vector (a1 , . . . , an ) Rn is called the normal vector to the hyper
plane H. Note that a is unique only up to scalar multiples. A hyper plane in Rn is of dimension
n 1 since there are n 1 linear independent vectors v1 , . . . , vn1 Rn and a point h H
such that
H = {h + 1 v1 + + n vn | 1 , . . . , n R}.
Example 7.7 (a) Special case m = 1; let f : U R be differentiable. Then
f
f
(x), . . . ,
(x) = grad f (x).
f (x) =
x1
xn
It is a row vector and gives a linear functional on Rn which linearly associates to each vector
y = (y1 , . . . , yn ) Rn a real number
y1
n
X
..
fxj (x)yj .
Df (x) . = grad f (x)y =
yn
j=1
204
In particular by Remark 7.3 (b), the equation of the linearization of f at a (the touching hyper
plane) is
xn+1 = L(x) = f (a) + grad f (a)(x a)
n
X
j=1
0=
n
X
j=1
fxj (a)(xj aj )
0=n
(
xa
),
(t)) Rm1 is
curve in Rm with initial point f (a) and end point f (b). f (t) = (f1 (t), . . . , fm
the Jacobi matrix of f at x (column vector). It is the tangent vector to the curve f at t (a, b).
(c) Let f : R3 R2 be given by
3
x 3xy 2 + z
.
f (x, y, z) = (f1 , f2 ) =
sin(xyz 2 )
Then
f (x, y, z) =
(f1 , f2 )
(x, y, z)
6xy
1
3x2 3y 2
.
yz 2 cos(xy 2 z) xz 2 cos(xy 2 z) 2xyz cos(xy 2 z)
(7.24)
205
(7.25)
(7.26)
(7.27)
(7.28)
then
lim
xa
k(x)k
= 0,
kx ak
lim
yb
k(y)k
=0
ky bk
(7.29)
k(x)k
= 0.
xa kx ak
lim
(x) = g(f (x)) g(f (a)) BA (x a) = g(f (x)) g(f (a)) B(f (x) f (a) (x))
(x) = [g(f (x)) g(f (a)) B(f (x) f (a))] + B (x)
(x) = (f (x)) + B((x)).
+ kBk
.
kx ak
kx ak
kx ak
ky bk
kx ak
kx ak
Inserting (7.26) again into the above equation we continue
k(x)k
k(y)k k(x) + A(x a)k
+ kBk
ky bk
ka xk
kx ak
k(y)k k(x)k
k(x)k
+ kAk + kBk
.
ky bk ka xk
kx ak
=
All terms on the right side tend to 0 as x approaches a. This completes the proof.
Remarks 7.5 (a) The chain rule in coordinates. If A = f (a), B = g (f (a)), and C = k (a),
then A Rmn , B Rpm , and C Rpn and
(k1 , . . . , kp )
(g1 , . . . , gp )
(f1 , . . . , fm )
=
(7.30)
(x1 , . . . , xn )
(y1 , . . . , ym)
(x1 , . . . , xn )
m
X
fi
kr
gr
(a) =
(f (a))
(a), r = 1, . . . , p, j = 1, . . . , n.
(7.31)
xj
yi
xj
i=1
(b) In particular, in case p = 1, k(x) = g(f (x)) we have,
k
g f1
g fm
=
++
.
xj
y1 xj
ym xj
206
Example 7.8 (a) Let f (u, v) = uv, u = g(x, y) = x2 + y 2 , v = h(x, y) = xy, and z =
f (g(x, y), h(x, y)) = (x2 + y 2 )xy = x3 y + x2 y 3 .
z
f g f h
=
= v 2x + u y = 2x2 y + y(x2 + y 2 )
x
u x v x
z
= 3x2 y + y 3 .
x
(b) Let f (u, v) = uv , u(t) = v(t) = t. Then F (t) = f (u(t), v(t)) = tt and
f
f
u (t) +
v (t) = vuv1 1 + uv log u 1
u
v
= t tt1 + tt log t = tt (log t + 1).
F (t) =
f (a + te) f (a)
.
t
(7.32)
f
.
xj
Proposition 7.9 Let f : U R be continuously differentiable. Then for every a U and every
unit vector e Rn , kek = 1, we have
De f (a) = e grad f (a)
(7.33)
207
k (0) =
n
X
(7.34)
j=1
k (0) = lim
This completes the proof.
Remark 7.6 (Geometric meaning of grad f ) Suppose that grad f (a) 6= 0 and let e be a
normed vector, kek = 1. Varying e, De f (x) = e grad f (x) becomes maximal if and only if
e and f (a) have the same directions. Hence the vector grad f (a) points in the direction of
maximal slope of f at a. Similarly, grad f (a) is the direction of maximal
decline.
p
For example f (x,
1 x2 y 2 has
y) =
grad f (x, y) =
x
,
1x2 y 2
1x2 y 2
. The
maximal slope p
of f at (x, y) is in direction
e = (x, y)/ x2 + y 2 . In this case,the tangent line to the graph points to the z-axis and
has maximal slope.
n
X
i1 ,...,ik =1
(7.35)
In particular
h(k) (0) =
n
X
i1 ,...,ik =1
(7.36)
Proof. The proof is by induction on k. For k = 1 it is exactly the statement of the Proposition.
We demonstrate the step from k = 1 to k = 2. By (7.34)
n
n
n X
X
X
d f (a + tx)
f
xi1 =
(a + tx)xi2 xi1 .
h (t) =
dt
xi1
xi2 xi1
i =1
i =1 i =1
1
= fx (a + tx).
In the second equality we applied the chain rule to h(t)
i1
208
For brevity we use the following notation for the term on the right of (7.36):
n
X
(x )k f (a) =
xi1 xik Dik Di1 f (a).
i1 ,...,ik =1
In particular, (x)f (a) = x1 fx1 (a) + x2 fx2 (a) + + xn fxn (a) and ( x)2 f (a) =
Pn
2f
i,j=1 xi xj xi xj .
Theorem 7.11 Let f Ck+1 (U), a U, and x Rn such that a + tx U for all t [0, 1].
Then there exists [0, 1] such that
k
X
1
1
(x )m f (a) +
(x )k+1f (a + x)
f (a + x) =
m!
(k
+
1)!
m=0
f (a + x) = f (a) +
n
X
i=1
n
1 X
xi fxi (a) +
xi xj fxi xj (a) + +
2! i,j=1
X
1
+
xi1 xik+1 fxi1 xik+1 (a + x).
(k + 1)! i ,...i
1
(7.37)
1
(x )k+1 f (a
(k+1)!
k+1
1
h(m) (0)
=
(x )m f (a).
m!
m!
and
1
h(k+1) ()
=
(x )k+1 f (a + x);
(k + 1)!
(k + 1)!
k
X
1
1
((x a) )m f (a) +
((x a) )(k+1) f (a + (x a))
m!
(k
+
1)!
m=0
n
X
n
1 X
f (x) = f (a) +
(xi ai )fxi (a) +
(xi ai )(xj aj )fxi xj (a) + +
2! i,j=1
i=1
X
1
+
(xi1 ai1 ) (xik+1 aik+1 )fxi1 xik+1 (a + (x a)).
(k + 1)! i ,...,i
1
k+1
209
X
1
f (x) =
((x a) )m f (a).
m!
m=0
fy = cos x cos y,
fx (0, 0) = 0,
fxx (0, 0) = 0,
fxxy (0, 0) = 1,
fy (0, 0) = 1,
fyy = cos x sin y,
fyy (0, 0) = 0,
fyyy (0, 0) = 1,
fxy (0, 0) = 0.
1
3x2 y y 3 + R4 (x, y; 0).
3!
The same result can be obtained by multiplying the Taylor series for cos x and sin y:
y3
1
x2 x4
y3
+
y
= y x2 y
+
1
2
4!
3!
2
6
f (x, y) = y +
X
(xy 2 )n
n=0
n!
1
= 1 + xy 2 + x2 y 4 + ;
2
(7.38)
210
where
lim
x0
(x)
kxkk
(7.39)
= 0.
Proof. By Taylors theorem for f Ck (U), there exists [0, 1] such that
f (x + a) =
k1
k
X
X
1
1
1
!
(x )m f (a) + (x )k f (a + x) =
(x )m f (a) + (x).
m!
k!
m!
m=0
m=0
This implies
(x) =
1
(x )k f (a + x) (x )k f (a) .
k!
1 k
f (a + x) k f (a) .
k!
Di1 i2 ik (f (a + x) f (a)) 0.
x0
((x a) )m
f (a).
m!
k
X
m=0
Pm (x) + (x),
lim
xa
k(x)k
kx akk
= 0.
n
X
j=1
211
Using Corollary 7.13 the first order approximation of a continuously differentiable function is
f (x) = f (a) + grad f (a)(x a) + (x),
(x)
= 0.
xa kx ak
lim
1
f (a)
2 xi xj
1
f (a + x) = f (a) + grad f (a)x + x Hess f (a)x + (x),
2
(7.40)
. As a special case of
(x)
= 0,
x0 kxk2
lim
(7.41)
where
( Hess f )(a) = fxi xj (a)
n
i,j=1
(7.42)
f (x+tei )f (x)
t
212
p
Example 7.10 Let f (x, y) = 1 x2 y 2 be defined on the open unit disc U = {(x, y)
R2 | x2 + y 2 < 1}. Then grad f (x, y) = (x/r, y/r) = p
0 if and only if x = y = 0. If f has
an extremum in U then at the origin. Obviously, f (x, y) = 1 x2 y 2 1 = f (0, 0) for all
points in U such that f attains its global (and local) maximum at (0, 0).
To obtain a sufficient criterion for the existence of local extrema we have to consider the Hessian
matrix. Before, we need some facts from Linear Algebra.
Definition 7.9 Let A Rnn be a real, symmetric n n-matrix, that is aij = aji for all
i, j = 1, . . . , n. The associated quadratic form
Q(x) =
n
X
i,j=1
aij xi xj = x A x
is called
positive definite
negative definite
indefinite
positive semidefinite
negative semidefinite
Also, we say that the corresponding matrix A is positive defininite if Q(x) is.
Example 7.11 Let n = 2, Q(x) = Q(x1 , x2 ). Then Q1 (x) = 3x21 + 7x22 is positive definite,
Q2 (x) = x21 2x22 is negative definite, Q3 (x) = x21 2x22 is indefinite, Q4 (x) = x21 is positive
semidefinite, and Q5 (x) = x22 is negative semidefinite.
Proposition 7.15 (Sylvester) Let A be a real symmetric n n-matrix and Q(x) = xAx the
corresponding quadratic form. For k = 1, , n let
a11 a1k
.. , D = det A .
Ak = ...
k
k
.
ak1 akk
(a) Q is positive definite if and only if 1 > 0, 2 > 0, . . . , n > 0. This is the case
if and only if D1 > 0, D2 > 0,. . . , Dn > 0.
(b) Q(x) is negative definite if and only if 1 < 0, 2 < 0,. . . ,n < 0. This is the
case if and only if (1)k Dk > 0 for all k = 1, . . . , n.
(c) Q(x) is indefinite if and only if, A has both positive and negative eigenvalues.
Example 7.12 Case n = 2. Let A R
22
, A =
Sylvesters criterion A is
(a) positive definite if and only if det A > 0 and a > 0,
a b
, be a symmetric matrix. By
b c
213
(x)
= 0,
x0 kxk2
lim
(7.43)
Since Q(x) is positive definite and 0 6 S, m > 0. If x is nonzero, y = x/ kxk S and therefore
1
x A(x)
1
x
m y A(y) =
=
xA
=
xA(x),
kxk
kxk
kxk kxk
kxk2
This implies Q(x) = xA(x) m kxk2 for all x U.
Since (x)/ kxk2 0 as x 0, there exists > 0 such that kxk < implies
m
m
kxk2 (x)
kxk2 .
4
4
m
m
1
1
Q(x) + (x) f (a) + m kxk2 kxk2 f (a) + kxk2 ,
2
2
4
4
hence
f (a + x) > f (a),
214
If t is small enough, m4 t2 (tx)
m 2
t,
4
hence
if
0 < | t | < .
Similarly, if y Rn \ 0 satisfies y A(y) < 0, for sufficiently small t we have f (a + ty) < f (a).
xU
To compute the global extremum of f on the boundary one has to find the local extrema on
the interior point of the boundary and to compare them with the values on the boundary of the
boundary.
215
This matrix is positive semidefinite in case y > 0, negative semidefinite in case y < 0 and 0 at
(0, 0). Hence, the above criterion gives no answer. We have to apply the definition directly. In
case y > 0 we have f (x, y) = x2 y 0 for all x. In particular f (x, y) f (0, y) = 0. Hence
(0, y) is a local minimum. Similarly, in case y < 0, f (x, y) f (0, y) = 0 for all x. Hence, f
has a local maximum at (0, y), y < 0. However f takes both positive and negative values in a
neighborhood of (0, 0), for example f (, ) = 3 and f (, ) = 3 . Thus (0, 0) is not a local
extremum.
We have to consider the boundary x2 + y 2 = 1. Inserting x2 = 1 y 2 we obtain
g(y) = f (x, y)|x2 +y2 =1 = x2 y x2 +y2 =1 = (1 y 2)y = y y 3 , | y | 1.
We compute the local extrema of the boundary x2 +y 2 = 1 (note, that the circle has no boundary,
such that the local extrema are actually the global extrema).
!
g (y) = 1 3y 2 = 0,
1
|y| = .
3
Since g (1/ 3) < 0 and g (1/ 3) > 0, g attains its maximum 32 3 at y = 1/ 3. Since this
is greater than the local maximum of f at (0, y), y > 0, f attains its global maximum at the two
points
!
r
2 1
M1,2 =
,
,
3 3
=
3
3
x2
2
x2
2
+ y2
x2 x2 2
y
2 2
31
2
= x2 y .
3 3
(b) Among all boxes with volume 1 find the one where the sum of the length of the 12 edges is
minimal.
Let x, y and z denote the length of the three perpendicular edges of one vertex. By assumption
xyz = 1; and g(x, y, z) = 4(x + y + z) is the function to minimize.
216
8
,
x3 y
fyy =
8
,
xy 3
fxy =
4
x2 y 2
8 4
= 64 16 > 0;
det Hess f (1, 1) =
4 8
hence f has an extremum at (1, 1). Since fxx (1, 1) = 8 > 0, f has a local minimum at (1, 1).
Global Extrema. We show that (1, 1) is even the global minimum on the first quadrant U.
1
Consider N = {(x, y) | 25
x, y 5}. If (x, y) 6 N,
f (x, y) 4(5 + 0 + 0) = 20,
Since f (x, y) 12 = f (1, 1), the global minimum of f on the right-upper quadrant is attained
on the compact rectangle N. Inserting the four boundaries x = 5, y = 5, x = 1/5, and y = 1/5,
in all cases, f (x, y) 20 such that the local minimum (1, 1) is also the global minimum.
R
U
11111
00000
00000
11111
00000
11111
00000
11111
00000
11111
a
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
11111
00000
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
f(a)
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
Rn
217
Theorem 7.17 (Inverse Mapping Theorem) Suppose that f : Rn Rn is continuously differentiable on an open set U containing a, and det f (a) 6= 0. Then there is an open set V U
containing a and an open set W containing f (a) such that f : V W has a continuous inverse
g : W V which is differentiable and for all y W . For y = f (x) we have
g (y) = (f (x))
(7.44)
defined by f (x) = x3 has f (0) = 0; however g(y) = 3 x is inverse to f (x). One thing is
certain if det f (a) = 0 then g cannot be differentiable at f (a). If g were differentiable at f (a),
the chain rule applied to g(f (x)) = x would give
g (f (a)) f (a) = id
218
and consequently
(x, y) xr x cos r sin
=
= r.
=
(r, ) yr y sin r cos
Let f (r0 , 0 ) = (x0 , y0 ) 6= (0, 0), then r0 6= 0 and the Jacobian of f at (r0 , 0 ) is non-zero.
Since all partial derivatives of f with respect to r and exist and they are continuous on R2 , the
assumptions of the theorem are satisfied. Hence, in a neighborhood U of (x0 , y0 ) there exists a
continuously differentiable inverse
function r = r(x, y), = (x, y). In this case, the function
p
2
can be given explicitly, r = x + y 2 , = arg(x, y). We want to compute the Jacobi matrix
of the inverse function. Since the inverse matrix
1
cos
sin
cos r sin
=
1r sin 1r cos
sin r cos
we obtain by the theorem
g (x, y) =
(r, )
(x, y)
cos
1r sin
sin
=
1
cos
r
x
x2 +y 2
y
x2 +y
2
x2 +y 2
x
x2 +y 2
in particular, the second row gives the partial derivatives of the argument function with respect
to x and y
arg(x, y)
y
x
arg(x, y)
= 2
=
,
.
x
x + y2
y
x2 + y 2
Note that we have not determined the explicit form of the argument function which is not unique
since f (r, +2k) = f (r, ), for all k Z. However, the gradient takes always the above form.
Note that det f (r, ) 6= 0 for all r 6= 0 is not sufficient for f to be injective on R2 \ {(0, 0)}.
(b) Let f : R2 R2 be given by (u, v) = f (x, y) where
u(x, y) = sin x cos y,
Since
(u, v) ux uy cos x sin y
=
=
= cos x cos y sin x sin y = cos(x + y)
(x, y) vx vy sin x cos y
219
Since f ( 4 , 4 ) = (0, 2), the inverse function g(u, v) = (x, y) is defined in a neighborhood
2
2
2
2
=
.
1
1
1
1
2
2
2
2
Note that at point ( 4 , 4 ) the Jacobian of f vanishes. There is indeed no neighborhood of ( 4 , 4 )
where f is injective since for all t R
= (0, 0).
+ t, t = f
,
f
4
4
4 4
Question: Is any hyper surface locally the graph of a differentiable function? More precisely,
we may ask the following question: Suppose that f : Rn R R is differentiable and
f (a1 , . . . , an , b) = 0. Can we find for each (x1 , . . . , xn ) near (a1 , . . . , an ) a unique y near b
such that f (x1 , . . . , xn , y) = 0? The answer to this question is provided by the Implicit Function Theorem (IFT).
Consider the function f : R2 R defined by f (x, y) = x2 + y 2 1. If we choose (a, b) with
a, b > 0, there are open intervals A and B containig a and b with the following property: if
x A, there is a unique y B with f (x, y) = 0. We can therefore define a function g : A B
by the condition g(x) B and f (x, g(x)) = 0. If b > 0 then g(x) = 1 x2 ; if b < 0 then
g(x) = 1 x2 . Both functions g are differentiable. These functions are said to be defined
implicitly by the equation f (x, y) = 0.
On the other hand, there exists no neighborhood of (1, 0) such that f (x, y) = p
0 can locally be
solved for y. Note that fy (1, 0) = 0. However it can be solved for x = h(y) = 1 y 2.
220
1
(x) = fy (x, g(x))
(x, g(x))
(7.46)
(x1 , . . . , xn )
(x1 , . . . , xn )
n
X
(gk (x))
fl (x, g(x))
=
(fy (x, g(x))1 )kl
, k = 1, . . . , m, j = 1, . . . , n.
xj
xj
l=1
Idea of Proof. Define F : Rn Rm Rn Rm by F (x, y) = (x, f (x, y)). Let M = fy (a, b).
Then
1n 0n,m
F (a, b) =
= det F (a, b) = det M 6= 0.
0m,n M
By the inverse mapping theorem Theorem 7.17 there exists an open set W Rn Rm containing F (a, b) = (a, 0) and an open set V Rn Rm containing (a, b) which may be of the
form A B such that F : A B W has a differentiable inverse h : W A B.
Since g is differentiable, it is easy to find the Jacobi matrix. In fact, since fi (x, g(x)) = 0,
f
on both sides gives by the chain rule
i = 1, . . . n, taking the partial derivative x
j
m
xj
yk
xj
k=1
fi (x, g(x))
gk (x)
0=
+ fy (x, g(x))
xj
xj
fi (x, g(x))
gk (x)
= fy (x, g(x))
.
xj
xj
fi (x, g(x))
xj
Since det fy (a, b) 6= 0, det fy (x, y) 6= 0 in a small neighborhood of (a, b). Hence fy (x, g(x))
is invertible and we can multiply the preceding equation from the left by (fy (x, g(x)))1 which
gives (7.46).
Remarks 7.9 (a) The theorem gives a sufficient condition for locally solving the system of
equations
0 = f1 (x1 , . . . , xn , y1 , . . . ym ),
..
.
0 = fm (x1 , . . . , xn , y1 , . . . , ym )
221
d
dx
fx (x, g(x))
.
fy (x, g(x))
(f (x, g(x))).
Example 7.16 (a) Let f (x, y) = sin(x + y) + exy 1. Note that f (0, 0) = 0. Since
fy (0, 0) = cos(x + y) + xexy |(0,0) = cos 0 + 0 = 1 6= 0
f (x, y) = 0 can uniquely be solved for y = g(x) in a neighborhood of x = 0, y = 0. Further
fx (0, 0) = cos(x + y) + yexy |(0,0) = 1.
By Remark 7.9 (b)
fx (x, y)
cos(x + g(x)) + g(x) exg(x)
=
.
g (x) =
fy (x, y) y=g(x)
cos(x + g(x)) + x exg(x)
In particular g (0) = 1.
Remark. Differentiating the equation fx + fy g = 0 we obtain
g
=
.
fy3
g = ffx
y
Since
fxx (0, 0) = sin(x + y) + y 2 exy (0,0) = 0,
fyy (0, 0) = sin(x + y) + x2 exy (0,0) = 0,
222
Inserting the curve into the equation y = g(x) we have y(t) = g(x(t)). Differentiation gives
y = g x,
y = g x 2 + g x
Thus
g (x) =
y
,
x
g (x) =
y g x
yx xy
=
x 2
x 3
g (x0 )
(x x0 )2 .
2
is the equation of the tangent hyper plane to the surface F (x) = 0 at point a.
Proof. Indeed, since the gradient at a is nonzero we may assume without loss of generality that
Fxn (a) 6= 0. By the IFT, F (x1 , . . . , xn1 , xn ) = 0 is locally solvable for xn = g(x1 , . . . , xn1 )
a) = an , where a
= (a1 , . . . , an1 ) and
in a neighborhood of a = (a1 , . . . , an ) with g(
x = (x1 , . . . , xn1 ). Define the tangent hyperplane to be the graph of the linearization of g
is given by
at (a1 , . . . , an1 , an ). By Example 7.7 (a) the hyperplane to the graph of g at a
xn = g(
a) + grad g(
a)x.
Since F (
a, g(
a)) = 0, by the implicit function theorem
Fx (a)
g(
a)
= j
,
xj
Fxn (a)
j = 1, . . . , n 1.
xn an =
1 X
Fx (a)(xj aj ).
Fxn (a) j=1 j
n1
X
j=1
(7.47)
223
U1
U0
U1
f=c
=0
Theorem 7.20 (Lagrange Multiplier Rule) Let f, : U R, U Rn is open, be continuously differentiable and f has a local extrema at a U under the constraint (x) = 0. Suppose
that grad (a) 6= 0.
Then there exists a real number such that
grad f (a) = grad (a).
This number is called Lagrange multiplier.
Proof. The idea is to solve the constraint (x) = 0 for one variable and to consider the free
extremum problem with one variable less. Suppose without loss of generality that xn (a) 6=
0. By the implicit function theorm we can solve (x) = 0 for xn = g(x1 , . . . , xn1 ) in a
neighborhood of x = a. Differentiating (
x, g(
x)) = 0 and inserting a = (
a, an ) as before we
have
xj (a) + xn (a)gxj (
a) = 0,
j = 1, . . . , n 1.
(7.48)
224
Since h(
x) = f (
x, g(
x)) has a local extremum at a
all partial derivatives of h vanish at a
:
fxj (a) + fxn (a)gxj (
a) = 0,
j = 1, . . . , n 1.
(7.49)
Setting = fxn (a)/xn (a) and comparing (7.48) and (7.49) we find
fxj (a) = xj (a),
j = 1, . . . , n 1.
Since by definition, fxn (a) = xn (a) we finally obtain grad f (a) = grad (a) which
completes the proof.
Example 7.17 (a) Let A = (aij ) be a real symmetric n n-matrix, and define f (x) = xAx =
P
n1
= {x Rn | kxk = 1}.
i,j aij xi xj . We aks for the local extrema of f on the unit sphere S
P
This constraint can be written as (x) = kxk2 1 = ni=1 x2i 1 = 0. Suppose that f attains
a local minimum at a Sn1 . By Example 7.6 (b)
grad f (a) = 2A(a).
On the other hand
grad (a) = (2x1 , . . . , 2xn )|x=a = 2a.
By Theorem 7.20 there exists a real number 1 such that
grad f (a) = 2A(a) = 1 grad (a) = 2a,
Hence A(a) = 1 a; that is, is an eigenvalue of A and a the corresponding eigenvector. In
particular, A has a real eigenvalue. Since Sn1 has no boundary, the global minimum is also a
local one. We find: if f (a) = aA(a) = aa = is the global minimum, is the smallest
eigenvalue.
(b) Let a be the point of a hypersurface M = {x | (x) = 0} with minimal distance to a given
point b 6 M. Then the line through a and b is orthogonal to M.
Indeed, the function f (x) = kx bk2 attains its minimum under the condition (x) = 0 at a.
By the Theorem, there is a real number such that
grad f (a) = 2(a b) = grad (a).
The assertion follows since by Example 7.16 (c), grad (a) is orthogonal to M at a and b a
is a multiple of the normal vector (a).
Theorem 7.21 (Lagrange Multiplier Rule extended version) Let f, i : U R, i =
1, . . . , m, m < n, be continuously differentiable functions. Let M = {x U | 1 (x) = =
m (x) = 0} and suppose that f (x) has a local extrema at a under the constraints x M.
Suppose further that the Jacobi matrix (a) Rmn has maximal rank m.
Then there exist real numbers 1 , . . . , m such that
grad f (a) = grad (1 1 + + m m )(a) = 0.
Note that the rank condition ensures that there is a choice of m variables out of x1 , . . . , xn such
that the Jacobian of 1 , . . . , m with respect to this set of variable is nonzero at a.
225
Proof. Let > 0. Since f is continuous on the compact set R, f is uniformly continuous on R
(see Proposition 6.25). Hence, there is a > 0 such that | x x | < and | y y | < and
(x, y), (x, y ) R imply
| f (x, y) f (x , y ) | < .
Z b
(f (x, y) f (x, y0 )) dx (b a).
| I(y) I(y0 ) | =
a
R1
0
Remark 7.10 (a) Note that continuity at y0 means that we can interchange the limit and the
Z b
Z b
Z b
integral, lim
f (x, y) dx =
lim f (x, y) dx =
f (x, y0 ) dx.
yy0
a yy0
(b) A similar statement holds for y : Suppose that f (x, y) is continuous on [a, b][c, +)
and limy+ f (x, y) = (x) exists uniformly for all x [a, b] that is
> 0 R > 0 x [a, b], y R : | f (x, y) (x) | < .
Then
Rb
a
(x) dx.
a
f (x, y) dx =
fy (x, y) dx.
226
Proof. Let > 0. Since fy (x, y) is continuous, it is uniformly continuous on R. Hence there
exists > 0 such that | x x | < and | y y | < imply | fy (x , y ) fy (x , y ) | < .
We have for | h | <
Z b
Z b
I(y0 + h) I(y0 )
f (x, y0 + h) f (x, y0 )
fy (x, y0 ) dx
fy (x, y0 ) dx
h
h
a
a
Z b
| fy (x, y0 + h) fy (x, y0 ) | dx < (b a)
for some (0, 1). Since this inequality holds for all small h, it holds for the limit as h 0,
too. Thus,
Z b
I (y0 )
fy (x, y0 ) dx (b a).
a
I (y) =
(y)
(y)
Rv
u
f (x, y) dx; then I(y) = F (y, (y), (y)). The fundamental theorem
F
(y, u, v) =
v
F
(y, u, v) =
u
Z v
I (y) =
(y)
(7.50)
(7.51)
227
R4
Example 7.18 (a) I(y) = 3 sin(xy)
dx is differentiable by Proposition 7.23 since fy (x, y) =
x
cos(xy)
x = cos(xy) is continuous. Hence
x
I (y) =
4
sin(xy)
sin 4y sin 3y
.
cos(xy) dx =
=
y
y
y
3
(b) I(y) =
R sin y
log y
ex y dx is differentiable with
I (y) =
sin y
log y
1
2
ey(log y) .
y
Note that the Cauchy and Weierstra criteria (see Proposition 6.1 and Theorem 6.2) for uniform
convergence of series of functions also hold for improper parametric integrals. For example the
theorem of Weierstra now reads as follows.
RA
Proposition 7.25 Suppose that a f (x, y) dx exists for all A a and y [c, d]. Suppose
R
further that | f (x, y) | (x) for all x a and a (x) dx converges.
R
Then a f (x, y) dx converges uniformly with respect to y [c, d].
R
Example 7.19 I(y) = 1 exy xy y 2 dx converges uniformly on [2, 4] since
| f (x, y) | = exy xy y 2 e2x x4 42 = (x).
R
and 1 e2x x4 42 dx < converges.
If we add the assumption of uniform convergence then the preceding theorems remain true for
improper integrals.
228
.
| f (x , y ) f (x , y ) | <
Aa
Therefore,
Z A
(A a) = , for | y y0 | < .
| f (x, y) f (x, y0 ) | dx <
Aa
a
Finally,
| I(y) I(y0) | =
Z
| y y0 | < .
I (y) =
2 /2
2x sin(2yx) ex dx.
2y cos(2yx) ex dx.
0
229
As A the first summand on the right tends to 0; thus I(y) satisfies the ordinary differential
equation
I (y) = 2yI(y).
ODE: y = 2xy; dy = 2xy dx; dy/y = 2x dx. Integration yields log y = x2 + c;
2
y = c ex .
2
The general solution is I(y) = Cey . We determine the constant C. Insert y = 0. Since
R
2
I(0) = 0 ex dx = /2, we find
y2
I(y) =
e .
2
R
(b) The Gamma function (x) = 0 tx1 et dt
is in C (R+ ). Let x > 0, say x
[c, d]. Recall from Subsection 5.3.3 the definition and the proof of the convergence of the imR1
proper integrals 1 = 0 f (x, t) dt and 2 =
R
f (x, t) dt, where f (x, t) = tx1 et . Note
1
that 1 (x) is an improper integral at t = 0 + 0.
By LHospitals rule lim t log t = 0 for all
Gamma Function
2.2
1.8
1.6
1.4
1.2
t0+0
0.5
1.5
2.5
Since et < 1 and moreover tx1 < tc1 for t < t0 by Lemma 1.23 (b) we conclude that
f (x, t) = tx1 log tet | log t | tc1 t 2c tc1 = 1 c ,
x
t1 2
for 0 < t < t0 . Since (t) = 11 2c is integrable over [0, 1], 1 (x) is differentiable by the
t
R1
Corrollary with 1 (x) = 0 tx1 log tet dt. Similarly, 2 (x) is an improper integral over an
unbounded interval [1, +), for sufficiently large t t0 > 1, we have log t < t and tx < td ,
such that
2 (x) =
tx1 log tet dt.
1
230
7.9 Appendix
fi
Proof of Proposition 7.7. Let A = (Aij ) =
be the matrix of partial derivatives considxj
ered as a linear map from Rn to Rm . Our aim is to show that
kf (a + h) f (a) A hk
= 0.
h0
khk
lim
For, it suffices to prove the convergence to 0 for each coordinate i = 1, . . . , m by Proposition 6.26
P
fi (a + h) fi (a) nj=1 Aij hj
= 0.
(7.52)
lim
h0
khk
Without loss of generality we assume m = 1 and f = f1 . For simplicity, let n = 2, f = f (x, y),
a = (a, b), and h = (h, k). Note first that by the mean value theorem we have
f (a + h, b + k) f (a, b) = f (a + h, b + k) f (a, b + k) + f (a, b + k) f (a, b)
f
f
(, b + k)h +
(a, )k,
=
x
y
where (a, a + h) and (b, b + k). Using this, the expression in(7.52) reads
(a, b) h f
(a, b) k
f (a + h, b + k) f (a, b) f
x
y
=
h2 + k 2
f
f
f
f
(,
b
+
k)
(a,
b)
h
+
(a,
)
(a,
b)
k
x
x
y
x
.
=
h2 + k 2
Since both f
and f
are continuous at (a, b),
x
y
U ((a, b)) implies
f
f
x (x, y) x (a, b) < ,
This shows
f (, b + k)
x
f
(a, b)
x
h+
h2 + k 2
f
(a, )
y
f
f
y (x, y) y (a, b) < .
f
(a, b)
x
k
|h| + |k|
2,
h2 + k 2
(a, b) f
(a, b)).
hence f is differentiable at (a, b) with Jacobi matrix A = ( f
x
y
Since both components of Athe partial derivativesare continuous functions of (x, y), the
assignment x 7 f (x) is continuous by Proposition 6.26.
Chapter 8
Curves and Line Integrals
8.1 Rectifiable Curves
8.1.1 Curves in
We consider curves in Rk . We define the tangent vector, regular points, angle of intersection.
Definition 8.1 A curve in Rk is a continuous mapping : I Rk , where I R is a closed
interval consisting of more than one point.
The interval can be I = [a, b], I = [a, +), or I = R. In the first case (a) and (b) are
called the initial and end point of . These two points derfine a natural orientation of the curve
from (a) to (b). Replacing (t) by (a + b t) we obtain the curve from (b) to (a) with
opposite orientation.
If (a) = (b), is said to be a closed curve. The curve is given by a k-tupel = (1 , . . . , k )
of continuous real-valued functions. If is differentiable, the curve is said to be differentiable.
Note that we have defined the curve to be a mapping, not a set of points in Rk . Of course, with
each curve in Rk there is associated a subset of Rk , namely the image of ,
C = (I) = {(t) Rk | t I}.
but different curves may have the same image C = (I). The curve is said to be simple if
is injective on the inner points I of I. A simple curve has no self-intersection.
Example 8.1 (a) A circle in R2 of radius r > 0 with center (0, 0) is described by the curve
: [0, 2] R2 ,
Note that : [0, 4] R2 with (t) = (t) has the same image but is different from . is a
simple curve, is not.
(b) Let p, q Rk be fixed points, p 6= q. Then
1 (t) = (1 t)p + tq,
t [0, 1],
t R,
232
are the segment pq from p to q and the line pq through p and q, respectively. If v Rk is a
vector, then 3 (t) = p + tv, t R, is the line through p with direction v.
(c) If f : [a, b] R is a continuous function, the graph of f is a curve in R2 :
: [a, b] R2 ,
1 (t1 )2 (t2 )
,
k1 (t1 )k k2 (t2 )k
[0, ].
NewtonsKnot
1.5
1
0.5
0
-1
-0.5
0
-0.5
-1
-1.5
0.5
Let us compute the angle of self-intersection. Since (1) = (1) = (0, 0), the selfintersection angle satisfies
cos =
(2, 2)(2, 2)
= 0,
8
233
n
X
i=1
k(ti ) (ti1 )k .
(8.1)
The ith term in this sum is the euclidean distance of the points xi1 = (ti1 ) and xi = (ti ).
x3
x0
x11
x2
a
t1
t2
Definition 8.3 A curve : [a, b] Rk is said to be rectifiable if the set of non-negative real
numbers {(P, ) | P is a partition of [a, b]} is bounded. In this case
() = sup (P, ),
where the supremum is taken over all partitions P of [a, b], is called the length of .
In certain cases, () is given by a Riemann integral. We shall prove this for continuously
differentiable curves, i. e. for curves whose derivative is continuous.
Proposition 8.1 If is continuous on [a, b], then is rectifiable, and
Z b
k (t)k dt.
() =
a
R ti
Proof. If a ti1 < ti b, by Theorem 5.28, (ti ) (ti1 ) = ti1
(t) dt. Applying
Proposition 5.29 we have
Z ti
Z ti
k(ti ) (ti1 )k =
(t) dt
k (t)k dt.
ti1
Hence
(P, )
ti1
k (t)k dt
234
for every partition P of [a, b]. Consequently,
Z b
()
k (t)k dt.
a
To prove the opposite inequality, let > 0 be given. Since is uniformly continuous on [a, b],
there exists > 0 such that
k (s) (t)k < if
| s t | < .
ti
k (t)k dt k (ti )k ti + ti
ti1
Z ti
+ ti
=
(
(t)
(t
)
(t))
dt
i
t
Z i1
Z
ti
ti
+
+ ti
(t)
dt
(
(t
)
(t))
dt
i
ti1
ti1
Special Case k = 2
k = 2, (t) = (x(t), y(t)), t [a, b]. Then
Z bp
() =
x (t)2 + y (t)2 dt.
a
Example 8.3 Catenary Curve. Let f (t) = a cosh at , t [0, b], b > 0. Then f (t) = sinh at
and moreover
s
b
2
Z b
Z b
t
t
b
t
() =
1 + sinh
dt =
cosh dt = a sinh = a sinh .
a
a
b 0
a
0
0
235
1.5
0.5
Cycloid
Defining Circle
Curve 3
Curve 4
Let the radius of the tire be a. It can be verified by plane trigonometry that
a( sin )
.
() =
a(1 cos )
This curve is called a cycloid.
Find the distance travelled by the bulge for 0 2.
= a 2 1 cos = 2a sin .
2
Therefore,
() = 2a
sin d = 4a
2
2
= 4a( cos + cos 0) = 8a.
cos
2 0
where =
a2 b2
.
a
236
,
0 < t 1,
t cos 2t
f (t) =
0,
t = 0.
Since lim f (t) = f (0) = 0, f is continuous and (t) is a curve. However, this curve is not rect0+0
tifiable. Indeed, choose the partition Pk = {t0 = 0, 1/(4k), 1/(4k 2), . . . , 1/4, 1/2, t2k+1 =
1
1} consisting of 2k + 1 points, ti = 4k2i+2
, i = 1, . . . , 2k. Note that t0 = 0 and
t2k+1 = 1 play a special role and will be omitted in the calculations below. Then cos 2ti =
(1, 1, 1, 1, . . . , 1, 1), i = 1, . . . , 2k. Thus
(Pk , )
2k
X
p
i=2
(ti ti1
)2
+ (f (ti ) f (ti1
))2
2k
X
i=2
| f (ti ) f (ti1 ) |
1
1
1 1
1
1
+
++
+
+
+
4k 2 4k
4k 4 4k 2
2 4
1
1
1
1
+
++
= +2
2
4
4k 2
4k
which is unbounded for k since the harmonic series is unbounded. Hence is not
rectifiable.
K(t)
= m(~v ~v + ~v ~v ) = m~a ~v = F ~v.
2
The total change of the kinetic energy from time t1 to t2 , denoted W , is called the work done by
the force F along the path ~x(t):
Z t2
Z t2
Z t2
W =
K(t)
dt =
F ~v dt =
F (t) ~x (t) dt.
t1
t1
t1
237
Let us now suppose that the force F at time t depends only on the position ~x(t). That is, we
assume that there is a vector field F~ (~x) such that F (t) = F~ (~x(t)) (gravitational and electrostatic
attraction are position-dependent while magnetic forces are velocity-dependent). Then we may
rewrite the above integral as
Z
t2
W =
t1
Definition 8.4 Let = {~x(t) | t [r, s]}, be a continuously differentiable curve ~x(t)
C1 ([r, s]) in Rn and f~ : Rn a continuous vector field on . The integral
Z
Z s
~
f (~x) d~x =
f~(~x(t)) ~x (t) dt
is called the line integral of the vector field f~ along the curve .
Remark 8.2 (a) The definition of the line integral does not depend on the parametrization of
.
(b) If we take different curves between the same endpoints, the line integral may be different.
R
(c) If the vector field f~ is orthogonal to the tangent vector, then f~ d~x = 0.
(d) Other notations. If f~ = (P, Q) is a vector field in R2 ,
Z
Z
~
f d~x =
P dx + Q dy,
R
where the right side is either a symbol or P dx = (P, 0) d~x.
R
Example 8.4 (a) Find the line integral i y dx + (x y) dy, i =
1, 2, where
R
and 2 = 3 4 ,
y dx + (x y) dy =
R
1
2
(t 1 + (t t )2t) dt =
R
(1,1)
(0,0)
(1,0)
1
(3t2 2t3 ) dt = .
2
In the second case 2 f d~x = 3 f d~x + 4 f d~x. For the first part ( dx, dy) = ( dt, 0), for the
second part ( dx, dy) = (0, dt) such that
1
Z
Z
Z 1
Z 1
1 2
1
f d~x =
y dx + (x y) dy =
0 dt + (t 0) 0 +
t 0 + (1 t) dt = t t = .
2 0 2
0
0
(b) Find the work done by the force field F~ (x, y, z) = (y, x, 1) as a particle moves from
(1, 0, 0) to (1, 0, 1) along the following paths = 1:
238
~x(t) = (cos t, sin t, 2t ), t [0, 2],
We find
F~ d~x =
= 2 + 1.
In case = 1, the motion is with the force, so the work is positive; for the path = 1, the
motion is against the force and the work is negative.
We can also define a scalar line integral in the following way. Let : [a, b] Rn be a continuously differentiable curve, = ([a, b]), and f : R a continuous function. The integral
Z
Z b
f (x) ds :=
f ((t)) k (t)k dt
f~ d~x =
f~ d~x.
(b) Change of orientation. If ~x(t), t [r, s] defines a curve which goes from a = ~x(r) to
b = ~x(s), then ~y (t) = ~x(r + s t), t [r, s], defines the curve which goes in the opposite
direction from b to a. It is easy to see that
Z
Z
f~ d~x = f~ d~x.
Z
~
f~ d~x ( ) sup
f(x)
.
x
~
f~ d~x =
f
(~
x
(t))
f
(~
x
(t))
~
x
(t)
dt
k~x (t)k dt
tri.in.,CSI
t0
t0
Z t
~
1
~
sup
f (~x)
k~x (t)k dt = sup
f (~x)
( ).
~
x
~
x
t0
(d) Splitting. If 1 and 2 are two curves such that the ending point of 1 equals the starting
point of 2 then
Z
Z
Z
f~ d~x =
f~ d~x +
f~ d~x.
1 2
239
Definition 8.6 A vector field f~ : G Rn is called potential field or gradient vector field if
there exists a continuously differentiable function U : G R such that f~(x) = grad U(x) for
x G. We call U the potential or antiderivative of f~.
Example 8.5 The gravitational force is given by
x
,
F~ (x) =
kxk3
where = mM. It is a potential field with potential
U(x) =
1
kxk
x
with f (y) = 1/y and f (y) =
This follows from Example 7.2 (a), grad f (kxk) = f (kxk) kxk
1/y 2.
Remark 8.4 (a) A vector field f~ is conservative if and only if the line integral over any closed
curve in G is 0. Indeed, suppose that f~ is conservative and = 1 2 is a closed curve,
where 1 is a curve from a to b and 2 is a curve from b to a. By Remark 8.3 (b), changing the
orientation of 2 , the sign of the line integral changes and 2 is again a curve from a to b:
Z
Z
Z
Z
Z
~
~
+
f d~x =
f d~x =
f~ d~x = 0.
y
G
x
x
240
Inserting the above expression and applying the fundamental theorem of calculus, we find
Z
Z s
~
(t)
dt = (s) (r) = U(~x(s)) U(~x(r)) = U(b) U(a).
f d~x =
(ii) Choose h Rn small such that x + th G for all t [0, 1]. By the path independence of
the line integral
Z x+h
Z x
Z x+h
~
~
U(x + h) U(x) =
f d~y
f d~y =
f~ d~y
a
Consider the curve ~x(t) = x + th, t [0, 1] from x to x + h. Then ~x (t) = h. By the mean value
theorem of integration (Theorem 5.18 with = 1, a = 0 and b = 1) we have
Z 1
Z x+h
~
~ + h) h,
f d~y =
f~(~x(t)) h dt = f(x
x
241
where [0, 1]. We check grad U(x) = f~(x) using the definition of the derivative:
~
~
U(x + h) U(x) f~(x) h
f (x + h) f~(x)
khk
(f(x + h) f~(x)) h
=
khk
khk
khk
CSI
~
=
f~(x + h) f(x)
0,
h0
Remark 8.5 (a) In case n = 2, a simple path to compute the line integral (and so the potential
U) in (ii) consists of 2 segments: from (0, 0) via (x, 0) to (x, y). The line integral of P dx+Q dy
then reads as ordinary Riemann integrals
Z x
Z y
U(x, y) =
P (t, 0) dt +
Q(x, t) dt.
0
(b) Case n = 3. You can also use just one single segment from the origin to the endpoint
(x, y, z). This path is parametrized by the curve
~x(t) = (tx, ty, tz),
t [0, 1],
We obtain
U(x, y, z) =
=x
(x,y,z)
f1 dx + f2 dy + f3 dz
(0,0,0)
Z 1
(8.2)
Z
(8.3)
(c) Although Theorem 8.2 gives a necessary and sufficient condition for a vector field to be
conservative, we are missing an easy criterion.
Recall from Example 7.4, that a necessary condition for f~ = (f1 , . . . , fn ) to be a potential vector
field is
fj
fi
=
, 1 i < j n.
xj
xi
which is a simple consequence from Schwarzs lemma since if fi = Uxi then
Uxi xj =
The condition
fi
xj
fj
,
xi
Uxj
Uxi
fi
fj
=
=
=
= Uxj xi .
xj
xj
xi
xi
~ It is a
1 i < j n. is called integrability condition for f.
242
The vector field satisfies the integrability condition Py = Qx . However, it is not conservative.
For, consider the unit circle (t) = (cos t, sin t), t [0, 2]. Then (t) = ( sin t, cos t) and
Z
Z 2
Z
Z 2
y dx
sin t sin t dt cos t cos t dt
x dy
~
f d~x =
+
=
+ 2
=
dt = 2.
2
2
x + y2
1
1
0
0
x +y
R
This contradicts f~ d~x = 0 for conservative vector fields. Hence, f~ is not conservative.
f~ fails to be conservative since G = R2 \ {(0, 0)} has an hole.
For more details, see homework 30.1.
The next proposition shows that under one additional assumption this criterion is also sufficient.
A connected open subset G (a region) of Rn is called simply connected if every closed polygonal
path inside G can be shrunk inside G to a single point.
Roughly speaking, simply connected sets do not have holes.
convex subset of Rn
1-torus S1 = {z C | | z | = 1}
annulus {(x, y) R2 | r 2 < x2 + y 2 R2 }, 0 r < R
R2 \ {(0, 0)}
R3 \ {(0, 0, 0)}
simply connected
not simply connected
not simply connected
not simlpy connected
simply connected
= 0,
x2 x3
f1
f3
= 0,
x3 x1
f2
f1
= 0.
x1 x2
=0
x2 x3
x2 x3 x3 x2
by Schwarzs Lemma.
(b) This will be an application of Stokes theorem, see below.
243
Uz = Cz = 1.
= 3x2 y 2 + ex + z.
244
Chapter 9
Integration of Functions of Several
Variables
References to this chapter are [ON75, Section 4] which is quite elemantary and good accessible. Another elementary approach is [MW85, Chapter 17] (part III). A more advanced but still
good accessible treatment is [Spi65, Chapter 3]. This will be our main reference here. Rudins
book [Rud76] is not recommendable for an introduction to integration.
246
where the sum is taken over all subrectangles S of the partition P . Clearly, if f is bounded with
m f (x) M on the rectangle x R,
m v(R) L(P, f ) U(P, f ) M v(R),
so that the numbers L(P, f ) and U(P, f ) form bounded sets. Lemma 5.1 remains true; the proof
is completely the same.
Lemma 9.1 (a) Suppose the partition P is a refinement of P (that is, each subrectangle of P
is contained in a subrectangle of P ). Then
L(P, f ) L(P , f ) and U(P , f ) U(P, f ).
(b) If P and P are any two partitions, then L(P, f ) U(P , f ).
It follows from the above corollary that all lower sums are bounded above by any upper sum
and vice versa.
Definition 9.1 Let f : A R be a bounded function. The function f is called Riemann integrable on the rectangle A if
Z
f dx,
A
where the supremum and the infimum are taken over all partitions P of A. This common
number is the Riemann integral of f on A and is denoted by
Z
Z
f dx or
f (x1 , . . . , xn ) dx1 dxn .
A
R
f dx and A f dx are called the lower and the upper integral of f on A, respectively. They
A
always exist. The set of integrable function on A is denoted by R(A).
As in the one dimensional case we have the following criterion.
247
Example 9.1 (a) Let f : A R be a constant function f (x) = c. Then for any Partition P and
any subrectangle S we have mS = MS = c, so that
X
X
c v(S) = c
v(S) = cv(A).
L(P, f ) = U(P, f ) =
S
Hence, A c dx = cv(A).
(b) Let f : [0, 1] [0, 1] R be defined by
(
0,
f (x, y) =
1,
if x is rational,
if x is irrational.
If P is a partition, then every subrectangle S will contain points (x, y) with x rational, and also
points (x, y) with x irrational. Hence mS = 0 and MS = 1, so
X
L(P, f ) =
0v(S) = 0,
S
and
U(P, f ) =
Therefore,
f dx = 1 6= 0 =
A
Z
Z
f dx
| f | dx.
A
(f) f R(A) and f (A) [a, b], g C[a, b]. Then g f R(A).
(g) If f R and f = g except at finitely many points, then g R and
f dx =
g dx.
248
Remark 9.2 (a) Any finite set {a1 , . . . , am } Rn is of measure 0. Indeed, let > 0 and
choose Ui be a rectangle with midpoint ai and volume /m. Then {Ui | i = 1, . . . , m} covers
P
A and i v(Ui ) .
(b) Any contable set is of measure 0.
(c) Any countable set has measure 0.
(d) If each (Ai )iN has measure 0 then A = A1 A2 has measure 0.
Proof. Let > 0. Since Ai has measure 0 there exist closed rectangles Uik , i N, k N,
S
such that for fixed i, the family {Uik | k N} covers Ai , i.e.
kN Uik Ai and
P
i1
, i N. In this way we have constructed an infinite array {Uik } which
kN v(Uik ) /2
covers A. Arranging those sets in a sequence (cf. Cantors first diagonal process), we obtain a
sequence of rectangles which covers A and
Hence,
v(Uik )
= 2.
i1
2
i=1
i,k=1
i,k=1 v(Uik )
(e) Let A = [a1 , b1 ] [an , bn ] be a non-singular rectangle, that is ai < bi for all i = 1, . . . , n.
Then A is not of measure 0. Indeed, we use the following two facts about the volume of finite
unions of rectangles:
P
(a) v(U1 Un ) ni=1 v(Ui ),
(b) U V implies v(U) v(V ).
Now let = v(A)/2 = (b1 a1 ) (bn an )/2 and suppose that the open rectangles (Ui )iN
cover the compact set A. Then there exists a finite subcover U1 Um A. This and (a),
(b) imply
!
n
n
[
X
X
Ui
v(Ui )
v(Ui ).
< v(A) v
This contradicts
i=1
i=1
i=1
Let
249
1111111111111111111
0000000000000000000
0000000000000000000
1111111111111111111
0000000000000000000
1111111111111111111
0000000000000000000
1111111111111111111
0000000000000000000
1111111111111111111
0000000000000000000
1111111111111111111
0000000000000000000
1111111111111111111
0000000000000000000
1111111111111111111
0000000000000000000
1111111111111111111
0000000000000000000
1111111111111111111
0000000000000000000
1111111111111111111
0000000000000000000
1111111111111111111
0000000000000000000
1111111111111111111
0000000000000000000
1111111111111111111
0000000000000000000
1111111111111111111
0000000000000000000
1111111111111111111
0000000000000000000
1111111111111111111
0000000000000000000
1111111111111111111
0000000000000000000
1111111111111111111
0000000000000000000
1111111111111111111
0000000000000000000
1111111111111111111
0000000000000000000
1111111111111111111
0000000000000000000
1111111111111111111
0000000000000000000
1111111111111111111
Definition 9.3 Let f : C R be bounded and A a rectangle, C A. We call f Riemann integrable on C if the
product function f C : A R is Riemann integrable
on A. In this case we define
Z
Z
f dx =
f C dx.
C
111
000
000
111
000
111
C
A
11111
00000
00000
11111
00000
11111
00000
11111
111
000
000
111
000
111
By the above discussion, A is the disjoint union of two open and a closed set:
A = C C (C c ) .
Theorem 9.4 The characteristic function C : A R is integrable if and only if the boundary
of C has measure 0.
Proof. Since the boundary C is closed and inside the bounded set, C is compact. Suppose
first x is an inner point of C. Then there is an open set U C containing x. Thus C (x) = 1
on x U; clearly C is continuous at x (since it is locally constant). Similarly, if x is an inner
point of C c , C (x) is locally constant, namely C = 0 in a neighborhood of x. Hence C is
250
Definition 9.4 A bounded set C is called Jordan measurable or simply a Jordan set if its boundR
ary has measure 0. The integral v(C) = C 1 dx is called the n-dimensional Jordan measure of
C or the n-dimensional volume of C; sometimes we write (C) in place of v(C).
Naturally, the one-dimensional volume in the length, and the two-dimensional volume is the
area.
Typical Examples of Jordan Measurable Sets
P
Hyper planes ni=1 ai xi = c, and, more general, hyper surfaces f (x1 , . . . , xn ) = c, f C1 (G)
are sets with measure 0 in Rn . Curves in Rn have measure 0. Graphs of functions
f = {(x, f (x)) Rn+1 | x G}, f continuous, are of measure 0 in Rn+1 . If G is a bounded
region in Rn , the boundary G has measure 0. If G Rn is a region, the cylinder
C = G R = {(x, xn+1 ) | x G} Rn+1 is a measure 0 set.
Let D Rn+1 be given by
D = {(x, xn+1 ) | x K, 0 xn+1 f (x)},
f(x)
D
K
gx dy =
B
gx dy =
B
f (x, y) dy,
B
f (x, y) dy.
B
251
f dxdy =
AB
U(x) dx =
Z Z
A
f (x, y) dy
B
f (x, y) dy
B
dx,
dx.
an
a1
R
(d) If C A B, Fubinis theorem can be used to compute C f dx since this is by definition
R
f C dx. Here are two examples in case n = 2 and n = 3.
AB
Let a < b and (x) and (x) continuous real valued functions on
[a, b] with (x) < (x) on [a, b]. Put
(x)
C = {(x, y) R2 | a x b,
(x) y (x)}.
(x)
f dxdy =
f (x, y) dy
dx.
(x)
a
(x)
Let
G = {(x, y, z) R3 | a x b, (x) y (x), (x, y) z (x, y)},
where all functions are sufficiently nice. Then
ZZZ
Z b Z (y)
f (x, y, z) dxdydz =
G
(x)
(x,y)
(x,y)
f (x, y, z) dz
dy
dx.
(e) Cavalieris Principle. Let A and B be Jordan sets in R3 and let Ac = {(x, y) | (x, y, c)
A} be the section of A with the plane z = c; Bc is defined similar. Suppose each Ac and Bc is
Jordan measurable (in R2 ) and they have the same area v(Ac ) = v(Bc ) for all c R.
Then A and B have the same volume v(A) = v(B).
252
(1,1)
= {(x, y) R2 | 0 y 1, y x y}.
Then
ZZ
xy dxdy =
xy dy dx =
x2
x
Z
1 1 3
xy 2
dx =
(x x5 ) dx
2 y=x2
2 0
1
1
x4 x6
1
1
=
=
= .
8
12 0 8 12
24
xy dxdy =
xy dx dy =
y
Z
1 1 2
x2 y
=
(y y 3) dy
2 y
2 0
1
y 3 y 4
1
1 1
=
= = .
6
8 0 6 8
24
(2,4)
(1,2)
(2,2)
(1,1)
f dxdydz =
Z
1x
Z
1xy
dz
(1 + x + y + z)3
dy
dx
1xy !
1
1
=
dy dx
2 (1 + x + y + z)2 0
0
0
Z 1 Z 1x
1
1
1
dy dx
=
2 (1 + x + y)2 8
0
0
Z
1
5
x3
1 1
1
dx =
log 2
.
+
=
2 0 x+1
4
2
8
Z
1x
2
1
y 2x
dx xe x =
x
3
(e2 x ex) dx = (e2 e).
2
253
But trying to reverse the order of integration we encounter two problems. First, we must break
D in several regions:
ZZ
Z 2
Z y
Z 4
Z 2
y/x
f dxdy =
dy
e dx +
dy
ey/x dx.
1
y/2
This is not a serious problem. A greater problem is that e1/x has no elementary antiderivative, so
R y y/x
R2
e dx and y/2 ey/x dx are very difficult to evaluate. In this example, there is a considerable
1
advantage in one order of integration over the other.
y 2 + z 2 f (x)2 }.
ZZZ
dxdydz =
G
dx
a
Z Z
dydz ,
Gx
(9.1)
where Gx = {(y, z) R2 | y 2 + z 2 f (x)2 } is the closed disc of radius f (x) around (0, 0).
RR
For any fixed x [a, b] its area is v(Gx ) = Gx dydz = f (x)2 . Hence
v(G) =
f (x)2 dx.
(9.2)
Example 9.3 We compute the volume of the ellipsoid obtained by revolving the graph of the
ellipse
x2 y 2
+ 2 =1
a2
b
2
around the x-axis. We have y 2 = f (x)2 = b2 1 xa2 ; hence
2
v(G) = b
a
x2
x3
2a3
4 2
2
2
b a.
1 2 dx = b x 2 = b 2a 2 =
a
3a
3a
3
a
a
R g(b)
g(a)
f (x) dx =
Rb
a
254
If BR is the ball in R3 with radius R around the origin we have in cartesian coordinates
ZZZ
Z
Z 2 2
Z 2 2 2
R
f dxdydz =
dx
BR
R x
R2 x2
R x y
dy
R2 x2 y 2
Usually, the complicated limits yield hard computations. Here spherical coordinates are appropriate.
To motivate the formula consider the area of a parallelogram D in the x-y-plane spanned by the
two vectors a = (a1 , a2 ) and b = (b1 , b2 ).
g1 (, )
D = {a + b | , [0, 1]} =
(, [0, 1] ,
g2 (, )
where g1 (, ) = a1 + b1 and g2 (, ) = a2 + b2 . As known from linear algebra the area
of D equals the norm of the vector product
e
e
e
1
2
3
v(D) = ka bk =
det a1 a2 0
= k(0, 0, a1b2 a2 b1 )k = | a1 b2 a2 b1 | =: d
b1 b2 0
x = a1 + b1 ,
y = a2 + b2 ,
the parallelogram D in the x-y-plane is now the unit square C = [0, 1] [0, 1] in the --plane
and D = g(C). We want to compare the area d of D with the area 1 of C. Note that d is exactly
1 ,g2 )
; indeed
the absolute value of the Jacobian (g
(,)
(g1 , g2 )
= det
(, )
Hence,
ZZ
D
g1
g2
g1
g2
a1 a2
= det
= a1 b2 a2 b1 .
b1 b2
ZZ
(g1 , g2 )
dxdy =
(, ) dd.
C
Theorem 9.6 (Change of variable) Let C and D be compact Jordan set in Rn ; let M C a
set of measure 0. Let g : C D be continuously differentiable with the following
properties
00000000000
11111111111
00000000000
11111111111
00000000000
11111111111
(i) g is injective on C \ M.
00000000000
11111111111
0000000000000000
1111111111111111
00000000000
11111111111
0000000000000000
1111111111111111
00000000000
11111111111
0000000000000000
1111111111111111
00000000000
11111111 11111111111
0000000000000000 00000000
1111111111111111
00000000000
11111111111
0000000000000000
1111111111111111
00000000000
11111111111
0000000000000000
1111111111111111
00000000000
11111111111
(ii) g (x) is regular on C \ M.
0000000000000000
1111111111111111
00000000000
11111111111
g
D, y
0000000000000000
1111111111111111
00000000000
11111111111
Let f : D R be continuous.
Then
Z
Z
f (y) dy =
f (g(x))
D
1111111111111111
0000000000000000
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
fg
00000000
11111111
C, x
00000000
11111111
IR
1111111111
0000000000
00000000000
11111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
f
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
(g1 , . . . , gn )
(x1 , . . . , xn ) (x) dx.
(9.3)
255
Remark 9.4 Why the absolute value of the Jacobian? In R1 we dont have the absolute value.
Rb
Ra
But in contrast to Rn , n 1, we have an orientation of the integration set a f dx = b f dx.
For the proof see [Rud76, 10.9 Theorem]. The main steps of the proof are: 1) In a small open
set g can be written as the composition of n flips and n primitive mappings. A flip changes
two variables xi and xk , wheras a primitive mapping H is equal to the identity except for one
variable, H(x) = x + (h(x) x)em where h : U R.
2) If the statement is true for transformations S and T , then it is true for the composition S T
which follows from det(AB) = det A det B.
3) Use a partition of unity.
Example 9.4 (a) Polar coordinates. Let A = {(r, ) | 0 r R, 0 < 2} be a rectangle
in polar coordinates. The mapping g(r, ) = (x, y), x = r cos , y = r sin maps this
rectangle continuously differentiable onto the disc D with radius R. Let M = {(r, ) | r = 0}.
= r, the map g is bijective and regular on A \ M. The assumptions of the theorem
Since (x,y)
(r,)
are satisfied and we have
ZZ
ZZ
f (x, y) dxdy =
f (r cos , r sin )rdrd
D
Fubini
A
R Z 2
(b) Spherical coordinates. Recall from the exercise class the spherical coordinates r [0, ),
[0, 2], and [0, ]
x = r sin cos ,
y = r sin sin ,
z = r cos .
The Jacobian reads
xr x x sin cos r cos cos r sin sin
(x, y, z)
= yr y y = sin sin r cos sin r sin cos = r 2 sin
(r, , )
zr z z cos
r sin
0
Sometimes one uses
Hence
ZZZ
B1
(x, y, z)
= r 2 sin .
(r, , )
f (x, y, z) dxdydz =
f (x, y, z) sin2 dr d d.
0
This example was not covered in the lecture. Compute the volume of the ellipsoid E given by
u2 /a2 + v 2 /b2 + w 2 /c2 = 1. We use scaled spherical coordinates:
u = ar sin cos ,
v = br sin sin ,
w = cr cos ,
256
where r [0, 1], [0, ], 0, 2]. Since the rows of the spherical Jacobian matrix
are simply multiplied by a, b, and c, respectively, we have
(x,y,z)
(r,,)
(u, v, w)
= abcr 2 sin .
(r, , )
Hence, if B1 is the unit ball around 0 we have using iterated integrals
ZZZ
ZZZ
v(E) =
r 2 sin drdd
dudvdw = abc
E
= abc
drr 2
0
B1
sin d
1
4
= abc 2 ( cos )|0 =
abc.
3
3
2
x - y =1
2
x -y
=4
(c)
ZZ
xy=2
xy=1
xy = 1, xy = 2, x2 y 2 = 1, x2 y 2 = 4.
We change coordinates g(x, y) = (u, v)
u = xy,
The Jacobian is
v = x2 y 2 .
(u, v) y
x
= 2(x2 + y 2).
=
2x 2y
(x, y)
(x, y)
1
=
.
2
(u, v)
2(x + y 2)
In the (u, v)-plane, the region is a rectangle D = {(u, v) R2 | 1 u 2,
Hence,
ZZ
ZZ
ZZ
x2 + y 2
2
2
2
2 (x, y)
dudv
=
dudv =
(x + y ) dxdy =
(x + y )
(u, v)
2(x2 + y 2 )
C
1 v 4}.
1
3
v(D) = .
2
2
Physical Applications
If (x) = (x1 , x2 , x3 ) is a mass density of a solid C R3 , then
m=
xi =
CZ
1
m
9.4 Appendix
257
ZCZ Z
xy dxdydz,
Ixz =
ZCZ Z
xz dxdydz,
Iyz =
ZCZ Z
yz dxdydz.
Here Ixx , Iyy , and Izz are the moments of inertia of the solid with respect to the x-axis, y-axis,
and z-axis, respectively.
Example 9.5 Compute the mass center of a homogeneous half-plate of radius R, C = {(x, y) |
x2 + y 2 R2 , y 0}.
Solution. By the symmetry of C with respect to the y-axis, x = 0. Using polar coordinates we
find
ZZ
Z Z
Z
1
1 R
1 R 2
1 2R3
.
y=
y dxdy =
r sin rd dr =
r dr ( cos ) |0 =
m
m 0 0
m 0
m 3
C
9.4 Appendix
Proof of Fubinis Theorem. Let PA be a partition of A and PB a partition of B. Together they
give a partition P of A B for which any subrectangle S is of the form SA SB , where SA is
a subrectangle of the partition PA , and SB is a subrectangle of the partition PB . Thus
X
X
mSA SB v(SA SB )
mS v(S) =
L(P, f ) =
S
SA ,SB
X X
SA
SB
Now, if x SA , then clearly mSA SB (f ) mSB (gx ) since the reference set SA SB on the
left is bigger than the reference set {x} SB on the right. Consequently, for x SA we have
Z
X
X
gx dy = L(x)
mSA SB v(SB )
mSB (gx ) v(SB )
SB
X
SB
SB
Therefore,
X X
SA
SB
X
SA
258
We thus obtain
L(P, f ) L(PA , L) U(PA , L) U(PA , U) U(P, f ),
where the proof of the last inequality is entirely analogous to the proof of the first. Since f is
R
integrable, sup{L(P, f )} = inf{U(P, f )} = AB f dxdy. Hence,
Z
sup{L(PA , L)} = inf{U(PA , L)} =
f dxdy.
AB
R
R
In other words, L(x) is integrable on A and AB f dxdy = A L(x) dx.
The assertion for U(x) follows similarly from the inequalities
Chapter 10
Surface Integrals
10.1 Surfaces in
Recall that a domain G is an open and connected subset in Rn ; connected means that for any
two points x and y in G, there exist points x0 , x1 , . . . , xk with x0 = x and xk = y such that
every segment xi1 xi , i = 1, . . . , k, is completely contained in G.
Definition 10.1 Let G R2 be a domain and F : G R3 continuously differentiable. The
mapping F as well as the set F = F (G) = {F (s, t) | (s, t) G} is called an open regular
surface if the Jacobian matrix F (s, t) has rank 2 for all (s, t) G.
If
x(s, t)
F (s, t) = y(s, t) ,
z(s, t)
xs xt
F (s, t) = ys yt .
zs zt
The two column vectors of F (s, t) span the tangent plane to F at (s, t):
y
z
x
(s, t), (s, t), (s, t) ,
D1 F (s, t) =
s
s
s
x
y
z
D2 F (s, t) =
(s, t), (s, t), (s, t) .
t
t
t
Justification: Suppose (s, t0 ) G where t0 is fixed. Then (s) = F (s, t0 ) defines a curve
in F with tangent vector (s) = D1 F (s, t0 ). Similarly, for fixed s0 we obtain another curve
(t) = F (s0 , t) with tangent vector (t) = D2 F (s0 , t). Since F (s, t) has rank 2 at every point
of G, the vectors D1 F and D2 F are linearly independent; hence they span a plane.
259
10 Surface Integrals
260
, R
is called the tangent plane E to F at F (s0 , t0 ). The line through F (s0 , t0 ) which is orthogonal
to E is called the normal line to F at F (s0 , t0 ).
Recall that the vector product ~x ~y of vectors ~x = (x1 , x2 , x3 ) and ~y = (y1 , y2, y3 ) from R3 is
the vector
e1 e2 e3
~x ~y = x1 x2 x3 = (x2 y3 y2 x3 , x3 y1 y3 x1 , x1 y2 y1 x2 ).
y y y
1
2
3
It is orthogonal to the plane spanned by the parallelogram P with edges ~x and ~y . Its length is
the area of the parallelogram P .
A vector which points in the direction of the normal line is
e1 e2
D1 F (s0 , t0 ) D2 F (s0 , t0 ) = xs ys
x y
t
t
D1 F
~n =
kD1 F
e3
zs
zt
D2 F
,
D2 F k
(10.1)
(10.2)
D2 F = (0, 1, fy ),
e1 e2 e3
D1 f D2 f = 1 0 fx = (fx , fy , 1).
0 1 f
y
fx (x x0 ) fy (y y0 ) + 1(z z0 ) = 0.
Further, the unit normal vector to the tangent plane is
(fx , fy , 1)
~n = p 2
.
fx + fy2 + 1
10.1 Surfaces in R3
261
ZZ
kD1 F D2 F k dsdt
(10.3)
25
20
15
10
5
D2 F = (s sin t, s cos t, 2)
-2
-1
0
1
-2
-1
e1
e2
e3
D1 F D2 F = cos t
sin t 0 = (2 sin t, 2 cos t, s).
s sin t s cos t 2
Therefore,
|F| =
such that
Z 2p
Z
2
2
2
4 cos t + 4 sin t + s dsdt = 4
0
2
0
4 + s2 ds = 8( 2 log( 2 1)).
Example 10.3 (Guldins Rule (Paul Guldin, 15771643, Swiss Mathematician)) Let f be a
continuously differentiable function on [a, b] with f (x) 0 for all x [a, b]. Let the graph of
f revolve around the x-axis and let F be the corresponding surface. We have
| F | = 2
f (x)
1 + f (x)2 dx.
10 Surface Integrals
262
We have
D1 F = (1, f (x) cos , f (x) sin ),
D1 F D2 F = (f f , f cos , f sin );
p
so that dS = f (x) 1 + f (x)2 dxd. Hence
|F| =
Z bZ
a
Z b
p
p
2
f (x) 1 + f (x) d dx = 2
f (x) 1 + f (x)2 dx.
a
(b) Let the surface be given implicitly as G(x, y, z) = 0. Suppose G is locally solvable for z in
a neighborhood of some point (x0 , y0 , z0 ). Then the surface element (up to the sign) is given by
dS =
EG H 2 dsdt,
where
E = x2s + ys2 + zs2 ,
H = xs xt + ys yt + zs zt .
263
~
Indeed, using
~a b
= k~ak
by ~a and ~b we get
~
b
sin and sin2 = 1 cos2 , where is the angle spanned
If
x = R cos sin ,
y = R sin sin ,
z = R cos ,
we obtain
D1 = F = R(cos cos , sin cos , sin ),
D2 = F = R( sin sin , cos sin , 0),
f dxdydz =
dr
ZZ
Sr
f (~x) dS =
ZZ
f (r~x) dS(~x) dr.
r2
S1
Indeed, by the previous example, and by our knowledge of spherical coordinates (r, , ).
dxdydz = r 2 sin dr d d = dr dSr .
On the other hand, on the unit sphere S1 , dS = sin d d such that
dxdydz = r 2 dr dS
which establishes the second formula.
10 Surface Integrals
264
RR
1
dS
ZZ
x (x, y, z) dS,
ZZ
F
(~x)
dS(~x),
k~y ~xk
~y 6 F
D1 F D2 F
,
kD1 F D2 F k
where = +1 or = 1 fixes the orientation of F. It turns out that for a regular surface F
there either exists exactly two unit normal fields or there is no such field. If F is provided with
an orientation we write F+ for the pair (F, ~n). For F with the opposite orientation, we write
F .
265
Examples of non-orientable surfaces are the Mobius band and the real projective plane. Analytically the Mobius band is given by
1 + t cos 2s sin s
F (s, t) = 1 + t cos 2s cos s ,
t sin 2s
1 1
(s, t) [0, 2] ,
.
2 2
(10.4)
F+
D2 G = D1 F s + D2 F t ,
so that using ~x ~x = 0, ~x ~y = ~y ~x
D1 G D2 Gdd = (D1 F s + D2 F t ) (D1 F s + D2 F t )dd,
= (s t s t )D1 F D2 F dd
= D1 F D2 F dsdt.
(b) The scalar surface integral is a special case of the surface integral, namely
RR
RR
f dS =
f~n~n dS.
(c) Special cases. Let F be the graph of a function f , F = {(x, y, f (x, y)) | (x, y) C}, then
~ = (fx , fy , 1) dxdy.
dS
If the surface is given implicitly by F (x, y, z) = 0 and it is locally solvable for z, then
~ = grad F dxdy.
dS
Fz
10 Surface Integrals
266
~
(d) Still another form of dS.
ZZ
~ =
f~ dS
ZZ
f1 (F (s, t)) f2 (F (s, t)) f3 (F (s, t))
~ (s, t)) (D1 F D2 F ) = xs (s, t)
.
f(F
y
(s,
t)
z
(s,
t)
s
s
x (s, t)
yt (s, t)
zt (s, t)
t
(10.5)
(e) Again another notation. Computing the previous determinant or the determinant (10.1)
explicitely we have
ys zs
zs xs
xs ys
~
= f1 (y, z) +f2 (z, x) +f3 (x, y) .
+f2
+f3
f (D1 F D2 F ) = f1
yt zt
zt xt
xt yt
(s, t)
(s, t)
(s, t)
Hence,
~ = D1 F D2 F dsdt =
dS
~ =
f~ dS
In this setting
ZZ
f1 dydz =
ZZ
(z, x)
(x, y)
(y, z)
dsdt,
dsdt,
dsdt
(s, t)
(s, t)
(s, t)
ZZ
~ =
(f1 , 0, 0) dS
ZZ
f1 (F (s, t))
(y, z)
dsdt.
(s, t)
267
f dzdx =
F+
ZZ
~
(0, f, 0) dS
F+
= (R)
ZZ
x y(x + y) dxdy =
dx
(x4 y + x2 y 2) dy =
0
19
.
90
z 0},
y
R
R
x
F1 = {(x, y,
p
R2 x2 y 2 ) | x2 + y 2 R2 },
z = g(x, y) =
R2 x2 y 2 ,
with the upper orientation of the unit normal field and of the disc F2 in the x-y-plane
F2 = {(x, y, 0) | x2 + y 2 R2 },
z = g(x, y) = 0,
with the downward directed normal. Let f~(x, y, z) = (ax, by, cz). We want to compute
ZZ
~
f~ dS.
F+
F1+
BR
R2 r 2 we get
I1 =
Noting
R 2
0
sin2 d =
R 2
I1 =
rdr.
R2 r 2
cos2 d = we continue
ar 3
br 3
+
+ 2cr R2 r 2
R2 r 2
R2 r 2
dr.
R2 x2 y 2 =
10 Surface Integrals
268
dr =
=R
sin3 t dt = R3 .
3
R2 r 2
0
0
0
R 1 sin2 t
Hence,
2 3
R (a + b) + c
I1 =
3
(R2 r 2 ) 2 d(r 2 )
0
R
2 3
2 2
2 32
=
R (a + b) + c (R r )
3
3
0
2 3
R (a + b + c).
=
3
BR
F2+
Hence,
ZZ
BR
~ = 2 R3 (a + b + c).
(ax, by, cz) dS
3
Note that a and b form the boundary of the segment [a, b]. There are three possibilities to do this
RR
RRR
~
g dxdydz =
Gau theorem in R3 ,
f~ dS
G
(G)+
RR
R
g dxdy
=
f~ d~x Greens theorem in R2 ,
G
(G)+
RR
R
~
f~ d~x Stokes theorem .
=
~g dS
F+
(G)+
Let G R3 be a bounded domain (open, connected) such that its boundary F = G satisfies
the following assumptions:
269
(10.6)
G+
(x,y)
where C R2 is a domain and , C1 (C) define regular top and bottom surfaces F1 and F2 of F, respectively.
We prove only one part of (10.7) namely f~ = (0, 0, f3).
ZZ
ZZZ
f3
dxdydz =
f3 dxdy.
(10.8)
z
Proof.
(x,y)
0000000000000000000000000000
1111111111111111111111111111
1111111111111111111111111111
0000000000000000000000000000
0000000000000000000000000000
1111111111111111111111111111
0000000000000000000000000000
1111111111111111111111111111
0000000000000000000000000000
1111111111111111111111111111
C
0000000000000000000000000000
1111111111111111111111111111
0000000000000000000000000000
1111111111111111111111111111
0000000000000000000000000000
1111111111111111111111111111
0000000000000000000000000000
1111111111111111111111111111
0000000000000000000000000000
1111111111111111111111111111
0000000000000000000000000000
1111111111111111111111111111
f3
dxdydz =
x
ZZ
(x,y)
ZZ
(x,y)
f3
dz
z
dxdy
(10.9)
ZZ
C
10 Surface Integrals
270
Since the bottom surface F2 is oriented downward, the outer normal is (x (x, y), y (x, y), 1)
such that
ZZ
ZZ
I2 =
f3 dxdy =
F2+
F3+
Remarks 10.2 (a) Gau divergence theorem can be used to compute the volume of the domain
G R3 . Suppose the boundary G of G has the orientation of the outer normal. Then
ZZ
ZZ
ZZ
x dydz =
y dzdx =
z dxdy.
v(G) =
G
(b) Applying the mean value theorem to the left-hand side of Gau formula we have for any
bounded region G containing x0
ZZZ
ZZ
~
~
~
div f (x0 + h)
dxdydz = div f (x0 + h)v(G) =
f~ dS,
G
G+
where h is a small vector. The integral on the left is the volume v(G). Hence
1
Gx0 v(G)
ZZ
G+
~ = lim 1 3
f~ dS
4
0
ZZ
S (x0 )+
~
f~ dS,
where the region G tends to x0 . In the second formula, we have chosen G = B (x0 ) the open
ball of radius with center x0 . The right hand side can be thought as to be the source density of
~ In particular, the right side gives a basis independent description of div f~.
the field f.
Example 10.6 We want to compute the surface integral from Example 10.5 (b) using Gau
theorem:
ZZ
ZZZ
ZZZ
2R3
~
~
div f dxdydz =
(a + b + c) dxdydz =
f dS =
(a + b + c).
3
F+
x2 +y 2 +z 2 R2 , z0
271
Gau divergence theorem which play an important role in partial differential equations.
Recall (Proposition 7.9 (Prop. 8.9)) that the directional derivative of a function v : U R,
U Rn , at x0 in the direction of the unit vector ~n is given by D~n f (x0 ) = grad f (x0 )~n.
Notation. Let U R3 be open and F+ U be an oriented, regular open surface with the unit
normal vector ~n(x0 ) at x0 F. Let g : U R be differentiable.
Then
g
(x0 ) = grad g(x0 )~n(x0 )
~n
(10.10)
ZG
Z
G
u grad v ~n dS =
ZZ
v
dS.
~n
This proves Greens first identity. Changing the role of u and v and taking the difference, we
obtain the second formula.
Inserting v = 1 into (10.12) we get (10.13).
10 Surface Integrals
272
Proof. Put u = u1 u2 and apply Greens first formula (10.11) to u = v. Note that (u) =
(u1 ) (u2 ) = 0 (U is harmonic ini G) and u(x) = u1 (x) u2 (x) = 0 on the boundary
x G. In other words, a harmonic function is uniquely determined by its boundary values.
ZZZ
ZZZ
ZZ
u
dS
u (u) dxdydz = 0.
(u)(u) dxdydz =
u
|{z}
| {z }
~n
G
0,xG
11111
00000
00000
11111
00000
11111
00000
11111
00000000000
11111111111
00000
11111
00000000000
11111111111
00000
11111
00000000000
11111111111
00000
00000000000 11111
11111111111
00000000000
11111111111
00000000000
11111111111
00000000000
11111111111
00000000000
11111111111
2
Let G be a domain in R2 with picewise smooth (differentiable) boundaries 1 , 2 , . . . , k . We give an orientation to the boundary: the outer curve is oriented counter
clockwise (mathematical positive), the inner boundaries
are oriented in the opposite direction.
Theorem 10.3 (Greens Theorem) Let (P, Q) be a continuously differentiable vector field on
G and let the boundary = G be oriented as above. Then
Z
ZZ
Q P
dxdy =
P dx + Q dy.
(10.14)
x
y
Proof. (a) First, we consider a region G of type 1 in the plane, as shown in the figure and we
will prove that
Z
ZZ
P
dxdy =
P dx.
(10.15)
y
(x)
y
1
(x)
273
The latter equality is due to the fundamental theorem of calculus. To compute the line integral,
we parametrize the four parts of in a natural way:
1 ,
2 ,
3 ,
4 ,
t [(a), (a)],
dx = 0,
t [(b), (b)],
dx = 0,
t [a, b],
dy = dt,
dx = dt,
t [a, b],
dy = (t) dt,
dy = dt,
dx = dt,
dy = (t) dt.
Since dx = 0 on 1 and 3 we are left with the line integrals over 2 and 4 :
Z b
Z b
Z
P dx =
P (t, (t)) dt
P (t, (t)) dt
(x)
d
Q(x, y) dy =
x
dx
(x)
RR
Q
x
ZZ
Q
dxdy =
x
dxdy =
d
dx
RR
Q
x
dxdy =
(x)
Q(x, y) dy
(x)
R b R (x)
Q
(x) x
Q(x, y) dy
(x)
Q(b, y) dy
dy dx, we get
(x)
(b)
(b)
(a)
(a)
Q(a, y) dy
+
Q dy =
1
Further,
Z
Q dy =
(a)
Q(a, y) dy,
(a)
dx
(10.17)
Q dy =
Q dy =
Z
Z
(b)
Q(b, y) dy.
(b)
Adding up these integrals and comparing the result with (10.17), the proof for type 1 regions is
complete.
1
type 2
(x)
1010
1010
type 1
110010
1111111111111111111111111
0000000000000000000000000
11111111111111
00000000000000
00000000000000
11111111111111
11111111
00000000
10
0000000000000000000000000
1111111111111111111111111
00000000000000
11111111111111
0000000000000000000000000
1111111111111111111111111
00000000000000
11111111111111
0000000000000000000000000
1111111111111111111111111
00000000000000
11111111111111
0000000000000000000000000
1111111111111111111111111
00000000000000
11111111111111
type 2
0000000000000000000000000
1111111111111111111111111
00000000000000
11111111111111
0000000000000000000000000
1111111111111111111111111
00000000000000
11111111111111
0000000000000000000000000
1111111111111111111111111
00000000000000
11111111111111
0000000000000000000000000
1111111111111111111111111
00000000000000
11111111111111
0000000000000000000000000
1111111111111111111111111
00000000000000
11111111111111
type
2
0000000000000000000000000
1111111111111111111111111
00000000000000
11111111111111
0000000000000000000000000
1111111111111111111111111
00000000000000
11111111111111
0000000000000000000000000
1111111111111111111111111
00000000000000
11111111111111
0000000000000000000000000
1111111111111111111111111
1111111111111111111
0000000000000000000
0000000000000000000000000
1111111111111111111111111
(x)
2
3
111
000
x
10 Surface Integrals
274
Exactly in the same way, we can prove that if G is a type 2 region then (10.14) holds.
(b) Breaking a region G up into smaller regions, each of which is both of type 1 and 2, Greens
theorem is valid for G. The line integrals along the inner boundary cancel leaving the line
integral around the boundary of G.
(c) If the region has a hole, one can split it into two simply connected regions, for which
Greens theorem is valid by the arguments of (b).
y dx + (1 )
x dy.
Inserting = 0, = 1, and =
1
2
x2
y2
Example 10.7 Find the area bounded by the ellipse : 2 + 2 = 1. We parametrize by
a
b
~x(t) = (a cos t, b sin t), t [0, 2], ~x (t) = (a sin t, b cos t). Then (10.18) gives
1
A=
2
1
a cos t b sin t dt b sin t(a sin t) dt =
2
ab dt = ab.
275
= F be the boundary with the above orientation. Further, let f~ be a continuously differentiable vector field on F.
Then we have
Z
ZZ
~
~
curl f dS =
f~ d~x.
(10.19)
F+
F+
F+
Proof. Main idea: Reduction to Greens theorem. Since both sides of the equation are additive
with respect to the vector field f~, it suffices to proof the statement for the vector fields (f1 , 0, 0),
(0, f2 , 0), and (0, 0, f3 ). We show the theorem for f~ = (f, 0, 0), the other cases are quite
analogous:
Z
ZZ
f
f
dzdx
dxdy =
f dx.
z
y
F
F
x
x
du +
dv,
u
v
(u, v)
such that the line integral on the right reads with P (u, v) = f (x(u, v), y(u, v), z(u, v)) x
u
x
and Q(u, v) = f v .
Z
f dx =
F
f xu du + f xv dv =
=
=
=
ZZ
ZGZ
ZGZ
P du + Q dv
G
Green s th.
ZZ
ZZ
Q
P
+
v
u
du dv
fv xu + fu xv du dv
G
ZZ
ZZ
(x, y)
(z, x)
du dv =
+ fz
=
fy
fy dxdy + fz dzdx.
(u, v)
(u, v)
G
Remark 10.3 (a) Greens theorem is a special case with F = G {0}, ~n = (0, 0, 1) (orientation) and f~ = (P, Q, 0).
10 Surface Integrals
276
(b) The right side of (10.19) is called the circulation of the vector field f~ over the closed curve
. Now let ~x0 F be fixed and consider smaller and smaller neighborhoods F0+ of ~x0 with
boundaries 0 . By Stokes theorem and by the Mean Value Theorem of integration,
Z
ZZ
~
curl f~~n dS = curl f~(x0 )~n(x0 ) area (F0 ).
f d~x =
0
F0
Hence,
curl f~(x0 )~n(x0 ) = lim
F0 x0
F0
f~ d~x
| F0 |
We call curl f~(x0 ) ~n(x0 ) the infinitesimal circulation of the vector field f~ at x0 corresponding
to the unit normal vector ~n.
(c) Stokes theorem then says that the integral over the infinitesimal circulation of a vector field
f~ corresponding to the unit normal vector ~n over F equals the circulation of the vector field
along the boundary of F.
Path Independence of Line Integrals
We are going complete the proof of Proposition 8.3 and show that for a
simply connected region G R3 and
a twice continuously differentiable vector field f~ with
curl f~ = 0 for all x G
the vector field f~ is conservative.
Proof. Indeed, let be a closed, regular, piecewise differentiable curve G and let be the
the boundary of a smooth regular oriented surface F+ , = F+ such that has the induced
orientation. Inserting curl f~ = 0 into Stokes theorem gives
ZZ
Z
~
f~ d~x;
curl f dS = 0 =
F+
the line integral is path independent and hence, f~ is conservative. Note that the region must be
simply connected; otherwise its in general impossible to find F with boundary .
277
.
f3 =
x
y
f1 =
where h(x, y) is the integration constant, not depending on z. Inserting this into the third equation, we obtain
Z z
Z z
f1
f2
g2 g1
=
(x, y, t) dt + hx (x, y)
(x, y, t) dt
x
y
z0 x
z0 y
Z z
f1 f2
=
dt + hx
+
x
y
z0
Z z
f3
(x, y, t) dt + hx
=
div f =0 z z
0
f3 (x, y, z) = f3 (x, y, z) f3 (x, y, z0 ) + hx (x, y).
curl f~ = ~a.
10 Surface Integrals
278
Proposition 10.6 The above problem has a solution if and only if div ~a = 0.
Proof. The condition is necessary since div ~a = div curl f~ = 0. We skip the vector arrows.
For the other direction we use the ansatz f = r + s with
curl r = 0,
div r = h,
(10.20)
curl s = a,
div s = 0.
(10.21)
Since curl r = 0, by Proposition 8.3 there exists a potential U with r = grad U. Then
curl r = 0 and div r = div grad U = (U). Hence (10.20) is satisfied if and only if
r = grad U and (U) = h.
Since div a = 0 by asssumption, there exists a vector potential g such that curl g = a. Let be
twice continuously differentiable on G and set s = g + grad . Then curl s = curl g = a and
div s = div g + div grad = div g + (). Hence, div s = 0 if and only if () = div g.
Both equations (U) = h and () = div g are so called Poisson equations which can be
solved within the theory of partial differential equations (PDE).
The inverse problem has not a unique solution. Choose a harmonic function , () = 0 and
put f1 = f + grad . Then
div f1 = div f + div grad = div f + () = div f = h,
curl f1 = curl f + curl grad = curl f = a.
Chapter 11
Differential Forms on
Rn
We show that Gau, Greens and Stokes theorems are three cases of a general theorem which
R
R
is also named after Stokes. The simple formula now reads c d = c . The appearance of
the Jacobian in the change of variable theorem will become clear. We formulate the Poincare
lemma.
Good references are [Spi65], [AF01], and [vW81].
R)
n
Although we are working with the ground field R all constructions make sense for arbitrary
fields K, in particular, K = C. Let {e1 , . . . , en } be the standard basis of Rn ; for h Rn we
P
write h = (h1 , . . . , hn ) with respect to the standard basis, h = i hi ei .
It turns out that V is again a linear space if we introduce addition and scalar multiples in the
natural way. For f, g V , R put
(f + g)(v) := f (v) + g(v),
(f )(v) := f (v).
11 Differential Forms on Rn
280
In this case, the brackets denote the dual pairing between V and V . By definition, the pairing
is linear in both components. That is, for all v, w V and for all , R
hf + g , vi = hf , vi + hg , vi ,
hf , v + wi = hf , vi + hf , wi .
Example 11.1 (a) Let V = Rn with the above standard basis. For i = 1, . . . , n define the ith
coordinate functional dxi : Rn R by
dxi (h) = dxi (h1 , . . . , hn ) = hi ,
h Rn .
The functional dxi associates to each vector h Rn its ith coordinate hi . The functional dxi is
indeed linear since for all v, w Rn and , R, dxi (v+w) = (v+w)i = vi +wi =
dxi (v) + dxi (w).
The linear space (Rn ) has also dimension n. We will show that { dx1 , dx2 , . . . , dxn } is a
basis of (Rn ) . We call it the dual basis of to {e1 , . . . , en }. Using the Kronecker symbol the
evaluation of dxi on ej reads as follows
dxi (ej ) = ij ,
i, j = 1, . . . , n.
n
X
f (ei ) hi
i=1
f homog.
f (hi ei ) = f
Pn
i=1
X
i
hi ei
= f (h).
In Propostition 11.1 below, we will see that { dx1 , . . . , dxn } is not only generating but linearly
independent.
(b) If V = C([0, 1]), the continuous functions on [0, 1] and is an increasing on [0, 1] function,
then the Riemann-Stieltjes integral
(f ) =
f d,
a (f ) = f (a),
f V
f V
281
( , vi , , vj , ) = ( , vj , , vi , ),
(11.1)
i, j = 1, . . . , k, i 6= j, (11.2)
We denote the linear space of all k-forms on Rn by k (Rn ) with the convention 0 (Rn ) = R.
In case k = 1 property (11.2) is an empty condition such that 1 (Rn ) = (Rn ) is just the dual
space.
Let f1 , . . . , fk (Rn ) be linear functionals on Rn . Then we define the k-form
f1 fk k (Rn ) (read: f1 wedge f2 . . . wedge fk ) as follows
f1 (h1 ) f1 (hk )
..
(11.3)
f1 fk (h1 , . . . , hk ) = ...
.
fk (h1 ) fk (hk )
In particular, let i1 , . . . , ik {1, . . . , n} be fixed and choose fj = dxij , j = 1, . . . , k. Then
h1i hki
1
1
..
dxi1 dxik (h1 , . . . , hk ) = ...
.
h1i hki
k
k
b a c
a b c
d e f = e d f .
h g i
g h i
h1 , . . . , hk , h Rn
11 Differential Forms on Rn
282
Proposition 11.1 For k n the k-forms { dxi1 dxik | 1 i1 < i2 < < ik n}
form a basis of the vector space k (Rn ). A k-form with k > n is identically zero. We have
n
k
n
dim (R ) =
.
k
Proof. Any k-form is uniquely determined by its values on the k-tuple of vectors (ei1 , . . . , eik )
with 1 i1 < i2 < < ik n. Indeed, using skew-symmetry of , we know on all ktuples of basis vectors; using linearity in each component, we get on all k-tuples of vectors.
This shows that the dxi1 dxik with 1 i1 < i2 < < ik n generate the linear space
P
P
k (Rn ). We make this precise in case k = 2. With y = i yi ei , z = j zj ej we have by
linearity and skew-symmetry of
(y, z) =
n
X
yi zj (ei , ej ) =
SKEW
i,j=1
X
i<j
Hence,
(yi zj yj zi )(ei , ej )
1i<jn
yi zi X
=
(ei , ej )
(ei , ej ) dxi dxj (y, z).
yj zj i<j
=
X
i<j
This shows that the n2 2-forms { dxi dxj | i < j} generate 2 (Rn ).
P
We show its linear independence. Suppose that i<j ij dxi dxj = 0 for some ij R.
Evaluating this on (er , es ), r < s, gives
X
X
ri si X
=
ij dxi dxj (er , es ) =
ij
ij (ri sj rj si ) = rs ;
0=
rj
sj
i<j
i<j
i<j
hence, the above 2-forms are linearly independent. The arguments for general k are similar.
In general, let k (Rn ) then there exist unique numbers ai1 ik = (ei1 , . . . , eik ) R,
i1 < i2 < < ik such that
X
=
ai1 ik dxi1 dxik .
1i1 <<ik n
283
Proposition 11.2 (i) (Rn ) is an R-algebra with unity 1 and product defined by
( dxi1 dxik ) ( dxj1 dxjl ) = dxi1 dxik dxj1 dxjl
(ii) If k k (Rn ) and l l (Rn ) then k l k+l (Rn ) and
k l = (1)kl l k .
Proof. (i) Associativity is clear since concatanation of strings is associative. The distributive
laws are used to extend multiplication from the basis to the entire space (Rn ).
We show (ii) for k = dxi1 dxik and l = dxj1 dxjl . We already know
dxi dxj = dxj dxi . There are kl transpositions dxir dxjs necessary to transport all
dxjs from the right to the left of k . Hence the sign is (1)kl .
In particular, dxi dxi = 0. The formula dxi dxj = dxj dxi determines the product in
(Rn ) uniquely.
We call (Rn ) is the exterior algebra of the vector space Rn .
The following formula will be used in the next subsection. Let k (Rn ) and l (Rn )
then for all v1 , . . . , vk+l Rn
()(v1 , . . . , vk+l ) =
1 X
(1) (v(1) , . . . , v(k) ) (v(k+1) , . . . , v(k+l) ).
k!l! S
(11.4)
k+l
11 Differential Forms on Rn
284
when expanding this determinant with respect to the last l rows. This can be done using Laplace
expansion:
ai j ai j ai j
aik+1 jk+l
1 1
1 k
k+1 k+1
X
Pk
..
..
..
|A| =
(1) m=1 (im +jm ) ...
,
.
.
.
1j1 <<jk n
ai j ai j ai j
ai j
k 1
k k
k+l k+1
k+l k+l
where (i1 , . . . , ik ) is any fixed orderd multi-index and (jk+1 , . . . , jk+l ) is the complementary
orderd multi-index to (j1 , . . . , jk ) such that all inegers 1, 2, . . . , k + l appear.
h1 , . . . , hk Rn
In particular,
A ( dy1 ) = 1 dx1 + 0 dx2 + 3 dx3 ,
ai2 aj2
.
A ( dy2 dy1 )(ei , ej ) = dy2 dy1 (A(ei ), A(ej )) = dy2 dy1 (Ai , Aj ) =
aj2 aj1
285
In particular
2
A ( dy2 dy1 )(e1 , e2 ) =
1
1
A ( dy2 dy1 )(e2 , e3 ) =
3
Hence,
1
= 1,
0
0
= 3.
0
2 0
= 6,
A ( dy2 dy1 )(e1 , e3 ) =
1 3
11.1.3 Orientation of
Then
If {e1 , . . . , en } and {f1 , . . . , fn } are two bases of Rn there exists a unique regular matrix A =
P
(aij ) (det A 6= 0) such that ei =
j aij fj . We say that {e1 , . . . , en } and {f1 , . . . , fn } are
equivalent if and only if det A > 0. Since det A 6= 0, there are exactly two equivalence classes.
We say that the two bases {ei | i = 1, . . . , n} and {fi | i = 1, . . . , n} define the same orientation
if and only if det A > 0.
Definition 11.5 An orientation of Rn is given by fixing one of the two equivalence classes.
Example 11.4 (a) In R2 the bases {e1 , e2 } and {e2 , e1 } have different orientations since A =
0 1
and det A = 1.
1 0
(b) In R3 the bases {e1 , e2 , e3 }, {e3 , e1 , e2 } and {e2 , e3 , e1 } have the same orientation whereas
{e1 , e3 , e2 }, {e2 , e1 , e3 }, and {e3 , e2 , e1 } have opposite orientation.
(c) The standard basis {e1 , . . . , en } and {e2 , e1 , e3 , . . . , en } define different orientations.
11 Differential Forms on Rn
286
(b) Let be a differential k-form on U. Since { dxi1 dxik | 1 i1 < i2 < < ik n}
forms a basis of k (Rn ) there exist uniquely determined functions ai1 ik on U such that
X
(p) =
(11.5)
ai1 ik (p) dxi1 dxik .
1i1 <<ik n
If all functions ai1 ik are in Cr (U), r N {} we say is an r times continuosly differentiable differential k-form on U. The set of those differential k-forms is denoted by rk (U)
We define
r0 (U)
n
M
k=0
in (U):
()(x) = (x)(x),
x U,
2 = (xy 2 + 3z 2 ) dx
11.2.2
Differentiation
d(p) =
1i1 <<ik n
(11.6)
Then d is a differential (k + 1)-form. The linear operator d : k (U) k+1 (U) is called the
exterior differential.
Remarks 11.1 (a) Note, that for a function f : U R, Df L(Rn , R) = 1 (Rn ). By
Example 7.7 (a)
n
n
X
X
f
f
Df (x)(h) = grad f (x) h =
(x)hi =
(x) dxi (h),
x
x
i
i
i=1
i=1
hence
n
X
f
df (x) =
(x) dxi .
xi
i=1
(11.7)
287
v3 v2
y
z
dy dz +
v1 v3
z
x
dz dx +
v2 v1
x
y
dx dy
1k (U), 1 (U).
Proof. (i) We first prove Leibniz rule for functions f, g 10 (U). By Remarks 11.1 (a),
X f
X
g
(f g) dxi =
g+f
dxi
d(f g) =
xi
xi
xi
i
i
X f
X g
=
dxi g +
dxi f = df g + f dg.
xi
xi
i
i
For I = (i1 , . . . , ik ) and J = (j1 , . . . , jl ) we abbreviate dxI = dxi1 dxik and dxJ =
11 Differential Forms on Rn
288
dxj1 dxjl . Let =
X
I
d() = d
=
X
I,J
aI dxI and =
X
I,J
bJ dxJ . By definition
X
I,J
aI bJ dxI dxJ
X
I,J
= d + (1) d,
X
I,J
where in the third line we used dbJ dxI = (1)k dxI dbJ .
(ii) Again by the definition of d:
d(d) =
X
I
X 2
X aI
aI
d (daI dxI ) =
d
dxj dxI =
dxi dxj dxI
x
x
j
i xj
I,j
I,i,j
X 2 aI
( dxj dxi dxI ) = d(d).
Schwarz lemma
xj xi
I,i,j
=
11.2.3 Pull-Back
Definition 11.8 Let f : U V be a differentiable function with open sets U Rn and
V Rm . Let k (V ) be a differential k-form. We define a differential k-form f ()
k (U) by
(f )(p) = (Df (p))(f (p)),
(f )(p; h1, . . . , hk ) = (f (p); Df (p)(h1), . . . , Df (p)(hk )),
p U, h1 , . . . , hk Rn .
f = f.
289
Proposition 11.4 Let f be as above and , (V ). Let { dy1 , . . . , dym } be the dual basis
to the standard basis in (Rm ) . Then we have with f = (f1 , . . . , fm )
n
X
fi
dxj = dfi ,
x
j
j=1
(a)
f ( dyi) =
(b)
f (d) = d(f ),
(c)
(d)
f () = f ()f ().
i = 1, . . . , m.
a C (V ),
(11.8)
(11.9)
(11.10)
(11.11)
If n = m, then
(e)
f ( dy1 dyn ) =
(f1 , . . . , fn )
dx1 dxn .
(x1 , . . . , xn )
(11.12)
Proof. We show (a). Let h Rn ; by Definition 11.7 and the definition of the derivative we have
+
*
!
n
X
(p)
f
k
f ( dyi )(h) = dyi (Df (p)(h)) = dyi ,
hj
x
j
j=1
k=1,...,m
n
n
X fi (p)
X fi (p)
=
hj =
dxj (h).
xj
xj
j=1
j=1
This shows (a). Equation (11.10) is a special case of (11.11); we prove (d). Let p U. Using
the pull-back formula for k forms we obtain
f ()(p) = (Df (p)) ((f (p))) = (Df (p))((f (p))(f (p)))
= (Df (p)) ((f (p))) (Df (p)) ((f (p)))
To show (11.9) we start with a 0-form g and prove that f (dg) = d(f g) for functions
g : U R. By (11.7) and (11.10) we have
!
m
m
X
X
g
g(f (p))
f (dg)(p) = f
dyi (p) =
f ( dyi)
yi
yi
i=1
i=1
n
m
X
g(f (p)) X fi
(p) dxj
y
x
i
j
i=1
j=1
!
n
m
X X g(f (p)) fi (p)
=
dxj
yi
xj
j=1
i=1
n
X
=
(g f )(p) dxj
chain rule
x
j
j=1
11 Differential Forms on Rn
290
we get by Leibniz rule
d (f ) = d
=
X
I
f (aI )f ( dxI )
X
I
By the first part of (b), both expressions coincide. This completes the proof of (b).
We finally prove (e). By (b) and (d) we have
f ( dy1 dyn ) = f ( dy1 ) f ( dyn )
n
n
X
X
f1
fn
dxin
dxi1
=
xi1
xin
i =1
i =1
1
n
X
i1 ,...,in
f1
fn
dxi1 dxin .
xi1
xin
=1
Since the square of a 1-form vanishes, the only non-vanishing terms in the above sum are the
permutations (i1 , . . . , in ) of (1, . . . , n). Using skew-symmetry to write dxi1 dxin as a
multiple of dx1 dxn , we obtain the sign of the permutation (i1 , . . . , in ):
X
f1
fn
sign (I)
f ( dy1 dyn ) =
dx1 dxn
xi1
xin
I=(i1 ,...,in )Sn
(f1 , . . . , fn )
dx1 dxn .
(x1 , . . . , xn )
Example 11.6 (a) Let f (r, ) = (r cos , r sin ) be given on R2 \ (0 R) and let { dr, d}
and { dx, dy} be the dual bases to {er , e } and {e1 , e2 }. We have
f (x) = r cos ,
f (y) = r sin ,
f ( dx dy) = r drd,
y
x
dx + 2
dy = d.
x2 + y 2
x + y2
(b) Let k N, r {1, . . . , k}, and R. Define a mapping a mapping from I : Rk Rk+1
and k (Rk+1 ) by
I(x1 , . . . , xk ) = (x1 , . . . , xr1 , , xr , . . . xk ),
(y1, . . . , yk+1) =
k+1
X
i=1
fi (y) dy1 d
dyi dyk+1,
291
where fi C (Rk+1 ) for all i; the hat means omission of the factor dyi . Then
I ()(x) = fr (x1 , . . . , xr1 , , xr , . . . , xk ) dx1 dxk .
This follows from
I ( dyi ) = dxi ,
I ( dyr ) = 0,
I ( dyi+1 ) = dxi ,
i = 1, . . . , r 1,
i = r, . . . , k.
y
x
dx + 2
dy
2
+y
x + y2
x2
x0
Definition 11.10 An open set U is called star-shaped if there exists an x0 U such that for all x U the segment from x0 to x is
in U, i. e. (1 t)x0 + tx U for all t [0, 1].
Convex sets U are star-shaped (take any x0 U); any star-shaped set is connected and simply
connected.
11 Differential Forms on Rn
292
Lemma
11.5 Let U
Rn be star-shaped with respect to the origin.
X
=
ai1 ik dxi1 dxik k (U). Define
Let
i1 <<ik
Z 1
k
X X
r1
k1
dir dxi ,
(1)
t ai1 ik (tx) dt xir dxi1 dx
k
I()(x) =
i1 <<ik r=1
(11.13)
where the hat means omission of the factor dxir . Then we have
I(d ) + d(I ) = .
(11.14)
(Without proof.)
Example 11.7 (a) Let k = 1, n = 3, and = a1 dx1 + a2 dx2 + a3 dx3 . Then
I() = x1
a1 (tx) dt + x2
a2 (tx) dt + x3
a3 (tx) dt.
Note that this is exactly the formula for the potential U(x1 , x2 , x3 ) from Remark 8.5 (b). Let
(a1 , a2 , a3 ) be a vector field on U with d = 0. This is equivalent to curl a = 0 by Example 11.5 (c). The above lemma shows dU = for U = I(); this means grad U = (a1 , a2 , a3 ),
U is the potential to the vector field (a1 , a2 , a3 ).
(b) Let k = 2, n = 3, and = a1 dx2 dx3 + a2 dx3 dx1 + a3 dx1 dx2 where a is a C1 -vector
field on U. Then
I() =
x3
ta2 (tx) dt x2
ta3 (tx) dt dx1 +
0
Z 1
Z 1
ta3 (tx) dt x3
ta1 (tx) dt dx2 +
+ x1
0
0
Z 1
Z 1
+ x2
ta1 (tx) dt x1
ta2 (tx) dt dx3 .
Z
By Example 11.5 (d), is closed if and only if div (a) = 0 on U. Let = b1 dx1 + b2 dx2 +
b3 dx3 such that d = . This means curl b = a. The Poincare lemma shows that b with
curl b = a exists if and only if div (a) = 0. Then b is the vector potential to a. In case d = 0
we can choose ~b d~x = I().
Theorem 11.6 (Poincare Lemma) Let U be star-shaped. Then every closed differential form
is exact.
Proof. Without loss of generality let U be star-shaped with respect to the origin and d = 0.
By Lemma 11.5, d(I) = .
293
if and only if U has exactly p components which are not connected U = U1 Up (disjoint
union). Then, the characteristic functions Ui , i = 1, . . . , p, form a basis of the 0-cycles C 0 (U)
(B 0 (U) = 0).
A very nice treatment of the topics to this section is [Spi65, Chapter 4]. The set [0, 1]k =
[0, 1] [0, 1] = {x Rk | 0 xi 1, i = 1, . . . , k} is called the k-dimensional unit
cube. Let U Rn be open.
Definition 11.11 (a) A singular k-cube in U Rn is a continuously differentiable mapping
ck : [0, 1]k U.
(b) A singular k-chain in U is a formal sum
sk = n1 ck,1 + + nr ck,r
c2
11 Differential Forms on Rn
294
Let Ik : [0, 1]k Rk be the identity map, i. e. Ik (x) = x, x [0, 1]k . It is called the standard kcube in Rk . We are going to define the boundary sk of a singular k-chain sk . For i = 1, . . . , k
define
k
(x1 , . . . , xk1 ) = (x1 , . . . , xi1 , 0, xi , . . . , xk1 ),
I(i,0)
k
I(i,1)
(x1 , . . . , xk1 ) = (x1 , . . . , xi1 , 1, xi , . . . , xk1 ).
k
X
i=1
k
k
(1)i I(i,0)
I(i,1)
.
(11.15)
k
X
i=1
k
k
(1)i ck I(i,0)
ck I(i,1)
,
(11.16)
x3
3
3
3
3
3
3
I3 = I(1,0)
+ I(1,1)
+ I(2,0)
I(2,1)
I(3,0)
+ I(3,1)
,
+I(2,0)
- I(1,0)
-I
+ I(1,1)
x2
x1
- I(3,0)
where
(2,1)
3
3
I(1,0)
(x1 , x2 ) = (0, x1 , x2 ), +I(1,1)
(x1 , x2 ) = +(1, x1 , x2 ),
3
3
(x1 , x2 ) = +(x1 , 0, x2 ), I(2,1)
(x1 , x2 ) = (x1 , 1, x2 ),
+I(2,0)
3
3
I(3,0)
(x1 , x2 ) = (x1 , x2 , 0), +I(3,1)
(x1 , x2 ) = +(x1 , x2 , 1).
k
k
D2 I(i,j)
to
Note, if we take care of the signs in (11.15) all 6 unit normal vectors D1 I(i,j)
3
the faces have the orientation of the outer normal with respect to the unit 3-cube [0, 1] . The
above sum I3 is a formal sum of singular 2-cubes. You are not allowed to add componentwise:
(0, x1 , x2 ) + (1, x1 , x2 ) 6= (1, 0, 0).
295
(b) In case k = 2 we have
E3
I (1,0)
E1
I (2,0)
2
2
2
2
I2 (x) = I(1,1)
I(1,0)
+ I(2,0)
I(2,1)
I (2,1)
E4
+ I
(1,1)
4
c2
I2 = (E3 E2 ) (E4 E1 )+
E2
+ (E2 E1 ) (E3 E4 ) = 0.
Here we have c2 = 1 + 2 3 4 .
be the singular 2-cube
3\
{(0, 0, 0)}
= (cos 2 sin x, sin 2 sin x, cos 2) (cos 0 sin x, sin 0 sin x, cos x)+
+ (cos x sin 0, sin x sin 0, cos 0) (cos x sin , sin x sin , cos )
= (sin x, 0, cos x) (sin x, 0, cos x) + (0, 0, 1) (0, 0, 1)
Hence, the boundary c2 of the singular 2-cube c2 is a degenerate singular 1-chain. We come
back to this example.
11.3.2 Integration
Definition 11.12 Let ck : [0, 1]k U Rn , ~x = ck (t1 , . . . , tk ), be a singular k-cube and
a k-form on U. Then (ck ) () is a k-form on the unit cube [0, 1]k . Thus there exists a unique
function f (t), t [0, 1]k , such that
(ck ) () = f (t) dt1 dtk .
Then
ck
:=
Ik
(ck ) ()
:=
[0,1]k
is called the integral of over the singular cube ck ; on the right there is the k-dimensional
Riemann integral.
r
X
If sk =
ni ck,i is a k-chain, set
i=1
sk
r
X
i=1
ni
ck,i
11 Differential Forms on Rn
296
Example 11.9 (a) k = 1. Let c : [0, 1] Rn be an oriented, smooth curve = c([0, 1]). Let
= f1 (x) dx1 + + fn (x) dxn be a 1-form on Rn , then
c () = (f1 (c(t))c1 (t) + + fn (c(t))cn (t)) dt
is a 1-form on [0, 1] such that
Z
Z
Z 1
Z
=
c =
f1 (c(t))c1 (t) + + fn (c(t))cn (t) dt =
f~ d~x.
c
[0,1]
R
Obviously, c is the line integral of f~ over .
(b) k = n. Let c : [0, 1]k Rk be continuously differentiable and let x = c(t). Let =
f (x) dx1 dxk be a differential k-form on Rk . By Proposition 11.4 (e),
c () = f (c(t))
(c1 , . . ., ck )
dt1 . . . dtk .
(t1 , . . ., tk )
Therefore,
Z
f (c(t))
(c1 , . . ., ck )
dt1 . . . dtk
(t1 , . . ., tk )
(11.17)
[0,1]k
[0,1]k
[0,1]k
Ik
[0,1]k
R
We see that is an oriented Riemann integral. Note that in the above formula (11.17) we do
Ik
sk+1
297
k+1
X
d =
(1)i+1
i=1
fi (x)
dx1 dxk+1 ,
xi
hence by Example 11.6 (b), Fubinis theorem and the fundamental theorem of calculus
Z
Z
k+1
X
fi
i+1
d =
(1)
dx1 dxk+1
xi
i=1
Ik+1
[0,1]k+1
k+1
X
(1)
i=1
k+1
X
(1)
k+1
X
i=1
fi
(x1 , . . ., t, . . .xk+1 ) dt dx1 dxi1 dxi+1 dxk+1
bi
xi
[0,1]k
i+1
[0,1]k
i=1
Z Z
i+1
(1)i+1
Z
[0,1]k
bi
Z
k+1
k+1
I(i,1)
I(i,0)
[0,1]k
Ik+1
d =
(ck+1) (d ) =
d ((ck+1) ) =
(ck+1 )
ck+1
Ik+1
k+1
X
i=1
k+1
X
(1)i
sk+1
Ik+1
(ck+1)
k+1
I(i,0)
(1)
k+1
I(i,0)
i=1
Ik+1
(ck+1 )
ck+1
k+1
k+1
ck+1 I(i,0)
ck+1 I(i,1)
d =
X
i
ni
d =
ck+1
X
i
ni
=
ck+1
sk+1
11 Differential Forms on Rn
298
Remark 11.4 Stokes theorem is valid for arbitrary oriented compact differentiable kdimensional manifolds F and continuously differentiable (k 1)-forms on F.
Example 11.10 We come back to Example 11.8 (c). Let = (x dy dz + y dz dx +
z dx dy)/r 3 be a 2-form on R3 \ {(0, 0, 0)}. It is easy to show that is closed, d = 0. We
R
compute . First note that c2 (r 3 ) = 1, c2 (x) = cos s sin t, c2 (y) = sin s sin t, c2 (z) = cos t
c2
such that
= sin t ds dt,
such that
Z
=
c2
c2 ()
[0,2][0,]
sin t ds dt =
[0,2][0,]
( sin t) dsdt = 4.
Stokes theorem shows that is not exact on R3 \ {(0, 0, 0)}. Suppose to the contrary that
= d for some 1 (R3 \ {(0, 0, 0)}). Since by Example 11.8 (c), c2 is a degenerate
1-chain (it consists of two points), the pull-back (c2 ) () is 0 and so is the integral
Z
Z
Z
Z
0=
(c2 ) () =
=
d =
= 4,
I1
c2
c2
c2
299
c2
c2
11.3.5 Applications
The following two applications were not covered by the lecture.
(a) Browers Fixed Point Theorem
Proposition 11.8 (Retraction Theorem) Let G Rn be a compact, connected, simply connected set with smooth boundary G.
There exist no vector field f : G Rn , fi C2 (G), i = 1, . . . , n such that f (G) G and
f (x) = x for all x G.
Proof. Suppose to the contrary that such f exists; consider n1 (U),
= x1 dx2 dx3 dxn . First we show that f (d) = 0. By definition, for
v1 , . . . , vn Rn we have
f (d)(p)(v1 , . . . , vn ) = d(f (p))(Df (p)v1, Df (p)v2 , . . . , Df (p)vn )).
Since dim f (G) = dim G = n 1, the n vectors Df (p)v1 , Df (p)v2 , . . . , Df (p)vn can be
thought as beeing n vectors in an n 1 dimensional linear space; hence, they are linearly
dependent. Consequently, any alternating n-form on those vectors is 0. Thus f (d) = 0. By
Stokes theorem
Z
Z
f () = 0 =
f (d).
G
11 Differential Forms on Rn
300
On the other hand, f = id on G such that
f () =
G
dx1 dxn = | G | ;
a contradiction.
301
f(z)
c(s,t)
= (1 t)f (z) + tz n ,
(1-t) f(z)+t z
be singular 2-cubes in R2 .
Lemma 11.10 If | z | = R is sufficiently large, then
| c(s, t) |
Rn
,
2
The only fact we need is c(s, t) 6= 0 for sufficiently large R; hence, c maps the unit square into
R2 \ {(0, 0)}.
Lemma 11.11 Let = (x, y) = (y dx + x dy)/(x2 + y 2 ) be the winding form on
R2 \ {(0, 0)}. Then we have
(a)
c = cR,f cR,n ,
b = f (z) f (0).
(b) For sufficiently large R, c, cR,n , and cR,f are chains in R2 \ {(0, 0)} and
Z
Z
=
= 2n.
cR,n
cR,f
11 Differential Forms on Rn
302
Proof. (a) Note that z(0) = z(1) = R. Since I2 (x) = (x, 0) (x, 1) + (1, x) (0, x) we have
c(s) = c(s, 0) c(s, 1) + c(1, s) c(0, s)
= f (z) z n ((1 s)f (R) + sRn ) + ((1 s)f (R) + sRn ) = f (z) z n .
(b) By the Lemma 11.10, c is a singular 2-chain in R2 \ {(0, 0, 0)} for sufficiently large R.
Hence c is a 1-chain in R2 \ {(0, 0)}. In particular, both cR,n and cR,f take values in
R2 \ {(0, 0)}. Hence (c) () is well-defined. We compute cR,n () using the pull-backs of
dx and dy
cR,n (x2 + y 2 ) = R2n ,
cR,n ( dx) = 2nRn sin(2ns) ds,
cR,n ( dy) = 2nRn cos(2ns) ds,
cR,n () = 2n ds.
Hence
cR,n
2n ds = 2n.
0
,
c
cR,n
hence
cR,f
=
cR,n
= 2n.
cR,f
111111111111
000000000000
000000000000
111111111111
0000
1111
00
11
We complete the proof of the fundamental theorem of al000000000000
111111111111
0000
1111
00
11
000000000000
111111111111
0000
1111
n
b(s,t)=(1-t) z
000000000000
111111111111
gebra. Suppose to the contrary that the polynomial f (z)
0000
1111
000000000000
111111111111
0000
1111
for f=1.
000000000000
111111111111
is non-zero in C, then b as well as b are singular chains
000000000000
111111111111
000000000000
111111111111
000000000000
111111111111
in R2 \ {(0, 0)}.
000000000000
111111111111
000000000000
111111111111
By Lemma 11.11 (b) and again by Stokes theorem we have
Z
Z
Z
Z
=
=
= d = 0.
cR,f
cR,f f (0)
But this is a contradiction to Lemma 11.11 (b). Hence, b is not a 2-chain in R2 \ {(0, 0)}, that
is there exist s, t [0, 1] such that b(s, t) = f ((1 t)z) = 0. We have found that (1
303
| f (z) z |
2
k=1
k=1
n
n1
X
304
11 Differential Forms on Rn
Chapter 12
Measure Theory and Integration
12.1 Measure Theory
Citation from Rudins book, [Rud66, Chapter 1]: Towards the end of the 19th century it became
clear to many mathematicians that the Riemann integral should be replaced by some other
type of integral, more general and more flexible, better suited for dealing with limit processes.
Among the attempts made in this direction, the most notable ones were due to Jordan, Borel,
W.H. Young, and Lebesgue. It was Lebesgues construction which turned out to be the most
successful.
In a brief outline, here is the main idea: The Riemann integral of a function f over an interval
[a, b] can be approximated by sums of the form
n
X
f (ti ) m(Ei ),
i=1
where E1 , . . . , En are disjoint intervals whose union is [a, b], m(Ei ) denotes the length of Ei
and ti Ei for i = 1, . . . , n. Lebesgue discovered that a completely satisfactory theory of integration results if the sets Ei in the above sum are allowed to belong to a larger class of subsets
of the line, the so-called measurable sets, and if the class of functions under consideration is
enlarged to what we call measurable functions. The crucial set-theoretic properties involved
are the following: The union and the intersection of any countable family of measurable sets are
measurable;. . . the notion of length (now called measure) can be extended to them in such
a way that
m(E1 E2 ) = m(E1 ) + m(E2 ) +
for any countable collection {Ei } of pairwise disjoint measurable sets. This property of m is
called countable additivity.
The passage from Riemanns theory of integration to that of Lebesgue is a process of completion. It is of the same fundamental importance in analysis as the construction of the real number
system from rationals.
305
306
rule
An =
Acn
!c
(d) The family P(X) of all subsets of X is both an algebra as well as a -algebra.
(e) Any -algebra is an algebra but there are algebras not being -algebras.
(f) The family of finite and cofinite subsets (these are complements of finite sets) of an infinite
set form an algebra. Do they form a -algebra?
(b) Elementary Sets and Borel Sets in
Let R be the extended real axis together with , R = R {+} {}. We use the old
rules as introduced in Section 3.1.1 at page 79. The new rule which is used in measure theory
only is
0 = 0 = 0.
The set
I = {(x1 , . . . , xn ) Rn | ai xi bi ,
i = 1, . . . , n}
307
is called a rectangle or a box in Rn , where either stands for < or for where ai , bi R.
For example ai = and bi = + yields I = Rn , whereas a1 = 2, b1 = 1 yields I = .
A subset of Rn is called an elementary set if it is the union of a finite number of rectangles
in Rn . Let En denote the set of elementary subsets of Rn . En = {I1 I2 Ir | r
N, Ij is a box in Rn }.
Lemma 12.1 En is an algebra but not a -algebra.
Proof. The complement of a finite interval is the union of
two intervals, the complement of an infinite interval is an
infinite interval. Hence, the complement of a rectangle in
Rn is the finite union of rectangles.
S
The countable (disjoint) union M = nN [n, n + 12 ] is
not an elementary set.
Note that any elementary set is the disjoint union of a
finite number of rectangles.
Let B be any (nonempty) family of \
subsets of X. Let (B) denote the intersection of all algebras containing B, i. e. (B) =
Ai , where {Ai | i I} is the family of all -algebras
iI
J(x0 , y0 ).
308
(x,y)
0 0
(a,b)
Since (a, b) I 1 (x0 , y0) J(x0 , y0 ), we have shown that M is the union of the countable
n
family of sets J. Since closed sets are the complements of open sets and complements are
again in the -algebra, the assertion follows for closed sets.
Remarks 12.2 (a) We have proved that any open subset M of Rn is the countable union of
rectangles I M.
(b) The Borel algebra Bn is also the -algebra generated by the family of open or closed sets
in Rn , Bn = (Gn ) = (Fn ). Countable unions and intersections of open or closed sets are in
Bn .
Let us look in more detail at some of the sets in (En ). Let G and F be the families of all open
and closed subsets of Rn , respectively. Let G be the collection of all intersection of sequences
of open sets (from G), and let F be the collection of all unions of sequences of sets of F. One
can prove that F G and G F . These inclusions are strict. Since countable intersection
and unions of countable intersections and union are still countable operations, G , F (En )
For an arbitrary family S of sets let S be the collection of all unions of sequences of sets in S,
and let S be the collection of all unions of sequences of sets in S. We can iterate the operations
represented by and , obtaining from the class G the classes G , G , G ,. . . and from F the
classes F , F ,. . . . It turns out that we have inclusions
G G G (En )
F F F (En ).
No two of these classes are equal. There are Borel sets that belong to none of them.
309
N An A we have
An
(An ).
X
X
An =
(An ).
n=1
n=1
We say is finite if (X) < . If (X) = 1, we call (X, A, ) a probability space. We call
S
-finite if there exist sets An A, with (An ) < and X =
n=1 An .
Example 12.1 (a) Let X be a set, x0 X and A = P(X). Then
(A) =
(
1,
0,
x0 A,
x0 6 A
(
n,
if A has n elements
+,
(
0,
+,
(
0,
+,
if A is finite
if A is infinite.
310
n
X
(Ii ),
if A =
i=1
n
X
Ii .
i=1
((0, x]),
((x, 0]),
x 0,
x < 0,
k=1
subsets.
311
(e)
n
[
Ak
k=1
n
X
(Ak ),
k=1
if Ak A, k = 1, . . . , n, ( is finitely subadditive).
X
Ak A. Then
(f) If {Ak | k N} is a disjoint family in A and
k=1
Ak
k=1
(Ak ).
k=1
Proof. (a) is by induction. (d), (c), and (b) are easy (cf. Homework 34.4).
n
[
(e) We can write
Ak as the finite disjoint union of n sets of A:
k=1
n
[
k=1
Ak = A1 A2 \ A1 A3 \ (A1 A2 ) An \ (A1 An1 ) .
Since is additive,
n
[
k=1
Ak
n
X
k=1
n
X
(Ak ),
k=1
(Ak ) =
k=1
n
X
Ak
k=1
X
k=1
Ak
Proposition 12.4 Let be an additive function on the algebra A. Consider the following statements
(a) is countably additive.
(b) For any increasing sequence An An+1 , An A, with
have lim (An ) = (A).
n
n=1
n=1
An = A A we
An = A A and
We have (a) (b) (c) (d). In case (X) < ( is finite), all statements are equivalent.
312
n
n
X
X
X
(A) =
(Bk ) = lim
(Bk ) = lim
Bk = lim (An ).
n
k=2
k=2
k=2
S
(b) (a). Let {An } be a family of disjoint sets in A with An = A A; put Bk =
A1 Ak . Then Bk is an increasing to A sequence. By (b)
!
!
n
n
X
X
X
(Bn ) =
=
Ak
(Ak ) (A) =
Ak ;
is additive
k=1
Thus,
(Ak ) =
k=1
X
k=1
k=1
k=1
Ak .
(An ) (A).
n
[
(ak , bk ]
(a, b] =
k=1
with a disjoint family [ak , bk ) of intervals. By Proposition 12.3 (f) we already know
((a, b])
((ak , bk ]).
(12.1)
k=1
We prove the opposite direction. Let > 0. Since is continuous from the right at a, there
exists a0 [a, b) such that (a0 )(a) < and, similarly, for every k N there exists ck > bk
such that (ck ) (bk ) < /2k . Hence,
[a0 , b]
(ak , bk ]
k=1
(ak , ck )
k=1
313
is an open covering of a compact set. By HeineBorel (Definition 6.14) there exists a finite
subcover
N
N
[
[
[a0 , b]
(ak , ck ), hence (a0 , b]
(ak , ck ],
k=1
k=1
((a0 , b])
N
X
((ak , ck ]).
k=1
.
2k
N
X
k=1
((ak , bk ]) + 2
((ak , bk ]) + 2
k=1
((ak , bk ]).
k=1
Corollary 12.6 The correspondence 7 from Example 12.1 (d) defines a bijection between countably additive functions on E1 and the monotonically increasing, right-continuous
functions on R (up to constant functions, i. e. and + c define the same additive function).
314
The extension theory is due to Caratheodory (1914). For a detailed treatment, see [Els02, Section II.4].
Theorem 12.7 (Extension and Uniqueness) Let be a countably additive function on the algebra A.
(a) There exists an extension of to a measure on the -algebra (A) which coincides with
on A. We denote the measure on (A) also by . It is defined as the restriction of the outer
measure : P(X) [0, ]
(A) = inf
X
k=n
(An ) | A
n=1
An , An A, n N
Using the facts from the previous subsection we conclude that for any increasing, right continuous function on R there exists a measure on the -algebra of Borel sets. We call this
measure the LebesgueStieltjes measure on R. In case (x) = x we call it the Lebesgue measure. Extending the Lebesgue content on elementary sets of Rn to the Borel algebra Bn , we
obtain the n-dimensional Lebesgue measure n on Rn .
Completeness
A measure : A R+ on a -algebra A is said to be complete if A A, (A) = 0, and
B A implies B A. It turns out that the Lebesgue measure n on the Borel sets of Rn is
not complete. Adjoining to Bn the subsets of measure-zero-sets, we obtain the -algebra An
of Legesgue measurable sets An .
An = (Bn {X Rn | B Bn : X E,
The Lebesgue measure n on An is now complete.
n (B) = 0}) .
315
Remarks 12.4 (a) The Lebesgue measure is invariant under the motion group of Rn . More
precisely, let O(n) = {T Rnn | T T = T T = En } be the group of real orthogonal
n n-matrices (motions), then
n (T (A)) = n (A),
A An ,
T O(n).
(b) n is translation invariant, i. e. n (A) = n (x+A) for all x Rn . Moreover, the invariance
of n under translations uniquely characterizes the Lebesgue measure n : If is a translation
invariant measure on Bn , then = cn for some c R+ .
(c) There exist non-measurable subsets in Rn . We construct a subset E of R that is not Lebesgue
measurable.
We write x y if x y is rational. This is an equivalence relation since x x for all x R,
x y implies y x for all x and y, and x y and y z implies x z. Let E be a subset of
(0, 1) that contains exactly one point in every equivalence class. (the assertion that there is such
a set E is a direct application of the axiom of choice). We claim that E is not measurable. Let
E + r = {x + r | x E}. We need the following two properties of E:
(a) If x (0, 1), then x E + r for some rational r (1, 1).
(b) If r and s are distinct rationals, then (E + r) (E + s) = .
To prove (a), note that for every x (0, 1) there exists y E with x y. If r = x y, then
x = y + r E + r.
To prove (b), suppose that x (E + r) (E + s). Then x = y + r = z + s for some y, z E.
Since y z = s r 6= 0, we have y z, and E contains two equivalent points, in contradiction
to our choice of E.
[
Now assume that E is Lebesgue measurable with (E) = . Define S =
(E + r) where
the union is over all rational r (1, 1). By (b), the sets E + r are pairwise disjoint; since
is translation invariant, (E + r) = (E) = for all r. Since S (1, 2), (S) 3. The
countable additivity of now forces = 0 and hence (S) = 0. But (a) implies (0, 1) S,
hence 1 (S), and we have a contradiction.
(d) Any countable set has Lebesgue measure zero. Indeed, every single point is a box with
edges of length 0; hence ({pt}) = 0. Since is countably additive,
({x1 , x2 , . . . , xn , . . . }) =
X
n=1
({xn }) = 0.
316
iai {0, 2} i = 1, . . . , n Clearly,
n
n+1
2
2
2
(Cn+1) = (Cn ) = =
(C1 ) =
.
3
3
3
P
ai
i=1 3i
By Proposition 12.4 (c), (C) = lim (Cn ) = 0. However, C has the same cardinality as
n
{0, 2}N
= {0, 1}N
= R which is uncountable.
(a 1/n, +]
317
Remark 12.5 (a) Let f, g : X R be A-measurable. Then {x | f (x) > g(x)} and {x |
f (x) = g(x)} are in A. Proof. Since
[
{x | f (x) < g(x)} =
({x | f (x) < q} {x | q < g(x)}) ,
and all sets {f < q} and {q < g} the right are in A, and on the right there is a countable union,
the right hand side is in A. A similar argument works for {f > g}. Note that the sets {f g}
and {f g} are the complements of {f < g} and {f > g}, respectively; hence they belong to
A as well. Finally, {f = g} = {f g} {f g}.
(b) It is not difficult to see that for any sequence (an ) of real numbers
lim an = inf sup ak
and
(12.2)
N kn
kn
As a consequence we can construct new mesurable functions using sup and limn . Let (fn )
be a sequence of A-measurable real functions on X. Then sup fn , inf fn , lim fn , lim fn are
Since all fn are measurable, so is sup fn . A similar proof works for inf fn . By (12.2), lim fn
n
and lim fn , are measurable, too.
n
318
f+
f-
Let (X, A, ) be a measure space and f : X R arbitrary. Let f + = max{f, 0} and f = max{f, 0}
denote the positive and negative parts of f . We have
f = f + f and | f | = f + +f ; moreover f + , f 0.
Corollary 12.10 Let f is a Borel function if and only if both f + and f are Borel.
n
X
ci Ai ,
i=1
n
X
i=1
ci Ai S+ define
Z
f d =
X
n
X
ci (Ai ).
(12.3)
i=1
The convention 0 (+) is used here; it may happen that ci = 0 for some i and (Ai ) = +.
319
Remarks 12.6 (a) Since ci 0 for all i, the right hand side is well-defined in R.
m
m
X
X
(b) Given another presentation of f , say, f (x) =
dj Bj ,
dj (Bj ) gives the same value
j=1
j=1
as (12.3).
0 s1 s2 f .
(b)
y
f
1
s1
3/4
E24
E24
1/2
F2 = f 1 ([1, +]) .
E11
E12
E12 E11 b
F1
and Fn = f 1 ([n, ])
and put
n
sn =
n2
X
i1
i=1
2n
Eni + nFn .
320
Proposition 12.8 shows that Eni and Fn are measurable sets. It is easily seen that the functions
sn satisfy (a). If x is such that f (x) < +, then
0 f (x) sn (x)
1
2n
(12.4)
as soon as n is large enough, that is, x Eni for some n, i N and not x Fn . If f (x) = +,
then sn (x) = n; this proves (b).
From (12.4) it follows, that sn f uniformly on X if f is bounded.
Step 2: Positive Measurable Real Functions
Definition 12.6 (Lebesgue Integral) Let f : X [0, +] be measurable. Let (sn ) be an
increasing sequence of non-negative simple functions sn converging to f (x) for all x X,
lim sn (x) = sup sn (x) = f (x). Define
f d = lim
sn d = sup
sn d
(12.5)
and call this number in [0, +] the Lebesgue integral of f (x) over X with respect to the measure or -integral of f over X.
The definition of the Lebesgue integral does not depend on the special choice of the increasing
functions sn f . One can define
Z
Z
f d = sup
s d | s f, and s is a simple function .
X
R
Observe, that we apparently have two definitions for X f d if f is a simple function. However
these assign the same value to the integral since f is the largest simple function greater than or
equal to f .
Proposition 12.13 The properties (1) to (4) from Lemma 12.11 hold for any non-negative measurable functions f, g : X [0, +], c R+ .
(Without proof.)
Step 3: Measurable Real Functions
Let f : X R be measurable and f + (x) = max(f, 0), f (x) = max(f (x), 0). Then f +
and f are both positive and measurable. Define
Z
Z
Z
+
f d =
f d
f d
X
if at least one of the integrals on the right is finite. We say that f is -integrable if both are
finite.
321
measurable
function
+
f d =
u d
u d + i
v d i
v d.
(12.6)
X
These four functions u+ , u , v + , and v are measurable, real, and non-negative. Since we have
u+ | u | | f | etc., each of these four integrals is finite. Thus, (12.6) defines the integral on
the left as a complex number.
We define L 1 (X, ) to be the collection of all complex -integrable functions f on X.
R
Note that for an integrable functions f , X f d is a finite number.
Proposition 12.14 Let f, g : X C be measurable.
(d) If f g on X, then
f d
g d.
f d
d = (A).
322
Example 12.3 (a) On Z define a b if 2 | (ab). a and b are equivalent if both are odd or both
are even. There are two equivalence classes, 1 = 5 = 2Z + 1 (odd numbers), 0 = 100 = 2Z
even numbers.
(b) Let W V be a subspace of the linear space V . For x, y V define x y if x y W .
This is an equivalence relation, indeed, the relation is reflexive since x x = 0 W , it
is symmetric since x y W implies y x = (x y) W , and it is transitive since
x y, y z W implies that there sum (x y) + (y z) = x z W such that x z. One
has 0 = W and x = x + W := {x + w | w W }. Set set of equivalence classes with respect to
this equivalence relation is called the factor space or quotient space of V with respect to W and
is denoted by V /W . The factor space becomes a linear space if we define x + y := x + y and
x = x, C. Addition is indeed well-defined since x x and y y , say, x x = w1 ,
y y = w2 , w1 , w2 W implies x + y (x + y ) = w1 + w2 W such that x + y = x + y .
(c) Similarly as in (a), for m N define the equivalence relation a b (mod m) if m | (a b).
We say a is congruent b modulo m. This defines a partition of the integers into m disjoint
equivalence classes 0, 1, . . . , m 1, where r = {am + r | a Z}.
(d) Two triangles in the plane are equivalent if
(1) there exists a translation such that the first one is mapped onto the second one.
(2) there exists a rotation around (0, 0)
(3) there exists a motion (rotation or translation or reflexion or composition)
Then (1) (3) define different equivalence relations on triangles or more generally on subsets
of the plane.
(e) Cardinality of sets is an equivalence relation.
323
X \N
| f g | d
(N)() + (X \ N) 0 = 0.
324
Z
| f | d
p1
(12.7)
This number may be finite or . In the first case, | f |p is integrable and we write f L 1 (X, ).
Proposition 12.17 Let p, q 1 be given such that 1p + 1q = 1.
(a) Let f, g : X C be measurable functions such that f L p and g L q .
Then f g L 1 and
Z
| f g | d kf kp kgkq (Holder inequality).
(12.8)
(Minkowski inequality).
(12.9)
Idea of proof. Holder follows from Youngs inequality (Proposition 1.31, as in the calssical case
of Holders inequality in Rn , see Proposition 1.32
Minkowskis inequality follows from Holders inequality as in Propostion 1.34
Note that Minkowski implies that f, g L p yields kf + gk < such that f + g L p . In
particular, L p is a linaer space.
Let us check the properties of kkp . For all measurable f, g we have
kf kp 0,
kf kp = | | kf kp ,
kf + gk kf k + kgk .
All properties of a norm, see Definition 6.9 at page 179 are satisfied except for the definitness:
R
kf kp = 0 imlies X | f |p d = 0 implies by Proposition 12.16, | f |p = 0 a. e. implies f = 0
a. e. . However, it does not imply f = 0. To overcome this problem, we use the equivalece
relation f = g a. e. and consider from now on only equivalence classes of functions in L p , that
is we identify functions f and g which are equal a. e. .
The space N = {f : X C | f is measurable and f = 0 a. e. } is a linear subspace of
L p (X, ) for all all p, and f = g a. e. if and only if f g N. Then the factor space L p /N,
see Example 12.3 (b) is again a linear space.
Definition 12.11 Let (X, A, ) be a measure space. Lp (X, ) denotes the set of equivalence
classes of functions of L p (X, ) with respect to the equivalence relation f = g a. e. that is,
Lp (X, ) = L p (X, )/N
is the quotient space. (Lp (X, ), kkp ) is a normed space. With this norm Lp (X, ) is complete.
325
fn d =
X
f d =
Z
lim fn d.
X
f (x) =
fn (x) for x X. Then
n=1
Z X
fn d =
X n=1
Z
X
n=1
fn d.
X
Example 12.5 (a) Let X = N, A = P(N) the -algebra of all subsets, and the counting
measure on N. The functions on N can be identified with the sequences (xn ), f (n) = xn .
Trivially, any function is A-measurable.
R
What is N xn d? First, let f 0. For a simple function gn , given by gn = xn {n} , we obtain
X
R
gn d = xn ({n}) = xn . Note that f =
gn and gn 0 since xn 0. By Corollary 12.19,
n=1
f d =
Z
X
n=1
gn d =
xn .
n=1
P
Now, let f be arbitrary integrable, i. e. N | f | d < ; thus
n=1 | xn | < . Therefore,
P
1
(xn ) L (N, ) if and only if xn converges absolutely. The space of absolutely convergent
series is denoted by 1 or 1 (N).
326
(b) Let anm 0 for all n, m N. Then
X
X
amn =
n=1 m=1
X
X
amn .
m=1 n=1
Proof. Consider the measure space (N, P(N), ) from (a). For n N define functions fn (m) =
amn . By Corollary 12.19 we then have
Z X
X n=1
fn (m) d =
{z
f (m)
XZ
n=1
f d =
(a)
fn (m) d =
f (m) =
m=1
amn
m=1 n=1
amn .
n=1 m=1
defines a measure on A.
Proof. Since f 0, (A) 0 for all A A. Let (An ) be a countable disjoint family of
P
P
measurable sets An A and let A =
n=1 An . By homework 40.1, A =
n=1 An and
therefore
Z
Z X
Z
Z
X
f d =
A f d =
An f d =
An f d
(A) =
A
Z
X
n=1
An
f d =
X n=1
B.Levi
n=1
(An ).
n=1
R or
327
a. e. on X,
R
Then f is measurable and integrable, X | f | d < , and
Z
Z
Z
lim
fn d =
f d =
lim fn d,
n X
n
X
X
Z
| fn f | d = 0.
lim
n
(12.10)
Note, that (12.10) shows that (fn ) converges to f in the normed space L1 (X, ).
Example
12.6 (a) Let An A, n N, A1 A2 be an increasing sequence with
[
An = A. If f L 1 (A, ), then f L 1 (An ) for all n and
lim
f d =
An
f d.
(12.11)
An
However, if we do not assume f L 1 (A, ), the statement is not true (see Remark 12.7 below).
Exhausting theorem. Let (An ) be an increasing sequencce of measurable sets and A =
R
S
n=1 An . suppose that f is measurable, and An f d is a bounded sequence. Then f
L 1 (A, ) and (12.11) holds.
(b) Let fn (x) = (1)n xn on [0, 1]. The sequence is dominated by the integrable function
R
R
1 | fn (x) | for all x [0, 1]. Hence limn [0,1] fn d = 0 = [0,1] limn fn d.
328
Then the function
g(t) =
is continuous at t0 .
Rm
f (x, t) dx
Proof. First we note that for any fixed t U, the function ft (x) = f (x, t) is integrable on
Rm since it is dominated by the integrable function F . We have to show that for any sequence
tj t0 , tj U, g(tj ) tends to g(t0 ) as n . We set fj (x) = f (x, tj ) and f0 (x) = f (x, t0 )
for all n N. By (b) we have
f0 (x) = lim fj (x),
j
a. e. x Rm .
By (a) and (c), the assumptions of the dominated convergence theorem are satisfied and thus
Z
Z
Z
lim g(tj ) = lim
fj (x) dx =
lim fj (x) dx =
f0 (x) dx = g(t0 ).
j
Rm
Rm j
Rm
Proposition 12.23 (Differentiation under the Integral Sign) Let I R be an open interval
and f : Rm I R be a function such that
(x, t) dx.
g (t) =
Rm t
The proof uses the previous theorem about the continuity of the parametric integral. A detailed
proof is to be found in [Kon90, p. 283].
(b) Let K R3 be a compact subset and : K R integrable, the Newton potential (with
mass density ) is given by
Z
(x)
u(t) =
dx, t 6 K.
K kx tk
Then u(t) is a harmonic function on R3 \ K.
Similarly, if K R2 is compact and L (K), the Newton potential is given by
Z
u(t) =
(x) log kx tk dx, t 6 K.
K
329
converges (see Example 5.11); however, the Lebesgue integral does not exist since the integral
does not converge absolutely. Indeed, for non-negative integers n 1 we have with some c > 0
Z (n+1)
Z (n+1)
sin x
1
c
dx
;
| sin x | dx =
x
(n + 1) n
(n + 1)
n
hence
n
X
sin x
1
dx c
.
x
k=1 k + 1
R
Since the harmonic series diverges, so does the integral 1 sinx x dx.
Z
(n+1)
X2
(x2 ) d2 =
X2
X1
f d(1 2 ) =
X1 X2
X2
Z
(x1 ) d1 .
X1
f (x1 , x2 ) d1
X1
d2 .
Here A1 A2 denotes the smallest -algebra over X, which contains all sets A B, A A1
and B A2 . Define (A B) = 1 (A)2 (B) and extend to a measure 1 2 on A1 A2 .
Remark 12.8 In (a), as in Levis theorem, we dont need any assumption on f to change the
order of integration since f 0. In (b) f is an arbitrary measurable function on X1 X2 ,
R
however, the integral X | f | d needs to be finite.
330
Chapter 13
Hilbert Space
Functional analysis is a fruitful interplay between linear algebra and analysis. One defines
function spaces with certain properties and certain topologies and considers linear operators
between such spaces. The friendliest example of such spaces are Hilbert spaces.
This chapter is divided into two partsone describes the geometry of a Hilbert space, the
second is concerned with linear operators on the Hilbert space.
hy , 1 x1 + 2 x2 i = 1 hy , x1 i + 2 hy , x2 i .
A form on E E satisfying (a) and (d) is called a sesquilinear form. (a) implies h0 , yi = 0
for all y E. The mapping x 7 hx , yi is a linear mapping into K (a linear functional) for all
y E.
By (c), we may define kxk, the norm of the vector x E to be the square root of hx , xi; thus
kxk2 = hx , xi .
331
(13.1)
13 Hilbert Space
332
For
kxk2 + kyk2 + 2 | hx , yi |
means that the sequence (kxn xk) of non-negative real numbers tends to 0. Recall from
Definition 6.8 that a metric space is said to be complete if every Cauchy sequence converges.
Definition 13.2 A complete unitary space is called a Hilbert space.
Example 13.1 Let K = C.
(a) E = C , x = (x1 , . . . , xn ) C , y = (y1 , . . . , yn ) C . Then hx , yi =
n
n
X
k=1
xk yk defines
1
P
an inner product, with the euclidean norm kxk = ( nk=1 | xk |) 2 . (Cn , h , i) is a Hilbert space.
333
R
(b) E = L2 (X, ) is a Hilbert space with the inner product hf , gi = X f g d.
By Proposition 12.17 with p = q = 2 we obtain the Cauchy-Schwarz inequality
Z
Z
12 Z
12
2
2
f g d
| f | d
| g | d .
X
Using CSI one can prove Minkowskis inequality, that is, f, g L2 (X, ) implies f + g
L2 (X, ). Also, hf , gi is a finite complex number, since f g L1 (X, ).
R
Note that the inner product is positive definite since X | f |2 d = 0 implies (by Proposition 12.16) | f | = 0 a. e. and therefore, f = 0 in L2 (X, ). To prove the completeness of
L2 (X, ) is more complicated, we skip the proof.
(c) E = 2 , i. e.
X
| xn |2 < }.
2 = {(xn ) | xn C, n N,
n=1
X
X
X
X
X
2
2
2
xn yn
| xn |
| yn |
| xn |
| y n |2 .
n=1
n=1
n=1
n=1
n=1
k
X
X
X
2
xn yn
| xn |
| y n |2 ;
n=1
n=1
hence
h(xn ) , (yn )i =
n=1
xn yn
n=1
13 Hilbert Space
334
Definition 13.3 Let H be a unitary space. We call x and y orthogonal to each other, and write
x y, if hx , yi = 0. Two subsets M, N H are called orthogonal to each other if x y for
all x M and y N.
For a subset M H define the orthogonal complement M of M to be the set
M = {x H | hx , mi = 0,
for all m M }.
For example, E = Rn with the standard inner product and v = (v1 , . . . , vn ) Rn , v 6= 0 yields
{v} = {x R |
n
X
xk vk = 0}.
k=1
x, y E
(13.2)
is satisfied.
(b) If (13.2) is satisfied, the inner product h , i is given by (13.3) in the real case K = R and
by (13.4) in the complex case K = C.
1
kx + yk2 kx yk2 , if K = R.
(13.3)
hx , yi =
4
1
kx + yk2 kx yk2 + i kx + iyk2 i kx iyk2 , if K = C.
hx , yi =
(13.4)
4
These equations are called polarization identities.
335
Proof. We check the parallelogram and the polarization identity in the real case, K = R.
kx + yk2 + kx yk2 = hx + y , x + yi + hx y , x yi
= hx , xi + hy , xi + hx , yi + hy , yi + (hx , xi hy , xi hx , yi + hy , yi)
= 2 kxk2 + 2 kyk2 .
Further,
The proof that the parallelogram identity is sufficient for E being a unitary space is in the
appendix to this section.
R2
Example 13.2 We show that L1 ([0, 2]) with kf k1 = 0 | f | dx is not an inner product norm.
R1
Indeed, let f = [1,2] and g = [0,1] . Then f + g = | f g | = 1 and kf k1 = kgk1 = 0 dx = 1
such that
kf + gk21 + kf gk21 = 22 + 22 = 8 6= 4 = 2(kf k21 + kgk21 ).
The parallelogram identity is not satisfied for kk1 such that L1 ([0, 2]) is not an inner product
space.
x, y H1 .
13 Hilbert Space
336
(a) Rieszs First Theorem
Problem. Let H1 be a closed linear subspace of H. Does there exist another closed linear
subspace H2 such that H = H1 H2 ?
Answer: YES.
Lemma 13.6 ( Minimal Distance Lemma) Let C be a conx
vex and closed subset of the Hilbert space H. For x H
let
1111
00
0000
11
00
11
00
11
c
000000000
111111111
00
11
01
000000000
111111111
000000000
111111111
000000000
111111111
000000000
111111111
C
Proof. Existence. Since (x) is an infimum, there exists a sequence (yn ), yn C, which
approximates the infimum, limn kx yn k = (x). We will show, that (yn ) is a Cauchy
sequence. By the parallelogram law (see Proposition 13.5) we have
kyn ym k2 = kyn x + x ym k2
By the choice of (yn ), the first two sequences tend to (x)2 as m, n . Thus,
lim kyn ym k2 = 2(2 (x) + (x)2 ) 4(x)2 = 0,
m,n
hence (yn ) is a Cauchy sequence. Since H is complete, there exists an element c H such
that limn yn = c. Since yn C and C is closed, c C. By construction, we have
kyn xk (x). On the other hand, since yn c and the norm is continuous (see
homework 42.1. (b)), we have
kyn xk kc xk .
This implies (x) = kc xk.
Uniqueness. Let c, c two such elements. Then, by the parallelogram law,
2
0 kc c k = kc x + x c k
2
c + c
2
2
= 2 kc xk + 2 kx c k 4
x
2
2((x)2 + (x)2 ) 4(x)2 = 0.
337
11111111
00000000
000000000
111111111
000
111
000000
111111
0
1
00000000
11111111
000000000
111111111
000
111
000000
111111
0
1
00000000
11111111
000000000
111111111
000
111
000000
111111
0
1
00000000
11111111
000000000
111111111
000
111
000000
111111
0
1
00000000
11111111
0000000000000000000
1111111111111111111
000000000
111111111
000
111
000000
111111
0
1
00000000
11111111
H
0000000000000000000
1111111111111111111
000
111
000000
111111
0
1
00000000
11111111
0000000000000000000
1111111111111111111
x
000000
111111
0
1
00000000
11111111
0000000000000000000
1111111111111111111
000000
111111
0
1
x
00000000
11111111
0000000000000000000
1111111111111111111
000000
111111
0
1
00000000
11111111
0000000000000000000
1111111111111111111
000000
111111
0
1
00000000
11111111
0000000000000000000
1111111111111111111
0
1
00000000
11111111
0000000000000000000
1111111111111111111
0
1
00000000
11111111
0000000000000000000
1111111111111111111
00000000
11111111
0000000000000000000
1111111111111111111
00000000
11111111
00
11
000000000
111111111
000
x 111
0
1
00000000
H11111111
2
00
11
Proof. Existence. Apply Lemma 13.6 to the convex, closed set H1 . There exists a unique
x1 H1 such that
(x) = inf{kx yk | y H1 } = kx x1 k kx x1 ty1 k
for all t K and y1 H1 . homework 42.2 (c) now implies x2 = x x1 y1 for all y1 H1 .
Hence x2 H1 . Therefore, x = x1 + x2 , and the existence of such a representation is shown.
Uniqueness. Suppose that x = x1 + x2 = x1 + x2 are two possibilities to write x as a sum of
elements of x1 , x1 H1 and x2 , x2 H1 . Then
x1 x1 = x2 x2 = u
belongs to both H1 and H1 (by linearity of H1 and H2 ). Hence hu , ui = 0 which implies
u = 0. That is, x1 = x1 and x2 = x2 .
Let x = x1 + x2 be as above. Then the mappings P1 (x) = x1 and P2 (x) = x2 are well-defined
on H. They are called orthogonal projections of H onto H1 and H2 , respectively. We will
consider projections in more detail later.
Example 13.3 Let H be a Hilbert space, z H, z 6= 0, H1 = K z the one-dimensional linear
subspace spanned by one single vector z. Since any finite dimensional subspace is closed,
Rieszs first theorem applies. We want to compute the projections of x H with respect to H1
and H1 . Let x1 = z; we have to determine such that hx x1 , zi = 0, that is
hx z , zi = hx , zi hz , zi = hx , zi hz , zi = 0.
Hence,
=
hx , zi
hx , zi
=
.
hz , zi
kzk2
13 Hilbert Space
338
(b) Rieszs Representation Theorem
Recall from Section 11 that a linear functional on the vector space E is a mapping F : E K
such that F (1 x1 + 2 x2 ) = 1 F (x1 ) + 2 F (x2 ) for all x1 , x2 E and 1 , 2 K.
Let (E, kk) be a normed linear space over K. Recall that a linear functional F : E K is
called continuous if xn x in E implies F (xn ) F (x).
The set of all continuous linear functionals F on E form a linear space E with the same linear
operations as in E .
Now let (H, h , i) be a Hilbert space. By Lemma 13.3, Fy : H K, Fy (x) = hx , yi defines
a continuous linear functional on H. Rieszs representation theorem states that any continuous
linear functional on H is of this form.
Theorem 13.8 (Rieszs Representation Theorem) Let F be a continuous linear functional on
the Hilbert space H.
Then there exists a unique element y H such that F (x) = Fy (x) = hx , yi for all x H.
Proof. Existence. Let H1 = ker F be the null-space of the linear functional F . H1 is a linear
subspace (since F is linear). H1 is closed since H1 = F 1 ({0}) is the preimage of the closed
set {0} under the continuous map F . By Rieszs first theorem, H = H1 H1 .
Case 1. H1 = {0}. Then H = H1 and F (x) = 0 for all x. We can choose y = 0; F (x) =
hx , 0i.
Case 2. H1 6= {0}. Suppose u H1 , u 6= 0. Then F (u) 6= 0 (otherwise, u H1 H1 such
that hu , ui = 0 which implies u = 0). We have
F (x)
F (x)
u = F (x)
F (u) = 0.
F x
F (u)
F (u)
Hence x
F (x)
u H1 . Since u H1 we have
F (u)
F (x)
x
u, u
F (u)
F (x)
= hx , ui
hu , ui
F (u)
*
+
F (u)
F (u)
hx , ui = x ,
u = Fy (x),
F (x) =
hu , ui
kuk2
0=
F (u)
u.
kuk2
Uniqueness. Suppose that both y1 , y2 H give the same functional F , i. e. F (x) = hx , y1 i =
hx , y2 i for all x. This implies
where y =
hy1 y2 , xi = 0,
x H.
339
(c) Example
Any continuous linear functionals on L2 (X, ) are of the form F (f ) =
g L2 (X, ). Any continuous linear functional on 2 is given by
F ((xn )) =
xn yn ,
n=1
f g d with some
with (yn ) 2 .
n
X
xk ek ,
k=1
kxk =
n
X
k=1
| xk | ,
hx , yi =
n
X
xk yk .
k=1
is an OS in H.
sin(nx) cos(nx)
1
,
,
|nN ,
2
to be orthonormal sets of H.
einx
|nN
2
k=1
k=1 kxk k
13 Hilbert Space
340
(b) Fourier Expansion and Completeness
Throughout this paragraph let {xn | n N} an NOS in the Hilbert space H.
2
ample on H = L2 ((0, 2)). Let f H. Then
Z 2
sin(nx)
1
f (t) sin(nt) dt,
f,
=
0
Z 2
cos(nx)
1
f (t) cos(nt) dt,
f,
=
0
Z 2
1
1
f,
f (t) dt,
=
2
2 0
These are the usual Fourier coefficientsup to a factor. Note that we have another normalization than in Definition 6.3 since the inner product there has the factor 1/(2).
Proposition 13.11 (Bessels Inequality) For x H we have
X
k=1
| hx , xk i |2 kxk2 .
n
X
k=1
Pn
k=1 hx ,
(13.5)
xk i xk . Then
hx , xk i hxk , xm i = hx , xm i
n
X
k=1
hx , xk i km = 0
k=1
k=1
since kxk k2 = 1 for all k. Taking the supremum over all n on the right, the assertion follows.
X
k=1
hx , xk i xk converges in H.
Proof. Since {hx , xk i xk } is an OS, by Lemma 13.10 the series converges if and only if the
P
P
2
2
series
k=1 khx , xk i xk k =
k=1 | hx , xk i | converges. By Bessels inequality, this series
converges.
We call
k=1 hx ,
341
Remarks 13.1 (a) In general, the Fourier series of x does not converge to x.
X
k=1
hx , xk i xk
X
2
(c) For every x H we have kxk =
| hx , xk i |2 .
k=1
k=1
hx , xk i hxk , yi.
*
X
k=1
X
k=1
hx , xk i xk ,
X
n=1
hy , xn i xn
k,n=1
hx , xk i hxk , yi .
hx , xk i hy , xn i hxk , xn i
| {z }
kn
kzk =
X
k=1
| hz , xk i |2 = 0;
hence
z = 0.
P
(b) (a): Fix x H and put y =
k=1 hx , xk i xk which converges according to Corollary 13.12. With z = x y we have for all positive integers n N
hz , xn i = hx y , xn i =
hz , xn i = hx , xn i
X
k=1
X
k=1
hx , xk i xk , xn
hx , xk i hxk , xn i = hx , xn i hx , xn i = 0.
13 Hilbert Space
342
Example 13.6 (a) H = 2 , {en | n N} is an NOS. We show that this NOS is complete.
For, let x = (xn ) be orthogonal to every en , n N; that is, 0 = hx , en i = xn . Hence,
x = (0, 0, . . . ) = 0. By (b), {en } is a CNOS. How does the Fourier series of x look like? The
Fourier coefficients of x are hx , en i = xn such that
x=
xn en
n=1
2
2
are both CNOSs in H. This was stated in Theorem6.14
(c) Existence of CNOS in a Separable Hilbert Space
Definition 13.9 A metric space E is called separable if there exists a countable dense subset
of E.
Example 13.7 (a) Rn is separable. M = {(r1 , . . . , rn ) | r1 , . . . , rn Q} is a countable dense
set in Rn .
(b) Cn is separable. M = {(r1 + is1 , . . . , rn + isn ) | r1 , . . . , rn , s1 , . . . , sn Q} is a countable
dense subset of Cn .
(c) L2 ([a, b]) is separable. The polynomials {1, x, x2 , . . . } are linearly independent in L2 ([a, b])
and they can be orthonormalized via Schmidts process. As a result we get a countable CNOS
in L2 ([a, b]) (Legendre polynomials in case a = 1 = b). However, L2 (R) contains no polyno2
mial; in this case the Hermite functions which are of the form pn (x) ex with polynomials pn ,
form a countable CNOS.
More general, L2 (G, n ) is separable for any region G Rn with respect to the Lebesgue
measure.
(d) Any Hilbert space is isomorphic to some L2 (X, ) where is the counting measure on X;
X = N gives 2 . X uncountable gives a non-separable Hilbert space.
Proposition 13.14 (Schmidts Orthogonalization Process) Let {yk } be an at most countable
linearly independent subset of the Hilbert space H. Then there exists an NOS {xk } such that
for every n
lin {y1 , . . . , yn } = lin {x1 , . . . , xn }.
The NOS can be computed recursively,
y1
x1 :=
,
ky1 k
xn+1 = (yn+1
n
X
k=1
hyn+1 , xk i xk )/ kk
343
Proposition 13.16 (a) A Hilbert space H has an at most countable complete orthonormal system (CNOS) if and only if H is separable.
(b) Let H be a separable Hilbert space. Then H is either isomorphic to Kn for some n N or
to 2 .
13.1.5 Appendix
(a) The Inner Product constructed from an Inner Product Norm
Proof of Proposition 13.5. We consider only the case K = R. Assume that the parallelogram
identity is satisfied. We will show that
hx , yi =
1
kx + yk2 kx yk2
4
Replacing y by y, we have
1
kx1 x2 yk2 + kx2 x1 yk2 .
2
1
kx1 + x2 + yk2 kx1 + x2 yk2
4
1
=
kx1 + yk2 kx1 yk2 + kx2 + yk2 kx2 yk2
2
= hx1 , yi + hx2 , yi ,
hx1 + x2 , yi =
that is, h , i is additive in the first variable. It is obviously symmetric and hence additive in the
second variable, too.
(b) We show hx , yi = hx , yi for all R, x, y E. By (a), h2x , yi = 2 hx , yi. By
induction on n, hnx , yi = n hx , yi for all n N. Now let = m
, m, n N. Then
n
Dm
E D m
E
n hx , yi = n
x , y = n x , y = m hx , yi
n
n
m
= hx , yi = hx , yi = hx , yi .
n
Hence, hx , yi = hx , yi holds for all positive rational numbers . Suppose Q+ , then
0 = hx + (x) , yi = hx , yi + hx , yi
13 Hilbert Space
344
P
Pn
2
Proof. By the above discussion,
k=1 xk converges if and only if k
k=m xk k becomes small
for sufficiently large m, n N. By the Pythagorean theorem this term equals
n
X
k=m
kxk k2 ;
kxk k2 converges.
for all x E1 .
(13.6)
345
(b) Suppose that T : E1 E2 is a bounded linear map. Then the operator norm is the smallest
number C satisfying (13.6) for all x E1 , that is
kT k = inf {C > 0 | x E1 : kT (x)k2 C kxk1 } .
One can show that
kT (x)k2
| x E1 , x 6= 0 ,
kxk1
(a)
kT k = sup
(b)
(c)
(13.7)
13 Hilbert Space
346
Definition 13.11 Let E and F be normed linear spaces. Let L (E, F ) denote the set of all
bounded linear maps from E to F . In case E = F we simply write L (E) in place of L (E, F ).
Proposition 13.18 Let E and F be normed linear spaces. Then L (E, F ) is a normed linear
space if we define the linear structure by
(S + T )(x) = S(x) + T (x),
(T )(x) = T (x)
for S, T L (E, F ), K. The operator norm kT k makes L (E, F ) a normed linear space.
Note that L (E, F ) is complete if and only if F is complete.
Example 13.8 (a) Recall that L (Kn , Km ) is a normed vector space with kAk
P
1
2 2
, where A = (aij ) is the matrix representation of the linear operator A, see
|
a
|
ij
i,j
Proposition 7.1
(b) The space E = L (E, K) of continuous linear functionals on E.
(c) H = L2 ((0, 1)), g C([0, 1]),
Tg (f )(t) = g(t)f (t)
defines a bounded linear operator on H. (see homework)
(d) H = L2 ((0, 1)), k(s, t) L2 ([0, 1] [0, 1]). Then
Z 1
k(s, t)f (s) ds, f H = L2 ([0, 1])
(Kf )(t) =
0
| k(s, t) | ds
| f (s) |2 ds
C-S-I 0
0
Z 1
=
| k(s, t) |2 ds kf k2H .
0
Hence,
kK(f )k2H
1
0
Z
1
0
| k(s, t) | ds
dt kf k2H
This shows Kf H and further, kKk kkkL2 ([0,1]2 ) . K is called an integral operator; K is
compact, i. e. it maps the unit ball into a set whose closure is compact.
(e) H = L2 (R), a R,
(Va f )(t) = f (t a), t R,
defines a bounded linear operator called the shift operator. Indeed,
Z
Z
2
2
kVa f k2 =
| f (t a) | dt =
| f (t) |2 dt = kf k22 ;
347
t[0,1]
Then kfn k1 = 1 and T fn (t) = ntn1 such that ktfn k2 = n. Thus, kT fn k2 / kfn k1 = n +
as n . T is unbounded.
However, if we put kf k1 = sup | f (t) | + sup | f (t) | and kf k2 as before, then T is bounded
t[0,1]
t[0,1]
since
kT f k2 = sup | f (t) | kf k1 = kT k 1.
t[0,1]
(13.8)
13 Hilbert Space
348
Definition 13.12 The above correspondence y 7 z is linear. Define the linear operator T by
z = T (y). By definition,
hT (x) , yi = x , T (y) ,
x, y H.
(13.9)
Proposition 13.19 Let T, T1 , T2 L (H). Then T is a bounded linear operator with
T
=
kT k. We have
(a) (T1 + T2 ) = T1 + T2 and
(b) ( T ) = T .
(c) (T1 T2 ) = T2 T1 .
(d) If T is invertible in L (H), so is T , and we have (T )1 = (T 1 ) .
(e) (T ) = T .
Proof. Inequality (13.8) shows that
T (y)
kT k kyk ,
By definition, this implies
and T is bounded. Since
y H.
T
kT k
T (x) , y = hy , T (x)i = hT (y) , xi = hx , T (y)i ,
we get (T ) = T . We conclude kT k =
T
T
; such that
T
= kT k.
(a). For x, y H we have
h(T1 + T2 )(x) , yi = hT1 (x) + T2 (x) , yi = hT1 (x) , yi + hT2 (x) , yi
= x , T1 (y) + x , T2 (y) = x , (T1 + T2 )(y) ;
A mapping : A A such that the above properties (a), (b), and (c) are satisfied is called an
involution. An algebra with involution is called a -algebra.
We have seen that L (H) is a (non-commutative) -algebra. An example of a commutative
-algebra is C(K) with the involution f (x) = f (x).
Example 13.9 (Example 13.8 continued)
(a) H = Cn , A = (aij ) M(n n, C). Then A = (bij ) has the matrix elements bij = aji .
(b) H = L2 ([0, 1]), Tg = Tg .
(c) H = L2 (R), Va (f )(t) = f (t a) (Shift operator), Va = Va .
349
X
X
hS(x) , yi =
xn1 yn =
xn yn+1 = h(x1 , x2 , . . . ) , (y2 , y3 , . . . )i .
n=2
n=1
A A = A A ,
is normal, then for all x H we have A A(x) , x = A A (x) , x which imply kA(x)k2 =
2
hA(x) , A(x)i = A (x) , A (x) =
A (x)
. On the other hand, the polarization identity
and A A(x) , x = A A (x) , x implies (A A A A )(x) , x = 0 for all x; hence
A A A A = 0 which proves the claim.
(b) Sums and real scalar multiples of self-adjoint operators are self-adjoint.
(c) The product AB of self-adjoint operators is self-adjoint if and only if A and B commute
with each other, AB = BA.
(d) A is self-adjoint if and only if hAx , xi is real for all x H.
Proof. Let A = A. Then hAx , xi = hx , Axi = hAx , xi is real; for the opposite direction
hA(x) , xi = hx , A(x)i and the polarization identity yields hA(x) , yi = hx , A(y)i for all
x, y; hence A = A.
13 Hilbert Space
350
(b) Unitary and Isometric Operators
Definition 13.14 Let T L (H). Then T is called
(a) unitary, if T T = I = T T .
(b) isometric, if
Proposition 13.20 (a) T is isometric if and only if T T = I and if and only if hT (x) , T (y)i =
hx , yi for all x, y H.
(b) T is unitary, if and only if T is isometric and surjective.
(c) If S, T are unitary, so are ST and T 1 . The unitary operators of L (H) form a group.
Proof. (a) T isometric yields hT (x) , T (x)i = hx , xi and further (T T I)(x) , x = 0 for
all x. The polarization identity implies T T = I. This implies (T T I)(x) , y = 0, for
all x, y H. Hence, hT (x) , T (y)i = hx , yi. Inserting y = x shows T is isometric.
(b) Suppose T is unitary. T T = I shows T is isometric. Since T T = I, T is surjective.
Suppose now, T is isometric and surjective. Since T is isometric, T (x) = 0 implies x = 0;
hence, T is bijective with an inverse operator T 1 . Insert y = T 1 (z) into hT (x) , T (y)i =
hx , yi. This gives
hT (x) , zi = x , T 1 (z) , x, z H.
Hence T 1 = T and therefore T T = T T = I.
(c) is easy (see homework 45.4).
Note that an isometric operator is injective with norm 1 (since kT (x)k / kxk = 1 for all x). In
case H = Cn , the unitary operators on Cn form the unitary group U(n). In case H = Rn , the
unitary operators on H form the orthogonal group O(n).
Example 13.10 (a) H = L2 (R). The shift operator Va is unitary since Va Vb = Va+b . The
multiplication operator Tg f = gf is unitary if and only if | g | = 1. Tg is self-adjoint (resp.
positive) if and only if g is real (resp. positive).
(b) H = 2 , the right-shift S((xn )) = (0, x1 , x2 , . . . ) is isometric but not unitary since S is not
surjective. S is not isometric since S (1, 0, . . . ) = 0; hence S is not injective.
(c) Fourier transform. For f L1 (R) define
Z
1
(Ff )(t) =
eitx f (x) dx.
2 R
Let S(R) = {f C (R) | suptR tn f (k) (t) < , n, k Z+ }. S(R) is called the
Schwartz space after Laurent Schwartz. We have S(R) L1 (R) L2 (R), for example, f (x) =
2
ex S(R). We will show later that F : S(R) S(R) is bijective and norm preserving,
kF(f )kL2 (R) = kf kL2 (R) , f S(R). F has a unique extension to a unitary operator on L2 (R).
The inverse Fourier transform is
Z
1
1
(F f )(t) =
eitx f (x) dx, f S(R).
2 R
351
that is, P = P .
. Suppose P 2 = P = P and put H1 = {x | P (x) = x}. First note, that for P 6= 0, H1 6=
{0} is non-trivial. Indeed, since P (P (x)) = P (x), the image of P is part of the eigenspace
of P to the eigenvalues 1, P (H) H1 . Since for z H1 , P (z) = z, H1 P (H) and thus
H1 = P (H).
Since P is continuous and {0} is closed, H1 = (P I)1 ({0}) is a closed linear subspace of
H. By Rieszs first theorem, H = H1 H1 . We have to show that P (x) = x1 for all x.
Since P 2 = P , P (P (x)) = P (x) for all x; hence P (x) H1 . We show x P (x) H1 which
completes the proof. For, let z H1 , then
hx P (x) , zi = hx , zi hP (x) , zi = hx , zi hx , P (z)i = hx , zi hx , zi = 0.
Hence x = P (x) + (I P )(x) is the unique Riesz decomposition of x with respect to H1 and
H1 .
n
X
k=1
hx , xk i xk ,
x H,
13 Hilbert Space
352
defines the orthogonal projection P : H H onto lin {x1 , . . . , xn }. Indeed, since P (xm ) =
Pn
2
k=1 hxm , xk i xk = xm , P = P and since
+
*
n
n
X
X
hP (x) , yi =
hx , xk i hxk , yi = x ,
hxk , yi xk = hx , P (y)i .
k=1
k=1
353
(a)
(b)
(b)
(d)
P1 P2 = P1 ,
(e)
P2 P1
(c)
is an orth. projection,
(f)
P1 P2 ,
P2 P1 = P1 ,
x H.
Proof. We show (d) (c). From P1 P2 we conclude that I P2 I P2 . Note that both
I P1 and I P2 are again orthogonal projections on H1 and H2 , respectively. Thus for all
x H:
k(I P2 )P1 (x)k2 = h(I P2 )P1 (x) , (I P2 )P1 (x)i
13 Hilbert Space
354
\ [0, 1]
(T ).
(R f )(x) =
1
is a continuous (hence
x
1
f (x)
x
(13.10)
By homework 39.5 (a), the norm of the multiplication operator Tg is less that or equal to kgk
(the supremum norm of g). Choose f = (,+) . Since M = 2M ,
k(T I)f k =
(x )U () (x)f (x)
sup (x )U () (x) kf k .
x[0,1]
sup (x )U () (x) = sup | x | = .
x[0,1]
This shows
355
xU ()
k(T I)f k kf k .
Inserting f into (13.10) we obtain
kf k = kR (T I)f k kR k k(T I)f k kR k kf k
which implies kR k 1/. This contradicts the boundedness of R since > 0 was arbitrary.
(b) Properties of the Spectrum
Lemma 13.25 Let T L (H). Then
(T ) = (T ) ,
(complex conjugation) (T ) = (T ) .
(T I)R (T ) = R (t) (T I) = I.
X
n=0
( 0 )n R0 (T )n+1 .
X
n=0
n1 T n .
13 Hilbert Space
356
X
X
kR0 k
n
n+1
converges.
| 0 | kR0 k
=
q n kR0 k =
1
q
n=0
n=0
P
P
By homework 38.4, xn converges if kxn k converges. Hence,
B=
X
n=0
( 0 )n Rn+1
0
X
X
n
n+1
=
( 0 ) (T 0 I)R0
( 0 )n+1 Rn+1
0
n=0
n=0
X
n=0
( 0 )n Rn0
= ( 0 )
R0 0
= I.
X
n=0
( 0 )n+1 Rn+1
0
n1 T n .
n=0
n1
n+1
n=0
n T n = 0 T 0 = I.
n=0
()
0
r(T)
357
n
X
k=0
T k n1k (T ) = (T )CB = I;
thus (T ).
We shall refine the above statement and give a better upper bound for {| | | (T )} than
kT k.
Proposition 13.27 Let T L (H) be a bounded linear operator. Then the spectral radius of T
is
1
r(T ) = lim kT n k n .
(13.11)
1
kxk ,
kR (T )k
x H.
13 Hilbert Space
358
In particular, kxk = sup | hx , yi | since y = x/ kxk yields the supremum and CSI gives the
kyk1
sup
kxk1, kyk1
(13.12)
kxk1
such that C kT k.
For any real positive > 0 we have:
kT (x)k2 = hT (x) , T (x)i = T 2 (x) , x =
T (x + 1 T (x)) , x + 1 T (x)
4
= T (x 1 T (x)) , x 1 T (x)
2
2
1
C
x + 1 T (x)
+ C
x 1 T (x)
4
2 C 2
C
2 kxk2 + 2
1 T (x)
=
kxk2 + 2 kT (x)k2 .
=
P.I. 4
2
359
C
(kT (x)k kxk + kxk kT (x)k)
2
kxk=1
Then we have
kxk1
and
m kxk2 hT (x) , xi M kxk2 ,
for all x H.
inf
[m,M ]
| 0 | > 0.
kxk=1
2
k(T 0 I)xk = kxk k(T 0 I)xk | h(T 0 I)x , xi | = hT (x) , xi 0 kxk C.
|{z}
| {z }
CSI
[m,M ]
1
This implies
for all x H.
By Proposition 13.28, 0 (T ).
Example 13.13 (a) Let H = L2 [0, 1], g C[0, 1] a real-valued function, and (Tg f )(t) =
g(t)f (t). Let m = inf g(t), M = sup g(t). One proves that m and M are the lower and
t[0,1]
t[0,1]
upper bounds of Tg such that (Tg ) [m, M]. Since g is continuous, by the intermediate value
theorem, (Tg ) = [m, M].
(b) Let T = T L (H) be self-adjoint. Then all eigenvalues of T are real and eigenvectors
to different eigenvalues are orthogonal to each other. Proof. The first statement is clear from
Corollary 13.30. Suppose that T (x) = x and T (y) = y with 6= . Then
hx , yi = hT (x) , yi = hx , T (y)i = hx , yi = hx , yi .
Since 6= , hx , yi = 0.
The statement about orthogonality holds for arbitrary normal operators.
13 Hilbert Space
360
X
n=0
kT n k z n
(13.13)
lim
1
p
n
kT n k
(13.14)
p
n
n1 T n
n=0
p
n
On the other hand, by Remark 13.4 (d), (T ) implies n (T n ); hence, by Remark 13.4
(c),
p
| n | kT n k = | | n kT n k.
Taking the supremum over all (T ) on the left and the lim over all n on the right, we have
p
p
r(T ) lim n kT n k lim n kT n k = r(T ).
n
p
n
Compact operators generalize finite rank operators. Integral operators on compact sets are compact.
Definition 13.16 A linear operator T L (H) is called compact if the closure T (U1 ) of the
unit ball U1 = {x | kxk 1} is compact in H. In other words, for every sequence (xn ),
xn U1 , there exists a subsequence such that T (xnk ) converges.
Proposition 13.31 For T L (H) the following are equivalent:
(a) T is compact.
(b) T is compact.
(c) For all sequences (xn ) with (hxn , yi) hx , yi converges for all y we have
T (xn ) T (x).
(d) There exists a sequence (Tn ) of operators of finite rank such that kT Tn k
0.
361
hT (x) , yi = x , T (y) = x , y = 0.
Hence, ker(T I) is T -invariant, too.
(b) Let T (x) = x and T (y) = y. Then (a) and T (y) = y ... imply
hx , yi = hT (x) , yi = x , T (y) = hx , yi = hx , yi .
Thus ( ) hx , yi = 0; since 6= , x y.
Theorem 13.33 (Spectral Theorem for Compact Self-Adjoint Operators) Let H be an infinite dimensional separable Hilbert space and T L (H) compact and self-adjoint.
Then there exists a real sequence (n ) with n 0 and an CNOS {en | n N} {fk | k
n
N N} such that
T (en ) = n en ,
nN
T (fk ) = 0,
k N.
Moreover,
T (x) =
X
n=1
n hx , en i en ,
x H.
(13.15)
Remarks 13.5 (a) Since {en } {fk } is a CNOS, any x H can be written as its Fourier series
x=
X
n=1
hx , en i en +
kN
hx , fk i fk .
X
n=1
hx , en i n en +
kN
hx , fk i T (fk )
| {z }
=0
which establishes (13.15). The main point is the existence of a CNOS of eigenvectors {en }
{fk }.
(b) In case H = Cn (Rn ) the theorem says that any hermitean (symmetric) matrix A is diagonalizable with only real eigenvalues.
362
13 Hilbert Space
Chapter 14
Complex Analysis
Here are some useful textbooks on Complex Analysis: [FL88] (in German), [Kno78] (in German), [Nee97], [Ruh83] (in German), [Hen88].
The main part of this chapter deals with holomorphic functions which is another name for a
function which is complex differentiable in an open set. On the one hand, we are already
familiar with a huge class of holomorphic functions: polynomials, the exponential function,
sine and cosine functions. On the other hand holomorphic functions possess quite amazing
properties completely unusual from the vie point of real analysis. The properties are very
strong. For example, it is easy to construct a real function which is 17 times differentiable but
not 18 times. A complex differentiable function (in a small region) is automatically infinitely
often differentiable.
Good references are Ahlfors [Ahl78], a little harder is Conway [Con78], easier is Howie
[How03].
Ur
Sr
{z | | z | < r}
{z | | z a | < R}
{z | | z | r}
{z | 0 < | z | < r}
{z | | z | = r}
exists, we call f complex differentiable at z0 and f (z0 ) the derivative of f at z0 . We call f (z0 )
the derivative of f at z0 .
363
14 Complex Analysis
364
f (z0 ) < .
z z0
zz0
=
.
(f + g) = f + g , (f g) = f g + f g ,
g
g2
f
lim
whereas
| 1 + i |2 1
1 + 2 1
= lim
= 0.
lim
0
0
i
i
This shows that f (1) does not exist.
365
lim
P
1
p
n
| cn |
That is, the series converges absolutely for all z with | z | < R; the series diverges for | z | > R,
the behaviour for | z | = R depends on the (cn ). Moreover, it converges uniformly on every
closed ball Ur with 0 < r < R, see Proposition 6.4.
We already know that a real power series can be differentiated elementwise, see Corollary 6.11.
We will see, that power series are holomorphic inside its radius of convergence.
Proposition 14.1 Let a C and
f (z) =
X
n=0
cn (z a)n
(14.1)
be a power series with radius of convergence R. Then f : UR (a) C is holomorphic and the
derivative is
f (z) =
X
n=1
ncn (z a)n1 .
(14.2)
Proof. If the series (14.1) converges in UR (a), the root test shows that the series (14.2) also
converges there. Without loss of generality, take a = 0. Denote the sum of the series (14.2) by
g(z), fix w UR (0) and choose r so that | w | < r < R. If z 6= w, we have
n
X
z wn
f (z) f (w)
n1
cn
.
g(w) =
nw
zw
z
w
n=0
The expression in the brackets is 0 if n = 1. For n 2 it is (by direct computation of the
following term)
= (z w)
n1
X
k=1
kw
k1
nk1
n1
X
k=1
kw k1 z nk kw k z nk1 ,
(14.3)
which gives a telescope sum if we shift k := k + 1 in the first summand. If | z | < r, the absolute
value of the sum (14.3) is less than
n(n 1) n2
r ,
2
so
X
f (z) f (w)
g(w) | z w |
n2 | cn | r n2 .
(14.4)
zw
n=2
Since r < R, the last series converges. Hence the left side of (14.4) tends to 0 as z w. This
says that f (w) = g(w), and completes the proof.
14 Complex Analysis
366
Corollary 14.2 Since f (z) is again a power series with the same radius of convergence R, the
proposition can be applied to f (z). It follows that f has derivatives of all orders and that each
derivative has a power series expansion around a
f
(k)
(z) =
X
n=k
(14.5)
Inserting z = a implies
f (k) (a) = k!ck ,
k = 0, 1, . . . .
This shows that the coefficients cn in the power series expansion f (z) =
with midpoint a are unique.
z
n=0 cn (z
a)n of f
X
zn
2n+1
2n
X
X
n z
n z
sin z =
(1)
(1)
, cos z =
.
(2n + 1)!
(2n)!
n=0
n=0
n=0
Definition 14.2 A complex function which is defined on C and which is holomorphic on the
entire complex plane is called an entire function.
U C open,
aU
u
v
(a) = (a).
y
x
uy = vx .
(14.6)
367
In this case,
f = ux + ivx = vy iuy .
Proof. (a) (b): Suppose that z = h + ik is a complex number such that a + z U; put
f (a) = b1 + ib2 . By assumption,
| f (a + z) f (a) zf (a) |
= 0.
z0
|z |
lim
We shall write this in the real form with real variables h and k. Note that
zf (a) = (h + ik)(b1 + ib2 ) = hb1 kb2 + i(hb2 + kb1 )
hb1 kb2
b1 b2
h
=
.
=
k
hb2 + kb2
b2 b1
This implies, with the identification z = (h, k),
h
f (a + z) f (a) b1 b2
k
b2 b1
= 0.
lim
z0
|z |
That is (see Subsection 7.2), f is real differentiable at a with the Jacobian matrix
b1 b2
f (a) = Df (a) =
.
b2 b1
(14.7)
By Proposition 7.6, the Jacobian matrix is exactly the matrix of the partial derivatives, that is
ux uy
.
Df (a) =
vx vy
Comparing this with (14.7), we obtain ux (a) = vy (a) = Re f (a) and uy (a) = vx (a) =
Im f (a). This completes the proof of the first direction.
(b) (a). Since f = (u, v) is differentiable at a U as a real function, there exists a linear
mapping Df (a) L (R2 ) such that
f (a + (h, k)) f (a) Df (a) h
k
lim
= 0.
(h,k)0
k(h, k)k
By Proposition 7.6,
ux uy
.
Df (a) =
vx vy
14 Complex Analysis
368
where ux = b1 and vx = b2 . Writing
h
hb1 kb2
Df (a)
=
= z(b1 + ib2 )
k
hb2 + kb2
in the complex form with z = h+ik gives f is complex differentiable at a with f (a) = b1 +ib2 .
Example 14.3 (a) We already know that f (z) = z 2 is complex differentiable. Hence, the
CauchyRiemann equations must be fulfilled. From
f (z) = z 2 = (x + iy)2 = x2 y 2 + 2ixy,
u(x, y) = x2 y 2 ,
v(x, y) = 2xy
we conclude
ux = 2x,
uy = 2y,
vx = 2y,
vy = 2x.
c2
c1
c3
369
Remarks 14.2 (a) We will see soon that the additional differentiability assumption in (c) is
superfluous.
(b) Note, that an inverse statement to (c) is easily proved: If Q = (a, b) (c, d) is an open
rectangle and u : Q R is harmonic, then there exists a holomorphic function f : Q C
such that u = Re f .
Z5
f
C
Z4
Z1
Z2
Z3
f (z) dz :=
t
n Zk
X
k=1t
k1
14 Complex Analysis
370
f (z) dz =
f (z) dz +
f (z) dz.
(c) From the definition and the triangle inequality, it follows that for a continuously differentiable path
Z
f (z) dz M ,
Rb
where | f (z) | M for all z and is the length of , = a | (t) | dt. t [t0 , t1 ]. Note
that the integral on the right is the length of the curve (t).
Rb
(d) The integral of f over generalizes the real integral f (t) dt. Indeed, let (t) = t, t [a, b],
a
then
f (z) dz =
Zb
f (t) dt.
(e) Let be the circle Sr (a) of radius r with center a. We can parametrize the positively oriented
circle as (t) = a + reit , t [0, 2]. Then
Z
f (z) dz = ir
Z2
0
f a + reit eit dt.
Example 14.4 (a) Let 1 (t) = eit , t [0, ], be the half of the unit circle from 1 to 1 via i and
2 (t) = t, t [1, 1] the segment from 1 to 1. Then 1 (t) = ieit and 2 (t) = 1. Hence,
Z
z dz = i
e2it eit
dt = i
2it+it
i it
=
e = (1 1) = 2.
i
0
Z 1
Z
2
z 2 dz =
t2 dt = .
see (b)
3
1
dt = i
eit dt
371
f (z) dz = 0.
Proof. It suffices to prove the statement for a continuously differentiable curve (t). Put h(t) =
F ((t)). By the chain rule
h (t) =
d
F ((t)) = F ((t)) (t) = f ((t)) (t)
dt
By definition of the integral and the fundamental theorem of calculus (see Subsection 5.5),
Z
Z b
Z b
f (z) dz =
f ((t)) (t) dt =
h (t) dt = h(t)|ba = h(b) h(a) = F (z1 ) F (z0 ).
a
(b)
R i
1
1i
z 3 dz =
2+3i
(1 i)4 (2 + 3i)4
.
4
4
ez dz = 1 e.
Theorem 14.6 (Cauchys Theorem) Let U be a simply connected region in C and let f (z) be
R
holomorphic in U. Suppose that (t) is a path in U joining z0 and z1 in U. Then f (z) dz
R
depends on z0 and z1 only and not on the choice of the path. In particular, f (z) dz = 0 for
Proof. We give the proof under the weak additional assumption that f not only exists but is
continuous in U. In this case, the partial derivatives ux , uy , vx , and vy are continuous and we can
apply the integrability criterion Proposition 8.3 which was a consequence of Greens theorem,
see Theorem 10.3. Note that we need U to be simply connected in contrast to Lemma 14.5.
14 Complex Analysis
372
Without this additional assumption (f is continuous), the proof is lengthy (see [FB93, Lan89,
Jan93]) and starts with triangular or rectangular paths and is generalized then to arbitrary paths.
We have
Z
f (z) dz =
(u + iv)( dx + i dy) =
(u dx v dy) + i
(v dx + u dy).
Remarks 14.4 (a) The proposition holds under the following weaker assumption: f is continuous in the closure U and holomorphic in U, U is a simply connected region, and = U is a
path.
(b) The statement is wrong without the assumption U is simply connected. Indeed, consider
the circle of radius r with center a, that is (t) = a + reit . Then f (z) = 1/(z a) is singular
at a and we have
Z 2 it
Z 2
Z
dz
e
= ir
dt = i
dt = 2i.
za
reit
0
0
Sr (a)
(a).
Since the integrals along i , i = 1, . . . , 4, cancel, we have
Z
f (z) dz = 0.
+1 +2
Sr2
Sr1
Sr2 (a)
373
S0 (z0 )
S (z0 )
Z
Z
f (z) dz =
S (z0 )
f (z) dz 2 C.
f (z) dz = 0.
We will see soon that under the conditions of the proposition, f can be made holomorphic at z0 ,
too.
f (z)f (a)
,
za
0,
z 6= a
z = a.
Then F (z) is holomorphic in U \ {a} and bounded in a neighborhood of a since f (a) exists
and therefore,
f (z) f (a)
<
f
(a)
za
14 Complex Analysis
374
as z approaches a. Using Proposition 14.7 and Remark14.4 (b) we have
F (z) dz = 0, that is
such that
f (z) dz
=
za
f (z) f (a)
dz = 0,
za
f (a) dz
= f (a)
za
dz
= 2i f (a).
za
Remark 14.5 The values of a holomorphic function f inside a path are completely determined by the values of f on .
Example 14.6 Evaluate
Ir :=
sin z
dz
z2 + 1
Sr (a)
in cases a = 1 + i and r = 12 , 2, 3.
Solution. We use the partial fraction decomposition of z 2 + 1 to obtain linear terms in the
denominator.
1
1
1
1
.
=
z2 + 1
2i z i z + i
Hence, with f (z) = sin z we have in case r = 3
Z
Z
Z
1
1
sin z dz
sin z
=
dz
I3 =
z2 + 1
2i
zi
2i
Sr (a)
Sr (a)
sin z
dz
z+i
Sr (a)
sin z
z+i
45
0
2
375
= ei/2 = i
Z
eit dt,
0
Z R
2
i/4
I3 = e
et dt
I1 =
Note that sin t is a concave function on [0, /2], that is, the graph of the sine function is above
the graph of the corresponding linear function through (0, 0) and (/2, 1); thus, sin t 2t/,
t [0, /2]. We have
Z /4
R2
R2 4t/
R2
.
e
dt =
1 =
| I2 (R) | R
e
1e
4R
4R
0
We conclude that | I2 (R) | tends to 0 as R . By Cauchys Theorem I1 + I2 I3 = 0 for
all R, we conclude
Z
Z
2
it2
i/4
lim I1 (R) =
e dt = e
et dt = lim I3 (R).
R
Z
Z
2
2
=
cos(t ) dt =
sin(t2 ) dt.
4
0
0
R
2
These are the so called Fresnel integrals. We show that I = 0 ex dx = /2. (This was
already done in Homework 41) For, we compute the double integral using Fubinis theorem:
Z Z
0
x2 y 2
dxdy =
x2
dx
ey dy = I 2 .
R+)2
Z
1 t
2
e dt = = I =
I =
.
22 0
4
2
This proves the claim. In addition, the change of variables x = s also yields
Z x
1
e
dx = .
=
x
2
0
14 Complex Analysis
376
Then h is holomorphic on the complement of in U and has derivatives of all orders. They are
given by
Z
g(z)
(n)
h (a) = n!
dz.
(z a)n+1
1
1
1
1
=
=
za
z b (a b)
z b 1 ab
zb
!
2
ab
ab
1
1+
+
=
+ .
zb
zb
zb
a
b
Since g is continuous and is a compact set, g(z) is bounded on such that by Theorem 6.6,
P
ab n
the series
can be integrated term by term, and we find
n=0 g(z) zb
h(a) =
Z X
(a b)n
g(z)
dz
(z b)n+1
n=0
X
n=0
X
n=0
(a b)
g(z)
dz
(z b)n+1
cn (a b)n ,
where
cn =
g(z) dz
.
(z b)n+1
This proves that h can be expanded into a power series in a neighborhood of b. By Proposition 14.1 and Corollary 14.2, f has derivatives of all orders in a neighborhood of b. By the
377
(b) = n!cn = n!
g(z) dz
.
(z b)n+1
Remark 14.6 There is an easy way to deduce the formula. Formally, we can exchange the
R
d
and :
differentiation da
Z
Z
Z
d
g(z)
d
g(z)
1
dz.
h (a) =
g(z)(z a)
dz =
dz =
da
za
da
(z a)2
Z
Z
g(z)
d
h (a) =
g(z)(z a)2 dz = 2
dz.
da
(z a)3
Theorem 14.10 Suppose that f is holomorphic in U and Ur (a) U, then f has a power series
expansion in Ur (a)
X
cn (z a)n .
f (z) =
n=0
In particular, f has derivatives of all orders, and we have the following coefficient formula
Z
f (n) (a)
f (z) dz
1
cn =
.
(14.10)
=
n!
2i
(z a)n+1
Sr (a)
Inserting g(z) = f (z)/(2i) (f is continuous) into Theorem 14.9, we see that f can be expanded
into a power series with center a and, therefore, it has derivatives of all orders at a,
Z
f (z) dz
n!
(n)
f (a) =
.
2i
(z a)n+1
SR (a)
14 Complex Analysis
378
Proof.
By the
coefficient formula (14.10) and Remark 14.3 (c) we have noting that
M
f
(z)
(z a)n+1 r n+1 for z Sr a
Z
1
1 M
M
M
f
(z)
dz
(S
(a))
=
2r
=
.
| cn |
r
2i
2 r n+1
(z a)n+1
2r n+1
rn
Sr (a)
M
rn
for all r > 0. This shows cn = 0 for all n 6= 0; hence f (z) = c0 is constant.
Remarks 14.7 (a) Note that we explicitly assume f to be holomorphic on the entire complex
plane. For example, f (z) = e1/z is holomorphic and bounded outside every ball U (0). However, f is not constant.
(b) Note that f (z) = sin z is an entire function which is not constant. Hence, sin z is unbounded
as a complex function.
Theorem 14.13 (Fundamental Theorem of Algebra) A polynomial p(z) with complex coefficients of degree deg p 1 has a complex root.
Proof. Suppose to the contrary that p(z) 6= 0 for all z C. It is known, see Example 3.3, that
lim | p(z) | = +. In particular there exists R > 0 such that
| z |
| z | R = | p(z) | 1.
That is, f (z) = 1/p(z) is bounded by 1 if | z | R. On the other hand, f is a continuous
function and {z | | z | R} is a compact subset of C. Hence, f (z) = 1/p(z) is bounded on
UR , too. That is, f is bounded on the entire plane. By Liouvilles theorem, f is constant and so
is p. This contradicts our assumption deg p 1. Hence, p has a root in C.
Now, there is an inverse-like statement to Cauchys Theorem.
Theorem 14.14 (Moreras Theorem) Let f : U C be a continuous function where U C
is open. Suppose that the integral of f along each closed triangular path [z1 , z2 , z3 ] in U is 0.
Then f is holomorphic in U.
379
a+h
Note that F (a) takes the same value for all polygonal paths
from z0 to a by assumption of the theorem. We have
Z a+h
F (a + h) F (a)
1
(f (z) f (a)) dz ,
f (a) =
h
h a
1
sup | f (z) f (a) | | h | = sup | f (z) f (a) | .
| h | zUh(a)
zUh (a)
Since f is continuous the above term tends to 0 as h tends to 0. This shows that F is
differentiable at a with F (a) = f (a). Since F is holomorphic in U, by Theorem 14.10 it has
derivatives of all orders; in particular f is holomorphic.
fn (z) dz = lim 0 = 0
Summary
Let U be a region and f : U C be a function on U. The following are equivalent:
(a) f is holomorphic in U.
(b) f = u + iv is real differentiable and the CauchyRiemann equations ux = vy
and uv = vx are satisfied in U.
(c) If U is simply connected, f is continuous and for every closed triangular path
R
= [z1 , z2 , z3 ] in U, f (z) dz = 0 (Morera condition).
14 Complex Analysis
380
(d) f possesses locally an antiderivative, that is, for every a U there is a ball
U (a) U and a holomorphic function F such that F (z) = f (z) for all z U (a).
(e) f is continuous and for every ball Ur (a) with Ur (a) U we have
Z
1
f (z)
f (b) =
dz,
b Ur (a).
2i
zb
Sr (a)
(f) For a U there exists a ball with center a such that f can be expanded in that
ball into a power series.
(g) For every ball B which is completely contained in U, f can be expanded into a
power series in B.
X
n=0
where dn =
Pn
cn z
X
n=0
bn z =
dn z n ,
n=0
| z | < r,
k=0 cnk bk .
n=0
c0 6= 0.
Then f (0) = c0 6= 0 and, by continuity of f , there exists r > 0 such that the power series
converges in the ball Ur (0) and is non-zero there. Hence, 1/f (z) is holomorphic in Ur (0) and
therefore it can be expanded into a converging power series in Ur (0), see summary (f). Suppose
X
n=0
381
1 = c0 b0 ,
0 = c0 b1 + c1 b0 ,
0 = c0 b2 + c1 b1 + c2 b0 ,
This system of equations can be solved recursively for bn , n N0 , for example, b0 = 1/c0,
b1 = c1 b0 /c0 .
(d) Double Series
Suppose that
fk (z) =
X
n=0
ckn (z a)n ,
kN
are converging in Ur (a) power series. Suppose further that the series
fk (z)
k=1
fk (z) =
X
X
n=0
k=1
k=1
ckn (z a)n .
P
In particular, one can form the sum of a locally uniformly convergent series fk (z) of power
P
series coefficientwise. Note that a series of functions
k=1 fk (z) converges locally uniformly at b if
there exists > 0 such that the series converges uniformly in U (b).
Note that any locally uniformly converging series of holomorphic functions defines a holomorphic function (Theorem of Weierstra). Indeed, since the series converges uniformly, line
integral and summation can be exchanged: Let = [z0 z1 z2 ] be any closed triangular path inside
U, then by Cauchys theorem
Z
Z X
X
X
f (z) dz =
fk (z) dz =
fk (z) dz =
0 = 0.
k=1
k=1 f (z)
k=1
k=1
is holomorphic.
X
f (n) (b)
n
bn (z b) , bn =
f (z) =
,
n!
n=0
14 Complex Analysis
382
(f) Composition
We restrict ourselves to the case
f (z) = a0 + a1 z + a2 z 2 +
g(z) = b1 z + b2 z 2 +
where g(0) = 0 and therefore, the image of g is a small neighborhood of 0, and we assume that
the first power series f is defined there; thus, f (g(z)) is defined and holomorphic in a certain
neighborhood of 0, see Remark 14.1. Hence
h(z) = f (g(z)) = c0 + c1 z + c2 z 2 +
where the coefficients cn = h(n) (0)/n! can be computed using the chain rule, for example,
c0 = f (g(0)) = a0 , c1 = f (g(0))g (0) = a1 b1 .
(g) The Composition Inverse f 1
P
n
Suppose that f (z) =
n=1 an z , a1 6= 0, has radius of convergence r > 0. Then there exists a
P
power series g(z) = n=1 bn z n converging on U (0) such that f (g(z)) = z = g(f (z)) for all
z U (0). Using (f) and the uniqueness, the coefficients bn can be computed recursively.
Example 14.8 (a) The function
f (z) =
1
1
+
2
1+z
3z
is holomorphic in C \ {i, i, 3}. Expanding f into a power series with center 1, the closest
singularity to 1 is i. Since the disc of convergence cannot contain i, the radius of convergence
1
1
1
1
=
=
1z
1 b (z b)
1 b 1 zb
1b
X
1
=
(z b)n = f(z).
n+1
(1
b)
n=0
r = | 1 i/2 | = 1 + 1/4 = 5/2. Note that the power series 1 + z + z 2 + has radius of
convergence 1 and a priori defines an analytic (= holomorphic) function in the open unit ball.
However, changing the center we obtain an analytic continuation of f to a larger region. This
example shows that (under certain assumptions) analytic functions can be extended into a larger
region by changing the center of the series.
383
which contradicts the mean value property. Hence, | f (z) | = M is constant in any sufficiently
small neighborhood of 0. Let z1 U be any point in U. We connect 0 and z1 by a path in
U. Let d be its distance from the boundary U. Let z continuously moving from 0 to z1 and
cosider the chain of balls with center z and radius d/2. By the above, | f (z) | = M in any such
ball, hence | f (z) | = M in U. It follows from homework 47.2 that f is constant.
Remark 14.8 In other words, if f is holomorphic in G and U G, then sup | f (z) | is attained
zU
on the boundary U. Note that both theorems are not true in the real setting: The image of the
sine function of the open set (0, 2) is [1, 1] which is not open. The maximum of f (x) = 1x2
over (1, 1) is not attained on the boundary since f (1) = f (1) = 0 while f (0) = 1. However
| z 2 1 | on the complex unit ball attains its maximum in z = ion the boundary.
14 Complex Analysis
384
Recall from topology:
X
f (z) =
cn (z a)n , | z a | < r.
n=0
Since a is an accumulation point of the zeros, there exists a sequence (zn ) of zeros converging
to a. Since f is continuous at a, limn f (zn ) = 0 = f (a). This shows c0 = 0. The same
argument works with the function
f (z)
,
za
which is holomorphic in the same ball with center a and has a as an accumulation point of zeros.
Hence, c1 = 0. In the same way we conclude that c2 = c3 = = cn = = 0. This shows
that f is identically 0 on Ur (a). That is, the set
f1 (z) = c1 + c2 (z a) + c3 (z a)3 + =
14.4 Singularities
385
Theorem 14.19 (Uniqueness Theorem) Suppose that f and g are both holomorphic functions
on U and U is a region. Then the following are equivalent:
(a) f = g
(b) The set D = {z U | f (z) = g(z)} where f and g are equal has an accumulation point in U.
(c) There exists z0 U such that f (n) (z0 ) = g (n) (z0 ) for all non-negative integers
n N0 .
Proof. (a) (b). Apply the previous proposition to the function f g.
(a) implies (c) is trivial. Suppose that (c) is satisfied. Then, the power series expansion of
f g at z0 is identically 0. In particular, the set Z(f g) contains a ball B (z0 ) which has an
accumulation point. Hence, f g = 0.
The following proposition is an immediate consequence of the uniqueness theorem.
Proposition 14.20 (Uniqueness of Analytic Continuation) Suppose that M U C where
U is a region and M has an accumulation point in U. Let g be a function on M and suppose
that f is a holomorphic function on U which extents g, that is f (z) = g(z) on M.
Then f is unique.
Remarks 14.9 (a) The previous proposition shows a quite amazing property of a holomorphic
function: It is completely determined by very few values. This is in a striking contrast to
C -functions on the real line. For example, the hat function
( 1
| x | < 1,
e 1x2
h(x) =
0
|x| 1
is identically 0 on [2, 3] (a set with accumulation points), however, h is not identically 0. This
shows that h is not holomorphic.
(b) For the uniqueness theorem, it is an essential point that U is connected.
(c) It is now clear that the real function ex , sin x, and cos x have a unique analytic continuation
into the complex plane.
(d) The algebra O(U) of holomorphic functions on a region U is a domain, that is, f g =
0 implies f = 0 or g = 0. Indeed, suppose that f (z0 ) 6= 0, then f (z) 6= 0 in a certain
neighborhood of z0 (by continuity of f ). Then g = 0 on that neighborhood. Since an open set
has always an accumulation point in itself, g = 0.
14.4 Singularities
We consider functions which are holomorphic in a punctured ball U r (a). From information
about the behaviour of the function near the center a, a number of interesting and useful results
will be derived. In particular, we will use these results to evaluate certain unproper integrals
over the real line which cannot be evaluated by methods of calculus.
14 Complex Analysis
386
h(z)
= c2 + c3 z + c4 z 2 + .
2
z
The right side defines a holomorphic function in a neighborhood of 0 which coincides with f
for z 6= 0. The setting f (0) = c2 removes the singularity at 0.
14.4 Singularities
387
Definition 14.5 (a) An isolated singularity a of f is called a pole of f if there exists a positive
integer m N and a holomorphic function g : Ur (a) C such that
f (z) =
g(z)
(z a)m
The smallest number m such that (z a)m f (z) has a removable singularity at a is called the
order of the pole.
(b) An isolated singularity a of f which is neither removable nor a pole is called an essential
singularity.
(c) If f is holomorphic at a and there exists a positive integer m and a holomorphic function g
such that f (z) = (z a)m g(z), and g(a) 6= 0, a is called a zero of order m of f .
Note that m = 0 corresponds to removable singularities. If f (z) has a zero of order m at a,
1/f (z) has a pole of order m at a and vice versa.
Example 14.11 The function f (z) = 1/z 2 has a pole of order 2 at z = 0 since z 2 f (z) = 1 has
a removable singularity at 0 and zf (z) = 1/z not. The function f (z) = (cos z 1)/z 3 has a
pole of order 1 at 0 since (cos z 1)/z 3 = /(2z) + z/4! .
n=
cn (z a)n
f (z) =
X
n=1
cn (z a)
and
f+ (z) =
X
n=0
cn (z a)n .
14 Complex Analysis
388
R
R
r
r
a
f(z) converges
+
f(z) converges
1
. Thus, we can derive facts about the
za
convergence of Laurent series from the convergence of power series. In fact, suppose that 1/r
P
n
is the radius of convergence of the power series
n=1 cn and R is the radius of convergence
P
P
of the series n=0 cn z n , then the Laurent series nZ cn z n converges in the annulus Ar,R =
{z | r < z < R} and defines there a holomorphic function.
P
(a) The power series f+ (z) = n0 cn (z a)n converges in the inner part of the ball UR (a)
whereas the series with negative powers, called the principal part of the Laurent series, f (z) =
P
n
n<0 cn (z a) converges in the exterior of the ball Ur (a). Since both series must converge,
f (z) convergence in intersection of the two domains which is the annulus Ar,R (a).
Remark 14.10 (a) f (z) is a power series in
The easiest way to determine the type of an isolated singularity is to use Laurent series which
are, roughly speaking, power series with both positive and negative powers z n .
Proposition 14.22 Suppose that f is holomorphic in the open annulus Ar,R (a) =
{z | r < | z a | < R}. Then f (z) has an expansion in a convergent Laurent series for
z Ar,R
f (z) =
X
n=0
cn (z a) +
X
n=1
cn
1
(z a)n
(14.12)
n Z,
(14.13)
with coefficients
1
cn =
2i
S (a)
f (z)
dz,
(z a)n+1
The series converges uniformly on every annulus As1 ,s2 (a) with
14.4 Singularities
389
Proof.
s1
Ss2 (a)
f (w)
dw,
wz
Ss1 (a)
1
f2 (z) =
2i
Ss1 (a)
f (w)
dw
wz
separately.
P
n
In what follows, we will see that f1 (z) is a power series
n=0 cn (z a) and f2 (z) =
P
1
The first part is completely analogous to the proof of Theorem 14.9.
n=1 cn (za)n .
za
<1
Case 1. w Ss2 (a). Then | z a | < | w a | and | q | =
s
wa
z
such that
2
1
1
1 X n X (z a)n
1
q =
.
=
=
za
wz
w a 1 wa
w a n=0
(w a)n+1
n=0
Since f (w) is bounded on Ss2 (a), the geometric series has a converging numerical upper bound.
Hence, the series converges uniformly with respect to w; we can exchange integration and
summation:
Z
1
f1 (z) =
2i
Ss2 (a)
where cn =
1
2i
X (z a)n
(z a)n
f (w)
dw
=
(w a)n+1
2i
n=0
n=0
Ss2 (a)
s1
z
a
w
Ss2 (a)
X
f (w)dw
=
cn (za)n ,
(w a)n+1
n=0
f (w)dw
(wa)n+1
X
1
1
1 X n
1
(w a)n
=
=
q
=
.
wz
z a 1 wa
z a n=0
(z a)n+1
za
n=0
Since f (w) is bounded on Ss1 (a), the geometric series has a converging numerical upper bound.
Hence, the series converges uniformly with respect to w; we can exchange integration and
14 Complex Analysis
390
summation:
Z
1
f2 (z) =
2i
Ss1 (a)
X
n=1
where cn =
Ss2 (a)
X
1
(w a)n
f (w)
dw
=
(z a)n+1
2i(z a)n+1
n=0
n=0
f (w) (w a)n dw
Ss1 (a)
cn (z a)n ,
1
2i
Ss2 (a)
f (w)
, k Z, is holomorphic in both annuli As1 , and A,s2 , by Re(w a)k
f (w)dw
=
(w a)k
S (a)
f (w)dw
,
(w a)k
and
Ss1 (a)
f (w)dw
=
(w a)k
S (a)
f (w)dw
,
(w a)k
that is, in the coefficient formulas we can replace both circles Ss1 (a) and Ss2 (a) by a common
circle S (a). Since a power series converge uniformly on every compact subset of the disc of
convergence, the last assertion follows.
Remark 14.11 The Laurent series of f on Ar,R (a) is unique. Its coefficients cn , n Z are
uniquely determined by (14.13). Another value of with r < < R yields the same values cn
by Remark 14.4 (c).
Example 14.12 Find the Laurent expansion of f (z) =
midpoint 0
0 < | z | < 1,
z2
1 < | z | < 3,
2
in the three annuli with
4z + 3
3 < |z|.
1
1
+
, we find in the case
1z z3
1
1
1 X z n
1
=
=
.
3z
3 1 z3
3 n=0 3
Hence,
X
1
=
zn,
1z
n=0
X
f (z) =
1
n=0
1
3n+1
1 1
1
=
1z
z 1
1
z
zn ,
| z | < 1.
X
1
=
n+1
z
n=0
1
1 X z n
=
3z
3 n=0 3
14.4 Singularities
391
such that
f (z) =
X
1
n=0
X
1 n
+
z .
n
z
3n+1
n=0
such that
X 3n
1
1
=
=
z3
z n+1
z 1 3z
n=0
f (z) =
X
n=1
(3n1 1)
1
.
zn
> 0 such that | f (z) w | for all z U (a). Then the function
g(z) =
1
,
f (z) w
z U (a)
1
+w
g(z)
has a removable singularity at a if g(a) 6= 0. If, on the other hand, g(z) has a zero at a of order
m, that is
X
cn (z a)n , cm 6= 0,
g(z) =
n=m
14 Complex Analysis
392
the function (z a)m f (z) has a removable singularity at a. Thus, f has a pole of order m at a.
Both conclusions contradict our assumption that f has an essential singularity at a.
The Laurent expansion establishes an easy classification of the singularity of f at a. We summarize the main facts about isolated singularities.
Proposition 14.25 Suppose that f (z) is holomorphic in the punctured disc U = U R (a) and
X
cn (z a)n .
possesses there the Laurent expansion f (z) =
n=
ez =
X
1
,
n
n!z
n=0
| z | > 0.
14.5 Residues
Throughout U C is an open connected subset of C.
Definition 14.7 Suppose that f : U r (a) C, is holomorphic, 0 < r1 < r and let
Z
X
1
f (z) dz
n
cn (z a) , cn =
f (z) =
2i Sr1 (a) (z a)n+1
nZ
be the Laurent expansion of f in the annulus {z | 0 < | z a | < r}.
Then the coefficient
Z
1
f (z) dz
c1 =
2i Sr1 (a)
is called the residue of f at a and is denoted by Res f (z) or Res f .
a
14.5 Residues
393
k=1
Proof.
a
a1
ak
R
As in Remark 14.4 we can replace by
the sum of integrals over small circles, one
around each singularity. As before, we obtain
Z
Z
m
X
f (z) dz =
f (z) dz,
k=1
S (ak )
where all circles are positively oriented. Applying the definition of the residue we obtain the
assertion.
Remarks 14.13 (a) The residue theorem generalizes the Cauchys Theorem, see Theorem 14.6.
Indeed, if f (z) possesses an analytic continuation to the points a1 , . . . , am , all the residues are
R
zero and therefore f (z) dz = 0.
X
(b) If g(z) is holomorphic in the region U, g(z) =
cn (z a)n , c0 = g(a), then
n=0
f (z) =
g(z)
,
za
z U \ {a}
c0
+ c1 + c2 (z a)2 + ,
za
Sr (a)
g(z)
dz = 2i Res f = 2i c0 = 2ig(a).
a
za
14 Complex Analysis
394
(14.15)
za
za,z6=a
cm
cm+1
c1
+ c0 + c1 (z a) + ,
+
++
m
m1
(z a)
(z a)
za
0 < |z a| < r
(14.16)
| z a | < r.
Differentiating this (m 1) times, all terms having coefficient cm , cm+1 , . . . , c2 vanish and
we are left with the power series
dm1
((z a)m f (z)) = (m 1)!c1 + m(m 1) 2 c0 (z a) +
dz m1
Inserting z = a on the left, we obtain c1 . However, on the left we have to take the limit z a
since f is not defined at a.
Thus, if f has a pol of order m at a,
Res f (z) =
a
dm1
1
lim m1 ((z a)m f (z)) .
(m 1)! za dz
(14.17)
p(z)
q(z)q(a)
za
lim p(z)
za
lim q(z)q(a)
za
za
p(a)
.
q (a)
(14.18)
395
R
dz
Example 14.14 Compute S1 (i) 1+z
The only singularities of
4.
4
f (z) = 1/(1 + z ) inside the disc {z | | z i | < 1} are
Im
i
a2
a1
Re
S1 (i)
dz
= 2i Res f + Res f =
a1
a2
1 + z4
2
1
a1 a2
1
=
.
+ 3 = 2i
2i
3
4a1 4a2
4
2
where
aU1 (0)
1
f (z) = R
iz
1
1
1
1
z+
,
z
2
z
2i
z
and the sum is over all isolated singularities of f (z) in the open unit ball.
Proof. By the residue theorem,
Z
f (z) dz = 2i
S1 (0)
aU1 (0)
Res f
a
Let z = eit for t [0, 2]. Rewriting the integral on the left using dz = eit i dt = iz dt
Z 2
Z
f (z) dz =
R(cos t, sin t) dt
S1 (0)
2
0
dt
2
=
.
2
1 2a cos t + a
1 a2
14 Complex Analysis
396
For a = 0, the statement is trivially true; suppose now a 6= 0. Indeed, the complex function
corresponding to the integrand is
f (z) =
1
1
i/a
.
=
=
iz(1 + a2 az a/z)
i(az 2 a + (1 + a2 )z)
(z a) z a1
In the unit disc, f (z) has exactly one pole of order 1, namely z = a. By (14.15), the formula in
Subsection 14.5.1,
i
i
a
;
Res f = lim (z a)f (z) =
= 2
1
a
za
a 1
a a
the assertion follows from the proposition:
Z
dt
2
i
=
= 2i 2
.
2
1 2a cos t + a
a 1
1 a2
f (x) dx
calculate limits
lim
f (x) dx,
(14.20)
which is called the principal value (or Cauchy mean value) of the integral over R and we denote
it by
Z
Vp
f (x) dx.
The existence of the coupled limit (14.20) in general does not imply the existence of the
improper integral
Z
f (x) dx = lim
f (x) dx + lim
f (x) dx.
0
R
R
R
For example, Vp x dx = 0 whereas x dx does not exist since 0 x dx = +. In
general, the existence of the improper integral implies the existence of the principal value. If f
is an even function or f (x) 0, the existence of the principal value implies the existence of the
improper integral.
(b) Rational Functions
R
The main idea to evaluate the integral R f (x) dx is as follows. Let H = {z | Im (z) > 0}
be the upper half-plane and f : H \ {a1 , . . . , am } C be holomorphic. Choose R > 0 large
397
enough such that | ak | < R for all k = 1, . . . , m, that is, all isolated singularities of f are in the
upper-plane-half-disc of radius R around 0.
Consider the path as in the picture which consists of the segment from R to R on the real
iR
line and the half-circle R of radius R. By the
.
R
residue theorem,
.
.
Z R
Z
m
.
. .
X
.
f (x) dx+
Res f (z).
f (z) dz = 2i
R
If
lim
f (z) dz = 0
k=1
ak
(14.21)
f (x) dx = 2i
Res (z).
k=1
ak
f (x) dx = 2i
m
X
m
X
Res (z).
k=1
ak
Suppose that f = pq is a rational function such that q has no real zeros and deg q deg p + 2.
Then (14.21) is satisfied. Indeed, since only the two leading terms
of p and q determine the the
p(z)
C on R . Using the
limit behaviour of f (z) for | z | , there exists C > 0 with
q(z) R2
estimate M() from Remark 14.3 (c) we get
Z
p(z)
C
C
dz 2 (R ) =
0.
R
R R
R q(z)
By the same reason namely | p(x)/q(x) | C/x2 , for large x, the improper real integral exists
(comparison test) and converges absolutely. Thus, we have shown the following proposition.
Proposition 14.28 Suppose that p and q are polynomials with deg q deg p + 2. Further, q
in the open
has no real zeros and a1 , . . . , am are all poles of the rational function f (z) = p(z)
q(z)
upper half-plane H.
Then
Z
m
X
f (x) dx = 2i
Res f.
k=1
ak
R dx
2
2
Example 14.16 (a) 1+x
2 . The only zero of q(z) = z + 1 in H is a1 = i and deg(1 + z ) =
2 deg(1) + 2 such that
Z
dx
1
1
= 2i Res
= 2i
= .
2
i
1 + z2
1 + z z=i
1 + x
14 Complex Analysis
398
(b) It follows from Example 14.14 that
Z
2
dx
.
= 2i Res f + Res f =
4
a1
a2
2
1 + x
(c) We compute the integral
1
dt
=
6
1+t
2
dt
.
1 + t6
a 2 =i
a3
a1
Res
ak
1
1
ak
1
=
= 5 = .
q(z)
q (ak )
6ak
6
5i/6
= 2i = .
=
+
i
+
e
e
2i
6
1+t
2
6
6
3
0
(c) Functions of Type g(z) eiz
Proposition 14.29 Suppose that p and q are polynomials with deg q deg p + 1. Further, q
has no real zeros and a1 , . . . , am are all poles of the rational function g = pq in the open upper
half-plane H. Put f (z) = g(z) eiz , where R is positive > 0.
Then
Z
m
X
f (x) dx = 2i
Res f.
k=1
ak
ir
r+ir
.
.
.
. .
rir
r+ir
f (z) dz +
r+ir
f (z) dz = 2i
m
X
k=1
Res f.
ak
p(z)
p(z)
exists and tends to 0 as
Since deg q deg p + 1, limz q(z) = 0. Thus, sr = sup
| z |r q(z)
r .
399
Consider the second integral I2 with z = r + it, t [0, r], dz = i dt. On this segment we have
the following estimate
p(z) i(r+it)
sr et
q(z) e
which implies
| I2 | sr
et dt =
sr
sr
1 er .
A similar estimate holds for the fourth integral from r + ir to r. In case of the third integral
one has z = t + ir, t [r, r], dz = dt such that
Z r
Z r
i(t+ri)
r
| I3 |
sr e
dt = 2rsr er .
dt = sr e
r
Obviously,
cos t
a
e .
dt
=
t2 + a2
2a
cos t
1
dt = Re
2
2
t +a
2
Z
eit
dt .
t2 + a2
eit
has a single pole of order 1 in the upper half-plane at z = ai. By
t2 + a2
eiz
eiz
ea
Res 2
.
=
=
ai z + a2
2z z=ai
2ai
1 2
iax
e 2 x
1 2
dx = e 2 a .
(14.22)
Proof.
R+ai
ai
R+ai
14 Complex Analysis
400
12 (R2 +it)2
f (z) dz =
e
i dt =
e 2 (R +2Ritt ) i dt
0
Z0 a
Z 2
Z a
1 2
1 2
12 R2 + 12 t2
12 R2
f (z) dz
e
e 2 t dt = Ce 2 R .
dt = e
2
Using
f (x) dx =
1 2
f (x + ai) dx.
e 2 x dx =
2,
which follows from Example 14.7, page 374, or from homework 41.3, we have
Z
Z
Z
1 2
1 2
12 x2
12 (x2 +2iaxa2 )
a
2
2 =
e
dx =
e
dx = e
e 2 x iax dx
R
R Z
R
1 2
1 2
1
e 2 x iax dx = e 2 a .
2 R
Chapter 15
Partial Differential Equations I an
Introduction
15.1 Classification of PDE
15.1.1 Introduction
There is no general theory known concerning the solvability of all PDE. Such a theory is extremely unlikely to exist, given the rich variety of physical, geometric, probabilistic phenomena
which can be modelled by PDE. Instead, research focuses on various particular PDEs that are
important for applications in mathematics and physics.
Definition 15.1 A partial differential equation (abbreviated as PDE) is an equation of the form
F (x, y, . . . , u, ux, uy , . . . , uxx , uxy , . . . ) = 0
(15.1)
(15.2)
where the function f on the right depends only on the variables x, y, . . . and G is linear in all
components with coefficients depending on x, y, . . . . More precisely, the formal differential
operator L(u) = G(u, ux, uy , . . . , uxx , uxy , . . . ) which associates to each function u(x, y, . . . )
a new function L(u)(x, y, . . . ) is a linear operator. The linear PDE (15.2) (L(u) = f ) is called
homogeneous if f = 0 and inhomogeneous otherwise. For example, cos(xy 2 )uxxy y 2 ux +
401
402
15.1.2 Examples
(1) The Laplace equation in n dimensions for a function u(x1 , . . . , xn ) is the linear second
order equation
u = ux1x1 + + uxn xn = 0.
The solutions u are called harmonic (or potential) functions. In case n = 2 we associate
with a harmonic function u(x, y) its conjugate harmonic function v(x, y) such that the
first-order system of CauchyRiemann equations
ux = vy ,
uy = vx
is satisfied. A real solution (u, v) gives rise to the analytic function f (z) = u + iv. The
Poisson equation is
u = f,
The Laplace equation models equilibrium states while the Poisson equation is important in electrostatics. Laplace and Poisson equation always describe stationary processes
(there is no time dependence).
(2) The heat equation. Here one coordinate t is distinguished as the time coordinate, while
the remaining coordinates x1 , . . . , xn represent spatial coordinates. We consider
u : R+ R,
open in Rn ,
where R+ = {t R | t > 0} is the positive time axis and pose the equation
kut = u,
where
u = ux1 x1 + + uxn xn .
The heat equation models heat conduction and other diffusion processes.
(3) The wave equation. With the same notations as in (2), here we have the equation
utt a2 u = 0.
It models wave and oscillation phenomena.
403
(magnetostatic law),
Bt + curl E = 0,
(magnetodynamic law),
div E = 4,
Et curl B = 4j
(7) The NavierStokes equations for the velocity v(x, t) = (v 1 , v 2 , v 3 ) and the pressure
p(x, t) of an incompressible fluid of density and viscosity :
vtj
3
X
i=1
v i vxj i v j = pxj ,
j = 1, 2, 3,
div v = 0.
404
(c) a non-linear equation is (5)
naturally, linear equations are simple than non-linear ones. We shall therefore mostly
study linear equations.
(II) The order of the equation. The CauchyRiemann equations and the Maxwell equations
are linear first order equations. (1), (2), (3), (5), (7), (8) are of second order; (4) is of third
order. Equations of higher order rarely occur. The most important PDEs are second order
PDEs.
(III) Elliptic, parabolic, hyperbolic. In particular, for the second order equations the following
classification turns out to be useful: Let x = (x1 , . . . , xn ) and
F (x, u, uxi , uxixj ) = 0
be a second-order PDE. We introduce auxiliary variables pi , pij , i, j = 1, . . . , n, and study
the function F (x, u, pi, pij ). The equation is called elliptic in if the matrix
Fpij (x, u(x), uxi (x), uxi xj (x))i,j=1,...,n
of the first derivatives of F with respect to the variables pij is positive definite or negative
definite for all x .
this may depend on the function u. The Laplace equation is the prominent example of an
elliptic equation. Example (5) is elliptic if f (x) > 0.
The equation is called hyperbolic if the above matrix has precisely one negative and d 1
positive eigenvalues (or conversely, depending on the choice of the sign). Example (3) is
hyperbolic and so is (5) if f (x) < 0.
Finally, the equation is parabolic if one eigenvalue of the above matrix is 0 and all the
other eigenvalues have the same sign. More precisely, the equation can be written in the
form
ut = F (t, x, u, uxi , uxixj )
with an elliptic F .
(IV) According to solvability. We consider the second-order PDE F (x, u, uxi , uxixj ) = 0 for
u : R, and wish to impose additional conditions upon the solution u, typically
prescribing the values of u or of certain first derivatives of u on the boundary or part
of it.
Ideally such a boundary problem satisfies the three conditions of Hadamard for a wellposed problem
(a) Existence of a solution u for the given boundary values;
(b) Uniqueness of the solution;
(c) Stability, meaning continuous dependence on the boundary values.
405
ux = f(x).
Integration with respect to x yields u = f(x) dx + g(y) = f (x) + g(y), where f
is differentiable and g is arbitrary.
(15.3)
(15.4)
We restrict ourselves to the linear equation with an initial condition given as a parametric curve
in the xyu-space
= (s) = (x0 (s), y0(s), u0 (s)),
s (a, b) R.
(15.5)
The curve will be called the initial curve. The initial condition then reads
u(x0 (s), y0 (s)) = u0 (s),
initial curve
(x(0,s), y(0,s),u(0,s))
(s)
integral
surface
initial point
y
characteristic curves
s (a, b).
(15.6)
406
Recall that (ux , uy , 1) is the normal vector to the surface (x, y, u(x, y)), that is, the tangent
equation to u at (x0 , y0 , u0 ) is
u u0 = ux (x x0 ) + uy (y y0 ) (x x0 , y y0 , u u0 )(ux , uy , 1) = 0.
It follows from (15.6) that (a, b, c0 u + c1 ) is a vector in the tangent plane. Finding a curve
(x(t), y(t), u(t)) with exactely this tangent vector
(a(x(t), y(t)), b(x(t), y(t)), c0(x(t), y(t))u(t) + c1 (x(t), y(t)))
is equivalent to solve the ODE
x (t) = a(x(t), y(t)),
(15.7)
(15.8)
(15.9)
This system is called the characteristic equations. The solutions are called characteristic curves
of the equation. Note that the above system is autonomous, i. e. there is no explicit dependence
on the parameter t.
In order to determine characteristic curves we need an initial condition. We shall require the
initial point to lie on the initial curve (s). Since each curve (x(t), y(t), u(t)) emanates from
a different point (s), we shall explicitely write the curves in the form (x(t, s), y(t, s), u(t, s)).
The initial conditions are written as
x(0, s) = x0 (s),
y(0, s) = y0 (s),
u(0, s) = u0 (s).
Notice that we selected the parameter t such that the characteristic curve is located at the initial
curve at t = 0. Note further that the parametrization (x(t, s), y(t, s), u(t, s)) represents a surface
in R3 .
The method of characteristics also applies to quasi-linear equations.
To summarize the method: In the first step we identify the initial curve . In the second step
we select a point s on as initial point and solve the characterictic equations using the point
we selected on as an initial point. After preforming the steps for all points on , we obtain
a portion of the solution surface, also called integral surface. That consists of the union of the
characteristic curves.
Example 15.2 Solve the equation
ux + uy = 2
subject to the initial condition u(x, 0) = x2 . The characteristic equations and the parametric
initial conditions are
xt (t, s) = 1,
yt (t, s) = 1,
ut (t, s) = 2,
x(0, s) = s,
y(0, s) = 0,
u(0, s) = s2 .
y(t, s) = t + f2 (s),
u(t, s) = 2t + f3 (s).
407
y(t, s) = t,
u(t, s) = 2t + s2 .
We have obtained a parametric representation of the integral surface. To find an explicit representation we have to invert the transformation (x(t, s), y(t, s)) in the form (t(x, y), s(x, y)),
namely, we have to solve for s and t. In the current example, we find t = y, s = x y. Thus
the explicit formula for the integral surface is
u(x, y) = 2y + (x y)2.
Remark 15.1 (a) This simple example might lead us to think that each initial value problem
for a first-order PDE possesses a unique solution. But this is not the case Is the problem(15.3)
together with the initial condition (15.5) well-posed? Under which conditions does there exists
a unique integral surface that contains the initial curve?
(b) Notice that even if the PDE is linear, the characteristic equations are non-linear. It follows
that one can expect at most a local existanece theorem for a first ordwer PDE.
(c) The inversion of the parametric presentation of the integral surface might hide further difficulties. Recall that the implicit function theorem implies that the inversion locally exists if the
6= 0. An explicit computation of the Jacobian at a point s of the initial curve
Jacobian (x,y)
(t,s)
gives
a b
x y x y
.
J=
= ay0 bx0 =
x0 y0
t s
s t
Thus, the Jacobian vanishes at some point if and only if the vectors (a, b) and (x0 , y0 ) are
linearly dependent. The geometrical meaning of J = 0 is that the projection of into the xy
plane is tangent to the projection of the characteristic curve into the xy plane. To ensure a unique
solution near the initial curve we must have J 6= 0. This condition is called the transersality
condition.
Example 15.3 (Well-posed and Ill-posed Problems) (a) Solve ux = 1 subject to the initial
condition u(0, y) = g(y). The characteristic equations and the inition conditions are given by
xt = 1,
yt (t, s) = 0,
ut (t, s) = 1,
x(0, s) = 0,
y(0, s) = s,
u(0, s) = g(s).
The parametric integral surface is (x(t, s), y(t, s), u(t, s)) = (t, s, t+g(s)) such that the explicit
solution is u(x, y) = x + g(y).
(b) If we keep ux = 1 but modify the initial condition into u(x, 0) = h(x), the picture changess
dramatically.
xt = 1,
yt (t, s) = 0,
ut (t, s) = 1,
x(0, s) = s,
y(0, s) = 0,
u(0, s) = h(s).
408
Remark 15.2 Because of the special role played by the projecions of the characteristics on the
xy plane, we also use the term characteristics to denote them. In case of the linear PDE (15.4)
the ODE for the projection is
x (t) =
which yields y (x) =
dx
= a(x(t), y(t)),
dt
y (t) =
dy
= b(x(t), y(t)),
dt
(15.10)
dy
b(x, y)
=
.
dx
a(x, y)
(15.11)
i,j=1
with continuous coefficients aij (x). Since we assume u C2 (), by Schwarzs lemma we
assume without loss of generality that aij = aji. Using the terminology of the introduction
409
(Classification (III), see page 404) we find that the matrix A(x) := (aij (x))i,j=1,...,n , coincides
with the matrix (Fpij )i,j defined therein.
Definition 15.2 We call the PDE (15.11) elliptic at x0 if the matrix A(x0 ) is positive definite
or negative definite. We call it parabolic at x0 if A(x0 ) is positive or negative semidefinite with
exactly one eigenvalue 0. We call it hyperbolic if A(x0 ) has the signature (n 1, 1, 0), i. e. A is
indefinite with n 1 positive eigenvalues and one negative eigenvalue and no zero eigenvalue
(or vice versa).
l = 1, . . . , n;
1 ,...,n )
The transformation is called non-singular if the Jacobian (
(x0 ) 6= 0 is non-zero at
(x1 ,...,xn )
any point x0 . By the Inverse Mapping Theorem, the transformation possesses locally an
inverse transformation denoted by x = (y)
xl = l (y1, . . . , yn ),
l = 1, . . . , n.
Putting
u(y) := u((y)),
n
X
l=1
uyl
l
,
xi
n
X
uyl yk
k,l=1
l k X
2 l
+
uyl
.
xi xj
x
i xj
l=1
(15.12)
k,l=1
uylyk
n
X
i,j=1
aij
n
n
X
l k X
2 l
+
uyl
aij
+ F (y, u
, uy1 , . . . , uyn ) = 0.
xi xj
x
x
i
j
i,j=1
l=1
(15.13)
We denote by a
lk the new coefficients of the partial second derivatives of u,
a
lk =
n
X
aij (x)
i,j=1
l k
,
xi xj
k,l=1
a
lk (y)
uylyk + F (y, u
, uy1 , . . . , uyn ) = 0.
(15.14)
410
Equation (15.14) later plays a crucial role in simplifying PDE (15.11). Namely, if we want
some of the coefficients alk to be 0, the right hand side of (15.14) has to be 0. Writing
blj =
l
,
xj
l, j = 1, . . . , n,
B = (blj ),
By Proposition 15.1, A and A have the same signature. We have shown the following proposition.
Proposition 15.2 The type of a semi-linear second order PDE is invariant under the change of
coordinates.
Notation. We call the operator L with
L(u) =
n
X
aij (x)
i,j=1
2u
+ F (x, u, ux1 , . . . , uxn )
xi xj
2u
L2 (u) =
aij (x)
xi xj
i,j=1
the sum of its the highest order terms; L2 is a linear operator.
Definition 15.3 The second-order PDE L(u) = 0 has normal form if
L2 (u) =
m
X
j=1
u xj xj
r
X
u xj xj
j=m+1
411
15.3.4 Characteristics
Suppose we are given the semi-linear second-order PDE in Rn
n
X
aij (x)
i,j=1
2u
+ F (x, u, ux1 , . . . , uxn ) = 0
xi xj
(15.15)
i,j=1
aij (x0 )
(x0 ) (x0 )
= 0.
xi
xj
(15.16)
x 6= 0, y 6= 0.
412
with solutions
b2 ac
, if a 6= 0.
a
We can see, that the elliptic equation has no characteristic lines, the parabolic equation has one
family of characteristics, the hyperbolic equation has two families of characteristic lines.
Hyperbolic case. In general, if c1 = 1 (x, y) is the first family of characteristic lines and
c2 = 2 (x, y) is the second family of characteristic lines,
y =
= 1 (x, y),
= 2 (x, y)
y = y.
y = y/x.
This yields
dx
dy
= ,
log | y | = log | x | + c0 .
y
x
We obtain the two families of characteristic lines
c2
.
x
y = c1 x,
y=
y
= c1 ,
x
= xy = c2
gives
x = y,
x =
y = x,
1
y = ,
x
y
,
x2
xx = 0,
xx = 2
y
,
x3
yy = 0,
xy = 1,
yy = 0,
xy =
1
.
x2
413
Noting x2 = /, y 2 = and inserting the values of the partial derivatives of and we get
y2
y2
y
2
u + u y 2 + 2 3 u ,
4
2
x
x
x
1
= u 2 + 2
u + u x2 .
x
uxx = u
uyy
Hence
y
x2 uxx y 2 uyy = 4y 2 u + 2 u = 0
x
1 1
u
u = 0.
2 xy
Since = xy, we obtain the characteristic form of the equation to be
u
1
u = 0.
2
1
1
v = 0 which corresponds to the ODE v 2
v=
Using the substitution v = u , we obtain v 2
0. Hence, v(, ) = c() . Integration with respect to gives u(, ) = A() + B().
Transforming back to the variables x and y, the general solution is
y
u(x, y) = A
xy + B(xy).
x
(c) The one-dimensional wave equation utt a2 uxx = 0. The characteristic equation t2 = a2 x2
yields
t /x = dx/ dt = x = a.
n
X
x2i = 0.
i=1
(0) 2
(x, t) = a (t t )
n
X
i=1
(0)
(xi xi )2 = 0,
where the point (x(0) , t(0) ) is the peak of the cone. Indeed,
implies t2 a2
Pn
i=1 (xi
t = 2a2 (t t(0) ),
(0)
xi )2 = 0.
(0)
x1 = 2(xi xi )
414
n
X
bi xi = 0,
i=1
where kbk = 1.
P
(e) The heat equation has characteristic equation ni=1 x2i = 0 which implies xi = 0 for
all i = 1, . . . , n such that t = c is the only family of characteristic surfaces (the coordinate
hyperplanes).
(f) The Poisson and Laplace equations have the same characteristic equation; however we have
one variable less (no t) and obtain grad = 0 which is impossible. The Poisson and Laplace
equations dont have characteristic surfaces.
We consider the Cauchy problem for an infinite string (no boundary values):
utt a2 uxx = 0,
u(x, 0) = u0 (x),
the
ut (x, 0) = u1(x),
general
solution
(see
Differentiating the first one yields u0 (x) = f (x) + g (x) such that
1
1
f (x) = u0 (x) u1 (x),
2
2a
Integrating these equations we obtain
Z x
1
1
f (x) = u0 (x)
u1 (y) dy + A,
2
2a 0
1
1
g (x) = u0 (x) + u1 (x).
2
2a
1
1
g(x) = u0 (x) +
2
2a
u1 (y) dy + B,
0
where A and B are constants such that A + B = 0 (since f (x) + g(x) = u0 (x)). Finally we
have
u(x, t) = f (x at) + g(x + at)
1
= (u0 (x + at) + u0 (x at))
2
1
= (u0 (x + at) + u0 (x at)) +
2
Z xat
Z x+at
1
1
u1 (y) dy +
u1 (y) dy
2a 0
2a 0
Z x+at
1
u1 (y) dy.
(15.17)
2a xat
415
It is clear from (15.17) that u(x, t) is uniquely determined by the values of the initial functions u0 and u1
in the interval [x at, x + at] whose end points are
cut out by the characteristic lines through the point
(x, t). This interval represents the domain of dependence for the solution at point u(x, t) as shown in the
figure.
(x,t)
x-at
x+at
Conversely, the initial values at point (, 0) of the x-axis influence u(x, t) at points (x, t) in the
wedge-shaped region bounded by the characteristics through (, 0), i. e. , for at < x < +at.
This indicates that our signal or disturbance only moves with speed a.
We want to give some interpretation of the solution (15.17). Suppose u1 = 0 and
u
1
t=0
(
| x | a,
1 | xa | ,
u0 (x) =
x
0,
| x | > a.
-a
a
In this example we consider the vibrating string which is plucked at time t = 0 as in the above
picture (given u0 (x)). The initial velocity is zero (u1 = 0).
u
t=1/2
1/2
-3a/2
a/2
-a/2
3a/2
u
t=1
1/2
-2a
-a
2a
u
t=2
1/2
-3a
-2a
-a
2a
3a
Formula (15.17) is due to dAlembert (1746). Usually one assumes u0 C2 (R) and u1
C1 (R). In this case, u C2 (R2 ) and we are able to evaluate the classical Laplacian (u)
which gives a continuous function. On the other hand, the right hand side of (15.17) makes
sense for arbitrary continuous function u1 and arbitrary u0 . If we want to call these u(x, t) a
generalized solution of the Cauchy problem we have to alter the meaning of (u). In particular,
we need more general notion of functions and derivatives. This is our main objective of the next
section.
(b) The Finite String over [0, l]
We consider the initial boundary value problem (IBVP)
utt = a2 uxx ,
u(0, x) = u0(x),
t R.
416
Suppose we are given functions u0 C2 ([0, l]) and u1 C1 ([0, l]) on [0, l] with
u0 (0) = u0 (l) = 0,
u1 (0) = u1 (l) = 0,
u0 (0) = u0 (l) = 0.
To solve the IBVP, we define new functions u0 and u1 on R as follows: first extend both
functions to [l, l] as odd functions, that is, ui (x) = ui (x), i = 0, 1. Then extend ui as a
2l-periodic function to the entire real line. The above assumptions ensure that u0 C2 (R) and
u1 C1 (R). Put
Z x+at
1
1
u(x, t) = (
u0 (x + at) + u0 (x at)) +
u1 (y) dy.
2
2a xat
Then u(x, t) solves the IVP.
Chapter 16
Distributions
16.1 Introduction Test Functions and Distributions
In this section we introduce the notion of distributions. Distributions are generalized functions.
The class of distributions has a lot of very nice properties: they are differentiable up to arbitrary
order, one can exchange limit procedures and differentiation, Schwarz lemma holds. Distributions play an important role in the theory of PDE, in particular, the notion of a fundamental
solution of a differential operator can be made rigorous within the theory of distributions only.
Generalized functions were first used by P. Dirac to study quantum mechanical phenomena.
Systematically he made use of the so called -function (better: -distribution). The mathematical foundations of this theory are due to S. L. Sobolev (1936) and L. Schwartz (1950, 1915
2002).
Since then many mathematicians made progress in the theory of distributions. Motivation comes
from problems in mathematical physics and in the theory of partial differential equations.
Good accessible (German) introductions are given in the books of W. Walter [Wal74] and
O. Forster [For81, 17]. More detailed explanations of the theory are to be found in the
books of H. Triebel (in English and German), V. S. Wladimirow (in russian and german) and
Gelfand/Schilow (in Russian and German, part I, II, and III), [Tri92, Wla72, GS69, GS64].
16.1.1 Motivation
Distributions generalize the notion of a function. They are linear functionals on certain spaces of
test functions. Using distributions one can express rigorously the density of a mass point, charge
density of a point, the single-layer and the double-layer potentials, see [Arn04, pp. 92]. Roughly
speaking, a generalized function is given at a point by the mean values in the neighborhood
of that point.
The main idea to associate to each sufficiently nice function f a linear functional Tf (a distribution) on an appropriate function space D is described by the following formula.
hTf , i =
f (x)(x) dx,
417
D.
(16.1)
16 Distributions
418
On the left we adopt the notation of a dual pairing of vector spaces from Definition 11.1. In
general the bracket hT , i denotes the evaluation of the functional T on the test function .
Sometimes it is also written as T (). It does not denote an inner product; the left and the right
arguments are from completely different spaces.
What we really want of Tf is
(a) The correspondence should be one-to-one, i. e., different functionals Tf and Tg correspond to different functions f and g. To achieve this, we need the function space D
sufficiently large.
(b) The class of functions f should contain at least the continuous functions. However, if
f (x) = xn , the function f (x)(x) must be integrable over R, that is xn (x) L1 (R).
Since polynomials are not in L1 (R), the functions must be very small for large | x |.
Roughly speaking, there are two possibilities to this end. First, take only those functions
which are identically zero outside a compact set (which depends on ). This leads to
the test functions D(R). Then Tf is well-defined if f is integrable over every compact
subset of R. These functions f are called locally integrable.
Secondly, we take (x) to be rapidly decreasing as | x | tends to . More precisely, we
want
sup | xn (x) | <
for all non-negative integers n Z+ . This concept leads to the notion of the so called
Schwartz space S (R).
(c) We want to differentiate f arbitrarily often, even in case that f has discontinuities. The
only thing we have to do is to give the expression
Z
f (x)(x) dx,
a meaning. Using integration by parts and the fact that (+) = () = 0, the above
R
expression equals R f (x) (x) dx. That is, instead differentiating f , we differentiate
the test function . In this way, the functional Tf makes sense as long as f is integrable. Since we want to differentiate f arbitrarily often, we need the test function to
be arbitrarily differentiable, C (R).
Note that conditions (b) and (c) make the space of test functions sufficiently small.
R ) and D()
We want to solve the problem f to be integrable for all polynomials f . We use the first
approach and consider only functions which are 0 outside a bounded set. If nothing is stated
otherwise, Rn denotes an open, connected subset of Rn .
419
n
D(Rn ) = C
0 (R ) = {f C (R ) | supp f
is compact}.
c/e
h(t)
1
1
R
The constant c is chosen such that R h(t) dt = 1. The function h is continuous on R. It was
already shown in Example 4.5 that h(k) (1) = h(k) (1) = 0 for all k N. Hence h D(R) is
a test function with supp h = [1, 1]. Accordingly, the function
( 1
kxk < 1,
cn e 1kxk2 ,
h(x) =
0,
kxk 1.
16 Distributions
420
h(x) dx =
Rn
h(x) dx = 1.
U1 (0)
1 x
.
h
n
h (x) =
Rn
U1 (0)
So far, we have constructed only one function h(x) (as well as its scaled relatives h (x)) which
is C and has compact support. Using this single hat-function h we are able
(a) to restrict the support of an arbitrary integrable function f to a given domain by replacing
f by f h (x a) which has a support in U (a),
(b) to make f smooth.
(b) Mollification
In this way, we have an amount of C functions with compact support which is large enough for
our purposes (especially, to recover the function f from the functional Tf ). Using the function
h , S. L. Sobolev developed the following mollification method.
Definition 16.2 (a) Let f L1 (Rn ) and g D(Rn ), define the convolution product f g by
Z
Z
f (y)g(x y) dy =
f (x y)g(y) dy = (g f )(x).
(f g)(x) =
Rn
Rn
f = f h .
Note that
f (x) =
Rn
h (x y)f (y) dy =
(16.2)
U (x)
0,
x < 1 ,
1
2
1 < x < 1 + ,
,
f (x) =
1+
2 2+
1,
0,
1 + < x < 2 ,
2 < x < 2 + ,
2 + < x,
421
n N,
(b) Convergence in D
Notations. For x Rn and Nn0 (a multi-index), = (1 , . . . , n ) we write
| | = 1 + 2 + + n ,
! = 1 ! n !
x = x1 1 x2 2 xnn ,
| | u(x)
.
D u(x) = 1
x1 xnn
It is clear that D(Rn ) is a linear space. We shall introduce an appropriate notion of convergence.
Definition 16.3 A sequence (n (x)) of functions of D(Rn ) converges to D(Rn ) if there
exists a compact set K Rn such that
(a) supp n K for all n N and
(b)
D n D ,
16 Distributions
422
Example
16.2
Let D be a fixed test function and consider the sequence (n (x)) given by
(x)
(a)
. This sequence converges to 0 in D since supp n = supp for all n and the
n
convergence
isuniform for all x Rn (in fact, it suffices to consider x supp ).
(x/n)
. The sequence does not converge to 0 in D since the supports supp (n ) =
(b)
n
n supp
n N, are not in any common compact subset.
(),
(nx)
(c)
has no limit if 6= 0, see homework 49.2.
n
Note that D(Rn ) is not a metric space, more precisely, there is no metric on D(Rn ) such that
the metric convergence and the above convergence coincide.
R)
n
sup
xK, | |l
| D (x) | ,
with supp K.
(16.3)
We show that the criterion (16.3) in implies continuity of T . Indeed, let 0. Then there
D
exists compact subset K Rn such that supp n K for all n. By the criterion, there is a
C > 0 and an l Z+ with | hT , n i | C sup | D n (x) |, where the supremum is taken over
all x K and multiindices with | | l. Since D n 0 on K for all , we particularly
have sup | D n (x) | 0 as n . This shows hT , n i 0 and T is continuous.
For the proof of the converse direction, see [Tri92, p. 52]
423
possible.
Definition 16.5 Let be an open subset of Rn . A function f (x) on is said to be locally
integrable over if f (x) is integrable over every compact subset K ; we write in this case
f L1loc ().
Remark 16.4 The following are equivalent:
(a) f L1loc (Rn ).
(b) For any R > 0, f L1 (UR (0)).
(c) For any x0 Rn there exists > 0 such that f L1 (U (x0 )).
Lemma 16.1 If f is locally integrable f L1loc (), Tf is a distribution, Tf D ().
A distribution T which is of the form T = Tf with some locally integrable function f is called
regular.
Proof. First, Tf is linear functional on D since integration is a linear operation. Secondly, if
m 0, then there exists a compact set K with supp m K for all m. We have the
D
following estimate:
Z
Z
| f (x) | dx = C sup | m (x) | ,
f (x)m (x) dx sup | m (x) |
R
Rn
xK
xK
where C = K | f | dx exists since f L1loc . The expression on the right tends to 0 since m (x)
uniformly tends to 0. Hence hTf , m i 0 and Tf belongs to D .
Proof. For simplicity we consider the case n = 1, = (, ). Fix with 0 < < . Let
n (x) = einx h (x), n Z. Then supp n [, ]. Since both, ex and h are C -functions,
n D() and
Z
cn = hTf , n i =
f (x)einx h (x) dx = 0, n Z;
and all Fourier coefficients of f h L2 [, ] vanish. From Theorem 13.13 (b) it follows that
f h is 0 in L2 (, ). By Proposition 12.16 it follows that f h is 0 a.e. in (, ). Since
h > 0 on (, ), f = 0 a.e. on (, ).
16 Distributions
424
Remark 16.5 The previous lemma shows, if f1 and f2 are locally integrable and Tf1 = Tf2 then
f1 = f2 a.e.; that is, the correspondence is one-to-one. In this way we can identify L1loc (Rn )
D (Rn ) the locally integrable functions as a subspace of the distributions.
ha , i = (a),
D.
U (0)
| f (x) | dx < 1.
Putting (x) = h(x/) with the bump function h we have supp = U (0) and
supxRn | (x) | = (0) > 0 such that
Z
Z
f (x)(x) dx sup | (x) |
n
U (0)
R
This contradicts Rn f (x)(x) dx = | (0) | = (0).
In the same way one can show that the assignment
hT , i = D (a),
defines an element of D which is singular.
The distribution
Z
hT , i =
Rn
a Rn ,
f L1loc ,
425
0+0
R
Proof. By the change of variable theorem, R f (x) dx = 1 for all > 0. To prove the claim we
have to show that for all D
Z
Z
f (x)(x) dx (0) =
f (x)(0) dx as 0;
or, equivalently,
Z
f (x)((x) (0)) dx 0,
as 0.
(0))
dy
f
(y)((y)
(0))
dy
Since is continuous at 0, for every fixed y, the family of functions ((y) (0)) tends to
0 as 0. Hence, the family of functions g (y) = f (y)((y) (0)) pointwise tends to
16 Distributions
426
x2
1
f (x) = e 42 ,
2
,
x2 + 2
x
1
sin
f (x) =
x
f (x) =
f (x) =
(16.4)
The first three functions satisfy the assumptions of the Lemma, the last one not since sinx x is
R
not in L1 (R). Later we will see that the above lemma even holds if f (x) dx = 1 as an
improper Riemann integral.
1
x
Since the function x1 is not locally integrable in a neighborhood of 0, 1/x is not a regular
distribution. However, we can define a substitute that coincides with 1/x for all x 6= 0.
Recall that the principal value (or Cauchy mean value) of an improper Riemann integral is
defined as follows. Suppose f (x) has a singularity at c [a, b] then
Z c Z b
Z b
f (x) dx := lim
+
f (x) dx.
Vp
0
c+
R 1 dx
For example, Vp 1 x2n+1
= 0, n N.
For D define
Z Z
Z
(x)
(x)
F () = Vp
dx = lim
dx.
+
0
x
x
Then F is obviously linear. We have to show, that F () is finite and continuous on D. Suppose
that supp [R, R]. Define the auxiliary function
(
(x)(0)
,
x 6= 0
x
(x) =
(0),
x = 0.
R
Since is differentiable at 0, C(R). Since 1/x is odd, dx/x = 0 and we get
Z Z
Z Z R
(x)
(x) (0)
F () = lim
+
+
dx = lim
dx
0
0
x
x
Z Z R
Z R
+
(x) dx =
(x) dx.
= lim
0
427
Z Z R
(
)
(0)
+
x
x
= lim
dx
+
0
x
R
Z R
| (x ) | dx 2R sup | (x) | .
This shows that the condition (16.3) in Remark 16.3 is satisfied with C = 2R and l = 1 such
that F is a continuous linear functional on D(R), F D (R). We denote this distribution by
P x1 .
In quantum physics one needs the so called Sokhotskys formulas, [Wla72, p.76]
1
1
= i(x) + P ,
0+0 x + i
x
1
1
lim
= i(x) + P .
0+0 x i
x
lim
Idea of proof: Show the sum and the difference of the above formulas instead.
lim
0+0 x2
2x
1
= 2P ,
2
+
x
2i
= 2i.
0+0 x2 + 2
lim
Rn
Rn
Obviously, a D(Rn ) since a C (Rn ) and has compact support; thus, a has compact
support, too. Hence, the right hand side of (16.5) defines a linear functional on D(Rn ). We
have to show continuity. Suppose that n 0 then an 0. Then hT , an i 0 since T
D
is continuous.
16 Distributions
428
1
xP ,
x
1
x
= Vp
x(x)
dx =
x
(x) dx = h1 , i .
1
xP
x
= 1 = .
(a)
(b) Differentiation
Consider n = 1. Suppose that f L1loc is continuously differentiable. Suppose further that
D with supp (a, a) such that (a) = (a) = 0. We want to define (Tf ) to be Tf .
Using integration by parts we find
Z a
Z a
a
hTf , i =
f (x)(x) dx = f (x)(x)|a
f (x) (x) dx
a
a
Z a
f (x) (x) dx = hTf , i ,
=
a
where we used that (a) = (a) = 0. Hence, it makes sense to define Tf , = hTf , i.
This can easily be generalized to arbitrary partial derivatives D Tf .
Definition 16.9 For T D (Rn ) and a multi-index Nn0 define D T D (Rn ) by
hD T , i = (1)| | hT , D i .
We have to make sure that D T is indeed a distribution. The linearity of D T is obvious. To
prove continuity let n 0. By definition, this implies D n 0. Since T is continuous,
D
429
a
T
(a T ) =
T +a
,
xi
xi
xi
i = 1, . . . , n (product rule).
T
a
, =
, a = T ,
(a)
xi
xi
xi
= haxi T , i aT ,
= hT , axi (x)i T , a
xi
xi
= haxi T , i +
(aT ) , = axi T +
(aT ) , .
xi
xi
Cancelling on both sides proves the claim.
(c) The easy proof uses D + = D (D ) for D.
Example 16.6 (a) Let a Rn , f L1loc (Rn ), D. Then
hD a , i = (1)| | ha , D i = (1)| | D (a)
Z
||
f D dx.
hD f , i = (1)
Rn
(b) Recall that the so-called Heaviside function H(x) is defined as the characteristic function of
the half-line (0, +). We compute its derivative in D :
Z
Z
H(x) (x) dx =
(x) dx = (x)|
hTH , (x)i =
0 = (0) = h , i .
This shows TH = .
f(x)
16 Distributions
430
The derivative of Tf in D is
Tf = Tf + hc ,
where
h = f (c + 0) f (c 0),
is the difference between the right-handed and left-handed limits of f at c. Indeed, for D
we have
Z c
Z
Tf , =
f (x) (x) dx
c
Z
= f (c 0)(c) + f (c + 0)(c) +
f (x)(x) dx
G
= (f (c + 0) f (c 0))c + Tf (x) ,
= hh c + Tf , i .
(d) We prove that f (x) = log | x | is in L1loc (R) (see homework 50.4, and 51.5) and compute its
derivative in D (R).
Proof. Since f (x) is continuous on R \ {0}, the only critical point is 0. Since the integral
R1
R0
(improper Riemann or Lebesgue) 0 log x dx = et dt = 1 exists f is locally integrable
at 0 and therefore defines a regular distribution. We will show that f (x) = P x1 . We use the
R
R R R
fact that = + + for all positive > 0. Also, the limit 0 of the right hand
R
side gives the . By definition of the derivative,
Z
log | x | (x) dx
hlog | x | , (x)i = hlog | x | , (x)i =
Z Z Z
+
+
log | x | (x) dx .
=
R
R
1
Since 1 log | x | (x) dx < , the middle integral log | x | (x) dx tends to 0 as 0
(Apply Lebesgues theorem to the family of functions g (x) = [,](x) log | x | (x) which
pointwise tends to 0 and is dominated by the integrable function log | x | (x)). We consider
the third integral. Integration by parts and (+) = 0 gives
Z
Z
Z
(x)
(x)
dx = log ()
dx.
log x (x) dx = log x (x)|
x
x
Similarly,
Z
(x)
dx.
x
The sum of the first two (non-integral) terms tends to 0 as 0 since log 0. Indeed,
log () log () = log
Hence,
hf , i = lim
Z
() ()
2 2 lim log (0) = 0.
0
2
+
(x)
dx =
x
1
P , .
x
431
Proof. (a) Let K be a compact subset of Rn , we will show that f L1 (K). Since fn converge
R
R
uniformly on K to 0, by Theorem 6.6 f is integrable and limn K fn (x) dx = K f dx. such
that f L1loc (Rn ).
We show that Tfn Tf in D . Indeed, for any D with compact support K, again by
Theorem 6.6 and uniform convergence of fn on K,
Z
fn (x)(x) dx
lim Tfn () = lim
n
n K
Z
Z
=
lim fn (x) (x) dx =
f (x)(x) dx = Tf ();
K
we are allowed to exchange limit and integration since (fn (x)(x)) uniformly converges on K.
Since this is true for all D, it follows that Tfn Tf .
(b) By Lemma 16.4 (a), differentiation is a continuous operation in D . Thus D Tfn D Tf .
Example 16.7 (a) Suppose that a, b > 0 and m N are given such that | cn | a | n |m + b for
all n Z. Then the Fourier series
X
cn einx ,
converges in D (R).
First consider the series
X
cn
c0 xm+2
+
einx .
(m + 2)! nZ,n6=0 (ni)m+2
By assumption,
cn
cn
inx
(ni)m+2 e = (ni)m+2
a | n |m + b
a
.
m+2
|n|
| n |2
(16.6)
P
Since n6=0 | na|2 < , the series (16.6) converges uniformly on R by the criterion of Weierstra (Theorem 6.2). By Lemma 16.5, the series (16.6) converges in D , too and can be differentiated term-by-term. The (m + 2)nd derivative of (16.6) is exactly the given Fourier series.
1/2
1/2
x
The 2-periodic function f (x) = 12 2
,x
[0, 2) has discontinuities of the first kind at
2n, n Z; the jump has height 1 since
f (0 + 0) f (0 0) = 12 + 12 = 1.
16 Distributions
432
Therefore in D
f (x) =
The Fourier series of f (x) is
X
1
(x 2n).
+
2 nZ
f (x)
1 X 1 inx
e .
2i n6=0 n
R 2
Note that f and the Fourier series g on the right are equal in L2 (0, 2). Hence 0 | f g |2 = 0.
This implies f = g a. e. on [0, 2]; moreover f = g a. e. on R. Thus f = g in L1loc (R) and f
coincides with g in D (R).
1 X 1 inx
f (x) =
e
in D (R).
2i
n
n6=0
By Lemma 16.5 the series can be differentiated elementwise up to arbitrary order. Applying
Example 16.6 we obtain:
X
1
1 X inx
=
(x 2n) =
e
in D (R).
f (x) +
2 nZ
2 nZ
u(x) =
m1
X
cn (n) (x),
n=0
cn C.
m (n)
x (x) , = (1)n , (xm (x))(n) = (xm (x))(n) x=0 = 0;
thus, the given u satisfies xm u = 0. One can show, that this is the general solution, see [Wla72,
p. 84].
(c) The general solution of the ODE u(m) = 0 in D is a polynomial of degree m 1.
Proof. We only prove that u = 0 implies u = c in D . The general statement follows by
induction on m.
Suppose that u = 0. That is, for all D we have 0 = hu , i = hu , i. In particular,
for , 1 D we have
Z x
((t) 1 (t)I) dt, where I = h1 , i ,
(x) =
belongs to D since both and 1 do; 1 plays an auxiliary role. Since hu , i = 0 and
= I 1 we obtain
0 = hu , i = hu , 1 Ii = hu , i hu , 1 i h1 , i
= hu , i h1 hu , 1 i , i
= hu 1 hu , 1 i , i = hu c 1 , i ,
where c = hu , 1 i. Since this is true for all test functions D(Rn ), we obtain 0 = u c or
u = c which proves the assertion.
433
l=1
16 Distributions
434
P
Indeed, T S is linear on D(Rn ) D(Rn ) such that (T S)( rk=1 k k ) =
Pr
n
n
n+m
). For
k=1 T (k )S(k ). By continuity it is extended from D(R ) D(R ) to D(R
n
m
n
example, if a R , b R then a b = (a,b) . Indeed, for D(R ) and D(Rm ) we
have
(a b )( ) = (a)(b) = ( )(a, b) = (a,b) ( ).
Lemma 16.6 Let F = T S be the unique distribution in D (Rn+m ) where T D(Rn ) and
S D(Rm ) and (x, y) D(Rn+m ).
Then (x) = hS(y) , (x, y)i is in D(Rn ), (y) = hT (x) , (x, y)i is in D(Rm ) and we have
h(T S) , i = hS , hT , ii = hT , hS , ii .
For the proof, see [Wla72, II.7].
Example 16.8 (a) Regular Distributions. Let f L1loc (Rn ) and g L1loc (Rm ). Then f g
L1loc (Rn+m ) and Tf Tg = Tf g . Indeed, by Fubinis theorem, for test functions and one
has
Z
Z
h(Tf Tg ) , i = hTf , i hTg , i = f (x)(x) dx
g(y)(y) dy
ZR
Rm
Rn+m
Rm
g(y)(a, y) dy.
(T S) ( ) = (T S)
( )
x
x
435
t=xy
ZZ
R2
R2
f (y)g(t)(y + t) dy dt = Tf g (),
(16.7)
where (y,
t) = (y + t).
0110
1010
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
1010
00000000000000000000000
11111111111111111111111
supp (x+y)
00000000000000000000000
11111111111111111111111
1010
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
1010
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
1010
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
1010
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
1010
00000000000000000000000
11111111111111111111111
x
00000000000000000000000
11111111111111111111111
1010
00000000000000000000000
11111111111111111111111
supp
00000000000000000000000
11111111111111111111111
1010
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
y
is bounded for any D(Rn ); then the integral (16.7) makes sense.
16 Distributions
436
Definition 16.12 Let T, S D (Rn ) be distributions and assume that for every D(Rn ) the
set
K := {(x, y) R2n | x + y supp , x supp T, y supp S}
is bounded. Define
hT S , i = lim hT S , (x + y)k (x, y)i .
(16.8)
d
.
dx
k
= lim S T ,
((x + y)k (x, y)) (x + y)
k
x
x
*
= lim hS T , (x + y)k (x, y)i lim
k
S T , (x + y)
k
x
|{z}
=0 for large k
= hS T , i
The proof of the second equality uses commutativity of the convolution product.
(d) If supp S is compact and D(Rn ) such that (y) = 1 in a neighborhood of supp S.
Then
(T S)() = hT S , (x + y)(y)i ,
D(Rn ).
(e) If T1 , T2 , T3 D (Rn ) all have compact support, then T1 (T2 T3 ) and (T1 T2 ) T3 exist
and T1 (T2 T3 ) = (T1 T2 ) T3 .
437
matrix. As usual, consider first the case of a regular distribution f (x). Let f(x)
= f (Ax + b)
1
with y = Ax + b, x = A (y b), dy = det A dx. Then
D
E Z
1
f (y) , (A1 (y b)) .
=
| det A |
Definition 16.13 Let T D (Rn ), A a regular n n-matrix and b Rn . Then T (Ax + b)
denotes the distribution
1
T (y) , (A1 (y b)) .
hT (Ax + b) , (x)i :=
| det A |
Example 16.9 (a) S = S = S for all S D . The existence is clear since has compact
support.
h( S) , i = lim h(x) S(y) , (x + y)k (x, y)i
k
Inparticular a b = a+b .
(c) Let L1loc (Rn ) and supp Tf is compact.
1
Case n = 2 f (x) = log kxk
L1loc (R2 ). We call
V (x) = ( f )(x) =
surface potential with density .
Case n 3 f (x) =
1
kxkn2
ZZ
x
1 e 22 .
2
R2
Rn
(y) log
(y)
1
dy
kx yk
1
dy
kx ykn2
Then f f = f2 + 2 .
16 Distributions
438
where c C (Rn ).
Definition 16.14 A distribution E D (Rn ) is said to be a fundamental solution of the differential operator L if
L(E) = .
Note that E D (Rn ) need not to be unique. It is a general result due to Malgrange and Ehrenpreis (1952) that any linear partial differential operator with constant coefficients possesses a
fundamental solution.
(a) ODE
We start with an example from the theory of ordinary differential equations. Recall that H =
(0,+) denotes the Heaviside function.
Lemma 16.7 Suppose that u(t) is a solution of the following initial value problem for the ODE
L[u] = u(m) + a1 (t)u(m1) + + am (t)u = 0,
u(m1) (0) = 1.
This yields
L(E) = E(m) + a1 (t)E(m1) + + am (t)E(t) = TH(t) L(u(t)) + = T0 + = .
E = TH(x)eax ,
E = TH(x) sin ax .
y + a y = 0,
439
(b) PDE
Here is the main application of the convolution product: knowing the fundamental solution of
a partial differential operator L one immediately knows a weak solution of the inhomogeneous
equations L(u) = f for f D (Rn ).
X
Theorem 16.8 Suppose that L[u] =
c D u is a linear differential operator in Rn with
| |k
| |k
Suppose that S1 and S2 are both solutions of L(S) = f , i. e. L(S1 ) = L(S2 ) = f . Then
S1 S2 = (S1 S2 ) = (S1 S2 )
ca D E =
| |k
| |k
ca D (S1 S2 ) E = (f f ) E = 0. (16.9)
R ) and S (R )
n
We want to define the Fourier transformation for test functions as well as for distributions.
The problem with D(Rn ) is that its Fourier transformation
Z
eix (x) dx
F() = n
of is an entire (analytic) function with real support R. That is, F does not have compact
support. The only test function in D which is analytic is 0. To overcome this problem, we
enlarge the space of test function D S in such a way that S becomes invariant under the
Fourier transformation F(S ) S .
R
Lemma 16.9 Let D(R). Then the Fourier transform g(z) = n R eitz (t) dt is holomorphic in the whole complex plane and bounded in any half-plane Ha = {z C | Im (z)
a}.
16 Distributions
440
Proof. (a) We show that the complex limit limh0(g(z + h) g(z))/h exists for all z C.
Indeed,
Z
g(z + h) g(z)
eiht 1
= n
(t) dt.
eizt
h
h
R
izt eiht 1
Since e
(t) C for all x supp (), h C, | h | 1, we can apply Lebesgues
h
Dominated Convergence theorem:
Z
Z
g(z + h) g(z)
eiht 1
izt
= n
(t) dt = n
e
lim
eizt (it)(t) dt = F(it(t)).
lim
h0
h0
h
h
R
R
(b) Suppose that Im (z) a. Then
Z
Z
it Re (z) t Im (z)
| g(z) | n
e
e
(t) dt n sup | (t) |
eta dt,
tK
R)
Definition 16.15 S (Rn ) is the set of all functions f C (Rn ) such that for all multi-indices
and
p, (f ) = sup x D f (x) < .
Rn
Roughly speaking, a Schwartz space function is a function decreasing to 0 (together with all its
1
partial derivatives) faster than any rational function
as x . In place of p, one can
P (x)
also use the norms
X
pk,l () =
p, (), k, l Z+
| |k, | |l,
to describe S (Rn ).
The set S (Rn ) is a linear space and p, are norms on S .
2
For example, P (x) 6 S for any non-zero polynomial P (x); however ekxk S (Rn ).
S (Rn ) is an algebra. Indeed, the generalized Leibniz rule ensures pkl ( ) < . For
2
example, f (x) = p(x)eax +bx+c , a > 0, belongs to S (R) for p is a polynomial; g(x) = e| x |
is not differentiable at 0 and hence not in S (R).
Convergence in S
Definition 16.16 Let n , S . We say that the sequence (n ) converges in S to , abbreviated by n , if one of the following equivalent conditions is satisfied for all multi-indices
S
441
and :
p, ( n ) 0;
n
0, uniformly on Rn ;
x D ( n )
x D , uniformly on Rn .
x D n
Remarks 16.9 (a) In quantum mechanics one defines the position and momentum operators
Qk and Pk , k = 1, . . . , n, by
(Qk )(x) = xk (x),
(Pk )(x) = i
,
xk
respectively. The space S is invariant under both operators Qk and Pk ; that is x D (x)
S (Rn ) for all S (Rn ).
(b) S (Rn ) L1 (Rn ).
Recall that a rational function P (x)/Q(x) is integrable over [1, +) if and only if Q(x) 6= 0
for x 1 and deg Q deg P + 2. Indeed, C/x2 is then an integrable upper bound. We want
to find a condition on m such that
Z
dx
< .
Rn (1 + kxk2 )m
For, we use that any non-zero x Rn can uniquely be written as x = r y where r = kxk and y
is on the unit sphere Sn1 . One can show that dx1 dx2 dxm = r n1 dr dS where dS is the
surface element of the unit sphere Sn1 . Using this and Fubinis theorem,
Z
Z Z
Z n1
dx
r n1 dr dS
r
dr
= n1
,
2 m =
2
m
(1 + r 2 )m
0
0
Rn (1 + kxk )
Sn1 (1 + r )
where n1 is the (n 1)-dimensional measure of the unit sphere Sn1 . By the above criterion,
the integral is finite if and only if 2m n + 1 > 1 if and only if m > n/2. In particular,
Z
dx
< .
Rn 1 + kxkn+1
In case n = 1 the integral is .
By the above argument
Z
Z
| (x) | dx =
Rn
Rn
(1 + kxk2n )(x)
C p0,2n ()
Rn
dx
1 + kxk2n
dx
< .
1 + kxk2n
(c) D(Rn ) S (Rn ); indeed, the supremum p, () of any test function D(Rn ) is finite
since the supremum of a continuous function over a compact set is finite. On the other hand,
2
D ( S since ekxk is in S but not in D.
(d) In contrast to D(Rn ), the rapidly decreasing functions S (Rn ) form a metric space. Indeed,
S (Rn ) is a locally convex space, that is a linear space V such that the topology is given by a
16 Distributions
442
X
1 pn ( )
d(, ) =
,
2n 1 + pn ( )
n=1
, V
defines a metric on V describing the same topology. (In our case, use Cantors first diagonal
method to write the norms pkl , k, l N, from th array into a sequence pn .)
Definition 16.17 Let f (x) L1 (Rn ), then the Fourier transform Ff of the function f (x) is
given by
Z
1
1
Let us abbreviate the normalization factor, n = 2
n . Caution, Wladimirow, [Wla72] uses
+ix
under the integral and normalization factor 1 in place of n .
another convention with e
R
Note that Ff (0) = n Rn f (x) dx.
of
the
function
(16.10)
= n
= n
=
Rn k=1
n
Y
(16.10)
Rn
1
e 2
Pn
k=1
x2k
ei
Pn
k=1
xk k
dx
n e 2 xk ixk k dxk
k=1
F() =
n
Y
n
Y
1 2
1 2
e 2 k = e 2 .
k=1
1 2
Hence, the Fourier transform of e 2 x is the function itself. It follows via scaling x 7 cx that
c2 x 2
2
1
F e 2 () = n e 2c2 .
c
Theorem 16.10 Let , S (Rn ). Then we have
443
1
1
eiA b F A ,
| det A |
1
,
F((x))() =
n (F)
||
Proof. (i) We carry out the proof in case = (1, 0, . . . , 0). The general case simply follows.
Z
(F)() = n
eix (x) dx.
1
1 Rn
Since 1 eix (x) = ix1 eix(x) tends to 0 as x , we can exchange partial differentiation and integration, see Proposition 12.23 Hence,
Z
ix
(x) () = n
x1 (x) dx = n
e
F
eix (x) dx
x1
RZn
Rn x1
eix (x) dx = i1 (F)().
= i1 n
Rn
(1 + kxkl )
(1 + kxkl+n+1) X
| D (x) | dx
n+1
n
R (1 + kxk ) | |k
X
| D (x) |
c3 sup (1 + kxkl+n+1 )
c2
Rn
c4 pk,l+n+1().
| |k
| |k
| D (x) | dx
16 Distributions
444
| f (y) |
| g(x y) | dx dy
Rn Z
Rn
| f (y) | dy = kf kL1 kgkL1 .
kgkL1
Rn
This in particular shows that | f g(x) | < is finite a.e. on Rn . By definition and Fubinis
theorem we have
Z
Z
ix
F( )() = n
e
(y)(x y) dy dx
n
n
R
R
Z Z
1
i(xy)
= n
(x y) dx n eiy (y) dy
e
n
Rn
z=xy
Rn
n1 F() F().
(vi) is straightforward using hA1 (y) , i = y , A () .
Remark 16.10 Similar properties as F has the operator G which is also defined on L1 (Rn ):
Z
G() = ()
= n
e+ix (x) dx.
Rn
and F = G = G().
S (Rn ).
2 x2
dx = n
n
(x) dx = n
(x) dx = (0)
= 1.
Rn
Rn n
Rn
. We
445
Further,
Z
Z
Z
1
1
1
n
(x)(x) dx = n
(x)(x) dx n
(x)(0) dx = (0).
0
2 Rn
2 Rn
2 Rn
(16.11)
In other words, n (x) is a -sequence.
We compute G(F )(x). Using Fubinis theorem we have
Z
Z
Z
1
1
i x
i x
(F)() ()e d =
()e
ei y (y) dyd
G(F )(x) = n
n
(2) Rn
2 ZRn
Rn
Z
1
1
(y) n
ei(yx) ()d dy
= n
2 ZRn
2 Rn
1
(y) (F )(y x) dy
= n
2 Rn Z
Z
1
1
= n
F (z) (z + x) dz = n
(z) (z + x) dz
z:=yx
2 Rn
2 Rn
(x).
as 0, see (16.11)
Rn
Rn
This proves the first part. The second part F(G) = follows from G() = F(), F() =
G(), and the first part.
We are now going to complete the proof of Theorem 16.10 (v). For, let = G1 and = G1
with 1 , 1 S . By (iv) we have
F( ) = F(G(1 )G(1 )) = F(n G(1 1 )) = n 1 1 = n F F.
Rn
Rn
ixy
(x) dx = n
Rn
(16.12)
(16.12)
Rn
16 Distributions
446
Remark 16.11 S (Rn ) L2 (Rn ) is dense. Thus, the Fourier transformation has a unique
extension to a unitary operator F : L2 (Rn ) L2 (Rn ). (To a given f L2 choose a sequence
n S converging to f in the L2 -norm. Since F preserves the L2 -norm, kFn Fm k =
kn m k and (n ) is a Cauchy sequence in L2 , (Fn ) is a Cauchy sequence, too; hence it
converges to some g L2 . We define F(f ) = g.)
R)
hT , n i 0.
For n D with n 0 it follows that n 0. So, every continuous linear functional
S
Remarks 16.12 (a) With the usual identification f Tf of functions and regular distributions,
L1 (Rn ) S (Rn ), L2 (Rn ) S (Rn ).
2
(b) L1loc 6 S , for example Tf 6 S (
R), f (x) = ex , since Tf () is not well-defined for all
2
a. e. x Rn .
447
Operations on S
The operations are defined in the same way as in case of D . One has to show that the result is
again in the (smaller) space S . If T S then
(a) D T S for all multi-indices .
(b) f T S for all f C (Rn ) such that D f growth at most polynomially at infinity for all
multi-indices , (i. e. for all multi-indices there exist C > 0 and k such that | D f (x) |
C (1 + kxk)k . )
(c) T (Ax + b) S for any regular real n n- matrix and b Rn .
(d) T S (Rn ) and S S (Rm ) implies T S S (Rn+m ).
(e) Let T S (Rn ), S (Rn ). Define the convolution product
h T , i = h(1(x) T (y)) , (x)(x + y)i ,
Z
= T,
(x)(x + y) dx
S (Rn )
Rn
Note that this definition coincides with the more general Definition 16.12 since
lim (x)(x + y)k (x, y) = (x)(x + y) S (R2n ).
R)
We are following our guiding principle to define the Fourier transform of a distribution T S :
First consider the case of a regular tempered distribution. We want to define F(Tf ) := TFf .
Suppose that f (x) L1 (Rn ) is integrable. Then its Fourier transformation Ff exists and is a
bounded continuous function:
Z
Z
ix
| Ff () | n
| f (x) | dx = n kTf kL1 < .
e f (x) dx = n
Rn
Rn
Rn
Rn
R2n
Hence, hTFf , i = hTf , Fi. We take this equation as the definition of the Fourier transformation of a distribution T S .
Definition 16.19 For T S and S define
hFT , i = hT , Fi .
We call FT the Fourier transform of the distribution T .
(16.13)
16 Distributions
448
Remark 16.13 All the properties of the Fourier transformation as stated in Theorem 16.10 (i),
(ii), (iii), (iv), and (v) remain valid in case of S . In particular, F(x T ) = i| | D (FT ).
Indeed, for S (Rn ), by Theorem 16.10 (ii)
F(x T )() = hx T , Fi = hT , x Fi = T , (i)| | F(D )
= (1)| | (i)| | hD (FT ) , i = i| | D T , .
Example 16.12 (a) Let a Rn . We compute F a . For S (Rn ),
Z
eixa (x) dx = Tn eixa ().
Fa () = a (F) = (F)(a) = n
Rn
2
b
449
(c) The single-layer distribution. Suppose that S is a compact, regular, piecewise differentiable,
non self-intersecting surface in R3 and (x) L1loc (R3 ) is a function on S (a density function
or distributionin the physical sense). We define the distribution S by the scalar surface
integral
ZZ
hS , i =
(x)(x) dS.
The support of S is S, a set of measure zero with respect to the 3-dimensional Lebesgue
measure. Hence, S is a singular distribution.
Similarly, one defines the double-layer distribution (which comes from dipoles) by
ZZ
(x)
(S ) , =
dS,
(x)
~n
~n
S
1
ix
hFSr , i = Sr (0) , F 3
(x) dx dS
e
3
Sr
R
2
Z
ZZ
1
Using spherical coordinates on Sr , where x is fixed to be the z-axis and is the angle between
x and Sr , we have dS = r 2 sin d d and x = r kxk cos . Hence, the inner (surface)
integral reads
=
= 2
0
kxkr
kxkr
cos s
s = kxk r cos ,
ds = kxk r sin d
r
r
ds = 4
sin(kxk r).
kxk
kxk
Hence,
2r
hFSr , i =
2
R3
(x)
sin(r kxk)
dx;
kxk
2r sin(r kxk)
FSr (x) =
.
kxk
2
(d) The Resolvent of the Laplacian . Consider the Hilbert space H = L2 (Rn ) and its dense
subspace S (Rn ). For S there is defined the Laplacian . Recall that the resolvent of
a linear operator A at is the bounded linear operator on H, given by R (A) = (A I)1 .
Given f H we are looking for u H with R (A) f = u. This is equivalent to solve
16 Distributions
450
u u = f, F
u Fu = Ff,
x2k
k=1
n
X
k=1
Fu() =
Hence,
R () = F 1
1
F,
2
where in the middle is the multiplication operator by the function 1/( 2 ). One can see that
this operator is bounded in H if and only if C \ R+ such that the spectrum of satisfies
() R+ .
(16.14)
D(Rn+1 ), where D(R) with (t) = 1 for t > and > 0 is any fixed positive
number. The convolution (T S)(x, t) vanishes for t < 0 and is continuous in both components,
that is
(a) If Tk T in D (Rn+1 ) and supp fk , f Rn R, then Tk S T S in D (Rn+1 ).
(b) If Sk S in D (Rn+1 ) and supp Sk , S + (0, 0), then T Sk T S in D (Rn+1 ).
Proof. Since D(R), there exists > 0 with (x) = 0 for x < . Let (x, t) D(Rn+1 )
with supp UR (0) for some R > 0. Let K (x, t, y, s), K R2n+2 , be a sequence in
D(R2n+2 ) converging to 1 in R2n+2 , see before Definition 16.12. For sufficiently large K we
then have
K := (s)(t)(as kyk)K (x, t, y, s)(x + y, t + s)
= (s)(t)(at kyk)(x + y, t + s) =: .
(16.15)
451
To prove this it suffices to show that D(R2n+2 ). Indeed, is arbitrarily often differentiable
and its support is contained in
{(x, t, y, s) | s, t , as kyk ,
kx + yk2 + | r + s |2 R2 },
D(R2n+2 ).
k := (t)(s)(as kyk) k (x + y, t + s)
D
as k . Hence,
hT S , k i = hT (x, t) S(y, s) , k i hT (x, t) S(y, s) , i = hT S , i ,
k ,
and T S is continuous.
We show that T S vanishes for t < 0. For, let D(Rn+1 ) with supp Rn (, 1 ].
Choosing > 1 /2 one has
(t)(s)(as kyk)(x + y, t + s) = 0,
such that hT S , i = 0. Continuity of the convolution product follows from the continuity
of the tensor product.
452
16 Distributions
Chapter 17
PDE II The Equations of Mathematical
Physics
In this chapter we study in detail the Laplace equation, wave equation as well as the heat equation. Firstly, for all space dimensions n we determine the fundamental solutions to the corresponding differential operators; then we consider initial value problems and initial boundary
value problems. We study eigenvalue problems for the Laplace equation.
Recall Greens identities, see Proposition 10.2,
ZZ
ZZZ
u
v
dS,
v
u
(u(v) v(u)) dxdydz =
~n
~n
G
ZZZ
ZGZ
u
(u) dxdydz =
dS.
(17.1)
~n
G
kxkn2
n 3,
(log kxk) = 0,
n = 2,
1
2
log kxk ,
1
(n2)
n
1
,
kxkn2
n = 2,
n3
454
(x) dx
RnZ kxkn2
(x) dx
= n lim
.
0 kxk kxkn2
hEn , i = n
kxk
(x) dx
= n
kxkn2
kxk=
1
,
r n2
(x) (x)
n2
r
r
r
1
r n2
dS
Let us consider the first integral as 0. Note that and grad are both bounded by a
constant C since h is a test function. We make use of the estimate
Z
kxk=
Z
Z
(x)
c
dS
c n1
1
(x) n2 n2
dS
dS
=
= c
n2
r
n2
r
r
kxk=
kxk=
which tends to 0 as 0.
\
n \v
second integral = n
1
1
such that ~n kxkn2 = (n 2) rn1 and we have
Z
(n 2)
1 1
(x)
dS =
n1
r
n n1
kxk=
(x) dS.
kxk=
Note that n n1 is exactly the (n 1)-dimensional measure of the sphere of radius . So, the
integral is the mean value of over the sphere of radius . Since is continuous at 0, the mean
value tends to (0). This proves the assertion in case n 3.
The proof in case n = 2 is quite analogous.
455
Corollary 17.2 Suppose that f (x) is a continuous function with compact support. Then S =
E f is a regular distribution and we have S = f in D . In particular,
ZZ
1
S(x) =
log kx yk f (y) dy, n = 2;
2
R2Z Z Z
(17.2)
f (y)
1
dy, n = 3.
S(x) =
4
kx yk
R3
Remarks 17.1 (a) The given solution (17.2) is even a classical solutions of the Poisson equation. Indeed, we can differentiate the parameter integral as usual.
(b) The function G(x, y) = En (x y) is called the Greens function of the Laplace equation.
kxk2
1
2
n H(t) e 4a t
(4a2 t) 2
(17.3)
Proof. Step 1. The function F (x, t) is locally integrable since F = 0 for t 0 and F 0 for
t > 0 and
Z
Z
Z
n
Y
2
1
1
2
r2
e k dk = 1.
(17.4)
F (x, t) dx =
e 4a t dx =
(4a2 t)n/2 Rn
Rn
R
k=1
Step 2. For t > 0, F C and therefore
2
F
x
n
F,
=
t
4a2 t2 2t
2
F
xi
1
xi
2F
= 2 F;
=
,
xi
2a t
x2i
4a4 t2 2a2 t
F
a2 F = 0.
t
See also homework 59.2.
(17.5)
456
We give a proof using the Fourier transformation with respect to the spatial variables. Let
E(, t) = (Fx F )(, t). We apply the Fourier transformation to (17.3) and obtain a first order
ODE with respect to the time variable t:
F 1 e 2c2
1
2c2
= a2 t or c =
1 .
2a2 t
22
E(x, t) = H(t) n F 1 ea t =
= cn e
c2 x 2
2
Hence,
2
2
1
1
1
x2
x2
n
n e 22a t =
n e 4a t .
(2) 2 (2a2 t) 2
(4a2 t) 2
Corollary 17.4 Suppose that f (x, t) is a continuous function on Rn R+ with compact support. Let
kxyk2
Z tZ
e 4a2 (ts)
1
V (x, t) = H(t)
n
n f (y, s) dy ds
(4a2 ) 2 0 Rn (t s) 2
Then V (x, t) is a regular distribution in D (Rn R+ ) and a solution of ut a2 u = f in
D (Rn R+ ).
Proof. This follows from Theorem 16.8.
457
Proof. As in case of the heat equation let E(, t) = FE(, t) be the Fourier transform of the
fundamental solution E(x, t). Then E(, t) satisfies
2
E + a2 2 E = 3 1()(t)
t2
Again, this is an ODE of order 2 in t. Recall from Example 16.10 that u + a2 u = , a 6= 0, has
a solution u(t) = H(t) sinaat . Thus,
E(, t) = 3 H(t)
sin(a kk t)
,
a kk
where is thought to be a parameter. Apply the inverse Fourier transformation Fx1 to this
function. Recall from Example 16.12 (b), the Fourier transform of the single layer of the sphere
of radius at around 0 is
2at sin(at kk)
FSat () =
.
kk
2
This shows
E(x, t) =
1
1
1 1
H(t)Sat (x) =
H(t)Sat (x)
2 2at
a
4a2 t
Lets evaluate hE3 , (x, t)i. Using dx1 dx2 dx3 = dSr dr where x = (x1 , x2 , x3 ) and r = kxk
as well as the transformation r = at, dr = a dt and dS is the surface element of the sphere
Sr (0), we obtain
Z ZZ
1
1
hE3 , (x, t)i =
(x, t) dS dt
(17.6)
2
4a 0 t
S
Z ZatZ
a
1
dr
r
=
dS
x,
2
4a 0 r
a
a
Sr
Z x, kxk
a
1
=
dx.
(17.7)
2
4a R3
kxk
(b) The Dimensions n = 2 and n = 1
To construct the fundamental solution E2 (x, t), x = (x1 , x2 ), we use the so-called method of
descent.
Lemma 17.6 A fundamental solution E2 of the 2-dimensional wave operator 2a,2 is given by
hE2 , (x1 , x2 , t)i = lim hE3 (x1 , x2 , x3 , t) , (x1 , x2 , t)k (x3 )i ,
k
where E3 denotes a fundamental solution of the 3-dimensional wave operator 2a,3 and k
D(R) is the function converging to 1 as k .
458
=
E2 (x, t) =
2a
a2 t2 x2
0,
at kxk
is a fundamental solution of the 2-dimensional wave operator.
(b) The regular distribution
(
1
,
| x | < at,
1
E1 (x, t) = H(at | x |) = 2a
2a
0,
| x | at
is a fundamental solution of the one-dimensional wave operator.
Proof. By the above lemma,
1
hE2 , (x1 , x2 , t)i = hE3 , (x1 , x2 , t)1(x3 )i =
4a2
1
t
ZZ
(x1 , x2 , t) dS dt.
Sat
Integration over both the upper and the lower half-sphere yields factor 2,
Z
ZZ
1
1
at(x1 , x2 , t)
p
=2
dx1 dx2 dt
2
4a 0 t
a2 t2 x21 x22
x2 +x2 a2 t2
Z ZZ 1 2
(x1 , x2 , t)
1
p
=
dx1 dx2 dt.
2a 0
a2 t2 x21 x22
kxkat
459
R2[R,R]
R > 0.
(b) It was already shown in homework 57.2 that E1 is the fundamental solution to the onedimensional wave operator. A short proof is to be found in [Wla72, II. 6.5 Example g)].
R3
u |t=0+ = u0 ,
u |t=0+ = u1 ,
(17.8)
where f C(R+ ). We extend the solution u(t) as well as f (t) by 0 for negative values of t,
t < 0. We denote the new function by u and f, respectively. Since u has a jump of height u0
at 0, by Example 16.6, u (t) = {u(t)} + u0 (t). Similarly, u (t) jumps at 0 by u1 such that
u (t) = {u (t)} + u0 (t) + u1 (t). Hence, u satisfies on R the equation
u + a2 u = f(t) + u0 (t) + u1(t).
(17.9)
We construct the solution u. Since the fundamental solution E(t) = H(t) sin at/a as well as the
right hand side of (17.9) has positive support, the convolution product exists and equals
u = E (f + u0 (t) + u1 (t)) = E f + u0 E (t) + u1E(t)
Z
1 t
f ( ) sin a(t )d + u0 E (t) + u1 E(t).
u =
a 0
Since in case t > 0, u satisfies (17.9) and the solution of the Cauchy problem is unique, the
above formula gives the classical solution for t > 0, that is
Z
sin at
1 t
.
f ( ) sin a(t )d + u0 cos at + u1
u(t) =
a 0
a
460
x Rn , t > 0,
(17.10)
(17.11)
(17.12)
t=0+
f C(Rn R+ ),
u0 C1 (Rn ),
u1 C(Rn ).
is called the classical initial value problem (CIVP, for short) to the wave equation.
A function u(x, t) is called classical solution of the CIVP if
u(x, t) C2 (Rn R+ ) C1 (Rn R+ ),
u(x, y) satisfies the wave equation (17.10) for t > 0 and the initial conditions (17.11) and
(17.12) as t 0 + 0.
(b) The problem
2a U = F (x, t) + U0 (x) (t) + U1 (x) (t)
Utt a2 U , = hF , i + hU0 , i + hU1 , i
Z Z
Z
Z
=
f (x, t)(x, t) dx dt
u0 (x) (x, 0) dx +
u1 (x)(x, 0) dx.
t
0
Rn
Rn
Rn
(17.13)
461
utt dt = ut |0
ut t dt
0
0
Z
Since Rn has no boundary and has compact support, integration by parts with respect to the
spatial variables x yields no boundary terms.
RR
RR
Hence, by the above formula and
u dt dx =
u dt dx, we obtain
Z Z
2
2
Utt a U , = U , tt a =
u(x, t) tt a2 dt dx
n
Z Z
ZR 0
(u(x, 0)t (x, 0) ut (x, 0)(x, 0)) dx.
=
utt a2 u (x, t) dx dt
Rn
Rn
(17.14)
Rn
By Lemma 16.2 (Du Bois Reymond) it follows that utt a2 u = f on Rn R+ . Inserting this
into (17.13) and (17.14) we have
Z
Z
(u0 (x) u(x, 0))t (x, 0) dx
(u1 (x) ut (x, 0)) (x, 0) dx = 0.
Rn
Rn
Rn
Moreover,
D(Rn )
u1 (x) = ut (x, 0)
462
Corollary 17.9 Suppose that F , U0 , and U1 are data of the GIVP. Then there exists a unique
solution U of the GIVP. It can be written as
U = V + V (0) + V (1)
where
En
x U0 .
t
Here En x U1 := En (U1 (x) (t)) denotes the convolution product with respect to the spatial
variables x only. The solution U depends continuously in the sense of the convergence in D on
F , U0 , and U1 . Here En denote the fundamental solution of the n-dimensional wave operator
2a,n .
V = En F,
V (1) = En x U1 ,
V (0) =
Proof. The supports of the distributions U0 and U1 are contained in the hyperplane
{(x, t) Rn+1 | t = 0}. Hence the support of the distribution F + U0 + U1 is contained
in the half space Rn R+ .
It follows from Proposition 16.15 below that the convolution product
U = En (F + U0 + U1 )
exists and has support in the positive half space t 0. It follows from Theorem 16.8 that U is a
solution of the GIVP. On the other and, any solution of the GIVP has support in Rn R+ and
therefore,
by Proposition 16.15, posses the convolution with En . By Theorem 16.8, the solution U is
unique.
Suppose that Uk U1 as k in D (Rn+1 ) then En Uk En U1 by the continuity of
the convolution product in D (see Proposition 16.15).
xy
ZZZ
Z
Z
Z
Z
f y, t
a
1
1
1
u1(y) dSy +
u0 (y) dSy .
dy +
u(x, t) =
2
4a
kx yk
t
t t
Uat (x)
Sat (x)
Sat (x)
(17.15)
463
ZZ
f (y, s) dy ds
q
0
a2 (t s)2 kx yk2
Ua(ts) (x)
1
+
2a
ZZ
Uat (x)
1
+
2a t
ZZ
Uat (x)
u1 (y) dy
q
a2 t2 kx yk2
u0 (y) dy
q
. (17.16)
2
2
2
a t kx yk
(where we impose the last inequality only in cases n = 3 and n = 2), then the corresponding
solutions u(x, t) and u(x, t) satisfy in a strip 0 t T
1
| u(x, t) u(x, y) | < T 2 + T 1 + 0 + (aT 0 ),
2
where the last term is omitted in case n = 1.
Proof. (idea of proof) We show Kirchhoffs formula.
(a) The potential term with f .
By Proposition 16.15 below, the convolution product E3 f exists. It is shown in [Wla72, p.
153] that for a locally integrable function f L1loc (Rn+1 ) with supp f Rn R+ , En Tf is
again a locally integrable function.
Formally, the convolution product is given by
Z
Z
(E3 f )(x, t) =
E3 (y, s)f (x y, t s) dy ds =
E3 (x y, t s) f (y, s) dy ds,
R4
R4
where the integral is to be understood the evaluation of E3 (y, s) on the shifted function f (x
y, t s). Since f has support on the positive time axis, one can restrict oneselves to s > 0 and
t s > 0, that is to 0 < s < t. That is formula (17.6) gives
Z t ZZ
1
1
E3 f (x, t) =
f (x y, t s) dS(y) ds
2
4a 0 s
Sas
at
ZZ
Sr
1
r
dS(y) dr.
f x y, t
r
a
464
The shift z = x y, dz1 dz2 dz3 = dy1 dy2 dy3 finally yields
Z Z Z f z, t kxzk
a
1
V (x, t) =
dz.
2
4a
kx zk
Uat(x)
(1)
1
(x, t) =
4a2 t
ZZZ
ZRZ
1
Sat (y) u1(x y) dy =
4a2 t
1
4a2 t
ZZ
u1 (x y) dS(y)
Sat
u1 (y) dS(y).
Sat (x)
(E3 x u0 ) ;
t
Remark 17.2 (a) The stronger regularity (differentiability) conditions on f, u0 , u1 are necessary to prove u C2 (Rn R+ ) and to show stability.
(b) Proposition 17.10 and Corollary 17.9 show that the GIVP for the wave wave equation is a
well-posed problem (existence, uniqueness, stability).
x Rn , t > 0
u0 C(Rn )
(17.18)
(17.19)
465
is called the classical initial value problem (CIVP, for short) to the heat equation.
A function u(x, t) is called classical solution of the CIVP if
u(x, t) C2 (Rn (0, +)) C(Rn [0, +)),
and u(x, t) satisfies the heat equation (17.18) and the initial condition (17.19).
(b) The problem
Ut a2 U = F + U0
The fundamental solution of the heat operator has the following properties:
Z
E(x, t) dx = 1,
Rn
E(x, t) (x),
as
t0+.
The fundamental solution describes the heat distribution of a point-source at the origin (0, 0).
Since E(x, t) > 0 for all t > 0 and all x Rn , the heat propagates with infinite speed. This is in
contrast to our experiences. However, for short distances, the heat equation is gives sufficiently
good results. For long distances one uses the transport equation. We summarize the results
which are similar to that of the wave equation.
Proposition 17.11 (a) Suppose that u(x, t) is a solution of the CIVP with the given data f
and u0 . Then the regular distribution Tu is a solution of the GIVP with the right hand side
Tf + Tu0 provided that f (x, t) and u(x, t) are extended to f(x, t) and u(x, t) by 0 into the
left half-space {(x, t) | (x, t) Rn+1 , t < 0}.
(b) Conversely, suppose that U is a solution of the GIVP. Let the distributions F = Tf , U0 = Tu0 ,
and U = Tu be regular and they satisfy the regularity assumptions of the CIVP.
Then, u(x, t) is a solution of the CIVP.
Proposition 17.12 Suppose that F and U0 are data of the GIVP. Suppose further that F and U0
both have compact support. Then there exists a solution U of the GIVP which can be written as
U = V + V (0)
where
V = E F,
V (0) = E x U0 .
466
Cb (Rn ) = {f C(Rn ) | f
is bounded on
Rn }
Corollary 17.13 (a) Let f M and u0 Cb (Rn ). Then the two potentials V (x, t) as in
Corollary 17.4 and
Z
2
H(t)
kxyk
(0)
4a2 t
V (x, t) = E Tu0 =
(y)e
dy
u
n
0
(4a2 t) 2 Rn
are regular distributions and u = V + V (0) is a solution of the GIVP.
(b) In case f C2 (Rn R+ ) with D f M for all with | | 1 (first order partial
derivatives), the solution in (a) is a solution of the CIVP. In particular, V (0) (x, t) u0 (x) as
t 0+.
(c) The solution u of the GIVP is unique in the class M.
(x,t)
00000000000000000000000
11111111111111111111111
11111111111111111111111
00000000000000000000000
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
(x,t)
y1
00000000000000000000000
11111111111111111111111
11111111111111111111111
00000000000000000000000
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
radius = at
(x,t)
+
y2
y2
(x,t)
The forward light cone
y1
s < t,
s > t,
which are called domain of dependence (backward light cone) and domain of influence (forward
light cone), respectively.
Recall that the boundaries + and are characteristic surfaces of the wave equation.
467
1
S T 1
t
4a2 at
Sat
It follows by the superposition principle that the solution u(x0 , t0 ) of an initial disturbance
u0 (x) (t) + u1 (x)(t) is completely determined by the values of u0 and u1 on the sphere of
the backwards light-cone at t = 0; that is by the values u0 (x) and u1 (x) at all values x with
kx x0 k = at0 .
Now, let the disturbance be situated in a compact set K rather than in a single point. Suppose that d and D are the minimal and maximal
distances of x from K. Then the disturbance
starts to act in x at time t0 = d/a it lasts for
(D d)/a; and again, for t > D/a = t1 there
is silence at x. Therefore, we can observe a forward wave front at time t0 and a backward wave
front at time t1 .
silence
disturbance
M(K)
M(K)
silence
K
This shows that the domain of influence M(K) of compact set K is the union of all boundaries
of forward light-cones + (y, 0) with y K at time t = 0.
M(K) = {(y, s) | x K : kx yk = as}.
(b) Propagation of Plane Waves
Consider the fundamental solution
E2 (x, t) =
H(at kxk)
q
,
2
2
2
2a a t kxk
x = (x1 , x2 )
468
000
111
111
000
000
111
000
111
000
111
000
111
t=0
t=1
0000000
1111111
1111111
0000000
0000000
1111111
0000000
1111111
0000000
1111111
0000000
1111111
0000000
1111111
0000000
1111111
0000000
1111111
0000000
1111111
0000000
1111111
0000000
1111111
0000000
1111111
t=2
1111111111
0000000000
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
t=3
0000000000000000
1111111111111111
1111111111111111
0000000000000000
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
t=4
Diffusion can also be observed in case of arbitrary initial disturbance u0 (x) (t) + u1 (x)(t).
Indeed, the superposition principle shows that the domain of dependence of a compact initial
disturbance K is the union of all discs Uat (y) with y K.
(c) Propagation on a Line
1
Recall that E1 (x, t) = 2a
H(at | x |). The disturbance at time t > 0 which is effected by a
point source (x)(t) is the whole closed interval [at, at]. We have two forward wave fronts
one at the point x = at and one at x = at; one moving to the right and one moving to the left.
As in the plane case, there does not exist a back wave font; we observe diffusion.
For more details, see the discussion in Wladimirow, [Wla72, p. 155 159].
1 int
e | n Z, , H = L2 (a, a + 2),
2
2i
1
nt
e ba | n Z , H = L2 (a, b),
ba
1
1
1
, sin(nt), cos(nt) | n N , H = L2 (a, a + 2),
2
(
)
r
r
1
2
2
2
2
,
sin
nt ,
cos
nt | n N ,
ba
ba
ba
ba
ba
For any function f L1 (0, 2) one has an associated Fourier series
f
int
cn e ,
1
cn =
2
f (t)eint dt.
H = L2 (a, b),
469
Lemma 17.14 Each of the following two sets forms a CNOS in H = L2 (0, ) (on the half
interval).
(r
)
(r
)
2
2
sin(nt) | n N ,
cos(nt) | n N0 .
Proof. To check that they form an NOS is left to the reader. We show completeness of the first
set. Let f L2 (0, ). Extend f to an odd function f L2 (, ), that is f(x) = f (x) and
f(x) = f (x) for x (0, ). Since f is an odd function, in its Fourier series
a0 X
+
an cos(nt) + bn sin(nt)
2
n=1
P
2
t R+ .
(PBC)
The initial temperature distribution at time t = 0 is given such that the BIVP reads
ut a2 uxx = 0,
x R, t > 0,
u(x, 0) = u0 (x),
(PBC).
x R,
1 g (t)
f (x)
=
= = const.
a2 g(t)
f (x)
g (t) a2 g(t) = 0.
(17.20)
470
f (x)g(t) = f (2)g(t)
which for nontrivial g gives f (0) = f (2) and f (0) = f (2). In case = 0, f (x) = 0 has
the general solution f (x) = ax + b. The only periodic solution is f (x) = b = const. Suppose
now that = 2 < 0. Then the general solution of the second order ODE is
f (x) = c1 cos(x) + c2 sin(x).
Since f is periodic with period 2, only a discrete set of values are possible, namely n = n,
n Z. This implies n = n2 , n N.
Finally, in case = 2 > 0, the general solution
f (x) = c1 ex + c2 ex
provides no periodic solutions f . So far, we obtained a set of solutions
fn (x) = an cos(nx) + bn sin(nx),
n N0 ,
2 n2 t
un (x, t) = ea
u0 (x, t) =
a0
2
a0 X a2 n2 t
+
e
(an cos(nx) + bn sin(nx)),
u(x, t) =
2
n=0
(17.21)
a0 X
an cos(nx) + bn sin(nx),
+
u(x, 0) =
2
n=0
(17.22)
which gives the ordinary Fourier series of u0 (x). That is, the Fourier coefficient an and bn of
the initial function u0 (x) formally give a solution u(x, t).
1. If the Fourier series of u0 pointwise converges to u0 , the initial conditions are satisfied by
the function u(x, t) given in (17.21)
2. If the Fourier series of u0 is twice differentiable (with respect to x), so is the function
u(x, t) given by (17.21).
471
Lemma 17.15 Consider the BIVP (17.20). (a) Existence. Suppose that u0 C4 (R) is periodic.
Then the function u(x, t) given by (17.21) is in C2,1
x,t ([0, 2] R+ ) and solves the classical BIVP
(17.20).
(b) Uniqueness and Stability. In the class of functions C2,1
x,t ([0, 2] R+ ) the solution of the
above BIVP is unique.
(4)
Proof. (a) The Fourier coefficients of u0 are bounded such that the Fourier coefficients of u0
(4)
have growth 1/n4 (integrate the Fourier series of u0 four times). Then the series for uxx (x, t)
X
1
and ut (x, t) both are dominated by the series
; hence they converge uniformly. This
n2
n=0
shows that the series u(x, t) can be differentiated term by term twice w.r.t. x and once w.r.t. t.
(b) For any fixed t 0, u(x, t) is continuous in x. Consider v(t) := ku(x, t)k2L2 (0,2) . Then
Z 2
Z 2
Z 2
d
2
u(x, t) dx = 2
u(x, t)ut (x, t) dx = 2
u(x, t)a2 uxx (x, t) dx
v (t) =
dt
0
0
0
Z 2
Z 2
2
= 2 a2 ux u0 a2
(ux (x, t))2 dx = 2a2
u2x dx 0.
R+
This shows that small changes in the initial conditions u0 imply small changes in the solution
u(x, t). The problem is well-posed.
(17.23)
(PBC).
Solution. Let en (x) = einx / 2, n Z, be the CNOS in L2 (0, 2). These functions are all
2
d2
eigen functions with respect to the differential operator dx
2 , en (x) = n en (x). Let t > 0 be
fixed and
X
f (x, t)
cn (t) en (x)
472
be the Fourier series of f (x, t) with coefficients cn (t). For u, we try the following ansatz
X
u(x, t)
dn (t) en (x)
(17.24)
dn (0) = 0.
dn (t) = e
2 n2 s
ea
cn (s) ds.
Under certain regularity and growth conditions on f , (17.24) solves the inhomogeneous IBVP.
(c) The Homogeneous Wave Equation with Dirichlet Conditions
Consider the initial boundary value problem of the vibrating string of length .
(E)
(BC)
(IC)
utt a2 uxx = 0,
0 < x < ,
t > 0;
u(0, t) = u(, t) = 0,
u(x, 0) = (x),
ut (x, 0) = (x),
0 < x < .
f (x) = f (x),
g = a2 g.
The boundary conditions imply f (0) = f () = 0. Hence, the first ODE has the only solutions
fn (x) = cn sin(nx),
n = n2 ,
n N.
X
u(x, t) =
(an cos(nat) + bn sin(nat)) sin(nx)
n=1
473
solves the boundary value problem in the sense of D (R2 ) (choose any an , bn of polynomial
growth). Now, insert the initial conditions, t = 0:
u(x, 0) =
an sin(nx) = (x),
ut (x, 0) =
n=1
n=1
X
u(x, t) =
(an cos(nat) + bn sin(nat)) sin(nx)
(17.25)
n=1
can be differentiated twice with respect to x or t since the differentiated series have a summable
P
upper bound c/n2 . Hence, (17.25) solves the IBVP.
(d) The Wave Equation with Inhomogeneous Boundary Conditions
Consider the following problem in Rn
utt a2 u = 0,
u(x, 0) = ut (x, 0) = 0
u | = w(x, t).
Idea. Find an extension v(x, t) of w(x, t), v C2 ( R+ ), and look for functions u = u v.
Then u has homogeneous boundary conditions and satisfies the IBVP
utt a2
u = vtt + a2 v,
u(x, 0) = v(x, 0),
u | = 0.
ut (x, 0) = vt (x, 0)
This problem can be split into two problems, one with zero initial conditions and one with
homogeneous wave equation.
474
l
(u v uv ) dx = u v v u|0 (u v v u) dx = 0.
0
u C2 () C1 ()
u | = 0,
(17.26)
has countably many eigenvalues k . All eigenvalues are negative and of finite multiplicity. Let
0 > 1 > 2 > then sequence ( 1k ) tends to 0. The eigenfunctions uk corresponding to k
form a CNOS in L2 ().
Sketch of proof. (a) Let H = L2 (). We use Greens 1st formula with u = v , u | = 0,
Z
Z
u u dx + (u)2 dx = 0.
to show that all eigenvalues of are negative. Let u = u. First note, that = 0 is not an
eigenvalue of . Suppose to the contrary u = 0, that is, u is harmonic. Since u | = 0, by
the uniqueness theorem for the Dirichlet problem, u = 0 in . Then
Z
Z
2
u u dx = (u)2 dx < 0.
kuk = hu , ui = hu , ui = hu , ui =
Hence, is negative.
(b) Assume that a Greens function G for exists. By (17.34), that is
Z
Z
G(x, y)
u(y) =
G(x, y) u(x) dx +
u(x)
dS(x),
~nx
u | = 0 implies
u(y) =
is inverse to the Laplacian. Since G(x, y) = G(y, x) is real, A is self-adjoint. By (a), its
eigenvalues, 1/k are all negative. If
ZZ
| G(x, y) |2 dxdy < ,
475
A is a compact operator.
R
We want to justify the last statement. Let (Kf )(x) = k(x, y)f (y) dy be an integral operator
on H = L2 (), with kernel k(x, y) H = L2 ( ). Let {un | n N} be a CNOS in H;
then {un (x)um (y) | n, m N} is a CNOS in H. Let knm be the Fourier coefficients of k with
respect to the basis {un (x)um (y)} in H. Then
Z
X
(Kf )(x) =
f (y)
knm un (x)um (y) dy
n,m
un (x)
knm
un (x)
X
m
um (y)f (y) dy
knm hf , um i =
X
n,m
knm hf , um i un .
kKf k =
2
kKk
X
m,n
X
n
2
kmn
hf , um i
2
kmn
m,n
kf k = kf k
X
n
kKun k2
kKun k
n
XX
m
r=1
krm hf , um i ur
k(K Kn )f k =
X X
2
krm
m r=n+1
| hf , um i | sup
such that
2
kK Kn k = sup
m
r=n+1
r=n+1
2
krm
kf k2
2
krm
0
as n . Hence, K is compact.
(c) By (a) and (b), A is a negative, compact, self-adjoint operator. By the spectral theorem for
compact self-adjoint operators, Theorem 13.33, there exists an NOS (uk ) of eigenfunctions to
1/k of A. The NOS (uk ) is complete since 0 is not an eigenvalue of A.
Example 17.1 Dirichlet Conditions on the Square. Let Q = (0, ) (0, ) R2 . The
Laplace operator with Dirichlet boundary conditions on has eigenfunctions
umn (x, y) =
2
sin(mx) sin(ny),
476
u |S1 (0) = 0.
u = R and
r
u(r, ) =
(r
ur ) =
(rR ) = (R + rR ), u = R .
r
r
Hence, u = u now reads
R
R
+ R + 2 = R
r
r
R
r
+ R
1
+ 2
=
R
r
rR + r 2 R
+ r 2 =
= .
R
+ = 0,
(0) = (2);
| R(0) | < ,
r R + rR + (r )R = 0,
R(1) = 0.
(17.27)
k () = eik ,
k Z.
2
Equation (17.27) is the Bessel ODE.
For = k the solution of (17.27) bounded in r = 0 is
given by the Bessel function Jk (r ). Recall from homework 21.2 that
2n+k
X
(1)n x2
Jk (x) =
, k N0 .
n!(n
+
k)!
n=0
To
determine the eigenvalues
we use the boundary condition R(1) = 0 in (17.27), namely
k Z,
j = 1, 2, .
Note that the Bessel functions {Jk | k Z+ } and the system {eikt | k Z} form a complete
OS in L2 ((0, 1), r dr) and in in L2 (0, 2), respectively. Hence, the OS {ukl | k Z, l Z+ } is
a complete OS in L2 (U1 (0)). Thus, there are no further solutions to the given BEVP. For more
details on Bessel functions, see [FK98, p. 383].
17.4 Boundary Value Problems for the Laplace and the Poisson Equations
477
17.4 Boundary Value Problems for the Laplace and the Poisson Equations
Throughout this section (if nothing is stated otherwise) we will assume that is a bounded
region in Rn , n 2. We suppose further that belongs to the class C2 , that is, the boundary
consists of finitely many twice continuously differentiable hypersurfaces; := Rn \ is
assumed to be connected (i. e. it is a region, too). All functions are assumed to be real valued.
and
u(y) = (y), y .
and
u(y) = (y), y ,
lim u(x) = 0.
| x |
x=ytn
u
(y)
~
n
and ~n(y) is the outer normal to at y . That is, x approaches y in the direction
of the normal vector ~n(y). We assume that this limit exists for all boundary points y .
478
(d) The Exterior Neumann Problem:
x=y+tn
n
y
u(x) = f (x) x ,
u
(y) = (y), y
~n +
lim u(x) = 0.
and
| x |
Here
u
(y)
~
n+
and ~n(y) is the outer normal to at y . We assume that this limit exists for all boundary
points y . In both Neumann problems one can also look for a function u C2 () C()
where the dot denotes the inner product in R . The term under the integral can be written as
n
17.4 Boundary Value Problems for the Laplace and the Poisson Equations
479
where the hat means ommission of this factor. In this way (y) becomes a differential (n 1)form. Using differentiation of forms, see Definition 11.7, we obtain
d = div f (y) dy1 dy2 dyn .
This establishes the above generalized form of Gau divergence theorem. Let U : R be
~
a continuous scalar function on , one can define U(y) dS(y) := U(y)~n(y) dS(y),
where ~n
is the outer unit normal vector to the surface .
Recall that we obtain Greens first formula inserting f (x) = v(x)u(x), u, v C2 ():
Z
Z
Z
u
v(x)u(x) dx +
u(x) v(x) dx =
v(y) (y) dS(y).
~n
Interchanging the role of u and v and taking the difference, we obtain Greens second formula
Z
Z
v
u
(y) u(y) (y) dS(y).
(17.28)
v(y)
(v(x)u(x) u(x)v(x)) dx =
~n
~n
Recall that
1
log kxk , n = 2,
2
1
kxkn+2 ,
En (x) =
(n 2)n
E2 (x) =
n3
(17.29)
Here
y.
~
ny
denotes the derivative in the direction of the outer normal with respect to the variable
u
\ U (x)
U (x)
En (x y)
u
En (x y) (y) u(y)
dS(y). (17.30)
~n
~ny
480
In the second integral ~n denotes the outer normal to \ U (x) hence the inner normal of U (x).
We wish to evaluate the limits of the individual integrals in this formula as 0. Consider
the left-hand side of (17.30). Since u C2 (), u is bounded; since En (x y) is locally
integrable, the lhs converges to
Z
En (x y) u(y) dy.
n2
(n 2)n
~n
U (x)
Furthermore, since ~n is the interior normal of the ball U (y), the same calculations as in the
d
proof of Theorem 17.1 show that En~(xy)
= n d
(n+2 ) = n+1 /n . We obtain,
ny
Z
Z
En (x y)
1
u(y)
dS(y) =
u(y) dS(y) u(x).
~ny
n n1
U (x)
S (x)
{z
}
|
spherical mean
In the last line we used that the integral is the mean value of u over the sphere S (x), and u is
continuous at x.
Remarks 17.4 (a) Greens representation formula is also true for functions
u C2 () C1 (). To prove this, consider Greens representation theorem on smaller
regions such that .
(b) Applying Greens representation formula to a test function D(), see Definition 16.1,
(y) =
(y) = 0, y , we obtain
~
n
Z
(x) =
En (x y)(x) dx
(c) We may now draw the following consequence from Greens representation formula: If one
knows u, then u is completely determined by its values and those of its normal derivative on
. In particular, a harmonic function on can be reconstructed from its boundary data. One
may ask conversely whether one can construct a harmonic function for arbitrary given values
u
of u and ~
on . Ignoring regularity conditions, we will find out that this is not possible in
n
general. Roughly speaking, only one of these data is sufficient to describe u completely.
(d) In case of a harmonic function u C2 ()C1 (), u = 0, Greens representation formula
reads (n = 3):
Z
1
1
u(y)
1
u(x) =
dS(y).
(17.31)
u(y)
4 kx yk ~n
~ny kx yk
17.4 Boundary Value Problems for the Laplace and the Poisson Equations
481
In particular, the surface potentials V (0) (x) and V (1) (x) can be differentiated arbitrarily often
for x . Outside , V (0) and V (1) are harmonic. It follows from (17.31) that any harmonic
function is a C -function.
(17.32)
Indeed, this follows from Greens first formula inserting v = 1 and u harmonic, u = 0.
Proposition 17.18 (Mean Value Property) Suppose that u is harmonic in UR (x0 ) and continuous in UR (x0 ).
(a) Then u(x0 ) coincides with its spherical mean over the sphere SR (x0 ).
Z
1
u(x0 ) =
u(y) dS(y) (spherical mean).
(17.33)
n Rn1
SR (x0 )
(b) Further,
n
u(x0 ) =
n R n
UR (x0 )
Proof. (a) For simplicity, we consider only the case n = 3 and x0 = 0. Apply Greens representation formula (17.31) to any ball = U (0) with < R. Noting (17.32) from (17.31) it
follows that
!
Z
Z
u(y)
1 1
1
dS
dS
u(0) =
u(y)
4 S (0) ~n
~ny kyk
S (0)
Z
Z
1
1
1
1
dS =
=
u(y)
u(y) 2 dS
4 S (0)
~ny kyk
4 S (0)
Z
1
=
u(y) dS,
42 S (0)
Since u is continuous on the closed ball of radius R, the formula remains valid as R.
(b) Use dx = dx1 dxn = dr dSr where kxk = r. Multiply both sides of (17.33) by
r n1 dr and integrate with respect to r from 0 to R:
Z R
Z
Z R
1
n1
n1
r u(x0 ) dr =
r
u(y) dS dr
n r n1 SR (x0 )
0
0
Z
1 n
1
u(x) dx.
R u(x0 ) =
n
n UR (x0 )
The assertion follows. Note that Rn n /n is exactly the n-dimensional volume of UR (x0 ). The
proof in case n = 2 is similar.
482
i. e. u attains its maximum on the boundary . The same is true for the minimum.
Proof. Suppose to the contrary that M = u(x0 ) = max u(x) is attained at an inner point x0
x
for all y .
Corollary 17.20 (Uniqueness) The inner and the outer Dirichlet problem has at most one solution, respectively.
Proof. Suppose that u1 and u2 both are solutions of the Dirichlet problem, u1 = u2 = f .
Put u = u1 u2 . Then u(x) = 0 for all x and u(y) = 0 on the boundary y .
(a) Inner problem. By the maximum principle, u(x) = 0 for all x ; that is u1 = u2 .
(b) Suppose that u 6 0. Without loss of generality we may assume that
u(x1 ) = > 0 for some x1 . By assumption, | u(x) | 0 as
B(0)
x . Hence, there exists r > 0 such that | u(x) | < /2 for all
x r. Since u is harmonic in Br (0) \ , the maximum principle yields
11111111111111111111111111
00000000000000000000000000
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
r
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
00000000000000000000000000
11111111111111111111111111
= u(x1 )
a contradiction.
max
xSR (0)
u(x) /2;
17.4 Boundary Value Problems for the Laplace and the Poisson Equations
483
Corollary 17.21 (Stability) Suppose that u1 and u2 are solutions of the inner Dirichlet problem u1 = u2 = f with boundary values 1 (y) and 2 (y) on , respectively. Suppose
further that
| 1 (y) 2 (y) | y .
Then | u1 (x) u2 (x) | for all x .
A similar statement is true for the exterior Dirichlet problem.
Proof. Put u = u1 u2 . Then u = 0 and | u(y) | for all y . By the Maximum
Principle, | u(x) | for all x .
Lemma 17.22 Suppose that u is a non-constant harmonic function on and the maximum of
u(x) is attained at y .
u
Then ~
(y) > 0.
n
For the proof see [Tri92, 3.4.2. Theorem, p. 174].
Proposition 17.23 (Uniqueness) (a) The exterior Neumann problem has at most one solution.
(b) A necessary condition for solvability of the inner Neumann problem is
Z
Z
dS =
f (x) dx.
Proposition 17.24 (Converse Mean Value Theorem) Suppose that u C() and that whenever x0 such that Ur (x0 ) we have the mean value property
Z
Z
1
1
u(y) dS(y) =
u(x0 + ry) dS(y).
u(x0 ) =
n r n1 Sr (x0 )
n S1 (0)
Then u C () and u is harmonic in .
Proof. (a) We show that u C (). The Mean Value Property ensures that the mollification
h u equals u as long as U1/ (x0 ) ; that is, the mollification does not change u. By
484
U (0)
1Z
zi =
S1 (0)
= n u(x)
U1 (0)
g(r)r
n1
dr = u(x)
Rn
h(y) dy = u(x).
R
Second Part. Differentiating the above equation with respect to r yields Ur (x) u(y) dy = 0
for any ball in , since the left-hand side u(x) does not depend on r.
Z
Z
d
u(x + ry) dS(y) =
y u(x + ry) dS(y)
0=
dr S1 (0)
S1 (0)
Z
=
(r 1 z)u(x + z)r 1n dS(z)
Sr (0)
Z
n
=r
~n(z) u(x + z) dS(z)
Sr (0)
Z
u
n
=r
(x + z) dS(z)
n
Sr (0) ~
Z
Z
u(y)
n
n
=r
dS(y) = r
u(x) dx.
n
Sr (x0 ) ~
Ur (x0 )
In the last line we used Greens 2nd formula with v = 1. Thus u = 0. Suppose to the contrary
that u(x0 ) 6= 0, say u(x0 ) > 0. By continuity of u(x), u(x) > 0 for x U (x0 ). Hence
R
u(x) dx > 0 which contradicts the above equation. We conclude that u is harmonic in
U (x0 )
Ur (x0 ).
u | = 0.
17.5 Appendix
485
Use separation of variables, u(x, y)) = X(x)Y (y) to solve the problem.
17.5 Appendix
17.5.1 Existence of Solutions to the Boundary Value Problems
(a) Greens Function
Let u C2 () C1 (). Let us combine Greens representation formula and Greens
2nd formula with a harmonic function v(x) = vy (x), x , where y is thought to be
a parameter.
Z
Z
En
u
(x) dS(x)
u(y) =
En (x y) u(x) dx +
u(x)
(x y) En (x y)
~nx
~n
Z
Z
vy
u
0=
(x) vy (x)
(x) dS(x)
u(x)
vy (x)u(x) dx +
~n
~n
Suppose now that G(x, y) vanishes for all x then the last surface integral is 0 and
Z
Z
G(x, y)
u(y) =
G(x, y) u(x) dx +
u(x)
dS(x).
(17.34)
~nx
In the above formula, u is completely determined by its boundary values and u in . This
motivates the following definition.
Definition 17.4 A function G : R satisfying
(a) G(x, y) = 0 for all x , y , x 6= y.
(b) vy (x) = G(x, y) En (x y) is harmonic in x for all y .
is called a Greens function of . More precisely, G(x, y) is a Greens function to the inner
Dirichlet problem on .
Remarks 17.7 (a) The function vy (x) is in particular harmonic in x = y. Since En (x y) has
a pole at x = y, G(x, y) has a pole of the same order at x = y such that G(x, y) En (x y)
has no singularity.
If such a function G(x, y) exists, for all u C2 () we have (17.34).
486
u(x)
G(x, y)
dS(x).
~nx
(17.35)
This is the so called Poissons formula for . In general, it is difficult to find Greens function.
For most regions it is even impossible to give G(x, y) explicitely. However, if has kind
of symmetry, one can use the reflection principle to construct G(x, y) explicitely. Nevertheless,
G(x, y) exists for all well-behaved (the boundary is a C2 -set and Gau divergence theorem
holds for ).
y :=
O
(
y
R2
,
kyk2
y 6= 0,
y = 0.
Note that this map has the the property y y = R2 and kyk2 y = R2 y. Points on the sphere
SR (0) are fix under this map, y = y. Let En : R+ R denote the corresponding to En radial
scalar function with E(x) = En (kxk), that is En (r) = 1/((n 2)n r n2 ), n 2. Then we
put
E (kx yk) E kyk kx yk ,
n
n
R
G(x, y) =
En (kxk) En (R),
y 6= 0,
(17.36)
y = 0.
For x 6= y, G(x, y) is harmonic in x, since for kyk < R, kyk > R and therefore x y 6= 0. The
function G(x, y) has only one singularity in UR (0) namely at x = y and this is the same as that
of En (x y). Therefore,
E kyk kx yk ,
n
R
vy (x) = G(x, y) En (x y) =
En (R),
y 6= 0,
y = 0.
17.5 Appendix
487
! 12
2
2
1
y
kyk kyk
= En R2 + kyk2 2x y 2 En kyk2 +
2 kyk2 x 2
2
R
R
1
1
= En R2 + kyk2 2x y 2 En kyk2 + R2 2x y 2 = 0.
For y = 0 we have
r = kx yk =
(x y) x
kyk
kx yk ,
R
kyk2
(x y) x = R2 kyk2 .
R2
Hence, for y 6= 0,
1
G(x, y) =
~nx
(n 2)n
(17.37)
kx ykn+2
~nx
~nx
kykn+2
kx ykn+2
Rn+2
(17.38)
!!
x
kykn+2
x
xy
xy
n+2 kx ykn+1
kx ykn+1
kx yk kxk
R
kx yk kxk
!
1
kyk2
=
(x
y)
(x
y)
x
n r n R
R2
1
=
n
1
R2 kyk2
G(x, y) =
.
~nx
n R
kx ykn
This formula holds true in case y = 0. Inserting this into (17.34) we have for any harmonic
function u C2 (UR (0)) C(UR (0)) we have
Z
R2 kyk2
u(x)
u(y) =
dS(x).
(17.39)
n R
kx ykn
SR (0)
488
Proposition 17.25 Let n 2. Consider the inner Dirichlet problem in = UR (0) and f = 0.
The function
R2 kyk2 R
(x)
n R
dS(x),
kyk < R,
kxykn
SR (0)
u(y) =
(y),
kyk = R
is continuous on the closed ball UR (0) and harmonic in UR (0).
In case n = 2 the function u(y), can be written in the following form
Z
1
z + y dz
u(y) = Re
, y UR (0) C.
(z)
2i SR (0)
zy z
For the proof of the general statement with n 2, see [Jos02, Theorem 1.1.2] or [Joh82, p.
107]. We show the last statement for n = 2. Since yz yz is purely imaginary,
z+y
(z + y)(z y)
| z |2 | y |2 + yz yz
R 2 | y |2
Re
= Re
= Re
=
.
zy
(z y)(z y)
| z y |2
| z y |2
Using the parametrization z = Reit , dt =
Re
1
2
z + y dz
(z)
z y iz
SR (0)
1
=
2
dz
we obtain
iz
Z
R2 | y |2
(z) dt =
| z y |2
R2 | y |2
2R
SR (0)
(x)
| dx | .
| x y |2
In the last line we have a (real) line integral of the first kind, using x = (x1 , x2 ) = x1 + ix2 = z,
x SR (0) and | dx | = R dt on the circle.
Other Examples. (a) n = 3. The half-space = {(x1 , x2 , x3 ) R3 | x3 > 0}. We use the
ordinary reflection map with respect to the plane x3 = 0 which is given by y = (y1 , y2, y3 ) 7
y = (y1, y2 , y3 ). Then Greens function to is
1
1
1
4 kx y k kx yk
(see homework 57.1)
(b) n = 3. The half ball = {(x1 , x2 , x3 ) R3 | kxk < R, x3 > 0}. We use the reflections
y y and y y (reflection with respect to the sphere SR (0)). Then
G(x, y) = E3 (x y)
R
R
E3 (x y) E3 (x y ) +
E3 (x y )
kyk
kyk
is Greens function to .
(c) n = 3, = {(x1 , x2 , x3 ) R3 | x2 > 0, x3 > 0}. We introduce the reflection
y = (y1 , y2 , y3 ) 7 y = (y1 , y2 , y3 ). Then Greens function to is
G(x, y) = E3 (x y) E3 (x y ) E3 (x y ) + E3 (x (y ) ).
17.5 Appendix
489
Consider the Neumann problem and the ansatz for Greens function in case of the Dirichlet
problem:
Z
Z
H(x, y)
u
u(y) =
(x) dS(x). (17.40)
u(x)
H(x, y) u(x) dx +
H(x, y)
~nx
~n
We want to choose a Greens function of the second kind H(x, y) in such a way that only the
last surface integral remains present.
Inserting u = 1 in the above formula, we have
Z
G(x, y)
dS(x).
1=
~nx
1
R
1
2R2
+
+ log 2
kx yk kyk kx yk R
R x y + kyk kx yk
(y)
En
(x y) dS(y),
~n
x Rn
(17.42)
a double-layer potential
Remarks 17.8 (a) For x 6 the integrals (17.41) and (17.42) exist.
(b) The single layer potential u(x) is continuous on Rn . The double-layer potential jumps at
y0 by (y0 ) as x approaches y0 , see (17.43) below.
Theorem 17.26 Let be a connected bounded region in Rn of the class C2 and = Rn \
also be connected.
Then the interior Dirichlet problem to the Laplace equation has a unique solution. It can be
represented in form of a double-layer potential. The exterior Neumann problem likewise has a
unique solution which can be represented in form of a single-layer potential.
490
Theorem 17.27 Under the same assumptions as in the previous theorem, the inner Neumann
R
problem to the Laplace equation has a solution if and only if (y) dS(y) = 0. If this
condition is satisfies, the solution is unique up to a constant.
The exterior Dirichlet problem has a unique solution.
Remark. Let ID denote the continuous functions which produces the solution v(x) of the inteR
En (xy).
rior Dirichlet problem, i. e. v(x) = ID (y) K(x, y) dS(y), where K(x, y) =
~ny
Because of the jump relation for v(x) at x0 :
1
1
v(x) (x0 ) = v(x0 ) = lim v(x) + (x0 ),
xx0 ,x
xx0 ,x
2
2
lim
(17.43)
ID (y)K(x, y) dS(y),
x .
The above equation can be written as = (A + 12 I)ID , where A is the above integral operator
in L2 (). One can prove the following facts: A is compact, A + 12 I is injective and surjective,
continuous implies ID continuous. For details, see [Tri92, 3.4].
Application to the Poisson Equation
Consider the inner Dirichlet problem u = f , and u = on . We suppose that f
C() C1 (). We already know that
Z
1
f (y)
dy
w(x) = (En f )(x) =
(n 2)n Rn kx ykn2
is a distributive solution of the Poisson equation, w = f . By the assumptions on f , w
C2 () and therefore is a classical solution. To solve the problem we try the ansatz u = w + v.
Then u = w + v = f + v. Hence, u = f if and only if v = 0. Thus, the
inner Dirichlet problem for the Poisson equation reduces to the inner Dirichlet problem for the
Laplace equation v = 0 with boundary values
v(y) = u(y) w(y) = (y) w(y) =: (y),
y .
17.5 Appendix
491
v C1 ().
(17.44)
This integral is also called energy integral. The Dirichlet principle says that among all functions
v with given boundary values , the function u with u = f minimizes the energy integral E.
Proposition 17.28 A function u C1 () C2 () is a solution of the inner Dirichlet problem
if and only if the energy integral E attains its minimum on C1 () at u.
Proof. (a) Suppose first that u C1 () C2 () is a solution of the inner Dirichlet problem,
u = f . For v C1 () let w = v u C1 (). Then
Z
Z
1
E(v) = E(u + w) =
(u + w) (u + w) dx + (u + w)f dx
2
Z
Z
Z
Z
1
1
kuk2 +
kwk2 +
u w dx + (u + w)f dx
=
2
2
Z
1
kwk2 dx E(u).
= E(u) +
2
This shows that E(u) is minimal.
(b) Conversely, let u C1 () C2 () minimize the energy integral. In particular, for any test
function D(), has zero boundary values, the function
Z
Z
1 2
(u + f ) dx + t
kk2 dx
g(t) = E(u + t) = E(u) + t
2
has a local minimum at t = 0. Hence, g (0) = 0 which is, again by Greens 1st formula and
| = 0, equivalent to
Z
Z
0 = (u + f ) dx =
(u + f ) dx.
492
(b) Hilbert Space Methods
We want to give another reformulation of the Dirichlet problem. Consider the problem
u = f,
u | = 0.
u v dx.
C1 () is not yet an inner product space since for any non-vanishing constant function u, uuE =
0. Denote by C10 () the subspace of functions in C1 () vanishing on the boundary . Now,
uvE is an inner product on C10 (). The positive definiteness is a consequence of the Poincare
R
inequality below. Its corresponding norm is kuk2E = kuk2 dx. Let u be a solution of the
above Dirichlet problem. Then for any v C10 (), by Greens 1st formula
Z
Z
Z
v uE =
v u dx =
v u dx =
v f dx = v fL2 .
This suggests that u can be found by representing the known linear functional in v
Z
F (v) =
v f dx
dy1
ux1 dy1 = (x1 + a)
ux1 dy1 2a
u2x1 dy1 .
CSI
Since the last integral does not depend on x1 , integration with respect to x1 gives
Z a
Z a
2
2
u(x) dx1 4a
u2x1 dy1 .
a
17.5 Appendix
493
where C = 2a/ n.
The Poincare inequality is sometimes called PoincareFriedrich inequality. It remains true for
functions u in the completion W . Let us discuss the elements of W in more detail. By definition,
f W if there is a Cauchy sequence (fn ) in C10 () such that (fn ) converges to f in the
energy norm. By the Poincare inequality, (fn ) is also an L2 -Cauchy sequence. Since L2 ()
is complete, (fn ) has an L2 -limit f . This shows W L2 (). For simplicity, let R1 .
R
By definition of the energy norm, | (fn fm ) |2 dx 0, as m, n ; that is (fn ) is an
L2 -Cauchy sequence, too. Hence, (fn ) has also some L2 -limit, say g L2 (). So far,
kfn f kL2 0,
kfn gkL2 0.
(17.45)
We will show that the above limits imply f = g in D (). Indeed, by (17.45) and the Cauchy
Schwarz inequality, for all D(),
Z
(fn f ) dx
Hence,
Z
(fn g) dx
f dx =
Z
Z
21 Z
12
2
0,
| fn f | dx
| | dx
2
12 Z
21
2
2
0.
| fn g | dx
| | dx
f dx = lim
fn dx = lim
fn
dx =
g dx.
This shows f = g in D (). One says that the elements of W provide weak derivatives, that
is, its distributive derivative is an L2 -function (and hence a regular distribution).
Also, the inner product E is positive definite since the L2 -inner product is. It turns out that W
is a separable Hilbert space. W is the so called Sobolev space W01,2 () sometimes also denoted
by H10 (). The upper indices 1 and 2 in W01,2 () refer to the highest order of partial derivatives
(| | = 1) and the Lp -space (p = 2) in the definition of W , respectively. The lower index 0
refers to the so called generalized boundary values 0. For further readings on Sobolev spaces,
see [Fol95, Chapter 6].
R
Corollary 17.30 F (v) = v fL2 = f v dx defines a bounded linear functional on W .
Proof. By the CauchySchwarz and Poincare inequalities,
Z
| F (v) |
| f v | dx kf kL2 kvkL2 C kf kL2 kvkE .
494
Corollary 17.31 Let f C(). Then there exists a unique u W such that
v uE = v fL2 ,
v W.
This u solves
u = f,
in
D ().
The first statement is a consequence of Rieszs representation theorem; note that F is a bounded
linear functional on the Hilbert space W . The last statement follows from D() C10 () and
uE = hu , i = f L2 . This is the so called modified Dirichlet problem. It remains open
the task to identify the solution u W with an ordinary function u C2 ().
u(x + h) u(x)
,
h
u(x) =
u(x) u(x h)
,
h
where h is called the step size. One can also use a symmetric difference
Five-Point formula for the Laplacian in R2 is then given by
u(x+h)u(xh)
.
2h
The
h u(x, y) := (x x+ + y y+ )u(x, y) =
u(x h, y) + u(x + h, y) + u(x, y h) + u(x, y + h) 4u(x, y)
=
h2
Besides the equation, the domain as well as its boundary undergo a discretization: If
= (0, 1) (0, 1) then
h = {(nh, mh) | n, m N},
x h ,
17.5 Appendix
495
where we are thinking of the Sobolev space V = W from the previous paragraph. Of course,
F is assumed to be bounded.
Difference methods arise through discretising the differential operator. Now we wish to leave
the differential operator which is hidden in E unchanged. The RitzGalerkin method consists
in replacing the infinite-dimensional space V by a finite-dimensional space VN ,
VN V,
dim VN = N < .
VN equipped with the norm kkE is still a Banach space. Since VN V , both the inner product
E and F are defined for u, v VN . Thus, we may pose the problem
Find uN VN , so that uN vE = F (v) for all v VN ,
The solution to the above problem, if it exists, is called RitzGalerkin solution (belonging to
VN ).
An introductory example is to be found in [Hac92, 8.1.11, p. 164], see also [Bra01, Chapter 2].
496
Bibliography
[AF01]
I. Agricola and T. Friedrich. Globale Analysis (in German). Friedr. Vieweg & Sohn,
Braunschweig, 2001.
[Ahl78] L. V. Ahlfors. Complex analysis. An introduction to the theory of analytic functions of one complex variable. International Series in Pure and Applied Mathematics.
McGraw-Hill Book Co., New York, 3 edition, 1978.
[Arn04] V. I. Arnold. Lectures in Partial Differential Equations. Universitext. Springer and
Phasis, Berlin. Moscow, 2004.
[Bra01] D. Braess. Finite elements. Theory, fast solvers, and applications in solid mechanics.
Cambridge University Press, Cambridge, 2001.
[Bre97] Glen E. Bredon. Topology and geometry. Number 139 in Graduate Texts in Mathematics. Springer-Verlag, New York, 1997.
[Bro92] Th. Brocker. Analysis II (German). B. I. Wissenschaftsverlag, Mannheim, 1992.
[Con78] J. B. Conway. Functions of one complex variable. Number 11 in Graduate Texts in
Mathematics. Springer-Verlag, New York, 1978.
[Con90] J. B. Conway. A course in functional analysis. Number 96 in Graduate Texts in
Mathematics. Springer-Verlag, New York, 1990.
[Cou88] R. Courant. Differential and integral calculus III. Wiley Classics Library. John
Wiley & Sons, New York etc., 1988.
[Die93] J. Dieudonne. Treatise on analysis. Volume I IX. Pure and Applied Mathematics.
Academic Press, Boston, 1993.
[Els02]
[Eva98] L. C. Evans. Partial differential equations. Number 19 in Graduate Studies in Mathematics. AMS, Providence, 1998.
[FB93]
BIBLIOGRAPHY
498
[FK98]
[FL88]
[Fol95]
[For81]
[For01]
[GS64]
[GS69]
[HW96] E. Hairer and G. Wanner. Analysis by its history. Undergraduate texts in Mathematics.
Readings in Mathematics. Springer-Verlag, New York, 1996.
[Jan93]
[Joh82]
[Jos02]
BIBLIOGRAPHY
499
[KK71] A. Kufner and J. Kadlec. Fourier Series. G. A. Toombs Iliffe Books, London, 1971.
[Kno78] K. Knopp. Elemente der Funktionentheorie. Number 2124 in Sammlung Goschen.
Walter de Gruyter, Berlin, New York, 9 edition, 1978.
[Kon90] K. Konigsberger. Analysis 1 (English). Springer-Verlag, Berlin, Heidelberg, New
York, 1990.
[Lan89] S. Lang. Undergraduate Analysis. Undergraduate texts in mathematics. Springer,
New York-Heidelberg, second edition, 1989.
[MW85] J. Marsden and A. Weinstein. Calculus. I, II, III. Undergraduate Texts in Mathematics. Springer-Verlag, New York etc., 1985.
[Nee97] T. Needham. Visual complex analysis. Oxford University Press, New York, 1997.
[ON75] P. V. ONeil. Advanced calculus. Collier Macmillan Publishing Co., London, 1975.
[RS80]
[Rud66] W. Rudin. Real and Complex Analysis. International Student Edition. McGraw-Hill
Book Co., New York-Toronto, 1966.
[Rud76] W. Rudin. Principles of mathematical analysis. International Series in Pure and Applied Mathematics. McGraw-Hill Book Co., New York-Auckland-Dusseldorf, third
edition, 1976.
[Ruh83] F. Ruhs. Funktionentheorie. Hochschulbucher fur Mathematik. VEB Deutscher Verlag der Wissenschaften, Berlin, 4 edition, 1983.
[Spi65]
[Spi80]
[Str92]
W. A. Strauss. Partial differential equations. John Wiley & Sons, New York, 1992.
[Tri92]
[vW81] C. von Westenholz. Differential forms in mathematical physics. Number 3 in Studies in Mathematics and its Applications. North-Holland Publishing Co., AmsterdamNew York, second edition, 1981.
[Wal74] W. Walter. Einfuhrung in die Theorie der Distributionen (in German). Bibliographisches Institut, B.I.- Wissenschaftsverlag, Mannheim-Wien-Zurich, 1974.
[Wal02] W. Walter. Analysis 12 (in German). Springer-Lehrbuch. Springer, Berlin, fifth
edition, 2002.
500
BIBLIOGRAPHY
PD D R . A. S CH ULER
M ATHEMATISCHES I NSTITUT
L EIPZIG
U NIVERSIT AT
04009 L EIPZIG
Axel.Schueler@math.uni-leipzig.de