Vous êtes sur la page 1sur 118

INTRODUCTION TO FUNCTIONAL ANALYSIS

VLADIMIR V. KISIL
ABSTRACT. This is lecture notes for several courses on Functional Analysis at
School of Mathematics of University of Leeds. They are based on the notes of
Dr. Matt Daws, Prof. Jonathan R. Partington and Dr. David Salinger used in the
previous years. Some sections are borrowed from the textbooks, which I used since
being a student myself. However all misprints, omissions, and errors are only my
responsibility. I am very grateful to Filipa Soares de Almeida, Eric Borgnet, Pasc
Gavruta for pointing out some of them. Please let me know if you nd more.
The notes are available also for download in PDF.
The suggested textbooks are [1, 6, 8, 9]. The other nice books with many inter-
esting problems are [3, 7].
Exercises with stars are not a part of mandatory material but are nevertheless
worth to hear about. And they are not necessarily difcult, try to solve them!
CONTENTS
List of Figures 3
Notations and Assumptions 3
Integrability conditions 3
1. Motivating Example: Fourier Series 4
1.1. Fourier series: basic notions 4
1.2. The vibrating string 6
1.3. Historic: Joseph Fourier 8
2. Basics of Linear Spaces 9
2.1. Banach spaces (basic denitions only) 9
2.2. Hilbert spaces 11
2.3. Subspaces 13
2.4. Linear spans 16
3. Orthogonality 16
3.1. Orthogonal System in Hilbert Space 17
3.2. Bessels inequality 18
3.3. The RieszFischer theorem 20
3.4. Construction of Orthonormal Sequences 21
3.5. Orthogonal complements 23
4. Fourier Analysis 24
4.1. Fourier series 24
4.2. Fej ers theorem 25
4.3. Parsevals formula 29
4.4. Some Application of Fourier Series 30
5. Duality of Linear Spaces 34
5.1. Dual space of a normed space 35
5.2. Self-duality of Hilbert space 36
6. Operators 36
6.1. Linear operators 37
6.2. B(H) as a Banach space (and even algebra) 38
Date: 28th November 2013.
1
2 VLADIMIR V. KISIL
6.3. Adjoints 38
6.4. Hermitian, unitary and normal operators 39
7. Spectral Theory 41
7.1. The spectrum of an operator on a Hilbert space 41
7.2. The spectral radius formula 43
7.3. Spectrum of Special Operators 44
8. Compactness 45
8.1. Compact operators 45
8.2. HilbertSchmidt operators 48
9. The spectral theorem for compact normal operators 50
9.1. Spectrum of normal operators 50
9.2. Compact normal operators 51
10. Applications to integral equations 53
11. Banach and Normed Spaces 58
11.1. Normed spaces 58
11.2. Bounded linear operators 61
11.3. Dual Spaces 61
11.4. HahnBanach Theorem 62
11.5. C(X) Spaces 64
12. Measure Theory 65
12.1. Basic Measure Theory 65
12.2. Extension of Measures 66
12.3. Complex-Valued Measures and Charges 70
12.4. Constructing Measures, Products 71
13. Integration 72
13.1. Measurable functions 72
13.2. Lebsgue Integral 73
13.3. Properties of the Lebesgue Integral 77
13.4. Integration on Product Measures 79
13.5. Absolute Continuity of Measures 82
14. Functional Spaces 83
14.1. Integrable Functions 83
14.2. Dense Subspaces in L
p
87
14.3. Continuous functions 88
14.4. Riesz Representation Theorem 90
15. Fourier Transform 93
15.1. Convolutions on Commutative Groups 93
15.2. Characters of Commutative Groups 95
15.3. Fourier Transform on Commutative Groups 97
15.4. Fourier Integral 98
Appendix A. Tutorial Problems 101
A.1. Tutorial problems I 101
A.2. Tutorial problems II 101
A.3. Tutorial Problems III 102
A.4. Tutorial Problems IV 102
A.5. Tutorial Problems V 103
A.6. Tutorial Problems VI 104
A.7. Tutorial Problems VII 105
Appendix B. Solutions of Tutorial Problems 106
Appendix C. Course in the Nutshell 107
C.1. Some useful results and formulae (1) 107
C.2. Some useful results and formulae (2) 108
INTRODUCTION TO FUNCTIONAL ANALYSIS 3
Appendix D. Supplementary Sections 111
D.1. Reminder from Complex Analysis 111
References 111
Index 112
LIST OF FIGURES
1 Triangle inequality 10
2 Different unit balls 11
3 To the parallelogram identity. 13
4 Jump function as a limit of continuous functions 14
5 The Pythagoras theorem 17
6 Best approximation from a subspace 19
7 Best approximation by three trigonometric polynomials 20
8 Legendre and Chebyshev polynomials 22
9 A modication of continuous function to periodic 24
10 The Fej er kernel 27
11 The dynamics of a heat equation 32
12 Appearance of dissonance 32
13 Different musical instruments 33
14 Fourier series for different musical instruments 33
15 Two frequencies separated in time 34
16 Distance between scales of orthonormal vectors 46
17 The /3 argument to estimate |f(x) f(y)|. 47
NOTATIONS AND ASSUMPTIONS
Z
+
, R
+
denotes non-negative integers and reals.
x, y, z, . . . denotes vectors.
, , , . . . denotes scalars.
z, z stand for real and imaginary parts of a complex number z.
Integrability conditions. In this course, the functions we consider will be real or
complex valued functions dened on the real line which are locally Riemann integ-
rable. This means that they are Riemann integrable on any nite closed interval
[a, b]. (A complex valued function is Riemann integrable iff its real and imagin-
ary parts are Riemann-integrable.) In practice, we shall be dealing mainly with
bounded functions that have only a nite number of points of discontinuity in
any nite interval. We can relax the boundedness condition to allow improper
Riemann integrals, but we then require the integral of the absolute value of the
function to converge.
We mention this right at the start to get it out of the way. There are many
fascinating subtleties connected with Fourier analysis, but those connected with
technical aspects of integration theory are beyond the scope of the course. It turns
out that one needs a better integral than the Riemann integral: the Lebesgue
integral, and I commend the module, Linear Analysis 1, which includes an intro-
duction to that topic which is available to MM students (or you could look it up
4 VLADIMIR V. KISIL
in Real and Complex Analysis by Walter Rudin). Once one has the Lebesgue integ-
ral, one can start thinking about the different classes of functions to which Fourier
analysis applies: the modern theory (not available to Fourier himself) can even go
beyond functions and deal with generalized functions (distributions) such as the
Dirac delta function which may be familiar to some of you from quantum theory.
From now on, when we say function, we shall assume the conditions of the
rst paragraph, unless anything is stated to the contrary.
1. MOTIVATING EXAMPLE: FOURIER SERIES
1.1. Fourier series: basic notions. Before proceed with an abstract theory we con-
sider a motivating example: Fourier series.
1.1.1. 2-periodic functions. In this part of the course we deal with functions (as
above) that are periodic.
We say a function f : R C is periodic with period T > 0 if f(x +T) = f(x) for all
x R. For example, sin x, cos x, e
ix
(= cos x + i sin x) are periodic with period 2.
For k R \ {0}, sin kx, cos kx, and e
ikx
are periodic with period 2/|k|. Constant
functions are periodic with period T, for any T > 0. We shall specialize to periodic
functions with period 2: we call them 2-periodic functions, for short. Note that
cos nx, sin nx and e
inx
are 2-periodic for n Z. (Of course these are also 2/|n|-
periodic.)
Any half-open interval of length T is a fundamental domain of a periodic function
f of period T. Once you know the values of f on the fundamental domain, you
know them everywhere, because any point x in R can be written uniquely as x =
w+nT where n Z and wis in the fundamental domain. Thus f(x) = f(w+ (n
1)T +T) = = f(w+T) = f(w).
For 2-periodic functions, we shall usually take the fundamental domain to be
] , ]. By abuse of language, we shall sometimes refer to [, ] as the funda-
mental domain. We then have to be aware that f() = f().
1.1.2. Integrating the complex exponential function. We shall need to calculate
_
b
a
e
ikx
dx,
for k R. Note rst that when k = 0, the integrand is the constant function
1, so the result is b a. For non-zero k,
_
b
a
e
ikx
dx =
_
b
a
(cos kx + i sin kx) dx =
(1/k)[(sin kxi cos kx)]
b
a
= (1/ik)[(cos kx+i sinkx)]
b
a
= (1/ik)[e
ikx
]
b
a
= (1/ik)(e
ikb

e
ika
). Note that this is exactly the result you would have got by treating i as a real
constant and using the usual formula for integrating e
ax
. Note also that the cases
k = 0 and k ,= 0 have to be treated separately: this is typical.
Denition 1.1. Let f : R C be a 2-periodic function which is Riemann integ-
rable on [, ]. For each n Z we dene the Fourier coefcient

f(n) by

f(n) =
1
2

f(x)e
inx
dx .
Remark 1.2. (i)

f(n) is a complex number whose modulus is the amplitude
and whose argument is the phase (of that component of the original func-
tion).
(ii) If f and g are Riemann integrable on an interval, then so is their product,
so the integral is well-dened.
(iii) The constant before the integral is to divide by the length of the interval.
(iv) We could replace the range of integration by any interval of length 2,
without altering the result, since the integrand is 2-periodic.
INTRODUCTION TO FUNCTIONAL ANALYSIS 5
(v) Note the minus sign in the exponent of the exponential. The reason for
this will soon become clear.
Example 1.3. (i) f(x) = c then

f(0) = c and

f(n) = 0 when n ,= 0.
(ii) f(x) = e
ikx
, where k is an integer.

f(n) =
nk
.
(iii) f is 2 periodic and f(x) = x on ] , ]. (Diagram) Then

f(0) = 0 and, for
n ,= 0,

f(n) =
1
2

xe
inx
dx=
_
xe
inx
2in
_

+
1
in
1
2

e
inx
dx=
(1)
n
i
n
.
Proposition 1.4 (Linearity). If f and g are 2-periodic functions and c and d are complex
constants, then, for all n Z,
(cf +dg

) (n) = c

f(n) +d g(n) .
Corollary 1.5. If p(x) =

k
k
c
n
e
inx
, then p(n) = c
n
for |n| k and = 0, for |n| k.
p(x) =

nZ
p(n)e
inx
.
This follows immediately from Ex. 1.3(ii) and Prop.1.4.
Remark 1.6. (i) This corollary explains why the minus sign is natural in the
denition of the Fourier coefcients.
(ii) The rst part of the course will be devoted to the question of how far this
result can be extended to other 2-periodic functions, that is, for which
functions, and for which interpretations of innite sums is it true that
(1.1) f(x) =

nZ

f(n)e
inx
.
Denition 1.7.

nZ

f(n)e
inx
is called the Fourier series of the 2-periodic func-
tion f.
For real-valued functions, the introduction of complex exponentials seems arti-
cial: indeed they can be avoided as follows. We work with (1.1) in the case of a
nite sum: then we can rearrange the sum as

f(0) +

n>0
(

f(n)e
inx
+

f(n)e
inx
)
=

f(0) +

n>0
[(

f(n) +

f(n)) cos nx +i(

f(n)

f(n)) sin nx]
=
a
0
2
+

n>0
(a
n
cos nx + b
n
sinnx)
Here
a
n
= (

f(n) +

f(n)) =
1
2

f(x)(e
inx
+e
inx
) dx
=
1

f(x) cos nx dx
for n > 0 and
b
n
= i((

f(n)

f(n)) =
1

f(x) sinnx dx
6 VLADIMIR V. KISIL
for n > 0. a
0
=
1

f(x) dx, the constant chosen for consistency.


The a
n
and b
n
are also called Fourier coefcients: if it is necessary to distin-
guish them, we may call them Fourier cosine and sine coefcients, respectively.
We note that if f is real-valued, then the a
n
and b
n
are real numbers and so

f(n) =

f(n),

f(n) =

f(n): thus

f(n) is the complex conjugate of

f(n).
Further, if f is an even function then all the sine coefcients are 0 and if f is an
odd function, all the cosine coefcients are zero. We note further that the sine and
cosine coefcients of the functions cos kx and sin kx themselves have a particularly
simple form: a
k
= 1 in the rst case and b
k
= 1 in the second. All the rest are zero.
For example, we should expect the 2-periodic function whose value on ], ]
is x to have just sine coefcients: indeed this is the case: a
n
= 0 and b
n
= i(

f(n)

f(n)) = (1)
n+1
2/n for n > 0.
The above question can then be reformulated as to what extent is f(x) rep-
resented by the Fourier series a
0
/2 +

n>0
(a
n
cos x + b
n
sinx)? For instance
how well does

(1)
n+1
(2/n) sinnx represent the 2-periodic sawtooth function
f whose value on ] , ] is given by f(x) = x. The easy points are x = 0, x = ,
where the terms are identically zero. This gives the wrong value for x = , but,
if we look at the periodic function near , we see that it jumps from to , so
perhaps the mean of those values isnt a bad value for the series to converge to.
We could conclude that we had dened the function incorrectly to begin with and
that its value at the points (2n + 1) should have been zero anyway. In fact one
can show (ref. ) that the Fourier series converges at all other points to the given
values of f, but I shant include the proof in this course. The convergence is not at
all uniform (it cant be, because the partial sums are continuous functions, but the
limit is discontinuous.) In particular we get the expansion

2
= 2(1 1/3 + 1/5 )
which can also be deduced from the Taylor series for tan
1
.
1.2. The vibrating string. In this subsection we shall discuss the formal solutions
of the wave equation in a special case which Fourier dealt with in his work.
We discuss the wave equation
(1.2)

2
y
x
2
=
1
K
2

2
y
t
2
,
subject to the boundary conditions
(1.3) y(0, t) = y(, t) = 0,
for all t 0, and the initial conditions
y(x, 0) = F(x),
y
t
(x, 0) = 0.
This is a mathematical model of a string on a musical instrument (guitar, harp,
violin) which is of length and is plucked, i.e. held in the shape F(x) and re-
leased at time t = 0. The constant K depends on the length, density and tension of
the string. We shall derive the formal solution (that is, a solution which assumes
existence and ignores questions of convergence or of domain of denition).
1.2.1. Separation of variables. We rst look (as Fourier and others before him did)
for solutions of the form y(x, t) = f(x)g(t). Feeding this into the wave equation
(1.2) we get
f

(x)g(t) =
1
K
2
f(x)g

(t)
INTRODUCTION TO FUNCTIONAL ANALYSIS 7
and so, dividing by f(x)g(t), we have
(1.4)
f

(x)
f(x)
=
1
K
2
g

(t)
g(t)
.
The left-hand side is an expression in x alone, the right-hand side in t alone. The
conclusion must be that they are both identically equal to the same constant C, say.
We have f

(x) Cf(x) = 0 subject to the condition f(0) = f() = 0. Working


through the method of solving linear second order differential equations tells you
that the only solutions occur when C = n
2
for some positive integer n and the
corresponding solutions, up to constant multiples, are f(x) = sin nx.
Returning to equation (1.4) gives the equation g

(t) + K
2
n
2
g(t) = 0 which has
the general solution g(t) = a
n
cos Knt + b
n
sin Knt. Thus the solution we get
through separation of variables, using the boundary conditions but ignoring the
initial conditions, are
y
n
(x, t) = sinnx(a
n
cos Knt +b
n
sin Knt) ,
for n 1.
1.2.2. Principle of Superposition. To get the general solution we just add together all
the solutions we have got so far, thus
(1.5) y(x, t) =

n=1
sin nx(a
n
cos Knt +b
n
sin Knt)
ignoring questions of convergence. (We can do this for a nite sum without dif-
culty because we are dealing with a linear differential equation: the iffy bit is to
extend to an innite sum.)
We now apply the initial condition y(x, 0) = F(x) (note F has F(0) = F() = 0).
This gives
F(x) =

n=1
a
n
sin nx .
We apply the reection trick: the right-hand side is a series of odd functions so if
we extend F to a function G by reection in the origin, giving
G(x) :=
_
F(x) , if 0 x ;
F(x) , if < x < 0.
we have
G(x) =

n=1
a
n
sin nx ,
for x .
If we multiply through by sinrx and integrate term by term, we get
a
r
=
1

G(x) sinrx dx
so, assuming that this operation is valid, we nd that the a
n
are precisely the sine
coefcients of G. (Those of you who took Real Analysis 2 last year may remember
that a sufcient condition for integrating term-by -term is that the series which is
integrated is itself uniformly convergent.)
If we now assume, further, that the right-hand side of (1.5) is differentiable
(term by term) we differentiate with respect to t, and set t = 0, to get
(1.6) 0 = y
t
(x, 0) =

n=1
b
n
Knsinnx.
8 VLADIMIR V. KISIL
This equation is solved by the choice b
n
= 0 for all n, so we have the following
result
Proposition 1.8 (Formal). Assuming that the formal manipulations are valid, a solution
of the differential equation (1.2) with the given boundary and initial conditions is
(2.11) y(x, t) =

1
a
n
sin nx cos Knt ,
where the coefcients a
n
are the Fourier sine coefcients
a
n
=
1

G(x) sinnx dx
of the 2 periodic function G, dened on ] , ] by reecting the graph of F in the origin.
Remark 1.9. This leaves us with the questions
(i) For which F are the manipulations valid?
(ii) Is this the only solution of the differential equation? (which Im not going
to try to answer.)
(iii) Is b
n
= 0 all n the only solution of (1.6)? This is a special case of the
uniqueness problem for trigonometric series.
1.3. Historic: Joseph Fourier. Joseph Fourier, Civil Servant, Egyptologist, and math-
ematician, was born in 1768 in Auxerre, France, son of a tailor. Debarred by birth
from a career in the artillery, he was preparing to become a Benedictine monk (in
order to be a teacher) when the French Revolution violently altered the course of
history and Fouriers life. He became president of the local revolutionary commit-
tee, was arrested during the Terror, but released at the fall of Robespierre.
Fourier then became a pupil at the Ecole Normale (the teachers academy) in
Paris, studying under such great French mathematicians as Laplace and Lagrange.
He became a teacher at the Ecole Polytechnique (the military academy).
He was orderedto serve as a scientist under Napoleon in Egypt. In 1801, Fourier
returned to France to become Prefect of the Grenoble region. Among his most
notable achievements in that ofce were the draining of some 20 thousand acres
of swamps and the building of a new road across the alps.
During that time he wrote an important survey of Egyptian history (a master-
piece and a turning point in the subject).
In 1804 Fourier started the study of the theory of heat conduction, in the course
of which he systematically used the sine-and-cosine series which are named after
him. At the end of 1807, he submitted a memoir on this work to the Academy of
Science. The memoir proved controversial both in terms of his use of Fourier series
and of his derivation of the heat equation and was not accepted at that stage. He
was able to resubmit a revised version in 1811: this had several important new fea-
tures, including the introduction of the Fourier transform. With this version of his
memoir, he won the Academys prize in mathematics. In 1817, Fourier was nally
elected to the Academy of Sciences and in 1822 his 1811 memoir was published as
Th eorie de la Chaleur.
For more details see Fourier Analysis by T.W. K orner, 475-480 and for even
more, see the biography by J. Herivel Joseph Fourier: the man and the physicist.
What is Fourier analysis. The idea is to analyse functions (into sine and cosines
or, equivalently, complex exponentials) to nd the underlying frequencies, their
strengths (and phases) and, where possible, to see if they can be recombined (syn-
thesis) into the original function. The answers will depend on the original prop-
erties of the functions, which often come from physics (heat, electronic or sound
INTRODUCTION TO FUNCTIONAL ANALYSIS 9
waves). This course will give basically a mathematical treatment and so will be
interested in mathematical classes of functions (continuity, differentiability prop-
erties).
2. BASICS OF LINEAR SPACES
A person is solely the concentration of an innite set of inter-
relations with another and others, and to separate a person
from these relations means to take away any real meaning of
the life.
Vl. Soloviev
A space around us could be described as a three dimensional Euclidean space.
To single out a point of that space we need a xed frame of references and three
real numbers, which are coordinates of the point. Similarly to describe a pair of
points from our space we could use six coordinates; for three pointsnine, end so
on. This makes it reasonable to consider Euclidean (linear) spaces of an arbitrary
nite dimension, which are studied in the courses of linear algebra.
The basic properties of Euclidean spaces are determined by its linear and metric
structures. The linear space (or vector space) structure allows to add and subtract
vectors associated to points as well as to multiply vectors by real or complex num-
bers (scalars).
The metric space structure assign a distancenon-negative real numberto a
pair of points or, equivalently, denes a length of a vector dened by that pair. A
metric (or, more generally a topology) is essential for denition of the core analyt-
ical notions like limit or continuity. The importance of linear and metric (topolo-
gical) structure in analysis sometime encoded in the formula:
(2.1) Analysis = Algebra + Geometry .
On the other hand we could observe that many sets admit a sort of linear and
metric structures which are linked each other. Just few among many other ex-
amples are:
The set of convergent sequences;
The set of continuous functions on [0, 1].
It is a very mathematical way of thinking to declare such sets to be spaces and call
their elements points.
But shall we lose all information on a particular element (e.g. a sequence {1/n})
if we represent it by a shapeless and size-less point without any inner cong-
uration? Surprisingly not: all properties of an element could be now retrieved
not from its inner conguration but from interactions with other elements through
linear and metric structures. Such a sociological approach to all kind of math-
ematical objects was codied in the abstract category theory.
Another surprise is that starting from our three dimensional Euclidean space
and walking far away by a road of abstraction to innite dimensional Hilbert
spaces we are arriving just to yet another picture of the surrounding spacethat
time on the language of quantum mechanics.
The distance from Manchester to Liverpool is 35 milesjust
about the mileage in the opposite direction!
A tourist guide to England
2.1. Banach spaces (basic denitions only). The following denition generalises
the notion of distance known from the everyday life.
Denition2.1. Ametric (or distance function) d on a set Mis a function d : MM
R
+
from the set of pairs to non-negative real numbers such that:
(i) d(x, y) 0 for all x, y M, d(x, y) = 0 implies x = y .
10 VLADIMIR V. KISIL
(ii) d(x, y) = d(y, x) for all x and y in M.
(iii) d(x, y) + d(y, z) d(x, z) for all x, y, and z in M (triangle inequality).
Exercise 2.2. Let M be the set of UKs cities are the following function are metrics
on M:
(i) d(A, B) is the price of 2nd class railway ticket from A to B.
(ii) d(A, B) is the off-peak driving time fromA to B.
The following notion is a useful specialisation of metric adopted to the linear
structure.
Denition 2.3. Let V be a (real or complex) vector space. A norm on V is a real-
valued function, written |x|, such that
(i) |x| 0 for all x V, and |x| = 0 implies x = 0.
(ii) |x| = || |x| for all scalar and vector x.
(iii) |x + y| |x| +|y| (triangle inequality).
A vector space with a norm is called a normed space.
The connection between norm and metric is as follows:
Proposition 2.4. If || is a norm on V, then it gives a metric on V by d(x, y) = |x y|.
(a)
d(a, c)
d(a, b)
d(b, c)
(b)
x + y
y
x
FIGURE 1. Triangle inequality in metric (a) and normed (b) spaces.
Proof. This is a simple exercise to derive items 2.1(i)2.1(iii) of Denition 2.1 from
corresponding items of Denition 2.3. For example, see the Figure 1 to derive the
triangle inequality.
An important notions known from real analysis are limit and convergence. Par-
ticularly we usually wish to have enough limiting points for all reasonable se-
quences.
Denition 2.5. A sequence {x
k
} in a metric space (M, d) is a Cauchy sequence, if for
every > 0, there exists an integer n such that k, l > n implies that d(x
k
, x
l
) < .
Denition 2.6. (M, d) is a complete metric space if every Cauchy sequence in M
converges to a limit in M.
For example, the set of integers Z and reals R with the natural distance func-
tions are complete spaces, but the set of rationals Q is not. The complete normed
spaces deserve a special name.
Denition 2.7. A Banach space is a complete normed space.
Exercise

2.8. A convenient way to dene a norm in a Banach space is as follows.


The unit ball Uin a normed space B is the set of x such that |x| 1. Prove that:
(i) U is a convex set, i.e. x, y U and [0, 1] the point x + (1 )y is also
in U.
(ii) |x| = inf{ R
+
|
1
x U}.
(iii) U is closed if and only if the space is Banach.
INTRODUCTION TO FUNCTIONAL ANALYSIS 11
(i)
1
1
(ii)
1
1
(iii)
1
1
FIGURE 2. Different unit balls dening norms in R
2
from Example 2.9.
Example 2.9. Here is some examples of normed spaces.
(i)
n
2
is either R
n
or C
n
with norm dened by
|(x
1
, . . . , x
n
)|
2
=
_
|x
1
|
2
+|x
2
|
2
+ +|x
n
|
2
.
(ii)
n
1
is either R
n
or C
n
with norm dened by
|(x
1
, . . . , x
n
)|
1
= |x
1
| +|x
2
| + + |x
n
|.
(iii)
n

is either R
n
or C
n
with norm dened by
|(x
1
, . . . , x
n
)|

= max(|x
1
| , |x
2
| , , |x
n
|).
(iv) Let Xbe a topological space, then C
b
(X) is the space of continuous bounded
functions f : X C with norm|f|

= sup
X
|f(x)|.
(v) Let X be any set, then

(X) is the space of all bounded (not necessarily


continuous) functions f : X C with norm|f|

= sup
X
|f(x)|.
All these normed spaces are also complete and thus are Banach spaces. Some more
examples of both complete and incomplete spaces shall appear later.
We need an extra space to accommodate this product!
A manager to a shop assistant
2.2. Hilbert spaces. Although metric and norm capture important geometric in-
formation about linear spaces they are not sensitive enough to represent such geo-
metric characterisation as angles (particularly orthogonality). To this end we need
a further renements.
From courses of linear algebra known that the scalar product x, y = x
1
y
1
+
+ x
n
y
n
is important in a space R
n
and denes a norm |x|
2
= x, x. Here is a
suitable generalisation:
Denition2.10. Ascalar product (or inner product) on a real or complex vector space
V is a mapping V V C, written x, y, that satises:
(i) x, x 0 and x, x = 0 implies x = 0.
(ii) x, y = y, x in complex spaces and x, y = y, x in real ones for all x,
y V.
(iii) x, y = x, y, for all x, y V and scalar . (What is x, y?).
(iv) x +y, z = x, z +y, z, for all x, y, and z V. (What is x, y +z?).
Last two properties of the scalar product is oftenly encoded in the phrase: it is
linear in the rst variable if we x the second and anti-linear in the second if we
x the rst.
Denition 2.11. An inner product space V is a real or complex vector space with a
scalar product on it.
Example 2.12. Here is some examples of inner product spaces which demonstrate
that expression |x| =
_
x, x denes a norm.
12 VLADIMIR V. KISIL
(i) The inner product for R
n
was dened in the beginning of this section.
The inner product for C
n
is given by x, y =

n
1
x
j
y
j
. The norm |x| =
_

n
1
|x
j
|
2
makes it
n
2
from Example 2.9(i).
(ii) The extension for innite vectors: let
2
be
(2.2)
2
= {sequences {x
j
}

1
|

1
|x
j
|
2
< }.
Let us equip this set with operations of term-wise addition and multiplic-
ation by scalars, then
2
is closed under them. Indeed it follows from the
triangle inequality and properties of absolutely convergent series. From
the standard CauchyBunyakovskiiSchwarz inequality follows that the
series

1
x
j
y
j
absolutely converges and its sum dened to be x, y.
(iii) Let C
b
[a, b] be a space of continuous functions on the interval [a, b]
R. As we learn from Example 2.9(iv) a normed space it is a normed
space with the norm |f|

= sup
[a,b]
|f(x)|. We could also dene an in-
ner product:
(2.3) f, g =
b
_
a
f(x) g(x) dx and |f|
2
=
_
_
b
_
a
|f(x)|
2
dx
_
_
1
2
.
Now we state, probably, the most important inequality in analysis.
Theorem 2.13 (CauchySchwarzBunyakovskii inequality). For vectors x and y in
an inner product space V let us dene |x| =
_
x, x and |y| =
_
y, y then we have
(2.4) |x, y| |x| |y| ,
with equality if and only if x and y are scalar multiple each other.
Proof. For any x, y V and any t R we have:
0 < x +ty, x +ty = x, x + 2ty, x +t
2
y, y),
Thus the discriminant of this quadratic expression in t is non-positive: (y, x)
2

|x|
2
|y|
2
0, that is |x, y| |x| |y|. Replacing y by e
i
y for an arbitrary
[, ] we get

(e
i
x, y)

|x| |y|, this implies the desired inequality.

Corollary 2.14. Any inner product space is a normed space with norm |x| =
_
x, x
(hence also a metric space, Prop. 2.4).
Proof. Just to check items 2.3(i)2.3(iii) from Denition 2.3.
Again complete inner product spaces deserve a special name
Denition 2.15. A complete inner product space is Hilbert space.
The relations between spaces introduced so far are as follows:
Hilbert spaces Banach spaces Complete metric spaces

inner product spaces normed spaces metric spaces.
How can we tell if a given norm comes from an inner product?
Theorem 2.16 (Parallelogram identity). In an inner product space H we have for all x
and y H (see Figure 3):
(2.5) |x + y|
2
+|x y|
2
= 2 |x|
2
+ 2 |y|
2
.
INTRODUCTION TO FUNCTIONAL ANALYSIS 13
x
x
y
y
x y
x + y
FIGURE 3. To the parallelogram identity.
Proof. Just by linearity of inner product:
x +y, x +y +x y, x y = 2 x, x + 2 y, y ,
because the cross terms cancel out.
Exercise 2.17. Show that (2.5) is also a sufcient condition for a norm to arise from
an inner product. Namely, for a norm on a complex Banach space satisfying to (2.5)
the formula
x, y =
1
4
_
|x + y|
2
|x y|
2
+i |x +iy|
2
i |x iy|
2
_
(2.6)
=
1
4
3

0
i
k
_
_
x +i
k
y
_
_
2
denes an inner product. What is a suitable formula for a real Banach space?
Divide and rule!
Old but still much used recipe
2.3. Subspaces. To study Hilbert spaces we may use the traditional mathematical
technique of analysis and synthesis: we split the initial Hilbert spaces into smaller
and probably simpler subsets, investigate them separately, and then reconstruct
the entire picture from these parts.
As known from the linear algebra, a linear subspace is a subset of a linear space
is its subset, which inherits the linear structure, i.e. possibility to add vectors and
multiply them by scalars. In this course we need also that subspaces inherit topo-
logical structure (coming either from a norm or an inner product) as well.
Denition 2.18. By a subspace of a normed space (or inner product space) we mean
a linear subspace with the same norm (inner product respectively). We write X Y
or X Y.
Example 2.19. (i) C
b
(X)

(X) where X is a metric space.


(ii) Any linear subspace of R
n
or C
n
with any norm given in Example 2.9(i)
2.9(iii).
(iii) Let c
00
be the space of nite sequences, i.e. all sequences (x
n
) such that exist
N with x
n
= 0 for n > N. This is a subspace of
2
since

1
|x
j
|
2
is a nite
sum, so nite.
We also wish that the both inhered structures (linear and topological) should be
in agreement, i.e. the subspace should be complete. Such inheritance is linked to
the property be closed.
A subspace need not be closedfor example the sequence
x = (1, 1/2, 1/3, 1/4, . . .)
2
because

1/k
2
<
and x
n
= (1, 1/2, . . . , 1/n, 0, 0, . . .) c
00
converges to x thus x c
00

2
.
14 VLADIMIR V. KISIL
Proposition 2.20. (i) Any closed subspace of a Banach/Hilbert space is complete,
hence also a Banach/Hilbert space.
(ii) Any complete subspace is closed.
(iii) The closure of subspace is again a subspace.
Proof. (i) This is true in any metric space X: any Cauchy sequence fromY has
a limit x X belonging to

Y, but if Y is closed then x Y.
(ii) Let Y is complete and x

Y, then there is sequence x
n
x in Y and it is a
Cauchy sequence. Then completeness of Y implies x Y.
(iii) If x, y

Y then there are x
n
and y
n
in Y such that x
n
x and y
n
y.
From the triangle inequality:
|(x
n
+y
n
) (x +y)| |x
n
x| + |y
n
y| 0,
so x
n
+y
n
x +y and x + y

Y. Similarly x

Y implies x

Y for any
.

Hence c
00
is an incomplete inner product space, with inner product x, y =

1
x
k
y
k
(this is a nite sum!) as it is not closed in
2
.
(a)
1
1
1
2

1
n
1
2
+
1
n (b)
1
1
1
2
FIGURE 4. Jump function on (b) as a L
2
limit of continuous func-
tions from (a).
Similarly C[0, 1] with inner product norm|f| =
_
1
_
0
|f(t)|
2
dt
_
1/2
is incomplete
take the large space X of functions continuous on [0, 1] except for a possible jump
at
1
2
(i.e. left and right limits exists but may be unequal and f(
1
2
) = lim
t
1
2
+
f(t).
Then the sequence of functions dened on Figure 4(a) has the limit shown on Fig-
ure 4(b) since:
|f f
n
| =
1
2
+
1
n
_
1
2

1
n
|f f
n
|
2
dt <
2
n
0.
Obviously f C[0, 1] \ C[0, 1].
Exercise 2.21. Show alternatively that the sequence of function f
n
from Figure 4(a)
is a Cauchy sequence in C[0, 1] but has no continuous limit.
Similarly the space C[a, b] is incomplete for any a < b if equipped by the inner
product and the corresponding norm:
f, g =
b
_
a
f(t) g(t) dt (2.7)
|f|
2
=
_
_
b
_
a
|f(t)|
2
dt
_
_
1/2
. (2.8)
INTRODUCTION TO FUNCTIONAL ANALYSIS 15
Denition 2.22. Dene a Hilbert space L
2
[a, b] to be the smallest complete inner
product space containing space C[a, b] with the restriction of inner product given
by (2.7).
It is practical to realise L
2
[a, b] as a certain space of functions with the inner
product dened via an integral. There are several ways to do that and we mention
just two:
(i) Elements of L
2
[a, b] are equivalent classes of Cauchy sequences f
(n)
of
functions from C[a, b].
(ii) Let integration be extended from the Riemann denition to the wider Le-
besgue integration (see Section 13). Let L be a set of square integrable in
Lebesgue sense functions on [a, b] with a nite norm (2.8). Then L
2
[a, b]
is a quotient space of L with respect to the equivalence relation f g
|f g|
2
= 0.
Example 2.23. Let the Cantor function on [0, 1] be dened as follows:
f(t) =
_
1, t Q;
0, t R \ Q.
This function is not integrable in the Riemann sense but does have the
Lebesgue integral. The later however is equal to 0 and as an L
2
-function
the Cantor function equivalent to the function identically equal to 0.
(iii) The third possibility is to map L
2
(R) onto a space of true functions but
with an additional structure. For example, in quantum mechanics it is use-
ful to work with the SegalBargmann space of analytic functions on C with
the inner product:
f
1
, f
2
=
_
C
f
1
(z)

f
2
(z)e
|z|
2
dz.
Theorem 2.24. The sequence space
2
is complete, hence a Hilbert space.
Proof. Take a Cauchy sequence x
(n)

2
, where x
(n)
= (x
(n)
1
, x
(n)
2
, x
(n)
3
, . . .). Our
proof will have three steps: identify the limit x; show it is in
2
; show x
(n)
x.
(i) If x
(n)
is a Cauchy sequence in
2
then x
(n)
k
is also a Cauchy sequence of
numbers for any xed k:

x
(n)
k
x
(m)
k

k=1

x
(n)
k
x
(m)
k

2
_
1/2
=
_
_
_x
(n)
x
(m)
_
_
_ 0.
Let x
k
be the limit of x
(n)
k
.
(ii) For a given > 0 nd n
0
such that
_
_
x
(n)
x
(m)
_
_
< for all n, m > n
0
.
For any K and m:
K

k=1

x
(n)
k
x
(m)
k

_
_
_x
(n)
x
(m)
_
_
_
2
<
2
.
Let m then

K
k=1

x
(n)
k
x
k

2

2
.
Let K then

k=1

x
(n)
k
x
k

2

2
. Thus x
(n)
x
2
and because

2
is a linear space then x = x
(n)
(x
(n)
x) is also in
2
.
(iii) We saw above that for any > 0 there is n
0
such that
_
_
x
(n)
x
_
_
< for
all n > n
0
. Thus x
(n)
x.
Consequently
2
is complete.
16 VLADIMIR V. KISIL
All good things are covered by a thick layer of chocolate (well,
if something is not yetit certainly will)
2.4. Linear spans. As was explained into introduction 2, we describe internal
properties of a vector through its relations to other vectors. For a detailed descrip-
tion we need sufciently many external reference points.
Let A be a subset (nite or innite) of a normed space V. We may wish to
upgrade it to a linear subspace in order to make it subject to our theory.
Denition 2.25. The linear span of A, write Lin(A), is the intersection of all linear
subspaces of V containing A, i.e. the smallest subspace containing A, equivalently
the set of all nite linear combination of elements of A. The closed linear span of A
write CLin(A) is the intersection of all closed linear subspaces of V containing A,
i.e. the smallest closed subspace containing A.
Exercise

2.26. (i) Show that if A is a subset of nite dimension space then


Lin(A) = CLin(A).
(ii) Show that for an innite A spaces Lin(A) and CLin(A)could be different.
(Hint: use Example 2.19(iii).)
Proposition 2.27. Lin(A) = CLin(A).
Proof. Clearly Lin(A) is a closed subspace containing A thus it should contain
CLin(A). Also Lin(A) CLin(A) thus Lin(A) CLin(A) = CLin(A). Therefore
Lin(A) = CLin(A).
Consequently CLin(A) is the set of all limiting points of nite linear combina-
tion of elements of A.
Example 2.28. Let V = C[a, b] with the sup norm||

. Then:
Lin{1, x, x
2
, . . .} = {all polynomials}
CLin{1, x, x
2
, . . .} = C[a, b] by the Weierstrass approximation theoremproved later.
The following simple result will be used later many times without comments.
Lemma 2.29 (about Inner Product Limit). Suppose H is an inner product space and
sequences x
n
and y
n
have limits x and y correspondingly. Then x
n
, y
n
x, y or
equivalently:
lim
n
x
n
, y
n
=
_
lim
n
x
n
, lim
n
y
n
_
.
Proof. Obviously by the CauchySchwarz inequality:
|x
n
, y
n
x, y| = |x
n
x, y
n
+ x, y
n
y|
|x
n
x, y
n
| +|x, y
n
y|
|x
n
x| |y
n
| +|x| |y
n
y| 0,
since |x
n
x| 0, |y
n
y| 0, and |y
n
| is bounded.
3. ORTHOGONALITY
Pythagoras is forever!
The catchphrase from TV commercial of Hilbert Spaces
course
As was mentioned in the introduction the Hilbert spaces is an analog of our 3D
Euclidean space and theory of Hilbert spaces similar to plane or space geometry.
One of the primary result of Euclidean geometry which still survives in high
INTRODUCTION TO FUNCTIONAL ANALYSIS 17
school curriculum despite its continuous nasty de-geometrisation is Pythagoras
theorem based on the notion of orthogonality
1
.
So far we was concerned only with distances between points. Now we would
like to study angles between vectors and notably right angles. Pythagoras theorem
states that if the angle C in a triangle is right then c
2
= a
2
+b
2
, see Figure 5 .
a
b
c
FIGURE 5. The Pythagoras theorem c
2
= a
2
+ b
2
It is a very mathematical way of thinking to turn this property of right angles into
their denition, which will work even in innite dimensional Hilbert spaces.
Look for a triangle, or even for a right triangle
A universal advice in solving problems from elementary
geometry.
3.1. Orthogonal Systemin Hilbert Space. In inner product spaces it is even more
convenient to give a denition of orthogonality not from Pythagoras theorem but
from an equivalent property of inner product.
Denition 3.1. Two vectors x and y in an inner product space are orthogonal if
x, y = 0, written x y.
An orthogonal sequence (or orthogonal system) e
n
(nite or innite) is one in which
e
n
e
m
whenever n ,= m.
An orthonormal sequence (or orthonormal system) e
n
is an orthogonal sequence
with |e
n
| = 1 for all n.
Exercise 3.2. (i) Show that if x x then x = 0 and consequently x y for
any y H.
(ii) Show that if all vectors of an orthogonal system are non-zero then they
are linearly independent.
Example 3.3. These are orthonormal sequences:
(i) Basis vectors (1, 0, 0), (0, 1, 0), (0, 0, 1) in R
3
or C
3
.
(ii) Vectors e
n
= (0, . . . , 0, 1, 0, . . .) (with the only 1 on the nth place) in
2
.
(Could you see a similarity with the previous example?)
(iii) Functions e
n
(t) = 1/(

2)e
int
, n Z in C[0, 2]:
(3.1) e
n
, e
m
=
2
_
0
1
2
e
int
e
imt
dt =
_
1, n = m;
0, n ,= m.
Exercise 3.4. Let A be a subset of an inner product space V and x y for any
y A. Prove that x z for all z CLin(A).
Theorem 3.5 (Pythagoras). If x y then |x +y|
2
= |x|
2
+|y|
2
. Also if e
1
, . . . , e
n
is orthonormal then
_
_
_
_
_
n

1
a
k
e
k
_
_
_
_
_
2
=
_
n

1
a
k
e
k
,
n

1
a
k
e
k
_
=
n

1
|a
k
|
2
.
1
Some more strange types of orthogonality can be seen in the paper Elliptic, Parabolic and Hyper-
bolic Analytic Function Theory1: Geometry of Invariants.
18 VLADIMIR V. KISIL
Proof. A one-line calculation.
The following theorem provides an important property of Hilbert spaces which
will be used many times. Recall, that a subset K of a linear space V is convex if for
all x, y K and [0, 1] the point x + (1 )y is also in K. Particularly any
subspace is convex and any unit ball as well (see Exercise 2.8(i)).
Theorem 3.6 (about the Nearest Point). Let K be a non-empty convex closed subset of
a Hilbert space H. For any point x H there is the unique point y K nearest to x.
Proof. Let d = inf
yK
d(x, y), where d(x, y)the distance coming from the norm
|x| =
_
x, x and let y
n
a sequence points in K such that lim
n
d(x, y
n
) = d.
Then y
n
is a Cauchy sequence. Indeed from the parallelogram identity for the
parallelogram generated by vectors x y
n
and x y
m
we have:
|y
n
y
m
|
2
= 2 |x y
n
|
2
+ 2 |x y
m
|
2
|2x y
n
y
m
|
2
.
Note that |2x y
n
y
m
|
2
= 4
_
_
x
y
n
+y
m
2
_
_
2
4d
2
since
y
n
+y
m
2
K by its con-
vexity. For sufciently large m and n we get |x y
m
|
2
d + and |x y
n
|
2

d + , thus |y
n
y
m
| 4(d
2
+) 4d
2
= 4, i.e. y
n
is a Cauchy sequence.
Let y be the limit of y
n
, which exists by the completeness of H, then y K
since K is closed. Then d(x, y) = lim
n
d(x, y
n
) = d. This show the existence
of the nearest point. Let y

be another point in K such that d(x, y

) = d, then the
parallelogram identity implies:
|y y

|
2
= 2 |x y|
2
+ 2 |x y

|
2
|2x y y

|
2
4d
2
4d
2
= 0.
This shows the uniqueness of the nearest point.
Exercise

3.7. The essential r ole of the parallelogram identity in the above proof
indicates that the theorem does not hold in a general Banach space.
(i) Show that in R
2
with either norm ||
1
or ||

form Example 2.9 the


nearest point could be non-unique;
(ii) Could you construct an example (in Banach space) when the nearest point
does not exists?
Liberte, Egalite, Fraternite!
A longstanding ideal approximated in the real life by
something completely different
3.2. Bessels inequality. For the case then a convex subset is a subspace we could
characterise the nearest point in the term of orthogonality.
Theorem 3.8 (on Perpendicular). Let M be a subspace of a Hilbert space H and a point
x H be xed. Then z M is the nearest point to x if and only if x z is orthogonal to
any vector in M.
Proof. Let z is the nearest point to x existing by the previous Theorem. We claim
that x z orthogonal to any vector in M, otherwise there exists y M such that
x z, y , = 0. Then
|x z y|
2
= |x z|
2
2x z, y +
2
|y|
2
< |x z|
2
,
if is chosen to be small enough and such that x z, y is positive, see Fig-
ure 6(i). Therefore we get a contradiction with the statement that z is closest point
to x.
INTRODUCTION TO FUNCTIONAL ANALYSIS 19
On the other hand if x z is orthogonal to all vectors in H
1
then particularly
(x z) (z y) for all y H
1
, see Figure 6(ii). Since x y = (x z) + (z y) we
got by the Pythagoras theorem:
|x y|
2
= |x z|
2
+|z y|
2
.
So |x y|
2
|x z|
2
and the are equal if and only if z = y.
(i)
M
x
z

y
(ii)
e
1
e
2
z
x
y
FIGURE 6. (i) A smaller distance for a non-perpendicular direc-
tion; and
(ii) Best approximation from a subspace
Exercise 3.9. The above proof does not work if x z, y is an imaginary number,
what to do in this case?
Consider now a basic case of approximation: let x H be xed and e
1
, . . . , e
n
be
orthonormal and denote H
1
= Lin{e
1
, . . . , e
n
}. We could try to approximate x by a
vector y =
1
e
1
+ +
n
e
n
H
1
.
Corollary 3.10. The minimal value of |x y| for y H
1
is achieved when y =

n
1
x, e
i
e
i
.
Proof. Let z =

n
1
x, e
i
e
i
, then x z, e
i
= x, e
i
z, e
i
= 0. By the previous
Theoremz is the nearest point to x.
Example 3.11. (i) In R
3
nd the best approximation to (1, 0, 0) fromthe plane
V : {x
1
+x
2
+x
3
= 0}. We take an orthonormal basis e
1
= (2
1/2
, 2
1/2
, 0),
e
2
= (6
1/2
, 6
1/2
, 2 6
1/2
) of V (Check this!). Then:
z = x, e
1
e
1
+x, e
2
e
2
=
_
1
2
,
1
2
, 0
_
+
_
1
6
,
1
6
,
1
3
_
=
_
2
3
,
1
3
,
1
3
_
.
(ii) In C[0, 2] what is the best approximation to f(t) = t by functions a +
be
it
+ce
it
? Let
e
0
=
1

2
, e
1
=
1

2
e
it
, e
1
=
1

2
e
it
.
We nd:
f, e
0
=
2
_
0
t

2
dt =
_
t
2
2
1

2
_
2
0
=

2
3/2
;
f, e
1
=
2
_
0
te
it

2
dt = i

2 (Check this!)
f, e
1
=
2
_
0
te
it

2
dt = i

2 (Why we may not check this one?)


20 VLADIMIR V. KISIL
Then the best approximation is (see Figure 7):
f
0
(t) = f, e
0
e
0
+ f, e
1
e
1
+ f, e
1
e
1
=

2
3/2

2
+ie
it
ie
it
= 2 sint.
0
6.3
y
0 6.3
x
FIGURE 7. Best approximation by three trigonometric polynomials
Corollary 3.12 (Bessels inequality). If (e
i
) is orthonormal then
|x|
2

i=1
|x, e
i
|
2
.
Proof. Let z =

n
1
x, e
i
e
i
then xz e
i
for all i therefore by Exercise 3.4 xz z.
Hence:
|x|
2
= |z|
2
+|x z|
2
|z|
2
=
n

i=1
|x, e
i
|
2
.

Did you say rice and sh for them?


A student question
3.3. The RieszFischer theorem. When (e
i
) is orthonormal we call x, e
n
the nth
Fourier coefcient of x (with respect to (e
i
), naturally).
Theorem 3.13 (RieszFisher). Let (e
n
)

1
be an orthonormal sequence in a Hilbert space
H. Then

1

n
e
n
converges in Hif and only if

1
|
n
|
2
< . In this case |

1

n
e
n
|
2
=

1
|
n
|
2
.
Proof. Necessity: Let x
k
=

k
1

n
e
n
and x = lim
k
x
k
. So x, e
n
= lim
k
x
k
, e
n
=

n
for all n. By the Bessels inequality for all k
|x|
2

1
|x, e
n
|
2
=
k

1
|
n
|
2
,
hence

k
1
|
n
|
2
converges and the sum is at most |x|
2
.
Sufciency: Consider |x
k
x
m
| =
_
_
_

k
m

n
e
n
_
_
_ =
_

k
m
|
n
|
2
_
1/2
for k > m.
Since

k
m
|
n
|
2
converges x
k
is a Cauchy sequence in H and thus has a limit x. By
INTRODUCTION TO FUNCTIONAL ANALYSIS 21
the Pythagoras theorem |x
k
|
2
=

k
1
|
n
|
2
thus for k |x|
2
=

1
|
n
|
2
by
the Lemma about inner product limit.
Observation: the closed linear span of an orthonormal sequence in any Hilbert
space looks like
2
, i.e.
2
is a universal model for a Hilbert space.
By Bessels inequality and the RieszFisher theorem we know that the series

1
x, e
i
e
i
converges for any x H. What is its limit?
Let y = x

1
x, e
i
e
i
, then
(3.2) y, e
k
= x, e
k

1
x, e
i
e
i
, e
k
= x, e
k
x, e
k
= 0 for all k.
Denition 3.14. An orthonormal sequence (e
i
) in a Hilbert space H is complete if
the identities y, e
k
= 0 for all k imply y = 0.
A complete orthonormal sequence is also called orthonormal basis in H.
Theorem3.15 (on Orthonormal Basis). Let e
i
be an orthonormal basis in a Hilber space
H. Then for any x H we have
x =

n=1
x, e
n
e
n
and |x|
2
=

n=1
|x, e
n
|
2
.
Proof. By the RieszFisher theorem, equation (3.2) and denition of orthonormal
basis.
There are constructive existence theorems in mathematics.
An example of pure existence statement
3.4. Constructionof Orthonormal Sequences. Natural questions are: Do orthonor-
mal sequences always exist? Could we construct them?
Theorem 3.16 (GramSchmidt). Let (x
i
) be a sequence of linearly independent vectors
in an inner product space V. Then there exists orthonormal sequence (e
i
) such that
Lin{x
1
, x
2
, . . . , x
n
} = Lin{e
1
, e
2
, . . . , e
n
}, for all n.
Proof. We give an explicit algorithm working by induction. The base of induction:
the rst vector is e
1
= x
1
/ |x
1
|. The step of induction: let e
1
, e
2
, . . . , e
n
are already
constructed as required. Let y
n+1
= x
n+1

n
i=1
x
n+1
, e
i
e
i
. Then by (3.2)
y
n+1
e
i
for i = 1, . . . , n. We may put e
n+1
= y
n+1
/ |y
n+1
| because y
n+1
,= 0
due to linear independence of x
k
s. Also
Lin{e
1
, e
2
, . . . , e
n+1
} = Lin{e
1
, e
2
, . . . , y
n+1
}
= Lin{e
1
, e
2
, . . . , x
n+1
}
= Lin{x
1
, x
2
, . . . , x
n+1
}.
So (e
i
) are orthonormal sequence.
Example 3.17. Consider C[0, 1] with the usual inner product (2.7) and apply or-
thogonalisation to the sequence 1, x, x
2
, . . . . Because |1| = 1 then e
1
(x) = 1. The
continuation could be presented by the table:
e
1
(x) = 1
y
2
(x) = x x, 1 1 = x
1
2
, |y
2
|
2
=
1
_
0
(x
1
2
)
2
dx =
1
12
, e
2
(x) =

12(x
1
2
)
y
3
(x) = x
2

x
2
, 1
_
1
_
x
2
, x
1
2
_
(x
1
2
) 12, . . . , e
3
=
y
3
|y
3
|
. . . . . . . . .
22 VLADIMIR V. KISIL
Example 3.18. Many famous sequences of orthogonal polynomials, e.g. Cheby-
shev, Legendre, Laguerre, Hermite, can be obtained by orthogonalisation of 1, x,
x
2
, . . . with various inner products.
(i) Legendre polynomials in C[1, 1] with inner product
(3.3) f, g =
1
_
1
f(t)g(t) dt.
(ii) Chebyshev polynomials in C[1, 1] with inner product
(3.4) f, g =
1
_
1
f(t)g(t)
dt

1 t
2
(iii) Laguerre polynomials in the space of polynomials P[0, ) with inner product
f, g =

_
0
f(t)g(t)e
t
dt.
1 0 1
x
1
0
1
y
1 0 1
x
1
0
1
y
FIGURE 8. Five rst Legendre P
i
and Chebyshev T
i
polynomials
See Figure 8 for the ve rst Legendre and Chebyshev polynomials. Observe the
difference caused by the different inner products (3.3) and (3.4). On the other hand
note the similarity in oscillating behaviour with different frequencies.
Another natural question is: When is an orthonormal sequence complete?
Proposition 3.19. Let (e
n
) be an orthonormal sequence in a Hilbert space H. The follow-
ing are equivalent:
(i) (e
n
) is an orthonormal basis.
(ii) CLin((e
n
)) = H.
(iii) |x|
2
=

1
|x, e
n
|
2
for all x H.
Proof. Clearly 3.19(i) implies 3.19(ii) because x =

1
x, e
n
e
n
in CLin((e
n
)) and
|x|
2
=

1
x, e
n
e
n
by Theorem 3.15.
If (e
n
) is not complete then there exists x H such that x ,= 0 and x, e
k
for all
k, so 3.19(iii) fails, consequently 3.19(iii) implies 3.19(i).
Finally if x, e
k
= 0 for all k then x, y = 0 for all y Lin((e
n
)) and moreover
for all y CLin((e
n
)), by the Lemma on continuity of the inner product. But then
x , CLin((e
n
)) and 3.19(ii) also fails because x, x = 0 is not possible. Thus 3.19(ii)
implies 3.19(i).
INTRODUCTION TO FUNCTIONAL ANALYSIS 23
Corollary 3.20. A separable Hilbert space (i.e. one with a countable dense set) can be
identied with either
n
2
or
2
, in other words it has an orthonormal basis (e
n
) (nite or
innite) such that
x =

n=1
x, e
n
e
n
and |x|
2
=

n=1
|x, e
n
|
2
.
Proof. Take a countable dense set (x
k
), then H = CLin((x
k
)), delete all vectors
which are a linear combinations of preceding vectors, make orthonormalisation
by GramSchmidt the remaining set and apply the previous proposition.
Most pleasant compliments are usually orthogonal to our real
qualities.
An advise based on observations
3.5. Orthogonal complements.
Denition 3.21. Let M be a subspace of an inner product space V. The orthogonal
complement, written M

, of M is
M

= {x V : x, m = 0 m M}.
Theorem3.22. If Mis a closed subspace of a Hilbert space Hthen M

is a closed subspace
too (hence a Hilbert space too).
Proof. Clearly M

is a subspace of H because x, y M

implies ax + by M

:
ax + by, m = ax, m + by, m = 0.
Also if all x
n
M

and x
n
x then x M

due to inner product limit Lemma.

Theorem 3.23. Let M be a closed subspace of a Hilber space H. Then for any x H
there exists the unique decomposition x = m + n with m M, n M

and |x|
2
=
|m|
2
+|n|
2
. Thus H = MM

and (M

= M.
Proof. For a given x there exists the unique closest point m in M by the Theorem
on nearest point and by the Theorem on perpendicular (x m) y for all y M.
So x = m + (x m) = m + n with m M and n M

. The identity |x|


2
=
|m|
2
+|n|
2
is just Pythagoras theorem and M M

= {0} because null vector is


the only vector orthogonal to itself.
Finally (M

= M. We have H = MM

= (M

, for any x (M

there is a decomposition x = m + n with m M and n M

, but then n is
orthogonal to itself and therefore is zero.
Corollary 3.24 (about Orthoprojection). There is a linear map P
M
fromHonto M(the
orthogonal projection or orthoprojection) such that
(3.5) P
2
M
= P
M
, ker P
M
= M

, P
M
= I P
M
.
Proof. Let us dene P
M
(x) = m where x = m + n is the decomposition from the
previous theorem. The linearity of this operator follows from the fact that both M
and M

are linear subspaces. Also P


M
(m) = m for all m M and the image of
P
M
is M. Thus P
2
M
= P
M
. Also if P
M
(x) = 0 then x M, i.e. ker P
M
= M

.
Similarly P
M
(x) = n where x = m+n and P
M
+P
M
= I.
Example 3.25. Let (e
n
) be an orthonormal basis in a Hilber space and let S N be
xed. Let M = CLin{e
n
: n S} and M

= CLin{e
n
: n N \ S}. Then

k=1
a
k
e
k
=

kS
a
k
e
k
+

kS
a
k
e
k
.
24 VLADIMIR V. KISIL
Remark 3.26. In fact there is a one-to-one correspondence between closed linear
subspaces of a Hilber space Hand orthogonal projections dened by identities (3.5).
4. FOURIER ANALYSIS
All bases are equal, but some are more equal then others.
As we saw already any separable Hilbert space posses an orthonormal basis
(innitely many of them indeed). Are they equally good? This depends from
our purposes. For solution of differential equation which arose in mathematical
physics (wave, heat, Laplace equations, etc.) there is a proffered choice. The fun-
damental formula:
d
dx
e
ax
= ae
ax
reduces the derivative to a multiplication by a.
We could benet from this observation if the orthonormal basis will be constructed
out of exponents. This helps to solve differential equations as was demonstrated
in Subsection 1.2.
7.40pm Fourier series: Episode II
Todays TV listing
4.1. Fourier series. Now we wish to address questions stated in Remark 1.9. Let
us consider the space L
2
[, ]. As we saw in Example 3.3(iii) there is an or-
thonormal sequence e
n
(t) = (2)
1/2
e
int
in L
2
[, ]. We will show that it is
an orthonormal basis, i.e.
f(t) L
2
[, ] f(t) =

k=
f, e
k
e
k
(t),
with convergence in L
2
norm. To do this we show that CLin{e
k
: k Z} =
L
2
[, ].
Let CP[, ] denote the continuous functions f on [, ] such that f() =
f(). We also dene f outside of the interval [, ] by periodicity.
Lemma 4.1. The space CP[, ] is dense in L
2
[, ].
Proof. Let f L
2
[, ]. Given > 0 there exists g C[, ] such that |f g| <
/2. Formcontinuity of g on a compact set follows that there is Msuch that |g(t)| <
M for all t [, ]. We can now replace g by periodic g, which coincides with


FIGURE 9. A modication of continuous function to periodic
g on [, ] for an arbitrary > 0 and has the same bounds: | g(t)| < M, see
Figure 9. Then
|g g|
2
2
=

|g(t) g(t)|
2
dt (2M)
2
.
So if <
2
/(4M)
2
then |g g| < /2 and |f g| < .
Now if we could show that CLin{e
k
: k Z} includes CP[, ] then it also
includes L
2
[, ].
INTRODUCTION TO FUNCTIONAL ANALYSIS 25
Notation 4.2. Let f CP[, ],write
(4.1) f
n
=
n

k=n
f, e
k
e
k
, for n = 0, 1, 2, . . .
the partial sum of the Fourier series for f.
We want to show that |f f
n
|
2
0. To this end we dene nth Fej er sum by the
formula
(4.2) F
n
=
f
0
+f
1
+ +f
n
n + 1
,
and show that
|F
n
f|

0.
Then we conclude
|F
n
f|
2
=
_
_

|F
n
(t) f|
2
_
_
1/2
(2)
1/2
|F
n
f|

0.
Since F
n
Lin((e
n
)) then f CLin((e
n
)) and hence f =

f, e
k
e
k
.
Remark 4.3. It is not always true that |f
n
f|

0 even for f CP[, ].


Exercise 4.4. Find an example illustrating the above Remark.
It took 19 years of his life to prove this theorem
4.2. Fej ers theorem.
Proposition 4.5 (Fej er, age 19). Let f CP[, ]. Then
F
n
(x) =
1
2

f(t)K
n
(x t) dt, where (4.3)
K
n
(t) =
1
n + 1
n

k=0
k

m=k
e
imt
, (4.4)
is the Fej er kernel.
Proof. From notation (4.1):
f
k
(x) =
k

m=k
f, e
m
e
m
(x)
=
k

m=k

f(t)
e
imt

2
dt
e
imx

2
=
1
2

f(t)
k

m=k
e
im(xt)
dt.
26 VLADIMIR V. KISIL
Then from (4.2):
F
n
(x) =
1
n + 1
n

k=0
f
k
(x)
=
1
n + 1
1
2
n

k=0

f(t)
k

m=k
e
im(xt)
dt
=
1
2

f(t)
1
n + 1
n

k=0
k

m=k
e
im(xt)
dt,
which nishes the proof.
Lemma 4.6. The Fej er kernel is 2-periodic, K
n
(0) = n + 1 and:
(4.5) K
n
(t) =
1
n + 1
sin
2 (n+1)t
2
sin
2 t
2
, for t , 2Z.
Proof. Let z = e
it
, then:
K
n
(t) =
1
n + 1
n

k=0
(z
k
+ + 1 +z + +z
k
)
=
1
n + 1
n

j=n
(n + 1 |j|)z
j
,
by switch from counting in rows to counting in columns in Table 1. Let w = e
it/2
,
1
z
1
1 z
z
2
z
1
1 z z
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
TABLE 1. Counting powers in rows and columns
i.e. z = w
2
, then
K
n
(t) =
1
n + 1
(w
2n
+ 2w
2n+2
+ + (n + 1) +nw
2
+ + w
2n
)
=
1
n + 1
(w
n
+w
n+2
+ +w
n2
+w
n
)
2
(4.6)
=
1
n + 1
_
w
n1
w
n+1
w
1
w
_
2
Could you sum a geometric progression?
=
1
n + 1
_
2i sin
(n+1)t
2
2i sin
t
2
_
2
,
if w ,= 1. For the value of K
n
(0) we substitute w = 1 into (4.6).
The rst eleven Fej er kernels are shown on Figure 10, we could observe that:
Lemma 4.7. Fej ers kernel has the following properties:
(i) K
n
(t) 0 for all t R and n N.
(ii)

K
n
(t) dt = 2.
INTRODUCTION TO FUNCTIONAL ANALYSIS 27
(iii) For any (0, )

K
n
(t) dt 0 as n .
Proof. The rst property immediately follows from the explicit formula (4.5). In
contrast the second property is easier to deduce from expression with double
sum (4.4):

K
n
(t) dt =

1
n + 1
n

k=0
k

m=k
e
imt
dt
=
1
n + 1
n

k=0
k

m=k

e
imt
dt
=
1
n + 1
n

k=0
2
= 2,
since the formula (3.1).
Finally if |t| > then sin
2
(t/2) sin
2
(/2) > 0 by monotonicity of sinus on
[0, /2], so:
0 K
n
(t)
1
(n + 1) sin
2
(/2)
3 2 1 0 1 2 3
x
2
1
0
1
2
3
4
5
6
7
8
9
y
3 2 1 0 1 2 3
x
2
1
0
1
2
3
4
5
6
7
8
9
y
FIGURE 10. A family of Fej er kernels with the parameter m run-
ning from 0 to 9 is on the left picture. For a comparison unregu-
larised Fourier kernels are on the right picture.
28 VLADIMIR V. KISIL
implying:
0
_
|t|
K
n
(t) dt
1( )
(n + 1) sin
2
(/2)
0 as n 0.
Therefore the third property follows from the squeeze rule.
Theorem 4.8 (Fej er Theorem). Let f CP[, ]. Then its Fej er sums F
n
(4.2) con-
verges in supremum norm to f on [, ] and hence in L
2
norm as well.
Proof. Idea of the proof: if in the formula (4.3)
F
n
(x) =
1
2

f(t)K
n
(x t) dt,
t is long way fromx, K
n
is small (see Lemma 4.7 and Figure 10), for t near x, K
n
is
big with total weight 2, so the weighted average of f(t) is near f(x).
Here are details. Using property 4.7(ii) and periodicity of f and K
n
we could
express trivially
f(x) = f(x)
1
2
x+
_
x
K
n
(x t) dt =
1
2
x+
_
x
f(x)K
n
(x t) dt.
Similarly we rewrite (4.3) as
F
n
(x) =
1
2
x+
_
x
f(t)K
n
(x t) dt,
then
|f(x) F
n
(x)| =
1
2

x+
_
x
(f(x) f(t))K
n
(x t) dt

1
2
x+
_
x
|f(x) f(t)| K
n
(x t) dt.
Given > 0 split into three intervals: I
1
= [x , x ], I
2
= [x , x + ],
I
3
= [x + , x + ], where is chosen such that |f(t) f(x)| < /2 for t I
2
, which
is possible by continuity of f. So
1
2
_
I
2
|f(x) f(t)| K
n
(x t) dt

2
1
2
_
I
2
K
n
(x t) dt <

2
.
And
1
2
_
I
1
I
3
|f(x) f(t)| K
n
(x t) dt 2 |f|

1
2
_
I
1
I
3
K
n
(x t) dt
=
|f|

_
<|u|<
K
n
(u) du
<

2
,
if n is sufciently large due to property 4.7(iii) of K
n
. Hence |f(x) F
n
(x)| < for
a large n independent of x.
We almost nished the demonstration that e
n
(t) = (2)
1/2
e
int
is an orthonor-
mal basis of L
2
[, ]:
INTRODUCTION TO FUNCTIONAL ANALYSIS 29
Corollary 4.9 (Fourier series). Let f L
2
[, ], with Fourier series

n=
f, e
n
e
n
=

n=
c
n
e
int
where c
n
=
f, e
n

2
=
1

f(t)e
int
dt.
Then the series

f, e
n
e
n
=

c
n
e
int
converges in L
2
[, ] to f, i.e
lim
k
_
_
_
_
_
f
k

n=k
c
n
e
int
_
_
_
_
_
2
= 0.
Proof. This follows from the previous Theorem, Lemma 4.1 about density of CP in
L
2
, and Theorem 3.15 on orthonormal basis.
4.3. Parsevals formula. The following result rst appeared in the framework of
L
2
[, ] and only later was understood to be a general property of inner product
spaces.
Theorem 4.10 (Parsevals formula). If f, g L
2
[, ] have Fourier series f =

n=
c
n
e
int
, g =

n=
d
n
e
int
then
(4.7) f, g =

f(t)g(t) dt = 2

c
n
d
n
.
More generally if f and g are two vectors of a Hilbert space H with an orthonormal
basis (e
n
)

then
f, g =

k=
c
n
d
n
, where c
n
= f, e
n
, d
n
= g, e
n
,
are the Fourier coefcients of f and g.
Proof. In fact we could just prove the second, more general, statementthe rst
one is its particular realisation. Let f
n
=

n
k=n
c
k
e
k
and g
n
=

n
k=n
d
k
e
k
will
be partial sums of the corresponding Fourier series. Then from orthonormality of
(e
n
) and linearity of the inner product:
f
n
, g
n
=
_
n

k=n
c
k
e
k
,
n

k=n
d
k
e
k
_
=
n

k=n
c
k
d
k
.
This formula together with the facts that f
k
f and g
k
g (following from
Corollary 4.9) and Lemma about continuity of the inner product implies the asser-
tion.
Corollary 4.11. A integrable function f belongs to L
2
[, ] if and only if its Fourier
series is convergent and then |f|
2
= 2

|c
k
|
2
.
Proof. The necessity, i.e. implication f L
2
f, f = |f|
2
= 2

|c
k
|
2
, follows
from the previous Theorem. The sufciency follows by RieszFisher Theorem.
Remark 4.12. The actual r ole of the Parsevals formula is shadowed by the or-
thonormality and is rarely recognised until we meet the wavelets or coherent states.
Indeed the equality (4.7) should be read as follows:
Theorem4.13 (Modied Parseval). The map W : H
2
given by the formula [Wf](n) =
f, e
n
is an isometry for any orthonormal basis (e
n
).
30 VLADIMIR V. KISIL
We could nd many other systems of vectors (e
x
), x X (very different from or-
thonormal bases) such that the map W : H L
2
(X) given by the simple universal
formula
(4.8) [Wf](x) = f, e
x

will be an isometry of Hilbert spaces. The map (4.8) is oftenly called wavelet trans-
form and most famous is the Cauchy integral formula in complex analysis. The ma-
jority of wavelets transforms are linked with group representations, see our post-
graduate course Wavelets in Applied and Pure Maths.
Heat and noise but not a re?
Answer:
A p p l i c a t i o n o f F o u r i e r S e r i e s
4.4. Some Application of Fourier Series. We are going to provide now few ex-
amples which demonstrate the importance of the Fourier series in many questions.
The rst two (Example 4.14 and Theorem 4.15) belong to pure mathematics and
last two are of more applicable nature.
Example 4.14. Let f(t) = t on [, ]. Then
f, e
n
=

te
int
dt =
_
(1)
n2i
n
, n ,= 0
0, n = 0
(check!),
so f(t)

(1)
n
(i/n)e
int
. By a direct integration:
|f|
2
2
=

t
2
dt =
2
3
3
.
On the other hand by the previous Corollary:
|f|
2
2
= 2

n=0

(1)
n
i
n

2
= 4

n=1
1
n
2
.
Thus we get a beautiful formula

1
1
n
2
=

2
6
.
Here is another important result.
Theorem 4.15 (Weierstrass Approximation Theorem). For any function f C[a, b]
and any > 0 there exists a polynomial p such that |f p|

< .
Proof. Change variable: t = 2(x
a+b
2
)/(b a) this maps x [a, b] onto t
[, ]. Let P denote the subspace of polynomials in C[, ]. Then e
int


P
for any n Z since Taylor series converges uniformly in [, ]. Consequently P
contains the closed linear span in (supremum norm) of e
int
, any n Z, which is
CP[, ] by the Fej er theorem. Thus

P CP[, ] and we extend that to non-
periodic function as follows (why we could not make use of Lemma 4.1 here, by
the way?).
For any f C[, ] let = (f() f())/(2) then f
1
(t) = f(t) t
CP[, ] and could be approximated by a polynomial p
1
(t) from the above dis-
cussion. Then f(t) is approximated by the polynomial p(t) = p
1
(t) + t.
It is easy to see, that the r ole of exponents e
int
in the above prove is rather
modest: they can be replaced by any functions which has a Taylor expansion. The
real glory of the Fourier analysis is demonstrated in the two following examples.
INTRODUCTION TO FUNCTIONAL ANALYSIS 31
Example 4.16. The modern history of the Fourier analysis starts from the works
of Fourier on the heat equation. As was mentioned in the introduction to this part,
the exceptional role of Fourier coefcients for differential equations is explained
by the simple formula
x
e
inx
= ine
inx
. We shortly review a solution of the heat
equation to illustrate this.
Let we have a rod of the length 2. The temperature at its point x [, ]
and a moment t [0, ) is described by a function u(t, x) on [0, ) [, ]. The
mathematical equation describing a dynamics of the temperature distribution is:
(4.9)
u(t, x)
t
=

2
u(t, x)
x
2
or, equivalently,
_

t

2
x
_
u(t, x) = 0.
For any xed moment t
0
the function u(t
0
, x) depends only from x [, ] and
according to Corollary 4.9 could be represented by its Fourier series:
u(t
0
, x) =

n=
u, e
n
e
n
=

n=
c
n
(t
0
)e
inx
,
where
c
n
(t
0
) =
u, e
n

2
=
1

u(t
0
, x)e
inx
dx,
with Fourier coefcients c
n
(t
0
) depending from t
0
. We substitute that decompos-
ition into the heat equation (4.9) to receive:
_

t

2
x
_
u(t, x) =
_

t

2
x
_

n=
c
n
(t)e
inx
=

n=
_

t

2
x
_
c
n
(t)e
inx
=

n=
(c

n
(t) + n
2
c
n
(t))e
inx
= 0. (4.10)
Since function e
inx
form a basis the last equation (4.10) holds if and only if
(4.11) c

n
(t) +n
2
c
n
(t) = 0 for all n and t.
Equations from the system (4.11) have general solutions of the form:
(4.12) c
n
(t) = c
n
(0)e
n
2
t
for all t [0, ),
producing a general solution of the heat equation (4.9) in the form:
(4.13) u(t, x) =

n=
c
n
(0)e
n
2
t
e
inx
=

n=
c
n
(0)e
n
2
t+inx
,
where constant c
n
(0) could be dened from boundary condition. For example,
if it is known that the initial distribution of temperature was u(0, x) = g(x) for a
function g(x) L
2
[, ] then c
n
(0) is the n-th Fourier coefcient of g(x).
The general solution (4.13) helps produce both the analytical study of the heat
equation (4.9) and numerical simulation. For example, from (4.13) obviously fol-
lows that
the temperature is rapidly relaxing toward the thermal equilibrium with
the temperature given by c
0
(0), however never reach it within a nite
time;
the higher frequencies (bigger thermal gradients) have a bigger speed
of relaxation; etc.
32 VLADIMIR V. KISIL
2
.
5
0
-
2
.
5
1
0
7
.
5
5
2
.
5
0
0
-
2
.
5
x
t
T
FIGURE 11. The dynamics of a heat equation:
xcoordinate on the rod,
ttime,
Ttemperature.
The example of numerical simulation for the initial value problem with g(x) =
2 cos(2 u) + 1.5 sin(u). It is clearly illustrate our above conclusions.
Example 4.17. Among the oldest periodic functions in human culture are acoustic
waves of musical tones. The mathematical theory of musics (including rudiments
of the Fourier analysis!) is as old as mathematics itself and was highly respected
already in Pythagoras school more 2500 years ago.
0 100 200 300
1
0.5
0
0.5
1
FIGURE 12. Two oscillation with unharmonious frequencies and
the appearing dissonance. Click to listen the blue and green pure
harmonics and red dissonance.
The earliest observations are that
(i) The musical sounds are made of pure harmonics (see the blue and green
graphs on the Figure 12), in our language cos and sin functions form a
basis;
(ii) Not every two pure harmonics are compatible, to be their frequencies
should make a simple ratio. Otherwise the dissonance (red graph on Fig-
ure 12) appears.
The musical tone, say G5, performed on different instruments clearly has some-
thing in common and different, see Figure 13 for comparisons. The decomposi-
tion into the pure harmonics, i.e. nding Fourier coefcient for the signal, could
provide the complete characterisation, see Figure 14.
INTRODUCTION TO FUNCTIONAL ANALYSIS 33
0 20 40 60 80
0.04
0.02
0
0.02
0.04
bowedvib
0 20 40 60 80
0.04
0.02
0
0.02
0.04
Human voice
0 20 40 60 80
0.04
0.02
0
0.02
0.04
AltoSax
0 20 40 60 80
0.04
0.02
0
0.02
0.04
Dizi
0 20 40 60 80
0.04
0.02
0
0.02
0.04
Violin
0 20 40 60 80 100 120
0.15
0.1
0.05
0
0.05
0.1
0.15
Glockenspiel
FIGURE 13. Graphics of G5 performed on different musical in-
struments (click on picture to hear the sound). Samples are taken
from Sound Library.
0 500 1000 1500
0
20
40
60
80
100
FIGURE 14. Fourier series for G5 performed on different musical
instruments (same order and colour as on the previous Figure)
The Fourier analysis tells that:
(i) All sound have the same base (i.e. the lowest) frequencies which corres-
ponds to the G5 tone, i.e. 788 Gz.
34 VLADIMIR V. KISIL
(ii) The higher frequencies, which are necessarily are multiples of 788 Gz
to avoid dissonance, appears with different weights for different instru-
ments.
The Fourier analysis is very useful in the signal processing and is indeed the
fundamental tool. However it is not universal and has very serious limitations.
Consider the simple case of the signals plotted on the Figure 15(a) and (b). They
are both made out of same two pure harmonics:
(i) On the rst signal the two harmonics (drawn in blue and green) follow
one after another in time on Figure 15(a);
(ii) They just blended in equal proportions over the whole interval on Fig-
ure 15(b).
(a)
1 0.5 0 0.5 1
2
1
0
1
2
(b)
1 0.5 0 0.5 1
2
1
0
1
2
(c)
0 10 20 30
0
100
200
300
400
500
FIGURE 15. Limits of the Fourier analysis: different frequencies
separated in time
This appear to be two very different signals. However the Fourier performed
over the whole interval does not seems to be very different, see Figure 15(c). Both
transforms (drawn in blue-green and pink) have two major pikes corresponding
to the pure frequencies. It is not very easy to extract differences between signals
from their Fourier transform (yet this should be possible according to our study).
Even a better picture could be obtained if we use windowed Fourier transform,
namely use a sliding window of the constant width instead of the entire interval
for the Fourier transform. Yet even better analysis could be obtained by means of
wavelets already mentioned in Remark 4.12 in connection with Plancherels for-
mula. Roughly, wavelets correspond to a sliding window of a variable size
narrow for high frequencies and wide for low.
5. DUALITY OF LINEAR SPACES
Everything has another side
INTRODUCTION TO FUNCTIONAL ANALYSIS 35
Orthonormal basis allows to reduce any question on Hilbert space to a question
on sequence of numbers. This is powerful but sometimes heavy technique. Some-
time we need a smaller and faster tool to study questions which are represented
by a single number, for example to demonstrate that two vectors are different it is
enough to show that there is a unequal values of a single coordinate. In such cases
linear functionals are just what we needed.
Is it functional?
Yes, it works!
5.1. Dual space of a normed space.
Denition 5.1. A linear functional on a vector space V is a linear mapping : V
C (or : V R in the real case), i.e.
(ax + by) = a(x) + b(y), for all x, y V and a, b C.
Exercise 5.2. Show that (0) is necessarily 0.
We will not consider any functionals but linear, thus bellow functional always
means linear functional.
Example 5.3. (i) Let V = C
n
and c
k
, k = 1, . . . , n be complex numbers. Then
((x
1
, . . . , x
n
)) = c
1
x
1
+ +c
2
x
2
is a linear functional.
(ii) On C[0, 1] a functional is given by (f) =
1
_
0
f(t) dt.
(iii) On a Hilbert space H for any x H a functional
x
is given by
x
(y) =
y, x.
Theorem 5.4. Let V be a normed space and is a linear functional. The following are
equivalent:
(i) is continuous (at any point of V).
(ii) is continuous at point 0.
(iii) sup{|(x)| : |x| 1} < , i.e. is a bounded linear functional.
Proof. Implication 5.4(i) 5.4(ii) is trivial.
Show 5.4(ii) 5.4(iii). By the denition of continuity: for any > 0 there exists
> 0 such that |v| < implies |(v) (0)| < . Take = 1 then |(x)| < 1 for
all x with norm less than 1 because |x| < . But from linearity of the inequality
|(x)| < 1 implies |(x)| < 1/ < for all |x| 1.
5.4(iii) 5.4(i). Let mentioned supremum be M. For any x, y V such that
x ,= y vector (x y)/ |x y| has norm 1. Thus |((x y)/ |x y|)| < M. By the
linearity of this implies that |(x) (y)| < M|x y|. Thus is continuous.

Denition5.5. The dual space X

of a normed space Xis the set of continuous linear


functionals on X. Dene a norm on it by
(5.1) || = sup
x1
|(x)| .
Exercise 5.6. (i) Show that X

is a linear space with natural operations.


(ii) Show that (5.1) denes a norm on X

.
(iii) Show that |(x)| || |x| for all x X, X

.
Theorem 5.7. X

is a Banach space with the dened norm (even if X was incomplete).


Proof. Due to Exercise 5.6 we only need to show that X

is complete. Let (
n
)
be a Cauchy sequence in X

, then for any x X scalars


n
(x) form a Cauchy
sequence, since |
m
(x)
n
(x)| |
m

n
| |x|. Thus the sequence has a limit
36 VLADIMIR V. KISIL
and we dene by (x) = lim
n

n
(x). Clearly is a linear functional on X.
We should show that it is bounded and
n
. Given > 0 there exists N such
that |
n

m
| < for all n, m N. If |x| 1 then |
n
(x)
m
(x)| , let
m then |
n
(x) (x)| , so
|(x)| |
n
(x)| + |
n
| +,
i.e. || is nite and |
n
| , thus
n
.
Denition 5.8. The kernel of linear functional , write ker , is the set all vectors
x X such that (x) = 0.
Exercise 5.9. Show that
(i) ker is a subspace of X.
(ii) If , 0 then ker is a proper subspace of X.
(iii) If is continuous then ker is closed.
Study one and get any other for free!
Hilbert spaces sale
5.2. Self-duality of Hilbert space.
Lemma 5.10 (RieszFr echet). Let H be a Hilbert space and a continuous linear func-
tional on H, then there exists the unique y H such that (x) = x, y for all x H.
Also ||
H
= |y|
H
.
Proof. Uniqueness: if x, y = x, y

x, y y

= 0 for all x H then y y

is
self-orthogonal and thus is zero (Exercise 3.2(i)).
Existence: we may assume that , 0 (otherwise take y = 0), then M = ker is
a closed proper subspace of H. Since H = MM

, there exists a non-zero z M

,
by scaling we could get (z) = 1. Then for any x H:
x = (x (x)z) +(x)z, with x (x)z M, (x)z M

.
Because x, z = (x) z, z = (x) |z|
2
for any x H we set y = z/ |z|
2
.
Equality of the norms ||
H
= |y|
H
follows fromthe CauchyBunyakovskiiSchwarz
inequality in the form(x) |x| |y| and the identity (y/ |y|) = |y|.
Example 5.11. On L
2
[0, 1] let (f) =

f, t
2
_
=
1
_
0
f(t)t
2
dt. Then
|| =
_
_
t
2
_
_
=
_
_
1
_
0
(t
2
)
2
dt
_
_
1/2
=
1

5
.
6. OPERATORS
All the spaces a stage,
and all functionals and operators merely players!
All our previous considerations were only a preparation of the stage and now
the main actors come forward to perform a play. The vectors spaces are not so
interesting while we consider them in statics, what really make them exciting is
the their transformations. The natural rst steps is to consider transformations
which respect both linear structure and the norm.
INTRODUCTION TO FUNCTIONAL ANALYSIS 37
6.1. Linear operators.
Denition 6.1. A linear operator T between two normed spaces X and Y is a map-
ping T : X Y such that T(v + u) = T(v) + T(u). The kernel of linear operator
ker T and image are dened by
ker T = {x X : Tx = 0} ImT = {y Y : y = Tx, for some x X}.
Exercise 6.2. Show that kernel of T is a linear subspace of X and image of T is a
linear subspace of Y.
As usual we are interested also in connections with the second (topological)
structure:
Denition 6.3. A norm of linear operator is dened:
(6.1) |T| = sup{|Tx|
Y
: |x|
X
1}.
T is a bounded linear operator if |T| = sup{|Tx| : |x|} < .
Exercise 6.4. Show that |Tx| |T| |x| for all x X.
Example 6.5. Consider the following examples and determine kernel and images
of the mentioned operators.
(i) On a normed space X dene the zero operator to a space Y by Z : x 0 for
all x X. Its norm is 0.
(ii) On a normed space X dene the identity operator by IX
: x x for all x X.
Its norm is 1.
(iii) On a normed space X any linear functional dene a linear operator from
X to C, its norm as operator is the same as functional.
(iv) The set of operators from C
n
to C
m
is given by n m matrices which
acts on vector by the matrix multiplication. All linear operators on nite-
dimensional spaces are bounded.
(v) On
2
, let S(x
1
, x
2
, . . .) = (0, x
1
, x
2
, . . .) be the right shift operator. Clearly
|Sx| = |x| for all x, so |S| = 1.
(vi) On L
2
[a, b], let w(t) C[a, b] and dene multiplication operator M
w
f by
(M
w
f)(t) = w(t)f(t). Now:
|M
w
f|
2
=
b
_
a
|w(t)|
2
|f(t)|
2
dt
K
2
b
_
a
|f(t)|
2
dt, where K = |w|

= sup
[a,b]
|w(t)| ,
so |M
w
| K.
Exercise 6.6. Show that for multiplication operator in fact there is the
equality of norms |M
w
|
2
= |w(t)|

.
Theorem6.7. Let T : X Y be a linear operator. The following conditions are equivalent:
(i) T is continuous on X;
(ii) T is continuous at the point 0.
(iii) T is a bounded linear operator.
Proof. Proof essentially follows the proof of similar Theorem 5.4.
38 VLADIMIR V. KISIL
6.2. B(H) as a Banach space (and even algebra).
Theorem 6.8. Let B(X, Y) be the space of bounded linear operators from X and Y
with the norm dened above. If Y is complete, then B(X, Y) is a Banach space.
Proof. The proof repeat proof of the Theorem 5.7, which is a particular case of the
present theorem for Y = C, see Example 6.5(iii).
Theorem 6.9. Let T B(X, Y) and S B(Y, Z), where X, Y, and Z are normed spaces.
Then ST B(X, Z) and |ST| |S| |T|.
Proof. Clearly (ST)x = S(Tx) Z, and
|STx| |S| |Tx| |S| |T| |x| ,
which implies norm estimation if |x| 1.
Corollary 6.10. Let T B(X, X) =B(X), where X is a normed space. Then for any
n 1, T
n
B(X) and |T
n
| |T|
n
.
Proof. It is induction by n with the trivial base n = 1 and the step following from
the previous theorem.
Remark 6.11. Some texts use notations L(X, Y) and L(X) instead of ours B(X, Y) and
B(X).
Denition 6.12. Let T B(X, Y). We say T is an invertible operator if there exists
S B(Y, X) such that
ST = IX
and TS = IY
.
Such an S is called the inverse operator of T.
Exercise 6.13. Show that
(i) for an invertible operator T : X Y we have ker T = {0} and T = Y.
(ii) the inverse operator is unique (if exists at all). (Assume existence of S and
S

, then consider operator STS

.)
Example 6.14. We consider inverses to operators from Exercise 6.5.
(i) The zero operator is never invertible unless the pathological spaces X =
Y = {0}.
(ii) The identity operator IX
is the inverse of itself.
(iii) A linear functional is not invertible unless it is non-zero and X is one di-
mensional.
(iv) An operator C
n
C
m
is invertible if and only if m = n and correspond-
ing square matrix is non-singular, i.e. has non-zero determinant.
(v) The right shift S is not invertible on
2
(it is one-to-one but is not onto). But
the left shift operator T(x
1
, x
2
, . . .) = (x
2
, x
3
, . . .) is its left inverse, i.e. TS = I
but TS ,= I since ST(1, 0, 0, . . .) = (0, 0, . . .). T is not invertible either (it is
onto but not one-to-one), however S is its right inverse.
(vi) Operator of multiplication M
w
is invertible if and only if w
1
C[a, b]
and inverse is M
w
1. For example M
1+t
is invertible L
2
[0, 1] and M
t
is
not.
6.3. Adjoints.
Theorem 6.15. Let H and K be Hilbert Spaces and T B(H, K). Then there exists
operator T

B(K, H) such that


Th, k
K
= h, T

k
H
for all h H, k K.
Such T

is called the adjoint operator of T. Also T

= T and |T

| = |T|.
INTRODUCTION TO FUNCTIONAL ANALYSIS 39
Proof. For any xed k K the expression h : Th, k
K
denes a bounded linear
functional on H. By the RieszFr echet lemma there is a unique y H such that
Th, k
K
= h, y
H
for all h H. Dene T

k = y then T

is linear:
h, T

(
1
k
1
+
2
k
2
)
H
= Th,
1
k
1
+
2
k
2

K
=

1
Th, k
1

K
+

2
Th, k
2

K
=

1
h, T

k
1

H
+

2
h, T

k
2

K
= h,
1
T

k
1
+
2
T

k
2

H
So T

(
1
k
1
+
2
k
2
) =
1
T

k
1
+
2
T

k
2
. T

is dened by k, T

h = T

k, h and
the identity T

h, k = h, T

k = Th, k for all h and k shows T

= T. Also:
|T

k|
2
= T

k, T

k = k, TT

k
|k| |TT

k| |k| |T| |T

k| ,
which implies |T

k| |T| |k|, consequently |T

| |T|. The opposite inequal-


ity follows from the identity |T| = |T

|.
Exercise 6.16. (i) For operators T
1
and T
2
show that
(T
1
T
2
)

= T

2
T

1
, (T
1
+T
2
)

= T

1
+T

2
(T)

=

T

.
(ii) If A is an operator on a Hilbert space H then (ker A)

= ImA

.
6.4. Hermitian, unitary and normal operators.
Denition 6.17. An operator T : H H is a Hermitian operator or self-adjoint oper-
ator if T = T

, i.e. Tx, y = x, Ty for all x, y H.


Example 6.18. (i) On
2
the adjoint S

to the right shift operator S is given by


the left shift S

= T, indeed:
Sx, y = (0, x
1
, x
2
, . . .), (y
1
, y
2
, . . .)
= x
1
y
2
+x
2
y
3
+ = (x
1
, x
2
, . . .), (y
2
, y
3
, . . .)
= x, Ty .
Thus S is not Hermitian.
(ii) Let D be diagonal operator on
2
given by
D(x
1
, x
2
, . . .) = (
1
x
1
,
2
x
2
, . . .).
where (
k
) is any bounded complex sequence. It is easy to check that
|D| = |(
n
)|

= sup
k
|
k
| and
D

(x
1
, x
2
, . . .) = (

1
x
1
,

2
x
2
, . . .),
thus D is Hermitian if and only if
k
R for all k.
(iii) If T : C
n
C
n
is represented by multiplication of a column vector by
a matrix A, then T

is multiplication by the matrix A

transpose and
conjugate to A.
Exercise 6.19. Show that for any bounded operator T operators T
1
=
1
2
(T T

),
T

T and TT

are Hermitians.
Theorem 6.20. Let T be a Hermitian operator on a Hilbert space. Then
|T| = sup
x=1
|Tx, x| .
40 VLADIMIR V. KISIL
Proof. If Tx = 0 for all x H, both sides of the identity are 0. So we suppose that
x H for which Tx ,= 0.
We see that |Tx, x| |Tx| |x| |T|
_
_
x
2
_
_
, so sup
x=1
|Tx, x| |T|. To get
the inequality the other way around, we rst write s := sup
x=1
|Tx, x|. Then for
any x H, we have |Tx, x| s
_
_
x
2
_
_
.
We now consider
T(x +y), x +y = Tx, x+Tx, y+Ty, x+Ty, y = Tx, x+2Tx, y+Ty, y
(because T being Hermitian gives Ty, x = y, Tx = Tx, y) and, similarly,
T(x y), x y = Tx, x 2Tx, y +Ty, y .
Subtracting gives
4Tx, y = T(x +y), x + yT(x y), x y s(|x +y|
2
+|x y|
2
) = 2s(|x|
2
+|y|
2
),
by the parallelogram identity.
Now, for x H such that Tx ,= 0, we put y = |Tx|
1
|x| Tx. Then |y| = |x|
and when we substitute into the previous inequality, we get
4 |Tx| |x| = 4Tx, y 4s
_
_
x
2
_
_
,
So |Tx| s |x| and it follows that |T| s, as required.
Denition 6.21. We say that U : H H is a unitary operator on a Hilbert space H
if U

= U
1
, i.e. U

U = UU

= I.
Example 6.22. (i) If D :
2

2
is a diagonal operator such that De
k
=
k
e
k
,
then D

e
k
=

k
e
k
and D is unitary if and only if |
k
| = 1 for all k.
(ii) The shift operator S satises S

S = I but SS

,= I thus S is not unitary.


Theorem 6.23. For an operator U on a complex Hilbert space H the following are equi-
valent:
(i) U is unitary;
(ii) U is surjection and an isometry, i.e. |Ux| = |x| for all x H;
(iii) U is a surjection and preserves the inner product, i.e. Ux, Uy = x, y for all
x, y H.
Proof. 6.23(i)6.23(ii). Clearly unitarity of operator implies its invertibility and
hence surjectivity. Also
|Ux|
2
= Ux, Ux = x, U

Ux = x, x = |x|
2
.
6.23(ii)6.23(iii). Using the polarisation identity (cf. polarisation in equation (2.6)):
4 Tx, y = T(x +y), x +y +i T(x +iy), x +iy
T(x y), x y i T(x iy), x iy .
=
3

k=0
i
k

T(x +i
k
y), x + i
k
y
_
INTRODUCTION TO FUNCTIONAL ANALYSIS 41
Take T = U

U and T = I, then
4 U

Ux, y =
3

k=0
i
k

U(x +i
k
y), x +i
k
y
_
=
3

k=0
i
k

U(x +i
k
y), U(x +i
k
y)
_
=
3

k=0
i
k

(x + i
k
y), (x + i
k
y)
_
= 4 x, y .
6.23(iii)6.23(i). Indeed U

Ux, y = x, y implies (U

UI)x, y = 0 for all


x,y H, then U

U = I. Since U should be invertible by surjectivity we see that


U

= U
1
.
Denition 6.24. A normal operator T is one for which T

T = TT

.
Example 6.25. (i) Any self-adjoint operator T is normal, since T

= T.
(ii) Any unitary operator U is normal, since U

U = I = UU

.
(iii) Any diagonal operator Dis normal , since De
k
=
k
e
k
, D

e
k
=

k
e
k
, and
DD

e
k
= D

De
k
= |
k
|
2
e
k
.
(iv) The shift operator S is not normal.
(v) A nite matrix is normal (as an operator on
n
2
) if and only if it has an
orthonormal basis in which it is diagonal.
Remark 6.26. Theorems 6.20 and 6.23(ii) draw similarity between those types of
operators and multiplications by complex numbers. Indeed Theorem 6.20 said
that an operator which signicantly change direction of vectors (rotates) cannot
be Hermitian, just like a multiplication by a real number scales but do not rotate.
On the other hand Theorem 6.23(ii) says that unitary operator just rotate vectors
but do not scale, as a multiplication by an unimodular complex number. We will
see further such connections in Theorem 7.17.
7. SPECTRAL THEORY
Beware of ghosts
2
in this area!
As we saw operators could be added and multiplied each other, in some sense
they behave like numbers, but are much more complicated. In this lecture we
will associate to each operator a set of complex numbers which reects certain
(unfortunately not all) properties of this operator.
The analogy between operators and numbers become even more deeper since
we could construct functions of operators (called functional calculus) in a way we
build numeric functions. The most important functions of this sort is called re-
solvent (see Denition 7.5). The methods of analytical functions are very powerful
in operator theory and students may wish to refresh their knowledge of complex
analysis before this part.
7.1. The spectrum of an operator on a Hilbert space. An eigenvalue of operator
T B(H) is a complex number such that there exists a nonzero x H, called
eigenvector with property Tx = x, in other words x ker(T I).
In nite dimensions T I is invertible if and only if is not an eigenvalue. In
innite dimensions it is not the same: the right shift operator S is not invertible
but 0 is not its eigenvalue because Sx = 0 implies x = 0 (check!).
42 VLADIMIR V. KISIL
Denition 7.1. The resolvent set (T) of an operator T is the set
(T) = { C : T I is invertible}.
The spectrumof operator T B(H), denoted (T), is the complement of the resolvent
set (T):
(T) = { C : T I is not invertible}.
Example 7.2. If H is nite dimensional the from previous discussion follows that
(T) is the set of eigenvalues of T for any T.
Even this example demonstrates that spectrum does not provide a complete de-
scription for operator even in nite-dimensional case. For example, both operators
in C
2
given by matrices
_
0 0
0 0
_
and
_
0 0
1 0
_
have a single point spectrum{0},
however are rather different. The situation became even worst in the innite di-
mensional spaces.
Theorem 7.3. The spectrum (T) of a bounded operator T is a nonempty compact (i.e.
closed and bounded) subset of C.
For the proof we will need several Lemmas.
Lemma 7.4. Let A B(H). If |A| < 1 then I A is invertible in B(H) and inverse is
given by the Neumann series (C. Neumann, 1877):
(7.1) (I A)
1
= I +A+A
2
+A
3
+ . . . =

k=0
A
k
.
Proof. Dene the sequence of operators B
n
= I + A + + A
N
the partial sums
of the innite series (7.1). It is a Cauchy sequence, indeed:
|B
n
B
m
| =
_
_
A
m+1
+ A
m+2
+ +A
n
_
_
(if n < m)

_
_
A
m+1
_
_
+
_
_
A
m+2
_
_
+ + |A
n
|
|A|
m+1
+ |A|
m+2
+ +|A|
n

|A|
m+1
1 |A|
<
for a large m. By the completeness of B(H) there is a limit, say B, of the sequence
B
n
. It is a simple algebra to check that (I A)B
n
= B
n
(I A) = I A
n+1
, passing
to the limit in the norm topology, where A
n+1
0 and B
n
B we get:
(I A)B = B(I A) = I B = (I A)
1
.

Denition7.5. The resolventof an operator T is the operator valued function dened


on the resolvent set by the formula:
(7.2) R(, T) = (T I)
1
.
Corollary 7.6. (i) If || > |T| then (T), hence the spectrum is bounded.
(ii) The resolvent set (T) is open, i.e for any (T) then there exist > 0 such
that all with | | < are also in (T), i.e. the resolvent set is open and the
spectrum is closed.
Both statements together imply that the spectrum is compact.
Proof. (i) If || > |T| then
_
_

1
T
_
_
< 1 and the operator T I = (I
1
T)
has the inverse
(7.3) R(, T) = (T I)
1
=

k=0

k1
T
k
.
INTRODUCTION TO FUNCTIONAL ANALYSIS 43
by the previous Lemma.
(ii) Indeed:
T I = T I + ( )I
= (T I)(I + ( )(T I)
1
).
The last line is an invertible operator because T I is invertible by the
assumption and I+()(T I)
1
is invertible by the previous Lemma,
since
_
_
( )(T I)
1
_
_
< 1 if <
_
_
(T I)
1
_
_
.

Exercise 7.7. (i) Prove the rst resolvent identity:


(7.4) R(, T) R(, T) = ( )R(, T)R(, T)
(ii) Use the identity (7.4) to show that (T I)
1
(T I)
1
as .
(iii) Use the identity (7.4) to show that for z (t) the complex derivative
d
dz
R(z, T) of the resolvent R(z, T) is well dened, i.e. the resolvent is an
analytic function operator valued function of z.
Lemma 7.8. The spectrum is non-empty.
Proof. Let us assume the opposite, (T) = then the resolvent function R(, T) is
well dened for all C. As could be seen from the von Neumann series (7.3)
|R(, T)| 0 as . Thus for any vectors x, y H the function f() =
R(, T)x, y) is analytic (see Exercise 7.7(iii)) function tensing to zero at innity.
Then by the Liouville theorem from complex analysis R(, T) = 0, which is im-
possible. Thus the spectrum is not empty.
Proof of Theorem 7.3. Spectrum is nonempty by Lemma 7.8 and compact by Corol-
lary 7.6.
Remark 7.9. Theorem 7.3 gives the maximal possible description of the spectrum,
indeed any non-empty compact set could be a spectrum for some bounded oper-
ator, see Problem A.23.
7.2. The spectral radius formula. The following denition is of interest.
Denition 7.10. The spectral radius of T is
r(T) = sup{|| : (T)}.
From the Lemma 7.6(i) immediately follows that r(T) |T|. The more accurate
estimation is given by the following theorem.
Theorem 7.11. For a bounded operator T we have
(7.5) r(T) = lim
n
|T
n
|
1/n
.
We start from the following general lemma:
Lemma 7.12. Let a sequence (a
n
) of positive real numbers satises inequalities: 0
a
m+n
a
m
+ a
n
for all m and n. Then there is a limit lim
n
(a
n
/n) and its equal to
inf
n
(a
n
/n).
Proof. The statements follows from the observation that for any n and m = nk +l
with 0 l n we have a
m
ka
n
+ la
1
thus, for big m we got a
m
/m a
n
/n +
la
1
/m a
n
/n +.
44 VLADIMIR V. KISIL
Proof of Theorem 7.11. The existence of the limit lim
n
|T
n
|
1/n
in (7.5) follows
fromthe previous Lemma since by the Lemma 6.9 log |T
n+m
| log |T
n
|+log |T
m
|.
Now we are using some results from the complex analysis. The Laurent series for
the resolvent R(, T) in the neighbourhood of innity is given by the von Neumann
series (7.3). The radius of its convergence (which is equal, obviously, to r(T)) by
the Hadamard theorem is exactly lim
n
|T
n
|
1/n
.
Corollary 7.13. There exists (T) such that || = r(T).
Proof. Indeed, as its known from the complex analysis the boundary of the con-
vergence circle of a Laurent (or Taylor) series contain a singular point, the singular
point of the resolvent is obviously belongs to the spectrum.
Example 7.14. Let us consider the left shift operator S

, for any C such that


|| < 1 the vector (1, ,
2
,
3
, . . .) is in
2
and is an eigenvector of S

with eigen-
value , so the open unit disk || < 1 belongs to (S

). On the other hand spectrum


of S

belongs to the closed unit disk || 1 since r(S

) |S

| = 1. Because spec-
trum is closed it should coincide with the closed unit disk, since the open unit disk
is dense in it. Particularly 1 (S

), but it is easy to see that 1 is not an eigenvalue


of S

.
Proposition 7.15. For any T B(H) the spectrum of the adjoint operator is (T

) =
{

: (T)}.
Proof. If (TI)V = V(TI) = I the by taking adjoints V

(T

I) = (T

I)V

=
I. So (T) implies

(T

), using the property T

= T we could invert the


implication and get the statement of proposition.
Example 7.16. In continuation of Example 7.14 using the previous Proposition we
conclude that (S) is also the closed unit disk, but S does not have eigenvalues at
all!
7.3. Spectrum of Special Operators.
Theorem 7.17. (i) If U is a unitary operator then (U) {|z| = 1}.
(ii) If T is Hermitian then (T) R.
Proof. (i) If || > 1 then
_
_

1
U
_
_
< 1 and then I U = (I
1
U) is
invertible, thus , (U). If || < 1 then |U

| < 1 and then I U =


U(U

I) is invertible, thus , (U). The remaining set is exactly


{z : |z| = 1}.
(ii) Without lost of generality we could assume that |T| < 1, otherwise we
could multiply T by a small real scalar. Let us consider the Cayley trans-
form which maps real axis to the unit circle:
U = (T iI)(T +iI)
1
.
Straightforward calculations show that U is unitary if T is Hermitian.
Let us take , R and ,= i (this case could be checked directly by
Lemma 7.4). Then the Cayley transform = ( i)( +i)
1
of is not on
the unit circle and thus the operator
UI = (T iI)(T +iI)
1
( i)( +i)
1
I = 2i( +i)
1
(T I)(T +iI)
1
,
is invertible, which implies invertibility of T I. So , R.

The above reduction of a self-adjoint operator to a unitary one (it can be done
on the opposite direction as well!) is an important tool which can be applied in
other questions as well, e.g. in the following exercise.
INTRODUCTION TO FUNCTIONAL ANALYSIS 45
Exercise 7.18. (i) Show that an operator U : f(t) e
it
f(t) on L
2
[0, 2] is
unitary and has the entire unit circle {|z| = 1} as its spectrum .
(ii) Find a self-adjoint operator T with the entire real line as its spectrum.
8. COMPACTNESS
It is not easy to study linear operators in general and there are many ques-
tions about operators in Hilbert spaces raised many decades ago which are still
unanswered. Therefore it is reasonable to single out classes of operators which
have (relatively) simple properties. Such a class of operators more closed to nite
dimensional ones will be studied here.
These operators are so compact that we even can t them in
our course
8.1. Compact operators. Let us recall some topological denition and results.
Denition 8.1. A compact set in a metric space is dened by the property that any
its covering by a family of open sets contains a subcovering by a nite subfamily.
In the nite dimensional vector spaces R
n
or C
n
there is the following equival-
ent denition of compactness (equivalence of 8.2(i) and 8.2(ii) is known as Heine
Borel theorem):
Theorem 8.2. If a set E in R
n
or C
n
has any of the following properties then it has other
two as well:
(i) E is bounded and closed;
(ii) E is compact;
(iii) Any innite subset of E has a limiting point belonging to E.
Exercise

8.3. Which equivalences from above are not true any more in the innite
dimensional spaces?
Denition 8.4. Let X and Y be normed spaces, T B(X, Y) is a nite rank operator if
ImT is a nite dimensional subspace of Y. T is a compact operator if whenever (x
i
)

1
is a bounded sequence in X then its image (Tx
i
)

1
has a convergent subsequence
in Y.
The set of nite rank operators is denote by F(X, Y) and the set of compact
operatorsby K(X, Y)
Exercise 8.5. Show that both F(X, Y) and K(X, Y) are linear subspaces of B(X, Y).
We intend to show that F(X, Y) K(X, Y).
Lemma 8.6. Let Z be a nite-dimensional normed space. Then there is a number N and a
mapping S :
N
2
Z which is invertible and such that S and S
1
are bounded.
Proof. The proof is given by an explicit construction. Let N = dimZ and z
1
, z
2
, . . . ,
z
N
be a basis in Z. Let us dene
S :
N
2
Z by S(a
1
, a
2
, . . . , a
N
) =
N

k=1
a
k
z
k
,
then we have an estimation of norm:
|Sa| =
_
_
_
_
_
N

k=1
a
k
z
k
_
_
_
_
_

k=1
|a
k
| |z
k
|

_
N

k=1
|a
k
|
2
_
1/2
_
N

k=1
|z
k
|
2
_
1/2
.
46 VLADIMIR V. KISIL
So |S|
_

N
1
|z
k
|
2
_
1/2
and S is continuous.
Clearly S has the trivial kernel, particularly |Sa| > 0 if |a| = 1. By the
HeineBorel theorem the unit sphere in
N
2
is compact, consequently the continu-
ous function a
_
_
_

N
1
a
k
z
k
_
_
_ attains its lower bound, which has to be positive.
This means there exists > 0 such that |a| = 1 implies |Sa| > , or, equivalently
if |z| < then
_
_
S
1
z
_
_
< 1. The later means that
_
_
S
1
_
_

1
and boundedness
of S
1
.
Corollary 8.7. For any two metric spaces X and Y we have F(X, Y) K(X, Y).
Proof. Let T F(X, Y), if (x
n
)

1
is a bounded sequence in X then ((Tx
n
)

1
Z =
ImT is also bounded. Let S :
N
2
Z be a map constructed in the above Lemma.
The sequence (S
1
Tx
n
)

1
is bounded in
N
2
and thus has a limiting point, say a
0
.
Then Sa
0
is a limiting point of (Tx
n
)

1
.
There is a simple condition which allows to determine which diagonal operat-
ors are compact (particularly the identity operator IX
is not compact if dimX = ):
Proposition 8.8. Let T is a diagonal operator and given by identities Te
n
=
n
e
n
for all
n in a basis e
n
. T is compact if and only if
n
0.
Proof. If
n
, 0 then there exists a subsequence
n
k
and > 0 such that |
n
k
| >
for all k. Now the sequence (e
n
k
) is bounded but its image Te
n
k
=
n
k
e
n
k
has no
convergent subsequence because for any k ,= l:
|
n
k
e
n
k

n
l
e
n
l
| = (|
n
k
|
2
+|
n
l
|
2
)
1/2

2,
i.e. Te
n
k
is not a Cauchy sequence, see Figure 16.
e
1

1
e
1
e
1
e
2

2
e
2
e
2
FIGURE 16. Distance between scales of orthonormal vectors
For the converse, note that if
n
0 then we can dene a nite rank operator
T
m
, m 1m-truncation of T by:
(8.1) T
m
e
n
=
_
Te
n
=
n
e
n
, 1 n m;
0, n > m.
Then obviously
(T T
m
)e
n
=
_
0, 1 n m;

n
e
n
, n > m,
and |T T
m
| = sup
n>m
|
n
| 0 if m . All T
m
are nite rank operators (so
are compact) and T is also compact as their limitby the next Theorem.
Theorem 8.9. Let T
m
be a sequence of compact operators convergent to an operator T in
the norm topology (i.e. |T T
m
| 0) then T is compact itself. Equivalently K(X, Y) is
a closed subspace of B(X, Y).
INTRODUCTION TO FUNCTIONAL ANALYSIS 47
Proof. Take a bounded sequence (x
n
)

1
. From compactness
of T
1
subsequence (x
(1)
n
)

1
of (x
n
)

1
s.t. (T
1
x
(1)
n
)

1
is convergent.
of T
2
subsequence (x
(2)
n
)

1
of (x
(1)
n
)

1
s.t. (T
2
x
(2)
n
)

1
is convergent.
of T
3
subsequence (x
(3)
n
)

1
of (x
(2)
n
)

1
s.t. (T
3
x
(3)
n
)

1
is convergent.
. . . . . . . . . . . . . . .
Could we nd a subsequence which converges for all T
m
simultaneously? The
rst guess take the intersection of all above sequences (x
(k)
n
)

1
does not work
because the intersection could be empty. The way out is provided by the diagonal
argument (see Table 2): a subsequence (T
m
x
(k)
k
)

1
is convergent for all m, because
at latest after the term x
(m)
m
it is a subsequence of (x
(m)
k
)

1
.
T
1
x
(1)
1
T
1
x
(1)
2
T
1
x
(1)
3
. . . T
1
x
(1)
n
. . . a
1
T
2
x
(2)
1
T
2
x
(2)
2
T
2
x
(2)
3
. . . T
2
x
(2)
n
. . . a
2
T
3
x
(3)
1
T
3
x
(3)
2
T
3
x
(3)
3
. . . T
3
x
(3)
n
. . . a
3
. . . . . . . . . . . . . . . . . .
T
n
x
(n)
1
T
n
x
(n)
2
T
n
x
(n)
3
. . . T
n
x
(n)
n
. . . a
n
. . . . . . . . . . . . . . . . . .

a
TABLE 2. The diagonal argument.
We are claiming that a subsequence (Tx
(k)
k
)

1
of (Tx
n
)

1
is convergent as well.
We use here /3 argument (see Figure 17): for a given > 0 choose p N such that
|T T
p
| < /3. Because (T
p
x
(k)
k
) 0 it is a Cauchy sequence, thus there exists
[f(x) f
n
(x)[ < /3
[f
n
(x) f
n
(y)[ < /3
[f
n
(y) f(y)[ < /3
x y
f
n
(t)
f(t)
FIGURE 17. The /3 argument to estimate |f(x) f(y)|.
n
0
> p such that
_
_
_T
p
x
(k)
k
T
p
x
(l)
l
_
_
_ < /3 for all k, l > n
0
. Then:
_
_
_Tx
(k)
k
Tx
(l)
l
_
_
_ =
_
_
_(Tx
(k)
k
T
p
x
(k)
k
) + (T
p
x
(k)
k
T
p
x
(l)
l
) + (T
p
x
(l)
l
Tx
(l)
l
)
_
_
_

_
_
_Tx
(k)
k
T
p
x
(k)
k
_
_
_ +
_
_
_T
p
x
(k)
k
T
p
x
(l)
l
_
_
_ +
_
_
_T
p
x
(l)
l
Tx
(l)
l
_
_
_

Thus T is compact.
48 VLADIMIR V. KISIL
8.2. HilbertSchmidt operators.
Denition 8.10. Let T : H K be a bounded linear map between two Hilbert
spaces. Then T is said to be HilbertSchmidt operator if there exists an orthonormal
basis in H such that the series

k=1
|Te
k
|
2
is convergent.
Example 8.11. (i) Let T :
2

2
be a diagonal operator dened by Te
n
=
e
n
/n, for all n 1. Then

|Te
n
|
2
=

n
2
=
2
/6 (see Example 4.14)
is nite.
(ii) The identity operator IH
is not a HilbertSchmidt operator, unless H is
nite dimensional.
A relation to compact operator is as follows.
Theorem 8.12. All HilbertSchmidt operators are compact. (The opposite inclusion is
false, give a counterexample!)
Proof. Let T B(H, K) have a convergent series

|Te
n
|
2
in an orthonormal basis
(e
n
)

1
of H. We again (see (8.1)) dene the m-truncation of T by the formula
(8.2) T
m
e
n
=
_
Te
n
, 1 n m;
0, n > m.
Then T
m
(

1
a
k
e
k
) =

m
1
a
k
e
k
and each T
m
is a nite rank operator because
its image is spanned by the nite set of vectors Te
1
, . . . , Te
n
. We claim that
|T T
m
| 0. Indeed by linearity and denition of T
m
:
(T T
m
)
_

n=1
a
n
e
n
_
=

n=m+1
a
n
(Te
n
).
Thus:
_
_
_
_
_
(T T
m
)
_

n=1
a
n
e
n
__
_
_
_
_
=
_
_
_
_
_

n=m+1
a
n
(Te
n
)
_
_
_
_
_
(8.3)

n=m+1
|a
n
| |(Te
n
)|

n=m+1
|a
n
|
2
_
1/2
_

n=m+1
|(Te
n
)|
2
_
1/2

_
_
_
_
_

n=1
a
n
e
n
_
_
_
_
_
_

n=m+1
|(Te
n
)|
2
_
1/2
(8.4)
so |T T
m
| 0 and by the previous Theorem T is compact as a limit of compact
operators.
Corollary 8.13 (from the above proof). For a HilbertSchmidt operator
|T|
_

n=m+1
|(Te
n
)|
2
_
1/2
.
Proof. Just consider difference of T and T
0
= 0 in (8.3)(8.4).
Example 8.14. An integral operator T on L
2
[0, 1] is dened by the formula:
(8.5) (Tf)(x) =
1
_
0
K(x, y)f(y) dy, f(y) L
2
[0, 1],
INTRODUCTION TO FUNCTIONAL ANALYSIS 49
where the continuous on [0, 1] [0, 1] function K is called the kernel of integral oper-
ator.
Theorem 8.15. Integral operator (8.5) is HilbertSchmidt.
Proof. Let (e
n
)

be an orthonormal basis of L
2
[0, 1], e.g. (e
2int
)
nZ
. Let us
consider the kernel K
x
(y) = K(x, y) as a function of the argument y depending
from the parameter x. Then:
(Te
n
)(x) =
1
_
0
K(x, y)e
n
(y) dy =
1
_
0
K
x
(y)e
n
(y) dy = K
x
, e
n
.
So |Te
n
|
2
=
1
_
0
|K
x
, e
n
|
2
dx. Consequently:

|Te
n
|
2
=

1
_
0
|K
x
, e
n
|
2
dx
=
1
_
0

1
|K
x
, e
n
|
2
dx (8.6)
=
1
_
0
|K
x
|
2
dx
=
1
_
0
1
_
0
|K(x, y)|
2
dx dy <
Exercise 8.16. Justify the exchange of summation and integration in (8.6).

Remark 8.17. The denition 8.14 and Theorem 8.15 work also for any T : L
2
[a, b]
L
2
[c, d] with a continuous kernel K(x, y) on [c, d] [a, b].
Denition 8.18. Dene HilbertSchmidt norm of a HilbertSchmidt operator A by
|A|
2
HS
=

n=1
|Ae
n
|
2
(it is independent of the choice of orthonormal basis
(e
n
)

1
, see Question A.27).
Exercise

8.19. Show that set of HilbertSchmidt operators with the above norm
is a Hilbert space and nd the an expression for the inner product.
Example 8.20. Let K(x, y) = x y, then
(Tf)(x) =
1
_
0
(x y)f(y) dy = x
1
_
0
f(y) dy
1
_
0
yf(y) dy
is a rank 2 operator. Furthermore:
|T|
2
HS
=
1
_
0
1
_
0
(x y)
2
dx dy =
1
_
0
_
(x y)
3
3
_
1
x=0
dy
=
1
_
0
(1 y)
3
3
+
y
3
3
dy =
_

(1 y)
4
12
+
y
4
12
_
1
0
=
1
6
.
50 VLADIMIR V. KISIL
On the other hand there is an orthonormal basis such that
Tf =
1

12
f, e
1
e
1

1

12
f, e
2
e
2
,
and |T| =
1

12
and

2
1
|Te
k
|
2
=
1
6
and we get |T| |T|
HS
in agreement with
Corollary 8.13.
9. THE SPECTRAL THEOREM FOR COMPACT NORMAL OPERATORS
Recall from Section 6.4 that an operator T is normal if TT

= T

T; Hermitian
(T

= T) and unitary (T

= T
1
) operators are normal.
9.1. Spectrum of normal operators.
Theorem 9.1. Let T B(H) be a normal operator then
(i) ker T = ker T

, so ker(T I) = ker(T

I) for all C
(ii) Eigenvectors corresponding to distinct eigenvalues are orthogonal.
(iii) |T| = r(T).
Proof. (i) Obviously:
x ker T Tx, Tx = 0 T

Tx, x = 0
TT

x, x = 0 T

x, T

x = 0
x ker T

.
The second part holds because normalities of T and T I are equivalent.
(ii) If Tx = x, Ty = y then from the previous statement T

y = y. If ,=
then the identity
x, y = Tx, y = x, T

y = x, y
implies x, y = 0.
(iii) Let S = T

T then normality of T implies that S is Hermitian (check!).


Consequently inequality
|Sx|
2
= Sx, Sx =

S
2
x, x
_

_
_
S
2
_
_
|x|
2
implies |S|
2

_
_
S
2
_
_
. But the opposite inequality follows from the The-
orem 6.9, thus we have the equality
_
_
S
2
_
_
= |S|
2
and more generally by
induction:
_
_
S
2
m
_
_
= |S|
2
m
for all m.
Now we claim |S| = |T|
2
. From Theorem 6.9 and 6.15 we get |S| =
|T

T| |T|
2
. On the other hand if |x| = 1 then
|T

T| |T

Tx, x| = Tx, Tx = |Tx|


2
implies the opposite inequality |S| |T|
2
. And because (T
2
m
)

T
2
m
=
(T

T)
2
m
we get the equality
_
_
T
2
m
_
_
2
=
_
_
(T

T)
2
m
_
_
= |T

T|
2
m
= |T|
2
m+1
.
Thus:
r(T) = lim
m
_
_
T
2
m
_
_
1/2
m
= lim
m
|T|
2
m+1
/2
m+1
= |T| .
by the spectral radius formula (7.5).

Example 9.2. It is easy to see that normality is important in 9.1(iii), indeed the
non-normal operator T given by the matrix
_
0 1
0 0
_
in C has one-point spectrum
{0}, consequently r(T) = 0 but |T| = 1.
INTRODUCTION TO FUNCTIONAL ANALYSIS 51
Lemma 9.3. Let T be a compact normal operator then
(i) The set of of eigenvalues of T is either nite or a countable sequence tending to
zero.
(ii) All the eigenspaces, i.e. ker(T I), are nite-dimensional for all ,= 0.
Remark 9.4. This Lemma is true for any compact operator, but we will not use that
in our course.
Proof. (i) Let H
0
be the closed linear span of eigenvectors of T. Then T restric-
ted to H
0
is a diagonal compact operator with the same set of eigenvalues

n
as in H. Then
n
0 from Proposition 8.8 .
Exercise 9.5. Use the proof of Proposition 8.8 to give a direct demonstra-
tion.
Solution. Or straightforwardly assume opposite: there exist an > 0
and innitely many eigenvalues
n
such that |
n
| > . By the previous
Theorem there is an orthonormal sequence v
n
of corresponding eigen-
vectors Tv
n
=
n
v
n
. Now the sequence (v
n
) is bounded but its image
Tv
n
=
n
e
n
has no convergent subsequence because for any k ,= l:
|
k
v
k

l
e
l
| = (|
k
|
2
+|
l
|
2
)
1/2

2,
i.e. Te
n
k
is not a Cauchy sequence, see Figure 16.
(ii) Similarly if H
0
= ker(T I) is innite dimensional, then restriction of T
on H
0
is Iwhich is non-compact by Proposition 8.8. Alternatively con-
sider the innite orthonormal sequence (v
n
), Tv
n
= v
n
as in Exercise 9.5.

Lemma 9.6. Let T be a compact normal operator. Then all non-zero points (T) are
eigenvalues and there exists an eigenvalue of modulus |T|.
Proof. Assume without lost of generality that T ,= 0. Let (T), without lost of
generality (multiplying by a scalar) = 1.
We claim that if 1 is not an eigenvalue then there exist > 0 such that
(9.1) |(I T)x| |x| .
Otherwise there exists a sequence of vectors (x
n
) with unit norm such that (I
T)x
n
0. Then from the compactness of T for a subsequence (x
n
k
) there is y H
such that Tx
n
k
y, then x
n
y implying Ty = y and y ,= 0i.e. y is eigenvector
with eigenvalue 1.
Now we claim Im(I T) is closed, i.e. y Im(I T) implies y Im(I T).
Indeed, if (I T)x
n
y, then there is a subsequence (x
n
k
) such that Tx
n
k
z
implying x
n
k
y +z, then (I T)(z +y) = y.
Finally I T is injective, i.e ker(I T) = {0}, by (9.1). By the property 9.1(i),
ker(I T

) = {0} as well. But because always ker(I T

) = Im(I T)

(check!) we
got surjectivity, i.e. Im(I T)

= {0}, of I T. Thus (I T)
1
exists and is bounded
because (9.1) implies |y| >
_
_
(I T)
1
y
_
_
. Thus 1 , (T).
The existence of eigenvalue such that || = |T| follows from combination of
Lemma 7.13 and Theorem 9.1(iii).
9.2. Compact normal operators.
Theorem 9.7 (The spectral theorem for compact normal operators). Let T be a com-
pact normal operator on a Hilbert space H. Then there exists an orthonormal sequence
(e
n
) of eigenvectors of T and corresponding eigenvalues (
n
) such that:
(9.2) Tx =

n
x, e
n
e
n
, for all x H.
52 VLADIMIR V. KISIL
If (
n
) is an innite sequence it tends to zero.
Conversely, if T is given by a formula (9.2) then it is compact and normal.
Proof. Suppose T ,= 0. Then by the previous Theorem there exists an eigenvalue

1
such that |
1
| = |T| with corresponding eigenvector e
1
of the unit norm. Let
H
1
= Lin(e
1
)

. If x H
1
then
(9.3) Tx, e
1
= x, T

e
1
=

x,

1
e
1
_
=
1
x, e
1
= 0,
thus Tx H
1
and similarly T

x H
1
. Write T
1
= T|
H
1
which is again a nor-
mal compact operator with a norm does not exceeding |T|. We could inductively
repeat this procedure for T
1
obtaining sequence of eigenvalues
2
,
3
, . . . with ei-
genvectors e
2
, e
3
, . . . . If T
n
= 0 for a nite n then theorem is already proved.
Otherwise we have an innite sequence
n
0. Let
x =
n

1
x, e
k
e
k
+y
n
|x|
2
=
n

1
|x, e
k
|
2
+|y
n
|
2
, y
n
H
n
,
fromPythagorass theorem. Then |y
n
| |x| and |Ty
n
| |T
n
| |y
n
| |
n
| |x|
0 by Lemma 9.3. Thus
Tx = lim
n
_
n

1
x, e
n
Te
n
+Ty
n
_
=

n
x, e
n
e
n
Conversely, if Tx =

1

n
x, e
n
e
n
then
Tx, y =

n
x, e
n
e
n
, y =

1
x, e
n

n
y, e
n
,
thus T

y =

n
y, e
n
e
n
. Then we got the normality of T: T

Tx = TT

x =

1
|
n
|
2
y, e
n
e
n
. Also T is compact because it is a uniform limit of the nite
rank operators T
n
x =

n
1

n
x, e
n
e
n
.
Corollary 9.8. Let T be a compact normal operator on a separable Hilbert space H, then
there exists a orthonormal basis g
k
such that
Tx =

n
x, g
n
g
n
,
and
n
are eigenvalues of T including zeros.
Proof. Let (e
n
) be the orthonormal sequence constructed in the proof of the previ-
ous Theorem. Then x is perpendicular to all e
n
if and only if its in the kernel of T.
Let (f
n
) be any orthonormal basis of ker T. Then the union of (e
n
) and (f
n
) is the
orthonormal basis (g
n
) we have looked for.
Exercise 9.9. Finish all details in the above proof.
Corollary 9.10 (Singular value decomposition). If T is any compact operator on a
separable Hilbert space then there exists orthonormal sequences (e
k
) and (f
k
) such that
Tx =

k
x, e
k
f
k
where (
k
) is a sequence of positive numbers such that
k
0 if
it is an innite sequence.
Proof. Operator T

T is compact and Hermitian (hence normal). From the previous


Corollary there is an orthonormal basis (e
k
) such that T

Tx =

n
x, e
k
e
k
for
some positive
n
= |Te
n
|
2
. Let
n
= |Te
n
| and f
n
= Te
n
/
n
. Then f
n
is an
orthonormal sequence (check!) and
Tx =

n
x, e
n
Te
n
=

n
x, e
n

n
f
n
.

INTRODUCTION TO FUNCTIONAL ANALYSIS 53


Corollary 9.11. A bounded operator in a Hilber space is compact if and only if it is a
uniform limit of the nite rank operators.
Proof. Sufciency follows from 8.9. Necessity: by the previous Corollary Tx =

n
x, e
n

n
f
n
thus T is a uniform limit of operators T
m
x =

m
n=1
x, e
n

n
f
n
which are of nite rank.
10. APPLICATIONS TO INTEGRAL EQUATIONS
In this lecture we will study the Fredholm equation dened as follows. Let the
integral operator with a kernel K(x, y) dened on [a, b] [a, b] be dened as before:
(10.1) (T)(x) =
b
_
a
K(x, y)(y) dy.
The Fredholm equation of the rst and second kinds correspondingly are:
(10.2) T = f and T = f,
for a function f on [a, b]. A special case is given by Volterra equation by an operator
integral operator (10.1) T with a kernel K(x, y) = 0 for all y > x which could be
written as:
(10.3) (T)(x) =
x
_
a
K(x, y)(y) dy.
We will consider integral operators with kernels K such that
b
_
a
b
_
a
K(x, y) dx dy < ,
then by Theorem 8.15 T is a HilbertSchmidt operator and in particular bounded.
As a reason to study Fredholm operators we will mention that solutions of dif-
ferential equations in mathematical physics (notably heat and wave equations) re-
quires a decomposition of a function f as a linear combination of functions K(x, y)
with coefcients . This is an continuous analog of a discrete decomposition
into Fourier series.
Using ideas from the proof of Lemma 7.4 we dene Neumann series for the re-
solvent:
(10.4) (I T)
1
= I +T +
2
T
2
+ ,
which is valid for all < |T|
1
.
Example 10.1. Solve the Volterra equation
(x)
x
_
0
y(y) dy = x
2
, on L
2
[0, 1].
In this case I T = f, with f(x) = x
2
and:
K(x, y) =
_
y, 0 y x;
0, x < y 1.
Straightforward calculations shows:
(Tf)(x) =
x
_
0
y y
2
dy =
x
4
4
,
(T
2
f)(x) =
x
_
0
y
y
4
4
dy =
x
6
24
, . . .
54 VLADIMIR V. KISIL
and generally by induction:
(T
n
f)(x) =
x
_
0
y
y
2n
2
n1
n!
dy =
x
2n+2
2
n
(n + 1)!
.
Hence:
(x) =

n
T
n
f =

n
x
2n+2
2
n
(n + 1)!
=
2

n+1
x
2n+2
2
n+1
(n + 1)!
=
2

(e
x
2
/2
1) for all C \ {0},
because in this case r(T) = 0. For the Fredholm equations this is not always the
case, see Tutorial problem A.29.
Among other integral operators there is an important subclass with separable
kernel, namely a kernel which has a form:
(10.5) K(x, y) =
n

j=1
g
j
(x)h
j
(y).
In such a case:
(T)(x) =
b
_
a
n

j=1
g
j
(x)h
j
(y)(y) dy
=
n

j=1
g
j
(x)
b
_
a
h
j
(y)(y) dy,
i.e. the image of T is spanned by g
1
(x), . . . , g
n
(x) and is nite dimensional, con-
sequently the solution of such equation reduces to linear algebra.
Example 10.2. Solve the Fredholm equation (actually nd eigenvectors of T):
(x) =
2
_
0
cos(x + y)(y) dy
=
2
_
0
(cos x cos y sinx sin y)(y) dy.
Clearly (x) should be a linear combination (x) = Acos x + Bsin x with coef-
cients A and B satisfying to:
A =
2
_
0
cos y(Acos y +Bsin y) dy,
B =
2
_
0
sin y(Acos y +Bsin y) dy.
INTRODUCTION TO FUNCTIONAL ANALYSIS 55
Basic calculus implies A = A and B = B and the only nonzero solutions
are:
=
1
A ,= 0 B = 0
=
1
A = 0 B ,= 0
We develop some HilbertSchmidt theory for integral operators.
Theorem 10.3. Suppose that K(x, y) is a continuous function on [a, b] [a, b] and
K(x, y) = K(y, x) and operator T is dened by (10.1). Then
(i) T is a self-adjoint HilbertSchmidt operator.
(ii) All eigenvalues of T are real and satisfy

2
n
< .
(iii) The eigenvectors v
n
of T can be chosen as an orthonormal basis of L
2
[a, b], are
continuous for nonzero
n
and
T =

n=1

n
, v
n
v
n
where =

n=1
, v
n
v
n
Proof. (i) The condition K(x, y) = K(y, x) implies the Hermitian property of
T:
T, =
b
_
a
_
_
b
_
a
K(x, y)(y) dy
_
_
(x) dx
=
b
_
a
b
_
a
K(x, y)(y)

(x) dx dy
=
b
_
a
(y)
_
_
b
_
a
K(y, x)(x) dx
_
_
dy
= , T .
The HilbertSchmidt property (and hence compactness) was proved in
Theorem 8.15.
(ii) Spectrum of T is real as for any Hermitian operator, see Theorem 7.17(ii)
and niteness of

2
n
follows from HilbertSchmidt property
(iii) The existence of orthonormal basis consisting from eigenvectors (v
n
) of T
was proved in Corollary 9.8. If
n
,= 0 then:
v
n
(x
1
) v
n
(x
2
) =
1
n
((Tv
n
)(x
1
) (Tv
n
)(x
2
))
=
1

n
b
_
a
(K(x
1
, y) K(x
2
, y))v
n
(y) dy
and by CauchySchwarz-Bunyakovskii inequality:
|v
n
(x
1
) v
n
(x
2
)|
1
|
n
|
|v
n
|
2
b
_
a
|K(x
1
, y) K(x
2
, y)| dy
which tense to 0 due to (uniform) continuity of K(x, y).

Theorem 10.4. Let T be as in the previous Theorem. Then if ,= 0 and


1
, (T), the
unique solution of the Fredholm equation of the second kind T = f is
(10.6) =

1
f, v
n

1
n
v
n
.
56 VLADIMIR V. KISIL
Proof. Let =

1
a
n
v
n
where a
n
= , v
n
, then
T =

1
a
n
(1
n
)v
n
= f =

1
f, v
n
v
n
if and only if a
n
= f, v
n
/(1
n
) for all n. Note 1
n
,= 0 since
1
, (T).
Because
n
0 we got

1
|a
n
|
2
by its comparison with

1
|f, v
n
|
2
= |f|
2
,
thus the solution exists and is unique by the RieszFisher Theorem.
See Exercise A.30 for an example.
Theorem 10.5 (Fredholm alternative). Let T K(H) be compact normal and
C \ {0}. Consider the equations:
T = 0 (10.7)
T = f (10.8)
then either
(A) the only solution to (10.7) is = 0 and (10.8) has a unique solution for any
f H; or
(B) there exists a nonzero solution to (10.7) and (10.8) can be solved if and only if f
is orthogonal all solutions to (10.7).
Proof. (A) If = 0 is the only solution of (10.7), then
1
is not an eigenvalue
of T and then by Lemma 9.6 is neither in spectrum of T. Thus I T is
invertible and the unique solution of (10.8) is given by = (I T)
1
f.
(B) A nonzero solution to (10.7) means that
1
(T). Let (v
n
) be an or-
thonormal basis of eigenvectors of T for eigenvalues (
n
). By Lemma 9.3(ii)
only a nite number of
n
is equal to
1
, say they are
1
, . . . ,
N
, then
(I T) =

n=1
(1
n
) , v
n
v
n
=

n=N+1
(1
n
) , v
n
v
n
.
If f =

1
f, v
n
v
n
then the identity (I T) = f is only possible if
f, v
n
= 0 for 1 n N. Conversely from that condition we could give
a solution
=

n=N+1
f, v
n

1
n
v
n
+
0
, for any
0
Lin(v
1
, . . . , v
N
),
which is again in H because f H and
n
0.

Example 10.6. Let us consider


(T)(x) =
1
_
0
(2xy x y + 1)(y) dy.
Because the kernel of T is real and symmetric T = T

, the kernel is also separable:


(T)(x) = x
1
_
0
(2y 1)(y) dy +
1
_
0
(y + 1)(y) dy,
and T of the rank 2 with image of T spanned by 1 and x. By direct calculations:
T : 1
1
2
T : x
1
6
x +
1
6
,
or T is given by the matrix
_
1
2
1
6
0
1
6
_
INTRODUCTION TO FUNCTIONAL ANALYSIS 57
According to linear algebra decomposition over eigenvectors is:

1
=
1
2
with vector
_
1
0
_
,

2
=
1
6
with vector
_

1
2
1
_
with normalisation v
1
(y) = 1, v
2
(y) =

12(y 1/2) and we complete it to an


orthonormal basis (v
n
) of L
2
[0, 1]. Then
If ,= 2 or 6 then (I T) = f has a unique solution (cf. equation (10.6)):
=
2

n=1
f, v
n

1
n
v
n
+

n=3
f, v
n
v
n
=
2

n=1
f, v
n

1
n
v
n
+
_
f
2

n=1
f, v
n
v
n
)
_
= f +
2

n=1

n
1
n
f, v
n
v
n
.
If = 2 then the solutions exist provided f, v
1
= 0 and are:
= f +

2
1
2
f, v
2
v
2
+Cv
1
= f +
1
2
f, v
2
v
2
+Cv
1
, C C.
If = 6 then the solutions exist provided f, v
2
= 0 and are:
= f +

1
1
1
f, v
1
v
1
+Cv
2
= f
3
2
f, v
2
v
2
+Cv
2
, C C.
58 VLADIMIR V. KISIL
11. BANACH AND NORMED SPACES
We will work with either the eld of real numbers R or the complex numbers
C. To avoid repetition, we use K to denote either R or C.
11.1. Normed spaces. Recall, see Defn. 2.3, a norm on a vector space V is a map
|| : V [0, ) such that
(i) |u| = 0 only when u = 0;
(ii) |u| = || |u| for K and u V;
(iii) |u +v| |u| +|v| for u, v V.
A norm induces a metric, see Defn. 2.1, on V by setting d(u, v) = |u v|. When
V is complete, see Defn. 2.6, for this metric, we say that V is a Banach space.
Theorem 11.1. Every nite-dimensional normed vector space is a Banach space.
We will use the following simple inequality:
Lemma 11.2 (Youngs inequality). Let two real numbers 1 < p, q < are related
through
1
p
+
1
q
= 1 then
(11.1) |ab|
|a|
p
p
+
|b|
q
q
,
for any complex a and b.
First proof: analytic. Consider the function (t) = t
m
mt for an 1 < m < . From
its derivative (t) = m(t
m1
1) we nd the only critical point t = 1 on [0, ),
which is its maximum. Thus write the inequality (t) (1) for t = a
p
/b
q
and m = 1/p. After a transformation we get ab
q/p
1
1
p
(a
p
b
q
1) and
multiplication by b
q
with rearrangements lead to the desired result.
Second proof: geometric. Consider the plain with coordinates (x, y) and take the curve
y = x
p1
which is the same as x = y
q1
. Comparing areas on the gure:
S
1
S
2
0
a
b
y
=
x
p

1
we see that S
1
+ S
2
ab for any positive reals a and b. Elementary integration
shows:
S
1
=
a
_
0
x
p1
dx =
a
p
p
, S
2
=
b
_
0
y
q1
dy =
b
q
q
.
This nishes the demonstration.
Remark 11.3. You may notice, that the both proofs introduced some specic auxili-
ary functions related to x
p
/p. It is fruitful generalisation to conduct the proofs for
more functions and derive respective forms of Youngs inequality.
INTRODUCTION TO FUNCTIONAL ANALYSIS 59
Proposition 11.4 (H olders Inequality). For 1 < p < , let q (1, ) be such that
1/p + 1/q = 1. For n 1 and u, v K
n
, we have that
n

j=1
|u
j
v
j
|
_
_
n

j=1
|u
j
|
p
_
_
1
p
_
_
n

j=1
|v
j
|
q
_
_
1
q
.
Proof. For reasons become clear soon we use the notation |u|
p
=
_

n
j=1
|u
j
|
p
_1
p
and |v|
q
=
_

n
j=1
|v
j
|
q
_1
q
and dene for 1 i n:
a
i
=
u
i
|u|
p
and b
i
=
v
i
|v|
q
.
Summing up for 1 i n all inequalities obtained from (11.1):
|a
i
b
i
|
|a
i
|
p
p
+
|b
i
|
q
q
,
we get the result.
Using H older inequality we can derive the following one:
Proposition 11.5 (Minkowskis Inequality). For 1 < p < , and n 1, let u, v
K
n
. Then
_
_
n

j=1
|u
j
+v
j
|
p
_
_
1/p

_
_
n

j=1
|u
j
|
p
_
_
1/p
+
_
_
n

j=1
|v
j
|
p
_
_
1/p
.
Proof. For p > 1 we have:
(11.2)
n

1
|u
k
+v
k
|
p
=
n

1
|u
k
| |u
k
+v
k
|
p1
+
n

1
|v
k
| |u
k
+v
k
|
p1
.
By H older inequality
n

1
|u
k
| |u
k
+ v
k
|
p1

_
n

1
|u
k
|
p
_1
p
_
n

1
|u
k
+v
k
|
q(p1)
_1
q
.
Adding a similar inequality for the second term in the right hand side of (11.2) and
division by
_

n
1
|u
k
+ v
k
|
q(p1)
_1
q
yields the result.
Minkowskis inequality shows that for 1 p < (the case p = 1 is easy) we
can dene a norm||
p
on K
n
by
|u|
p
=
_
_
n

j=1
|u
j
|
p
_
_
1/p
(u = (u
1
, , u
n
) K
n
).
See, Figure 2 for illustration of various norms of this type dened in R
2
.
We can dene an innite analogue of this. Let 1 p < , let
p
be the space
of all scalar sequences (x
n
) with

n
|x
n
|
p
< . A careful use of Minkowskis
inequality shows that
p
is a vector space. Then
p
becomes a normed space for
the ||
p
norm. Note also, that
2
is the Hilbert space introduced before in Ex-
ample 2.12(ii).
Recall that a Cauchy sequence, see Defn. 2.5, in a normed space is bounded:
if (x
n
) is Cauchy then we can nd N with |x
n
x
m
| < 1 for all n, m N.
60 VLADIMIR V. KISIL
Then |x
n
| |x
n
x
N
| + |x
N
| < |x
N
| + 1 for n N, so in particular, |x
n
|
max(|x
1
| , |x
2
| , , |x
N1
| , |x
N
| + 1).
Theorem 11.6. For 1 p < , the space
p
is a Banach space.
Proof. Most completeness proofs are similar to this, see Thm. 2.24 which we repeat
here changing 2 to p. Let (x
(n)
) be a Cauchy-sequence in
p
; we wish to show this
converges to some vector in
p
.
For each n, x
(n)

p
so is a sequence of scalars, say (x
(n)
k
)

k=1
. As (x
(n)
) is
Cauchy, for each > 0 there exists N

so that
_
_
x
(n)
x
(m)
_
_
p
for n, m N

.
For k xed,

x
(n)
k
x
(m)
k


_
_

x
(n)
j
x
(m)
j

p
_
_
1/p
=
_
_
_x
(n)
x
(m)
_
_
_
p
,
when n, m N

. Thus the scalar sequence (x


(n)
k
)

n=1
is Cauchy in K and hence
converges, to x
k
say. Let x = (x
k
), so that x is a candidate for the limit of (x
(n)
).
Firstly, we check that x x
(n)

p
for some n. Indeed, for a given > 0 nd
n
0
such that
_
_
x
(n)
x
(m)
_
_
< for all n, m > n
0
. For any K and m:
K

k=1

x
(n)
k
x
(m)
k

_
_
_x
(n)
x
(m)
_
_
_
p
<
p
.
Let m then

K
k=1

x
(n)
k
x
k

p

p
.
Let K then

k=1

x
(n)
k
x
k

p

p
. Thus x
(n)
x
p
and because
p
is a
linear space then x = x
(n)
(x
(n)
x) is also in
p
.
Finally, we saw above that for any > 0 there is n
0
such that
_
_
x
(n)
x
_
_
< for
all n > n
0
. Thus x
(n)
x.
For p = , there are two analogies to the
p
spaces. First, we dene

to be
the vector space of all bounded scalar sequences, with the sup-norm (||

-norm):
(11.3) |(x
n
)|

= sup
nN
|x
n
| ((x
n
)

).
Second, we dene c
0
to be the space of all scalar sequences (x
n
) which converge
to 0. We equip c
0
with the sup norm (11.3). This is dened, as if x
n
0, then (x
n
)
is bounded. Hence c
0
is a subspace of

, and we can check (exercise!) that c


0
is
closed.
Theorem 11.7. The spaces c
0
and

are Banach spaces.


Proof. This is another variant of the previous proof of Thm. 11.6. We do the

case.
Again, let (x
(n)
) be a Cauchy sequence in

, and for each n, let x


(n)
= (x
(n)
k
)

k=1
.
For > 0 we can nd N such that
_
_
x
(n)
x
(m)
_
_

< for n, m N. Thus, for


any k, we see that

x
(n)
k
x
(m)
k

< when n, m N. So (x
(n)
k
)

n=1
is Cauchy, and
hence converges, say to x
k
K. Let x = (x
k
).
Let m N, so that for any k, we have that

x
k
x
(m)
k

= lim
n

x
(n)
k
x
(m)
k

.
As k was arbitrary, we see that sup
k

x
k
x
(m)
k

. So, rstly, this shows that


(x x
(m)
)

, and so also x = (x x
(m)
) +x
(m)

. Secondly, we have shown


that
_
_
x x
(m)
_
_

when m N, so x
(m)
x in norm.
INTRODUCTION TO FUNCTIONAL ANALYSIS 61
Example 11.8. We can also consider a Banach space of functions L
p
[a, b] with the
norm
|f|
p
=
__
b
a
|f(t)|
p
dt
_
1/p
.
See the discussion after Defn. 2.22 for a realisation of such spaces.
11.2. Bounded linear operators. Recall what a linear map is, see Defn. 6.1. A lin-
ear map is often called an operator. A linear map T : E F between normed spaces
is bounded if there exists M > 0 such that |T(x)| M|x| for x E, see Defn. 6.3.
We write B(E, F) for the set of operators from E to F. For the natural operations,
B(E, F) is a vector space. We norm B(E, F) by setting
(11.4) |T| = sup
_
|T(x)|
|x|
: x E, x ,= 0
_
.
Exercise 11.9. Show that
(i) The expression (11.4) is a norm in the sense of Defn. 2.3.
(ii) We equivalently have
|T| = sup {|T(x)| : x E, |x| 1} = sup {|T(x)| : x E, |x| = 1} .
Proposition 11.10. For a linear map T : E F between normed spaces, the following are
equivalent:
(i) T is continuous (for the metrics induced by the norms on E and F);
(ii) T is continuous at 0;
(iii) T is bounded.
Proof. Proof essentially follows the proof of similar Theorem 5.4. See also discus-
sion about usefulness of this theorem there.
Theorem 11.11. Let E be a normed space, and let F be a Banach space. Then B(E, F) is a
Banach space.
Proof. In the essence, we follows the same three-step procedure as in Thms. 2.24,
11.6 and 11.7. Let (T
n
) be a Cauchy sequence in B(E, F). For x E, check that
(T
n
(x)) is Cauchy in F, and hence converges to, say, T(x), as F is complete. Then
check that T : E F is linear, bounded, and that |T
n
T| .
We write B(E) for B(E, E). For normed spaces E, F and G, and for T B(E, F)
and S B(F, G), we have that ST = S T B(E, G) with |ST| |S| |T|.
For T B(E, F), if there exists S B(F, E) with ST = I
E
, the identity of E, and
TS = I
F
, then T is said to be invertible, and write T = S
1
. In this case, we say that
E and F are isomorphic spaces, and that T is an isomorphism.
If |T(x)| = |x| for each x E, we say that T is an isometry. If additionally T
is an isomorphism, then T is an isometric isomorphism, and we say that E and F are
isometrically isomorphic.
11.3. Dual Spaces. Let E be a normed vector space, and let E

(also written E

) be
B(E, K), the space of bounded linear maps from E to K, which we call function-
als, or more correctly, bounded linear functionals, see Defn. 5.1. Notice that as K is
complete, the above theorem shows that E

is always a Banach space.


Theorem 11.12. Let 1 < p < , and again let q be such that 1/p + 1/q = 1. Then
the map
q
(
p
)

: u
u
, is an isometric isomorphism, where
u
is dened, for
u = (u
j
)
q
, by

u
(x) =

j=1
u
j
x
j
_
x = (x
j
)
p
_
.
62 VLADIMIR V. KISIL
Proof. By Holders inequality, we see that
|
u
(x)|

j=1
|u
j
| |x
j
|
_
_

j=1
|u
j
|
q
_
_
1/q
_
_

j=1
|x
j
|
p
_
_
1/p
= |u|
q
|x|
p
.
So the sum converges, and hence
u
is dened. Clearly
u
is linear, and the above
estimate also shows that |
u
| |u|
q
. The map u
u
is also clearly linear, and
weve just shown that it is norm-decreasing.
Now let (
p
)

. For each n, let e


n
= (0, , 0, 1, 0, ) with the 1 in the nth
position. Then, for x = (x
n
)
p
,
_
_
_
_
_
x
n

k=1
x
k
e
k
_
_
_
_
_
p
=
_

k=n+1
|x
k
|
p
_
1/p
0,
as n . As is continuous, we see that
(x) = lim
n
n

k=1
(x
k
e
k
) =

k=1
x
k
(e
k
).
Let u
k
= (e
k
) for each k. If u = (u
k
)
q
then we would have that =
u
.
Let us x N N, and dene
x
k
=
_
0, if u
k
= 0 or k > N;
u
k
|u
k
|
q2
, if u
k
,= 0 and k N.
Then we see that

k=1
|x
k
|
p
=
N

k=1
|u
k
|
p(q1)
=
N

k=1
|u
k
|
q
,
as p(q 1) = q. Then, by the previous paragraph,
(x) =

k=1
x
k
u
k
=
N

k=1
|u
k
|
q
.
Hence
||
|(x)|
|x|
p
=
_
N

k=1
|u
k
|
q
_
11/p
=
_
N

k=1
|u
k
|
q
_
1/q
.
By letting N , it follows that u
q
with |u|
q
||. So =
u
and
|| = |
u
| |u|
q
. Hence every element of (
p
)

arises as
u
for some u, and
also |
u
| = |u|
q
.
Loosely speaking, we say that
q
= (
p
)

, although we should always be careful


to keep in mind the exact map which gives this.
Corollary 11.13 (RieszFrechet Self-duality Lemma 5.10).
2
is self-dual:
2
=
2

.
Similarly, we can show that c

0
=
1
and that (
1
)

(the implementing
isometric isomorphism is giving by the same summation formula).
11.4. HahnBanach Theorem. Mathematical induction is a well known method
to prove statements depending from a natural number. The mathematical induc-
tion is based on the following property of natural numbers: any subset of N has
the least element. This observation can be generalised to the transnite induction
described as follows.
A poset is a set X with a relation _ such that a _ a for all a X, if a _ b and
b _ a then a = b, and if a _ b and b _ c, then a _ c. We say that (X, _) is total
if for every a, b X, either a _ b or b _ a. For a subset S X, an element a X
INTRODUCTION TO FUNCTIONAL ANALYSIS 63
is an upper bound for S if s _ a for every s S. An element a X is maximal if
whenever b X is such that a _ b, then also b _ a.
Then Zorns Lemma tells us that if X is a non-empty poset such that every total
subset has an upper bound, then X has a maximal element. Really this is an axiom
which we have to assume, in addition to the usual axioms of set-theory. Zorns
Lemma is equivalent to the axiom of choice and Zermelos theorem.
Theorem 11.14 (HahnBanach Theorem). Let E be a normed vector space, and let
F E be a subspace. Let F

. Then there exists E

with || || and
(x) = (x) for each x F.
Proof. We do the real case. An extension of is a bounded linear map
G
: G
R such that F G E,
G
(x) = (x) for x F, and |
G
| ||. We intro-
duce a partial order on the pairs (G,
G
) of subspaces and functionals as follows:
(G
1
,
G
1
) _ (G
2
,
G
2
) if and only if G
1
G
2
and
G
1
(x) =
G
2
(x) for all x G
1
.
A Zorns Lemma argument shows that a maximal extension
G
: G R exists.
We shall show that if G ,= E, then we can extend
G
, a contradiction.
Let x , G, so an extension
1
of to the linear span of G and x must have the
form

1
( x +ax) = (x) +a ( x G, a R),
for some R. Under this,
1
is linear and extends , but we also need to ensure
that |
1
| ||. That is, we need
(11.5) |( x) +a| || | x + ax| ( x G, a R).
It is straightforward for a = 0, otherwise to simplify proof put ay = x in (11.5)
an divide both sides of the identity by a. Thus we need to show that there exist
such that
| (y)| || |x y| for all y G, a R,
or
(y) || |x y| (y) +|| |x y| .
For any y
1
and y
2
in G we have:
(y
1
) (y
2
) || |y
1
y
2
| || (|x y
2
| +|x y
1
|).
Thus
(y
1
) || |x y
1
| (y
2
) +|| |x y
2
| .
As y
1
and y
2
were arbitrary,
sup
yG
((y) || |y +x|) inf
yG
((y) +|| |y + x|).
Hence we can choose between the inf and the sup.
The complex case follows by complexication.
The Hahn-Banach theorem tells us that a functional from a subspace can be
extended to the whole space without increasing the norm. In particular, extending
a functional on a one-dimensional subspace yields the following.
Corollary 11.15. Let E be a normed vector space, and let x E. Then there exists E

with || = 1 and (x) = |x|.


Another useful result which can be proved by Hahn-Banach is the following.
Corollary 11.16. Let E be a normed vector space, and let F be a subspace of E. For x E,
the following are equivalent:
(i) x F the closure of F;
(ii) for each E

with (y) = 0 for each y F, we have that (x) = 0.


64 VLADIMIR V. KISIL
Proof. 11.16(i)11.16(ii) follows because we can nd a sequence (y
n
) in F with
y
n
x; then its immediate that (x) = 0, because is continuous. Conversely,
we show that if 11.16(i) doesnt hold then 11.16(ii) doesnt hold (that is, the contra-
positive to 11.16(ii)11.16(i)).
So, x , F. Dene : lin{F, x} K by
(y + tx) = t (y F, t K).
This is well-dened, for if y + tx = y

+ t

x then either t = t

, or otherwise
x = (t t

)
1
(y

y) F which is a contradiction. The map is obviously linear,


so we need to show that it is bounded. Towards a contradiction, suppose that
is not bounded, so we can nd a sequence (y
n
+ t
n
x) with |y
n
+ t
n
x| 1 for
each n, and yet |(y
n
+t
n
x)| = |t
n
| . Then
_
_
t
1
n
y
n
+x
_
_
1/ |t
n
| 0, so
that the sequence (t
1
n
y
n
), which is in F, converges to x. So x is in the closure of
F, a contradiction. So is bounded. By Hahn-Banach theorem, we can nd some
E

extending . For y F, we have (y) = (y) = 0, while (x) = (x) = 1,


so 11.16(ii) doesnt hold, as required.
We dene E

= (E

to be the bidual of E, and dene J : E E

as follows.
For x E, J(x) should be in E

, that is, a map E

K. We dene this to be the


map (x) for E

. We write this as
J(x)() = (x) (x E, E

).
The Corollary 11.15 shows that J is an isometry; when J is surjective (that is, when
J is an isomorphism), we say that E is reexive. For example,
p
is reexive for
1 < p < .
11.5. C(X) Spaces. This section is not examinable. Standard facts about topology
will be used in later sections of the course.
All our topological spaces are assumed Hausdorff. Let X be a compact space,
and let C
K
(X) be the space of continuous functions from X to K, with pointwise
operations, so that C
K
(X) is a vector space. We norm C
K
(X) by setting
|f|

= sup
xX
|f(x)| (f C
K
(X)).
Theorem 11.17. Let X be a compact space. Then C
K
(X) is a Banach space.
Let E be a vector space, and let ||
(1)
and ||
(2)
be norms on E. These norms are
equivalent if there exists m > 0 with
m
1
|x|
(2)
|x|
(1)
m|x|
(2)
(x E).
Theorem 11.18. Let E be a nite-dimensional vector space with basis {e
1
, . . . , e
n
}, so we
can identify E with K
n
as vector spaces, and hence talk about the norm ||
2
on E. If || is
any norm on E, then || and ||
2
are equivalent.
Corollary 11.19. Let E be a nite-dimensional normed space. Then a subset X E is
compact if and only if it is closed and bounded.
Lemma 11.20. Let E be a normed vector space, and let F be a closed subspace of E with
E ,= F. For 0 < < 1, we can nd x
0
E with |x
0
| 1 and |x
0
y| > for y F.
Theorem 11.21. Let E be an innite-dimensional normed vector space. Then the closed
unit ball of E, the set {x E : |x| 1}, is not compact.
Proof. Use the above lemma to construct a sequence (x
n
) in the closed unit ball of
E with, say, |x
n
x
m
| 1/2 for each n ,= m. Then (x
n
) can have no convergent
subsequence, and so the closed unit ball cannot be compact.
INTRODUCTION TO FUNCTIONAL ANALYSIS 65
12. MEASURE THEORY
The presentation in this section is close to [35].
12.1. Basic Measure Theory.
Denition 12.1. Let X be a set. A -algebra on X is a collection of subsets of X, say
R 2
X
, such that
(i) X R;
(ii) if A, B R, then A\ B R;
(iii) if (A
n
) is any sequence in R, then
n
A
n
R.
Note, that in the third condition we admit any countable unions. The usage of
in the names of -algebra and -ring is a reference to this. If we replace the
condition by
(iii) if (A
n
)
m
1
is any nite family in R, then
m
n=1
A
n
R;
then we obtain denitions of an algebra.
For a -algebra R and A, B R, we have
A B = X \ (X \ (A B)) = X \ ((X \ A) (X \ B)) R.
Similarly, R is closed under taking (countably) innite intersections.
If we drop the rst condition from the denition of (-)algebra (but keep the
above conclusion from it!) we got a (-)ring, that is a (-)ring is closed under
(countable) unions, (countable) intersections and subtractions of sets.
Exercise 12.2. Show that the empty set belongs to any non-empty ring.
Sets A
k
are pairwise disjoint if A
n
A
m
= for n ,= m. We denote the union of
pairwise disjoint sets by , e.g. A B C.
It is easy to work with a vector space through its basis. For a ring of sets the
following notion works as a helpful basis.
Denition 12.3. A semiring S of sets is the collection such that
(i) it is closed under intersection;
(ii) for A, B S we have A\ B = C
1
. . . C
N
with C
k
S.
Again, any semiring contain the empty set.
Example 12.4. The following are semirings but not rings:
(i) The collection of intervals [a, b) on the real line;
(ii) The collection of all rectangles {a x < b, c y < d} on the plane.
As the intersection of a family of -algebras is again a -algebra, and the power
set 2
X
is a -algebra, it follows that given any collection D 2
X
, there is a -
algebra R such that D R, such that if S is any other -algebra, with D S, then
R S. We call R the -algebra generated by D.
Exercise 12.5. Let S be a semiring. Show that
(i) The collection of all disjoint unions
n
k=1
A
k
, where A
k
S, is a ring. We
call it the ring R(S) generated by the semiring S.
(ii) Any ring containing S contains R(S) as well.
We introduce the symbols +, , and treat these as being extended real
numbers, so < t < for t R. We dene t += , t= if t > 0 and
so forth. We do not (and cannot, in a consistent manner) dene or 0.
Denition 12.6. A measure is a map : R [0, ] dened on a (semi-)ring (or
-algebra) R, such that if A =
n
A
n
for A R and a nite subset (A
n
) of R, then
(A) =

n
(A
n
). This property is called additivity of a measure.
66 VLADIMIR V. KISIL
Exercise 12.7. Show that the following two conditions are equivalent:
(i) () = 0.
(ii) There is a set A R such that (A) < .
The rst condition often (but not always) is included in the denition of a measure.
In analysis we are interested in innities and limits, thus the following exten-
sion of additivity is very important.
Denition 12.8. In terms of the previous denition we say that is countably ad-
ditive (or -additive) if for any countable family (A
n
) of pairwise disjoint sets from
R such that A =
n
A
n
R we have (A) =

n
(A
n
). If the sum diverges, then
as it will be the sum of positive numbers, we can, without problem, dene it to be
+.
Example 12.9. (i) Fix a point a R and dene a measure by the condition
(A) = 1 if a A and (A) = 0 otherwise.
(ii) For the ring obtained in Exercise 12.5 from semiring S in Example 12.4(i)
dene ([a, b)) = b a on S. This is a measure, and we will show its
-additivity.
(iii) For ring obtained in Exercise 12.5 from the semiring in Example 12.4(ii),
dene (V) = (b a)(d c) for the rectangle V = {a x < b, c y < d}
S. It will be again a -additive measure.
We will see how to dene a measure which is not -additive in Section 12.4.
Denition 12.10. A measure is nite if (A) < for all A X.
A measure is -nite if X is a union of countable number of sets X
k
, such that
the restriction of to each X
k
is nite.
Exercise 12.11. Modify the example 12.9(i) to obtain a measure which is not -
nite.
Proposition 12.12. Let be a measure on a -algebra R. Then:
(i) If A, B R with A B, then (A) (B) [we call this property monotonicity
of a measure];
(ii) If A, B R with A B and (B) < , then (B \ A) = (B) (A);
(iii) If (A
n
) is a sequence in R, with A
1
A
2
A
3
. Then
lim
n
(A
n
) = (A
n
) .
(iv) If (A
n
) is a sequence in R, with A
1
A
2
A
3
. If (A
m
) < for some
m, then
lim
n
(A
n
) = (A
n
) .
12.2. Extension of Measures. From now on we consider only nite measures, an
extension to -nite measures will be done later.
Proposition 12.13. Any measure

on a semiring S is uniquely extended to a measure


on the generated ring R(S), see Ex. 12.5. If the initial measure was -additive, then the
extension is -additive as well.
Proof. If an extension exists it shall satisfy (A) =

n
k=1

(A
k
), where A
k
S.
We need to show for this denition two elements:
(i) Consistency, i.e. independence of the value from a presentation of A
R(S) as A =
n
k=1
A
k
, where A
k
S. For two different presentation
INTRODUCTION TO FUNCTIONAL ANALYSIS 67
A =
n
j=1
A
j
and A =
m
k=1
B
k
dene C
jk
= A
j
B
k
, which will be pair-
wise disjoint. By the additivity of

we have

(A
j
) =

(C
jk
) and

(B
k
) =

(C
jk
). Then

(A
j
) =

(C
jk
) =

(C
jk
) =

(B
k
).
(ii) Additivity. For A =
n
k=1
A
k
, where A
k
R(S) we can present A
k
=

n(k)
j=1
C
jk
, C
jk
S. Thus A =
n
k=1

n(k)
j=1
C
jk
and:
(A) =
n

k=1
n(k)

j=1

(C
jk
) =
n

k=1
(A
k
).
Finally, show the -additivity. For a set A =

k=1
A
k
, where A and A
k
R(S),
nd presentations A =
n
j=1
B
j
, B
j
S and A
k
=
m(k)
l=1
B
lk
, B
lk
S. Dene
C
jlk
= B
j
B
lk
S, then B
j
=

k=1

m(k)
l=1
C
jlk
and A
k
=
n
j=1

m(k)
l=1
C
jlk
Then,
from-additivity of

:
(A) =
n

j=1

(B
j
) =
n

j=1

k=1
m(k)

l=1

(C
jlk
) =

k=1
n

j=1
m(k)

l=1

(C
jlk
) =

k=1
(A
k
),
where we changed the summation order in series with non-negative terms.
In a similar way we can extend a measure from a semiring to corresponding
-ring, however it can be done even for a larger family. To do that we will use the
following notion.
Denition 12.14. Let S be a semi-ring of subsets in X, and be a measure dened
on S. An outer measure

on X is a map

: 2
X
[0, ] dened by:

(A) = inf
_

k
(A
k
), such that A
k
A
k
, A
k
S
_
.
Proposition 12.15. An outer measure has the following properties:
(i)

() = 0;
(ii) if A B then

(A)

(B);
(iii) if (A
n
) is any sequence in 2
X
, then

(
n
A
n
)

(A
n
).
The nal condition says that an outer measure is countably sub-additive. Note,
that an outer measure may be not a measure in the sense of Defn. 12.6 due to a
luck of additivity.
Example 12.16. The Lebesgue outer measure on R is dened out of the measure from
Example 12.9(ii), that is, for A R, as

(A) = inf
_
_
_

j=1
(b
j
a
j
) : A

j=1
[a
j
, b
j
)
_
_
_
.
We make this denition, as intuitively, the length, or measure, of the interval
[a, b) is (b a).
For example, for outer Lebesgue measure we have

(A) = 0 for any countable


set, which follows, as clearly

({x}) = 0 for any x R.


Lemma 12.17. Let a < b. Then

([a, b]) = b a.
68 VLADIMIR V. KISIL
Proof. For > 0, as [a, b] [a, b + ), we have that

([a, b]) (b a) + . As
> 0, was arbitrary,

([a, b]) b a.
To show the opposite inequality we observe that [a, b) [a, b] and

[a, b) =
b a (because [a, b) is in the semi-ring) so

[a, b] b a by 12.15(ii).
Our next aim is to construct measures fromouter measures. We use the notation
A B = (A B) \ (A B) for symmetric difference of sets.
Denition 12.18. Given an outer measure

dened by a semiring S, we dene


A X to be Lebesgue measurable if for any > 0 there is a nite union B of elements
in S (in other words: B R(S)), such that

(A B) < .
Obviously all elements of S are measurable. An alternative denition of a meas-
urable set is due to Carath eodory.
Denition 12.19. Given an outer measure

, we dene E X to be Carath eodory


measurable if

(A) =

(A E) +

(A\ E),
for any A X.
As

is sub-additive, this is equivalent to

(A)

(A E) +

(A\ E) (A X),
as the other inequality is automatic.
Exercise

12.20. Show that measurability by Lebesgue and Carath eodory are equi-
valent.
Suppose now that the ring R(S) is an algebra (i.e., contains the maximal element
X). Then, the outer measure of any set is nite, and the following theorem holds:
Theorem 12.21 (Lebesgue). Let

be an outer measure on X dened by a semiring S,


and let L be the collection of all Lebesgue measurable sets for

. Then L is a -algebra,
and if is the restriction of

to L, then is a measure.
Sketch of proof. First we show that

(A) = (A) for a set A R(S). If A


k
A
k
for A
k
S), then (A)

k
(A
k
), taking the inmum we get (A)

(A).
For the opposite inequality, any A R(S) has a disjoint representation A =
k
A
k
,
A
k
S, thus

(A)

k
(A
k
) = (A).
Now we will show that R(S) is an incomplete metric space, with the measure
being uniformly continuous functions. Measurable sets make the completion of
R(S) with being continuation of

to the completion by continuity.


Dene a distance between elements A, B L as the outer measure of the sym-
metric difference of Aand B: d(A, B) =

(A B). Introduce equivalence relation


A B if d(A, B) = 0 and use the inclusion for the triangle inequality:
A B (A C) (C B)
Then, by the denition, Lebesgue measurable sets make the closure of R(S) with
respect to this distance.
We can check that measurable sets form an algebra. To this end we need to
make estimations, say, of

((A
1
A
2
) (B
1
B
2
)) in terms of

(A
i
B
i
). A
demonstration for any nite number of sets is performed through mathematical
inductions. The above two-sets case provide both: the base and the step of the
induction.
Now, we show that L is -algebra. Let A
k
L and A =
k
A
k
. Then for any
> 0 there exists B
k
R(S), such that

(A
k
B
k
) <

2
k
. Dene B =
k
B
k
. Then
(
k
A
k
) (
k
B
k
)
k
(A
k
B
k
) implies

(A B) < .
INTRODUCTION TO FUNCTIONAL ANALYSIS 69
We cannot stop at this point since B =
k
B
k
may be not in R(S). Thus, dene B

1
=
B
1
and B

k
= B
k
\
k1
i=1
B
i
, so B

k
are pair-wise disjoint. Then B =
k
B

k
and B

k

R(S). From the convergence of the series there is Nsuch that

k=N
(B

k
) < . Let
B

=
N
k=1
B

k
, which is in R(S). Then

(B B

) and, thus,

(A B

) 2.
To check that

is measure on L we use the following


Lemma 12.22. |

(A)

(B)|

(A B), that is

is uniformly continuous in the


metric d(A, B).
Proof of the Lemma. Use inclusions A B (A B) and B A (A B).
To show additivity take A
1,2
L , A = A
1
A
2
, B
1,2
R and

(A
i
B
i
) < .
Then

(A (B
1
B
2
)) < 2 and |

(A)

(B
1
B
2
)| < 2. Thus

(B
1
B
2
) =
(B
1
B
2
) = (B
1
) + (B
2
) (B
1
B
2
), but (B
1
B
2
) = d(B
1
B
2
, ) =
d(B
1
B
2
, A
1
A
2
) < 2. Therefore
|

(B
1
B
2
) (B
1
) (B
2
)| < 2.
Combining everything together we get:
|

(A)

(A
1
)

(A
2
)| < 6.
Thus

is additive.
Check the countable additivity for A =
k
A
k
. The inequality

(A)

(A
k
)
follows from countable sub-additivity. The opposite inequality is the limiting case
of the nite inequality

(A)

N
k=1

(A
k
) following fromadditivity and mono-
tonicity of .
Corollary 12.23. Let E R be open or closed. Then E is Lebesgue measurable.
Proof. This is a common trick, using the density and the countability of the ration-
als. As -algebras are closed under taking complements, we need only show that
open sets are Lebesgue measurable.
Intervals (a, b) are Lebesgue measurable by the very denition. Now let U R
be open. For each x U, there exists a
x
< b
x
with x (a
x
, b
x
) U. By making
a
x
slightly larger, and b
x
slightly smaller, we can ensure that a
x
, b
x
Q. Thus
U =
x
(a
x
, b
x
). Each interval is measurable, and there are at most a countable
number of them(endpoints make a countable set) thus Uis the countable (or nite)
union of Lebesgue measurable sets, and hence Uis Lebesgue measurable itself.
We perform now an extension of nite measure to -nite one. Let there is
-additive and -nite measure dened on a semiring in X =
k
X
k
, where
restriction of to every X
k
is nite. Consider the Lebesgue extension
k
of
dened within X
k
. A set A X is measurable if every intersection A X
k
is
k
measurable. For a such measurable set A we dene its measure by the identity:
(A) =

k
(A X
k
).
We call a measure dened on L complete if whenever E X is such that there
exists F L with (F) = 0 and E F, we have that E L. Measures constructed
from outer measures by the above theorem are always complete. On the example
sheet, we saw how to form a complete measure from a given measure. We call sets
like E null sets: complete measures are useful, because it is helpful to be able to
say that null sets are in our -algebra. Null sets can be quite complicated. For the
Lebesgue measure, all countable subsets of R are null, but then so is the Cantor
set, which is uncountable.
Denition 12.24. If we have a property P(x) which is true except possibly x A
and (A) = 0, we say P(x) is almost everywhere or a.e..
70 VLADIMIR V. KISIL
12.3. Complex-Valued Measures and Charges. We start from the following ob-
servation.
Exercise 12.25. Let
1
and
2
be measures on a same -algebra. Dene
1
+
2
and
1
, > 0 by (
1
+
2
)(A) =
1
(A) +
2
(A) and (
1
)(A) = (
1
(A)). Then

1
+
2
and
1
are measures on the same -algebra as well.
In view of this, it will be helpful to extend the notion of a measure to obtain a
linear space.
Denition 12.26. Let X be a set, and R be a -ring. A real- (complex-) valued
function on R is called a charge (or signed measure) if it is countably additive as
follows: for any A
k
R the identity A =
k
A
k
implies the series

k
(A
k
) is
absolute convergent and has the sum (A).
Example 12.27. Any linear combination of -additive measures on R with real
(complex) coefcients is real (complex) charge.
The opposite statement is also true:
Theorem 12.28. Any real (complex) charge has a representation =
1

2
( =

1

2
+ i
3
i
4
), where
k
are -additive measures.
To prove the theorem we need the following denition.
Denition 12.29. The variation of a charge on a set A is || (A) = sup

k
|(A
k
)| for
all disjoint splitting A =
k
A
k
.
Example 12.30. If =
1

2
, then || (A)
1
(A) +
2
(A). The inequality
becomes an identity for disjunctive measures on A (that is there is a partition A =
A
1
A
2
such that
2
(A
1
) =
1
(A
2
) = 0).
The relation of variation to charge is as follows:
Theorem 12.31. For any charge the function || is a -additive measure.
Finally to prove the Thm. 12.28 we use the following
Proposition 12.32. For any charge the function || is a -additive measure as well.
From the Thm. 12.28 we can deduce
Corollary 12.33. The collection of all charges on a -algebra R is a linear space which is
complete with respect to the distance:
d(
1
,
2
) = sup
AR
|
1
(A)
2
(A)| .
The following result is also important:
Theorem 12.34 (Hahn Decomposition). Let be a charge. There exist A, B L, called
a Hahn decomposition of (X, ), with A B = , A B = X and such that for any
E L,
(A E) 0, (B E) 0.
This need not be unique.
Sketch of proof. We only sketch this. We say that A L is positive if
(E A) 0 (E L),
and similiarly dene what it means for a measurable set to be negative. Suppose
that never takes the value (the other case follows by considering the charge
).
INTRODUCTION TO FUNCTIONAL ANALYSIS 71
Let = inf (B
0
) where we take the inmum over all negative sets B
0
. If
= then for each n, we can nd a negative B
n
with (B
n
) n. But then
B =
n
B
n
would be negative with (B) n for any n, so that (B) = a
contradiction.
So > and so for each n we can nd a negative B
n
(B
n
) < + 1/n.
Then we can show that B =
n
B
n
is negative, and argue that (B) . As B is
negative, actually (B) = .
There then follows a very tedious argument, by contradiction, to show that
A = X \ B is a positive set. Then (A, B) is the required decomposition.
12.4. Constructing Measures, Products. Consider the semiring S of intervals [a, b).
There is a simple description of all measures on it. For a measure dene
(12.1) F

(t) =
_
_
_
([0, t)) if t > 0,
0 if t = 0,
([t, 0)) if t < 0,
F

is monotonic and any monotonic function F denes a measure on S by the


by ([a, b)) = F(b) F(a). The correspondence is one-to-one with the additional
assumption F(0) = 0.
Theorem 12.35. The above measure is -additive on S if and only if F is continuous
from the left: F(t 0) = F(t) for all t R.
Proof. The necessity: F(t) F(t 0) = lim
0
([t , t)) = 0.
For sufciency assume [a, b) =
k
[a
k
, b
k
). The inequality ([a, b))

k
([a
k
, b
k
))
follows from additivity and monotonicity. For the opposite inequality take
k
s.t.
F(b) F(b ) < and F(a
k
) F(a
k

k
) < /2
k
(use left continuity of F). Then
the interval [a, b ] is covered by (a
k

k
, b
k
), there is nite subcovering. Thus
([a, b ))

N
j=1
([a
k
j

k
j
, b
k
j
)).
Exercise 12.36. (i) Give an example of function discontinued from the left at
1 and show that the resulting measure is additive but not -additive.
(ii) Check that, if a function F is continuous at point a then ({a}) = 0.
Example 12.37. (i) Take F(t) = t, then the corresponding measure is the Le-
besgue measure on R.
(ii) Take F(t) be the integer part of t, then counts the number of integer
within the set.
(iii) Dene the Cantor function as follows (x) = 1/2 on (1/3, 2/3); (x) =
1/4 on (1/9, 2/9); (x) = 3/4 on (7/9, 8/9), and so for. This function is
monotonic and can be continued to [0, 1] by continuity, it is know as Cantor
ladder. The resulting measure has the following properties:
The measure of the entire interval is 1.
Measure of every point is zero.
The measure of the Cantor set is 1, while its Lebesgue measure is 0.
Another possibility to build measures is their product. In particular, it allows
to expand various measures dened through (12.1) on the real line to R
n
.
Denition 12.38. Let X and Y be spaces, and let S and T be semirings on X and
Y respectively. Then S T is the semiring consisting of {A B : A S, B
T} (generalised rectangles). Let and be measures on S and T respectively.
Dene the product measure on S T by the rule ( )(AB) = (A)(B).
Example 12.39. The measure from Example 12.9(iii) is the product of two copies
of pre-Lebesgue measures from Example 12.9(ii).
72 VLADIMIR V. KISIL
13. INTEGRATION
We now come to the main use of measure theory: to dene a general theory of
integration.
13.1. Measurable functions. From now on, by a measure space we shall mean a
triple (X, L, ), where X is a set, L is a -algebra on X, and is a -additive measure
dened on L. We say that the members of L are measurable, or L-measurable, if
necessary to avoid confusion.
Denition 13.1. A function f : X R is measurable if
E
c
(f) = {x X : f(x) < c}
is in L for any c R.
A complex-valued function is measurable if its real and imaginary parts are
measurable.
Lemma 13.2. The following are equivalent:
(i) A function f is measurable;
(ii) For any a < b the set f
1
((a, b)) is measurable;
(iii) For any open set U R the set f
1
(U) is measurable.
Proof. Use that any open set U R is a union of countable set of intervals (a, b),
cf. proof of Cor. 12.23.
Corollary 13.3. Let f be measurable and g be continuous, then the composition g(f(x))
is measurable.
Proof. The preimage of (, c) under a continuous g is an open set, and its preim-
age under f is measurable.
Theorem 13.4. Let f, g : X R be measurable. Then af, f + g, fg, max(f, g) and
min(f, g) are all measurable. That is measurable functions forman algebra and this algebra
is closed under convergence a.e.
Proof. Use Cor. 13.3 to show measurability of f, |f| and f
2
.
Next use the following identities:
E
c
(f
1
+f
2
) =
rQ
(E
r
(f
1
) E
cr
(f
2
)),
f
1
f
2
=
(f
1
+f
2
)
2
(f
1
f
2
)
2
4
,
max(f
1
, f
2
) =
(f
1
+f
2
) +|f
1
f
2
|
2
.
If (f
n
) is a non-increasing sequence of measurable functions converging to f.
Than E
c
(f) =
n
E
c
(f
n
).
Moreover any limit can be replaced by two monotonic limits:
(13.1) lim
n
f
n
(x) = lim
n
lim
k
max(f
n
(x), f
n+1
(x), . . . , f
n+k
(x)).
Finally if f
1
is measurable and f
2
= f
1
almost everywhere, then f
2
is measurable
as well.
We can dene several types of convergence for measurable functions
Denition 13.5. We say that sequence (f
n
) of functions converges
(i) uniformly to f (notated f
n
f) if
sup
xX
|f
n
(x) f(x)| 0;
INTRODUCTION TO FUNCTIONAL ANALYSIS 73
(ii) almost everywhere to f (notated f
n
a.e.
f) if
f
n
(x) f(x) for all x X \ A, (A) = 0;
(iii) in measure to f (notated f
n

f) if for all > 0


({x X : |f
n
(x) f(x)| > }) 0.
Clearly uniform convergence implies both convergences a.e and in measure.
Theorem 13.6. On nite measures convergence a.e. implies convergence in measure.
Proof. Dene A
n
() = {x X : |f
n
(x) f(x)| }. Let B
n
() =
kn
A
k
().
Clearly B
n
() B
n+1
(), let B() =

1
B
n
(). If x B() then f
n
(x) , f(x). Thus
(B()) = 0, but (B()) = lim
n
(B
n
()). Since A
n
() B
n
() we see that
(A
n
()) 0.
Note, that the construction of sets B
n
() is just another implementation of the
two monotonic limits trick (13.1) for sets.
Exercise 13.7. Present examples of sequences (f
n
) and functions f such that:
(i) f
n

f but not f
n
a.e.
f.
(ii) f
n
a.e.
f but not f
n
f.
However we can slightly x either the set or the sequence to upgrade the
convergence as shown in the following two theorems.
Theorem 13.8 (Egorov). If f
n
a.e.
f on a nite measure set X then for any > 0 there is
E

X with (E

) < and f
n
f on X \ E

.
Proof. We use A
n
() and B
n
() from the proof of Thm. 13.6. For every > 0 we
seen (B
n
()) 0, thus for each k there is N(k) such that (B
N(k)
(1/k)) < /2
k
.
Put E

=
k
B
N(k)
(1/k).
Theorem13.9. If f
n

f then there is a subsequence (n


k
) such that f
n
k
a.e.
f for k .
Proof. In the notations of two previous proofs: for every natural k take n
k
such
that (A
n
k
(1/k)) < 1/2
k
. Dene C
m
=

k=m
A
n
k
(1/k) and C = C
m
. Then,
(C
m
) = 1/2
m1
and, thus, (C) = 0. If x , C then there is such N that x ,
A
n
k
(1/k) for all k > N. That means that |f
n
k
(x) f(x)| < 1/k for all such k, i.e
f
n
k
(x) f(x).
It is worth to note, that we can use the last two theorem subsequently and up-
grade the convergence in measure to the uniform convergence of a subsequence
on a subset.
Exercise 13.10. For your counter examples from Exercise 13.7, nd
(i) a subsequence f
n
k
of the sequence from 13.7(i) which converges to f a.e.
(ii) fund a subset such that sequence from 13.7(ii) converges uniformly.
Exercise 13.11. Read about Luzins C-property.
13.2. Lebsgue Integral. First we dene a sort of basis for the space of integral
functions.
Denition 13.12. For A X, we dene
A
to be the indicator function of A, by

A
(x) =
_
1 : x A,
0 : x , A.
74 VLADIMIR V. KISIL
Then, if
A
is measurable, then
1
A
((1/2, 3/2)) = A L; conversely, if A L,
then X \ A L, and we see that for any U R open,
1
A
(U) is either , A, X \ A,
or X, all of which are in L. So
A
is measurable if and only if A L.
Denition 13.13. A measurable function f : X R is simple if it attains only a
countable number of values.
Lemma 13.14. A function f : X R is simple if and only if
(13.2) f =

k=1
t
k

A
k
for some (t
k
)

k=1
R and A
k
L. That is, simple functions are linear combinations of
indicator functions of measurable sets.
Moreover in the above representation the sets A
k
can be pair-wise disjoint and all t
k
pair-wise different. In this case the representation is unique.
Notice that it is now obvious that
Corollary 13.15. The collection of simple functions forms a vector space: this wasnt clear
from the original denition.
Denition 13.16. A simple function in the form (13.2) with disjoint A
k
is called
summable if the following series converges:
(13.3)

k=1
|t
k
| (A
k
) if f has the above unique representation f =

k=1
t
k

A
k
It is another combinatorial exercise to show that this denition is independent
of the way we write f.
Denition 13.17. For any simple summable function f from the previous Deni-
tion we dene the integral of a simple function f : X R over a measurable set A
by setting
_
A
f d =

k=1
t
k
(A
k
A).
Clearly the series converges for any simple summable function f. Moreover
Lemma 13.18. The value of integral of a simple summable function is independent from
its representation by the sum of indicators over pair-wise disjoint sets.
Proof. This is another slightly tedious combinatorial exercise. You need to prove
that the integral of a simple function is well-dened, in the sense that it is inde-
pendent of the way we choose to write the simple function.
Exercise 13.19. Let f be the function on [0, 1] which take the value 1 in all rational
points and 0everywhere else. Find the value of the Lebesgue integral
_
[0,1]
f, d
with respect to the Lebesgue measure on [0, 1]. Show that the Riemann upper- and
lower sums for f converges to different values, so f is not Riemann-integrable.
Remark 13.20. The previous exercise shows that the Lebesgue integral does not
have those problems of the Riemann integral related to discontinuities. Indeed,
most of function which are not Riemann-integrable are integrable in the sense of
Lebesgue. The only reason, why a measurable function is not integrable by Le-
besgue is divergence of the series (13.3). Therefore, we prefer to speak that the
function is summable rather than integrable. However, those terms are used inter-
changeably in the mathematical literature.
We will denote by S(X) the collection of all simple summable functions on X.
INTRODUCTION TO FUNCTIONAL ANALYSIS 75
Proposition 13.21. Let f, g : X R be in S(X) (that is simple summable), let a, b R
and A is a measurable. Then:
(i)
_
A
af + bgd = a
_
A
f d +b
_
A
gd, that is S(X) is a linear space;
(ii) The correspondence f
_
A
f d is a linear functional on S(X);
(iii) The correspondence A
_
A
f d is a charge;
(iv) The function
(13.4) d
1
(f, g) =
_
X
|f(x) g(x)| d(x)
has all properties of the distance on S(X) probably except separation.
(v) For all A X:

_
A
f(x) d(x)
_
A
g(x) d(x)

d
1
(f, g).
(vi) If f g then
_
X
f d
_
X
gd, that is integral is monotonic;
(vii) For f 0 we have
_
X
f d = 0 if and only if ({x X : f(x) ,= 0}) = 0.
Proof. The proof is almost obvious, for example the Property 13.21(i) easily follows
from Lem. 13.18.
We will outline 13.21(iii) only. Let f is an indicator function of a set B, then
A
_
A
f d = (A B) is a -additive measure (and thusa charge). By the
Cor. 12.33 the same is true for nite linear combinations of indicator functions and
their limits in the sense of distance d
1
.
We can identify functions which has the same values a.e. Then S(X) becomes
a metric space with the distance d
1
(13.4). The space may be incomplete and we
may wish to look for its completion. However, if we will simply try to assign a
limiting point to every Cauchy sequence in S(X), then the resulting space becomes
so huge that it will be impossible to realise it as a space of functions on X. To
reduce the number of Cauchy sequences in S(X) eligible to have a limit, we shall
ask an additional condition. A convenient reduction to functions on X appears if
we ask both the convergence in d
1
metric and the point-wise convergence on X
a.e.
Denition 13.22. A function f is summable by a measure if there is a sequence
(f
n
) S(X) such that
(i) the sequence (f
n
) is a Cauchy sequence in S(X);
(ii) f
n
a.e.
f.
Clearly, if a function is summable, then any equivalent function is summable as
well. Set of equivalent classes will be denoted by L
1
(X).
Lemma 13.23. If the measure is nite then any bounded measurable function is sum-
mable.
Proof. Dene E
kn
(f) = {x X : k/n f(x) < (k +1)/n} and f
n
=

k
k
n

E
kn
(note
that the sum is nite due to boundedness of f).
Since |f
n
(x) f(x)| < 1/nwe have uniform convergence (thus convergence a.e.)
and (f
n
) is the Cauchy sequence: d
1
(f
n
, f
m
) =
_
X
|f
n
f
m
| d (
1
n
+
1
m
)(X).
Remark 13.24. This Lemma can be extended to the space of essentially bounded func-
tions L

(X), in other words L

(X) L
1
(X) for nite measures.
Another simple result, which is useful on many occasions is as follows.
Lemma 13.25. If the measure is nite and f
n
f then d
1
(f
n
, f) 0.
76 VLADIMIR V. KISIL
Corollary 13.26. For a convergent sequence f
n
a.e.
f, which admits the uniform bound
|f
n
(x)| < M for all n and x, we have d
1
(f
n
, f) 0.
Proof. For any > 0, by the Egorovs theorem 13.8 we can nd E, such that
(i) (E) <

2M
; and
(ii) from the uniform convergence on X \ E there exists N such that for any
n > N we have |f(x) f
n
(x)| <

2(X)
.
Combining this we found that for n > N, d
1
(f
n
, f) < M

2M
+ (X)

2(X)
< .
Exercise 13.27. Convergence in the metric d
1
and a.e. do not imply each other:
(i) Give an example of f
n
a.e.
f such that d
1
(f
n
, f) , 0.
(ii) Give an example of the sequence (f
n
) and function f in L
1
(X) such that
d
1
(f
n
, f) 0 but f
n
does not converge to f a.e.
To build integral we need the following
Lemma 13.28. Let (f
n
) and (g
n
) be two Cauchy sequences in S(X) with the same limit
a.e., then d
1
(f
n
, g
n
) 0.
Proof. Let
n
= f
n
g
n
, then this is a Cauchy sequence with zero limit a.e. As-
sume the opposite to the statement: there exist > 0 and sequence (n
k
) such that
_
x
|
n
k
| d > . Rescaling-renumbering we can obtain
_
x
|
n
| d > 1.
Take quickly convergent subsequence using the Cauchy property:
d
1
(
n
k
,
n
k+1
) 1/2
k+2
.
Renumbering agian assume d
1
(
k
,
k+1
) 1/2
k+2
Since
1
is a simple, that is =

k
t
k

A
k
and

k
|t
k
|
A
k
=
_
X
|
1
| d 1.
Thus there exists N, such that

N
k=1
|t
k
|
A
k
3/4. Put A =
N
k=1
A
k
and C =
max
1kN
|t
k
| = max
xA
|
1
(x)|.
By the Egorovs Theorem 13.8 there is E A such that (E) < 1/(4C) and

n
0 on B = A\ E. Then
_
B
|
1
| d =
_
A
|
1
| d
_
E
|
1
| d
3
4

1
4C
C =
1
2
.
Since
_
B
|
n
| d
_
B
|
n+1
| d d
1
(
n
,
n+1
)
1
2
n+2
we get
_
B
|
n
| d
_
B
|
1
| d
n1

k=1

_
B
|
n
| d
_
B
|
n+1
| d

1
2

n1

1
1
2
k+2
>
1
4
.
But this contradicts to the fact
_
B
|
n
| d 0, which follows from the uniform
convergence
n
0 on B.
Corollary 13.29. The functional I
A
(f) =
_
A
f(x) d(x), dened on any A L on the
space of simple functions S(X) can be extended by continuity to the functional on L
1
(X, ).
Denition 13.30. For an arbitrary summable f L
1
(X), we dene the Lebesgue
integral
_
A
f d = lim
n
_
A
f
n
d,
where the Cauchy sequence f
n
of summable simple functions converges to f a.e.
Theorem 13.31. (i) L
1
(X) is a linear space.
(ii) For any set A X the correspondence f
_
A
f d is a linear functional on
L
1
(X).
INTRODUCTION TO FUNCTIONAL ANALYSIS 77
(iii) For any f L
1
(X) the value (A) =
_
A
f d is a charge.
(iv) d
1
(f, g) =
_
A
|f g| d is a distance on L
1
(X).
Proof. The proof is follows from Prop. 13.21 and continuity of extension.
Remark 13.32. Note that we build L
1
(X) as a completion of S(X) with respect to the
distance d
1
. Its realisation as equivalence classes of measurable functions on X is
somehow secondary to this.
13.3. Properties of the Lebesgue Integral. The space L
1
was dened from dual
convergencein d
1
metric and a.e. Can we get the continuity of the integral from
the convergence almost everywhere alone? No, in general. However, we will state
now some results on continuity of the integral under convergence a.e. with some
additional assumptions. Finally, we show that L
1
(X) is closed in d
1
metric.
Theorem 13.33 (Lebesgue on dominated convergence). Let (f
n
) be a sequence of -
summable functions on X, and there is L
1
(X) such that |f
n
(x)| (x) for all x X,
n N.
If f
n
a.e.
f, then f L
1
(X) and for any measurable A:
lim
n
_
A
f
n
d =
_
A
f d.
Proof. For any measurable A the expression (A) =
_
A
d denes a nite meas-
ure on X due to non-negativeness of and Thm. 13.31.
Lemma 13.34. If g is measurable and bounded then f = g is -summable and for any
-measurable set A we have
_
A
f d =
_
A
gd.
Proof of the Lemma. Let M be the set of all g such that the Lemma is true. M in-
cludes any indicator functions g =
B
of a measurable B:
_
A
f d =
_
A

B
d =
_
AB
d = (A B) =
_
A
gd.
Thus M contains also nite liner combinations of indicators. For any n N and a
bounded g two functions g

(x) =
1
n
[ng(x)] and g
+
(x) = g

+
1
n
are nite linear
combinations of indicators and are in M. Since g

(x) g(x) g
+
(x) we have
_
A
g

(x) d =
_
A
g

d
_
A
g(x) d
_
A
g
+
(x) d =
_
A
g
+
(x) d.
By squeeze rule for n we have the middle term tenses to
_
A
gd, that is
g M.
For the proof of the theorem dene:
g
n
(x) =
_
f
n
(x)/(x), if (x) ,= 0,
0, if (x) = 0,
g(x) =
_
f(x)/(x), if (x) ,= 0,
0, if (x) = 0.
Then g
n
is bounded by 1 and g
n
a.e.
g. To show the theorem it will be enough to
show lim
n
_
A
g
n
d =
_
A
gd. For the uniformly bounded functions on the
nite measure set this can be derived from the Egorovs Thm. 13.8, see an example
of this in the proof of Lemma 13.28.
Exercise 13.35. Give an example of f
n
a.e.
f such that
_
X
f
n
d ,=
_
X
f d.
78 VLADIMIR V. KISIL
Exercise 13.36 (Chebyshevs inequality). Show that: if f is non-negative and sum-
mable, then
(13.5) {x X : f(x) > c} <
1
c
_
X
f d.
Theorem 13.37 (B. Levis, on monotone convergence). Let (f
n
) be monotonically in-
creasing sequence of -summable functions on X. Dene f(x) = lim
n
f
n
(x) (allowing
the value +).
(i) If all integrals
_
X
f
n
d are bounded by the same value then f is summable and
_
X
f d = lim
n
_
X
f
n
d.
(ii) If lim
n
_
X
f
n
d = +then function f is not summable.
Proof. Replacing f
n
by f
n
f
1
and f by ff
1
we can assume f
n
0 and f 0. Let E
be the set where f is innite, then E =
N

n
E
Nn
, where E
Nn
= {x X : f
n
(x) N.
By Chebyshevs inequality we have
N(E
Nn
) <
_
E
Nn
f
n
d
_
X
f
n
d C,
then (E
Nn
) C/N . Thus (E) = lim
N
lim
n
(E
Nn
) = 0.
Thus f is nite a.e.
Lemma 13.38. Let f be a measurable non-negative function attaining only nite values.
f is summable if and only if sup
_
A
f d < , where the supremum is taken over all
nite-measure set A such that f is bounded on A.
Proof of the Lemma. If f is summable then for any set A X we have
_
A
f d
_
X
f d < , thus the supremum is nite.
Let sup
_
A
f d = M < , dene B = {x X : f(x) = 0} and A
k
= {x X : 2
k

f(x) < 2
k+1
, k Z} we have (A
k
) < M/2
k
and X = B (

k=0
A
k
). Dene
g(x) =
_
2
k
, if x A
k
,
0, if x B,
f
n
(x) =
_
f(x), if x
n
n
A
n
,
0, otherwise.
Then g(x) f(x) < 2g(x). Function g is a simple function, its summability follows
from the estimation
_

n
n
A
k
gd
_

n
n
A
k
f d M which is valid for any n,
taking n we get summability of g. Furthermore, f
n
a.e.
f and f
n
(x) f(x) <
2g(x), so we use the Lebesgue Thm. 13.33 on dominated convergence to obtain the
conclusion.
Let A be a nite measure set such that f is bounded on A, then
_
A
f d
Cor. 13.26
= lim
n
_
A
f
n
d lim
n
_
X
f
n
d C.
This show summability of f by the previous Lemma. The rest of statement and
(contrapositive to) the second part follows from the Lebesgue Thm. 13.33 on dom-
inated convergence.
Now we can extend this result dropping the monotonicity assumption.
Lemma 13.39 (Fatou). If a sequence (f
n
) of -summable non-negative functions is such
that:

_
X
f
n
d C for all n;
f
n
a.e.
f,
then f is -summable and
_
X
f d C.
INTRODUCTION TO FUNCTIONAL ANALYSIS 79
Proof. Let us replace the limit f
n
f by two monotonic limits. Dene:
g
kn
(x) = min(f
n
(x), . . . , f
n+k
(x)),
g
n
(x) = lim
k
g
kn
(x).
Then g
n
is a non-decreasing sequence of functions and lim
n
g
n
(x) = f(x) a.e.
Since g
n
f
n
, from monotonicity of integral we get
_
X
g
n
d C for all n. Then
Levis Thm. 13.37 implies that f is summable and
_
X
f d C.
Remark 13.40. Note that the price for dropping monotonicity from Thm. 13.37 to
Lem. 13.39 is that the limit
_
X
f
n
d
_
X
f d may not hold any more.
Exercise 13.41. Give an example such that under the Fatous lemma condition we
get lim
n
_
X
f
n
d ,=
_
X
f d.
Now we can show that L
1
(X) is complete:
Theorem 13.42. L
1
(X) is a Banach space.
Proof. It is clear that the distance function d
1
indeed dene a norm|f|
1
= d
1
(f, 0).
We only need to demonstrate the completeness. Take a Cauchy sequence (f
n
)
and building a subsequence if necessary, assume that its quickly convergent that is
d
1
(f
n
, f
n+1
) 1/2
k
. Put
1
= f
1
and
n
= f
n
f
n1
for n > 1. The sequence

n
(x) =

n
1
|
k
(x)| is monotonic, integrals
_
X

n
d are bounded by the same
constant |f
1
|
1
+ 1. Thus, by the B. Levis Thm. 13.37 and its proof,
n

for a summable essentially bounded function . Therefore, the series

k
(x)
converges as well to a function f. But, this means that f
n
a.e.
f. We also notice
|f
n
(x)| |(x)|. Thus by the Lebesgue Thm. 13.33 on dominated convergence
lim
n
_
X
|f
n
f| d = 0. That is, f
n
f in the norm of L
1
(X).
The next important property of the Lebesgue integral is its absolute continuity.
Theorem 13.43 (Absolute continuity of Lebesgue integral). Let f L
1
(X). Then for
any > 0 there is a > 0 such that

_
A
f d

< if (A) < .


Proof. If f is essentially bounded by M, then it is enough to set = /M. In general
let:
A
n
= {x X : n |f(x)| < n + 1},
B
n
=
n
0
A
k
,
C
n
= X \ B
n
.
Then
_
X
|f| d =

0
_
A
k
|f| d, thus there is an N such that

N
_
A
k
|f| d =
_
C
N
|f| d < /2. Now put =

2N+2
, then for any A X with (A) < :

_
A
f d

_
A
|f| d =
_
AB
N
|f| d +
_
AC
N
|f| d <

2
+

2
= .

13.4. Integration on Product Measures. It is well-known geometrical interpreta-


tion of an integral in calculus as the area under the graph. If we advance from
area to a measure then the Lebesgue integral can be treated as theory of meas-
ures of very special shapes created by graphs of functions. This shapes belong to
the product spaces of the function domain and its range. We introduced product
measures in Defn. 12.38, now we will study them in same details using the Le-
besgue integral. We start from the following
80 VLADIMIR V. KISIL
Theorem 13.44. Let X and Y be spaces, and let S and T be semirings on X and Y respect-
ively and and be measures on S and T respectively. If and are -additive, then the
product measure from Defn. 12.38 is -additive as well.
Proof. For any C = AB S T let us dene f
C
(x) =
A
(x)(B). Then
( )(C) = (A)(B) =
_
X
f
C
d.
If the same set C has a representation C =
k
C
k
for C
k
S T, then -additivity
of implies f
C
=

k
f
C
k
. By the Lebesgue theorem 13.33 on dominated conver-
gence:
_
X
f
C
d =

k
_
X
f
C
k
d.
Thus
( )(C) =

k
( )(C
k
).

The above correspondence C f


C
can be extended to the ring R(S T) gener-
ated by S T by the formula:
f
C
=

k
f
C
k
, for C =
k
C
k
R(S T).
We have the uniform continuity of this correspondence:
|f
C
1
f
C
2
|
1
( )(C
1
C
2
)
because from the representation C
1
= A
1
B and C
2
= A
2
B, where B = C
1

C
2
one can see that f
C
1
f
C
2
= f
A
1
f
A
2
, f
C
1
C
2
= f
A
1
+ f
A
2
together with
|f
A
1
f
A
2
| f
A
1
+ f
A
2
for non-negative functions..
Thus the map C f
C
can be extended to the map of -algebra L(X Y) of
-measurable set to L
1
(X) by the formula f
lim
n
C
n
= lim
n
f
C
n
.
Exercise 13.45. Describe topologies where two limits from the last formula are
taken.
The following lemma provides the geometric interpretation of the function f
C
as the size of the slice of the set C along x = const.
Lemma 13.46. Let C L(XY). For almost every x X the set C
x
= {y Y : (x, y)
C} is -measurable and (C
x
) = f
C
(x).
Proof. For sets from the ring R(S T) it is true by the denition. If C
(n)
is a mono-
tonic sequence of sets, then (lim
n
C
(n)
x
) = lim
n
(C
(n)
x
) by -additivity of meas-
ures. Thus the property (C
x
) = f
x
(C) is preserved by monotonic limits. The
following result of the separate interest:
Lemma 13.47. Any measurable set is up to a set of zero measure can be received from
elementary sets by two monotonic limits.
Proof of Lem. 13.47. Let C be a measurable set, put C
n
R(S T) to approximate
C up to 2
n
in . Let

C =

n=1

k=1
C
n+k
, then
( ) (C \

k=1
C
n+k
) = 0 and ( ) (

k=1
C
n+k
\ C) = 2
1n
.
Then ( )(

C C) 2
1n
for any n N.
INTRODUCTION TO FUNCTIONAL ANALYSIS 81
Coming back to Lem. 13.46 we notice that (in the above notations) f
C
= f

C
almost everywhere. Then:
f
C
(x)
a.e
= f

C
(x) = (

C
x
) = (C
x
).

The following theorem generalizes the meaning of the integral as area under
the graph.
Theorem 13.48. Let and are -nite measures and C be a measurable set
X Y. We dene C
x
= {y Y : (x, y) C}. Then for -almost every x X the set C
x
is -measurable, function f
C
(x) = (C
x
) is -measurable and
(13.6) ( )(C) =
_
X
f
C
d,
where both parts may have the value +.
Proof. If C has a nite measure, then the statement is reduced to Lem. 13.46 and a
passage to limit in (13.6).
If C has an innite measure, then there exists a sequence of C
n
C, such that

n
C
n
= C and ( )(C
n
) . Then f
C
(x) = lim
n
f
C
n
(x) and
_
X
f
C
n
d = ( )(C
n
) +.
Thus f
C
is measurable and non-summable.
This theorem justify the well-known technique to calculation of areas (volumes)
as integrals of length (areas) of the sections.
Remark 13.49. (i) The role of spaces X and Y in Theorem 13.48 is symmetric,
thus we can swap them in the conclusion.
(ii) The Theorem 13.48 can be extended to any nite number of measure
spaces. For the case of three spaces (X, ), (Y, ), (Z, ) we have:
(13.7) ( )(C) =
_
XY
(C
xy
) d( )(x, y) =
_
Z
( )(C
z
) d(z),
where
C
xy
= {z Z : (x, y, z) C},
C
z
= {(x, y) X Y : (x, y, z) C}.
Theorem 13.50 (Fubini). Let f(x, y) be a summable function on the product of spaces
(X, ) and (Y, ). Then:
(i) For -almost every x X the function f(x, y) is summable on Y and f
Y
(x) =
_
Y
f(x, y) d(y) is a -summable on X.
(ii) For -almost every y Y the function f(x, y) is summable on X and f
X
(y) =
_
X
f(x, y) d(x) is a -summable on Y.
(iii) There are the identities:
_
XY
f(x, y) d( )(x, y) =
_
X
__
Y
f(x, y) d(y)
_
d(x) (13.8)
=
_
Y
__
X
f(x, y) d(x)
_
d(y).
(iv) For a non-negative functions the existence of any repeated integral in (13.8) im-
plies summability of f on X Y.
82 VLADIMIR V. KISIL
Proof. From the decomposition f = f
+
f

we can reduce our consideration to


non-negative functions. Let us consider the product of three spaces (X, ), (Y, ),
(R, ), with = dz being the Lebesgue measure on R. Dene
C = {(x, y, z) X Y R : 0 z f(x, y)}.
Using the relation (13.7) we get:
C
xy
= {z R : 0 z f(x, y)}, (C
xy
) = f(x, y)
C
x
= {(y, z) Y R : 0 z f(x, y)}, ( )(C
x
) =
_
Y
f(x, y) d(y).
the theorem follows from those relations.
Exercise 13.51. Show that the rst three conclusions of the Fubini The-
orem may fail if f is not summable.
Show that the fourth conclusion of the Fubini Theorem may fail if f has
values of different signs.
13.5. Absolute Continuity of Measures. Here, we consider another topic in the
measure theory which benets from the integration theory.
Denition 13.52. Let X be a set with -algebra R and -nite measure and nite
charge on R. The charge is absolutely continuous with respect to if (A) = 0
for A R implies (A) = 0. Two charges
1
and
2
are equivalent if two conditions
|
1
| (A) = 0 and |
2
| (A) = 0 are equivalent.
The above denition seems to be not justifying absolute continuity name, but
this will become clear from the following important theorem.
Theorem 13.53 (RadonNikodym). Any charge which absolutely continuous with
respect to a measure have the form
(A) =
_
A
f d,
where f is a function from L
1
. The function f L
1
is uniquely dened by the charge .
Sketch of the proof. First we will assume that is a measure. Let D be the collection
of measurable functions g : X [0, ) such that
_
E
gd (E) (E L).
Let = sup
gD
_
X
gd (X) < . So we can nd a sequence (g
n
) in D with
_
X
g
n
d .
We dene f
0
(x) = sup
n
g
n
(x). We can show that f
0
= only on a set of -
measure zero, so if we adjust f
0
on this set, we get a measurable function f : X
[0, ). There is now a long argument to show that f is as required.
If is a charge, we can nd f by applying the previous operation to the meas-
ures
+
and

(as it is easy to verify that


+
,

).
We show that f is essentially unique. If g is another function inducing , then
_
E
f gd = (E) (E) = 0 (E L).
Let E = {x X : f(x)g(x) 0}, so as fg is measurable, E L. Then
_
E
fgd =
0 and fg 0 on E, so by our result from integration theory, we have that fg = 0
almost everywhere on E. Similarly, if F = {x X : f(x) g(x) 0}, then F L and
f g = 0 almost everywhere on F. As E F = X, we conclude that f = g almost
everywhere.
INTRODUCTION TO FUNCTIONAL ANALYSIS 83
Corollary 13.54. Let be a measure on X, be a nite charge, which is absolutely con-
tinuous with respect to . For any > 0 there exists > 0 such that (A) < implies
|| (A) < .
Proof. By the RadonNikodym theorem there is a function f L
1
(X, ) such that
(A) =
_
A
f d. Then || (A) =
_
A
|f| d ad we get the statement from The-
orem 13.43 on absolute continuity of the Lebesgue integral.
14. FUNCTIONAL SPACES
In this section we describe various Banach spaces of functions on sets with
measure.
14.1. Integrable Functions. Let (X, L, ) be a measure space. For 1 p < , we
dene L
p
() to be the space of measurable functions f : X K such that
_
X
|f|
p
d < .
We dene ||
p
: L
p
() [0, ) by
|f|
p
=
__
X
|f|
p
d
_
1/p
(f L
p
()).
Notice that if f = 0 almost everywhere, then |f|
p
= 0 almost everywhere, and
so |f|
p
= 0. However, there can be non-zero functions such that f = 0 almost
everywhere. So ||
p
is not a norm on L
p
().
Exercise 14.1. Find a measure space (X, ) such that
p
= L
p
(), that is the space
of sequences
p
is a particular case of function spaces considered in this section. It
also explains why the following proofs are referencing to Section 11 so often.
Lemma 14.2 (Integral H older inequality). Let 1 < p < , let q (1, ) be such that
1/p + 1/q = 1. For f L
p
() and g L
q
(), we have that fg is summable, and
(14.1)
_
X
|fg| d |f|
p
|g|
q
.
Proof. Recall that we know from Lem. 11.2 that
|ab|
|a|
p
p
+
|b|
q
q
(a, b K).
Now we follow the steps in proof of Prop. 11.4. Dene measurable functions a, b :
X K by setting
a(x) =
f(x)
|f|
p
, b(x) =
g(x)
|g|
q
(x X).
So we have that
|a(x)b(x)|
|f(x)|
p
p|f|
p
p
+
|g(x)|
q
q|g|
q
q
(x X).
By integrating, we see that
_
X
|ab| d
1
p|f|
p
p
_
X
|f|
p
d +
1
q|g|
q
q
_
X
|g|
q
d =
1
p
+
1
q
= 1.
Hence, by the denition of a and b,
_
X
|fg| |f|
p
|g|
q
,
as required.
84 VLADIMIR V. KISIL
Lemma 14.3. Let f, g L
p
() and let a K. Then:
(i) |af|
p
= |a| |f|
p
;
(ii) |f + g|
p
|f|
p
+|g|
p
.
In particular, L
p
is a vector space.
Proof. Part 14.3(i) is easy. For 14.3(ii), we need a version of Minkowskis Inequality,
which will follow from the previous lemma. We essentially repeat the proof of
Prop. 11.5.
Notice that the p = 1 case is easy, so suppose that 1 < p < . We have that
_
X
|f +g|
p
d =
_
X
|f +g|
p1
|f +g| d

_
X
|f +g|
p1
(|f| +|g|) d
=
_
X
|f +g|
p1
|f| d +
_
X
|f +g|
p1
|g| d.
Applying the lemma, this is
|f|
p
__
X
|f +g|
q(p1)
d
_
1/q
+|g|
p
__
X
|f +g|
q(p1)
d
_
1/q
.
As q(p 1) = p, we see that
|f +g|
p
p

_
|f|
p
+|g|
p
_
|f +g|
p/q
p
.
As p p/q = 1, we conclude that
|f +g|
p
|f|
p
+|g|
p
,
as required.
In particular, if f, g L
p
() then af +g L
p
(), showing that L
p
() is a vector
space.
We dene an equivalence relation on the space of measurable functions by
setting f g if and only if f = g almost everywhere. We can check that is an
equivalence relation (the slightly non-trivial part is that is transitive).
Proposition 14.4. For 1 p < , the collection of equivalence classes L
p
()/ is a
vector space, and ||
p
is a well-dened norm on L
p
()/ .
Proof. We need to show that addition, and scalar multiplication, are well-dened
on L
p
()/ . Let a K and f
1
, f
2
, g
1
, g
2
L
p
() with f
1
f
2
and g
1
g
2
. Then
its easy to see that af
1
+g
1
af
2
+ g
2
; but this is all thats required!
If f g then |f|
p
= |g|
p
almost everywhere, and so |f|
p
= |g|
p
. So ||
p
is well-
dened on equivalence classes. In particular, if f 0 then |f|
p
= 0. Conversely,
if |f|
p
= 0 then
_
X
|f|
p
d = 0, so as |f|
p
is a positive function, we must have that
|f|
p
= 0 almost everywhere. Hence f = 0 almost everywhere, so f 0. That is,
{f L
p
() : f 0} =
_
f L
p
() : |f|
p
= 0
_
.
It follows from the above lemma that this is a subspace of L
p
().
The above lemma now immediately shows that ||
p
is a norm on L
p
()/ .
Denition 14.5. We write L
p
() for the normed space (L
p
()/ , ||
p
).
We will abuse notation and continue to write members of L
p
() as functions.
Really they are equivalence classes, and so care must be taken when dealing with
L
p
(). For example, if f L
p
(), it does not make sense to talk about the value of
f at a point.
INTRODUCTION TO FUNCTIONAL ANALYSIS 85
Theorem 14.6. Let (f
n
) be a Cauchy sequence in L
p
(). There exists f L
p
() with
|f
n
f|
p
0. In fact, we can nd a subsequence (n
k
) such that f
n
k
f pointwise,
almost everywhere.
Proof. Consider rst the case of a nite measure space X. Let f
n
be a Cauchy
sequence in L
p
(). From the H older inequality (14.1) we see that |f
n
f
m
|
1

|f
n
f
m
|
p
((X))
1/q
. Thus, f
n
is also a Cauchy sequence in L
1
(). Thus by the
Theorem 13.42 there is the limit function f L
1
(). Moreover, from the proof of
that theorem we know that there is a subsequence f
n
k
of f
n
convergent to f almost
everywhere. Thus in the Cauchy sequence inequality
_
X
|f
n
k
f
n
m
|
p
d <
we can pass to the limit m by the Fatou Lemma 13.39 and conclude:
_
X
|f
n
k
f|
p
d < .
So, f
n
k
converges to f in L
p
(), then f
n
converges to f in L
p
() as well.
For a -additive measure we represent X =
k
X
k
with (X
k
) < + for
all k. The restriction (f
(k)
n
) of a Cauchy sequence (f
n
) L
p
(X, ) to every X
k
is a Cauchy sequence in L
p
(X
k
, ). By the previous paragraph there is the limit
f
(k)
L
p
(X
k
, ). Dene a function f L
p
(X, ) by the identities f(x) = f
(k)
if
x X
k
. By the additivity of integral, the Cauchy condition on (f
n
) can be written
as:
_
X
|f
n
f
m
|
p
d =

k=1
_
X
k

f
(k)
n
f
(k)
m

p
d < .
It implies for any M:
M

k=1
_
X
k

f
(k)
n
f
(k)
m

p
d < .
In the last inequality we can pass to the limit m :
M

k=1
_
X
k

f
(k)
n
f
(k)

p
d < .
Since the last inequality is independent from M we conclude:
_
X
|f
n
f|
p
d =

k=1
_
X
k

f
(k)
n
f
(k)

p
d < .
Thus we conclude that f
n
f in L
p
(X, ).
Corollary 14.7. L
p
() is a Banach space.
Proposition 14.8. Let (X, L, ) be a measure space, and let 1 p < . We can dene a
map : L
q
() L
p
()

by setting (f) = F, for f L


q
(),
1
p
+
1
q
= 1, where
F : L
p
() K, g
_
X
fgd (g L
p
()).
Proof. This proof very similar to proof of Thm. 11.12. For f L
q
() and g L
p
(),
it follows by the H olders Inequality (14.1), that fg is summable, and

_
X
fgd

_
X
|fg| d |f|
q
|g|
p
.
86 VLADIMIR V. KISIL
Let f
1
, f
2
L
q
() and g
1
, g
2
L
p
() with f
1
f
2
and g
1
g
2
. Then f
1
g
1
= f
2
g
1
almost everywhere and f
2
g
1
= f
2
g
2
almost everywhere, so f
1
g
1
= f
2
g
2
almost
everywhere, and hence
_
X
f
1
g
1
d =
_
X
f
2
g
2
d.
So is well-dened.
Clearly is linear, and we have shown that |(f)| |f|
q
.
Let f L
q
() and dene g : X K by
g(x) =
_
f(x) |f(x)|
q2
: f(x) ,= 0,
0 : f(x) = 0.
Then |g(x)| = |f(x)|
q1
for all x X, and so
_
X
|g|
p
d =
_
X
|f|
p(q1)
d =
_
X
|f|
q
d,
so |g|
p
= |f|
q/p
q
, and so, in particular, g L
p
(). Let F = (f), so that
F(g) =
_
X
fgd =
_
X
|f|
q
d = |f|
q
q
.
Thus |F| |f|
q
q
/ |g|
p
= |f|
q
. So we conclude that |F| = |f|
q
, showing that
is an isometry.
Proposition 14.9. Let (X, L, ) be a nite measure space, let 1 p < , and let F
L
p
()

. Then there exists f L


q
(),
1
p
+
1
q
= 1 such that
F(g) =
_
X
fgd (g L
p
()).
Sketch of the proof. As (X) < , for E L, we have that |
E
|
p
= (E)
1/p
< .
So
E
L
p
(), and hence we can dene
(E) = F(
E
) (E L).
We proceed to show that is a signed (or complex) measure. Then we can apply
the Radon-Nikodym Theorem 13.53 to nd a function f : X K such that
F(
E
) = (E) =
_
E
f d (E L).
There is then a long argument to show that f L
q
() and that
_
X
fgd = F(g)
for all g L
p
(), and not just for g =
E
.
Proposition 14.10. For 1 < p < , we have that L
p
()

= L
q
() isometrically, under
the identication of the above results.
Remark 14.11. Note that L

is not isomorphic to L
1
, except nite-dimensional situ-
ation. Moreover if is not a point measure L
1
is not a dual to any Banach space.
INTRODUCTION TO FUNCTIONAL ANALYSIS 87
14.2. Dense Subspaces in L
p
. We note that f L
p
(X) if and only if |f|
p
is sum-
mable, thus we can use all results from Section 13 to investigate L
p
(X).
Proposition 14.12. Let (X, L, ) be a nite measure space, and let 1 p < . Then the
collection of simple bounded functions attained only a nite number of values is dense in
L
p
().
Proof. Let f L
p
(), and suppose for now that f 0. For each n N, let
f
n
= max(n, 2
n
2
n
f).
Then each f
n
is simple, f
n
f, and |f
n
f|
p
0 pointwise. For each n, we have
that
0 f
n
f = 0 f f
n
f,
so that |f f
n
|
p
|f|
p
for all n. As
_
|f|
p
d < , we can apply the Dominated
Convergence Theorem to see that
lim
n
_
X
|f
n
f|
p
d = 0,
that is, |f
n
f|
p
0.
The general case follows by taking positive and negative parts, and if K = C,
by taking real and imaginary parts rst.
Let ([0, 1], L, ) be the restriction of Lebesgue measure to [0, 1]. We often write
L
p
([0, 1]) instead of L
p
().
Proposition 14.13. For 1 < p < , we have that C
K
([0, 1]) is dense in L
p
([0, 1]).
Proof. As [0, 1] is a nite measure space, and each member of C
K
([0, 1]) is bounded,
it is easy to see that each f C
K
([0, 1]) is such that |f|
p
< . So it makes sense
to regard C
K
([0, 1]) as a subspace of L
p
(). If C
K
([0, 1]) is not dense in L
p
(), then
we can nd a non-zero F L
p
([0, 1])

with F(f) = 0 for each f C


K
([0, 1]). This
was a corollary of the Hahn-Banach theorem 11.14.
So there exists a non-zero g L
q
([0, 1]) with
_
[0,1]
fgd = 0 (f C
K
([0, 1])).
Let a < b in [0, 1]. By approximating
(a,b)
by a continuous function, we can
show that
_
(a,b)
gd =
_
g
(a,b)
d = 0.
Suppose for now that K = R. Let A = {x [0, 1] : g(x) 0} L. By the
denition of the Lebesgue (outer) measure, for > 0, there exist sequences (a
n
)
and (b
n
) with A
n
(a
n
, b
n
), and

n
(b
n
a
n
) (A) +.
For each N, consider
N
n=1
(a
n
, b
n
). If some (a
i
, b
i
) overlaps (a
j
, b
j
), then we
could just consider the larger interval (min(a
i
, a
j
), max(b
i
, b
j
)). Formally by an
induction argument, we see that we can write
N
n=1
(a
n
, b
n
) as a nite union of
some disjoint open intervals, which we abusing notations still denote by (a
n
, b
n
).
By linearity, it hence follows that for N N, if we set B
N
=
N
n=1
(a
n
, b
n
), then
_
g
B
N
d =
_
g
(a
1
,b
1
)(a
N
,b
N
)
d = 0.
Let B =
n
(a
n
, b
n
), so A B and (B)

n
(b
n
a
n
) (A) + . We then
have that

_
g
B
N
d
_
g
B
d

_
g
B\(a
1
,b
1
)(a
N
,b
N
)
d

.
88 VLADIMIR V. KISIL
We now apply H olders inequality to get
__

B\(a
1
,b
1
)(a
N
,b
N
)
d
_
1/p
|g|
q
= (B \ (a
1
, b
1
) (a
N
, b
N
))
1/p
|g|
q

n=N+1
(b
n
a
n
)
_
1/p
|g|
q
.
We can make this arbitrarily small by making N large. Hence we conclude that
_
g
B
d = 0.
Then we apply H olders inequality again to see that

_
g
A
d

_
g
A
d
_
g
B
d

_
g
B\A
d

|g|
q
(B\A)
1/p
|g|
q

1/p
.
As > 0 was arbitrary, we see that
_
A
gd = 0. As g is positive on A, we conclude
that g = 0 almost everywhere on A.
Asimilar argument applied to the set {x [0, 1] : g(x) 0} allows us to conclude
that g = 0 almost everywhere. If K = C, then take real and imaginary parts.
14.3. Continuous functions. Let K be a compact (always assumed Hausdorff) to-
pological space.
Denition 14.14. The Borel -algebra, B(K), on K, is the -algebra generated by the
open sets in K (recall what this means from Section 11.5). A member of B(K) is a
Borel set.
Notice that if f : K K is a continuous function, then clearly f is B(K)-
measurable (the inverse image of an open set will be open, and hence certainly
Borel). So if : B(K) K is a nite real or complex charge (for K = R or K = C
respectively), then f will be -summable (as f is bounded) and so we can dene

: C
K
(K) K,

(f) =
_
K
f d (f C
K
(K)).
Clearly

is linear. Suppose for now that is positive, so that


|

(f)|
_
K
|f| d |f|

(K) (f C
K
(K)).
So

C
K
(K)

with |

| (K).
The aim of this section is to show that all of C
K
(K)

arises in this way. First


we need to dene a class of measures which are in a good agreement with the
topological structure.
Denition 14.15. A measure : B(K) [0, ) is regular if for each A B(K), we
have
(A) = sup {(E) : E A and E is compact}
= inf {(U) : A U and U is open} .
A charge =
+

is regular if
+
and

are regular measures. A complex


measure is regular if its real and imaginary parts are regular.
Note the similarity between this notion and denition of outer measure.
Example 14.16. (i) Many common measures on the real line, e.g. the Le-
besgue measure, point measures, etc., are regular.
INTRODUCTION TO FUNCTIONAL ANALYSIS 89
(ii) An example of the measure on [0, 1] which is not regular:
() = 0, ({
1
2
}) = 1, (A) = +,
for any other subset A [0, 1].
(iii) Another example of a -additive measure on [0, 1] which is not regular:
(A) =
_
0, if A is at most countable;
+ otherwise.
As we are working only with compact spaces, for us, compact is the same as
closed. Regular measures somehow interact well with the underlying topo-
logy on K.
We let M
R
(K) and M
C
(K) be the collection of all nite, regular real or complex
charges (that is, signed or complex measures) on B(K).
Exercise 14.17. Check that, M
R
(K) and M
C
(K) are real or complex, respectively,
vector spaces for the obvious denition of addition and scalar multiplication.
Recall, Defn. 12.29, that for M
K
(K) we dene the variation of
|| = sup
_

n=1
|(A
n
)|
_
,
where the supremum is taken over all sequences (A
n
) of pairwise disjoint mem-
bers of B(K), with
n
A
n
= K. Such (A
n
) are called partitions.
Proposition 14.18. The variation || is a norm on M
K
(K).
Proof. If = 0 then clearly || = 0. If || = 0, then for A B(K), let A
1
=
A, A
2
= K \ A and A
3
= A
4
= = . Then (A
n
) is a partition, and so
0 =

n=1
|(A
n
)| = |(A)| +|(K \ A)| .
Hence (A) = 0, and so as A was arbitrary, we have that = 0.
Clearly |a| = |a| || for a K and M
K
(K).
For , M
K
(K) and a partition (A
n
), we have that

n
|( +)(A
n
)| =

n
|(A
n
) +(A
n
)|

n
|(A
n
)| +

n
|(A
n
)| || +|| .
As (A
n
) was arbitrary, we see that | +| || +||.
To get a handle on the regular condition, we need to know a little more about
C
K
(K).
Theorem 14.19 (Urysohns Lemma). Let K be a compact space, and let E, F be closed
subsets of K with E F = . There exists f : K [0, 1] continuous with f(x) = 1 for
x E and f(x) = 0 for x F (written f(E) = {1} and f(F) = {0}).
Proof. See a book on (point set) topology.
Lemma 14.20. Let : B(K) [0, ) be a regular measure. Then for U K open, we
have
(U) = sup
__
K
f d : f C
R
(K), 0 f
U
_
.
Proof. If 0 f
U
, then
0 =
_
K
0 d
_
K
f d
_
K

U
d = (U).
90 VLADIMIR V. KISIL
Conversely, let F = K \ U, a closed set. Let E U be closed. By Urysohn
Lemma 14.19, there exists f : K [0, 1] continuous with f(E) = {1} and f(F) = {0}.
So
E
f
U
, and hence
(E)
_
K
f d (U).
As is regular,
(U) = sup {(E) : E U closed} sup
__
K
f d : 0 f
U
_
(U).
Hence we have equality throughout.
The next result tells that the variation coincides with the norm on real charges
viewed as linear functionals on C
R
(K).
Lemma 14.21. Let M
R
(K). Then
|| = |

| := sup
_

_
K
f d

: f C
R
(K), |f|

1
_
.
Proof. Let (A, B) be a Hahn decomposition (Thm. 12.34) for . For f C
R
(K) with
|f|

1, we have that

_
K
f d

_
A
f d

_
B
f d

_
A
f d
+

_
B
f d

_
A
|f| d
+
+
_
B
|f| d

|f|

((A) (B)) |f|

|| ,
using the fact that (B) 0 and that (A, B) is a partition of K.
Conversely, as is regular, for > 0, there exist closed sets E and F with E
A, F B, and with
+
(E) >
+
(A) and

(F) >

(B) . By Urysohn
Lemma 14.19, there exists f : K [0, 1] continuous with f(E) = {1} and f(F) = {0}.
Let g = 2f 1, so g is continuous, g takes values in [1, 1], and g(E) = {1}, g(F) =
{1}. Then
_
K
gd =
_
E
1 d +
_
F
1 d +
_
K\(EF)
gd
= (E) (F) +
_
A\E
gd +
_
B\F
gd
As E A, we have (E) =
+
(E), and as F B, we have (F) =

(F). So
_
K
gd >
+
(A) +

(B) +
_
A\E
gd +
_
B\F
gd
|(A)| +|(B)| 2 |(A\ E)| |(B \ F)|
|(A)| +|(B)| 4.
As > 0 was arbitrary, we see that |

| |(A)| +|(B)| = ||.


Thus, we know that M
R
(K) is isometrically embedded in C
R
(K)

.
14.4. Riesz Representation Theorem. To facilitate an approach to the key point
of this Subsection we will require some more denitions.
Denition 14.22. A functional F is positive if for any non-negative function f we
have F(f) > 0.
Lemma 14.23. Any positive linear functional F on C(X) is continuous and |F| = F(1),
where 1 is the function identically equal to 1 on X.
INTRODUCTION TO FUNCTIONAL ANALYSIS 91
Proof. For any function f such that |f|

1 the function 1 f is non negative


thus: F(1) F(f) = F(1 f) > 0, Thus F(1) > F(f), that is F is bounded and its norm
is F(1).
So for a positive functional you know the exact place where to spot its norm,
while a linear functional can attain its norm in an generic point (if any) of the
unit ball in C(X). It is also remarkable that any bounded linear functional can be
represented by a pair of positive ones.
Lemma 14.24. Let be a continuous linear functional on C(X). Then there are positive
functionals
+
and

on C(X), such that =


+

.
Proof. First, for f C
R
(K) with f 0, we dene

+
(f) = sup
_
(g) : g C
R
(K), 0 g f
_
0,

(f) =
+
(f) (f) = sup
_
(g) (f) : g C
R
(K), 0 g f
_
= sup
_
(h) : h C
R
(K), 0 h +f f
_
= sup
_
(h) : h C
R
(K), f h 0
_
0.
In a sense, this is similar to the Hahn decomposition (Thm. 12.34).
We can check that

+
(tf) = t
+
(f),

(tf) = t

(f) (t 0, f 0).
For f
1
, f
2
0, we have that

+
(f
1
+f
2
) = sup {(g) : 0 g f
1
+f
2
}
= sup {(g
1
+ g
2
) : 0 g
1
+g
2
f
1
+f
2
}
sup {(g
1
) + (g
2
) : 0 g
1
f
1
, 0 g
2
f
2
}
=
+
(f
1
) +
+
(f
2
).
Conversely, if 0 g f
1
+ f
2
, then set g
1
= min(g, f
1
), so 0 g
1
f
1
. Let
g
2
= g g
1
so g
1
g implies that 0 g
2
. For x K, if g
1
(x) = g(x) then g
2
(x) =
0 f
2
(x); if g
1
(x) = f
1
(x) then f
1
(x) g(x) and so g
2
(x) = g(x) f
1
(x) f
2
(x).
So 0 g
2
f
2
, and g = g
1
+ g
2
. So in the above displayed equation, we really
have equality throughout, and so
+
(f
1
+ f
2
) =
+
(f
1
) +
+
(f
2
). As is additive,
it is now immediate that

(f
1
+f
2
) =

(f
1
) +

(f
2
)
For f C
R
(K) we put f
+
(x) = max(f(x), 0) and f

(x) = min(f(x), 0). Then


f

0 and f = f
+
f

. We dene:

+
(f) =
+
(f
+
)
+
(f

),

(f) =

(f
+
)

(f

).
As when we were dealing with integration, we can check that
+
and

become
linear functionals; by the previous Lemma they are bounded.
Finally, we need a technical denition.
Denition 14.25. For f C
R
(K), we dene the support of f, written supp(f), to be
the closure of the set {x K : f(x) ,= 0}.
Theorem 14.26 (Riesz Representation). Let K be a compact (Hausdorff) space, and let
C
K
(K)

. There exists a unique M


K
(K) such that
(f) =
_
K
f d (f C
K
(K)).
Furthermore, || = ||.
92 VLADIMIR V. KISIL
Proof. Let us show uniqueness. If
1
,
2
M
K
(K) both induce then =
1

2
induces the zero functional on C
K
(K). So for f C
R
(K),
0 =
_
K
f d =
_
K
f d
r
=
_
K
f d =
_
K
f d
i
.
So
r
and
i
both induce the zero functional on C
R
(K). By Lemma 14.21, this
means that |
r
| = |
i
| = 0, showing that =
r
+i
i
= 0, as required.
Existence is harder, and we shall only sketch it here. Firstly, we shall suppose
that K = R and that is positive.
Motivated by the above Lemmas 14.20 and 14.21, for U K open, we dene

(U) = sup
_
(f) : f C
R
(K), 0 f
U
, supp(f) U
_
.
For A K general, we dene

(A) = inf {

(U) : U K is open, A U} .
We then proceed to show that

is an outer measure: this requires a technical topological lemma, where


we make use of the support condition in the denition.
We then check that every open set in

-measurable.
As B(K) is generated by open sets, and the collection of

-measurable
sets is a -algebra, it follows that every member of B(K) is

-measurable.
By using results fromSection 12, it follows that if we let be the restriction
of

to B(K), then is a measure on B(K).


We then check that this measure is regular.
Finally, we show that does induce the functional . Arguably, it is this
last step which is the hardest (or least natural to prove).
If is not positive, then by Lemma 14.24 represent it as =
+

for pos-
itive

. As
+
and

are positive functionals, we can nd


+
and

positive
measures in M
R
(K) such that

+
(f) =
_
K
f d
+
,

(f) =
_
K
f d

(f C
R
(K)).
Then if =
+

, we see that
(f) =
+
(f)

(f) =
_
K
f d (f C
R
(K)).
Finally, if K = C, then we use the same complexication trick from the proof
of the Hahn-Banach Theorem 11.14. Namely, let C
C
(K)

, and dene
r
,
i

C
R
(K)

by

r
(f) = (f),
i
(f) = (f) (f C
R
(K)).
These are both clearly R-linear. Notice also that |
r
(f)| = |(f)| |(f)|
|| |f|

, so
r
is bounded; similarly
i
.
By the real version of the Riesz Representation Theorem, there exist charges
r
and
i
such that
(f) =
r
(f) =
_
K
f d
r
, (f) =
i
(f) =
_
K
f d
i
(f C
R
(K)).
INTRODUCTION TO FUNCTIONAL ANALYSIS 93
Then let =
r
+i
i
, so for f C
C
(K),
_
K
f d =
_
K
f d
r
+i
_
K
f d
i
=
_
K
(f) d
r
+i
_
K
(f) d
r
+i
_
K
(f) d
i

_
K
(f) d
i
=
r
((f)) +i
r
((f)) +i
i
((f))
i
((f))
= ((f)) +i((f)) +i((f)) ((f))
= ((f) +i(f)) = (f),
as required.
Notice that we have not currently proved that || = || in the case K = C. See
a textbook for this.
15. FOURIER TRANSFORM
In this section we will briey present a theory of Fourier transform focusing on
commutative group approach. We mainly follow footsteps of [3, Ch. IV].
15.1. Convolutions on Commutative Groups. Let Gbe a commutative group, we
will use + sign to denote group operation, respectively the inverse elements of
g G will be denoted g. We assume that G has a Hausdorff topology such
that operations (g
1
, g
2
) g
1
+ g
2
and g g are continuous maps. We also
assume that the topology is locally compact, that is the group neutral element has a
neighbourhood with a compact closure.
Example 15.1. Our main examples will be as follows:
(i) G = Z the group of integers with operation of addition and the discrete
topology (each point is an open set).
(ii) G = R the group of real numbers with addition and the topology dened
by open intervals.
(iii) G = T the group of Euclidean rotations the unit circle in R
2
with the
natural topology. Another realisations of the same group:
Unimodular complex numbers under multiplication.
Factor group R/Z, that is addition of real numbers modulus 1.
There is a homomorphism between two realisations given by z = e
2it
,
t [0, 1), |z| = 1.
We assume that G has a regular Borel measure which is invariant in the follow-
ing sense.
Denition 15.2. Let be a measure on a commutative group G, is called invari-
ant (or Haar measure) if for any measurable X and any g G the sets g +X and X
are also measurable and (X) = (g +X) = (X).
Such an invariant measure exists if and only if the group is locally compact, in
this case the measure is uniquely dened up to the constant factor.
Exercise 15.3. Check that in the above three cases invariant measures are:
G = Z, the invariant measure of X is equal to number of elements in X.
G = R the invariant measure is the Lebesgue measure.
G = T the invariant measure coincides with the Lebesgue measure.
94 VLADIMIR V. KISIL
Denition 15.4. A convolution of two functions on a commutative group G with
an invariant measure is dened by:
(15.1) (f
1
f
2
)(x) =
_
G
f
1
(x y) f
2
(y) d(y) =
_
G
f
1
(y) f
2
(x y) d(y).
Theorem 15.5. If f
1
, f
2
L
1
(G, ), then the integrals in (15.1) exist for almost every
x G, the function f
1
f
2
is in L
1
(G, ) and |f
1
f
2
| |f
1
| |f
2
|.
Proof. If f
1
, f
2
L
1
(G, ) then by Fubinis Thm. 13.50 the function (x, y) = f
1
(x)
f
2
(y) is in L
1
(G G, ) and || = |f
1
| |f
2
|.
Let us dene a map : G G G G such that (x, y) = (x + y, y). It is
measurable (send Borel sets to Borel sets) and preserves the measure . Indeed,
for an elementary set X = AB GG we have:
( )((X)) =
_
GG

(X)
(x, y) d(x) d(y)
=
_
GG

X
(x y, y) d(x) d(y)
=
_
G
__
G

X
(x y, y) d(x)
_
d(y)
=
_
B
(A+y) d(y) = (A) (B) = ( )(X).
We used invariance of and Fubinis Thm. 13.50. Therefore we have an isometric
isomorphism of L
1
(GG, ) into itself by the formula:
T(x, y) = ((x, y)) = (x y, y).
If we apply this isomorphism to the above function (x, y) = f
1
(x) f
2
(y) we shall
obtain the statement.
Denition 15.6. Denote by S(k) the map S(k) : f k f which we will call
convolution operator with the kernel k.
Corollary 15.7. If k L
1
(G) then the convolution S(k) is a bounded linear operator on
L
1
(G).
Theorem 15.8. Convolution is a commutative, associative and distributive operation. In
particular S(f
1
)S(f
2
) = S(f
2
)S(f
1
) = S(f
1
f
2
).
Proof. Direct calculation using change of variables.
It follows from Thm. 15.5 that convolution is a closed operation on L
1
(G) and
has nice properties due to Thm. 15.8. We x this in the following denition.
Denition 15.9. L
1
(G) equipped with the operation of convolution is called con-
volution algebra L
1
(G).
The following operators of special interest.
Denition 15.10. An operator of shift T(a) acts on functions by T(a) : f(x)
f(x +a).
Lemma 15.11. An operator of shift is an isometry of L
p
(G), 1 p .
Theorem 15.12. Operators of shifts and convolutions commute:
T(a)(f
1
f
2
) = T(a)f
1
f
2
= f
1
T(a)f
2
,
or
T(a)S(f) = S(f)T(a) = S(T(a)f).
INTRODUCTION TO FUNCTIONAL ANALYSIS 95
Proof. Just another calculation with a change of variables.
Remark 15.13. Note that operator of shifts T(a) provide a representation of the group
G by linear isometric operators in L
p
(G), 1 p . A map f S(f) is a
representation of the convolution algebra
There is a useful relation between support of functions and their convolutions.
Lemma 15.14. For any f
1
, f
2
L
1
(G) we have:
supp(f
1
f
2
) supp(f
1
) + supp(f
2
).
Proof. If x , supp(f
1
)+supp(f
2
) then for any y supp(f
2
) we have xy , supp(f
1
).
Thus for such x convolution is the integral of the identical zero.
15.2. Characters of Commutative Groups. Our purpose is to map the commutat-
ive algebra of convolutions to a commutative algebra of functions with point-wise
multiplication. To this end we rst represent elements of the group as operators of
multiplication.
Denition 15.15. A character : G T is a continuous homomorphism of an
abelian topological group G to the group T of unimodular complex numbers un-
der multiplications:
(x + y) = (x)(y).
Lemma 15.16. The product of two characters of a group is again a character of the group.
If is a character of G then
1
= is a character as well.
Proof. Let
1
and
2
be characters of G. Then:

1
(gh)
2
(gh) =
1
(g)
1
(h)
2
(g)
2
(h)
= (
1
(g)
2
(g))(
1
(h)
2
(h)) T.

Denition 15.17. The dual group is collection of all characters of G with operation
of multiplication.
The dual group becomes a topological group with the uniform convergence on
compacts: for any compact subset K

G and any > 0 there is N N such that
|
n
(x) (x)| < for all x K and n > N.
Exercise 15.18. Check that the sequence f
n
(x) = x
n
does not converge uniformly
on compacts if considered on [0, 1]. However it does converges uniformly on com-
pacts if considered on (0, 1).
Example 15.19. If G = Z then any character is dened by its values (1) since
(15.2) (n) = [(1)]
n
.
Since (1) can be any number on T we see that

Z is parametrised by T.
Theorem 15.20. The group

Z is isomorphic to T.
Proof. The correspondence from the above example is a group homomorphism.
Indeed if
z
is the character with
z
(1) = z, then
z
1

z
2
=
z
1
z
2
. Since Z is discrete,
every compact consists of a nite number of points, thus uniform convergence on
compacts means point-wise convergence. The equation (15.2) shows that
z
n

z
if and only if
z
n
(1)
z
(1), that is z
n
z.
Theorem 15.21. The group

T is isomorphic to Z.
96 VLADIMIR V. KISIL
Proof. For every n Z dene a character of T dened by
(15.3)
n
(z) = z
n
, z T.
We will show that these are the only characters in Cor. 15.25. The isomorphism
property is easy to establish. The topological isomorphism follows from discrete-
ness of

T. Indeed due to compactness of T for n ,= m:
max
zT
|
n
(z)
m
(z)|
2
= max
zT

2 2z
mn

2
= 4.
Thus, any convergent sequence (n
k
) have to be constant for sufciently large k,
that corresponds to a discrete topology on Z.
The two last Theorem are an illustration to the following general statement.
Principle 15.22 (Pontryagins duality). For any locally compact commutative topo-
logical group G the natural map G

G, such that it maps g G to a character f


g
on

G by the formula:
(15.4) f
g
() = (g),

G,
is an isomorphism of topological groups.
Remark 15.23. (i) The principle is not true for commutative group which are
not locally compact.
(ii) Note the similarity with an embedding of a vector space into the second
dual.
In particular, the Pontryagins duality tells that the collection of all characters
contains enough information to rebuild the initial group.
Theorem 15.24. The group

R is isomorphic to R.
Proof. For R dene a character



R by
(15.5)

(x) = e
2ix
, x R.
Moreover any smooth character of the group G = (R, +) has the form (15.5). In-
deed, let be a smooth character of R. Put c =

(t)|
t=0
C. Then

(t) = c(t)
and (t) = e
ct
. We also get c iR and any such c dene a character. Then the
multiplication of characters is:
1
(t)
2
(t) = e
c
1
t
e
c
2
t
= e
(c
2
+c
1
)t
. So we have a
group isomorphism.
For a generic character we can apply rst the smoothing technique and reduce to
the above case.
Let us show topological homeomorphism. If
n
then

uniformly
on any compact in R from the explicit formula of the character. Reverse, let

uniformly on any interval. Then

(x) 1 uniformly on any compact, in


particular, on [0, 1]. But
sup
[0,1]
|

(x) 1| = sup
[0,1]
|sin (
n
)x|
=
_
1, if |
n
| 1/2,
sin|
n
| , if |
n
| 1/2.
Thus
n
.
Corollary 15.25. Any character of the group T has the form (15.3).
Proof. Let

T, consider
1
(t) = (e
2it
) which is a character of R. Thus
1
(t) =
e
2it
for some R. Since
1
(1) = 1 then = n Z. Thus
1
(t) = e
2int
, that is
(z) = z
n
for z = e
2it
.
INTRODUCTION TO FUNCTIONAL ANALYSIS 97
Remark 15.26. Although

R is isomorphic to R there is no a canonical form for this
isomorphism (unlike for R

R). Our choice is convenient for the Poisson formula


below, however some other popular denitions are e
ix
or e
ix
.
We can unify the previous three Theorem into the following statement.
Theorem 15.27. Let G = R
n
Z
k
T
l
be the direct product of groups. Then the dual
group is

G = R
n
T
k
Z
l
.
15.3. Fourier Transform on Commutative Groups.
Denition 15.28. Let Gbe a locally compact commutative group with an invariant
measure . For any f L
1
(G) dene the Fourier transform

f by
(15.6)

f() =
_
G
f(x) (x) d(x),

G.
That is the Fourier transform

f is a function on the dual group

G.
Example 15.29. (i) If G = Z, then f L
1
(Z) is a two-sided summable se-
quence (c
n
)
nZ
. Its Fourier transform is the function f(z) =

n=
c
n
z
n
on T. Sometimes f(z) is called generating function of the sequence (c
n
).
(ii) If G = T, then the Fourier transform of f L
1
(T) is its Fourier coefcients,
see Section 4.1.
(iii) If G = R, the Fourier transform is also the function on R given by the
Fourier integral:
(15.7)

f() =
_
R
f(x) e
2ix
dx.
The important properties of the Fourier transform are captured in the following
statement.
Theorem 15.30. Let G be a locally compact commutative group with an invariant meas-
ure . The Fourier transform maps functions fromL
1
(G) to continuous bounded functions
on

G. Moreover, a convolution is transformed to point-wise multiplication:
(15.8) (f
1
f
2
)() =

f
1
()

f
2
(),
a shift operator T(a), a G is transformed in multiplication by the character f
a

G:
(15.9) (T(a)f)() = f
a
()

f(), f
a
() = (a)
and multiplication by a character

G is transformed to the shift T(
1
):
(15.10) ( f)(
1
) = T(
1
)

f(
1
) =

f(
1

1
).
Proof. Let f L
1
(G). For any > 0 there is a compact K Gsuch that
_
G\K
|f| d <
. If
n
in

G, then we have the uniform convergence of
n
on K, so there
is n() such that for k > n() we have |
k
(x) (x)| < for all x K. Then

f(
n
)

f()


_
K
|f(x)| |
n
(x) (x)| d(x) +
_
G\K
|f(x)| |
n
(x) (x)| d(x)
|f| + 2.
98 VLADIMIR V. KISIL
Thus

f is continuous. Its boundedness follows from the integral estimations. Al-
gebraic maps (15.8)(15.10) can be obtained by changes of variables under integ-
ration. For example, using Fubinis Thm. 13.50 and invariance of the measure:
(f
1
f
2
)() =
_
G
_
G
f
1
(s) f
2
(t s) ds (t) dt
=
_
G
_
G
f
1
(s) (s) f
2
(t s) (t s) ds dt
=

f
1
()

f
2
().

15.4. Fourier Integral. We recall the formula (15.7):


Denition 15.31. We dene the Fourier integral of a function f L
1
(R) by
(15.11)

f() =
_
R
f(x) e
2ix
dx.
We already know that

f is a bounded continuous function on R, a further prop-
erty is:
Lemma 15.32. If a sequence of functions (f
n
) L
1
(R) converges in the metric L
1
(R),
then the sequence (

f
n
) converges uniformly on the real line.
Proof. This follows from the estimation:

f
n
()

f
m
()


_
R
|f
n
(x) f
m
(x)| dx.

Lemma 15.33. The Fourier integral



f of f L
1
(R) has zero limits at and +.
Proof. Take f the indicator function of [a, b]. Then

f =
1
2i
(e
2ia
e
2ib
),
,= 0. Thus lim

f() = 0. By continuity from the previous Lemma this can


be extended to the closure of simple functions, the space L
1
(R).
Lemma 15.34. If f is absolutely continuous on every interval and f

L
1
(R), then
(f

)= 2i

f.
More generally:
(15.12) (f
(k)
)= (2i)
k

f.
Proof. A direct demonstration is based on integration by parts, which is possible
because assumption in the Lemma.
It may be also interesting to mention that the operation of differentiation D can
be expressed through the shift operatot T
a
:
(15.13) D = lim
t0
T
t
I
t
.
By the formula (15.9), the Fourier integral transforms
1
t
(T
t
I) into
1
t
(

(t)
1). Providing we can justify that the Fourier integral commutes with the limit, the
last operation is multiplication by

(0) = 2i.
Corollary 15.35. If f
(k)
L
1
(R) then

f
(k)

|2|
k
0,
that is

f decrease at innity faster than ||
k
.
INTRODUCTION TO FUNCTIONAL ANALYSIS 99
Denition15.36. We say that a function f is rapidly decreasing if lim
x

x
k
f(x)

=
0 for any k N.
The collection of all smooth rapidly decreasing functions is called Schwartz space
and is denoted by S .
Example 15.37. An example of a rapidly decreasingfunction is the Gaussian e
x
2
.
Find its Fourier transform using the classical integral
_
R
e
x
2
dx = 1, cf. Corol-
lary 15.40.
Lemma 15.38. Let f(x) and xf(x) are both in L
1
(R), then

f is differentiable and

= (2ixf).
More generally
(15.14)

f
(k)
= ((2ix)
k
f).
Proof. There are several strategies to prove this results, all having their own merits:
(i) The most straightforward uses the differentiation under the integration
sign.
(ii) We can use the intertwining property (15.10) of the Fourier integral and
the connection of derivative with shifts (15.13).
(iii) Using the inverse Fourier integral (see below), we regard this Lemma as
the dual to the Lemma 15.34.

Corollary 15.39. The Fourier transform of a smooth rapidly decreasing function is a


smooth rapidly decreasing function.
Corollary 15.40. The Fourier integral of the Gaussian e
x
2
is e

2
.
Proof. [2] Note that the Gaussian g(x) = e
x
2
is a unique (up to a factor) solution
of the equation g

+ 2xg = 0. Then, by Lemmas 15.34 and 15.38, its Fourier


transform shall satisfy to the equation 2i g + i g

= 0. Thus, g = c e

2
with a
constant factor c, its value 1 can be found from the classical integral
_
R
e
x
2
dx =
1 which represents g(0).
The relation (15.12) and (15.14) allows to reduce many partial differential equa-
tions to algebraic one, to nish the solution we need the inverse of the Fourier
transform.
Denition 15.41. We dene the inverse Fourier transform on L
1
(R):
(15.15)

f() =
_
R
f(x) e
2ix
dx.
We can notice the formal correspondence

f() =

f() =

f(), which is a mani-
festation of the group duality

R = R for the real line. This immediately generates
analogous results from Lem. 15.32 to Cor. 15.40 for the inverse Fourier transform.
Theorem15.42. The Fourier integral and the inverse Fourier transform are inverse maps.
That is, if g =

f then f = g.
Sketch of a proof. The exact meaning of the statement depends fromthe spaces which
we consider as the domain and the range. Various variants and their proofs can be
found in the literature. For example, in [3, IV.2.3], it is proven for the Schwartz
space S of smooth rapidly decreasing functions.
100 VLADIMIR V. KISIL
The outline of the proof is as follows. Using the intertwining relations (15.12)
and (15.14), we conclude the composition of Fourier integral and the inverse Four-
ier transform commutes both with operator of multiplication by x and differenti-
ation. Then we need a result, that any operator commuting with multiplication by
x is an operator of multiplication by a function f. For this function, the commuta-
tion with differentiation implies f

= 0, that is f = const. The value of this constant


can be evaluated by a Fourier transform on a single function, say the Gaussian
e
x
2
from Cor. 15.40.
The above Theorem states that the Fourier integral is an invertible map. For the
Hilbert space L
2
(R) we can show a stronger propertyits unitarity.
Theorem 15.43 (Plancherel identity). The Fourier transform extends uniquely to a
unitary map L
2
(R) L
2
(R):
(15.16)
_
R
|f|
2
dx =
_
R

2
d.
Proof. The proof will be done in three steps: rst we establish the identity for
smooth rapidly decreasing functions, then for L
2
functions with compact support
and nally for any L
2
function.
(i) Take f
1
and f
2
S be smooth rapidly decreasing functions and g
1
and g
2
be their Fourier transform. Then (using Funinis Thm. 13.50):
_
R
f
1
(t)

f
2
(t) dt =
_
R
_
R
g
1
() e
2it
d

f
2
(t) dt
=
_
R
g
1
()
_
R
e
2it

f
2
(t) dt d
=
_
R
g
1
() g
2
() d
Put f
1
= f
2
= f (and therefore g
1
= g
2
= g) we get the identity.
The same identity (15.16) can be obtained from the property (f
1
f
2
) =

f
1


f
2
, cf. (15.8), or explicitly:
_
R
f
1
(x) f
2
(x) e
2ix
dx =
_
R

f
1
(t)

f
2
( t) dt.
Now, substitute = 0 and f
2
=

f
1
(with its corollary

f
2
(t) =

f
1
(t)) and
obtain (15.16).
(ii) Next let f L
2
(R) with a support in (a, a) then f L
1
(R) as well, thus
the Fourier transform is well-dened. Let f
n
S be a sequence with
support on (a, a) which converges to f in L
2
and thus in L
1
. The Fourier
transform g
n
converges to g uniformly and is a Cauchy sequence in L
2
due to the above identity. Thus g
n
g in L
2
and we can extend the
Plancherel identity by continuity to L
2
functions with compact support.
(iii) The nal bit is done for a general f L
2
the sequence
f
n
(x) =
_
f(x), if |x| < n,
0, otherwise;
of truncations to the interval (n, n). For f
n
the Plancherel identity is es-
tablished above, and f
n
f in L
2
(R). We also build their Fourier images
g
n
and see that this is a Cauchy sequence in L
2
(R), so g
n
g.
If f L
1
L
2
then the above g coincides with the ordinary Fourier transform on
L
1
.
INTRODUCTION TO FUNCTIONAL ANALYSIS 101
Proofs of the following statements are not examinable Thms. 12.21, 12.34, 13.53,
14.26, 15.42, Props. 14.12, 14.13.
APPENDIX A. TUTORIAL PROBLEMS
These are tutorial problems intended for self-assessment of the course under-
standing.
A.1. Tutorial problems I. All spaces are complex, unless otherwise specied.
A.1. Show that |f| = |f(0)| + sup |f

(t)| denes a norm on C


1
[0, 1], which is the
space of (real) functions on [0, 1] with continuous derivative.
A.2. Showthat the formula (x
n
), (y
n
) =

n=1
x
n
y
n
/n
2
denes an inner product
on

, the space of bounded (complex) sequences. What norm does it produce?


A.3. Use the CauchySchwarz inequality for a suitable inner product to prove that
for all f C[0, 1] the inequality

1
_
0
f(x)x dx

C
_
_
1
_
0
|f(x)|
2
dx
_
_
1/2
holds for some constant C > 0 (independent of f) and nd the smallest possible C
that holds for all functions f (hint: consider the cases of equality).
A.4. We dene the following norm on

, the space of bounded complex se-


quences:
|(x
n
)|

= sup
n1
|x
n
|.
Show that this norm makes

into a Banach space (i.e., a complete normed space).


A.5. Fix a vector (w
1
, . . . , w
n
) whose components are strictly positive real num-
bers, and dene an inner product on C
n
by
x, y =
n

k=1
w
k
x
k
y
k
.
Showthat this makes C
n
into a Hilbert space (i.e., a complete inner-product space).
A.2. Tutorial problems II.
A.6. Show that the supremum norm on C[0, 1] isnt given by an inner product, by
nding a counterexample to the parallelogram law.
A.7. In
2
let e
1
= (1, 0, 0, . . .), e
2
= (0, 1, 0, 0, . . .), e
3
= (0, 0, 1, 0, 0, . . .), and so
on. Show that Lin (e
1
, e
2
, . . .) = c
00
, and that CLin(e
1
, e
2
, . . .) =
2
. What is
CLin (e
2
, e
3
, . . .)?
A.8. Let C[1, 1] have the standard L
2
inner product, dened by
f, g =
1
_
1
f(t)g(t) dt.
Show that the functions 1, t and t
2
1/3 form an orthogonal (not orthonormal!)
basis for the subspace P
2
of polynomials of degree at most 2 and hence calculate
the best L
2
-approximation of the function t
4
by polynomials in P
2
.
102 VLADIMIR V. KISIL
A.9. Dene an inner product on C[0, 1] by
f, g =
1
_
0

t f(t) g(t) dt.


Use the GramSchmidt process to nd the rst 2 terms of an orthonormal sequence
formed by orthonormalising the sequence 1, t, t
2
, . . . .
A.10. Consider the plane P in C
4
(usual inner product) spanned by the vectors
(1, 1, 0, 0) and (1, 0, 0, 1). Find orthonormal bases for P and P

, and verify dir-


ectly that (P

= P.
A.3. Tutorial Problems III.
A.11. Let a and b be arbitrary real numbers with a < b. By using the fact that the
functions
1

2
e
inx
, n Z, are orthonormal in L
2
[0, 2], together with the change
of variable x = 2(t a)/(b a), nd an orthonormal basis in L
2
[a, b] of the form
e
n
(t) = e
int
, n Z, for suitable real constants and .
A.12. For which real values of is

n=1
n

e
int
the Fourier series of a function in L
2
[, ]?
A.13. Calculate the Fourier series of f(t) = e
t
on [, ] and use Parsevals identity
to deduce that

n=
1
n
2
+ 1
=

tanh
.
A.14. Using the fact that (e
n
) is a complete orthonormal system in L
2
[, ],
where e
n
(t) = exp(int)/

2, show that e
0
, s
1
, c
1
, s
2
, c
2
, . . . is a complete orthonor-
mal system, where s
n
(t) = sin nt/

and c
n
(t) = cos nt/

. Show that every


L
2
[, ] function f has a Fourier series
a
0
+

n=1
a
n
cos nt +b
n
sin nt,
converging in the L
2
sense, and give a formula for the coefcients.
A.15. Let C(T) be the space of continuous (complex) functions on the circle
T = {z C : |z| = 1} with the supremum norm. Show that, for any polynomial f(z)
in C(T)
_
|z|=1
f(z) dz = 0.
Deduce that the function f(z) = z is not the uniform limit of polynomials on the
circle (i.e., Weierstrasss approximation theorem doesnt hold in this form).
A.4. Tutorial Problems IV.
A.16. Dene a linear functional on C[0, 1] (continuous functions on [0, 1]) by (f) =
f(1/2). Show that is bounded if we give C[0, 1] the supremum norm. Show that
is not bounded if we use the L
2
norm, because we can nd a sequence (f
n
) of
continuous functions on [0, 1] such that |f
n
|
2
1, but f
n
(1/2) .
INTRODUCTION TO FUNCTIONAL ANALYSIS 103
A.17. The Hardy space H
2
is the Hilbert space of all power series f(z) =

n=0
a
n
z
n
,
such that

n=0
|a
n
|
2
< , where the inner product is given by
_

n=0
a
n
z
n
,

n=0
b
n
z
n
_
=

n=0
a
n
b
n
.
Show that the sequence 1, z, z
2
, z
3
, . . . is an orthonormal basis for H
2
.
Fix w with |w| < 1 and dene a linear functional on H
2
by (f) = f(w). Write
down a formula for the function g(z) H
2
such that (f) = f, g. What is ||?
A.18. The Volterra operator V : L
2
[0, 1] L
2
[0, 1] is dened by
(Vf)(x) =
x
_
0
f(t) dt.
Use the CauchySchwarz inequality to show that |(Vf)(x)|

x|f|
2
(hint: write
(Vf)(x) = f, J
x
where J
x
is a function that you can write down explicitly).
Deduce that |Vf|
2
2

1
2
|f|
2
2
, and hence |V| 1/

2.
A.19. Find the adjoints of the following operators:
(i) A :
2

2
, dened by A(x
1
, x
2
, . . .) = (0,
x
1
1
,
x
2
2
,
x
3
3
, . . .);
and, on a general Hilbert space H:
(ii) The rank-one operator R, dened by Rx = x, yz, where y and z are xed
elements of H;
(iii) The projection operator P
M
, dened by P
M
(m + n) = m, where m M
and n M

, and H = MM

as usual.
A.20. Let U B(H) be a unitary operator. Show that (Ue
n
) is an orthonormal
basis of H whenever (e
n
) is.
Let
2
(Z) denote the Hilbert space of two-sided sequences (a
n
)

n=
with
|(a
n
)|
2
=

n=
|a
n
|
2
< .
Show that the bilateral right shift, V :
2
(Z)
2
(Z) dened by V((a
n
)) = (b
n
),
where b
n
= a
n1
for all n Z, is unitary, whereas the usual right shift S on

2
=
2
(N) is not unitary.
A.5. Tutorial Problems V.
A.21. Let f C[, ] and let M
f
be the multiplication operator on L
2
(, ),
given by (M
f
g)(t) = f(t)g(t), for g L
2
(, ). Find a function

f C[, ] such
that M

f
= M

f
.
Show that M
f
is always a normal operator. When is it Hermitian? When is it
unitary?
A.22. Let T be any operator such that T
n
= 0 for some integer n(such operators are
called nilpotent). Show that IT is invertible (hint: consider I+T +T
2
+. . . +T
n1
).
Deduce that I T/ is invertible for any ,= 0.
What is (T)? What is r(T)?
A.23. Let (
n
) be a xed bounded sequence of complex numbers, and dene an
operator on
2
by T((x
n
)) = ((y
n
)), where y
n
=
n
x
n
for each n. Recall that T is a
bounded operator and |T| = |(
n
)|

. Let = {
1
,
2
, . . .}. Prove the following:
(i) Each
k
is an eigenvalue of T, and hence is in (T).
(ii) If , , then the inverse of T I exists (and is bounded).
104 VLADIMIR V. KISIL
Deduce that (T) = . Note, that then any non-empty compact set could be a spectrum
of some bounden operator.
A.24. Let S be an isomorphism between Hilbert spaces H and K, that is, S : H K
is a linear bijection such that S and S
1
are bounded operators. Suppose that T
B(H). Show that T and STS
1
have the same spectrum and the same eigenvalues
(if any).
A.25. Dene an operator U :
2
(Z) L
2
(, ) by U((a
n
)) =

n=
a
n
e
int
/

2.
Show that U is a bijection and an isometry, i.e., that |Ux| = |x| for all x
2
(Z).
Let V be the bilateral right shift on
2
(Z), the unitary operator dened on Ques-
tion A.20. Let f L
2
(, ). Show that (UVU
1
f)(t) = e
it
f(t), and hence, using
Question A.24, show that (V) = T, the unit circle, but that V has no eigenvalues.
A.6. Tutorial Problems VI.
A.26. Show that K(X) is a closed linear subspace of B(X), and that AT and TA are
compact whenever T K(X) and A B(X). (This means that K(X) is a closed ideal
of B(X).)
A.27. Let A be a HilbertSchmidt operator, and let (e
n
)
n1
and (f
m
)
m1
be or-
thonormal bases of A. By writing each Ae
n
as Ae
n
=

m=1
Ae
n
, f
m
f
m
, show
that

n=1
|Ae
n
|
2
=

m=1
|A

f
m
|
2
.
Deduce that the quantity |A|
2
HS
=

n=1
|Ae
n
|
2
is independent of the choice
of orthonormal basis, and that |A|
HS
= |A

|
HS
. (|A|
HS
is called the Hilbert
Schmidt norm of A.)
A.28. (i) Let T K(H) be a compact operator. Using Question A.26, show
that T

T and TT

are compact Hermitian operators.


(ii) Let (e
n
)
n1
and (f
n
)
n1
be orthonormal bases of a Hilbert space H, let
(
n
)
n1
be any bounded complex sequence, and let T B(H) be an oper-
ator dened by
Tx =

n=1

n
x, e
n
f
n
.
Prove that T is HilbertSchmidt precisely when (
n
)
2
. Show that
T is a compact operator if and only if
n
0, and in this case write
down spectral decompositions for the compact Hermitian operators T

T
and TT

.
A.29. Solve the Fredholm integral equation T = f, where f(x) = x and
(T)(x) =
1
_
0
xy
2
(y) dy ( L
2
(0, 1)),
for small values of by means of the Neumann series.
For what values of does the series converge? Write down a solution which is
valid for all apart from one exception. What is the exception?
A.30. Suppose that h is a 2-periodic L
2
(, ) function with Fourier series

n=
a
n
e
int
.
Show that each of the functions
k
(y) = e
iky
, k Z, is an eigenvector of the in-
tegral operator T on L
2
(, ) dened by
(T)(x) =

h(x y)(y) dy,


INTRODUCTION TO FUNCTIONAL ANALYSIS 105
and calculate the corresponding eigenvalues.
Now let h(t) = log(2(1 cos t)). Assuming, without proof, that h(t) has
the Fourier series

nZ,n=0
e
int
/|n|, use the HilbertSchmidt method to solve the
Fredholm equation T = f, where f(t) has Fourier series

n=
c
n
e
int
and
1/ , (T).
A.7. Tutorial Problems VII.
A.31. Use the GramSchmidt algorithm to nd an orthonormal basis for the sub-
space X of L
2
(1, 1) spanned by the functions t, t
2
and t
4
.
Hence nd the best L
2
(1, 1) approximation of the constant function f(t) = 1
by functions from X.
A.32. For n = 1, 2, . . . let
n
denote the linear functional on
2
dened by

n
(x) = x
1
+x
2
+ . . . +x
n
,
where x = (x
1
, x
2
, . . .)
2
. Use the RieszFr echet theorem to calculate |
n
|.
A.33. Let T be a bounded linear operator on a Hilbert space, and suppose that
T = A+iB, where Aand B are self-adjoint operators. Express T

in terms of Aand
B, and hence solve for A and B in terms of T and T

.
Deduce that every operator T can be written T = A + iB, where A and B are
self-adjoint, in a unique way.
Show that T is normal if and only if AB = BA.
A.34. Let P
n
be the subspace of L
2
(, ) consisting of all polynomials of degree
at most n, and let T
n
be the subspace consisting of all trigonometric polynomials
of the form f(t) =

n
k=n
a
k
e
ikt
. Calculate the spectrum of the differentiation
operator D, dened by (Df)(t) = f

(t), when
(i) D is regarded as an operator on P
n
, and
(ii) D is regarded as an operator on T
n
.
Note that both P
n
and T
n
are nite-dimensional Hilbert spaces.
Show that T
n
has an orthonormal basis of eigenvectors of D, whereas P
n
does
not.
A.35. Use the Neumann series to solve the Volterra integral equation T = f
in L
2
[0, 1], where C, f(t) = 1 for all t, and (T)(x) =
x
_
0
t
2
(t) dt. (You should
be able to sum the innite series.)
106 VLADIMIR V. KISIL
APPENDIX B. SOLUTIONS OF TUTORIAL PROBLEMS
Solutions of the tutorial problems will be distributed due in time on the paper.
INTRODUCTION TO FUNCTIONAL ANALYSIS 107
APPENDIX C. COURSE IN THE NUTSHELL
C.1. Some useful results and formulae (1).
C.1. A norm on a vector space, |x|, satises |x| 0, |x| = 0 if and only if x = 0,
|x| = || |x|, and |x + y| |x| + |y| (triangle inequality). A norm denes a
metric and a complete normed space is called a Banach space.
C.2. An inner-product space is a vector space (usually complex) with a scalar product
on it, x, y C such that x, y = y, x, x, y = x, y, x+y, z = x, z+y, z,
x, x 0 and x, x = 0 if and only if x = 0. This denes a norm by |x|
2
= x, x.
A complete inner-product space is called a Hilbert space. A Hilbert space is auto-
matically a Banach space.
C.3. The CauchySchwarz inequality. |x, y| |x| |y| with equality if and only if
x and y are linearly dependent.
C.4. Some examples of Hilbert spaces. (i) Euclidean C
n
. (ii)
2
, sequences (a
k
) with
|(a
k
)|
2
2
=

|a
k
|
2
< . In both cases (a
k
), (b
k
) =

a
k
b
k
. (iii) L
2
[a, b], func-
tions on [a, b] with |f|
2
2
=
b
_
a
|f(t)|
2
dt < . Here f, g =
b
_
a
f(t)g(t) dt. (iv) Any
closed subspace of a Hilbert space.
C.5. Other examples of Banach spaces. (i) C
b
(X), continuous bounded functions on
a topological space X. (ii)

(X), all bounded functions on a set X. The supremum


norms on C
b
(X) and

(X) make them into Banach spaces. (iii) Any closed sub-
space of a Banach space.
C.6. On incomplete spaces. The inner-product (L
2
) norm on C[0, 1] is incomplete.
c
00
(sequences eventually zero), with the
2
norm, is another incomplete i.p.s.
C.7. The parallelogram identity. |x + y|
2
+ |x y|
2
= 2|x|
2
+ 2|y|
2
in an inner-
product space. Not in general normed spaces.
C.8. On subspaces. Complete =closed. The closure of a linear subspace is still a
linear subspace. Lin (A) is the smallest subspace containing A and CLin(A) is its
closure, the smallest closed subspace containing A.
C.9. From now on we work in inner-product spaces.
C.10. The orthogonality. x y if x, y = 0. An orthogonal sequence has e
n
, e
m
= 0
for n ,= m. If all the vectors have norm 1 it is an orthonormal sequence (o.n.s.), e.g.
e
n
= (0, . . . , 0, 1, 0, 0, . . .)
2
and e
n
(t) = (1/

2)e
int
in L
2
(, ).
C.11. Pythagorass theorem: if x y then |x +y|
2
= |x|
2
+|y|
2
.
C.12. The best approximationto x by a linear combination

n
k=1

k
e
k
is

n
k=1
x, e
k
e
k
if the e
k
are orthonormal. Note that x, e
k
is the Fourier coefcient of x w.r.t. e
k
.
C.13. Bessels inequality. |x|
2

n
k=1
|x, e
k
|
2
if e
1
, . . . , e
n
is an o.n.s.
C.14. RieszFischer theorem. For an o.n.s. (e
n
) in a Hilbert space,

n
e
n
converges
if and only if

|
n
|
2
< ; then |

n
e
n
|
2
=

|
n
|
2
.
C.15. A complete o.n.s. or orthonormal basis (o.n.b.) is an o.n.s. (e
n
) such that if
y, e
n
= 0 for all n then y = 0. In that case every vector is of the form

n
e
n
as
in the R-F theorem. Equivalently: the closed linear span of the (e
n
) is the whole
space.
C.16. GramSchmidt orthonormalization process. Start with x
1
, x
2
, . . . linearly inde-
pendent. Construct e
1
, e
2
, . . . an o.n.s. by inductively setting y
n+1
= x
n+1

n
k=1
x
n+1
, e
k
e
k
and then normalizing e
n+1
= y
n+1
/|y
n+1
|.
108 VLADIMIR V. KISIL
C.17. On orthogonal complements. M

is the set of all vectors orthogonal to everything


in M. If M is a closed linear subspace of a Hilbert space H then H = M M

.
There is also a linear map, P
M
the projection fromH onto M with kernel M

.
C.18. Fourier series. Work in L
2
(, ) with o.n.s. e
n
(t) = (1/

2)e
int
. Let
CP(, ) be the continuous periodic functions, which are dense in L
2
. For f
CP(, ) write f
m
=

m
n=m
f, e
n
e
n
, m 0. We wish to show that |f
m
f|
2

0, i.e., that (e
n
) is an o.n.b.
C.19. The Fej er kernel. For f CP(, ) write F
m
= (f
0
+. . . +f
m
)/(m+1). Then
F
m
(x) = (1/2)

f(t)K
m
(x t) dt where K
m
(t) = (1/(m+1))

m
k=0

k
n=k
e
int
is the Fej er kernel. Also K
m
(t) = (1/(m+ 1))[sin
2
(m+ 1)t/2]/[sin
2
t/2].
C.20. Fej ers theorem. If f CP(, ) then its Fej er sums tend uniformly to f on
[, ] and hence in L
2
norm also. Hence CLin ((e
n
)) CP(, ) so must be all
of L
2
(, ). Thus (e
n
) is an o.n.b.
C.21. Corollary. If f L
2
(, ) then f(t) =

c
n
e
int
with convergence in L
2
,
where c
n
= (1/2)

f(t)e
int
dt.
C.22. Parsevals formula. If f, g L
2
(, ) have Fourier series

c
n
e
int
and

d
n
e
int
then (1/2)f, g =

c
n

d
n
.
C.23. Weierstrass approximation theorem. The polynomials are dense in C[a, b] for
any a < b (in the supremum norm).
C.2. Some useful results and formulae (2).
C.24. On dual spaces. A linear functional on a vector space X is a linear mapping
: X C (or to R in the real case), i.e., (ax + by) = a(x) + b(y). When X is
a normed space, is continuous if and only if it is bounded, i.e., sup{|(x)| : |x|
1} < . Then we dene || to be this sup, and it is a norm on the space X

of
bounded linear functionals, making X

into a Banach space.


C.25. RieszFr echet theorem. If : H C is a bounded linear functional on a
Hilbert space H, then there is a unique y H such that (x) = x, y for all x H;
also || = |y|.
C.26. On linear operator. These are linear mappings T : X Y, between normed
spaces. Dening |T| = sup{|T(x)| : |x| 1}, nite, makes the bounded (i.e.,
continuous) operators into a normed space, B(X, Y). When Y is complete, so is
B(X, Y). We get |Tx| |T| |x|, and, when we can compose operators, |ST|
|S| |T|. Write B(X) for B(X, X), and for T B(X), |T
n
| |T|
n
. Inverse S = T
1
when ST = TS = I.
C.27. On adjoints. T B(H, K) determines T

B(K, H) such that Th, k


K
=
h, T

k
H
for all h H, k K. Also |T

| = |T| and T

= T.
C.28. On unitary operator. Those U B(H) for which UU

= U

U = I. Equival-
ently, U is surjective and an isometry (and hence preserves the inner product).
Hermitian operator or self-adjoint operator. Those T B(H) such that T = T

.
On normal operator. Those T B(H) such that TT

= T

T (so including Her-


mitian and unitary operators).
C.29. On spectrum. (T) = { C : (T I) is not invertible in B(X)}. Includes all
eigenvalues where Tx = x for some x ,= 0, and often other things as well. On
spectral radius: r(T) = sup{|| : (T)}. Properties: (T) is closed, bounded and
INTRODUCTION TO FUNCTIONAL ANALYSIS 109
nonempty. Proof: based on the fact that (I A) is invertible for |A| < 1. This
implies that r(T) |T|.
C.30. The spectral radius formula. r(T) = inf
n1
|T
n
|
1/n
= lim
n
|T
n
|
1/n
.
Note that (T
n
) = {
n
: (T)} and (T

) = { : (T)}. The spectrum


of a unitary operator is contained in {|z| = 1}, and the spectrum of a self-adjoint
operator is real (proof by Cayley transform: U = (T iI)(T +iI)
1
is unitary).
C.31. On nite rank operator. T F(X, Y) if ImT is nite-dimensional.
On compact operator. T K(X, Y) if: whenever (x
n
) is bounded, then (Tx
n
) has
a convergent subsequence. Now F(X, Y) K(X, Y) since bounded sequences in a
nite-dimensional space have convergent subsequences (because when Z is f.d., Z
is isomorphic to
n
2
, i.e., S :
n
2
Z with S, S
1
bounded). Also limits of compact
operators are compact, which shows that a diagonal operator Tx =

n
x, e
n
e
n
is compact iff
n
0.
C.32. HilbertSchmidt operators. T is HS when

|Te
n
|
2
< for some o.n.b. (e
n
).
All such operators are compactwrite them as a limit of nite rank operators
T
k
with T
k

n=1
a
n
e
n
=

k
n=1
a
n
(Te
n
). This class includes integral operators
T : L
2
(a, b) L
2
(a, b) of the form
(Tf)(x) =
b
_
a
K(x, y)f(y) dy,
where K is continuous on [a, b] [a, b].
C.33. On spectral properties of normal operators. If T is normal, then (i) ker T = ker T

,
so Tx = x = T

x = x; (ii) eigenvectors corresponding to distinct eigenvalues


are orthogonal; (iii) |T| = r(T).
If T B(H) is compact normal, then its set of eigenvalues is either nite or a
sequence tending to zero. The eigenspaces are nite-dimensional, except possibly
for = 0. All nonzero points of the spectrum are eigenvalues.
C.34. On spectral theorem for compact normal operators. There is an orthonormal se-
quence (e
k
) of eigenvectors of T, and eigenvalues (
k
), such that Tx =

k
x, e
k
e
k
.
If (
k
) is an innite sequence, then it tends to 0. All operators of the above form
are compact and normal.
Corollary. In the spectral theorem we can have the same formula with an or-
thonormal basis, adding in vectors from ker T.
C.35. On general compact operators. We can write Tx =

k
x, e
k
f
k
, where (e
k
)
and (f
k
) are orthonormal sequences and (
k
) is either a nite sequence or an in-
nite sequence tending to 0. Hence T B(H) is compact if and only if it is the norm
limit of a sequence of nite-rank operators.
C.36. On integral equations. Fredholm equations on L
2
(a, b) are T = f or
T = f, where (T)(x) =
b
_
a
K(x, y)(y) dy. Volterra equations similar, except
that T is now dened by (T)(x) =
x
_
a
K(x, y)(y) dy.
C.37. Neumann series. (I T)
1
= 1 + T +
2
T
2
+ . . ., for |T| < 1.
On separable kernel. K(x, y) =

n
j=1
g
j
(x)h
j
(y). The image of T (and hence its
eigenvectors for ,= 0) lies in the space spanned by g
1
, . . . , g
n
.
C.38. HilbertSchmidt theory. Suppose that K C([a, b] [a, b]) and K(y, x) =
K(x, y). Then (in the Fredholm case) T is a self-adjoint HilbertSchmidt operator
110 VLADIMIR V. KISIL
and eigenvectors corresponding to nonzero eigenvalues are continuous functions.
If ,= 0 and 1/ , (T), the the solution of T = f is
=

k=1
f, v
k

1
k
v
k
.
C.39. Fredholm alternative. Let T be compact and normal and ,= 0. Consider
the equations (i) T = 0 and (ii) T = f. Then EITHER (A) The only
solution of (i) is = 0 and (ii) has a unique solution for all f OR (B) (i) has nonzero
solutions and (ii) can be solved if and only if f is orthogonal to every solution of
(i).
INTRODUCTION TO FUNCTIONAL ANALYSIS 111
APPENDIX D. SUPPLEMENTARY SECTIONS
D.1. Reminder from Complex Analysis. The analytic function theory is the most
powerful tool in the operator theory. Here we briey recall few facts of complex
analysis used in this course. Use any decent textbook on complex variables for a
concise exposition. The only difference with our version that we consider function
f(z) of a complex variable z taking value in an arbitrary normed space V over the eld
C. By the direct inspection we could check that all standard proofs of the listed
results work as well in this more general case.
Denition D.1. A function f(z) of a complex variable z taking value in a normed
vector space V is called differentiable at a point z
0
if the following limit (called de-
rivative of f(z) at z
0
) exists:
(D.1) f

(z
0
) = lim
z0
f(z
0
+z) f(z
0
)
z
.
Denition D.2. A function f(z) is called holomorphic (or analytic) in an open set
C it is differentiable at any point of .
Theorem D.3 (Laurent Series). Let a function f(z) be analytical in the annulus r < z <
R for some real r < R, then it could be uniquely represented by the Laurent series:
(D.2) f(z) =

k=
c
k
z
k
, for some c
k
V.
Theorem D.4 (CauchyHadamard). The radii r

and R

, (r

< R

) of convergence of
the Laurent series (D.2) are given by
(D.3) r

= liminf
n
|c
n
|
1/n
and
1
R

= limsup
n
|c
n
|
1/n
.
REFERENCES
[1] B ela Bollob as, Linear analysis. An introductory course, Second, Cambridge University Press, Cam-
bridge, 1999. MR2000g:46001. MR2000g:46001
[2] Roger Howe, On the role of the Heisenberg group in harmonic analysis, Bull. Amer. Math. Soc. (N.S.) 3
(1980), no. 2, 821843. MR81h:22010
[3] Alexander A. Kirillov and Alexei D. Gvishiani, Theorems and problems in functional analysis, Problem
Books in Mathematics, Springer-Verlag, New York, 1982.
[4] A. N. Kolmogorov and S. V. Fomin, Measure, Lebesgue integrals, and Hilbert space, Translated by
Natascha Artin Brunswick and Alan Jeffrey, Academic Press, New York, 1961. MR0118797 (22
#9566b)
[5] A. N. Kolmogorov and S. V. Fomn, Introductory real analysis, Dover Publications Inc., New York,
1975. Translated from the second Russian edition and edited by Richard A. Silverman, Corrected
reprinting. MR0377445 (51 #13617)
[6] Erwin Kreyszig, Introductory functional analysis with applications, John Wiley & Sons Inc., New York,
1989. MR90m:46003. MR90m:46003
[7] Michael Reed and Barry Simon, Functional analysis, Second, Methods of Modern Mathematical
Physics, vol. 1, Academic Press, Orlando, 1980.
[8] Walter Rudin, Real and complex analysis, Third, McGraw-Hill Book Co., New York, 1987.
MR88k:00002. MR88k:00002
[9] Nicholas Young, An introduction to Hilbert space, Cambridge University Press, Cambridge, 1988.
MR90e:46001. MR90e:46001
INDEX
S(k), 94
Z, 37
B(X), 38
B(X, Y), 38
CP[, ], 24
F(X, Y), 45
H
2
, 103
K(X, Y), 45
L
2
[a, b], 15
L(X), 38
L(X, Y), 38
L
1
, 75
L

, 75
L
p
, 84
L, 68
S, Schwartz space, 99
S(X), 74
c
0
, 60

2
, 12, 59

, 60

p
, 59
IX
, 37
/3 argument, 47
ker T, 37
supp, 91
, 17
-additivity, see countable additivity
-algebra, 65
Borel, 88
-nite
measure, 66, 69
-ring, 65
, 65
CLin(A), 16

n
1
, 11

n
2
, 11

, 11
C
b
(X), 11

(X), 11
Lin(A), 16
Zermelos theorem, 63
Zorns Lemma, 63
axiom of choice, 63
a.e., see almost everywhere
absolute continuity, 79
absolutely continuous charge, 82
additivity, 65
countable, 66
adjoint operator, 38
adjoints, 108
algebra
convolution, 94
of sets, 65
representation, 95
almost everywhere, 69
convergence, 73
alternative
Fredholm, 56
analysis, 13
Fourier, 8
analytic function, 111
approximation, 19
by polynomials, 30
Weierstrass, of, 30
argument
/3, 47
diagonal, 47
ball
unit, 10
Banach space, 10, 58, 107
basis, 109
orthonormal, 21
Bessels inequality, 20, 107
best approximation, 107
bilateral right shift, 103
Borel -algebra, 88
Borel set, 88
bounded
functional, 35
operator, 37, 61
bounded linear functional, 35
bounded linear operator, 37
calculus
functional, 41
Cantor
function, see Cantor function, 71
set, 71
Cantor function, 15
Carath eodory
measurable set, 68
category theory, 9
Cauchy integral formula, 30
Cauchy sequence, 10
CauchySchwarz inequality, 107
CauchySchwarzBunyakovskii inequality,
12
Cayley transform, 44, 109
character, 95
charge, 70
absolutely continuous, 82
Hahn decomposition, 70
regular, 88
variation, 70, 89
charges
equivalent, 82
Chebyshev
inequality, 78
Chebyshev polynomials, 22
closed linear span, 16
coefcient
Fourier, 20
coefcients
Fourier, 4, 97
coherent states, 29
compact operator, 45, 109
singular value decomposition, 52
compact set, 45
complement
orthogonal, 23
112
INTRODUCTION TO FUNCTIONAL ANALYSIS 113
complete
measure, 69
complete metric space, 10
complete o.n.s., 107
complete orthonormal sequence, 21
conditions
integrability, 3
continuity
absolute, 79
convergence
almost everywhere, 73
in measure, 73
monotone
theorem B. Levi, on, 78
uniform, 72
on compacts, 95
convex, 18
convex set, 10
convolution, 94
algebra, 94
kernel, 94
convolution operator, 94
coordinates, 9
corollary about orthoprojection, 23
cosine
Fourier coefcients, 6
countable
additivity, 66
countably
sub-additive, 67
countably additive
charge, 70
decreasing
rapidly, 99
derivative, 111
diagonal argument, 47
diagonal operator, 39
differentiable function, 111
disjoint
pairwise, 65
disjunctive measures, 70
distance, 9
distance function, 9
domain
fundamental, 4
dual group, 95
dual space, 35
dual spaces, 108
duality
Pontryagins, 96
Egorovs theorem, 73
eigenspace, 51
eigenvalue of operator, 41
eigenvalues, 108
eigenvector, 41
equation
Fredholm, 53
rst kind, 53
second kind, 53, 55
heat, 31
Volterra, 53
equivalent charges, 82
essentially bounded function, 75
examples of Banach spaces, 107
examples of Hilbert spaces, 107
Fatous lemma, 78
Fej er
theorem, 28
Fej er kernel, 25, 108
Fej er sum, 25
Fej ers theorem, 108
nite
measure, 66
nite rank operator, 45, 109
rst resolvent identity, 43
formula
integral
Cauchy, 30
Parsevals, of, 29
Fourier
coefcients, 97
cosine coefcients, 6
integral, 97, 98
inverse, 99
sine coefcients, 6
transform, 97
inverse, 99
Fourier analysis, 8
Fourier coefcient, 20
Fourier coefcients, 4
Fourier series, 5, 108
Fourier transform
windowed, 34
Fourier, Joseph, 8
frame of references, 9
Fredholm equation
rst kind, 53
Fredholm alternative, 56, 110
Fredholm equation, 53
second kind, 53
Fredholm equation of the second kind, 55
Fubini theorem, 81
function
analytic, 111
bounded
essentially, 75
Cantor, 15, 71
differentiable, 111
essentially bounded, 75
generating, 97
holomorphic, 111
indicator, 73
integrable, 74
seesummable function, 74
measurable, 72
rapidly decreasing, 99
simple, 74
integral, 74
summable, 74
square integrable, 15
summable, 74, 75
support, 91
functional, see linear functional
114 VLADIMIR V. KISIL
linear, 35
bounded, 35
positive, 90
functional calculus, 41
functions of operators, 41
fundamental domain, 4
Gaussian, 99
general compact operators, 109
generating function, 97
GramSchmidt orthogonalisation, 21
GramSchmidt orthonormalization process,
107
group
dual, 95
representation, 95
group representations, 30
Haar measure, 93
Hahn decomposition of a charge, 70
Hahn-Banach theorem, 63
Hardy space, 103
heat equation, 31
HeineBorel theorem, 45
Hermitian operator, 39, 108
Hilbert space, 12, 107
HilbertSchmidt norm, 49, 104
HilbertSchmidt operator, 48
HilbertSchmidt operators, 109
HilbertSchmidt theory, 109
holomorphic function, 111
H olders Inequality, 59
identity
parallelogram, of, 12
identity operator, 37
image of linear operator, 37
incomplete spaces, 107
indicator function, 73
inequality
Bessels, 20
CauchySchwarzBunyakovskii, of, 12
Chebyshev, 78
H olders, 59
Minkowskis , 59
triangle, of, 10
inner product, 11
inner product space, 11
complete, see Hilbert space
inner-product space, 107
integrability conditions, 3
integrable
function, 74
seesummable function, 74
integral
Fourier, 97, 98
Lebesgue, 15, 76
monotonicity, 75
Riemann, 15
simple function, 74
integral equations, 109
integral formula
Cauchy, 30
integral operator, 48, 53
with separable kernel, 54
invariant measure, 93
Inverse, 108
inverse Fourier transform, 99
inverse operator, 38
invertible operator, 38
isometric
isomorphism, 61
isometry, 40, 61
isomorphic
isometrically, 61
isomorphic spaces, 61
isomorphism, 61, 104
isometric, 61
kernel, 53
Fej er, 25
kernel of convolution, 94
kernel of integral operator, 49
kernel of linear functional, 36
kernel of linear operator, 37
ladder
Cantor, see Cantor function
Laguerre polynomials, 22
Lebesgue
integral, 76
measure
outer, 67
set
measurable, 68
theorem, 68
theorem on dominated convergence, 77
Lebesgue integration, 15
Lebesgue measure, 71
left inverse, 38
left shift operator, 38
Legendre polynomials, 22
lemma
about inner product limit, 16
Fatous, 78
RieszFr echet, 36
Urysohns , 89
length of a vector, 9
Levis theorem on monotone convergence, 78
limit
two monotonic, 72, 79, 80
for sets, 73
linear
operator, 37
linear operator
image, of, 37
linear functional, 35, 108
kernel, 36
linear operator, 108
norm, of, 37
kernel, of, 37
linear space, 9
linear span, 16
locally compact topology, 93
mathematical way of thinking, 9, 17
measurable
function, 72
INTRODUCTION TO FUNCTIONAL ANALYSIS 115
set
Carath eodory, 68
Lebesgue, 68
measure, 65
-nite, 66, 69
absolutely continuous, 82
complete, 69
disjunctive, 70
nite, 66
Haar, 93
invariant, 93
Lebesgue, 71
outer, 67
outer, 67
product, 71, 80
regular, 88
signed, see charge
metric, 9, 58
metric space, 9
Minkowskis inequality, 59
monotonicity of integral, 75
multiplication operator, 37
nearest point theorem, 18
Neumann series, 42, 53, 109
nilpotent, 103
norm, 10, 58, 107
seesup-norm, 60
HilbertSchmidt, 49, 104
sup, 60
norm of linear operator, 37
normal operator, 41, 108
normed space, 10
complete, see Banach space
operator, 61
adjoint, 38
bounded, 61
compact, 45
singular value decomposition, 52
convolution, 94
diagonal, 39
unitary, 40
eigenvalue of, 41
eigenvector of, 41
nite rank, 45
Hermitian, 39
HilbertSchmidt, 48
identity, 37
integral, 48, 53
kernel of, 49
with separable kernel, 54
inverse, 38
left, 38
right, 38
invertible, 38
isometry, 40
linear, 37
bounded, 37
image, of, 37
kernel, of, 37
norm, of, 37
nilpotent, 103
normal, 41
of multiplication, 37
self-adjoint, see Hermitian operator
shift
left, 38
right, 37
shift on a group, 94
spectrum of, 42
unitary, 40
Volterra, 103
zero, 37
orthogonal
complement, 23
projection, 23
orthogonal polynomials, 22
orthogonal complement, 23
orthogonal complements, 108
orthogonal projection, 23
orthogonal sequence, 17, 107
orthogonal system, 17
orthogonalisation
GramSchmidt, of, 21
orthogonality, 11, 17, 107
orthonormal basis, 21
theorem, 21
orthonormal basis (o.n.b.), 107
orthonormal sequence, 17
complete , 21
orthonormal sequence (o.n.s.), 107
orthonormal system, 17
orthoprojection, 23
corollary, about, 23
outer measure, 67
pairwise
disjoint, 65
parallelogram identity, 12, 107
Parsevals formula, 29, 108
partial sum of the Fourier series, 25
period, 4
periodic, 4
perpendicular
theorem on, 18
polynomial approximation, 30
polynomials
Chebyshev, 22
Laguerre, 22
Legendre, 22
orthogonal, 22
Pontryagins duality, 96
positive
functional, 90
product
inner, 11
scalar, 11
product measure, 71, 80
projection
orthogonal, 23
Pythagoras school, 32
Pythagoras theorem, 17
Pythagorass theorem, 107
quantum mechanics, 9, 15
116 VLADIMIR V. KISIL
radius
spectral, 43
RadonNikodym theorem, 82
regular charge, 88
regular measure, 88
representation
of group, 30
algebra, of, 95
group, of, 95
Riesz, 91
resolvent, 41, 42
identity, rst, 43
set, 42
resolvent set, 42
Riesz representation, 91
RieszFischer theorem, 107
RieszFisher theorem, 20
RieszFr echet lemma, 36
RieszFr echet theorem, 108
right inverse, 38
right shift operator, 37
scalar product, 11
school
Pythagoras, 32
Schwartz space, 99
SegalBargmann space, 15
self-adjoint operator, see Hermitian operator,
108
semiring, 65
separable Hilbert space, 23
separable kernel, 54, 109
sequence
Cauchy, 10
orthogonal, 17
orthonormal, 17
complete , 21
series
Fourier, 5
Neumann, 42, 53
set
compact, 45
Borel, 88
Cantor, 71
convex, 10, 18
measurable
Carath eodory, 68
Lebesgue, 68
resolvent, 42
symmetric difference, 68
shift
bilaterial right, 103
shift operator, 94
signed measure, see charge
simple function, 74
integral, 74
summable, 74
sine
Fourier coefcients, 6
singular value decomposition of compact
operator, 52
space
Banach, 10, 58
dual, 35
Hardy, 103
Hilbert, 12
separable, 23
inner product, 11
complete, see Hilbert space
linear, 9
metric, 9
complete, 10
normed, 10
complete, see Banach space
of bounded linear operators, 38
Schwartz, 99
SegalBargmann, 15
vector, see linear space
space of nite sequences, 13
span
linear, 16
closed, 16
spectral properties of normal operators, 109
spectral radius, 43
spectral radius formula, 109
spectral radius:, 108
spectral theorem for compact normal
operators, 51, 109
spectrum, 42, 108
statement
Fej er, see theorem
GramSchmidt, see theorem
RieszFisher, see theorem
RieszFr echet, see lemma
sub-additive
countably, 67
subsequence
convergent
quickly, 76, 79
quickly convergent, 76, 79
subspace, 13
subspaces, 107
sum
Fej er, of, 25
summable
function, 74, 75
simple function, 74
sup-norm, 60
support of function, 91
symmetric difference of sets, 68
synthesis, 13
system
orthogonal, 17
orthonormal, 17
theorem
Egorov, 73
Fej er, of, 28
Fubini, 81
GramSchmidt, of, 21
Hahn-Banach, 63
HeineBorel, 45
Lebesgue, 68
Lebesgue on dominated convergence, 77
monotone convergence, B. Levi, 78
on nearest point , 18
INTRODUCTION TO FUNCTIONAL ANALYSIS 117
on orthonormal basis, 21
on perpendicular, 18
Pythagoras, 17
RadonNikodym, 82
RieszFisher, of, 20
spectral for compact normal operators, 51
Weierstrass approximation, 30
thinking
mathematical, 9, 17
topology
locally compact, 93
transform
Cayley, 44
Fourier, 97
windowed, 34
wavelet, 30
triangle inequality, 10, 107
two monotonic limits, 72, 79, 80
for sets, 73
uniform convergence, 72
on compacts, 95
unit ball, 10
unitary operator, 40, 108
Urysohns lemma, 89
variation of a charge, 70, 89
vector
length of, 9
vector space, 9
vectors
orthogonal, 17
Volterra equation, 53
Volterra operator, 103
wavelet transform, 30
wavelets, 29, 34
Weierstrass approximation theorem, 30, 108
windowed Fourier transform, 34
zero operator, 37
118 VLADIMIR V. KISIL
SCHOOL OF MATHEMATICS, UNIVERSITY OF LEEDS, LEEDS LS2 9JT, UK
E-mail address: kisilv@maths.leeds.ac.uk
URL: http://www.maths.leeds.ac.uk/~kisilv/

Vous aimerez peut-être aussi