Académique Documents
Professionnel Documents
Culture Documents
Term 1, 2009–2010
Vassili Gelfreich
Contents
1 Vector spaces 1
1.1 Definition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Examples of vector spaces . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Hamel bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Normed spaces 6
2.1 Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Four famous inequalities . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Examples of norms on a space of functions . . . . . . . . . . . . . . 8
2.4 Equivalence of norms . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.5 Linear Isometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4 Banach spaces 18
4.1 Completeness: Definition and examples . . . . . . . . . . . . . . . . 18
4.2 The completion of a normed space . . . . . . . . . . . . . . . . . . . 20
4.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5 Lebesgue spaces 25
5.1 Integrable functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.2 Properties of Lebesgue integrals . . . . . . . . . . . . . . . . . . . . 28
5.3 Lebesgue space L1 (R) . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.4 L p spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6 Hilbert spaces 36
6.1 Inner product spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.2 Natural norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.3 Parallelogram law and polarisation identity . . . . . . . . . . . . . . 38
6.4 Hilbert spaces: Definition and examples . . . . . . . . . . . . . . . . 40
iii
8 Closest points and approximations 48
8.1 Closest points in convex subsets . . . . . . . . . . . . . . . . . . . . 48
8.2 Orthogonal complements . . . . . . . . . . . . . . . . . . . . . . . . 49
8.3 Best approximations . . . . . . . . . . . . . . . . . . . . . . . . . . 51
8.4 Weierstrass Approximation Theorem . . . . . . . . . . . . . . . . . . 53
11 Linear functionals 63
11.1 Definition and examples . . . . . . . . . . . . . . . . . . . . . . . . 63
11.2 Riesz representation theorem . . . . . . . . . . . . . . . . . . . . . . 63
14 Sturm-Liouville problems 82
iv
Preface
These notes follow the lectures on Functional Analysis given in the Autumn Term of
2009. If you find a mistake or misprint please inform the author by sending an e-mail
to v.gelfreich@warwick.ac.uk. The author thanks James Robinson for his set of
notes and selection of exercises which significantly facilitated the preparation of the
lectures. The author also thanks all students who helped with proofreading the notes.
1 Vector spaces
1.1 Definition.
A vector space V over a field K is a set equipped with two binary operations called
vector addition and multiplication by scalars. Elements of V are called vectors and
elements of K are called scalars. The sum of two vectors x, y ∈ V is denoted x + y, the
product of a scalar α ∈ K and vector x ∈ V is denoted αx.
It is possible to consider vector spaces over an arbitrary field K, but we will con-
sider the fields R and C only. So we will always assume that K denotes either R or C
and refer to V as a real or complex vector space respectively.
In a vector space, addition and multiplication have to satisfy the following set of
axioms: Let x, y, z be arbitrary vectors in V , and α, β be arbitrary scalars in K, then
• Associativity of addition: x + (y + z) = (x + y) + z.
• Commutativity of addition: x + y = y + z.
• There exists an element 0 ∈ V , called the zero vector, such that x + 0 = x for all
x ∈ V.
• For all x ∈ V , there exists an element y ∈ V , called the additive inverse of x, such
that x + y = 0. The additive inverse is denoted −x.
• Distributivity:
α(x + y) = αx + αy and (α + β )x = αx + β x.
1
It is convenient to define two additional operations: subtraction of two vectors and
division by a (non-zero) scalar are defined by
x − y = x + (−y),
x/α = (1/α)x.
2
7. The space C[0, 1] of all real-valued continuous functions on the closed interval
[0, 1] is a vector space. The addition and multiplication by scalars are defined
naturally: for f , g ∈ C[0, 1] and α ∈ R we defined by f + g the function whose
values are given by
3
Definition 1.2 A set E is linearly independent if any finite collection of elements of E
is linearly independent:
n
∑ α je j = 0 =⇒ α1 = α2 = · · · = αn = 0
j=1
Examples:
Lemma 1.4 If E is a Hamel basis for a vector space V then any element x ∈ V can be
uniquely written in the form
n
x= ∑ α je j
j=1
where n ∈ N, α j ∈ K and e j ∈ E.
Definition 1.5 We say that a set is finite if it consists of a finite number of elements.
Theorem 1.6 If V has a finite Hamel basis then every Hamel basis for V has the same
number of elements.
Definition 1.7 If V has a finite basis E then the dimension of V (denoted dimV ) is
the number of elements in E. If V has no finite basis then we say that V is infinite-
dimensional.
4
Proposition 1.10 Any n-dimensional vector space over K is linearly isomorphic to
Kn .
Proof: If E = { e j : 1 ≤ j ≤ n } is a basis in V , then every element x ∈ V is represented
uniquely in the form
n
x= ∑ α je j .
j=1
The map L : x 7→ (α1 , . . . , αn ) is a linear bijection V → Kn . Therefore V is linearly
isomorphic to Kn .
In order to show that a vector space is infinite-dimensional it is sufficient to find an
infinite linearly independent subset. Let’s consider the following examples:
1. ` p (K) is infinite-dimensional (1 ≤ p ≤ ∞).
Proof. The set
E = { (1, 0, 0, 0, . . .), (0, 1, 0, 0, . . .), (0, 0, 1, 0, . . .), . . .}
is not finite and linearly independent. Therefore dim ` p (K) = ∞.
Remark: This linearly independent subset is not a Hamel basis. Indeed, the
sequence x = (x1 , x2 , x3 , . . .) with xk = e−k belongs to ` p (K) for any p ≥ 1 but
cannot be represented as a sum of finitely many elements of the set E.
2. C[0, 1] is infinite-dimensional.
Proof: The set E = { xk : k ∈ N } is linearly independent subset of C0 [0, 1]: In-
deed, suppose
n
p(x) = ∑ αk x k = 0 for all x ∈ [0, 1].
k=1
Differentiating the equality n times we get p(n) (x) = n!αn = 0. Which implies
αn = 0. Therefore p(x) ≡ 0 implies αk = 0 for all k.
The linearly independent sets provided in the last two examples are not Hamel
bases. This is not a coincidence: we will see later that ` p (K) and C[0, 1] (as well as
many other functional spaces) do not have a countable Hamel basis.
Theorem 1.11 Every vector space has a Hamel basis.
The proof of this theorem is based on Zorn’s Lemma.
We note that in many interesting vector spaces (called normed spaces), a very large
number of elements should be included into a Hamel basis in order to enable repre-
sentation of every element in the form of a finite sum. Then the basis is too large to
be useful for the study of the original vector space. A natural idea would be to allow
infinite sums in the definition of a basis. In order to use infinite sums we need to de-
fine convergence which cannot be done using the axioms of vector spaces only. An
additional structure on the vector space should be defined.
5
2 Normed spaces
2.1 Norms
Definition 2.1 A norm on a vector space V is a map k · k : V → R such that for any
x, y ∈ V and any α ∈ K:
In order to prove the triangle inequality for the ` p norm, we will state and prove
several inequalities.
6
2.2 Four famous inequalities
1
Lemma 2.2 (Young’s inequality) If a, b > 0, 1 < p, q < ∞, p + 1q = 1, then
a p bq
ab ≤ + .
p q
p
Proof: Consider the function f (t) = tp −t + 1q defined for t ≥ 0. Since f 0 (t) = t p−1 − 1
vanishes at t = 1 only, and f 00 (t) = (p−1)t p−2 ≥ 0, the point t = 1 is a global minimum
for f . Consequently, f (t) ≥ f (1) = 0 for all t ≥ 0. Now substitute t = ab−q/p :
a p b−q 1
f (ab−q/p ) = − ab−q/p + ≥ 0 .
p q
Multiplying the inequality by bq yields Young’s inequality.
1
Lemma 2.3 (Hölder’s inequality) If 1 ≤ p, q ≤ ∞, p + 1q = 1, x ∈ ` p (K), y ∈ `q (K),
then ∞
∑ |x j y j | ≤ kxk` p kyk`q .
j=1
Proof. If 1 < p, q < ∞, we use Young’s inequality to get that for any n ∈ N
n |x | |y | n
1 |x j | p 1 |y j |q
j j 1 1
∑ kxk` p kyk`q ≤ ∑ p kxk pp + q kykqq ≤ p + q = 1
j=1 j=1 ` `
Since the partial sums are monotonically increasing and bounded above, the series
converge and Hölder’s inequality follows by taking the limit as n → ∞.
If p = 1 and q = ∞:
n n
∑ |x j y j | ≤ max |y j |
1≤ j≤n
∑ |x j | ≤ kxk`1 kyk`∞ .
j=1 j=1
Therefore the series converges and Hölder’s inequality follows by taking the limit as
n → ∞.
7
Proof: This inequality coincides with Hölder’s inequality with p = q = 2.
Now we state and prove the triangle inequality for the ` p norm.
Proof: If 1 < p < ∞, define q from 1p + 1q = 1. Then using Hölder’s inequality (finite
sequences belong to ` p with any p) we get2
n n
∑ |x j + y j | p = ∑ |x j + y j | p−1|x j + y j |
j=1 j=1
n n
≤ ∑ |x j + y j | p−1|x j | + ∑ |x j + y j | p−1|y j |
j=1 j=1
!1/q !1/p
n n
≤ ∑ |x j + y j |(p−1)q ∑ |x j | p (Hölder’s inequality)
j=1 j=1
!1/q !1/p
n n
+ ∑ |x j + y j |(p−1)q ∑ |y j | p .
j=1 j=1
1/q
Dividing the inequality by ∑nj=1 |x j + y j | p and using that (p − 1)q = p and 1 −
1
q = 1p , we get for all n
!1/p !1/p !1/p
n n n
∑ |x j + y j | p ≤ ∑ |x j | p + ∑ |y j | p .
j=1 j=1 j=1
The series on the right hand side converge to kxk` p + kyk` p . Consequently the series
on the left hand side also converge, x + y ∈ ` p (K), and Minkowski’s inequality follows
by taking the limit as n → ∞.
Exercise: Prove Minkowski’s inequality for p = 1 and p = ∞.
8
1. the “sup(remum) norm”
k f k∞ = sup | f (t)| ;
t∈[0,1]
Exercise: Check that each of these formulae defines a norm. For the case of the L2
norm, you will need a Cauchy-Schwartz inequality for integrals.
Example: Let k ∈ N. The space Ck [0, 1] consists of all continuous real-valued func-
tions which have continuous derivatives up to order k. The norm on Ck [0, 1] is defined
by
k
k f kCk = ∑ sup | f ( j) (t)| ,
j=0 t∈[0,1]
Definition 2.6 Two norms k · k1 and k · k2 on a vector space V are equivalent if there
are constants c1 , c2 > 0 such that
9
Proof: Consider the sequence of functions fn (t) = t n with n ∈ N. Obviously fn ∈
C[0, 1]. We see that
k fn k∞ = max |t|n = 1 ,
t∈[0,1]
Z 1
n 1
k fn kL1 = t dt = .
0 n+1
Suppose the norms are equivalent. Then there is a constant c2 > 0 such that for all fn :
k f n k∞
= n + 1 ≤ c2 ,
k f n kL 1
which is not possible for all n. This contradiction implies the norms are not equivalent.
Definition 2.8 If a linear map L : V → W preserves norms, i.e. kL(x)k = kxk for all
x ∈ V , it is called a linear isometry.
This definition implies L is injective, i.e., L : V → L(V ) is bijective, but it does not im-
ply L(V ) = W , i.e., L is not necessarily invertible. Note that sometimes the invertibility
property is included into the definition of the isometry. Finally, in Metric Spaces the
word “isometry” is used to denote distance-preserving transformations.
Definition 2.9 We say that two normed spaces are isometrically isomorphic (or simply
isometric), if there is an invertible linear isometry between them.
kxkW = kL(x)kV ≥ 0 ,
kαxkW = kL(αx)kV = |α| kL(x)kV = |α| kxkW .
10
Finally, the triangle inequality follows from the triangle inequality for k · kV :
Therefore, k · kW is a norm.
Note that in the proposition the new norm is introduced in such a way that L :
(W, k · kW ) → (V, k · kV ) is a linear isometry.
Let V be a finite dimensional vector space and n = dimV . We have seen that V is
linearly isomorphic to Kn . Then the proposition implies the following statements.
Corollary 2.11 Any finite dimensional vector space V can be equipped with a norm.
Since any two norms on Rn (and therefore on Cn ) are equivalent we also get the
following statement.
Theorem 2.13 Let V be a finite-dimensional vector space. Then all norms on V are
equivalent.
11
3 Convergence in a normed space
3.1 Definition and examples
The norm on a vector space V can be used to measure distances between points x, y ∈ V .
So we can define the limit of a sequence.
Then we write xn → x.
We note that the sequence of vectors xn → x if and only if the sequence of non-
negative real numbers kxn − xk → 0.
12
Example: Consider the sequence fn ∈ C[0, 1] defined by fn (t) = t n .
1. fn → 0 in the L1 norm.
Proof: We have already computed the norms:
1
k fn kL1 = → 0.
n+1
Consequently, fn → 0.
2. fn does not converge in the sup norm.
Proof: If m > 2n ≥ 1 then
1 1 1
fn (2−1/n ) − fm (2−1/n ) = − m/n ≥ .
2 2 4
Consequently ( fn ) is not Cauchy in the sup norm and hence not convergent.
This example shows that the convergence in the L1 norm does not imply the point-
wise convergence and, as a results, does not imply the convergence in the sup norm
(often called the uniform convergence). Note that in contrast to the uniform and L1
convergences the notion of pointwise convergence is not based on a norm on the space
of continuous function.
Exercise: The pointwise convergence does not imply the L1 convergence.
Hint: Construct fn with a very small support but make the maximum of fn very
large to ensure that k fn kL1 > n. Therefore fn is not bounded in the L1 norm, hence not
convergent.
We can also make supp fn ∩ supp fm = 0/ for all m, n such that n 6= m. Then for any
t there is at most one n such that fn (t) 6= 0. The last property guarantees pointwise
convergence: fn (t) → 0 for any t.
Proposition 3.9 If fn ∈ C[0, 1] for all n ∈ N and fn → f in the sup norm, then fn → f
in the L1 norm, i.e.,
k fn − f k∞ → 0 =⇒ k fn − f kL1 → 0 .
Proof:
Z 1
0 ≤ k f n − f kL 1 = | fn (t) − f (t)| dt ≤ sup | fn (t) − f (t)| = k fn − f k∞ → 0 .
0 0≤t≤1
Therefore k fn − f kL1 → 0.
We have seen that different norms may lead to different conclusions about conver-
gence of a given sequence but sometime convergence in one norm implies convergence
in another one. The following lemma shows that equivalent norms give rise to the same
notion of convergence.
13
Lemma 3.10 Suppose k · k1 and k · k2 are equivalent norms on a vector space V . Then
for any sequence (xn ):
Proof: Since the norms are equivalent, there are constant c1 , c2 > 0 such that
for all n. Then kxn − xk2 → 0 implies kxn − xk1 → 0, and vice versa.
/ ∈T ;
1. 0,V
Definition 3.11 A subset X ⊂ V is open, if for any x ∈ X there is ε > 0 such that the
ball of radius ε centred around x belongs to X:
B(x, ε) = {y ∈ V : ky − xk < ε} ⊂ X .
1. The unit ball centred around the zero, B0 = { x : kxk < 1 }, is open.
3. V is open.
It is not too difficult to check that the collection of open sets defines a topology
on V . You can easily check from the definition that equivalent norms generate the
same topology, i.e., open sets are exactly the same. The notion of convergence can be
defined in terms of the topology.
14
Definition 3.12 An open neighbourhood of x is an open set which contains x.
Proof: ( =⇒ ). Let xn → x. Take any open X such that x ∈ X. Then there is ε > 0 such
that B(x, ε) ⊂ X. Since the sequence converges there is N such that kxn − xk < ε for
all n > N. Then xn ∈ B(x, ε) ⊂ X for the same values of n.
(⇐=). Take any ε > 0. The ball B(x, ε) is open, therefore there is N such that
xn ∈ B(x, ε) for all n > N. Hence kxn − xk < ε and xn → x.
E = { e1 , e2 , . . . , en } ⊂ L
such that L = Span(E). Suppose L is not closed, then by Lemma 3.15 there is a
convergent sequence xk → x∗ , xk ∈ L but x∗ ∈ V \ L. Then x∗ is linearly independent
from E (otherwise it would belong to L). Consequently
Ẽ = { e1 , e2 , . . . , en , x∗ }
15
is a Hamel basis in L̃ = Span(Ẽ). In this basis, the components of xk are given by
(α1k , . . . , αnk , 0) and x∗ corresponds to the vector (0, . . . , 0, 1). We get in the limit as
k→∞
(α1k , . . . , αnk , 0) → (0, . . . , 0, 1) ,
which is obviously impossible. Therefore L is closed.4
Example: The subspace of polynomial functions is linear but not closed in C[0, 1]
equipped with the sup norm.
3.4 Compactness
Definition 3.18 (sequential compactness) A subset K of a normed space (V, k · kV ) is
n=1 with xn ∈ K has a convergent subse-
(sequentially) compact if any sequence (xn )∞
quence xn j → x∗ with x∗ ∈ K.
Example: The unit sphere in ` p (K) is closed, bounded but not compact.
Proof: Take the sequence e j such that
e j = (0, . . . , 0, |{z}
1 , 0, . . .) .
jth place
We note that ke j − ek k` p = 21/p for all j 6= k. Consequently, (e j )∞j=1 does not have any
convergent subsequence, hence S is not compact.
Lemma 3.22 (Riesz’ Lemma) Let X be a normed vector space and Y be a closed
linear subspace of X such that Y 6= X and α ∈ R, 0 < α < 1. Then there is xα ∈ X
such that kxα k = 1 and kxα − yk > α for all y ∈ Y .
16
Since α −1 > 1 there is a point z ∈ Y such that kx − zk < dα −1 . Let xα = kx−zk
x−z
. Then
kxα k = 1 and for any y ∈ Y ,
x−z
x − (z + kx − zky)
d
kxα − yk =
kx − zk − y
=
> =α,
kx − zk dα −1
Theorem 3.23 A normed space is finite dimensional iff the unit sphere is compact.
Proof: Bolzano-Weierstrass Theorem and Lemma 3.15 imply that in a finite dimen-
sional normed space the unit sphere is compact (the unit sphere is bounded and closed).
So we only need to show that if the unit sphere S ⊂ B is sequentially compact, then
the normed space V is finite dimensional. Indeed, if V is infinite dimensional, then
Riesz’ Lemma can be used to construct an infinite sequence of xn ∈ S such that kxn −
xm k > α > 0 for all m 6= n. This sequence does not have a convergent subsequence
(none of the subsequences is Cauchy) and therefore S is not compact.
We construct xn inductively. Fix α ∈ (0, 1) and take any x1 ∈ S.
Suppose that for some n ≥ 1 we have found En = { x1 , . . . , xn } such that xk ∈ S and
kxl − xk k > α for all 1 ≤ k, l ≤ n, k 6= l (note that the second property is automatically
fulfilled for n = 1). The linear subspace Yn = Span(En ) is n-dimensional and hence
closed (see Proposition 3.17). Since X is infinite dimensional Yn 6= X. Then Riesz’
Lemma implies that there is xn+1 ∈ S such that kxn+1 − xk k > α for all 1 ≤ k ≤ n.
Repeating this argument we generate xn for all n ∈ N.
17
4 Banach spaces
4.1 Completeness: Definition and examples
Definition 4.1 (Banach space) A normed space V is called complete if any Cauchy
sequence in V converges to a limit in V . A complete normed space is called a Banach
space.
Proof: Theorem 3.6 implies that R is complete, i.e., every Cauchy sequence of num-
bers has a limit.
Now let V be a real vector space, dimV = n < ∞. Take any basis in V . Then
a sequence of vectors in V converges iff each component of the vectors converges,
and a sequence of vectors is Cauchy iff each component is Cauchy. Therefore each
component has a limit, and those limits constitute the limit vector for the original
sequence. Hence V is complete.
Considering C as a real vector space we conclude that it is also complete. There-
fore, any finite-dimensional complex vector space V is also complete.
In particular, Rn and Cn are complete.
Theorem 4.3 (` p is a Banach space) The space ` p (K) equipped with the standard ` p
norm is complete.
Proof: Suppose that xk = (x1k , x2k , . . .) ∈ ` p (K) is Cauchy. Then for every ε > 0 there is
N such that
∞
kxm − xn k` p = ∑ |xmj − xnj| p < ε
j=1
for all m, n > N. Consequently, for each j ∈ N the sequence xkj is Cauchy, and the
completeness of K implies that there is a j ∈ K such that
xkj → a j
as k → ∞. Let a = (a1 , a2 , . . .). First we note that for any M ≥ 1 and m, n > N:
M ∞
∑ |xmj − xnj| p ≤ ∑ |xmj − xnj| p < ε .
j=1 j=1
18
This holds for any M, so we can take the limit as M → ∞:
∞
∑ |xmj − a j | p ≤ ε .
j=1
We conclude that xm − a ∈ ` p (K). Since ` p (K) is a vector space and xm ∈ ` p (K), then
a ∈ ` p (K). Moreover, kxm − ak` p < ε for all m > N. Consequently xm → a in ` p (K)
with the standard norm, and so ` p (K) is complete.
Theorem 4.4 (C is a Banach space) The space C[0, 1] equipped with the sup norm is
complete.
Proof: Let fk be a Cauchy sequence. Then for any ε > 0 there is N such that
for all m, n > N. In particular, fn (t) is Cauchy for any fixed t and consequently has a
limit. Set
f (t) = lim fn (t) .
n→∞
Let’s prove that fn (t) → f (t) uniformly in t. Indeed, we already know that
for all n, m > N and all t ∈ [0, 1]. Taking the limit as m → ∞ we get
| fn (t) − f (t)| ≤ ε
for all n > N and all t ∈ [0, 1]. Therefore fn converges uniformly:
for all n > N. The uniform limit of a sequence of continuous functions is continuous.
Consequently, f ∈ C[0, 1] which completes the proof of completeness.
Example: The space C[0, 2] equipped with the L1 norm is not complete.
Proof: Consider the following sequence of functions:
( n
t for 0 ≤ t ≤ 1,
fn (t) =
1 for 1 ≤ t ≤ 2 .
19
and consequently for any m, n > N
1
k fn − fm kL1 < .
N
Now let us show that fn do not converge to a continuous function in the L1 norm.
Indeed, suppose such a limit exists and call it f . Then
Z 1 Z 2
n
k fn − f kL1 = |t − f (t)| dt + |1 − f (t)| dt → 0 .
0 1
Since
| f (t)| − |t n | ≤ |t n − f (t)| ≤ | f (t)| + |t n |
implies that
Z 1 Z 1 Z 1 Z 1 Z 1
n n
| f (t)| dt − t dt ≤ |t − f (t)| dt ≤ | f (t)| dt + t n dt ,
0 0 0 0 0
Z 1 Z 2
| f (t)| dt + |1 − f (t)| dt = 0 .
0 1
We see that the limit function f cannot be continuous. This contradiction implies that
C[0, 2] is not complete with respect to the L1 norm.
Definition 4.5 (dense set) We say that a subset X ⊂ V is dense in V if for any v ∈ V
and any ε > 0 there is x ∈ X such that kx − vk < ε.
5 In this context “minimal” means that if any other space X̃ has the same property, then the minimal
X̂ is isometric to a subspace of X̃. It turns out that this property can be achieved by requiring X to be
dense in X̂.
20
Note that X is dense in V iff for every point v ∈ V there is a sequence xn ∈ X such
that xn → v.
Theorem 4.6 Let (X, k · kX ) be a normed space. Then there is a complete normed
space (X , k · kX ) and a linear map i : X → X such that i is an isometrical isomor-
phism between (X, k · kX ) and (i(X), k · kX ), and i(X) is dense in X .
Moreover, X is unique up to isometry, i.e., if there is another complete normed
space (X˜ , k·kX˜ ) with these properties, then X and X˜ are isometrically isomorphic.
write x ∼ y, if
lim kxn − yn kX = 0 .
n→∞
Let X be the space of all equivalence classes in Y , i.e., it is the factor space: X =
Y / ∼. The elements of X are collections of equivalent Cauchy sequences from X.
We will use [x] to denote the equivalence class of x.
Exercises: Show that X is a vector space.
Norm on X . For an η ∈ X take any representative x = (xn )∞ n=1 , xn ∈ X, of the
equivalence class η. Then the equation
21
Now consider the sequence x∗ defined by
∞
∗ (k)
x = xnk .
k=1
Next we will check that x∗ is Cauchy, and consider its equivalence class η ∗ = [x∗ ] ∈
X . Then we will prove that η (k) → η ∗ in (X , k · kX ).
The sequence x∗ is Cauchy. Since the sequence of η (k) is Cauchy, for any ε > 0 there
is Mε such that
(k) (l)
lim kxn − xn kX = kη (k) − η (l) kX < ε for all k, l > Mε .
n→∞
22
4.3 Examples
The theorem provides an explicit construction for the completion of a normed space.
Often this description is not sufficiently convenient and a more direct description is
desirable.
2. Example: Let ` f (K) be the space of all sequences which have only a finite
number of non-zero elements. This space is not complete in the ` p norm. The
completion of ` f (K) in the ` p norm is isometric to ` p (K).
Indeed, we have already seen that ` p (K) is complete. So in order to prove the
claim you only need to check that ` f (K) is dense in ` p (K).
We see that the completion of a space depends both on the space and on the
norm.
In spite of the fact that f can be considered as a limit of the sequence of continu-
ous functions fn in the L1 norm, we cannot define the value of f (t) for a given t
as the limit of fn (t), because the limit may not exist or may depend on the choice
of the representative in the equivalence class.
In spite of that we can define the integral of f by setting
Z 1 Z 1
f (t) dt := lim fn (t) dt . (4.4)
0 n→∞ 0
23
hence convergent. It is not difficult to check that the limit does not depend from
the choice of a representative in the equivalence class f .
Obviously, if fn is a constant sequence, i.e., fn (t) = f0 (t), t ∈ [0, 1], for all n ∈
N, then 01 f (t) = 01 f0 (t) dt. Therefore this definition can be considered as an
R R
where the integral should be interpreted in the sense of the new definition. Note
that f and g are not continuous and consequently the equality above cannot be
used to deduce that f (t) = g(t) for all t.
Here the integral should be considered as the Lebesgue integral. We note that it
coincides with the definition provided above but its construction is more direct.
Since the notion of the Lebesgue integration is very important for the functional
analysis and its applications we will discuss it in more details in the next few
lectures. A more detailed study of this topics is a part of MA359 Measure Theory
module.
24
5 Lebesgue spaces
The exposition of the Lebesgue integral is based on the book H.A.Priesly, Introduction
to integration, Oxford Sc.Publ., 1997, 306 p.
where |I j | is the length of I j . We Rnote that this sum equals to the Riemann integral
which you studied in Year 1, i.e., ϕ is the “algebraic” area under the graph of the
step function ϕ (the area is counted negative on those intervals where ϕ(x) < 0).
Exercise: Show that a countable union of measure zero sets has measure zero. Hint:
for An choose a cover with εn = ε/2n .
Examples. The set Q of all rational numbers has measure zero. The Cantor set has
measure zero.
Definition 5.2 A property is said to hold “almost everywhere” or “for almost every
x” (and abbreviated to “a.e.”) if the set of points at which the property does not hold
has measure zero.
25
Almost everywhere convergence
Theorem 5.3 Let (ϕn (x))R∞ n=1 be an increasing sequence of step functions (ϕn+1 (x) ≥
ϕn (x) for all x) such that ϕn < K. Then ϕn (x) converges for a.e. x.
Proof: First note that an increasing sequence of numbers has a limit if and only if it is
bounded from above. So in order to prove the theorem it is sufficient to show that the
set
E = { x ∈ R : ϕn (x) → +∞ }
has measure zero.
Without loosing in generality, we can assume that ϕn (x) ≥ 0.6 Let us define the set
(n)
The total length of those intervals is less than K/m. Indeed, since c j ≥ 0 for all j and
(n)
c j > n for j ∈ In ,
Z
(n) (n) (n)
K> ϕn = ∑ c j |I j | > m ∑ |I j | .
j j∈In,m
Finally, E ⊂ Em = ∞ l=1 El,m for every m. Since the sequence ϕn is increasing, En,m ⊂
S
26
Lemma 5.4 If ϕn and ψn are two increasing sequences of step functions which respec-
tively tend to f and g a.e. and f (x) ≥ g(x) a.e., then
Z Z
lim ϕn ≥ lim ψn .
n→∞ n→∞
Consequently
(ψk − ϕn )+ (x) = max{ ψk (x) − ϕn (x), 0 } → 0
Since it is a decreasing sequence of non-negative step functions which converges to 0
a.e.,7 Z
(ψk − ϕn )+ → 0 .
Corollary 5.5 If ϕn and ψn are two increasing sequences of step functions which tend
to a function f a.e., then Z Z
lim ϕn = lim ψn .
n→∞ n→∞
27
Lebesgue integrable functions
Definition 5.6 If a function f : R → R can be represented as an a.e. limit of an in-
creasing sequence of step functions ϕn , then the integral of f is given by
Z Z
f = lim ϕn .
n→∞
Note that the limit is independent of the choice of the increasing sequence of step
functions.
Unfortunately Linc (R) is not a vector space as f ∈ Linc (R) does not imply − f ∈
inc
L (R). Indeed, f is bounded from below by ϕ1 but not necessarily bounded from
above. Then − f is not bounded from below and therefore − f 6∈ Linc (R). For example,
( 1
√ , x 6= 0 , ∞
f (x) = |x| or f = ∑ k1/2 χ[(k+1)−1 ,k−1 ] .
0, x = 0, k=1
Of course in the definition the choice of g and h is not unique. So we will have to
check that the value of the integral does not depend from this freedom.
28
Exercise: Prove the proposition.
First we state the main elementary properties of the Lebesgue integration.
Proof:
1. Since f1 , f2 are integrable, there are functions g1 , g2 , h1 , h2 ∈ Linc (R) such that
fk = gk − hk , k = 1, 2. If λ ≥ 0 then g1 + λ g2 , h1 + λ h2 ∈ Linc (R) and
Z Z Z
( f1 + f2 ) = (g1 + λ g2 ) − (h1 + λ h2 )
Z Z Z Z Z Z
= g1 + λ g2 − h1 − λ h2 = f1 + λ f2 .
The case of λ < 0 can be reduced to the previous one. Indeed, in this case
we can write f1 + λ f2 = f1 + (−λ )(− f2 ) and observe that − f2 = h2 − g2 and
consequently is also integrable. Linearity is proved.
then Proposition 5.8 implies that the maximum and the minimum belong to
Linc (R) and hence | f | is integrable. The inequality
Consequently
Z Z Z Z Z Z
f= g− h≤ max{ g, h } − min{ g, h } = | f |.
29
3. Since f (x) = g(x) − h(x) with g, h ∈ Linc (R) and Rf (x) ≥
R
0 a.e., we conclude
g(x) ≥ h(x) a.e.. Then Proposition 5.8 implies that g ≥ h. Consequently
Z Z Z
f= g− h ≥ 0.
The following two theorems establish conditions which allow swapping the limit
and integration. They play the fundamental role in the theory of Lebesgue integrals.
Theorem 5.10 (MonotoneR Convergence Theorem) Suppose that fn are integrable,
fn (x) ≤ fn+1 (x) a.e., and fn < K for some constant independent of n. Then there is
an integrable function g such that fn (x) → g(x) a.e. and
Z Z
g = lim fn .
n→∞
R
Corollary 5.11 If f is integrable and | f | = 0, then f (x) = 0 a.e.
Proof: Let fn (x) = n| f (x)|. This sequence satisfies MCT (integrable, increasing and
R
fn = 0 < 1), consequently there is an integrable g(x) such that fn (x) → g(x) for a.e.
x. Since the sequence is increasing, fn (x) ≤ g(x) a.e. which implies | f (x)| ≤ g(x)/n
for all n and a.e. x. Consequently f (x) = 0 a.e.
Theorem 5.12 (Dominated Convergence Theorem) Suppose that fn are integrable
functions and fn (x) → f (x) for a.e. x.. If there is an integrable function g such that
| fn (x)| ≤ g(x) for every n and a.e. x, then f is integrable and
Z Z
f = lim fn .
n→∞
9 Indeed, we can sketch an example. It is based on partitioning the interval [0, 1] into two very nasty
subsets. So let f (x) = 0 outside [0, 1], for x ∈ [0, 1] let f (x) = 1 if x belongs to the Vitali set and
f (x) = −1 otherwise. Then | f | = χ[0,1] but f is not integrable.
30
It is also possible to integrate complex valued functions: f : R → C is integrable if
its real and imaginary parts are both integrable, and
Z Z Z
f := Re f + i Im f .
The MCT has no meaning for complex valued functions. The DCT is valid without
modifications (and indeed follows easily from the real version).
Revise the properties of the Lebesgue integral which imply that L1 (R) is a normed
space. The completeness of L1 (R) follows from the combination of the following two
statements: The first lemma gives a criterion for completeness of a normed space,
and the second one implies that the assumptions of the first lemma are satisfied for
X = L1 (R).
31
Let y1 = xn1 and yk = xnk − xnk−1 for k ≥ 2. Since kyk kX ≤ 21−k for k ≥ 2,
∞ ∞
∑ kyk kX ≤ ky1kX + ∑ 21−k = ky1kX + 2 < ∞ .
k=1 k=1
2. ∑∞
k=1 f k (x) converges a.e. to an integrable function.
Proof: The first statement follows from MCT applied to the sequence gn = ∑nk=1 | fk |
and K = ∑∞k=1 k f k kL1 . So there is an integrable function g(x) such that
∞
g(x) = ∑ | fk (x)|
k=1
for almost all x. For these values of x the partial sums hn (x) = ∑nk=1 fk (x) obviously
converge, so let
∞
h(x) = ∑ fk (x).
k=1
Moreover
n n ∞
|hn (x)| = ∑ fk (x) ≤ ∑ | fk (x)| ≤ ∑ | fk (x)| = g(x) .
k=1 k=1 k=1
Therefore the partial sums hn satisfy DCT and the second statement follows.
In addition to L1 (R) we will sometimes consider the Lebesgue spaces L1 (I) where
I is an interval. We say that f ∈ L1 (I) if χI f ∈ L1 (R), i.e., we extend the function by
zero outside its original domain.
Proposition 5.17 The space C[0, 1] is dense in L1 (0, 1).
Proof: First show that step functions are dense in L1 (0, 1). Then check that every step
function can be approximated by a piecewise linear continuous function.
Consequently the space we constructed in this section is isometrically isomorphic
to the completion of C[0, 1] in the L1 norm.
32
5.4 L p spaces
Another important class of Lebesgue spaces consists of L p spaces for 1 ≤ p < ∞,
among those the L2 space is the most remarkable (it is also a Hilbert space, see the
next chapter for details). In this section we will sketch the main definitions of those
spaces noting that the full discussion requires more knowledge of Measure Theory
than we can fit into this module).
If I = (a, b) is an interval, then L p (I) can be defined in terms of the integration
procedure developed earlier in this chapter. This definition is equivalent to the standard
one which will be given a bit later.
The Lebesgue space L p (I) is the space of all integrable functions such that
Z
k f kLp p = | f |p < ∞
I
modulo the equivalence relation: f = g iff f (x) = g(x) a.e. We note that in this case
L p (I) ⊂ L1 (I). The definition of L p (R) is slightly different. We say that f ∈ L p (R) if f
is locally integrable (i.e., f ∈ L1 (I) for any interval I)10 and its pth power is integrable.
The norm is defined by the same formula:
Z
k f kLp p = | f |p < ∞ .
R
We note that although L1 (R) ∩ L2 (R) 6= 0/ (e.g. both spaces contain all step functions)
none of those spaces is a subset of the other one. For Rexample, f (x) = 1/(1 + |x|)
2 1
R 2
belongs to L (R) but not to L (R). Indeed, f < ∞ but f = ∞ so it is not integrable
on R. On the other hand
χ(0,1) (x)
g(x) =
|x|1/2
belongs to L1 (R) but not to L2 (R).
Theorem 5.18 L p (R) and L p (I) are Banach spaces for p ≥ 1 and any interval I.
We will not give a complete proof but sketch the main ideas instead.
Let 1p + q1 = 1 and f ∈ L p (R), g ∈ Lq (R). Then the Hölder inequality states11 that
Z
| f g| ≤ k f k p kgkq .
10 This is an important requirement. It is not sufficient to define L p as a set of all functions such
that f p is integrable: this space would not be a vector space. Indeed, let p = 2, g = χ[0,1] and f be
the non-integrable function from the footnote9 . Then f 2 = g2 = χ[0,1] is integrable. But ( f + g)2 =
f 2 + 2 f g + g2 = 2 + 2 f is not integrable. Therefore if we followed this definition f , g ∈ L2 would not
imply f + g ∈ L2 .
11 We will not discuss the proof of this inequality in these lectures.
33
Note that the characteristic function χI ∈ Lq (R) for any interval I and any q ≥ 1, more-
1
over kχI kLq = |I| q where |I| = b − a is the length of I. The Hölder inequality with
g = χI implies that Z Z
χI | f | = | f | ≤ |I|1/q k f k p .
I
The left hand side of this inequality is the norm of f in L1 (I):
1. Let ( fk )∞ 2
k=1 be a sequence in L (R) such that
∞
∑ k f k kL 2 < ∞ .
k=1
If you look into a textbook, you will probably see a differently looking definition
of the Lebesgue spaces. Traditionally a function f is asked to be measurable instead
of locally integrable. Local integrability is a stronger property: every locally inte-
grable function is measurable but there are measurable functions which
R 1 −2
are not locally
−2
integrable, e.g. x is measurable but not locally integrable since −1 x = +∞. Nev-
ertheless the two alternative definitions of the Lebesgue space are equivalent.
Let us discuss the notion of a measurable function from the perspective of our
definitions.
34
First we need to define the measure, which can be considered as a generalisation
of the length of an interval. We say that a subset A ⊂ R has finite Lebesgue measure
µ(A) if the characteristic function χA is Lebesgue integrable. Then
Z
µ(A) := χA ≥ 0 .
Obviously µ([a, b]) = b − a for an interval [a, b] and consequently its Lebesgue mea-
sure coincides with the length.
In order to study large sets (like R) we need to extend this definition to allow
measuring sets with infinitely large measures. We say that A ⊂ R is measurable if χA
is locally integrable. In particular, if A is measurable then An = A ∩ [−n, n] has finite
measure for each n. Since µ(An ) is an increasing sequence the following limit exists
(but can be +∞)
µ(A) = lim µ(An ) ≤ +∞ .
n→∞
R
Note that if χA < ∞ MCT implies that the limit coincides with the previous definition
of µ(A).
For example R is measurable and µ(R) = +∞.
We note that sums, products and pointwise limits of measurable functions are mea-
surable.
Consider the set of all measurable functions from R to R (or C) whose absolute
value raised to the pth power has a finite Lebesgue integral, i.e.,
Z 1/p
p
k f kL p := |f| < ∞.
This space modulo the equivalence relation “ f = g iff f (x) = g(x) a.e.” is called the
Lebesgue space L p (R).
35
6 Hilbert spaces
6.1 Inner product spaces
You have already seen the inner product on Rn .
• If K = C, then (iv) with y = x implies that (x, x) is real and therefore the require-
ment (x, x) ≥ 0 make sense.
36
6.2 Natural norms
Every inner product space is a normed space as well.
Proposition 6.2 If V is an inner product space, then
p
kvk = (v, v)
defines a norm on V .
p
Definition 6.3 We say that kxk = (x, x) is the natural norm induced by the inner
product.
Lemma
p 6.4 (Cauchy-Schwartz inequality) If V is an inner product space and kvk =
(v, v) for all v ∈ V , then
|(x, y)| ≤ kxk kyk for all x, y ∈ V .
Proof of the lemma: The inequality is obvious if y = 0. So suppose that y 6= 0. Then
for any λ ∈ K:
0 ≤ (x − λ y, x − λ y) = (x, x) − λ (y, x) − λ (x, y) + |λ |2 (y, y) .
Then substitute λ = (x, y)/kyk2 :
|(x, y)| |(x, y)| |(x, y)|
0 ≤ (x, x) − 2 2
+ 2
= kxk2 − ,
kyk kyk kyk2
which implies the desired inequality.
Proof of the proposition: Now we can complete the proof of Proposition 6.2. We note
that positive definiteness and homogeneity of k · k easily follow from (i), and (iii), (iv)
in the definition of the inner product. In order to establish the triangle inequality we
use the Cauchy-Schwartz inequality. Let x, y ∈ V . Then
kx + yk2 = (x + y, x + y) = (x, x) + (x, y) + (y, x) + (y, y)
≤ kxk2 + 2kxk kyk + kyk2 = (kxk + kyk)2 ,
and the triangle inequality follows by taking the square root.
Therefore k · k is a norm.
We have already proved the Cauchy-Schwartz inequality for `2 (K) using a different
strategy (see Lemma 2.4).
The Cauchy-Schwartz inequality in L2 (a, b) takes the form
Z b Z b 1/2 Z b 1/2
2 2
f (x)g(x) dx ≤ | f (x)| dx |g(x)| dx .
a a a
37
Lemma 6.5 If V is an inner product space equipped with the natural norm, then xn →
x and yn → y imply that
(xn , yn ) → (x, y) .
The lemma implies that we can swap inner products and limits.
Lemma 6.6 (Parallelogram law) If V is an inner product space with the natural norm
k · k, then
Proof: The linearity of the inner product implies that for any x, y ∈ V
kx + yk2 + kx − yk2 = (x + y, x + y) + (x − y, x − y)
= (x, x) + (x, y) + (y, x) + (y, y)
+(x, x) − (x, y) − (y, x) + (y, y)
= 2 kxk2 + kyk2
Example (some norms are not induced by an inner product): There is no inner
product which induces the following norms on C[0, 1]:
Z 1
k f k∞ = sup | f (t)| or k f kL1 = | f (t)| dt .
t∈[0,1] 0
Indeed, these norms do not satisfy the parallelogram law, e.g., take f (x) = x and g(x) =
1 − x, obviously f , g ∈ C[0, 1] and
38
Lemma 6.7 (Polarisation identity) Let V be an inner product space with the natural
norm k · k. Then
1. If V is real
4(x, y) = kx + yk2 − kx − yk2 ;
2. If V is complex
Proof: Plug in the definition of the natural norm into the right hand side and use
linearity of the inner product.
Lemma 6.7 shows that the inner product can be restored from its natural norm.
Although the right hand sides of the polarisation identities is meaningful for any norm,
we should not rush to the conclusion that any normed space is automatically an inner
product space. Indeed, the example above implies that for some norms these formulae
cannot define an inner product. Nevertheless, if the norm satisfy the parallelogram law,
we indeed get an inner product:
Proposition 6.8 Let V be a real normed space with the norm k · k satisfying the par-
allelogram law, then
kx + yk2 − kx − yk2 kx + yk2 − kxk2 − kyk2
(x, y) = =
4 2
defines an inner product on V .12
Proof: Let us check that (x, y) satisfy the axioms of inner product. Positivity and
symmetry are straightforward (Exercise). The linearity:
39
n
for any m ∈ Z and n ∈ N. Consequently, for any rational λ = m
(λ x, y) = λ (x, y) .
We note that the right hand side of the definition involves the norms only, which com-
mute with the limits. Any real number is a limit of rational numbers and therefore the
linearity holds for all λ ∈ R.
40
7 Orthonormal bases in Hilbert spaces
The goal of this section is to discuss properties of orthonormal bases in a Hilbert space
H. Unlike Hamel bases, the orthonormal ones involve a countable number of elements:
i.e. a vector x is represented in the form of an infinite sum
∞
x= ∑ αk ek
k=1
for some αk ∈ K.
We will mainly consider complex spaces with K = C. The real case K = R is not
very different. We will use (·, ·) to denote an inner product on H, and k · k will stand
for the natural norm induced by the inner product.
Definition 7.3 A set E is orthonormal if kek = 1 for all e ∈ E and (e1 , e2 ) = 0 for all
e1 , e2 ∈ E such that e1 6= e2 .
Note that this definition does not require the set E to be countable.
Exercise: Any orthonormal set is linearly independent.
Indeed, suppose ∑nk=1 αk ek = 0 with ek ∈ E and αk ∈ K. Multiplying this equality
by e j we get !
n n
0= ∑ αk ek , e j = ∑ αk (ek , e j ) = α j .
k=1 k=1
Definition 7.4 (Kronecker delta) The Kronecker delta is the function defined by
1, if j = k,
δ jk =
0, if j 6= k .
41
Example: For every j ∈ N, let e j = (δ jk )∞k=1 (it is an infinite sequence of zeros with 1
at the jth position). The set E = { e j : j ∈ N } is orthonormal in `2 . Indeed, from the
definition of the scalar product in `2 we see that (e j , ek ) = δ jk for all j, k ∈ N.
and if j 6= k
x=π
1 π i(k− j)x ei(k− j)x
Z π Z
( fk , f j ) = fk (x) f j (x) dx = e dx = = 0.
2π i(k − j)
−π −π
x=−π
Lemma 7.5 If {e1 , . . . , en } is an orthonormal set in an inner product space V , then for
any α j ∈ K
2
n
n
∑ α j e j
= ∑ |α j |2 .
j=1
j=1
42
v1
Proof: Let e1 = kv1 k . Then
Span{ v1 } = Span{ e1 }
and the statement is true for n = 1 as the set E1 = { e1 } is obviously orthonormal.
Then we continue inductively.14 Suppose that for some k ≥ 2 we have found an
orthonormal set Ek−1 = { e1 , . . . , ek−1 } such that its span coincides with the span of
{ v1 , . . . , vk−1 }. Then set
k−1
ẽk = vk − ∑ (vk , e j )e j .
j=1
which implies that ẽk ⊥ e j . Finally let ek = ẽk /kẽk k. Then { e1 , . . . , ek } is an orthonor-
mal set such that
Span{ e1 , . . . , ek } = Span{ v1 , . . . , vk } .
If the original sequence is finite, the orthonormalisation procedure will stop after a
finite number of steps. Otherwise, we get an infinite sequence of ek .
Corollary 7.7 Any infinite-dimensional inner product space contains a countable or-
thonormal sequence.
Corollary 7.8 Any finite-dimensional inner product space has an orthonormal basis.
Proposition 7.9 Any finite dimensional inner product space is isometric to Cn (or Rn
if the space is real) equipped with the standard inner product.
Proof: Let n = dimV and e j , j = 1, . . . , n be an orthonormal basis in V . Note that
(ek , e j ) = δk j . Any two vectors x, y ∈ V can be written as
n n
x= ∑ xk ek and y= ∑ y je j .
k=1 j=1
Then !
n n n n n
(x, y) = ∑ xk ek , ∑ y j e j = ∑ ∑ xk y j (ek , e j ) = ∑ xk yk .
k=1 j=1 k=1 j=1 k=1
Therefore the map x 7→ (x1 , . . . , xn ) is an isometry.
We see that an arbitrary inner product, when written in orthonormal coordinates,
takes the form of the “canonical” inner product on Cn (or Rn if the original space is
real).
14 For example, let k = 2. We define ẽ2 = v2 −(v2 , e1 )e1 . Then (ẽ2 , e1 ) = (v2 , e1 )−(v2 , e1 )(e1 , e1 ) = 0.
Since v1 , v2 are linearly independent ẽ2 6= 0. So we can define e2 = kẽẽ2 k .
2
43
7.3 Bessel’s inequality
Lemma 7.10 (Bessel’s inequality) If V is an inner product space and E = (ek )∞
k=1 is
an orthonormal sequence, then for every x ∈ V
∞
∑ |(x, ek )|2 ≤ kxk2 .
k=1
Corollary 7.11 If E is an orthonormal set in an inner product space V , then for any
x ∈ V the set
Ex = { e ∈ E : (x, e) 6= 0 }
is at most countable.
Proof: For any m ∈ N the set Em = { e : |(x, e)| > m1 } has a finite number of elements.
Otherwise there would be an infinite sequence (ek )∞ k=1 with ek ∈ Em , then the series
∑k=1 |(x, ek )| = +∞ which contradicts to Bessel’s inequality. Therefore Ex = ∪∞
∞ 2
m=1 Em
is a countable union of finite sets and hence at most countable.
7.4 Convergence
In this section we will discuss convergence of series which involve elements from an
orthonormal set.
Lemma 7.12 Let H be a Hilbert space and E = (ek )∞ k=1 an orthonormal sequence.
The series ∑∞ α e converges iff ∞
|α |2 < +∞. Then
k=1 k k ∑k=1 k
2
∞
∞
∑ αk ek
= ∑ |αk |2 . (7.1)
k=1
k=1
44
Proof: Let xn = ∑nk=1 αk ek and βn = ∑nk=1 |αk |2 . Lemma 7.5 implies that kxn k2 = βn
and that for any n > m
2
n
n
kxn − xm k2 =
∑ αk ek
= ∑ |αk |2 = βn − βm .
k=m+1
k=m+1
converges unconditionally.
Proof: Exercise.
45
If E is a basis, then it is a linearly independent set. Indeed, if ∑nk αk ek = 0 then
αk = 0 due to the uniqueness.
Note that in this definition the uniqueness is a delicate point. Indeed, the sum
∑∞
k=1 αk ek is defined as a limit of partial sums xn = ∑nk=1 αk ek . A permutation of ek
changes the partial sums and may lead to a different limit. In general, we cannot even
guarantee that after a permutation the series remains convergent.
If E is countable, we can assume that the sum involves all elements of the basis
(some αk can be zero) and that the summation is taken following the order of a selected
enumeration of E. The situation is more difficult if E is uncountable since in this case
there is no natural way of numbering the elements.
The situation is much simpler if E is orthonormal as in this case the series converge
unconditionally and the order of summations is not important.
(a) E is a basis in H;
(b) x = ∑∞
k=1 (x, ek )ek for all x ∈ H;
(c) kxk2 = ∑∞ 2
k=1 |(x, ek )| ;
Proof:
(a) ⇐⇒ (b): use Lemma 7.15.
(b) =⇒ (c): use Lemma 7.12.
(c) =⇒ (d): Let (x, ek ) = 0 for all k, then (c) implies that kxk = 0 hence x = 0.
(d) =⇒ (b): let y = x − ∑∞
k=1 (x, ek )ek . Corollary 7.14 implies that the series con-
verges. Then Lemma 6.5 implies we can swap the limit and the inner product to get
for every n
!
∞
(y, en ) = x − ∑ (x, ek )ek , en
k=1
∞
= (x, en ) − ∑ (x, ek )(ek , en ) = (x, en ) − (x, en ) = 0 .
k=1
Since (y, en ) = 0 for all n, then (d) implies that y = 0 which is equivalent to x =
∑∞k=1 (x, ek )ek as required.
46
(e) =⇒ (d): since Span(E) is dense in H for any x ∈ H there is a sequence xn ∈
Span(E) such that xn → x. Take x such that (x, en ) = 0 for all n. Then (xn , x) = 0 and
consequently
kxk2 = lim xn , x = lim (xn , x) = 0 .
n→∞ n→∞
Therefore x = 0.
(a) =⇒ (e): Since E is a basis any x = limn→∞ xn with xn = ∑nk=1 αk ek ∈ Span(E).
Example: The orthonormal sets from examples of Section 7.1 are also examples of
orthonormal bases.
47
8 Closest points and approximations
8.1 Closest points in convex subsets
Definition 8.1 A subset A of a vector space V is convex if λ x + (1 − λ )y ∈ A for any
two vectors x, y ∈ V and any λ ∈ [0, 1].
Lemma 8.2 Let A be a non-empty closed convex subspace of a Hilbert space H and
x ∈ H. Then there is a unique a∗ ∈ A such that
kx − a∗ k = inf kx − ak .
a∈A
Then
ku − vk2 = 2kx − uk2 + 2kx − vk2 − 4kx − 12 (u + v)k2 .
Let d = infa∈A kx − ak. Since A is convex, 21 (u + v) ∈ A for any u, v ∈ A, and conse-
quently kx − 12 (u + v)k ≥ d. Then
Since d is the infinum, for any n there is an ∈ A such that kx − an k2 < d 2 + n1 . Then
equation (8.1) implies that
2 2 2 2
kan − am k ≤ 2d 2 + + 2d 2 + − 4d 2 = + .
n m n m
Consequently (an ) is Cauchy and, since H is complete, it converges to some a∗ . Since
A is closed, a∗ ∈ A. Then
kx − a∗ k2 = lim kx − an k2 = d 2 .
n→∞
Therefore a∗ is the point closest to x. Now suppose that there is another point ã ∈ A
such that kx − ãk = d, then (8.1) implies
So ã = a∗ and a∗ is unique.
48
8.2 Orthogonal complements
In an infinite dimensional space a linear subspace does not need to be closed. For
example the space ` f of all sequences with only a finite number of non-zero elements
is a linear subspace of `2 but it not closed in `2 (e.g. consider the sequence xn =
(1, 2−1 , 2−2 , . . . , 2−n , 0, 0, . . .)).
Exercises:
1. If E is a basis in H, then E ⊥ = { 0 }.
2. If Y ⊆ X, then X ⊥ ⊆ Y ⊥ .
3. X ⊆ (X ⊥ )⊥
Definition 8.5 The closed linear span of E ⊂ H is a minimal closed set which contains
Span(E):
49
Theorem 8.7 If U is a closed linear subspace of a Hilbert space H then
1. any x ∈ H can be written uniquely in the form x = u + v with u ∈ U and v ∈ U ⊥ .
2. u is the closest point to x in U.
3. The map PU : H → U defined by PU x = u is linear and satisfies
PU2 x = PU x and kPU (x)k ≤ kxk for all x ∈ H .
Let v = x − u. Let us show that v ∈ U ⊥ . Indeed, take any y ∈ U and consider the
function ∆ : C → R defined by
∆(t) = kv + tyk2 = kx − (u − ty)k2 .
Since the definition of u together with u − ty ∈ U imply that ∆(t) ≥ ∆(0) = kx − uk2 ,
the function ∆ has a minimum at t = 0. On the other hand
∆(t) = kv + tyk2 = (v + ty, v + ty)
= (v, v) + t(y, v) + t¯(v, y) + |t|2 (y, y) .
d∆
First suppose that t is real. Then t¯ = t and dt (0) = 0 implies
(y, v) + (v, y) = 0 .
d∆
Then suppose that t is purely imaginary, Then t¯ = −t and dt (0) = 0 implies
(y, v) − (v, y) = 0 .
Taking the sum of these two equalities we conclude
(y, v) = 0 for every y ∈ U.
Therefore v ∈ U ⊥ .
In order to prove the uniqueness of the representation suppose x = u1 + v1 = u + v
with u1 , u ∈ U and v1 , v ∈ U ⊥ . Then u1 − u = v − v1 . Since u − u1 ∈ U and v − v1 ∈ U ⊥ ,
kv − v1 k2 = (v − v1 , v − v1 ) = (v − v1 , u1 − u) = 0 .
Therefore u and v are unique.
Finally x = u + v with u ⊥ v implies kxk2 = kuk2 + kvk2 . Consequently kPU (x)k =
kuk ≤ kxk. We also note that PU (u) = u for any u ∈ U. So PU2 (x) = PU (x) as PU (x) ∈ U
Corollary 8.9 If U is a closed linear subspace in a Hilbert space H and x ∈ H, then
PU (x) is the closest point to x in U.
50
8.3 Best approximations
Theorem 8.10 Let E be an orthonormal sequence: E = { e j : j ∈ J } where J is
either finite or countable set. Then for any x ∈ H, the closest point to x in Span(E) is
given by
y = ∑ (x, e j )e j .
j∈J
PU (x) = ∑ (x, e j )e j .
j∈J
Proof: Corollary 7.14 implies that u = ∑ j∈J (x, e j )e j converges. Then obviously u ∈
Span(E) which is a closed linear subset. Let v = x−u. Since (v, ek ) = (x, ek )−(u, ek ) =
0 for all k ∈ J, we conclude v ∈ E ⊥ = (Span(E))⊥ (Lemma 8.6). Theorem 8.7 implies
that u is the closest point.
Now suppose that the set E is not orthonormal. If the set E is finite or countable we
can use the Gram-Schmidt orthonormalisation procedure to construct an orthonormal
basis in Span(E). After that the theorem above gives us an explicit expression for the
best approximation. Let’s consider some examples.
The set E is not orthonormal. Let’s apply the Gram-Schmidt orthonormalisation pro-
cedure to construct an orthonormal basis in Span(E). For the sake of shortness, let’s
write k · k = k · kL2 (−1,1) .
51
√
First note that k1k = 2 and let
1
e1 = √ .
2
R1 R1 2
Then (1, x) = −1 x dx = 0 and kxk2 = 2
−1 |x| dx = 3 so let
r
3
e2 = x.
2
Then
7 3 1 Z
5
Z 1
(5x − 3x) f (t)(5t 3 − 3t)dt + (3x2 − 1) f (t)(3t 2 − 1)dt
8 −1 8 −1
Z 1
3 1 1
Z
+ x t f (t) dt + f (t) dt
2 −1 2 −1
For example, if f (x) = |x| its best approximation by a third degree polynomial is
15x2 + 3
p3 = .
16
We can check (after computing the corresponding integral);
3
k f − p3 k2 = .
16
52
Note that the best approximation in the L2 norm is not necessarily the best approx-
imation in the sup norm. Indeed, for example,
2 + 3
15x > 3
sup |x| −
x∈[−1,1] 16 16
(the supremum is larger than the values at x = 0). At the same time
2 1 1
sup |x| − x +
= .
x∈[−1,1] 8 8
53
We get
n
∑ r p(x) = 1,
p=0
n
∑ pr p(x) = nx .
p=0
n
∑ p(p − 1)r p(x) = n(n − 1)x2 .
p=0
Consequently,
n n n n
∑ (p − nx)2r p(x) = ∑ p2 r p (x) − 2nx ∑ pr p (x) + n2 x2 ∑ r p(x)
p=0 p=0 p=0 p=0
Let M = supx∈[0,1] | f (x)|. Note that f is uniformly continuous, i.e., for every ε > 0
there is δ > 0 such that
54
The second sum is bounded by
∑ ( f (x) − f (p/n))r p (x) ≤ 2M ∑ r p (x)
|x−p/n|>δ |nx−p|>nδ
n
(p − nx)2
≤ ∑ 2 2
r p (x)
p=0 n δ
2Mx(1 − x) 2M
= 2
≤ 2
nδ nδ
2M
which is less than ε for any n > nδ 2 ε
. Therefore for these values of n
Consequently,
Corollary 8.13 The set of polynomials is dense in C[0, 1] equipped with the supremum
norm.
55
9 Separable Hilbert spaces
9.1 Definition and examples
Definition 9.1 A normed space is separable if it contains a countable dense subset.
kxn − uk < ε .
Example: The space C[0, 1] is separable. Indeed, the Weierstrass approximation the-
orem states that every continuous function can be approximated (in the sup norm) by
a polynomial. The dense countable set is given by polynomials with rational coeffi-
cients.
9.2 Isometry to `2
If H is a Hilbert space, then its separability is equivalent to existence of a countable
orthonormal basis.
Proof: If a Hilbert space has a countable basis, then we can construct a countable dense
set by taking finite linear combinations of the basis elements with rational coefficients.
Therefore the space is separable.
If H is separable, then it contains a countable dense subset V = {xn : n ∈ N}.
Obviously, the closed linear span of V coincides with H. First we construct a linear
independent set Ṽ which has the same linear span as V by eliminating from V those
xn which are not linearly independent from { x1 , . . . , xn−1 }. Then the Gram-Schmidt
process gives an orthonormal sequence with the same closed linear span, i.e., it is a
basis by characterisation (e) of Proposition 7.17.
56
The following theorem shows that all infinite dimensional separable spaces are
isometric. So in some sense `2 is essentially the “only” separable infinite-dimensional
space.
is invertible. Indeed, the image of A is in `2 due to Lemma 7.12, and the inverse map
is given by
∞
A−1 : (xk )∞
k=1 7→ ∑ xk ek .
k=1
The characterisation of a basis in Proposition 7.17 implies that kukH = kA(u)k`2 .
Note that there are Hilbert spaces which are not separable.
Example: Let J be uncountable. The space of all functions f : J → R such that
∑ | f ( j)|2 < ∞
j∈J
15 How do we define the sum over an uncountable set? For any n ∈ N the set J = { j ∈ J : | f ( j)| >
n
1
n } is finite (otherwise the sum is obviously infinite). Consequently, the set J ( f ) := { j ∈ J : | f ( j)| >
0 } is countable because it is a countable union of finite sets: J ( f ) = ∪∞
n=1 Jn . Therefore, the number
of non-zero terms in the sum is countable and the usual definition of an infinite sum can be used.
57
10 Linear maps between Banach spaces
A linear map on a vector space is traditionally called a linear operator. All linear
functions defined on a finite-dimensional space are continuous. This statement is no
longer true in the case of an infinite dimensional space.
We will begin our study with continuous operators: this class has a rich theory and
numerous applications. We will only slightly touch some of them (the most remarkable
examples will be the shift operators on `2 , and integral operators and multiplication
operators on L2 ).
Of course many interesting linear maps are not continuous, i.e., the differential
operator A : f 7→ f 0 on the space of continuously differentiable functions. More accu-
rately, let D(A) = C1 [0, 1] ⊂ L2 (0, 1) be the domain of A Obviously A : D(A) → L2 (0, 1)
is linear but not continuous. Indeed, consider the sequence xn (t) = n−1 sin(nt). Obvi-
ously kxn kL2 ≤ n−1 so xn → 0, but A(xn ) = cos(nt) does not converge to A(0) = 0 in
the L2 norm so A is not continuous.
Some definitions and properties from the theory of continuous linear operators can
be literally extended onto unbounded ones, but sometimes subtle differences appear:
e.g., we will see that a bounded operator is self-adjoint iff it is symmetric, which is
no longer true for unbounded operators. In a study of unbounded operators a special
attention should be paid to their domains.
58
Proof: Suppose A is bounded. Then there is M > 0 such that
Since A is linear
kAkB(U,V ) = sup kA(x)kV .
kxkU =1
We note that kAkB(U,V ) is the smallest M such that (10.1) holds: indeed, it is easy to
see that the definition of operator norm implies
and (10.1) holds with M = kAkB(U,V ) . On the other hand, (10.1) implies M ≥ kAxk V
kxkU for
any x 6= 0 and consequently M ≥ kAkB(U,V ) .
Theorem 10.5 Let U be a normed space and V be a Banach space. Then B(U,V ) is a
Banach space.
59
The operator A is bounded. Indeed, An is Cauchy and hence bounded: there is constant
M ∈ R such that kAn kop < M for all n. Taking the limit in the inequality kAn uk ≤ Mkuk
implies kAuk ≤ Mkuk. Therefore A ∈ B(U,V ).
Finally, An → A in the operator norm. Indeed, Since An is Cauchy, for any ε > 0
there is N such that kAn − Am kop < ε or
10.2 Examples
1. Example: Shift operator: Tl , Tr : `2 → `2 :
60
(If t0 = b let xε = χ[t0 −ε,t0 ] .) Since f is continuous,
Z t0 +ε
kAxε k 1
= | f (t)|2 dt → | f (t)|2 as ε → 0.
kxε k ε t0
Therefore kAkop = k f k∞ .
where Z bZ b
|K(s,t)| ds dt < +∞ .
a a
Let us estimate the norm of A:
Z b Z b 2
2
kAxk = K(t, s)x(s)ds dt
a a
Z b Z b Z b
2 2
≤ |K(t, s)| ds |x(s)| ds dt (Cauchy-Schwartz)
a a a
Z bZ b
= |K(t, s)|2 dsdt kxk2 .
a a
Consequently
Z bZ b
kAk2op ≤ |K(t, s)|2 dsdt .
a a
Note that this example requires a bit more from the theory of Lebesgue integrals
than we discussed in Section 5. If you are not taking Measure Theory and feel
uncomfortable with these integrals, you may assume that x, y and K are continu-
ous functions.
Ker A = { x ∈ U : Ax = 0 }
Range of A:
Range A = { y ∈ V : ∃x ∈ U such that y = Ax }
We note that 0 ∈ Ker A for any linear operator A. We say that Ker A is trivial if
Ker A = { 0 }.
61
Proof: If x, y ∈ Ker A and α, β ∈ K, then
Note that the range is a linear subspace but not necessarily closed (see Examples
3).
62
11 Linear functionals
11.1 Definition and examples
Definition 11.1 If U is a vector space then a linear map U → K is called a linear
functional on U,
Definition 11.2 The space of all continuous functionals on a normed space U is called
the dual space, i.e., U ∗ = B(U, K) .
The dual space equipped with the operator norm is Banach. Indeed, K = R or C
which are both complete. Then Theorem 10.5 implies that U ∗ is Banach.
`y (x) = (x, y)
Theorem 11.3 (Riesz Representation Theorem) Let H be a Hilbert space. For any
bounded linear functional f : H → K there is a unique y ∈ H such that
Moreover, k f kH ∗ = kykH .
63
Theorem 8.7 implies that every vector x ∈ H can be written uniquely in the form
f (x) = (x, y) .
If there is another y0 ∈ H such that f (x) = (x, y0 ) for all x ∈ H, then (x, y) = (x, y0 ) for
all x, i.e., (x, y − y0 ) = 0. Setting x = y − y0 we conclude ky − y0 k2 = 0, i.e. y = y0 is
unique.
Finally, the Cauchy-Schwartz inequality implies
Consequently, k f kH ∗ = kykH .
64
12 Linear operators on Hilbert spaces
12.1 Complexification
In the next lectures we will discuss the spectral theory of linear operators. The spectral
theory looks more natural in complex spaces. For example, a part of the theory studies
eigenvalues and eigenvectors of linear maps (i.e. non-zero solutions of the equation
Ax = λ x). In the finite-dimensional space a linear operator can be describe by a ma-
trix. You already know that a matrix (even a real one) can have complex eigenvalues.
Fortunately a real Hilbert space can always be considered as a part of a complex one
due to the “complexification” procedure.
The following lemma states that any bounded operator on H can be extended to a
bounded operator on HC .
65
Proof: Let y ∈ H and f (x) = (Ax, y) for all x ∈ H. The map f : H → K is linear and
A∗ (α1 y1 + α2 y2 ) = α1 A∗ y1 + α2 A∗ y2 .
Dividing by kA∗ yk (do not forget to consider the case A∗ y = 0 separately), we conclude
that
kA∗ yk ≤ kAkop kyk .
Therefore A∗ is bounded and kA∗ kop ≤ kAkop .
Definition 12.4 The operator A∗ from Theorem 12.3 is called the adjoint operator.
66
Indeed, for any x, y ∈ L2 (0, 1):
Z 1 Z 1
(Ax, y) = K(t, s)x(s) ds ȳ(t) dt
0 0
Z 1Z 1
= K(t, s)x(s)ȳ(t) ds dt
0 0
Z 1
!
Z 1
= x(s) K(t, s)y(t) dt ds
0 0
= (x, A∗ y) .
2. (AB)∗ = B∗ A∗
3. (A∗ )∗ = A
4. kA∗ k = kAk
Proof: Statements 1—3 follow directly from the definition of an adjoint operator (Ex-
ercise). Statement 4 follows from 3 and the estimate of Theorem 12.3: indeed,
Finally since
kAxk2 = (AX, Ax) = (x, A∗ Ax) ≤ kxk kA∗ Axk ≤ kA∗ Ak kxk2
implies kAk2 ≤ kAA∗ k and on the other hand kA∗ Ak ≤ kA∗ k kAk = kAk2 , it follows
that kA∗ Ak = kAk2 .
67
12.3 Self-adjoint operators
Definition 12.6 A linear operator A is self-adjoint, if A∗ = A.
Theorem 12.8 Let A be a self-adjoint operator on a Hilbert space H. Then all eigen-
values of A are real and the eigenvectors corresponding to distinct eigenvalues are
orthogonal.
Consequently, λ is real.
Now if λ1 and λ2 are distinct eigenvalues and Ax1 = λ1 x1 , Ax2 = λ2 x2 , then
Exercise: Let A be a self-adjoint operator on a real space. Show that the complexifi-
cation of A is also self-adjoint and has the same eigenvalues as the original operator
A.
68
Proof: For any x ∈ H
(Ax, x) = (x, Ax) = (Ax, x)
which implies (Ax, x) is real. Now let
for all x ∈ H such that kxk = 1. Consequently M ≤ kAkop . On the other hand, for any
u, v ∈ H we have
= 2M kuk2 + kvk2
kuk
v= Au
kAuk
Consequently kAuk ≤ Mkuk (for all u, including those with Au = 0) and kAkop ≤ M.
Therefore kAkop = M.
69
Unbounded operators and their adjoint operators16
The notions of adjoint and self-adjoint operators play an important role in the general
theory of linear operators. If an operator is not bounded a special care should be taken
in the consideration of its domain of definition.
Let D(A) be a linear subspace of a Hilbert space H, and A : D(A) → H be a linear
operator. If D(A) is dense in H we say that A is densely defined.
Example: Consider the operator A( f ) = ddtf on the set of all continuously differentiable
functions, i.e., D(A) = C1 [0, 1] ⊂ L2 (0, 1). This operator is densely defined.
Given a densely defined linear operator A on H, its adjoint A∗ is defined as follows:
y 7→ (x, Ay)
16 Optional topic
70
13 Introduction to Spectral Theory
13.1 Point spectrum
Let H be a complex Hilbert space and A : H → H a linear operator. If Ax = λ x for
some x ∈ H, x 6= 0, and λ ∈ C, then λ is an eigenvalue of A and x is an eigenvector.
The space
Eλ = { x ∈ H : Ax = λ x }
is called the eigenspace.
Exercise: Prove the following: If A ∈ B(H, H) and λ is an eigenvalue of A, then Eλ is
a closed linear subspace in H. Moreover, Eλ is invariant, i.e., A(Eλ ) = Eλ .
Definition 13.1 The point spectrum of A consists of all eigenvalues of A:
σ p (A) = { λ ∈ C : Ax = λ x for some x ∈ H, x 6= 0 } .
Proposition 13.2 If A : H → H is bounded and λ is its eigenvalue then
kλ k ≤ kAkop .
Proof: If Ax = λ x with x 6= 0, then
kAyk kAxk
kAkop = sup ≥ = |λ | .
y6=0 kyk kxk
Examples:
1. A linear map on an n-dimensional vector space has at least one and at most n
different eigenvalues.
2. The right shift Tr : `2 → `2 has no eigenvalues, i.e., the point spectrum is empty.
Indeed, suppose Tr x = λ x, then
(0, x1 , x2 , x3 , x4 , . . .) = λ (x1 , x2 , x3 , x4 , . . . )
implies 0 = λ x1 , x1 = λ x2 , x2 = λ x3 , . . . If λ 6= 0, we divide by λ and conclude
x1 = x2 = · · · = 0. If λ = 0 we also get x = 0. Consequently
σ p (Tr ) = 0.
/
3. The point spectrum of the left shift Tl : `2 → `2 is the open unit disk. Indeed,
suppose Tl x = λ x with λ ∈ C. Then
(x2 , x3 , x4 , . . .) = λ (x1 , x2 , x3 , x4 , . . . )
is equivalent to x2 = λ x1 , x3 = λ x2 , x4 = λ x3 , . . . Consequently, x = (xk )∞
k=1 with
xk = λ k−1 x1 for all k ≥ 2. This sequence belongs to `2 if and only if ∑∞ 2
k=1 |xk | =
2k
∑∞k=1 |x1 | |λ | converges or equivalently |λ | < 1. Therefore
σ p (Tl ) = { λ ∈ C : |λ | < 1 } .
71
13.2 Invertible operators
Let us discuss the concept of an inverse operator.
Definition 13.3 (injective operator) We say that A : U → V is injective if the equation
Ax = y has a unique solution for every y ∈ Range(A).
Definition 13.4 (bijective operator) We say that A : U → V is bijective if the equation
Ax = y has exactly one solution for every y ∈ V .
Definition 13.5 (inverse operator) We say that A is invertible if it is bijective. Then
the equation Ax = y has a unique solution for all y ∈ V and we define A−1 y = x.
1. Exercise: Show that A−1 is a linear operator.
2. Exercise: Show that if A−1 is invertible, then A−1 is also invertible and
(A−1 )−1 = A.
3. Exercise: Show that if A and B are two invertible linear operators, then AB is
also invertible and (AB)−1 = B−1 A−1 .
Proposition 13.6 A linear operator A : U → V is invertible iff
Ker(A) = { 0 } and Range(A) = V.
Exercise: prove the proposition.
We will use IV : V → V to denote the identity operator on V , i.e., IV (x) = x for all
x ∈ V . Moreover, we will skip the subscript V if there is no danger of a mistake. It is
easy to see that if A : U → V is invertible then
AA−1 = IV and A−1 A = IU .
72
13.3 Resolvent and spectrum
Let A : V → V be a linear operator on a vector space V . A complex number λ is an
eigenvalue of A if Ax = λ x for some x 6= 0. This equation is equivalent to (A−λ I)x = 0.
Then we immediately see that A − λ I is not invertible since 0 has infinitely many
preimages: α x with α ∈ C.
If V is finite dimensional the reversed statement is also true: if A − λ I is not invert-
ible then λ is an eigenvalue of A (recall the Fredholm alternative from the first year
Linear Algebra). In the infinite dimensional case this is not necessarily true.
Definition 13.8 (resolvent set and spectrum) The resolvent set of a linear operator
A : H → H is defined by
R(A) = { λ ∈ C : (A − λ I)−1 ∈ B(H, H) } .
The resolvent set consists of regular values. The spectrum is the complement to the
resolvent set in C:
σ (A) = C \ R(A) .
Note that the definition of the resolvent set assumes existence of the inverse opera-
tor (A−λ I)−1 for λ ∈ R(A). If λ ∈ σ p (A) then (A−λ I) is not invertible. Consequently
any eigenvalue λ ∈ σ (A) and
σ p (A) ⊆ σ (A) .
The spectrum of A can be larger than the point spectrum.
Example: The point spectrum of the right shift operator Tr is empty but since
Range Tr 6= `2 it is not invertible and therefore 0 ∈ σ (Tr ). So σ p (Tr ) 6= σ (Tr ).
Technical lemmas
Lemma 13.9 If T ∈ B(H, H) and kT k < 1, then (I − T )−1 ∈ B(H, H). Moreover
(I − T )−1 = I + T + T 2 + T 3 + . . .
and
k(I − T )−1 k ≤ (1 − kT k)−1 .
≤ kT kn+1 + kT kn+2 + · · · + kT km
kT kn+1 − kT km+1 kT kn+1
= ≤ .
1 − kT k 1 − kT k
73
Since kT k < 1, Vn is a Cauchy sequence in the operator norm. The space B(H, H) is
complete and there is V ∈ B(H, H) such that Vn → V . Moreover,
kV k ≤ 1 + kT k + kT k2 + · · · = (1 − kT k)−1 .
Vn (I − T ) = Vn −Vn T = I − T n+1 ,
(I − T )Vn = Vn − TVn = I − T n+1
Lemma 13.10 Let H be a Hilbert space and T, T −1 ∈ B(H, H). If U ∈ B(H, H) and
kUk < kT −1 k−1 , then the operator T +U is invertible and
(T +U)−1
≤
kT −1 k
.
1 − kUk kT −1 k
I = V −1V = V −1 T −1 (T +U)
(T +U)−1 = V −1 T −1 .
Finally, k(T +U)−1 k ≤ kV −1 k kT −1 k implies the desired upper bound for the norm of
the inverse operator.
74
Properties of the spectrum
Lemma 13.11 If A is bounded and λ ∈ σ (A) then λ̄ ∈ σ (A∗ ).
Proof: Take λ ∈ C such that |λ | > kAkop . Since kλ −1 Akop < 1 Lemma 13.9 implies
that I −λ −1 A is invertible and the inverse operator is bounded. Consequently, A−λ I =
−λ (I − λ −1 A) also has a bounded inverse and so λ ∈ R(A). The proposition follows
immediately since σ (A) is the complement of R(A).
Example: The spectrum of Tl and of Tr are both equal to the closed unit disk on the
complex plane.
Indeed, σ p (Tl ) = { λ ∈ C : |λ | < 1 }. Since σ p (Tl ) ⊂ σ (Tl ) and σ (Tl ) is closed, we
conclude that σ (Tl ) includes the closed unit disk. On the other hand, Proposition 13.12
implies that σ (Tl ) is a subset of the closed disk |λ | ≤ kTl kop = 1. Therefore
σ (Tl ) = { λ ∈ C : |λ | ≤ 1 } .
Since Tr = Tl∗ and σ (Tl ) is invariant, Lemma 13.11 implies σ (Tr ) = σ (Tl ).
75
13.4 Compact operators
Definition 13.14 Let X be a normed space and Y be a Banach space. Then a linear
operator A : X → Y is compact if the image of any bounded sequence has a convergent
subsequence.
Theorem 13.16 If X is a normed space and Y is a Banach space, then compact linear
operators form a closed linear subspace in B(X,Y ).
Given ε > 0 choose n sufficiently large to ensure that the first term is less than ε2 ,
then choose N sufficiently large to guarantee that the second term is less than ε2 for
all j, l > N. So Ky j is Cauchy and consequently converges. Therefore K is a compact
operator, and the subspace formed by compact operators is closed.
76
Proposition 13.17 The integral operator A : L2 (a, b) → L2 (a, b) defined by
Z b Z bZ b
(A f )(t) = K(t, s) f (s) ds with |K(t, s)|2 dsdt < ∞
a a a
is compact.
77
13.5 Spectral theory for compact self-adjoint operators
Lemma 13.18 Let H be an infinitely dimensional Hilbert space and T : H → H a
compact self-adjoint operator. Then at least one of λ± = ±kT kop is an eigenvalue of
T.
xn → x = α −1 y .
The operator T is continuous and consequently T x = αx. Finally, since kxn k = 1 for
all n, we have kxk = 1, and consequently α is an eigenvalue.
Proof: Suppose there is ε > 0 such that T has infinitely many different eigenvalues
with |λn | > ε. Let xn be corresponding eigenvectors with kxn k = 1. Since the operator
is self-adjoint, this sequence is orthonormal and for any n 6= m
kT xn − T xm k2 = kλn xn − λm xm k2
= (λn xn − λm xm , λn xn − λm xm ) = |λn |2 + |λm |2 > 2ε .
78
Theorem 13.20 (Hilbert-Schmidt theorem) Let H be a Hilbert space and T : H → H
be a compact self-adjoint operator. Then there is a finite or countable orthonormal
sequence (en ) of eigenvectors of T with corresponding real eigenvalues (λn ) such that
79
implies kyn k2 ≤ kxk2 . On the other hand
n−1
T x − ∑ (x, e j )λ j e j
= kTyn k ≤ kTn k kyn k ≤ |λn | kxk
j=1
Exercise: Deduce that operators with finite range are dense among compact self-
adjoint operators.
Proposition 13.19 implies that zero is the only possible limit point of σ p (T ). There-
fore the theorem means that either σ (T ) = σ p (T ) or σ (T ) = σ p (T ) ∪ { 0 }. In partic-
ular, σ (T ) = σ p (T ) if zero is an eigenvalue. Note that if zero is not an eigenvalue,
then there is a sequence of eigenvalues which accumulates to zero since H is infinite
dimensional.
Proof: According to the Hilbert-Schmidt theorem
∞
Tx = ∑ λ j (x, e j )e j
j=1
where { e j } is an orthonormal basis in H.18 Then x = ∑∞j=1 (x, e j )e j and for any µ ∈ C
∞
(T − µI)x = ∑ (λ j − µ)(x, e j )e j .
j=1
18 The proof uses a countable basis in H, therefore it assumes that His separable. The theorem remain
valid for a non-separable H but the proof should be slightly modified. The modification is based on the
following observation: Proposition 13.19 implies that (Ker T )⊥ has a countable orthonormal basis of
eigenvectors { e j }. Then for any vector x ∈ H write x = PKer T (x) + ∑∞j=1 (x, e j )e j where PKer T is the
orthogonal projection on the kernel of T . Then follow the arguments of the proof (adding µ −1 PKer T (y)
to the definition of S).
80
Let µ ∈ C \ σ p (T ) which is an open subset of C. Consequently there is ε > 0 such that
|µ − λ | > ε for all λ ∈ σ p (T ) ⊂ σ p (T ). Consider an operator S defined by
∞
(y, e )
Sy = ∑ λk −kµ ek .
k=1
Lemma 7.12 implies that the series converges since |λk − µ| > ε and
(y, ek ) 2
∞
∞
2
kSyk = ∑ ≤ ε −2 ∑ |(y, ek )|2 = ε −2 kyk2 .
k=1 λk − µ
j=k
81
14 Sturm-Liouville problems
In this chapter we will study the Sturm-Liouville problem:
d du
− p(x) + q(x)u = λ u with u(a) = u(b) = 0
dx dx
where p and q are given function. The values of λ for which the problem has a non-
trivial solution are called eigenvalues of the Sturm-Liouville problem and the corre-
sponding solutions u are called eigenfunctions. We will prove that the eigenfunctions
form an orthonormal basis in L2 (a, b).
We assume that p ∈ C1 [a, b], q ∈ C[a, b] and
Obviously L : C2 [a, b] → C0 [a, b] is linear. We will see that Range L = C0 [a, b] and
Ker L = Span{ u1 , u2 } where u1 , u2 are two linear independent solutions of the equation
Lu = 0. Therefore the operator is not invertible.
We will restrict L onto the space
and show that L : D(L) → C0 [a, b] is invertible. We will prove that L−1 is a restriction
on Range L of a compact self-adjoint operator A : L2 (a, b) → L2 (a, b). The Sturm-
Liouville theorem states that the eigenfunction of A form an orthonormal basis in
L2 (a, b). Moreover, we will see that A and L have the same eigenfunctions.
Wp (u1 , u2 ) 6= 0.
Proof: The equation Lu = 0 implies pu00 = −p0 u0 + qu. Then differentiate W with
respect to x:
82
Therefore Wp is constant.
Finally, since p 6= 0, Wp = 0 implies u01 (x)u2 (x) − u1 (x)u02 (x) = 0 for all x. Conse-
quently at all points where u1 and u2 do not vanish
u01 u02 d ln |u1 | d ln |u2 |
= ⇔ = ,
u1 u2 dx dx
which implies that u1 = Cu2 for some constant C, i.e., if Wp = 0 then u1 and u2 are not
linearly independent. In the reverse direction the statement is straightforward.
Lemma 14.2 The equation Lu = 0 has two linearly independent solutions, u1 , u2 ∈
C2 [a, b], such that u1 (a) = u2 (b) = 0.
Proof: Let u1 , u2 be solutions of the Cauchy problems
Lu1 = 0 u1 (a) = 0, u01 (a) = 1,
Lu2 = 0 u2 (b) = 0, u02 (b) = 1 .
According to the theory of linear ordinary differential equations u1 and u2 exist, belong
to C2 [a, b] and are unique.
Moreover, u1 and u2 are linearly independent. Indeed, suppose Lu = 0 for some
u ∈ C2 [a, b] such that u(a) = u(b) = 0. Then
Z b
−(pu0 )0 u + qu2 dx
0 = (Lu, u) =
a
b Z b
0
p(u0 )2 + qu2 dx
= p(x)u (x)u(x) + (integration by parts)
a a
Z b
p(u0 )2 + qu2 dx
=
a
Since p > 0 on [a, b], we conclude that u0 ≡ 0. Then u(a) = u(b) = 0 implies u(x) = 0
for all x ∈ [a, b].
Consequently since u2 (b) = 0 and u2 is not identically zero, u2 (a) 6= 0 and so
Wp (u1 , u2 ) = p(a)(u01 (a)u2 (a) − u1 (a)u02 (a)) = p(a)u01 (a)u2 (a) 6= 0.
Therefore u1 , u2 are linearly independent by Lemma 14.1.
Proposition 14.3 If u1 and u2 are linearly independent solutions of the equation Lu =
0 such that u1 (a) = u2 (b) = 0 and
1 u1 (x)u2 (y), a ≤ x < y ≤ b,
G(x, y) =
Wp (u1 , u2 ) u 1 (y)u2 (x), a ≤ y ≤ x ≤ b,
then for any f ∈ C0 [a, b] the function
Z b
u(x) = G(x, y) f (y) dy
a
belongs to C2 [a, b], satisfies the equation Lu = f and the boundary conditions u(a) =
u(b) = 0.
83
Proof: The statement is proved by a direct substitution of
Z x Z b
u2 (x) u1 (x)
u(x) = u1 (y) f (y) dy + u2 (y) f (y) dy
Wp (u1 , u2 ) a Wp (u1 , u2 ) x
into the differential equation. Moreover, since u1 (a) = u2 (b) = 0, we get u(a) = u(b) =
0.
is compact and self-adjoint. Moreover, Range A is dense in L2 (a, b), Ker A = { 0 } and
all its eigenfunctions, Au = µu, belong to C2 [a, b] and satisfy u(a) = u(b) = 0.
Proposition 14.5 The operator L : D(L) → C0 [a, b] has a bounded inverse (in the op-
erator norm induced by the L2 norm on both spaces).
Lemma 14.4 states that this operator is bounded. In other words, L−1 coincides with
the restriction of A onto Range L.
84
Proof: Since A : L2 (a, b) → L2 (a, b) is compact and self adjoint, Theorem 13.21 im-
plies that its eigenvectors form an orthonormal basis in L2 (a, b). If u is an eigenfunc-
tion of A, then Au = µu, µ 6= 0 and u ∈ C2 [a, b]. Consequently Lu = λ u = µ −1 u, i.e.,
u is also an eigenvector of L which corresponds to the eigenvalue λ = µ −1 .
d2u
− = λ u, u(0) = u(1) = 0 .
dx2
It corresponds to the choice p = 1, q = 0. Theorem 14.6 implies that the normalised
eigenfunctions of this problem form an orthonormal basis in L2 (0, 1). In this example
the eigenfunctions are easy to find:
1
√ sin kπx : k ∈ N .
2
where Z 1
1
αk = f (x) sin kπx dx .
2 0
85