Vous êtes sur la page 1sur 111

Introduction to Analysis

Lecture Notes 2005/2006

Vitali Liskevich
With minor adjustments by Vitaly Moroz

School of Mathematics
University of Bristol
Bristol BS8 1TW, UK
Contents

1 Elements of Logic and Set Theory 1


1.1 Propositional connectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Negation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Conjunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.3 Disjunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.4 Implication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.5 Equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Logical laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.1 Subsets. The empty set . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.2 Operations on sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.3 Laws for operations on sets . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.4 Universe. Complement . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Predicates and quantifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5 Ordered pairs. Cartesian products . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.7 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.7.1 Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.7.2 Injections and surjections. Bijections. Inverse functions . . . . . . . . 21
1.8 Some methods of proof. Proof by induction . . . . . . . . . . . . . . . . . . . 24

2 Numbers 27
2.1 Various Sorts of Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1.1 Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1.2 Rational Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.1.3 Irrational Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.1.4 Cuts of the Rationals . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2 The Field of Real Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3 Bounded sets of numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.4 Supremum and infimum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

II
CONTENTS

3 Sequences and Limits 40


3.1 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.2 Null sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3 Sequence converging to a limit . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4 Monotone sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.5 Cauchy sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.6 Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.6.1 Series of positive terms . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4 Limits of functions and continuity 55


4.1 Limits of function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2 Continuous functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.3 Continuous functions on a closed interval . . . . . . . . . . . . . . . . . . . . 63
4.4 Uniform continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.5 Inverse functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

5 Differential Calculus 68
5.1 Definition of derivative. Elementary properties . . . . . . . . . . . . . . . . . 68
5.2 Theorems on differentiable functions . . . . . . . . . . . . . . . . . . . . . . . 72
5.3 Approximation by polynomials. Taylor’s Theorem . . . . . . . . . . . . . . . 75

6 Series 78
6.1 Series of positive and negative terms . . . . . . . . . . . . . . . . . . . . . . . 78
6.1.1 Alternating series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.1.2 Absolute convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.1.3 Rearranging series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6.1.4 Multiplication of series . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6.2 Power series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

7 Elementary functions 87
7.1 Exponential function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.2 Logarithmic function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
7.3 Trigonometric functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

8 The Riemann Integral 92


8.1 Definition of integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
8.1.1 Definition of integral and integrable functions . . . . . . . . . . . . . . 92
8.1.2 Properties of upper and lower sums . . . . . . . . . . . . . . . . . . . 94
8.2 Criterion of integrability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
8.3 Integrable functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
8.4 Elementary properties of integral . . . . . . . . . . . . . . . . . . . . . . . . . 97
8.5 Integration as the inverse to differentiation . . . . . . . . . . . . . . . . . . . 102

III January 28, 2006


8.6 Integral as the limit of integral sums . . . . . . . . . . . . . . . . . . . . . . . 104
8.7 Improper integrals. Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
8.8 Constant π . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Chapter 1

Elements of Logic and Set Theory

In mathematics we always assume that our propositions are definite and unambiguous, so that
such propositions are always true or false (there is no intermediate option). In this respect
they differ from propositions in ordinary life, which are often ambiguous or indeterminate. We
also assume that our mathematical propositions are objective, so that they are determinately
true or determinately false independently of our knowing which of these alternatives holds.
We will denote propositions by capital Roman letters.
Examples of propositions
1. A ≡ London is the capital of the UK.
2. B ≡ Paris is the capital of the UK.
3. C ≡ 3 > 5
4. D ≡ 3 < 5. etc.
(We will use the sign ≡ in order to define propositions.)
A and D are true, whereas B and C are false, nonetheless they are propositions. Thus
with respect to our agreement a proposition may take one of the two values : “true” or
“false” (never simultaneously). We use the symbol T for an arbitrary true proposition and
F for an arbitrary false one.
Not every sentence is a mathematical proposition. For instance, the following sentences
are not mathematical propositions.

(i) It is easy to study Analysis.


(ii) The number 0.00001 is very small.
(iii) Is there a number whose square is 2?
(iv) x > 2.

About (i) it is impossible to judge whether it is true or false (depends for whom...). (ii) does
not have precise sense. (iii) is a question, so does not state anything. And (iv) contains a
letter x and whether it is true or false depends on the value of x.
At the same time, there are propositions for which it is not immediately easy to establish
whether they are true or false. For example,

E ≡ (1263728 + 1515876 ) + 4 is a prime number.

1
1.1. PROPOSITIONAL CONNECTIVES

This is obviously a definite proposition, but it would take a lot of computation actually to
determine whether it is true or false.

There are some propositions in mathematics for which their truth value has not been
determined yet. For instance, “in the decimal representation of π there are infinitely many
digits 7”. It is not known whether it is true, but this is certainly a mathematical proposition.
The truth value of this proposition constitutes an open question (at least the answer is not
known to the author of these notes). There are plenty of open questions in mathematics!

1.1 Propositional connectives


Propositional connectives are used to combine simple propositions into complex ones. They
can be regarded as operations with propositions.

1.1.1 Negation
One can build a new proposition from an old one by negating it. Take A above as an example.
The negation of A (not A) will mean

¬A ≡ London is not the capital of the UK.

We will use the symbol ¬A to denote not A. Another example. For the proposition D ≡
{8 is a prime number}, its negation is ¬D ≡ {8 is not a prime number}. Since we agreed
that we have one of the two truth values : truth or false, we can define negation of a proposition
A by saying that if A is true then ¬A is false, and if A is false then ¬A is true. This definition
is reflected in the following table.

A ¬A
T F
F T

This is called the truth table for negation.

1.1.2 Conjunction
Conjunction is a binary operation on propositions which corresponds to the word “and” in
English. We stipulate by definition that “A and B” is true if A is true and B is true, and “A
and B” is false if A is false or B is false. This definition is expressed in the following truth
table. We use the notation A ∧ B for the conjunction “A and B”.

A B A∧B
T T T
T F F
F T F
F F F

2 January 28, 2006


CHAPTER 1. ELEMENTS OF LOGIC AND SET THEORY

The four rows of this table correspond to the four possible truth combinations of the propo-
sition A and the proposition B. The last entry in each row stipulates the truth or falsity of
the complex proposition in question.
Conjunction is sometimes called logical product.

1.1.3 Disjunction
Disjunction is a binary operation on propositions which corresponds to the word “or” in
English. We stipulate by definition that “A or B” is true if A is true or B is true, and “A
or B” is false if A is false and B is false. This definition is expressed in the following truth
table. We use the notation A ∨ B for the disjunction “A or B”.

A B A∨B
T T T
T F T
F T T
F F F

Disjunction is sometimes called logical sum.

1.1.4 Implication
Implication is a binary operation on propositions which corresponds to the word “if... then...”
in English. We will denote this operation ⇒. So A ⇒ B can be read “if A then B”, or “A
implies B”. A is called the antecedent, and B is called the consequent. The truth table for
implication is the following.

A B A⇒B
T T T
T F F
F T T
F F T

So, as we see from the truth table which constitutes the definition of the operation “im-
plication”, the implication is false only in the case in which the antecedent is true and the
consequent is false. It is true in the remaining cases.
Notice that our introduced “implies” differs from that used in the ordinary speech. The
reason for such a definition will become clear later. It proved to be useful in mathematics.
Now we have to accept it. Thus in the mathematical meaning of “implies” the proposition

“Snow is black implies grass is red”

is true (since it corresponds to the last line of the truth table).


The proposition A ⇒ B can be also read as “A is sufficient for B”, “B if A”, “A only if
B” and “B is necessary for A” (the meaning of the latter will be clarified later on).

3 January 28, 2006


1.1. PROPOSITIONAL CONNECTIVES

1.1.5 Equivalence
The last binary operation on propositions we introduce, is equivalence. Saying that A is
equivalent to B we will mean that A is true whenever B is true, and vice versa. We denote
this by A ⇔ B. So we stipulate that A ⇔ B is true in the cases in which the truth values of
A and B are the same. In the remaining cases it is false. This is given by the following truth
table.

A B A⇔B
T T T
T F F
F T F
F F T

A ⇔ B can be read as “A is equivalent to B”, or “A if and only if B” (this is usually


shortened to “A iff B”), or “A is necessary and sufficient for B”. The equivalence A ⇔ B
can be also defined by means of conjunction and implication as follows

(A ⇒ B) ∧ (B ⇒ A).

Now let A be a proposition. Then ¬A is also a proposition, so that we can construct its
negation ¬¬A. Now it is easy to understand that ¬¬A is the same proposition as A as we
have only two truth values T and F .

¬¬A ⇔ A

The last equivalence is called the double negation law.

4 January 28, 2006


CHAPTER 1. ELEMENTS OF LOGIC AND SET THEORY

1.2 Logical laws


In our definitions in the previous section A, B, etc. stand for arbitrary propositions. They
themselves may be built from simple propositions by means of the introduced operations.
Logical laws or, in other words, logical tautologies are composite propositions built from
simple propositions A, B, etc. (operands) by means of the introduced operations, that are
true, no matter what are the truth values of the operands A, B etc.
The truth value of a proposition constructed from A, B, etc. does not depend on the
propositions A, B, etc. themselves, but depends only on the truth value of A, B, etc. Hence
in order to check whether a composite proposition is a law or not, one can plug in T or F
instead of A, B, etc. in all possible combinations to determine the corresponding truth values
of the proposition in question. If all the values are T then the proposition in question is a
law. If there is a substitution which gives the value F , then it is not a law.

Example 1.2.1. The proposition

(A ∧ B) ⇒ (A ∨ B)

is a law.

The shortest way to justify this is to build the truth table for (A ∧ B) ⇒ (A ∨ B).

A B A∧B A∨B (A ∧ B) ⇒ (A ∨ B)
T T T T T
T F F T T
F T F T T
F F F F T

As we see the last column consists entirely of T ’s, which means that this is a law.

Below we list without proof some of the most important logical laws. We recommend that
you verify them by constructing the truth tables of these laws.

• Commutative law of disjunction

(1.2.1) (A ∨ B) ⇔ (B ∨ A)

• Associative law of disjunction

(1.2.2) [(A ∨ B) ∨ C] ⇔ [A ∨ (B ∨ C)]

• Commutative law of conjunction

(1.2.3) (A ∧ B) ⇔ (B ∧ A)

• Associative law of conjunction

(1.2.4) [(A ∧ B) ∧ C] ⇔ [A ∧ (B ∧ C)]

5 January 28, 2006


1.2. LOGICAL LAWS

• First distributive law

(1.2.5) [A ∧ (B ∨ C)] ⇔ [(A ∧ B) ∨ (A ∧ C)]

• Second distributive law

(1.2.6) [A ∨ (B ∧ C)] ⇔ [(A ∨ B) ∧ (A ∨ C)]

• Idempotent laws

(1.2.7) (A ∧ A) ⇔ A, (A ∨ A) ⇔ A

• Absorption laws

(A ∧ T ) ⇔ A, (A ∧ F ) ⇔ F,
(1.2.8) (A ∨ T ) ⇔ T, (A ∨ F ) ⇔ A

• Syllogistic law

(1.2.9) [(A ⇒ B) ∧ (B ⇒ C)] ⇒ (A ⇒ C)

(1.2.10) (A ∨ ¬A) ⇔ T

(1.2.11) (A ∧ ¬A) ⇔ F

• De Morgan’s laws

(1.2.12) ¬(A ∨ B) ⇔ (¬A ∧ ¬B)


(1.2.13) ¬(A ∧ B) ⇔ (¬A ∨ ¬B)

• Contrapositive law

(1.2.14) (A ⇒ B) ⇔ (¬B ⇒ ¬A)

(1.2.15) (A ⇒ B) ⇔ (¬A ∨ B)

6 January 28, 2006


CHAPTER 1. ELEMENTS OF LOGIC AND SET THEORY

1.3 Sets
The notion of a set is one of the basic notions. We cannot, therefore, give it a precise
mathematical definition. Roughly speaking:

A set is a collection of objects, to which one can assign a “size”.

It is obviously not a definition (what is “collection”? what is “size”?). A rigorous set theory
is constructed in the axiomatic way. We however confine ourselves to the naive set theory,
introducing some axioms only in order to clarify the notion of a set. We accept as the basic
notions “set”, and the relation “to be an element of a set”. If A is a set we write a ∈ A to
express that “a is an element of the set A”, or “a belongs to A”. If a is not an element of A
we write a 6∈ A. So the following is true for any x and A:

(x 6∈ A) ⇔ ¬(x ∈ A).

One way to define a set is just by listing its elements.

Example 1.3.1. (i) A = {0, 1}. This means that the set A consists of two elements, 0
and 1.

(ii) B = {0, {1}, {0, 1}}. The set B contains three elements: the number 0; the set {1}
containing one element: the number 1; and the set containing two elements: the numbers
0 and 1.

The order in which the elements are listed is irrelevant. Thus {0, 1} = {1, 0}.

A set can be also specified by an elementhood test.

Example 1.3.2. (i) C = {x ∈ N | x is a prime number}. The set C contains all primes.
We cannot list them for the reason that there are infinitely many primes. (Here N is
the set of natural numbers 1, 2, 3 . . . )

(ii) D = {x ∈ R | x2 − x = 0}. The set D contains the roots of the equation x2 − x = 0. But
these are 0 and 1, so the set D contains the same elements as the set A in the previous
example. In this case we say that the sets A and D coincide, and write A = D. (Here
R is the set of real numbers.)

Sets satisfy the following fundamental law (axiom)

If the sets A and B contain the same elements, then they coincide, (or are equal).

1.3.1 Subsets. The empty set


Definition 1.3.1. If each element of the set A is also an element of the set B we say that A
is a subset of B and write A ⊂ B.

This definition can be expressed as a logical proposition as follows:

For every x [(x ∈ A) ⇒ (x ∈ B)].

7 January 28, 2006


1.3. SETS

Note that by the definition

(A = B) ⇔ [(A ⊂ B) ∧ (B ⊂ A)].

This is the main way we will use to prove equalities for sets.
Since it happens that a set may contain no elements, the following definition is useful.

Definition 1.3.2. The empty set is the set which contains no elements.

Warning: there is only one empty set. It is denoted by ∅.

This in particular means that the set of elephants taking this course of Analysis coincides
with the set of natural numbers solving the equation x2 − 2 = 0.

Let us illustrate the notion of a subset by a simple example.

Example 1.3.3. Let S = {0, 1, 2}. Then the subsets of S are:

∅, {0}, {1}, {2}, {0, 1}, {0, 2}, {1, 2}, {0, 1, 2}.

Subsets of a set A which do not coincide with A are called proper subsets. In this example
all subsets except for the last one are proper.

Definition 1.3.3. The set of all subsets of a set A is called the power set of A, and is denoted
by P (A) (or 2A in some books).

Note that in the last example S contains 3 elements, whereas P (S) has 8 elements (23 = 8).
In general, if a set A has n elements then P (A) has 2n elements (provided n is finite). Prove
it!

1.3.2 Operations on sets


Now we introduce operations on sets. The main operations are: union, intersection, difference
and symmetric difference.

Union of sets

Definition 1.3.4. The union of sets A and B is the set containing the elements of A and the
elements of B, and no other elements.

We denote the union of A and B by A ∪ B.


Note: existence of the union for arbitrary A and B is accepted as an axiom.

For arbitrary x and arbitrary A and B the following proposition is true.

(x ∈ A ∪ B) ⇔ (x ∈ A) ∨ (x ∈ B).

8 January 28, 2006


CHAPTER 1. ELEMENTS OF LOGIC AND SET THEORY

Intersection of sets

Definition 1.3.5. The intersection of sets A and B is the set containing the elements which
are elements of both A and B, and no other elements.

We denote the intersection of A and B by A ∩ B. Thus for arbitrary x and arbitrary A


and B the following proposition is true.

(x ∈ A ∩ B) ⇔ (x ∈ A) ∧ (x ∈ B).

Difference of sets

Definition 1.3.6. The difference of sets A and B is the set containing the elements of A
which do not belong to B.

We use the notation A − B for the difference. The following is true for arbitrary x and
arbitrary A and B :
(x ∈ A − B) ⇔ [(x ∈ A) ∧ (x 6∈ B)].

By the de Morgan law and by the double negation law

¬(x ∈ A − B) ⇔ [¬(x ∈ A) ∨ (x ∈ B)].

Symmetric difference

Definition 1.3.7. The symmetric difference of the sets A and B is defined by

A4B = (A − B) ∪ (B − A).

Let us illustrate the introduced operations by a simple example.

Example 1.3.4. Let A = {0, 1, 2, 3, 4, 5} and B = {1, 3, 5, 7, 9}. Then

A ∪ B = {0, 1, 2, 3, 4, 5, 7, 9}.

A ∩ B = {1, 3, 5}.

A − B = {0, 2, 4}, B − A = {7, 9}.

A4B = {0, 2, 4, 7, 9}.

Note that
A ∪ B = (A ∩ B) ∪ (A4B).

9 January 28, 2006


1.3. SETS

1.3.3 Laws for operations on sets


• Commutative laws
(1.3.16) A ∪ B = B ∪ A,
(1.3.17) A ∩ B = B ∩ A.

• Associative laws
(1.3.18) A ∪ (B ∪ C) = (A ∪ B) ∪ C,
(1.3.19) A ∩ (B ∩ C) = (A ∩ B) ∩ C.

• Distributive laws
(1.3.20) A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C),
(1.3.21) A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C).

• Idempotent laws
(1.3.22) A ∪ A = A, A ∩ A = A.

The proofs of the above laws are based on the corresponding laws for conjunction and
disjunction.

Now let us formulate and prove some laws of the difference.

(1.3.23) A ∪ (B − A) = A ∪ B.

Proof.
[x ∈ A ∪ (B − A)] ⇔ {(x ∈ A) ∨ [(x ∈ B) ∧ ¬(x ∈ A)]}
⇔ [(x ∈ A) ∨ (x ∈ B)] ∧ [(x ∈ A) ∨ ¬(x ∈ A)]
⇔ [(x ∈ A) ∨ (x ∈ B)],
since [(x ∈ A) ∨ ¬(x ∈ A)] is true.

From the last formula it follows that the difference is not an inverse operation to the union
(which means that in general A ∪ (B − A) 6= B).

(1.3.24) A − B = A − (A ∩ B).

Proof.
[x ∈ A − (A ∩ B)] ⇔ {(x ∈ A) ∧ ¬(x ∈ A ∩ B)}
⇔{(x ∈ A) ∧ ¬[(x ∈ A) ∧ (x ∈ B)]
⇔{(x ∈ A) ∧ [¬(x ∈ A) ∨ ¬(x ∈ B)]}
⇔{[(x ∈ A) ∧ ¬(x ∈ A)] ∨ [(x ∈ A) ∧ ¬(x ∈ B)]}
⇔[(x ∈ A) ∧ ¬(x ∈ B)]
⇔(x ∈ A − B)

10 January 28, 2006


CHAPTER 1. ELEMENTS OF LOGIC AND SET THEORY

since [(x ∈ A) ∧ ¬(x ∈ A)] is false.

De Morgan’s laws

(1.3.25) A − (B ∩ C) = (A − B) ∪ (A − C),
(1.3.26) A − (B ∪ C) = (A − B) ∩ (A − C).

The proof is based on de Morgan’s laws for propositions.

1.3.4 Universe. Complement


In many applications of the set theory one considers only sets which are contained in some
fixed set. (For example, in plane geometry we study only set containing points from the
plane.) This set is called a universe. We will denote it by U.
Definition 1.3.8. Let U be the universe. The set U − A is called the complement to A. It
is denoted by Ac .
It is easy to see that the following properties hold

(1.3.27) (Ac )c = A,
(1.3.28) (A ∩ B)c = Ac ∪ B c ,
(1.3.29) (A ∪ B)c = Ac ∩ B c .

Using the properties of the difference prove it!

11 January 28, 2006


1.4. PREDICATES AND QUANTIFIERS

1.4 Predicates and quantifiers


In mathematics, along with propositions, one deals with statements that depend on one or
more variables, letters denoting elements of sets. In this case we speak of predicates. For
instance, “n is a prime number” is a predicate. As we see, it may be true or false depending
on the value n. A predicate becomes a proposition after substituting the variable by a
fixed element from the set of definition (the set to which the variable belongs). Generally, a
predicate can be written as
A(x)(x ∈ S),
where S is the set of definition (which we often omit when it is clear what S is).
A subset of S containing all the elements of S which make A(x) true is called the truth
set for A(x).
For the truth set of A(x) we write

Truth set of A = {x ∈ S |A(x)}.

Example 1.4.1. Let A ≡ {x ∈ R | x2 − x = 0}. (The notation R is used to denote the set of
real numbers, which we will discuss in details later on.) Then

{x ∈ R |A(x)} = {0, 1}.

We often want to say that some property holds for every element from S. In this case we
use the universal quantifier ∀. So for “for all x ∈ S A(x)” we write (∀x ∈ S) A(x). After
applying the universal quantifier to a predicate we obtain a proposition (which may be true
or false, as usual). The universal quantifier ∀ substitutes for the words “every”, “for every”,
“any”, “for all”.
Example 1.4.2. (i) The proposition “Every real number has a non-negative square” can
be written as
(∀x ∈ R) [x2 ≥ 0].

(ii) The proposition “Every real number is non-negative” can be written as

(∀x ∈ R) [x ≥ 0].

Evidently, (i) is true and (ii) is false.

Note that proposition [(∀x ∈ S) A(x)] means that the truth set of A(x) is the whole set
S. Thus if for an element x of S, A(x) is false, the proposition [(∀x ∈ S) A(x)] is false, hence
in order to state that it is false it is enough to find one element in S for which A(x) is false.

To express that a property holds for some element of S, or in other words, “there exists
an element in S such that A(x) holds”, we use
the existential quantifier ∃ and write

(∃x ∈ S) A(x).

∃ substitutes for the words “for some”, “there exists”.

12 January 28, 2006


CHAPTER 1. ELEMENTS OF LOGIC AND SET THEORY

Note that in order to state that

[(∃x ∈ S) A(x)] is true

it is enough to find one element in S for which A(x) is true.


Example 1.4.3. The proposition “Some real numbers are greater than their squares” can
be written as
(∃x ∈ R) [x2 < x].
It is true, of course.

In propositions “(∀x) P (x)” and “(∃x) P (x)” P (x) may itself contain quantifiers.
Example 1.4.4. (i)
(∀x ∈ N)(∃y ∈ N) [y = x + x].

(ii)
(∀x ∈ R)(∃y ∈ R) [y < x].

(N denotes the set of natural numbers.)

Quantifiers negation laws

The following equivalences are laws

¬[(∀x ∈ S) P (x)] ⇔ [(∃x ∈ S) ¬P (x)],

¬[(∃x ∈ S) P (x)] ⇔ [(∀x ∈ S) ¬P (x)].

This means that we can obtain a proposition with initially placed quantifier that is equiv-
alent to the negation of a proposition with initially placed quantifier by pushing the negation
to the inside and changing ∀ to ∃ and ∃ to ∀.
Example 1.4.5. Consider the following proposition

(∀x ∈ N) (∃y ∈ N) (2y > x).

The negation of it is the proposition

(∃x ∈ N) (∀y ∈ N) (2y ≤ x).

Example 1.4.6.

¬[(∀x ∈ X)(∃y ∈ Y )(∀z ∈ Z) P (x, y, z)]


⇔ [(∃x ∈ X)(∀y ∈ Y )(∃z ∈ Z) ¬P (x, y, z)].

13 January 28, 2006


1.5. ORDERED PAIRS. CARTESIAN PRODUCTS

1.5 Ordered pairs. Cartesian products


Let us talk about sets from a universe U. Recall that {a} denotes the set containing one
element a. The set {a, b} contains two elements if a 6= b and one element otherwise. Obviously,
{a, b} = {b, a}. In many problems in mathematics we need an object in which the order in a
pair is important. So, we want to define an ordered pair.
You have already seen ordered pairs studying points in the xy plane. The use of x and y
coordinates to identify points in the plane works by assigning to each point in the plane an
ordered pair of real numbers x and y. The pair must be ordered because, for example, (2, 5)
and (5, 2) correspond to different points.
How to define an ordered pair formally? Whatever we define the ordered pairs (a, b) and
(c, d) to be, it must come out that

(1.5.30) [(a, b) = (c, d)] ⇔ [(a = c) ∧ (b = d)].

Definition 1.5.1.
(a, b) = {{a}, {a, b}}.

Let us prove that (1.5.30) is fulfilled.

Proof. 1)(if) [(a = c) ∧ (b = d)] ⇒ [(a, b) = (c, d)].


Suppose a = c and b = d. Then

(a, b) = {{a}, {a, b}} = {{c}, {c, d}} = (c, d).

2)(only if) [(a, b) = (c, d)] ⇒ [(a = c) ∧ (b = d)]. Suppose that (a, b) = (c, d), i.e.

{{a}, {a, b}} = {{c}, {c, d}}.

There are two cases to consider.


Case 1. a = b. In this case

{a, b} = {a} so {{a}, {a, b}} = {{a}}.

Hence
{{a}} = {{c}, {c, d}}, so {c} = {a} and {c, d} = {c},
i.e. a = b = c = d as required.
Case 2. a 6= b. In this case we have

[{c} = {a}] ∨ [{c} = {a, b}].

But [{c} = {a, b}] is false since a 6= b. Hence {c} = {a}, so a = c. Since a 6= b, it follows
that {a} 6= {a, b}. Therefore {c, d} = {a, b} and as a = c, {a, d} = {a, b}. Hence b = d, as
required.
Thus the definition of ordered pair, however “artificial” it may appear, gives to ordered
pairs the crucial property that we require of them; and that is all we can reasonably require
from a mathematical definition.
Now we define Cartesian product which is an essential tool for further development.

14 January 28, 2006


CHAPTER 1. ELEMENTS OF LOGIC AND SET THEORY

Definition 1.5.2. Let A and B are sets. The Cartesian product of A and B, denoted by
A × B, is the set of all ordered pairs (a, b) in which a ∈ A and b ∈ B, i.e.

A × B = {(a, b) | (a ∈ A) ∧ (b ∈ B)}.

Thus
(p ∈ A × B) ⇔ {(∃a ∈ A)(∃b ∈ B) [p = (a, b)]}.
Example 1.5.1. 1. If A ={red, green} and B = {1, 2, 3} then

A × B = {(red, 1), (red, 2), (red, 3), (green, 1), (green, 2), (green, 3)}.

2. R × R = {(x, y) | x and y are real numbers}. These are coordinates of all points in the
plane. The notation R2 is usually used for this set.
X × X is called Cartesian square of X.

The following theorem provides some basic properties of the Cartesian product.
Theorem 1.5.2. Let A, B, C, D be sets. Then

(1.5.31) A × (B ∩ C) = (A × B) ∩ (A × C),
(1.5.32) A × (B ∪ C) = (A × B) ∪ (A × C),
(1.5.33) A × ∅ = ∅ × A = ∅.

Proof of (1.5.31). For arbitrary p

[p ∈ A × (B ∩ C)] ⇔ {(∃a ∈ A)(∃x ∈ B ∩ C) [p = (a, x)]}


⇔{(∃a ∈ A)(∃x ∈ B) [p = (a, x)]} ∧ {(∃a ∈ A)(∃x ∈ C) [p = (a, x)]}
⇔{p ∈ A × B} ∧ {p ∈ A × C}
⇔{p ∈ (A × B) ∩ (A × C)}

which proves (1.5.31).


The proof of (1.5.32) and of (1.5.33) is left as an exercise.

15 January 28, 2006


1.6. RELATIONS

1.6 Relations
Definition 1.6.1. Let X, Y be sets. A set R ⊂ X × Y is called a relation from X to Y .
If (x, y) ∈ R, we say that x is in the relation R to y. We will also write in this case xRy.
Example 1.6.1. 1. Let A = {1, 2, 3}, B = {3, 4, 5}. The set R = {(1, 3), (1, 5), (3, 3)} is
a relation from A to B since R ⊂ A × B.
2. G = {(x, y) ∈ R × R | x > y} is a relation from R to R.
Definition 1.6.2. Let R be a relation from X to Y . The domain of R is the set
D(R) = {x ∈ X | ∃y ∈ Y [(x, y) ∈ R]}.
The range of R is the set
Ran(R) = {y ∈ Y | ∃x ∈ X [(x, y) ∈ R]}.
The inverse of R is the relation R−1 from Y to X defined as follows
R−1 = {(y, x) ∈ Y × X | (x, y) ∈ R}.
Definition 1.6.3. Let R be a relation from X to Y , S be a relation from Y to Z. The
composition of S and R is a relation from X to Z defined as follows
S ◦ R = {(x, z) ∈ X × Z | ∃y ∈ Y [(x, y) ∈ R] ∧ [(y, z) ∈ S]}.
Theorem 1.6.2. Let R be a relation from X to Y , S be a relation from Y to Z, T be a
relation from Z to V . Then

1. (R−1 )−1 = R.
2. D(R−1 ) = Ran(R).
3. Ran(R−1 ) = D(R).
4. T ◦ (S ◦ R) = (T ◦ S) ◦ R.
5. (S ◦ R)−1 = R−1 ◦ S −1 .

For the proof see [1], p.170.

Next we take a look at some particular types of relations. Let us consider relations from
X to X, i.e. subsets of X × X. In this case we talk about relations on X.
A simple example of such a relation is the identity relation on X which is defined as follows
iX = {(x, y) ∈ X × X | x = y}.
Definition 1.6.4. 1. A relation R on X is said to be reflexive if
(∀x ∈ X) (x, x) ∈ R.

2. R is said to be symmetric if
(∀x ∈ X)(∀y ∈ X) {[(x, y) ∈ R] ⇒ [(y, x) ∈ R]}.

3. R is said to be transitive if
(∀x ∈ X)(∀y ∈ X)(∀z ∈ X){[((x, y) ∈ R) ∧ ((y, z) ∈ R)] ⇒ [(x, z) ∈ R]}.

16 January 28, 2006


CHAPTER 1. ELEMENTS OF LOGIC AND SET THEORY

Equivalence relations

A particularly important class of relations are equivalence relations.


Definition 1.6.5. A relation R on X is called equivalence relation if it is reflexive, symmetric
and transitive.
Example 1.6.3. 1. Let X be a set of students. A relation on X × X: “to be friends”.
It is reflexive (I presume that everyone is a friend to himself/herself). It is symmetric.
But it’s not transitive.

2. Let X = R, a be some positive number. Define R ⊂ X × X as

R = {(x, y) | |x − y| ≤ a}.

R is reflexive, symmetric, but not transitive.

3. Let X = Z, m ∈ N. Define the congruence mod m on X × X as follows:

x ≡ y if (∃k ∈ Z |x − y = km).

This is an equivalence relation on X.

Definition 1.6.6. Let R be an equivalence relation on X. Let x ∈ X. The equivalence class


of x with respect to R is the set

[x]R = {y ∈ X | (y, x) ∈ R}.

Let us take a look at several properties of classes of equivalence.


Proposition 1.6.1. Let R be an equivalence relation on X. Then
1.(∀x ∈ X) x ∈ [x]R .
2.(∀x ∈ X)(∀y ∈ X) [(y ∈ [x]R ) ⇔ ([y]R = [x]R )].

Proof. 1. Since R is reflexive, (x, x) ∈ R, hence x ∈ [x]R .


2.First, let y ∈ [x]R . (a) Suppose that z ∈ [y]R (an arbitrary element of [y]R . Then
(x, y) ∈ R, (y, z) ∈ R and by transitivity (x, z) ∈ R. Therefore z ∈ [x]R , which shows that
[y]R ⊂ [x]R . (b) Suppose that z ∈ [x]R . Similarly show that z ∈ [y]R .
Therefore [x]R = [y]R .
The implication ([y]R = [x]R ) ⇒ (y ∈ [x]R ) is obvious.

From the above proposition it follows that the classes of equivalence are disjoint and every
element of the set X belongs to a class of equivalence (the union of the classes equals to the
set X).
Remark 1. See more on this in [1].

17 January 28, 2006


1.7. FUNCTIONS

1.7 Functions
1.7.1 Function
The notion of a function is of fundamental importance in all branches of mathematics. You
met functions in your previous study of mathematics, but without precise definition. Here
we give a definition and make connections to examples you learned before.

Definition 1.7.1. Let X and Y be sets. Let F be a relation from X to Y . Then F is called
a function if the following properties are satisfied
(i) (∀x ∈ X)(∃y ∈ Y ) [(x, y) ∈ F ].
In this case it is customary to write y = F (x).
(ii) (∀x ∈ X)(∀y ∈ Y )(∀z ∈ Y ){([(x, y) ∈ F ] ∧ [(x, z) ∈ F ]) ⇒ (y = z)}.
(In other words, for every x ∈ X there is only one y ∈ Y such that (x, y) ∈ F ).
X is called the domain of F and Y is called codomain.

Let us consider several examples.


Example 1.7.1. (i) Let X = {1, 2, 3}, Y = {4, 5, 6}. Define F ⊂ X × Y as
F = {(1, 4), (2, 5), (3, 5)}.
Then F is a function.
(In contrast to that define G ⊂ X × Y as
G = {(1, 4), (1, 5), (2, 6), (3, 6)}.
Then G is not a function.)
(ii)Let X = R and Y = R. Define F ⊂ X × Y as
F = {(x, y) ∈ R × R | y = x2 }.
Then F is a function from R to R.
(In contrast to that define G ⊂ X × Y as
G = {(x, y) ∈ R × R | x2 + y 2 = 1}.
Then G is not a function.)

We often use the notation


F : X→Y
for a function F from X to Y .
Note that in order to define a function F from X to Y we have to define X, Y and a
subset of X × Y satisfying (i) and (ii) of the definition.
Let F : X → Y be a function. If x ∈ X then we know that there is a unique y such that
(x, y) ∈ F . For this y we write y = F (x). This y is called the image of x under F .
Note: one can assign a value of y for each value x ∈ X by means of a “rule” or a “formula”.
Though these notions is rather vague and cannot be used in a definition of a function.

18 January 28, 2006


CHAPTER 1. ELEMENTS OF LOGIC AND SET THEORY

Theorem 1.7.2. Let X, Y be sets, F, G be functions from X to Y . Then

[(∀x ∈ X)(F (x) = G(x))] ⇔ (F = G).

Proof. 1. (⇒). Let x ∈ X. Then (∃y ∈ Y )(y = F (x)). But G(x) = F (x), so (x, y) ∈ G.
Therefore F ⊂ G. Analogously one sees that G ⊂ F .
2. (⇐). Obvious since the sets F and G are equal.
The above theorem says that in order to establish that two functions are equal one has to
establish that they have the same domain and codomain and for every element of the domain
they have equal images. Note that equal function may be defined by different “rules”.
Example 1.7.3. Let f : R → R, g : R → R, h : R → R+ (By R+ we denote the set of
non-negative real numbers). Let ∀x ∈ R

f (x) = (x + 1)2 , g(x) = x2 + 2x + 1, h(x) = (x + 1)2 .

Then f and g are equal, but f and h are not since they have different codomains.

Definition 1.7.2. Let f : X → Y be a function. The set

Ran(f ) = {y | (∃x ∈ X)(f (x) = y)}

is called the range of f .


2x
Example 1.7.4. Let f : R → R, f (x) = x2 +1
. Find Ran(f ).

Solution. We have to find a set of y such that the equation y = x22x+1


has a solution x ∈ R.
2
The equation is equivalent to yx − 2x + y = 0. The existence of a real solution is equivalent
to D = 4 − 4y 2 ≥ 0. Solving this inequality we obtain that Ran(f ) = [−1, 1].

The definition of composition of relations can be also applied to functions. If f : X → Y


and g : Y → Z then

g ◦ f = {(x, z) ∈ X × Z | (∃y ∈ Y )[(x, y) ∈ f ] ∧ [(y, z) ∈ g]}.

Theorem 1.7.5. Let f : X → Y and g : Y → Z. Then g ◦ f : X → Z and

(∀x ∈ X) [(g ◦ f )(x) = g(f (x))].

Proof. We know that g ◦ f is a relation. So we must prove that for every x ∈ X there exists
a unique element z ∈ Z such that (x, z) ∈ g ◦ f .
Existence: Let x ∈ X be arbitrary. Then ∃y ∈ Y such that y = f (x), or in other words,
(x, y) ∈ f . Also ∃z ∈ Z such that z = g(y), or in other words, (y, z) ∈ g. By the definition it
means that (x, z) ∈ g ◦ f . Moreover, we see that

(g ◦ f )(x) = g(f (x)).

Uniqueness: Suppose that (x, z1 ) ∈ g ◦ f and (x, z2 ) ∈ g ◦ f . Then by the definition of


composition (∃y1 ∈ Y ) [(x, y1 ) ∈ f ] ∧ [(y1 , z1 ) ∈ g] and (∃y2 ∈ Y ) [(x, y2 ) ∈ f ] ∧ [(y2 , z2 ) ∈ g].
But f is a function. Therefore y1 = y2 . g is also a function, hence z1 = z2 .

19 January 28, 2006


1.7. FUNCTIONS

Example 1.7.6. Let f : R → R, g : R → R,


1
f (x) = , g(x) = 2x − 1.
x2 +2
Find (f ◦ g)(x) and (g ◦ f )(x).
Solution.
1 1
(f ◦ g)(x) = f (g(x)) = 2
= ,
[g(x)] + 2 (2x − 1)2 + 2
2
(g ◦ f )(x) = g(f (x)) = 2f (x) − 1 = 2 − 1.
x +2

Warning: As you clearly see from the above f ◦ g 6= g ◦ f .

Images and inverse images

Definition 1.7.3. Let f : X → Y and A ⊂ X. The image of A under f is the set

f (A) = {f (x) | x ∈ A}.

Example 1.7.7. Let f : R → R defined by f (x) = x2 . Let A = {x ∈ R | 0 ≤ x ≤ 2}. Then


f (A) = [0, 4].

The following theorem (which we give without proof) establishes some properties of images
of sets.
Theorem 1.7.8. Let f : X → Y and A ⊂ X, B ⊂ X. Then

(i) f (A ∪ B) = f (A) ∪ f (B),

(ii) f (A ∩ B) ⊂ f (A) ∩ f (B),

(iii) (A ⊂ B) ⇒ [f (A) ⊂ f (B)].

Remark 2. Note that in (ii) there is no equality in general. Consider the following example.
f : R → R defined by f (x) = x2 . Let A = [−1, 12 ] and B = [− 12 , 1]. Then f (A) = [0, 1],
f (B) = [0, 1], so that f (A) ∩ f (B) = [0, 1]. At the same time A ∩ B = [− 21 , 12 ], and hence
f (A ∩ B) = [0, 14 ].

Definition 1.7.4. Let f : X → Y and B ⊂ Y . The inverse image of B under f is the set

f −1 (B) = {x ∈ X | f (x) ∈ B}.

Example 1.7.9. Let f : R → R defined by f (x) = x2 . Let B = [−1, 4]. Then f −1 (B) =
[−2, 2].
The following theorem (which we give without proof) establishes some properties of inverse
images of sets.

20 January 28, 2006


CHAPTER 1. ELEMENTS OF LOGIC AND SET THEORY

Theorem 1.7.10. Let f : X → Y and A ⊂ Y , B ⊂ Y . Then

(i) f −1 (A ∪ B) = f −1 (A) ∪ f −1 (B),

(ii) f −1 (A ∩ B) = f −1 (A) ∩ f −1 (B),

(iii) (A ⊂ B) ⇒ [f −1 (A) ⊂ f −1 (B)].


Remark 3. Note the difference with the previous theorem.

1.7.2 Injections and surjections. Bijections. Inverse functions


In the last section we saw that the composition of two functions is again a function. If
f : X → Y the is a relation from X to Y . One can define the inverse relation f −1 . Then the
question comes : Is f −1 a function? In general the answer is “no”. In this section we will
study particular cases of functions and find out when the answer is “yes”.

Definition 1.7.5. Let f : X → Y . Then f is called an injection if

(∀x1 ∈ X)(∀x2 ∈ X)[(f (x1 ) = f (x2 )) ⇒ (x1 = x2 )].

The above definition means that f is one-to-one correspondence between X and Ran(f ).
Using the contrapositive law one can rewrite the above definition as follows.

(∀x1 ∈ X)(∀x2 ∈ X)[(x1 6= x2 ) ⇒ (f (x1 ) 6= f (x2 ))].

Example 1.7.11. (i) Let X = {1, 2, 3}, Y = {4, 5, 6, 7}. Define f : X → Y as

f = {(1, 5), (2, 6), (3, 7)}.

Then f is an injection.
In contrast define g : X → Y as

g = {(1, 5), (2, 6), (3, 5)}.

Then g is not an injection since g(1) = g(3).


(ii)Let f : N → N defined by f (n) = n2 . Then f is an injection.
In contrast let g : Z → Z defined by g(n) = n2 . Then g is not an injection since, for instance,
g(1) = g(−1).

Definition 1.7.6. Let f : X → Y . Then f is called a surjection if

(∀y ∈ Y )(∃x ∈ X)[f (x) = y].

The above definition means that Ran(f ) = Y . For this reason surjections are sometimes
called onto.

21 January 28, 2006


1.7. FUNCTIONS

Example 1.7.12. (i) Let X = {1, 2, 3, 4}, Y = {5, 6, 7}. Define f : X → Y as


f = {(1, 5), (2, 6), (3, 7), (4, 6)}.
Then f is an surjection.
In contrast define g : X → Y as
g = {(1, 5), (2, 6), (3, 5), (4, 6)}.
Then g is not an surjection since 7 is not in its range.
(ii)Let f : Z → Z defined by f (n) = n + 2. Then f is a surjection.
In contrast let g : Z → Z defined by g(n) = n2 . Then g is not a surjection since, for instance,
there is no integer such that its square is 2.

Definition 1.7.7. Let f : X → Y . Then f is called a bijection if it is an injection an a


surjection.

Example 1.7.13. (i) Let X = {1, 2, 3} and Y = {4, 5, 6}. Define f : X → Y by


f = {(1, 4), (2, 5), (3, 6)}.
Then f is a bijection.
(ii) Let X = Y = [0, 1]. Define g : X → Y by g(x) = x2 . Then g is a bijection.

Now we are ready to answer the question about the inverse to a function. First, recall
that if f : X → Y then f −1 is a relation from Y to X with the properties D(f −1 ) = Ran(f )
and Ran(f −1 ) = D(f ). So, if f −1 is a function from Y to X then D(f −1 ) = Y . Therefore we
conclude that the condition Ran(f ) = Y is a necessary condition for f −1 to be a function.
Which means, that f has to be surjective. Also, from the definition of the inverse relation f −1
it is clear that injectivity of f is also a necessary condition for f −1 to be a function. Hence
bijectivity of f is a necessary condition for f −1 to be a function.
It turns out that this is also a sufficient condition.
Theorem 1.7.14. Let f : X → Y . Then
(f −1 : Y → X) ⇔ (f is a bijection).

Proof. We have to show only that


(f is a bijection) ⇒ (f −1 : Y → X).
Recall that
f −1 = {(y, x) ∈ Y × X | (x, y) ∈ f }.
We have to verify two properties in the definition of a function.
First, existence. We have to show that (∀y ∈ Y )(∃x ∈ X)[(y, x) ∈ f −1 ], or in other words,
(∀y ∈ Y )(∃x ∈ X)[(x, y) ∈ f ]. This follows from the surjectivity of f .
Second, Uniqueness. We have to show that (∀x1 ∈ X)(∀x1 ∈ X){[((y, x1 ) ∈ f −1 ) ∧
(((y, x2 ) ∈ f −1 )] ⇒ (x1 = x2 )}, or in other words (∀x1 ∈ X)(∀x1 ∈ X){[f (x1 ) = f (x2 )] ⇒
(x1 = x2 )}. But this follows from injectivity of f .

22 January 28, 2006


CHAPTER 1. ELEMENTS OF LOGIC AND SET THEORY

Definition 1.7.8. Let f : X → Y . If f −1 is a function from Y to X we say that f is


invertible. In that case f −1 is called the inverse function.

Theorem 1.7.15. Let f : X → Y . Let f −1 be a function from Y to X. Then

f −1 ◦ f = iX and f ◦ f −1 = iY .

Proof. Let x ∈ X be arbitrary. Let y = f (x) ∈ Y . Then (x, y) ∈ f so that (y, x) ∈ f −1 .


Therefore f −1 (y) = x. Thus

(f −1 ◦ f )(x) = f −1 (f (x)) = f −1 (y) = x.

We have proved that


(∀x ∈ X)[f −1 f (x) = x]

This is the same as to say that f −1 ◦ f = iX . The second statement is similar and is left
as an exercise.

Example 1.7.16. Let f : R → R defined by f (x) = x+7 5 . Then f is a bijection.


x1 +7 x2 +7
Indeed, Let x1 , x2 ∈ R be arbitrary and 5 = 5 . Then, of course, x1 + 7 = x2 + 7,
therefore x1 = x2 , so f is an injection.
Now let y ∈ R be arbitrary and y = x+7 5 . Then, of course, x + 7 = 5y and x = 5y − 7 ∈ R.
So we have proved that
(∀y ∈ R)(∃x ∈ R)[y = f (x)].
By definition this means that f is a surjection.
So f is a bijection. Moreover f −1 (y) = 5y − 7. This means that g(x) = 5x − 7 is the inverse
function to f .

23 January 28, 2006


1.8. SOME METHODS OF PROOF. PROOF BY INDUCTION

1.8 Some methods of proof. Proof by induction


1. First we discuss a couple of widely used methods of proof: contrapositive proof and proof
by contradiction.

The idea of contrapositive proof is the following equivalence

(A ⇒ B) ⇔ (¬B ⇒ ¬A).

So to prove that A ⇒ B is true is the same as to prove that ¬B ⇒ ¬A is true.

Example 1.8.1. For integers m and n, if mn is odd then so are m and n.

Proof. We have to prove that (∀m, n ∈ N)

(mn is odd) ⇒ [(m is odd) ∧ (n is odd)],

which is the same as to prove that

[(m is even) ∨ (n is even)] ⇒ (mn is even).

The latter is evident.

The idea of proof by contradiction is the following equivalence

(A ⇒ B) ⇔ (¬A ∨ B) ⇔ ¬(A ∧ ¬B).

So to prove that A ⇒ B is true is the same as to prove that ¬A ∨ B is true or else that A ∧ ¬B
is false.

Example 1.8.2. Let x, y ∈ R be positive. If x2 + y 2 = 25 and x 6= 3, then y 6= 4.

Proof. In order to prove by contradiction we assume that

[(x2 + y 2 = 25) ∧ (x 6= 3)] ∧ (y = 4).

Then x2 + y 2 = x2 + 16 = 25. Hence (x = 3) ∧ (x 6= 3) which is a contradiction.

2. In the remaining part of this short section we will discuss an important property of
natural numbers. The set of natural numbers

N = {0, 1, 2, 3, 4, . . . }

which we will always denote by N, is taken for granted.

We shall denote the set of positive natural numbers N+ = {1, 2, 3, 4, . . . }.

24 January 28, 2006


CHAPTER 1. ELEMENTS OF LOGIC AND SET THEORY

The Principle of Mathematical Induction is often used when one needs to prove the state-
ment of the form
(∀n ∈ N+ ) P (n)
or similar types of statements.
Since there are infinitely many natural numbers we cannot check one by one that they all
have property P . The idea of mathematical induction is that to list all natural numbers one
has to start from 1 and then repeatedly add 1. Thus one can show that 1 has property P
and that whenever one adds 1 to a number that has property P , the resulting number also
has property P .
Principle of Mathematical Induction. If for a statement P (n)

(i) P (1) is true,


(ii) [P (n) ⇒ P (n + 1)] is true,

then (∀n ∈ N+ ) P (n) is true.


Part (i) is called base case; (ii) is called induction step.

Example 1.8.3. Prove that ∀n ∈ N+ if n > 1 then:


n(n + 1)(2n + 1)
12 + 22 + 32 + · · · + n2 = .
6
Solution. Base case: n = 1. 12 = 1·2·3
6 is true.
Induction step: Suppose that the statement is true for n = k (k > 1). We have to prove that
it is true for n = k + 1. So our assumption is
k(k + 1)(2k + 1)
12 + 22 + 32 + · · · + k 2 = .
6
Therefore we have
k(k + 1)(2k + 1)
12 + 22 + 32 + · · · + k 2 + (k + 1)2 = + (k + 1)2
6
(k + 1)(k + 2)(2k + 3)
= ,
6
which proves the statement for n = k + 1. By the principle of mathematical induction the
statement is true for all natural n.

Example 1.8.4. 1 + 3 + 5 + · · · + (2n − 1) =?


First we have to work out a conjecture. For this let us try several particular cases for n.
n = 1; 1 =1;
n = 2; 1 + 3 =4 = 22 ;
n = 3; 1 + 3 + 5 =9 = 32 .
So a reasonable conjecture is that ∀n ∈ N+
1 + 3 + 5 + · · · + (2n − 1) = n2 .

25 January 28, 2006


1.8. SOME METHODS OF PROOF. PROOF BY INDUCTION

Proof. We need not make the base case, since it has been done working out the conjecture.
Induction step:
Let
1 + 3 + 5 + · · · + (2k − 1) = k 2 (k > 1).
Then
1 + 3 + 5 + · · · + (2k − 1) + (2k + 1) = k 2 + 2k + 1 = (k + 1)2 .
This completes the proof by induction.

The base step can be started not necessarily from 1. It can start from any integer onwards.

Example 1.8.5. Prove that (∀n ∈ N)(3|(n3 − n)).


Solution. Base case: n = 0. (3|0) is true.
Induction step:
Assume that (3|(k 3 − k)) for k > 0. Then

(k + 1)3 − (k + 1) = k 3 + 3k 2 + 3k + 1 − k − 1 = (k 3 − k) + 3(k 2 + k)
| {z } | {z }
divisible by 3 divisible by 3

Example 1.8.6. Prove that (∀n ∈ N)[(n > 5) ⇒ (2n > n2 )].
Solution. Base case: n = 5. True.
Induction step:
Suppose that 2k > k 2 (k > 5). Then

2k+1 = 2 · 2k > 2k 2 .

Now it is sufficient to prove that


2k 2 > (k + 1)2 .
Consider the difference:

2k 2 − (k + 1)2 = k 2 − 2k − 1 = (k − 1)2 − 2.

Since k > 5 we have that k − 1 > 4 and (k − 1)2 − 2 > 14 > 0 which proves the above
inequality.

26 January 28, 2006


Chapter 2

Numbers

2.1 Various Sorts of Numbers


2.1.1 Integers

We take for granted the system N of natural numbers

N = {0, 1, 2, 3, 4 . . .}

stressing only that for N the following properties hold.

(∀a ∈ N)(∀b ∈ N)(∃c ∈ N)(∃d ∈ N)[(a + b = c) ∧ (ab = d)].

(closure under addition and multiplication)

(∀a ∈ N)[a · 1 = 1 · a = a].

(existence of a multiplicative identity)

(∀a ∈ N)(∀b ∈ N)[(a = b) ∨ (a < b) ∨ (a > b)].

(order)
The first difficulty occurs when we try solving the equation in N

a + x = b,

with b 6 a. In order to make this equation soluble we have to widen the set N by introducing
0 and negative integers as solutions of the equations

a + x = a (existence of additive identity)

and
a + x = 0 (existence of additive inverse)
respectively. Our extended system, which is denoted by Z, now contains all integers and can
be arranged in order

Z = {. . . , −3, −2, −1, 0, 1, 2, 3, . . .} = N ∪ {−a | a ∈ N}.

27
2.1. VARIOUS SORTS OF NUMBERS

2.1.2 Rational Numbers


Let a ∈ Z, b ∈ Z. The equation

(2.1.1) ax = b

need not have a solution x ∈ Z. In order to enable one to solve (2.1.1) (for a 6= 0) we
have to widen our system of numbers again so that it included fractions b/a (existence of
multiplicative inverse in Z − {0}. This motivates the following
Definition 2.1.1. Rational numbers, or rationals, is the set
p
{r = | p ∈ Z, q ∈ N}.
q

The set of rational numbers will be denoted by Q. When writing p/q for a rational we
always assume that the numbers p and q have no common factor greater than 1.
All the arithmetical operations in Q are straightforward. Let us introduce a relation of
order for rationals.
Definition 2.1.2. Let b ∈ N, d ∈ N, with both b, d > 0,. Then
³a c´ ³ ´
> ⇐⇒ ad > bc .
b d
The following theorem provides a very important property of rationals.
Theorem 2.1.1. Between any two rational numbers there is another (and, hence, infinitely
many others).
Proof. Let b ∈ N d ∈ N, both b, d > 0, and
a c
> .
b d
Notice that
a a + mc c
(∀m ∈ N) [ > > ].
b b + md d
Indeed, since b, d and m are positive we have

[a(b + md) > b(a + mc)] ⇐⇒ [mad > mbc] ⇔ (ad > bc),

and
[d(a + mc) > c(b + md)] ⇔ (ad > bc).

2.1.3 Irrational Numbers


Suppose that a ∈ Q and consider the equation

(2.1.2) x2 = a.

In general (2.1.2) does not have rational solutions. For example the following theorem holds.
Theorem 2.1.2. No rational number has square 2.

28 January 28, 2006


CHAPTER 2. NUMBERS

Proof. Suppose for a contradiction that pq with p ∈ Z, q ∈ N, q =6 0 and p, q have no


p 2
common factors greater than 1, is such that ( q ) = 2. Then p = 2q . Hence p2 is even, and
2 2

so is p. Hence, (∃k ∈ Z) [ p = 2k ]. This implies that

2k 2 = q 2

and therefore q is also even. The last statement contradicts our assumption that p and q have
no common factor.
Theorem 2.1.2 provides an example of a number which is not rational, or irrational. Here
are some other examples of irrational numbers.
Theorem 2.1.3. No rational x satisfies the equation x3 = x + 7.

Proof. First we show that there are no integers satisfying the equation x3 = x + 7. For a
contradiction suppose that there is. Then x(x + 1)(x − 1) = 7 from which it follows that x
divides 7. Hence x can be only ±1, ±7. Direct verification shows that these numbers do not
satisfy the equation.
Second, show that there are no fractions satisfying the equation x3 = x + 7. For a
contradiction suppose that there is. Let m
n with m ∈ Z, n ∈ N, n 6= 0, and m, n have no
common factors greater than 1, is such that ( m 3 m
n ) = n + 7. Multiplying this equality by n
2
3
we obtain mn = mn2 + 7n2 , which is impossible since the right-hand side is an integer and
the left-hand side is not.
Example 2.1.4. No rational satisfies the equation x5 = x + 4.

Prove it!

Algebraic numbers A correspond to all real solutions of polynomial equations with integer
coefficients. So x is algebraic if there exists n ∈ N, a0 , a1 , . . . , an ∈ Z such that

a0 + a1 x + · · · + an xn = 0.

2.1.4 Cuts of the Rationals



In this subsection we discuss how a particular irrational number, say 2 is fitted in among
the rationals.
In trying to isolate a number whose square is 2, we first observe from Theorem 2.1.2 that
the positive rational numbers fall into two classes, those whose squares are greater than 2
and those whose squares are less than 2. Call these classes the left-hand class L and the
right-hand class R, corresponding to their relative positions when represented graphically on
a horizontal line. Examples of numbers l ∈ L are 7/5 and 1 · 41, and of numbers r ∈ R are
17/12 and 1 · 42. At this stage we shall have to convince ourselves that for elements of the
classes L and R the following properties hold.

(i) (∀l ∈ L)(∀r ∈ R)[r > l];

(ii) (∀l1 ∈ L)(∃l2 ∈ L)[l2 > l1 ];

(iii) (∀r1 ∈ R)(∃r2 ∈ R)[r2 < r1 ].

29 January 28, 2006


2.1. VARIOUS SORTS OF NUMBERS

The last three statements become more concrete if we use the arithmetical rule for square
root to find, to as many decimal places as we please, a set of numbers l ∈ L

1, 1 · 4, 1 · 41, 1 · 414, . . . ,

each of which is greater than the preceding (or equal to it if the last digit is 0) and each
having its square less than 2. Moreover, the numbers got by adding 1 to the last digit of these
numbers l ∈ L for a set of numbers r ∈ R

2, 1 · 5, 1 · 42, 1 · 415, . . . ,

each having its square greater than 2 and each less than (or equal to) the preceding.
If we are now given a particular number a whose square is less than 2, we shall by going
far enough along the set of numbers 1, 1 · 4, 1 · 41, 1 · 414, . . . come to one which is greater
than a.
If, then, we are building a number-system starting with integers
√ and then including the
rational numbers, we see that an irrational number (such as 2) corresponds to and can be
defined by a cutting of the rationals into two classes L, R of which L has no greatest member
and R no least member. This is Dedekind’s definition of irrationals by the cut.

There are cuts which correspond to non-algebraic numbers: the transcendental numbers,
denoted by T. For example, the cut corresponding to 2x = 3 is transcendental. (Check that
no rational x satisfies 2x = 3). The proof that x 6∈ A is more complicated. Summarising:

N ⊂ Z ⊂ Q ⊂ A ⊂ T ∪ A = R.

30 January 28, 2006


CHAPTER 2. NUMBERS

2.2 The Field of Real Numbers


In the previous sections, starting from integers, we have sketched the building-up of the set
of real numbers which is the union of the sets of rationals and irrationals and denoted by R.
Now we are making the list of the basic properties which the real numbers satisfy. Since
we provide no proof of these statements we present them in form of axioms. Naturally they
fall into two groups which concern algebraic operations with real numbers and the relation of
order for them.
Those of you familiar with basic concepts of algebra will find that the first group of axioms
defines R to be a field.

A.1. (∀a ∈ R)(∀b ∈ R) [(a + b) ∈ R] .

A.2. (∀a ∈ R)(∀b ∈ R) [a + b = b + a] .

A.3. (∀a ∈ R)(∀b ∈ R)(∀c ∈ R) [(a + b) + c = a + (b + c)] .

A.4. (∃0 ∈ R)(∀a ∈ R) [0 + a = a] .

A.5. (∀a ∈ R)(∃!x ∈ R) [a + x = 0] . We write x = −a.

Notation: We use the symbol ∃! in order to say that there exists a unique... So (∃!x ∈ R)
has to be red : there exists a unique real number x.
The axioms A.6-A.10 that follow are analogues of A.1-A.5 for the operation of multipli-
cation.

A.6. (∀a ∈ R)(∀b ∈ R) [ab ∈ R].

A.7. (∀a ∈ R)(∀b ∈ R) [ab = ba].

A.8. (∀a ∈ R)(∀b ∈ R)(∀c ∈ R) [(ab)c = a(bc)].

A.9. (∃1 ∈ R)(∀a ∈ R) [1 · a = a].

A.10. (∀a ∈ R − {0})(∃!y ∈ R) [ay = 1] . We write y = 1/a.

The last axiom links the operations of summation and multiplication.

A.11. (∀a ∈ R)(∀b ∈ R)(∀c ∈ R) [(a + b)c = ac + bc] .

From the axioms above the familiar rules of manipulation of real numbers can be deduced.
As an illustration we present
Example 2.2.1. (∀a ∈ R)[0 · a = 0]. Indeed, by A.11 and A.4 we have

1 · a + 0 · a = (1 + 0)a = 1 · a.

Then by A.2 and A.4 0a = 0.


Let us observe that integers do not form a field since they do not satisfy axiom A.10.
Now we are adding the relevant axioms of order for real numbers.

31 January 28, 2006


2.2. THE FIELD OF REAL NUMBERS

O.1. (∀a ∈ R)(∀b ∈ R)[(a = b) ∨ (a < b) ∨ (a > b)].


(∀a ∈ R)(∀b ∈ R)[(a ≥ b) ∧ (b ≥ a) ⇒ (a = b)].

O.2. (∀a ∈ R)(∀b ∈ R)(∀c ∈ R)[(a > b) ∧ (b > c) ⇒ (a > c)].

O.3. (∀a ∈ R)(∀b ∈ R)(∀c ∈ R)[(a > b) ⇒ (a + c > b + c)].

O.4. (∀a ∈ R)(∀b ∈ R)(∀c ∈ R)[(a > b) ∧ (c > 0) ⇒ (ac > bc)].

Observe that
(∀a ∈ R)(∀b ∈ R){[a > b] ⇔ [a − b > 0]}.
This follows from (O.3).
Completeness axiom states that there are no gaps among the reals. We will give it in the
form of
Dedekind’s axiom: Suppose that the system of all real numbers is divided into two classes
L and R, such that every number l ∈ L is less than every number r ∈ R (and R, L 6= ∅).
Then there is a dividing number ξ with the properties that every number less ξ belongs to L
and every number greater than ξ belongs to R.
The number ξ is either in L or in R. If ξ ∈ R then ξ is the least number in R. If ξ ∈ L
then ξ is the greatest number in L.
Let us express Dedekind’s axiom in the form of a true logical proposition.

[(R = L ∪ R) ∧ (L 6= ∅) ∧ (R 6= ∅) ∧ (∀l ∈ L)(∀r ∈ R)(l < r)]


⇒ (∃ξ ∈ R){(∀x ∈ R){[(x < ξ) ⇒ (x ∈ L)] ∧ [(x > ξ) ⇒ (x ∈ R)]}}.

Consequences of the Dedekind’s axiom:

1) L ∩ R = ∅.

2) (ξ ∈ L) ∨ (ξ ∈ R) is true.

3) (ξ ∈ L) ⇒ (ξ is the greatest number in L).

4) (ξ ∈ R) ⇒ (ξ is the least number in R).

Inequalities

The order axioms express the properties of the order relation (inequalities) on the set of real
numbers. Inequalities play an extremely important role in analysis. Here we discuss several
ideas how to prove inequalities. But before we start doing so we give the definition of the
absolute value of a real number and derive a simple consequence from it.
Definition 2.2.1. Let a ∈ R. The absolute value |a| of a is defined by
½
a if a ≥ 0,
|a| =
−a if a < 0.

Theorem 2.2.2.
(∀a ∈ R)(∀b ∈ R) [ |a + b| ≤ |a| + |b| ] .

32 January 28, 2006


CHAPTER 2. NUMBERS

The proof is left as an exercise. Hint: Consider four cases:

1) a ≥ 0, b ≥ 0;
2) a ≥ 0, b < 0;
3) a < 0, b ≥ 0;
4) a < 0, b < 0.

1. First idea of proving inequalities is very simple and is based on the equivalence

(∀x ∈ R)(∀y ∈ R)[(x ≥ y) ⇔ (x − y ≥ 0)].

Example 2.2.3. Prove that

(∀a ∈ R)(∀b ∈ R) [a2 + b2 ≥ 2ab].

Proof. It equivalent to show that a2 + b2 − 2ab ≥ 0. Indeed,

a2 + b2 − 2ab = (a − b)2 ≥ 0.

The equality holds iff a = b.


Example 2.2.4. Prove that
· ¸
a+b √
(∀a ∈ R+ )(∀b ∈ R+ ) ≥ ab .
2

(Reminder: R+ = {x ∈ R | x ≥ 0}.)

Proof. As above, let us prove that the difference between the left-hand side (LHS) and the
right-hand side (RHS) is non-negative.
√ √
a+b √ ( a − b)2
− ab = ≥ 0.
2 2
The equality holds iff a = b.

2. The second idea is to use already proved inequalities and axioms to derive new ones.
Example 2.2.5. Prove that

(∀a ∈ R)(∀b ∈ R)(∀c ∈ R) [a2 + b2 + c2 ≥ ab + ac + bc].

Proof. Adding the tree inequalities from Example 2.2.3

a2 + b2 ≥ 2ab,
a2 + c2 ≥ 2ac,
b2 + c2 ≥ 2bc,

we obtain 2(a2 + b2 + c2 ) ≥ 2ab + 2ac + 2bc, which proves the desired. The equality holds iff
a = b = c.

33 January 28, 2006


2.2. THE FIELD OF REAL NUMBERS

Example 2.2.6. Prove that


· ¸
a+b+c+d √
4
(∀a ∈ R+ )(∀b ∈ R+ )(∀c ∈ R+ )(∀d ∈ R+ ) ≥ abcd .
4

Proof. By Example 2.2.4 and by (O.2) we have


√ √
a+b+c+d 2 ab + 2 cd √
4
≥ ≥ abcd.
4 4
The equality holds iff a = b = c = d.

3. The third idea is to use the transitivity property

(∀a ∈ R+ )(∀b ∈ R+ )(∀c ∈ R+ )[(a ≥ b) ∧ (b ≥ c) ⇒ (a ≥ c)].

In general this means that when proving that a ≥ c we have to find b such that a ≥ b and
b ≥ c.
Example 2.2.7. Let n ≥ 2 be a natural number. Prove that
1 1 1 1
+ + ··· + > .
n+1 n+2 2n 2
Proof.
1 1 1 1 1 1 1 1
+ + ··· + > + + ··· + =n = .
n+1 n+2 2n |2n 2n {z 2n} 2n 2
n

4. Inequalities involving integer may be proved by induction as we discussed before. Here


we give an inequality which is of particular importance in analysis.
Theorem 2.2.8. (Bernoulli’s inequality)

(∀n ∈ N)(∀α > −1)[(1 + α)n ≥ 1 + nα].

Proof. For n = 0, 1 it is true.


Suppose that it is true for n = k (k ≥ 1), i.e. (1 + α)k ≥ 1 + kα. We have to prove that it is
true for n = k + 1, i.e.
(1 + α)k+1 ≥ 1 + (k + 1)α.
Indeed,

(1 + α)k+1 = (1 + α)k (1 + α) ≥ (1 + kα)(1 + α)


= 1 + (k + 1)α + kα2 ≥ 1 + (k + 1)α.

34 January 28, 2006


CHAPTER 2. NUMBERS

2.3 Bounded sets of numbers


Simplest examples of bounded sets of numbers is finite sets of numbers. In case of a finite set
of real numbers we can always find the maximal number in the set and the minimal number.
In general we can define these numbers for any set of real number by the following
Definition 2.3.1. Let S ⊂ R. Then
(x = max S) ⇔ {(x ∈ S) ∧ [(∀y ∈ S)(y ≤ x)]},
(x = min S) ⇔ {(x ∈ S) ∧ [(∀y ∈ S)(y ≥ x)]}.
Note that the definition says nothing about existence of max S and min S. And indeed
they not always exist.
Definition 2.3.2. (i) A set S ⊂ R is called bounded above if
(∃K ∈ R)(∀x ∈ S)(x ≤ K).
The number K is called an upper bound of S.
(ii) A set S ⊂ R is called bounded below if
(∃k ∈ R)(∀x ∈ S)(x ≥ k).
The number k is called a lower bound of S.
(iii) A set S ⊂ R is called bounded if it is bounded above and below.

Note that if S is bounded above with upper bound K then any real number greater than
K is also an upper bound of S. Similarly, if S is bounded below with lower bound k then any
real number smaller than k is also a lower bound of S.

Example 2.3.1. Let a, b ∈ R and a < b.


(i) A = {x ∈ R | a ≤ x ≤ b} = [a, b] - closed interval. It is bounded above since (∀x ∈
A)(x ≤ b + 1) and bounded below since (∀x ∈ A)(x ≥ a − 1).
Therefore max A = b, min A = a.
(ii) B = {x ∈ R | a < x < b} = (a, b) - open interval. It is bounded above and below for the
same reasons as A.
Therefore B has no maximal or minimal number.
Note that if A ⊂ R is a bounded set and B ⊂ A then B is also bounded. Prove it!
Example 2.3.2. ½ ¾
n
A= |n ∈ N .
n+1
³ ´
n
A is bounded. Indeed, (∀n ∈ N) n+1 > 0) , so A is bounded below. Moreover,
µ ¶
n
(∀n ∈ N) < 1)
n+1
(Why? Prove this inequality) so A is bounded above, and therefore is bounded.
Therefore min A = 12 , max A does not exist.

35 January 28, 2006


2.4. SUPREMUM AND INFIMUM

2.4 Supremum and infimum


If a set S ∈ R is bounded above one tries to find the sharp, or in other words, the least upper
bound (existence of which is non-trivial and will be proved below). The least upper bound of
a set is called supremum of this set. Let us give it a precise definition.

Definition 2.4.1. Let S ⊂ R be bounded above. Then a ∈ R is called supremum of S (the


least upper bound of S) if

(i) a is an upper bound of S, i.e.


(∀x ∈ S)(x ≤ a);

(ii) there is no upper bound of S less than a, i.e.

(∀ε > 0)(∃b ∈ S)(b > a − ε).

Analogously one defines the greatest lower bound for a set bounded below.

Definition 2.4.2. Let S ⊂ R be bounded below. Then a ∈ R is called infimum of S (the


greatest lower bound of S) if

(i) a is a lower bound of S, i.e.


(∀x ∈ S)(x ≥ a);

(ii) there is no lower bound of S greater than a, i.e.

(∀ε > 0)(∃b ∈ S)(b < a + ε).

The next theorem is one of the corner-stones of analysis.

Theorem 2.4.1. Let S ⊂ R be non-empty and bounded above. Then sup S exists.

Proof. Divide R into two classes L and R as follows.

(x ∈ L) ⇔ (∃s ∈ S)(s > x),


(x ∈ R) ⇔ (∀s ∈ S)(s ≤ x).

Then

1. L ∩ R = ∅, and L 6= ∅, R 6= ∅.
Indeed, since S 6= ∅, it follows that ∃s ∈ S. Then x = s − 1 ∈ L. Let K be an upper
bound of S. Then K ∈ R.

2.
(∀l ∈ L)(∀r ∈ R) (l < r).
Indeed, let l ∈ L and r ∈ R. The by the definition of L there exists s ∈ S such that
s > l. By the definition of R we have that r ≥ s, hence l < r.

36 January 28, 2006


CHAPTER 2. NUMBERS

From this it follows that the classes L and R satisfy Dedekind’s axiom. Therefore

(∃ξ ∈ R)(∀ε > 0) [(ξ − ε ∈ L) ∧ (ξ + ε ∈ R)].

From Dedekind’s axiom one cannot conclude whether ξ ∈ L or ξ ∈ R.


We claim that in our circumstances ξ ∈ R.
Let us prove this claim by contradiction.
For a contradiction suppose that ξ ∈ L. Then by the definition of L there exists s ∈ S such
that s > ξ. Fix this s.
Take η = 21 (s + ξ). Then ξ < η < s. Hence η ∈ R since η > ξ. Therefore by the definition of
R we conclude that η ≥ s which is a contradiction. This proves that ξ ∈ R.
So we obtain that ξ satisfies

(1) (∀s ∈ S) (s ≤ ξ);

(2) (∀ε > 0)(∃s ∈ S) (s > ξ − ε).

By the definition of the supremum we conclude that ξ = sup S, which proves the theorem.
We use the following notation

−S = {x ∈ R | − x ∈ S}.

Theorem 2.4.2. Let S ⊂ R be non-empty and bounded below. Then inf S exists and is equal
to − sup(−S).

Proof. Since S is bounded below, it follows that ∃k ∈ R such that (∀x ∈ S)(x ≥ k). Hence
(∀x ∈ S)(−x ≤ −k) which means that −S is bounded above. By Theorem 2.4.1 there exists
ξ = sup(−S). We will show that −ξ = inf S. First, we have that (∀x ∈ S)(−x ≤ ξ), so
that x ≥ −ξ. Hence −ξ is a lower bound of S. Next, let η is another lower bound of S, i.e.
(∀x ∈ S)(x ≥ η). Then (∀x ∈ S)(−x ≤ −η). Hence −η is an upper bound of −S, and by the
definition of supremum −η ≥ sup(−S) = ξ. Hence η ≤ −ξ which proves the required.

Archimedian Principle and its consequences.

Theorem 2.4.3. (The Archimedian Principle) Let x > 0 and y ∈ R. Then there exists
n ∈ Z such that y < nx.

Proof. Suppose for a contradiction that

(∀n ∈ Z) (nx ≤ y).

Then the set A := {nx | n ∈ Z} is bounded above and y its upper bound. Then by Theorem
2.4.1 there exists sup A. The number sup A − x < sup A is not an upper bound of A, so there
exists m ∈ Z such that mx > sup A − x, or otherwise

(2.4.3) (m + 1)x > sup A.

But m+1 ∈ Z, hence (m+1)x ∈ A, and (2.4.3) contradicts to the definition of supremum.
Corollary 2.4.1. N is unbounded.

37 January 28, 2006


2.4. SUPREMUM AND INFIMUM

Corollary 2.4.2.
y
(∀x > 0)(∀y > 0)(∃n ∈ N) ( < x).
n
Proof. By the Archimedian Principle (Theorem 2.4.3)

(∃n ∈ Z) (nx > y).


y
Since y > 0 it follows that n ∈ N. Therefore n < x.
Corollary 2.4.3. ³ ny o ´
(∀y > 0) inf | n ∈ N, n 6= 0 = 0 .
n
ny o
Proof. Since all the elements of the set | n ∈ N are positive, we conclude that 0 is its
n
lower bound. By Corollary 2.4.2 any positive number x is not its lower bound, which proves
the assertion.
Example 2.4.4. ½ ¾
1
inf | n ∈ N, n 6= 0 = 0.
n
Example 2.4.5. Let ½ ¾
n−1
A := | n ∈ N, n 6= 0 .
2n
1
Then sup A = 2 and inf A = 0.

Proof. All the element of A are positive, except for the first one which is 0. Therefore
min A = 0, and hence inf A = 0.
Now notice that µ ¶
n−1 1 1 1
(∀n ∈ N) n 6= 0 ⇒ = − < .
2n 2 2n 2
Hence 12 is an upper bound of A.
We have to prove that 12 is the least upper bound. For this we have to prove that
µ ¶
n−1 1
(∀ε > 0)(∃n ∈ N) > −ε .
2n 2

Notice that µ ¶ µ ¶
n−1 1 1 1 1
> −ε ⇔ − > −ε
2n 2 2 2n 2
µ ¶
1
⇔ < ε ⇔ [n · (2ε) > 1].
2n
By the Archimedian Principle there exists such n ∈ N.
Note that for any set A ⊂ R if max A exists then sup A = max A. Also if min A exists
then inf A = min A.
Theorem 2.4.6. For any interval (a, b) there exists a rational r ∈ (a, b). In other words,

(∀a ∈ R)(∀b ∈ R)[(b > a) ⇒ (∃r ∈ Q)(a < r < b)].

38 January 28, 2006


CHAPTER 2. NUMBERS

Proof. Let h = b − a > 0. Then by the Archimedian Principle


µ ¶
1
(∃n ∈ N) <h .
n
µ ¶
0 m0 m
Fix this n. By the Archimedian Principle (∃m, m ∈ N) − <a< . Indeed,
n n
³m ´ µ 0 ¶
0 m
(∃m ∈ N) >a and (∃m ∈ N) > −a .
n n

Now the set of integers between −m0 and m is finite. Let m00 be the least element of that
set such that
m00
> a.
n
00
Set r = mn . Then we have
m00 m00 − 1 1 1
a<r= = + ≤ a + < a + h = b.
n n n n
The next theorem is a multiplicative analogue of the Archimedian Principle.
Theorem 2.4.7. Let x > 1, y > 0. Then
(∃n ∈ N) (xn > y).

Prove this theorem repeating the argument from the proof of Theorem 2.4.3.
Theorem 2.4.8. (∃!x > 0) (x2 = 2).

Proof. Define a set


A := {y > 0 | y 2 < 2}.
A 6= ∅ since 1 ∈ A. A is bounded above since (∀y ∈ A) (y < 2). Then by Theorem 2.4.1
there exists sup A = x. We will prove that x2 = 2.
2
1) First let us suppose that x2 > 2. Let ε = x 2x−2 . Then ε > 0 and
x2 − 2
(x − ε)2 = x2 − 2xε + ε2 > x2 − 2xε = x2 − 2x = 2.
2x
Hence x − ε is another upper bound for A, so that x is not the least upper bound for A. This
is a contradiction.
2−x2
2) Now suppose that x2 < 2. Let ε = 2x+1 . Then by assumption 0 < ε < 1 so that

(x + ε)2 = x2 + 2xε + ε2 < x2 + 2xε + ε =


2 − x2
= x2 + ε(2x + 1) = x2 + (2x + 1) = 2.
2x + 1
Hence x + ε is also in A, in which case x cannot be an upper bound for A, which is a
contradiction.

In the same way one can prove the following theorem.


Theorem 2.4.9. (∀a > 0)(∀n ∈ N)[n 6= 0 ⇒ (∃!x > 0) (xn = a)].

39 January 28, 2006


Chapter 3

Sequences and Limits

3.1 Sequences
In general any function f : N − {0} → X with domain N+ and arbitrary codomain X is
called a sequence. This means that a sequence (an )n∈N+ is a subset of X each element of
which has its number, i.e. an = f (n). (We do not call this a series - this term is used for
another concept.)
In this course we confine ourselves to sequences of numbers. In particular, in this chapter
we always assume that X = R. So we adopt the following definition.

Definition 3.1.1. A sequence of real numbers is a function f : N+ → R. We use the


following notation for the general term an = f (n) and for the whole sequence (an )n∈N+ .

Example 3.1.1. 1. an = n1 . The sequence is 1, 12 , 31 , 14 , . . . .

2. an = n2 . The sequence is 1, 4, 9, 16, 25, . . . .


(−1)n
3. an = n . The sequence is −1, 12 , − 13 , 14 , . . . .

3.2 Null sequences


In the example 1. in the previous section nth member of the sequence becomes smaller as n
becomes larger. Making computations we see that an “tends” to zero. The same refers to the
example 3. Such sequences are called null sequences.

We will give the precise mathematical definition to the notion of a null sequence.
Definition 3.2.1. (an )n∈N+ is a null sequence if

(∀ε > 0)(∃N ∈ N+ )(∀n ∈ N+ ) [(n ≥ N ) ⇒ (|an | < ε)].

Now we are ready to verify rigorously this definition for some examples.

1
Example 3.2.1. (an )n∈N+ with an = is a null sequence.
n

40
CHAPTER 3. SEQUENCES AND LIMITS

Proof. We have to prove that


· µ ¶¸
+ + 1
(∀ε > 0)(∃N ∈ N )(∀n ∈ N ) (n > N ) ⇒ <ε .
n

Otherwise we need n to satisfy the inequality n > 1ε . By


· ¸AP (the Archimedian Principle)
1
(∃N ∈ N+ ) (N > 1ε ). (In particular one can take N = + 1.) Then [(n > N ) ∧ (N >
ε
1 1
ε )] ⇒ (n > ε ) which proves the statement.

The next theorem follows straight from the definition of a null sequence and the simple
fact that (∀a ∈ R) (|a| = ||a||).
Theorem 3.2.2. A sequence (an )n∈N+ is a null sequence if and only if the sequence (|an |)n∈N+
is a null sequence.
(−1)n
Example 3.2.3. (an )n∈N+ with an = is a null sequence.
n

Definition 3.2.2. Let x ∈ R and ε > 0. The set

B(x, ε) := (x − ε, x + ε) = {y ∈ R | |y − x| < ε}

is called the ε-neighbourhood of the point x.

Using this notion one can say that a sequence (an )n∈N+ is a null sequence if an only if
for any ε > 0 starting from some number all the elements of the sequence belong to the
ε-neighbourhood of zero.

3.3 Sequence converging to a limit


A null sequence is one whose terms approach zero. This a particular case of a sequence
tending to a limit.
n
Example 3.3.1. Let an = n+1 . Then an tends to 1 as n → ∞.

Definition 3.3.1. A sequence (an )n∈N+ converges to the limit a ∈ R if

(∀ε > 0)(∃N ∈ N+ )(∀n ∈ N+ ) [(n > N ) ⇒ (|an − a| < ε)].

We write in this case that


lim an = a
n→∞
or abbreviating:
lim an = a
n
or, again
an → a as n → ∞.

41 January 28, 2006


3.3. SEQUENCE CONVERGING TO A LIMIT

Remark 4. 1. From the above definition it is clear that


³ ´
lim an = a ⇔ ((an − a)n is a null sequence).
n→∞

2. limn→∞ an = a if an only if for any ε > 0 starting from some number all the elements
of the sequence belong to the ε-neighbourhood of a.

3. Note that in Definition 3.3.1 N depends on ε.

Let us prove now the statement of Example 3.3.1.

Proof. Let ε > 0 be given. We have to find N ∈ N+ such that for n > N |an − a| < ε. The
last inequality for our example reads as follows
¯ ¯
¯ n ¯
¯ ¯
¯ n + 1 − 1¯ < ε.

An equivalent form of this inequality is


µ ¶ µ ¶
1 1
<ε ⇔ n> −1 .
n+1 ε

Chose n to be any integer greater than or equal to 1ε −1 (which exists by AP). Then, if n > N ,
we have
1
n > N > − 1.
ε
Therefore µ ¶ µ¯ ¯ ¶
1 ¯ n ¯
<ε ⇔ ¯ ¯ ¯
− 1¯ < ε .
n+1 n+1

If a sequence (an )n does not converge, we say that it diverges.

Definition 3.3.2. 1. The sequence (an )n∈N+ diverges to ∞ if

(∀M ∈ R)(∃N ∈ N+ )(∀n ∈ N+ ) [(n > N ) ⇒ (an > M )].

2. The sequence (an )n∈N+ diverges to −∞ if

(∀M ∈ R)(∃N ∈ N+ )(∀n ∈ N+ ) [(n > N ) ⇒ (an < M )].

Example 3.3.2. 1. The sequence an = n2 diverges to ∞.

2. The sequence an = −n2 diverges to −∞.

3. The sequence an = (−1)n diverges.

Theorem 3.3.3. A sequence (an )n∈N+ can have at most one limit.

42 January 28, 2006


CHAPTER 3. SEQUENCES AND LIMITS

Proof. We have to prove that the following implication is true for all a, b ∈ R
h³ ´ ³ ´i
lim an = a ∧ lim an = b ⇒ (a = b).
n→∞ n→∞

By the definition

(∀ε > 0)(∃N1 ∈ N+ )(∀n ∈ N+ ) [(n > N ) ⇒ (|an − a| < ε)],

(∀ε > 0)(∃N2 ∈ N+ )(∀n ∈ N+ ) [(n > N ) ⇒ (|an − b| < ε)].

Suppose for a contradiction that the statement is not true. Then its negation
h³ ´ ³ ´i
lim an = a ∧ lim an = b ∧ (a 6= b)
n→∞ n→∞

is true.
Fix ε = 13 |a − b| > 0 and take N = max{N1 , N2 }. Then, if n > N , we have

2
|a − b| = |(a − an ) + (an − b)| ≤ |an − a| + |an − b| < 2ε = |a − b|,
3
which is a contradiction.
Theorem 3.3.4. Any convergent sequence is bounded.

Proof. We have to prove that the following implication is true


h³ ´i
lim an = a ⇒ [(∃m, M ∈ R)(∀n ∈ N+ ) (m ≤ an ≤ M )].
n→∞

Take ε = 1 in the definition of the limit. Then ∃N ∈ N+ such that for n > N we have

(|an − a| < 1) ⇔ (a − 1 < an < a + 1).

Set M = max{a + 1, a1 , . . . , aN } and m = min{a − 1, a1 , . . . , aN }. Then

(∀n ∈ N+ ) (m ≤ an ≤ M ).

Note that Theorem 3.3.4 can be expressed by the contrapositive law in the following way
• If a sequence is unbounded then it diverges.

So boundedness of a sequence is a necessary condition for its convergence!

Theorem 3.3.5. If lim an = a 6= 0 then


n→∞
· ¸
+ + |a|
(∃N ∈ N )(∀n ∈ N ) (n > N ) ⇒ (|an | > ) .
2

Moreover, if a > 0 then for the above n an > a2 , and if if a < 0 then for the above n an < a2 .

43 January 28, 2006


3.3. SEQUENCE CONVERGING TO A LIMIT

|a|
Proof. Fix ε = 2 . Then ∃N ∈ N+ such that for n > N

|a|
> |a − an | ≥ |a| − |an |,
2
|a| |a|
which implies |an | > |a| − 2 = 2 ,
and the first assertion is proved. On the other hand
µ ¶
|a| |a| |a|
( > |a − an |) ⇔ a − < an < a + .
2 2 2

Thus if a > 0 then for n > N


|a| a
an > a − = .
2 2
Also if a < 0 then for n > N
|a| a
an < a + = .
2 2

Theorem 3.3.6. Let a, b ∈ R, (an )n , (bn )n be real sequences. Then


n³ ´ ³ ´ £ ¤o
lim an = a ∧ lim bn = b ∧ (∀n ∈ N+ ) (an ≤ bn ) ⇒ (a ≤ b).
n→∞ n→∞

a−b
Proof. For a contradiction assume that b < a. Fix ε < 2 so that b + ε < a − ε and choose
N1 and N2 such that the following is true

[(∀n > N1 )(an > a − ε)] ∧ [(∀n > N2 )(bn < b + ε)].

If N ≥ max{N1 , N2 } then
bn < b + ε < a − ε < an ,
which contradicts to the condition (∀n ∈ N+ ) (an ≤ bn ).

Theorem 3.3.7. Sandwich rule. Let a ∈ R and (an )n , (bn )n , (cn )n be real sequences. Then

{[(∀n ∈ N+ ) (an ≤ bn ≤ cn )] ∧ ( lim an = lim cn = a)} ⇒ ( lim bn = a).


n→∞ n→∞ n→∞

Proof. Let ε > 0. Then one can find N1 , N2 ∈ N+ such that

[(n > N1 ) ⇒ (a − ε < an )] ∧ [(n > N2 ) ⇒ (cn < a + ε)].

Then choosing N = max{N1 , N2 } we have

(a − ε < an ≤ bn ≤ cn < a + ε) ⇒ (|bn − a| < ε) .

The next theorem is a useful tool in computing limits.


Theorem 3.3.8. Let lim an = a and lim bn = b. Then
n→∞ n→∞

(i) lim (an + bn ) = a + b;


n→∞

44 January 28, 2006


CHAPTER 3. SEQUENCES AND LIMITS

(ii) lim (an bn ) = ab.


n→∞

(iii) If in µ
addition +
¶ b 6= 0 and (∀n ∈ N ) (bn 6= 0) then
an a
lim = .
n→∞ bn b

Proof. (i) Let ε > 0. Choose N1 such that for all n > N1 |an − a| < ε/2. Choose N2 such
that for all n > N2 |bn − b| < ε/2. Then for all n > max{N1 , N2 }

|an + bn − (a + b)| ≤ |an − a| + |bn − b| < ε.

(ii) Since (bn )n is convergent, it is bounded by Theorem 3.3.4. Let K be such that
(∀n ∈ N+ ) (|bn | ≤ K) and |a| ≤ K. We have

|an bn − ab| = |an bn − abn + abn − an bn |


≤ |an − a||bn | + |a||bn − b| ≤ K|an − a| + K|bn − b|.

Choose N1 such that for n > N1 K|an − a| ≤ ε/2.


Choose N2 such that for n > N2 K|bn − b| ≤ ε/2.
Then for n > N = max{N1 , N2 }
|an bn − ab| < ε.

(iii) We have
¯ ¯ ¯ ¯
¯ an a ¯ ¯ an b − abn ¯ |an − a||b| + |a||bn − b|
¯ − ¯=¯ ¯≤ .
¯ bn b¯ ¯ bn b ¯ |bn b|

Choose N1 such that for n > N1


1
|bn | > |b| (possible by Theorem 3.3.5).
2
Choose N2 such that for n > N2
1
|an − a| < |b|ε.
4
Choose N3 such that for n > N3
1
|a||bn − b| < b2 ε.
4
Then for n > N := max{N1 , N2 , N3 }
¯ ¯
¯ an a ¯
¯ − ¯ < ε.
¯ bn b¯

45 January 28, 2006


3.4. MONOTONE SEQUENCES

3.4 Monotone sequences


In this section we consider a particular class of sequences which often occur in applications.
Definition 3.4.1. (i) A sequence of real numbers (an )n∈N+ is called increasing if

(∀n ∈ N+ ) (an ≤ an+1 ).

(ii) A sequence of real numbers (an )n∈N+ is called decreasing if

(∀n ∈ N+ ) (an ≥ an+1 ).

(iii) A sequence of real numbers (an )n∈N+ is called monotone if it is either increasing or
decreasing.

Example 3.4.1. (i) an = n2 is increasing.


1
(ii) an = is decreasing.
n
1
(iii) an = (−1)n is not monotone.
n

The next theorem is one of the main theorems in the theory of converging sequences.
Theorem 3.4.2. If a sequence of real numbers is bounded above and increasing then it is
convergent.

Proof. Let (an )n∈N+ be a bounded above increasing sequence of real numbers. Since the set
{an | n ∈ N+ } is bounded above, there exists supremum

a := sup{an }.
n

By the definition of supremum

(∀ε > 0)(∃N ∈ N+ ) (aN > a − ε).

Fix this N . Since (an )n is increasing, (∀n > N ) (an > aN ). Therefore

(∀n ∈ N+ ) [(n > N ) ⇒ (an > a − ε).

Also by the definition of supremum

(∀ε > 0)(∀n ∈ N+ ) (an < a + ε).

Therefore for n > N

(a − ε < an < a + ε) ⇔ (|an − a| < ε).

In complete analogy to the previous theorem one proves the following one.

46 January 28, 2006


CHAPTER 3. SEQUENCES AND LIMITS

Theorem 3.4.3. If a sequence of real numbers is bounded below and decreasing then it is
convergent.

In many cases before computing the limit of a sequence one has to prove that the limit
exists, i.e. the sequence converges. In particular, this is the case when a sequence is defined
by a recurrent formula, i.e. the formula for the nth member of the sequence via previous
members with numbers n − 1 and less.

Let us illustrate this point by an example.

Example 3.4.4. Let (an )n be a sequence defined by


p √
(3.4.1) an = an−1 + 2, a1 = 2.

1. (an )n is bounded.
Indeed, we prove that
(∀n ∈ N+ ) (0 < an ≤ 2).
Proof. Positivity is obvious.
For n = 1 a1 = 2 ≤ 2, and the statement is true.
Suppose that it is true for n = k (k ≥ 1), i.e. ak ≤ 2.
For n = k + 1 we have √ √
ak+1 = ak + 2 ≤ 2 + 2 = 2,
and by the principle of induction the statement is proved.

2. (an )n is increasing.
We have to prove that
(∀n ∈ N+ ) (an+1 ≥ an ).
Proof. This is equivalent to prove

[(∀n ∈ N+ ) ( an + 2 ≥ an )] ⇔ [(∀n ∈ N+ ) (an + 2 ≥ a2 )].

Indeed
an + 2 − a2n = (2 − an )(an + 1) ≥ 0.

By Theorem 3.4.2 we conclude that (an ) is convergent. Let limn an = x. Then, of course,
limn an−1 = x. Passing to the limit in (3.4.1) we obtain that

x = x + 2,

from which we easily find that x = 2.

Now we are ready to prove a very important property of real numbers.


Theorem 3.4.5. Let ([an , bn ])n∈N+ be a sequence of closed intervals on the real axis such
that:

(i) (∀n ∈ N ) ([an+1 , bn+1 ] ⊂ [an , bn ]);

47 January 28, 2006


3.5. CAUCHY SEQUENCES

(ii) lim (bn − an ) = 0


n→∞
(i.e. the length of the intervals tends to zero).

Then there is a unique point which belongs to all the intervals.

Proof. The sequence (an )n is increasing and bounded above, and the sequence (bn )n is de-
creasing and bounded below. Therefore (an )n , (bn )n are convergent.
Let a := limn an ; b := limn bn .
Then
a≤b
by Theorem 3.3.6. Moreover,

b − a = lim(bn − an ) = 0.
n

Hence \
a=b∈ [an , bn ].
n∈N+

3.5 Cauchy sequences


In this short section we are going to derive the intrinsic criterion for convergence of sequences
(i.e. the criterion which does not use the value of the limit and is expressed in terms of a
condition on the members of the sequence).

First we introduce an important notion of a Cauchy sequence.

Definition 3.5.1. A sequence of real numbers (an )n∈N+ is called a Cauchy sequence if

(∀ε > o)(∃N ∈ N) {[(n > N ) ∧ (m > N )] ⇒ (|an − am | < ε)} .

Proposition 3.5.1. If a sequence (an )n∈N+ is convergent then it is a Cauchy sequence.

Proof. Let a := limn an . Let ε > 0. Then there exists N ∈ N such that for all n > N

|an − a| < ε/2.

Let m, n > N . Then by the triangle inequality

|an − am | ≤ |an − a| + |am − a| < ε.

It turns out that the inverse to Proposition 3.5.1 is also true. This is expressed in the
next theorem that delivers the Cauchy criterion for convergence of sequences.

Theorem 3.5.1. If (an )n∈N+ is a Cauchy sequence of real numbers then it is convergent.

48 January 28, 2006


CHAPTER 3. SEQUENCES AND LIMITS

Proof. Let (an )n∈N+ be a Cauchy sequence. Take ε = 1. There exists N ∈ N such that for all n, m > N

|an − am | < 1.

Fix m > N . Then


|an | − |am | ≤ |an − am | < 1.
Therefore
(n > N ) ⇒ (|an | < |am | + 1).
It follows that (an )n∈N+ is bounded (Why?)
For any n ∈ N define
αn := inf ak , βn := sup ak .
k>n k>n

Then we have
(∀n ∈ N) ([αn+1 , βn+1 ] ⊂ [αn , βn ]).
(Justify the above)
The length of the interval [αn , βn ] converges to zero as ∇ → ∞ since from the Cauchy condition it
follows that for any ε > 0 there exists N ∈ N+ such that for all k > N

aN − ε < ak < aN + ε,

from which we conclude that


aN − ε ≤ αN ≤ βN ≤ aN + ε,
or else
(∀n ∈ N+ ) [(n > N ) ⇒ (2ε ≥ βN − αN ≥ βn − αn )].
\
By Theorem 3.4.5 there exists a ∈ [αn , βn ].
n∈N+
Fix ε > 0. Find N such that |βN − αN | < ε. Clearly (∀k > N ) (ak ∈ [αN , βn ]). It follows that

(∀k > N ) (|ak − a| < ε).

Therefore lim an = a.
n→∞

49 January 28, 2006


3.6. SERIES

3.6 Series
In this section we discuss infinite series. Let us start from an example which is likely to be
familiar from high school. How to make sense of the infinite sum
1 1 1 1
+ + + ...
2 4 8 16
This is the known sum of a geometric progression. One proceeds as follows. Define the sum
of the first n term
1 1 1 1 1
sn = + + + · · · + n = 1 − n
2 4 8 2 2
and then define the infinite sum s as limn→∞ sn , so that s = 1.
This idea is used to define formally an arbitrary series,
Definition 3.6.1. Let (an )n∈N+ be a sequence of real (complex) numbers. Let
n
X
sn = a1 + a2 + · · · + an = ak .
k=1

We say that the series



X
ak = a1 + a2 + . . .
k=1

is convergent if the sequence of the partial sums (sn )n∈N+ is convergent. The limit of this
sequence is called the sum of the series.

If the series is not convergent we say that it is divergent (or diverges).


Theorem 3.6.1. If the series

X
an
n=1

is convergent then lim an = 0.


n→∞
P
Proof. Let sn = nk=1 ak . Then by the definition the limit limn sn exists. Denote it by s.
Then of course limn sn−1 = s. Note that an = sn − sn−1 for n ≥ 2. Hence

lim an = lim sn − lim sn−1 = s − s = 0.


n→∞ n→∞ n→∞

The same idea can be used to prove the following theorem.


Theorem 3.6.2. If the series

X
an
n=1

is convergent then

s2n − sn = an+1 + an+2 + · · · + a2n → 0 as n → ∞.

50 January 28, 2006


CHAPTER 3. SEQUENCES AND LIMITS

The above theorem expresses the simplest necessary condition for the convergence of a
series.
For example, each of the following series is divergent

1 + 1 + 1 + 1 + ...,

1 − 1 + 1 − 1 + ...
since the necessary condition is not satisfied.

Let us take a look at several important examples.


Example 3.6.3. The series

X
xk = 1 + x + x2 + . . .
k=0
is convergent if and only if −1 < x < 1.
Indeed, ½ 1 xn
1−x − 1−xif x 6= 1,
sn =
n if x = 1.

Example 3.6.4. Harmonic series The series


1 1 1 1
1+ + + + ··· + + ...
2 3 4 n
diverges.
Indeed,
1 1 1 1
s2n − sn = + + ··· + > .
n+1 n+2 2n 2

Example 3.6.5. Generalised harmonic series The series


1 1 1 1
1+ p
+ p + p + ··· + p + ...
2 3 4 n
converges if p > 1 and diverges if p ≤ 1.

3.6.1 Series of positive terms


In this subsection we confine ourselves to considering series which terms are non-negative
numbers, i.e.
X∞
ak with ak ≥ 0 for every k ∈ N+ .
k=1

P
The main feature of such a case that the sequence of the partial sums sn = nk=1 ak is
increasing.
Indeed, sn − sn−1 = an ≥ 0.
This allows us to formulate a simple criterion of convergence for such a series.

51 January 28, 2006


3.6. SERIES

Theorem 3.6.6. A series of positive terms is convergent if and only if the sequence of partial
sums is bounded above.

Proof. Indeed, if a series is convergent then the sequence of partial sums is convergent by the
definition, and therefore it is bounded.
Conversely, if the sequence of partial sums is bounded above then it is convergent by
Theorem 3.4.2, and therefore the series is convergent.

Comparison tests

Here we establish tests which make possible to infer the convergence or divergence of a series
by means of comparing it with another series convergence or divergence of which is known.

X ∞
X
Theorem 3.6.7. Let ak , bk be two series of positive terms. Assume that
k=1 k=1

(3.6.2) (∀n ∈ N+ ) (an ≤ bn ).

Then

X ∞
X
(i) If bk is convergent then ak is convergent.
k=1 k=1


X ∞
X
(ii) If ak is divergent then bk is divergent.
k=1 k=1

Pn Pn
Proof. Let sn = k=1 ak , s0n = k=1 bk . Then

(∀n ∈ N+ ) (sn ≤ s0n ).

So if the theorem follows now from Theorem 3.6.6.

Remark 5. Theorem 3.6.7 holds if instead of (3.6.2) one assumes

(∃K > 0)(∀n ∈ N+ ) (an ≤ Kbn ).

Now since for p ≤ 1 µ ¶


+ 1 1
(∀n ∈ N ) p
≥ ,
n n
we obtain one of the assertions of Example 3.6.5 from the above theorem.

The following observation (which follows immediately from the definition of convergence
of a series) is important for the next theorem.
The convergence or divergence of a series is unaffected if a finite number of terms are
inserted, or suppressed, or altered.

52 January 28, 2006


CHAPTER 3. SEQUENCES AND LIMITS

Theorem 3.6.8. Let (an )n , (bn )n be two sequences of positive numbers. Assume that
an
lim = L > 0. (the limit is positive and finite).
n→∞ bn


X ∞
X
Then the series ak is convergent if and only if the series bk is convergent.
k=1 k=1

Proof. By the definition of the limit


µ ¶
1 an 3
(∃N ∈ N)(∀n > N ) L< < L .
2 bn 2

Now the assertion follows from the observation before the theorem and from Theorem 3.3.6.


X
2n
Example 3.6.9. (i) The series is divergent since
n2 + 1
n=1
2n ∞
X
n2 +1 1
lim 1 = 2 and the series is divergent.
n→∞
n
n
n=1


X n ∞
X
n 2n3 +2 1 1
(ii) The series 3
is convergent since lim 1 = and the series is
2n + 2 n→∞
n2
2 n2
n=1 n=1
convergent.

Other tests of convergence

Theorem 3.6.10. (Cauchy’s Test or The Root Test) Let (an )n∈N+ be a sequence of positive
numbers. Suppose that

lim n an = l.
n→∞

X ∞
X
Then if l < 1, the series an converges; if l > 1, the series an diverges. If l = 1, no
n=1 n=1
conclusion can be drawn.

Proof. Suppose that l < 1. Choose r such that l < r < 1. Then

(∃N ∈ N)(∀n ∈ N+ )[(n > N ) ⇒ ( n
an < r)].

Or otherwise n
P∞ ann < r . The convergence follows now from the comparison with the convergent
series n=1 r .
Next suppose that l > 1. Then

(∃N ∈ N)(∀n ∈ N+ )[(n > N ) ⇒ ( n
an > 1)].

Or otherwise an > 1. Hence the series diverges.

53 January 28, 2006


3.6. SERIES

Theorem 3.6.11. (D’Alembert’s Test or The Ratio Test) Let (an )n∈N+ be a sequence of
positive numbers. Suppose that
an+1
lim = l.
n→∞ an

X ∞
X
Then if l < 1, the series an converges; if l > 1, the series an diverges. If l = 1, no
n=1 n=1
conclusion can be drawn.

Proof. Suppose that l < 1. Choose r such that l < r < 1. Then
an+1
(∃N ∈ N)(∀n ∈ N+ )[(n > N ) ⇒ ( < r)].
an
Therefore
an an−1 aN+ 2 aN +1
an = · · ··· · · aN +1 < rn−N −1 aN +1 = N +1 · rn .
an−1 an−2 aN +1 r
P
The convergence follows now from the comparison with the convergent series ∞ n
n=1 r .
Next suppose that l > 1. Then
an+1
(∃N ∈ N)(∀n ∈ N+ )[(n > N ) ⇒ ( > 1)].
an
Or otherwise an+1 > an . Hence the series diverges.

Example 3.6.12. Investigate the convergence of the series


n
X 2n + n
.
3n − n
n=1

Let us use the d’Alembert test. Compute the limit

an+1 2n+1 + n + 1 3n − n 2
lim = lim n
· n+1
= .
n→∞ an n→∞ 2 +n 3 −n−1 3
Therefore the series converges.

Example 3.6.13. Investigate the convergence of the series


n
X µ ¶n
(−1)n 1
2 .
2
n=1

Let us use the Cauchy test. Compute the limit


√ (−1)n 1 1
lim n
an = lim 2 n = .
n→∞ n→∞ 2 2

n
(We used the fact that limn→∞ 2 = 1). Therefore the series converges.

54 January 28, 2006


Chapter 4

Limits of functions and continuity

4.1 Limits of function


In this chapter we deal only with function with values in the set of real numbers (real valued
functions).

We want to define what is meant by

f (x) → b as x → a.

Definition 4.1.1. (Cauchy) Let f : D(f ) → R and let c < a < d be such that (c, a)∪(a, d) ⊂
D(f ). We say that f (x) → b as x → a and write lim f (x) = b if
x→a
h ¡ ¢i
(∀ε > 0)(∃δ > 0)(∀x ∈ D(f )) (0 < |x − a| < δ) ⇒ |f (x) − b| < ε .

Note that
¡ ¢
(|x − a| < δ) ⇔ (−δ < x − a < δ) ⇔ (a − δ < x < a + δ) ⇔ x ∈ (a − δ, a + δ) .

The set (a − δ, a + δ) is again called a δ-neighborhood of a.

Example 4.1.1. Let f : R → R be defined by f (x) = x. Then for any a ∈ R

lim x = a.
x→a

Proof.
h Let a ∈ R be arbitrary. Fix ε > 0.
We need to show that 0 < |x − a| < δ implies |x − a| < ε. Hence one can
i
choose δ = ε.
Choose δ = ε. Then
h i
(∀x ∈ R) (0 < |x − a| < δ) ⇒ (|x − a| < ε) .

55
4.1. LIMITS OF FUNCTION

Example 4.1.2. Let f : R → R be defined by f (x) = x2 . Then for any a ∈ R

lim x2 = a2 .
x→a

hProof. Let a ∈ R be arbitrary. Fix ε > 0.


We need to show that 0 < |x − a| < δ implies |x2 − a2 | < ε. We can restrict
ourselves to |x − a| < 1. Then |x| ≤ |x − a| + |a| < 1 + |a|,
and |x + a| ≤ |x| + |a| < 1 + 2|a|.
i
Note that |x2 − a2 | = |x − a| · |x + a| < (1 + 2|a|) · |x − a|.
ε
Choose δ := min{1, 1+2|a| }. Then
h i
(∀x ∈ R) (0 < |x − a| < δ) ⇒ (|x2 − a2 | < ε) .

There is another possibility to define a limit of a function. It is based upon the definition
of a limit of a sequence.

Definition 4.1.2. (Heine) Let f be be such that (c, a) ∪ (a, d) ⊂ D(f ). We say that

f (x) → b as x → a

if for any sequence (xn )n∈N such that


h i
(i) (∀n ∈ N) (xn ∈ (c, d)) ∧ (xn 6= a) ,

(ii) xn → a as n → ∞

we have
lim f (xn ) = b.
n→∞

In order to use any of the two definitions of the limit of a function we have to make sure
that they are equivalent. This is the matter of the next theorem.
Theorem 4.1.3. Definitions 4.1.1 and 4.1.2 are equivalent.

Proof. (i) Definition (4.1.1) implies Definition (4.1.2).


Let lim f (x) = b in the sense of Definition (4.1.1). Let (xn )n∈N be a sequence satisfying
x→a
the conditions of Definition (4.1.2). We have to prove that lim f (xn ) = b.
n→∞
Fix ε > 0.
h ¡ ¢i
(4.1.1) (∃δ > 0)(∀x ∈ R) (0 < |x − a| < δ) ⇒ |f (x) − b| < ε .

Fix now δ found above. Then


h ¡ ¢i
(4.1.2) (∃N ∈ N)(∀n ∈ N) (n > N ) ⇒ |xn − a| < δ .

56 January 28, 2006


CHAPTER 4. LIMITS OF FUNCTIONS AND CONTINUITY

Then by (4.1.1) and (4.1.2)


¡ ¢
(n > N ) ⇒ |f (xn ) − b| < ε

which proves that


lim f (xn ) = b.
n→∞

(ii) Definition (4.1.2) implies Definition (4.1.1).


By contradiction. Suppose that f (x) → b as x → a in the sense of Definition (4.1.2) but not
in the sense of Definition (4.1.1).
This means that
£ ¡ ¢i
(∃ε > 0)(∀δ > 0)(∃x ∈ R) (0 < |x − a| < δ) ∧ |f (x) − b| ≥ ε .

Take δ = n1 . Find xn such that

1 ¡ ¢
(0 < |xn − a| < ) ∧ |f (xn ) − b| ≥ ε .
n

We have that xn → a as n → ∞. Therefore by Definition (4.1.2) f (xn ) → b as n → ∞.


Contradiction.

Definition 4.1.3. We say that A is the limit of the function f : X → R (X ⊂ R) as x → +∞


and write lim f (x) = A if ∃K ∈ R such that (K, ∞) ⊂ D(f ) and
x→+∞
h ¡ ¢i
(∀ε > 0)(∃M > K)(∀x ∈ D(f )) (x > M ) ⇒ |f (x) − A| < ε .

Similarly one can define what it means that

lim f (x) = A.
x→−∞

Example 4.1.4.
1
lim = 0.
x→∞ x

Proof. We have to prove that


h ³ ¯¯ 1 ¯¯ ´i
(∀ε > 0)(∃M > 0)(∀x ∈ D(f )) (x > M ) ⇒ ¯¯ ¯¯ < ε
x

It is sufficient to choose M such that M ε > 1 which exists by the Archimedian Principle.

Limits of functions exhibit many properties similar to those of limits of sequences. Let us
prove uniqueness of the limit of a function at a point using the Heine definition.
Theorem 4.1.5. If lim f (x) = A and lim f (x) = B then A = B.
x→a x→a

57 January 28, 2006


4.1. LIMITS OF FUNCTION

Proof. Let xn → a as n → ∞. Then f (xn ) → A as n → ∞ and f (xn ) → B as n → ∞. From


the uniqueness of the limit of a sequence it follows that A = B.

Sometimes it is more convenient to use the Cauchy definition.


Theorem 4.1.6. Let lim f (x) = A. Let B > A. Then
x→a
h ¡ ¢i
(∃δ > 0)(∀x ∈ D(f )) (0 < |x − a| < δ) ⇒ f (x) < B .

Proof. Using ε = B − A > 0 in the Cauchy definition of the limit we have


h ¡ ¢i
(∃δ > 0)(∀x ∈ D(f )) (0 < |x − a| < δ) ⇒ |f (x) − A| < B − A

which implies that f (x) − A < B − A, or f (x) < B.

The following two theorems can be proved using the Heine definition of the limit of a
function at a point and the corresponding property of a limit of a sequence. Proofs are left
as exercises.
Theorem 4.1.7. Let lim f (x) = A and lim g(x) = B. Then
x→a x→a
¡ ¢
(i) lim f (x) + g(x) = A + B;
x→a
¡ ¢
(ii) lim f (x) · g(x) = A · B;
x→a

If in addition B 6= 0 then
µ ¶
f (x) A
(iii) lim = .
x→a g(x) B

Theorem 4.1.8. Let lim f (x) = A and lim g(x) = B. Suppose that
h x→a x→a i
(∃δ > 0) {x ∈ R | 0 < |x − a| < δ} ⊂ D(f ) ∩ D(g) and
¡ ¢ £ ¤
0 < |x − a| < δ ⇒ f (x) ≤ g(x) .

Then A ≤ B, i.e.
lim f (x) ≤ lim g(x).
x→a x→a

Using Theorem 4.1.7 one can easily compute limits of some functions.
Example 4.1.9.

x3 + 2x2 − 7 lim x3 + lim 2x2 − lim 7 8+8−7 3


x→2 x→2 x→2
lim 3
= 3 = = .
x→2 2x − 4 lim 2x − lim 4 16 − 4 4
x→2 x→2

58 January 28, 2006


CHAPTER 4. LIMITS OF FUNCTIONS AND CONTINUITY

One-sided limits

Definition 4.1.4. (i) Let f be defined on an interval (a, d) ⊂ R. We say that f (x) →
b as x → a+ and write lim f (x) = b if
x→a+
h ¡ ¢i
(∀ε > 0)(∃δ > 0)(∀x ∈ (c, d)) (0 < x − a < δ) ⇒ |f (x) − b| < ε .

(ii) Let f be defined on an interval (c, a) ⊂ R. We say that f (x) → b as x → a− and write
lim f (x) = b if
x→a−
h ¡ ¢i
(∀ε > 0)(∃δ > 0)(∀x ∈ (c, d)) (−δ < x − a < 0) ⇒ |f (x) − b| < ε .

59 January 28, 2006


4.2. CONTINUOUS FUNCTIONS

4.2 Continuous functions


Definition 4.2.1. Let a ∈ R. Let a function f be defined in a neighborhood of a. Then the
function f is called continuous at a if

lim f (x) = f (a).


x→a

Using the definitions of the limit given in the previous section we can formulate the above
definition in the following way.
Definition 4.2.2. A function f is called continuous at a point a ∈ R if f is defined on an
interval (c, d) containing a and
h ¡ ¢i
(∀ε > 0)(∃δ > 0)(∀x ∈ (c, d)) (|x − a| < δ) ⇒ |f (x) − f (a)| < ε .

Note the difference between this definition and the definition of the limit: the function f
has to be defined at a.

Using the above definition it is easy to formulate what it means that a function f is
discontinuous at a point a:
A function f is discontinuous at a point a ∈ R if
f is not defined on any neighbourhood (c, d) containing a, or if
h ¡ ¢i
(∃ε > 0)(∀δ > 0)(∃x ∈ D(f )) (|x − a| < δ) ∧ |f (x) − f (a)| ≥ ε .

One more equivalent way to define continuity at a point is to use the Heine definition of
the limit.
Definition 4.2.3. A function f is called continuous at a point a ∈ R if f is defined on an
interval (c, d) containing a and for any sequence (xn )n∈N such that
h i
(i) (∀n ∈ N) (xn ∈ (c, d)) ∧ (xn 6= a) ,

(ii) xn → a as n → ∞

we have
lim f (xn ) = f (a).
n→∞

Example 4.2.1. Let c ∈ R. Let f : R → R be defined by f (x) = c for any x ∈ R. Then f is


continuous at any point in R.
Example 4.2.2. Let f : R → R be defined by f (x) = x for any x ∈ R. Then f is continuous
at any point in R.

The following theorem easily follows from the definition of continuity and properties of
limits.

60 January 28, 2006


CHAPTER 4. LIMITS OF FUNCTIONS AND CONTINUITY

Theorem 4.2.3. Let f and g be continuous at a ∈ R. Then

(i) f + g is continuous at a.
(ii) f · g is continuous at a.

Moreover, if g(a) =
6 0, then
f
(iii) is continuous at a.
g
Based on this theorem and the two examples above we conclude that
a0 xn + a1 xn−1 + · · · + an
f (x) =
b0 xm + b1 xm−1 + · · · + bm
is continuous at every point of its domain of definition.

Theorem 4.2.4. Let g be continuous at a ∈ R, f be continuous at b = g(a) ∈ R. Then f ◦ g


is continuous at a.

Proof. Fix ε > 0. Since f is continuous at b,


h ¡ ¢i
(∃δ > 0)(∀y ∈ D(f )) (|y − b| < δ) ⇒ |f (y) − f (b)| < ε .

Fix this δ > 0. From the continuity of g at a


h ¡ ¢i
(∃γ > 0)(∀x ∈ D(g)) (|x − a| < γ) ⇒ |g(x) − g(a)| < δ .

From the above it follows that


h ¡ ¢i
(∀ε > 0)(∃γ > 0)(∀x ∈ D(g)) (|x − a| < γ) ⇒ |f (g(x)) − f (g(a))| < ε .

This proves continuity of f ◦ g at a.

Another useful characterization of continuity of a function f at a point a is the following.


f is continuous at a point a ∈ R if and only if
lim f (x) = lim f (x) = f (a),
x→a− x→a+

which means that the one-sided limits exist, are equal and equal the value of the function at
a.

Theorem 4.2.5. Let f be continuous at a ∈ R. Let f (a) < B. Then there exists a neigh-
bourhood of a such that f (x) < B for all the points x from the neighbourhood.

Proof. Take ε = B − f (a) in Definition 4.2.2. Then


h ¡ ¢i
(∃δ > 0)(∀x ∈ (c, d)) (|x − a| < δ) ⇒ |f (x) − f (a)| < ε .

Remark 6. A similar fact is true for the case f (x) > B.

61 January 28, 2006


4.2. CONTINUOUS FUNCTIONS

Boundedness of functions

Definition 4.2.4. Let f : D(f ) → R, D(f ) ⊂ R. f is called bounded if Ran(f ) is a bounded


subset of R.
1
Example 4.2.6. Let f : R → R be defined by f (x) = 1+x2
. Then Ran(f ) = (0, 1], so f is
bounded.
Definition 4.2.5. Let A ⊂ D(f ). f is called bounded above on A if
¡ ¢
(∃K ∈ R)(∀x ∈ A) f (x) ≤ K .

Remark 7. Boundedness below and boundedness on a set is defined analogously.

Theorem 4.2.7. If f is continuous at a then there exists δ > 0 such that f is bounded on
the interval (a − δ, a + δ).

Proof. Since lim f (x) = f (a)


x→a
£ ¡ ¢¤
(∃δ > 0)(∀x ∈ D(f )) (|x − a| < δ) ⇒ |f (x) − f (a)| < 1 .

So on the interval (a − δ, a + δ)

f (a) − 1 < f (x) < f (a) + 1.

62 January 28, 2006


CHAPTER 4. LIMITS OF FUNCTIONS AND CONTINUITY

4.3 Continuous functions on a closed interval


In the previous section we dealt with functions which were continuous at a point. Here we
consider functions which are continuous at every point of an interval [a, b]. In this case we
speak of functions continuous at [a, b]. In comparison with the previous section we will be
interested here in global behaviour of such functions.
We first define this notion formally:
Definition 4.3.1. Let [a, b] ⊆ R be a closed interval, f a function with f : D(f ) −→ R, and
with [a, b] ⊆ D(f ). Then we say that f is a continuous function on [a, b] if:
(i) f is continuous at every point of (a, b) and
(ii) limx→a+ f (x) = f (a); limx→b− f (x) = f (b).

Our first theorem exhibits the intermediate-value property of continuous function.


Theorem 4.3.1. The Intermediate Value Theorem
Let f be a continuous function on a closed interval [a, b] ⊂ R. Suppose that f (a) < f (b).
Then ¡ ¢¡ ¢³ ´
∀η ∈ [f (a), f (b)] ∃x ∈ [a, b] f (x) = η .

Proof. If η = f (a) or η = f (b) then there is nothing to prove. Fix η ∈ (f (a), f (b)). Let us
introduce the set
A := {x ∈ [a, b] | f (x) < η}.

The set A is not empty since a ∈ A. The set A is bounded above (by b). Therefore there
exists ξ := sup A.
Our aim now is to prove that f (ξ) = η. We will do that by ruling out two other possibilities:
f (ξ) < η and f (ξ) > η.

First, let us assume that f (ξ) < η. Then by Theorem 4.2.5 we have that
h¡ ¢ ¡ ¢i
(∃δ > 0)(∀x ∈ D(f )) x ∈ (ξ − δ, ξ + δ) ⇒ f (x) < η .

Therefore h¡ ¢ ¡ ¢i
(∃x1 ∈ R) x1 ∈ (ξ, ξ + δ) ∧ f (x1 ) < η .

In other words h¡ ¢ ¡ ¢i
(∃x1 ∈ R) x1 > ξ ∧ x1 ∈ A .

This contradicts to the fact that ξ is an upper bound of A.

Next, let us assume that f (ξ) > η. Then by the remark after Theorem 4.2.5 we have that
£ ¤
(∃δ > 0) D(f ) ∩ (ξ − δ, ξ + δ) ⊂ Ac (Ac is the complement to A).

This contradicts to the fact that ξ = sup A.

The next theorem establishes boundedness of functions continuous on an interval.


Theorem 4.3.2. Let f be continuous on [a, b]. Then f is bounded on [a, b].

63 January 28, 2006


4.3. CONTINUOUS FUNCTIONS ON A CLOSED INTERVAL

Proof. Let us introduce a set


¯
A := {x ∈ [a, b] ¯ f is bounded on [a, x] }.

Note that

1. A 6= ∅ since a ∈ A.
2. A is bounded (by b).

Therefore there exists ξ := sup A.


Our first task will be to prove that ξ = b.
First, we prove that ξ > a. Indeed, if we let ξ = a then by continuity of f at a it follows
that £ ¡ ¢¤
(∃δ > 0)(∀x ∈ D(f )) (0 ≤ x − a < δ) ⇒ |f (x) − f (a)| < 1 .
Hence the function f is bounded on [a, a + δ) . Therefore
¡ ¢¡ ¢
∃x1 ∈ (a, a + δ) f is bounded on [a, x1 ] ,

which proves that ξ > a.


Now suppose that ξ < b. Then by Theorem 4.2.7
£ ¤
(4.3.3) (∃δ > 0) f is bounded on (ξ − δ, ξ + δ) .

Fix δ as above. By the definition of sup


¡ ¢£ ¤
∃x1 ∈ (ξ − δ, ξ] f is bounded on [a, x1 ] .

Also from (4.3.3) it follows that


¡ ¢£ ¤
∃x2 ∈ (ξ, ξ + δ) f is bounded on [x1 , x2 ] .

Therefore f is bounded on [a, x2 ] where x2 > ξ which contradicts to the fact that ξ is the
supremum of A. This proves that ξ = b.
(Note that this does not complete the proof since supremum may not belong to a set, i.e.
it is possible that ξ 6∈ A).
From the continuity of f at b it follows that
£ ¤
(∃δ1 > 0) f is bounded on (b − δ1 , b] .

By the definition of supremum


£ ¤
(∃x1 ∈ (b − δ1 , b] ) f is bounded on [a, x1 ] .

Therefore f is bounded on [a, b].

The last theorem asserts that the range of a continuous function restricted to a closed
interval is a bounded subset in R, so that there exist supremum and infimum of it. The next
theorem asserts that these values are attained, which means that there are point in [a, b] such
that the values of the function at this points equal to the supremum (infimum).

64 January 28, 2006


CHAPTER 4. LIMITS OF FUNCTIONS AND CONTINUITY

Theorem 4.3.3. Let f be continuous on [a, b] ⊂ R. Then


¡ ¢
(∃y ∈ [a, b])(∀x ∈ [a, b]) f (x) ≤ f (y) .

(i.e. f (y) = max f (x). ) Similarly,


x∈[a,b]
¡ ¢
(∃z ∈ [a, b])(∀x ∈ [a, b]) f (x) ≥ f (z) .

Proof. Let us introduce the set of values of f on [a, b]


¯
F := { f (x) ¯ x ∈ [a, b] }

We know that F 6= ∅ and by Theorem 4.3.2 F is bounded. Therefore its supremum exists.
Denote α := sup F .
Our aim is to prove that there exists y ∈ [a, b] such that f (y) = α. We prove this by
contradiction.
Suppose that ¡ ¢
( ∀x ∈ [a, b] ) f (x) < α .
Define the following function
1
g(x) = , x ∈ [a, b].
α − f (x)

Since the denominator is never zero on [a, b] we conclude that g is continuous and by Theorem
4.3.2 is bounded on [a, b].
At the same time by the definition of supremum
¡ ¢
(∀ε > 0)(∃x ∈ [a, b]) f (x) > α − ε ,

1
in other words α − f (x) < ε, and so g(x) > . This proves that
ε
³ 1´
(∀ε > 0)(∃x ∈ [a, b]) g(x) > .
ε
Therefore g is unbounded on [a, b] which is a contradiction.

4.4 Uniform continuity


Definition 4.4.1. Let f be a function defined on a set A ⊂ R. We say that f is uniformly
continuous on A if
h¡ ¢ ¡ ¢i
(∀ε > 0)(∃δ > 0)(∀x1 , x2 ∈ A) |x1 − x2 | < δ ⇒ |f (x1 ) − f (x2 )| < ε .

Theorem 4.4.1. Let f be defined and continuous on a closed interval [a, b]. Then f is
uniformly continuous on [a, b].

65 January 28, 2006


4.5. INVERSE FUNCTIONS

We leave this theorem without proof. See [4], Theorem 3.82.

The next example shows that the property of uniform continuity is stronger than the
property of continuity.
1
Example 4.4.2. The function f (x) = is continuous on (0, 1) but not uniformly continuous.
x
Indeed, take xn = n1 , yn = 1
n+1 .


Example 4.4.3. The function f (x) = x is uniformly continuous on [1, ∞).
Indeed,
√ √ |x1 − x2 |
|f (x1 ) − f (x2 )| = | x1 − x2 | = √ √ ≤ |x1 − x2 |.
x1 + x2

4.5 Inverse functions


In this section we discuss the inverse to a continuous function. Recall that for existence of
the inverse of a function, the function must be a bijection. It is advisable to revise Section
1.7.2.

It is not difficult to show (this point is delegated to Exercises) that a continuous bijection
defined on an interval [a, b] is monotone (either increasing or decreasing), the notion which
was discussed for sequences and the intuitive meaning of which is clear. The precise definition
reads as follows.

Definition 4.5.1. A function f defined on [a, b] is called increasing if


h ¡ ¢i
(∀x1 , x2 ∈ [a, b]) (x1 < x2 ) ⇒ f (x1 ) < f (x2 ) .

Theorem 4.5.1. Let f be continuous on [a, b] and increasing. Let f (a) = c, f (b) = d. Then
there exists a function g : [c, d] → [a, b] which is continuous, increasing and such that
h i
(∀y ∈ [c, d]) f (g(y)) = y .

Proof. Let η ∈ [c, d]. By Theorem 4.3.1


h i
(∃ξ ∈ [a, b]) f (ξ) = η .

There is only one such value ξ since f is increasing. (Why?) The inverse function g is defined
by
ξ = g(η).
It is easy to see that g is increasing.
• Indeed, let y1 < y2 , y1 = f (x1 ), y2 = f (x2 ). Suppose that at the same time x2 ≤ x1 . Since
f is increasing it follows that y2 ≤ y1 . Contradiction.•

66 January 28, 2006


CHAPTER 4. LIMITS OF FUNCTIONS AND CONTINUITY

Now let us prove that g is continuous.


Let y0 ∈ (c, d). Then ¡ ¢£ ¤
∃x0 ∈ (a, b) y0 = f (x0 ) ,
or in other words
x0 = g(y0 ).
Let ε > 0. We assume also that ε is small enough such that [x0 − ε, x0 + ε] ⊂ (a, b). Let
y1 = f (x0 − ε), y2 = f (x0 + ε). Since g is increasing we have that
£ ¤ £ ¤
y ∈ (y1 , y2 ) ⇒ x = g(y) ∈ (x0 − ε, x0 + ε) .

Take δ = min{y2 − y0 , y0 − y1 }. Then


£ ¤ £ ¤
|y − y0 | < δ ⇒ |g(y) − g(y0 )| < ε .

The continuity at the ends of the interval can be established similarly.

67 January 28, 2006


Chapter 5

Differential Calculus

5.1 Definition of derivative. Elementary properties


Definition 5.1.1. Let f be defined in a neighbourhood of a ∈ R. (This mens that
£ ¤
(∃δ > 0) (a − δ, a + δ) ⊂ D(f ) ).

f (a + h) − f (a)
We say that f is differentiable at a if the limit lim exists. This limit, denoted
h→0 h
by f 0 (a), is called the derivative of f at a.

For a function f , its derivative f 0 is the function defined on the set


½ ¾
0 f (a + h) − f (a)
D(f ) = x ∈ D(f ) | lim exists
h→0 h

with the values


f (x + h) − f (x)
f 0 (x) = lim .
h→0 h

Example 5.1.1. Let c ∈ R. Let f : R → R be defined by f (x) = c. Then f is differentiable


at any x ∈ R and f 0 (x) = 0.

Proof.
f (a + h) − f (a) c−c
lim = lim = 0.
h→0 h h→0 h

Example 5.1.2. Let f : R → R be defined by f (x) = x. Then f is differentiable at any


x ∈ R and f 0 (x) = 1.

Proof.
f (a + h) − f (a) a+h−a
lim = lim = 1.
h→0 h h→0 h

68
CHAPTER 5. DIFFERENTIAL CALCULUS

Example 5.1.3. Let f : R → R be defined by f (x) = x2 . Then f is differentiable at any


x ∈ R and f 0 (x) = 2x.

Proof.

f (a + h) − f (a) (a + h)2 − a2 2ah + h2


lim = lim = lim = lim (2a + h) = 2a.
h→0 h h→0 h h→0 h h→0

Example 5.1.4. Let n ∈ N. Let f : R → R be defined by f (x) = xn . Then f is differentiable


at any x ∈ R and f 0 (x) = nxn−1 .

The proof which can be performed by induction, based on the product rule below, is left
as an exercise.

Example 5.1.5. Let f : R → R be defined by f (x) = |x|. Then f is differentiable at any


x ∈ R − {0}. f is not differentiable at 0.

Proof. If x > 0 then


f (x + h) − f (x)
lim = 1.
h→0 h
If x < 0 then
f (x + h) − f (x)
lim = −1.
h→0 h
The derivative does not exist at 0 since
f (x + h) − f (x) f (x + h) − f (x)
lim 6= lim .
h→0+ h h→0− h

Since in the above example the function f is continuous at 0, it shows that continuity
does not imply differentiability.

Theorem 5.1.6. If f is differentiable at a then f is continuous at a.

Proof.
f (a + h) − f (a)
f 0 (a) = lim .
h→0 h
Hence we have
¡ ¢ ¡ ¢
lim f (x) − f (a) = lim f (a + h) − f (a)
x→a h→0
f (a + h) − f (a)
= lim · h = f 0 (a) · lim h = 0.
h→0 h h→0

Therefore
lim f (x) = f (a).
x→a

69 January 28, 2006


5.1. DEFINITION OF DERIVATIVE. ELEMENTARY PROPERTIES

Remark 8. (Important!) If f is differentiable at a ∈ R then there exists a function α(x)


such that lim α(x) = 0 and
x→a

f (x) = f (a) + f 0 (a)(x − a) + α(x)(x − a).

Indeed, define
f (x) − f (a)
α(x) := − f 0 (a).
x−a
Then α(x) → 0 as x → a and f (x) = f (a) + f 0 (a)(x − a) + α(x)(x − a).

Theorem 5.1.7. If f and g are differentiable at a then f + g is also differentiable at a, and

(f + g)0 (a) = f 0 (a) + g 0 (a).

The proof is left as an exercise.

Theorem 5.1.8. (The product rule). If f and g are differentiable at a then f · g is also
differentiable at a, and
(f · g)0 (a) = f 0 (a) · g(a) + f (a) · g 0 (a).

Proof.

(f · g)(a + h) − f · g(a) f (a + h)g(a + h) − f (a)g(a)


lim = lim
h→0
· h h→0 h ¸
f (a + h)[g(a + h) − g(a)] [f (a + h) − f (a)]g(a)
= lim +
h→0 h h
g(a + h) − g(a) f (a + h) − f (a)
= lim f (a + h) lim + lim lim g(a)
h→0 h→0 h h→0 h h→0
=f 0 (a) · g(a) + f (a) · g 0 (a).

1
Theorem 5.1.9. If g is differentiable at a and g(a) 6= 0 then φ = g is also differentiable at
a, and µ ¶0
1 g 0 (a)
φ0 (a) = (a) = − .
g [g(a)]2

Proof. Follows from


φ(a + h) − φ(a) g(a) − g(a + h)
= .
h hg(a)g(a + h)

Theorem 5.1.10. (The quotient rule). If f and g are differentiable at a and g(a) 6= 0
then φ = fg is also differentiable at a, and
µ ¶0
0 f f 0 (a) · g(a) − f (a) · g 0 (a)
φ (a) = (a) = .
g [g(a)]2

70 January 28, 2006


CHAPTER 5. DIFFERENTIAL CALCULUS

Proof. Follows from Theorems 5.1.8 and 5.1.9.

Theorem 5.1.11. (The Chain rule) If g is differentiable at a ∈ R and f is differentiable


at g(a) then f ◦ g is differentiable at a and
(f ◦ g)0 (a) = f 0 (g(a)) · g 0 (a).

Proof. By the definition of the derivative and Remark 8 we have


f (y) − f (y0 ) = f 0 (y0 )(y − y0 ) + α(y)(y − y0 ),
where α(y) → 0 as y → y0 . Set in the above equality, for x 6= a, y = g(x), y0 = g(a) and
divide both sides by x − a. We obtain
f (g(x)) − f (g(a)) g(x) − g(a) g(x) − g(a)
= f 0 (g(a)) + α(g(x)) .
x−a x−a x−a
By Theorem 5.1.6 g is continuous at a. Hence y = g(x) → g(a) = y0 as x → a, and
α(g(x)) → 0 as x → a. Passing to the limit in the above equality as x → a we obtain the
required.
Example 5.1.12. Let f : R → R defined by f (x) = (x2 + 1)100 . Then f is differentiable at
every point in R and f 0 (x) = 100(x2 + 1)99 · 2x.

In the next theorem we establish relation between the derivative of an invertible function
and the derivative of the inverse function.

Theorem 5.1.13. Let f be continuous, increasing on (a, b) and given by y = f (x). Suppose
that for some x0 ∈ (a, b) f is differentiable at x0 and f 0 (x0 ) 6= 0. Then the inverse function
g = f −1 given by x = g(y) is differentiable at y0 = f (x0 ) and
1
g 0 (y0 ) = .
f 0 (x0 )
Proof. Remark 8 implies that
y − y0 = f (g(y)) − f (g(y0 ))
0
= f (g(y0 ))(g(y) − g(y0 )) + α(g(y))(g(y) − g(y0 )),
where α(g(y)) → 0 as g(y) → g(y0 ). Since g is continuous at y0 it follows that g(y) → g(y0 )
as y → y0 , and hence α(g(y)) → 0 as y → y0 . Therefore we have
g(y) − g(y0 ) g(y) − g(y0 )
= 0
y − y0 f (g(y0 ))(g(y) − g(y0 )) + α(g(y))(g(y) − g(y0 ))
1 1
= 0 → 0 as y → y0 .
f (g(y0 )) + α(g(y)) f (g(y0 ))

Example 5.1.14. Let n ∈ N. Let f : R+ → R+ be defined by f (x) = xn . Then the inverse



function is g(y) = n y = y 1/n . We have that f 0 (x) = nxn−1 . Hence by the previous theorem
1 1 1 1 1 1
g 0 (y) = = = · n−1 = · y n −1 .
f 0 (x) nxn−1 n y n n

71 January 28, 2006


5.2. THEOREMS ON DIFFERENTIABLE FUNCTIONS

One-sided derivatives

Similar to one-sided limits we define left and right derivatives of f at a as

f (a + h) − f (a) f (a + h) − f (a)
f−0 (a) = lim , f+0 (a) = lim .
h→0− h h→0+ h

5.2 Theorems on differentiable functions


Theorems of this section show how with the help of derivative (which is a local notion) one
can investigate behaviour of functions on intervals.
Theorem 5.2.1. Let f be a function defined on (a, b). If x0 ∈ (a, b) is a maximum (or
minimum) point for f on (a, b) and f is differentiable at x0 then f 0 (x0 ) = 0,

Note that we do not assume differentiability, or even continuity, of f at any other point.

Proof. We prove the theorem for the case of maximum. The case of minimum is similar.
If h is any number such that x0 + h ∈ (a, b), then

f (x0 ) ≥ f (x0 + h).

Thus for h > 0 we have


f (x0 + h) − f (x0 )
≤ 0,
h
and consequently
f (x0 + h) − f (x0 )
lim ≤ 0.
h→0+ h
On the other hand, if h < 0, then

f (x0 + h) − f (x0 )
≥ 0,
h
so that
f (x0 + h) − f (x0 )
lim ≥ 0.
h→0− h
By hypothesis f 0 (x0 ) exists so that f−0 (x0 ) = f+0 (x0 ). So from the above f 0 (x0 ) = 0.

Note that the inverse statement is false. A simple example is

f : R → R, f (x) = x3 .
We see that f 0 (0) = 0, however 0 is not the point if maximum or minimum on any interval.

Theorem 5.2.2. (Rolle’s Theorem). If f is continuous on [a, b] and differentiable on


(a, b), and f (a) = f (b), then
¡ ¢£ ¤
∃x0 ∈ (a, b) f 0 (x0 ) = 0 .

72 January 28, 2006


CHAPTER 5. DIFFERENTIAL CALCULUS

Proof. It follows from the continuity of f that f has a maximum and a minimum value on
[a, b].
Suppose that the maximum value occurs at x0 ∈ (a, b). Then by Theorem 5.2.1 f 0 (x0 ) = 0,
and we are done.
Suppose next that the minimum value occurs at x0 ∈ (a, b). Then again by Theorem 5.2.1
f 0 (x0 ) = 0.
Finally, suppose that the maximum value and the minimum value both occur at the end
points. Since f (a) = f (b), the maximum value and the minimum value are equal, so that f
is a constant. Hence f 0 (x) = 0 for all x ∈ (a, b).

Theorem 5.2.3. (The Mean Value Theorem) If f is continuous on [a, b] and differen-
tiable on (a, b), then
³ ´· f (b) − f (a)
¸
0
∃x0 ∈ (a, b) f (x0 ) = .
b−a

Proof. Let · ¸
f (b) − f (a)
g(x) = f (x) − (x − a).
b−a
Then g is continuous on [a, b] and differentiable on (a, b), and

f (b) − f (a)
g 0 (x) = f 0 (x) − .
b−a
Moreover, g(a) = f (a) and
· ¸
f (b) − f (a)
g(b) = f (b) − (b − a) = f (a).
b−a

Therefore by Rolle’s Theorem

(∃x0 ∈ (a, b)) [g 0 (x0 ) = 0].

Corollary 5.2.1. If f is defined on an interval and f 0 (x) = 0 for all x in the interval, then
f is a constant on the interval.

Proof. Let a and b be any two points in the interval with a 6= b. Then by the Mean Value
theorem there is a point x in (a, b) such that

f (b) − f (a)
f 0 (x) = .
b−a
But f 0 (x) = 0 for all x in the interval, so that

f (b) − f (a)
0= ,
b−a
and consequently f (b) = f (a). Thus the value of f at any two points is the same. Therefore
f is a constant on the interval.

73 January 28, 2006


5.2. THEOREMS ON DIFFERENTIABLE FUNCTIONS

Corollary 5.2.2. If f and g are defined on the same interval and f 0 (x) = g 0 (x) for all x in
the interval, then there is some number c ∈ R such that f = g + c on the interval.
The proof is left as an exercise.
Corollary 5.2.3. If f 0 (x) > 0 for all x in an interval, then f is increasing on the interval;
if f 0 (x) < 0 for all x in an interval, then f is decreasing on the interval;

Proof. Consider the case f 0 (x) > 0. Let a and b be any two points in the interval with a < b.
Then by the Mean Value theorem there is a point x in (a, b) such that

f (b) − f (a)
f 0 (x) = .
b−a
But f 0 (x) > 0 for all x in the interval, so that

f (b) − f (a)
> 0.
b−a
Since b − a > 0, it follows that f (b) > f (a), which proves that f is increasing on the interval.
The case f 0 (x) < 0 is left as an exercise.
The next theorem is a generalisation of the Mean Value Theorem. It is of interest because
of its applications.
Theorem 5.2.4. (The Cauchy Mean Value Theorem). If f and g are continuous on
[a, b] and differentiable on (a, b), then
¡ ¢h i
∃x0 ∈ (a, b) [f (b) − f (a)]g 0 (x0 ) = [g(b) − g(a)]f 0 (x0 ) .

(If g(b) 6= g(a), and g 0 (x0 ) 6= 0, the above equality can be rewritten as

f (b) − f (a) f 0 (x0 )


= 0 .
g(b) − g(a) g (x0 )

Note that if g(x) = x, we obtain the Mean Value Theorem).

Proof. Let h : [a, b] → R be defined by

h(x) = [f (b) − f (a)]g(x) − [g(b) − g(a)]f (x).

Then h(a) = f (b)g(a) − f (a)g(b) = h(b), so that h satisfies Rolle’s theorem. Therefore
¡ ¢h i
∃x0 ∈ (a, b) 0 = h0 (x0 ) = [f (b) − f (a)]g 0 (x0 ) − [g(b) − g(a)]f 0 (x0 ) .

Theorem 5.2.5. (L’Hôpital’s Rule). Let f and g be differentiable and g 0 (x) 6= 0, in a


neighbourhood of a point a ∈ R, and such that f (a) = g(a) = 0. Suppose that the limit
f 0 (x)
lim 0 exists. Then
x→a g (x)
f (x) f 0 (x)
lim = lim 0
x→a g(x) x→a g (x)

74 January 28, 2006


CHAPTER 5. DIFFERENTIAL CALCULUS

Proof. By the Cauchy Mean Value Theorem


f (a + h) − f (a) f 0 (a + th)
= 0
g(a + h) − g(a) g (a + th)
for some 0 < t < 1. Passing to the limit as h → 0 we get the result.

L’Hôpital’s Rule is a useful tool in computing limits.


Example 5.2.6. Let m, n ∈ N, 0 6= a ∈ R.
xm − am mxm−1 m
lim n n
= lim n−1
= am−n .
x→a x − a x→a nx n

5.3 Approximation by polynomials. Taylor’s Theorem


Polynomials, i.e. functions which are given
a0 + a1 x + a2 x2 + · · · + an xn ,
where n ∈ N, a0 , a1 , . . . , an ∈ R, is the simplest class of functions. In this section we will
show that functions that are differentiable sufficiently many time can be approximated by
polynomials. We use the notation f (n) (a) for nth derivative of f at a.
Lemma 1. If f is the polynomial
a0 + a1 x + a2 x2 + · · · + an xn ,
then
1 (k)
ak = f (0) (0 ≤ k ≤ n).
k!
Proof. ak is computed by k times differentiation of f and evaluation at 0.
Thus for a polynomial f of degree n we have
x2 00 xn (n)
f (x) = f (0) + xf 0 (0) + f (0) + · · · + f (0).
2! n!
More generally, if x = a + h, where a is fixed, we have
h2 00 hn (n)
f (a + h) = f (a) + hf 0 (a) + f (h) + · · · + f (a).
2! n!

The next theorem gives approximation of a sufficiently “nice” function by a polynomial.


We will call it Taylor’s theorem (see [4] for comments).
Theorem 5.3.1. Let h > 0, p ≥ 1. Suppose that f and its derivatives up to order n − 1 be
continuous on [a, a + h] and f (n) exists on (a, a + h). Then there exists a number t ∈ (0, 1)
such that
h2 00 hn−1 (n−1)
f (a + h) = f (a) + hf 0 (a) + f (h) + · · · + f (a)
2! (n − 1)!
hn (1 − t)n−p (n)
(5.3.1) + f (a + th).
p(n − 1)!

75 January 28, 2006


5.3. APPROXIMATION BY POLYNOMIALS. TAYLOR’S THEOREM

Proof. Set

h2 00 hn−1 (n−1)
Rn = f (a + h) − f (a) − hf 0 (a) − f (a) − · · · − f (a).
2! (n − 1)!

Define g : [a, a + h] → R by

(a + h − x)2 00
g(x) = f (a + h) − f (x) − (a + h − x)f 0 (x) − f (x)
2!
(a + h − x)n−1 (n−1) (a + h − x)p Rn
−··· − f (x) − .
(n − 1)! hp

Clearly g(a + h) = 0. From our definition of Rn it follows that g(a) = 0. Therefore we


can apply Rolle’s theorem to g on [a, a + h]. Hence there exists t ∈ (0, 1) such that

g 0 (a + th) = 0.

It is easy to verify that

[h(1 − t)]n−1 (n) p(1 − t)p−1


g 0 (a + th) = − f (a + th) + Rn ,
(n − 1)! h

all other terms in the differentiation canceling in pairs (It is advisable to check this). From
this we find that
hn (1 − t)n−p (n)
Rn = f (a + th),
p(n − 1)!
which proves the theorem.

Corollary 5.3.1. (Forms of the remainder).


Let the conditions of Theorem 5.3.1 be satisfied. Then

h2 00 hn−1 (n−1)
(5.3.2) f (a + h) = f (a) + hf 0 (a) + f (h) + · · · + f (a) + Rn ,
2! (n − 1)!

where

hn (n)
(i) (Lagrange) Rn = f (a + th) for some t ∈ (0, 1).
n!
(1 − s)n−1 hn (n)
(ii) (Cauchy) Rn = f (a + sh) for some s ∈ (0, 1).
(n − 1)!

Proof. Putting p = n in (5.3.1) gives (i), an d putting p = 1 gives (ii).


Rn
Remark 9. The number in (5.3.2) is called the remainder term. Note that lim = 0,
h→0 hn−1
which means that Rn → 0 as h → 0 faster than any other term in (5.3.2).

76 January 28, 2006


CHAPTER 5. DIFFERENTIAL CALCULUS

Example 5.3.2. Let α ∈ R. Let f : (−1, 1] → R be given by f (x) = (1 + x)α . Since

f (n) (x) = α(α − 1) . . . (α − n + 1)(1 + x)α−n ,


f (n) (0) = α(α − 1) . . . (α − n + 1),

Taylor’s expansion of f takes the form

α α(α − 1) 2 α(α − 1) . . . (α − n + 1) n
(1 + x)α = 1 + x+ x + ··· + x + Rn+1 (x),
1! 2! n!
where the remainder term in the Lagrange form is

α(α − 1) . . . (α − n)
Rn+1 (x) = (1 + tx)α−n−1 xn+1 for some t ∈ (0, 1).
(n + 1)!

77 January 28, 2006


Chapter 6

Series

In this chapter we continue to study series. We already studied series of positive terms. (It
is advisable to revise Section 3.6.) In this chapter we will discuss convergence and divergence
of series which have infinitely many positive and infinitely many negative terms.

6.1 Series of positive and negative terms


6.1.1 Alternating series
Theorem 6.1.1. Let (an )n be a sequence of positive numbers such that
¡ ¢
(i) (∀n ∈ N) an ≥ an+1 ;

(ii) lim an = 0.
n→∞

Then the series



X
n−1
a1 − a2 + a3 − a4 + · · · + (−1) an + · · · = (−1)n−1 an
n=1

converges.

Proof. For each n ∈ N set


n
X
sn =a1 − a2 + a3 − a4 + · · · + (−1)n−1 an = (−1)k−1 ak ,
k=1
tn =a1 − a2 + a3 − a4 + · · · + a2n−1 − a2n = s2n ,
un =a1 − a2 + a3 − a4 + · · · + a2n−1 = s2n−1 .

Then we have by (i)

tn+1 − tn = s2n+2 − s2n = a2n+1 − a2n+2 ≥ 0.

Thus the sequence (tn )n is increasing. Similarly

un+1 − un = −a2n + a2n+1 ≤ 0,

78
CHAPTER 6. SERIES

so that (un )n is decreasing. Also for every n ∈ N

un − tn = a2n ≥ 0, i.e. un ≥ tn .

Thus the sequence (un )n is decreasing and bounded below by t1 . Therefore is converges to
a limit u = limn→∞ un . The the sequence (tn )n is increasing and bounded above by u1 .
Therefore is converges to a limit t = limn→∞ tn . Moreover u − t = limn→∞ (un − tn ) =
limn→∞ a2n = 0. Denote t = u = s. So we have

s2n = tn → s as n → ∞ and s2n+1 = un → s.

Therefore sn → s as n → ∞.

Example 6.1.2. The series


1 1 1
1− + − + ···+
2 3 4
converges.
Example 6.1.3. The series

X xn
n
n=1

converges if and only if x ∈ [−1, 1).

Proof. First, let x > 0. Then


an+1 xn
lim = lim =x
n→∞ an n→∞ n + 1

By the d’Alembert test if x < 1 then the series converges, if x > 1 then it diverges. If x = 1
we obtain the harmonic series which is divergent.
Now let x < 0. Denote y = −x > 0. Then we have an alternating series

X (−1)n y n
.
n
n=1

yn y n+1 yn
If y ≤ 1 then > and lim = 0, so that the series converges by Theorem 6.1.1. If
n n n+1 n→∞ n
y
y > 1 then lim 6= 0, and the series diverges.
n→∞ n

6.1.2 Absolute convergence



X ∞
X
Definition 6.1.1. The series an is said to be absolutely convergent if the series |an |
n=1 n=1
is convergent. A series which is convergent but not absolutely convergent is said to be
conditionally convergent.

Theorem 6.1.4. An absolutely convergent series is convergent.

79 January 28, 2006


6.1. SERIES OF POSITIVE AND NEGATIVE TERMS


X
Proof. Let an be an absolutely convergent series. Define
n=1
½
an if an ≥ 0,
bn =
0 if an < 0.
½
0 if an ≥ 0,
cn =
−an if an < 0.
Then (∀n ∈ N)(bn ≥ 0) ∧ (cn ≥ 0). Note that
an = bn − cn ,
|an | = bn + cn .
Since (∀n ∈ N)(bn ≤ |an |) ∧ (cn ≤ |an |) by the comparison test we conclude that the series

X ∞
X
bn and cn
n=1 n=1

X
converge, and therefore so does the series (bn − cn ).
n=1

Example 6.1.5. (i) The series


1 1 1
+1− − + ...
4 9 6
is absolutely convergent since the series
1 1 1
1+ + + + ...
4 9 6
converges.
(ii) The series
1 1 1
1−
+ − + ...
2 3 4
is conditionally convergent since the series
1 1 1 1
1 + + + + ··· + + ...
2 3 4 n
diverges.

Example 6.1.6. The exponential series


x2 x3 xn
1+x+ + + ··· + + ...
2! 3! n!
converges for all x ∈ R.

Proof. To see this consider the series of absolute values


X∞
|x|n
.
n!
n=1
Using d’Alembert’s test prove that it is convergent for every x ∈ R. From this conclude that
the exponential series is absolutely convergent. Use Theorem 6.1.4 to conclude that the series
converges. Details are left as an exercise.

80 January 28, 2006


CHAPTER 6. SERIES

6.1.3 Rearranging series


Example 6.1.7. Let s be the sum of the series

1 1 1
1− + − + ...
2 3 4
Let sn be the nth partial sum, i.e.

1 1 1 (−1)n−1
sn = 1 − + − + ··· + .
2 3 4 n
Let us now rearrange the series in the following way

1 1 1 1 1 1 1 1
1− − + − − + − − + ...
2 4 3 6 8 5 10 12
Let tn be be the nth partial sum of the new series. Then
µ ¶ µ ¶ µ ¶
1 1 1 1 1 1 1 1
t3n = 1 − − + − − + ··· + − −
2 4 3 6 8 2n − 1 4n − 2 4n
µ ¶ µ ¶ µ ¶
1 1 1 1 1 1 1 1 1
= 1− + − + ··· + − = s2n .
2 2 2 3 4 2 2n − 1 2n 2

But s2n → s as n → ∞. Hence t3n → 21 s as n → ∞. Also t3n+1 = t3n + 1


2n+1 → 12 s and
1
t3n+2 = t3n+1 − 4n+2 → 12 s. Therefore

1
tn → s as n → ∞.
2
Remark 10. The series 6.1.7 is conditionally convergent. We see that by rearranging this
series we change its sum. In fact, one can prove that any conditionally convergent series can
be rearranged in such a way that the sum of the rearranged series is any real number chosen
in advance.

In the next theorem we prove that rearranging an absolutely convergent series does not
alter its sum.

X ∞
X
(We say that the series bn is a rearrangement of the series an if there is a bijection
n=1 n=1
f on the set of natural numbers N such that

(∀n ∈ N)(bn = af (n) ).)


X
Theorem 6.1.8. Let an be an absolutely convergent series with sum s. Then any rear-
n=1

X
rangement of an is also convergent with sum s.
n=1

81 January 28, 2006


6.1. SERIES OF POSITIVE AND NEGATIVE TERMS

Proof. First let us consider the case in which an ≥ 0 for all n. Let b1 + b2 + b3 + . . . is a
rearrangement of a1 + a2 + a3 + . . . . Denote by sn and tn the corresponding partial sums, i.e.

sn = a1 + a2 + a3 + · · · + an , tn = b1 + b2 + b3 + · · · + bn .

The sequences (sn )n and (tn )n are increasing and sn → s as n → ∞. So (∀n ∈ N)(sn ≤ s).
Also
¡ ¢
(∀n ∈ N)(∃k ∈ N) b1 + b2 + b3 + · · · + bn ≤ sk .
(Each of the numbers b1 , b2 , . . . , bn is one of a’s.) Therefore (∀n ∈ N)(tn ≤ s). Hence the
sequence (tn )n is bounded above by s. Thus it converges, and the limit t ≤ s. By the same
argument considering a1 + a2 + a3 + . . . as a rearrangement of b1 + b2 + b3 + . . . we obtain
that s ≤ t. So we conclude that t = s.
The general case is treated similarly to the proof of Theorem 6.1.4. Define
½
an if an ≥ 0,
dn =
0 if an < 0.
½
0 if an ≥ 0,
cn =
−an if an < 0.
We have h i h i
(∀n ∈ N) 0 ≤ dn ≤ |an | ∧ 0 ≤ cn ≤ |an |

So by the comparison test



X ∞
X
dn and cn
n=1 n=1

X ∞
X ∞
X
converge. Set dn = d and cn = c. Then s = an = d − c. Let b1 + b2 + b3 + . . . is a
n=1 n=1 n=1
rearrangement of a1 + a2 + a3 + . . . . Set
½
bn if bn ≥ 0,
xn =
0 if bn < 0.
½
0 if bn ≥ 0,
yn =
−bn if bn < 0.

Then x1 + x2 + . . . is a rearrangement of d1 + d2 + . . . and from what we proved

x1 + x2 + · · · = d1 + d2 + . . .

Similarly
y1 + y2 + · · · = c1 + c2 + . . .
But bn = xn − yn for all n. Therefore

b1 + b2 + · · · =(x1 + x2 + . . . ) − (y1 + y2 + . . . )
=(d1 + d2 + . . . ) − (c1 + c2 + . . . ) = u − v = s.

82 January 28, 2006


CHAPTER 6. SERIES

6.1.4 Multiplication of series



X ∞
X
Theorem 6.1.9. Let an and bn be absolutely convergent series with sums s and t
n=1 n=1
respectively. Then the series a1 b1 + a2 b1 + a2 b2 + a1 b3 + a2 b2 + a3 b1 + . . . is absolutely
convergent with sum st.

X
Proof. Denote by w1 , w2 , w3 , . . . products ak bl (k, l ∈ N). Let us prove that the series |wn |
n=1
converges. Let Sn denote its partial sum. Sn contains terms |ak bl |. Denote by m the maximal
index from k and l among those that |ak bl | are the terms in Sn . Then

(6.1.1) Sn ≤ (|a1 | + |a2 | + · · · + |am |)(|b1 | + |b2 | + · · · + |bm |)

In the right-hand side of (6.1.1) there is the product of mth partial sums of the series

X X∞
|an | and |bn |. Therefore the sequence (Sn )n is bounded above, which proves the
n=1 n=1
absolute convergence of the series in question.
It remains to prove that the sum of the series is st. Let S denote the sum of the series.
By Theorem 6.1.8 it does not depend on the order of terms. So we can write S = limn→∞ Sn .
Now notice that
Sn2 = (a1 + a2 + · · · + an )(b1 + b2 + · · · + bn )
which implies that
lim Sn2 = st.
n→∞

But of course lim Sn2 = lim Sn which completes the proof


n→∞ n→∞
Example 6.1.10. Set
x x2 xn
e(x) = 1 +
+ + ··· + + ...
1! 2! n!
We know from Example 6.1.6 that the series on the right-hand side of the above equality is
absolutely convergent for all x ∈ R. Hence any two such series can be multiplied together
in the obvious way and the order of the terms can be changed without affecting the sum.
Therefore for all x, y ∈ R we have

x x2 xn y y2 yn
e(x)e(y) =(1 + + + ··· + + . . . )(1 + + + ··· + + ...)
1! 2! n! 1! 2! n!
y2 x2
=1 + x + y + + xy + + ...
2 2
(x + y) (x + y)2 (x + y)n
=1 + + + ··· + + · · · = e(x + y),
1! 2! n!
where we used the observation that the terms of degree n in x and y are

xn xk y n−k yn (x + y)n
+ ··· + + ··· + = .
n! k! (n − k)! n! n!

83 January 28, 2006


6.2. POWER SERIES

6.2 Power series


Definition 6.2.1. A power series is a series of the form

X
a0 + a1 x + a2 x2 + · · · + an xn + · · · = an xn ,
n=0

where the coefficients an and the variable x are real numbers.

Example 6.2.1. The exponential series

x x2 xn
1+ + + ··· + + ...
1! 2! n!
converges for all x ∈ R. (see Example 6.1.10).
Example 6.2.2. The series

X
nn xn
n=1
converges only for x = 0.
Indeed, if x 6= 0 then limn→∞ nn xn 6= 0.
Example 6.2.3. The geometric series

X
1 + x + x2 + · · · + xn + · · · = xn
n=0

converges if and only if x ∈ (−1, 1).


Example 6.2.4. The series
X ∞
x2 x3 x4 xn xn
x− + − + · · · + (−1)n−1 + ··· = (−1)n−1
2 3 4 n n
n=1

converges if and only if x ∈ (−1, 1]. (see Example 6.1.2)


X
Theorem 6.2.5. Let r ∈ R be such that the series an rn converges. Then the power series
n=0

X
an xn is absolutely convergent for all x ∈ (−|r|, |r|) (i.e. |x| < |r|).
n=0

Proof. If r = 0 there is nothing to prove, so we can assume that r 6= 0.


X∞
From the convergence of the series an rn it follows that lim an rn = 0, therefore the
n→∞
n=0
sequence (an rn )n∈N is bounded. Thus

(∃K ∈ R)(∀n ∈ N)(an rn ≤ K).

84 January 28, 2006


CHAPTER 6. SERIES

|x|
Let x ∈ R be such that |x| < |r|, and set y = . Note that 0 ≤ y < 1. We have
|r|
|an xn | = |an | · |x|n = |an | · |rn |y n = |an rn | · y n ≤ Ky n .

X
Therefore the series |an xn | converges by comparison with the convergent geometric series
n=0
K(1 + y + y 2 + · · · + y n + . . . ).


X
Theorem 6.2.6. Let an xn be a power series. Then one of the following possibilities
n=0
occurs:

(i) The series converges only when x = 0;

(ii) The series absolutely converges for all x ∈ R;

(iii) There exists r > 0 such that the series is absolutely convergent for all x ∈ R such that
|x| < r and is divergent for all x ∈ R such that |x| > r.

X
Proof. Let us denote by E the set of all non-negative numbers such that the series an xn
n=0
converges. Clearly 0 ∈ E. If E = {0} then (i) is true.
If (∃x ∈ R+ )(x ∈ E) ∧ (x 6= 0) then the series converges for all y ∈ R such that 0 ≤ y ≤ x (by
the previous theorem).
Suppose that E is not bounded above. Let y be any non-negative real number. Then y is
not an upper bound for E so that (∃x ∈ E)(y < x). By Theorem 6.2.5 y ∈ E. Thus E is the
set of all non-negative real numbers. Absolute convergence follows also from Theorem 6.2.5.
Finally let us suppose that E is bounded above and contains at least one positive number.
Then sup E exists. Set r = sup E. We know that r > 0. Let x ∈ (−r, r). Then (∃y ∈ (|x|, r))
such that y ∈ E.
Indeed, y is not an upper bound for E. Hence (∃z ∈ E)(y < z). By Theorem 6.2.5 y ∈ E.
Therefore by Theorem 6.2.5 the series is absolutely convergent at x.
Now suppose that |x| > r. Then there exists u ∈ R such that r < u < |x| and u 6∈ E. The
series is not convergent at u, hence by Theorem 6.2.5 it is not convergent at x

Definition 6.2.2. The radius of convergence, denoted by R, of the power series

a0 + a1 x + a2 x2 + . . .

is defined as follows: R = ∞ if the series converges for all x ∈ R; R = 0 if the series converges
for x = 0 only; and R = r if (iii) in Theorem 6.2.6 holds.

Example 6.2.7. The series


x2 x4 x6
1− + − + ...
2! 4! 6!
has infinite radius of convergence.

85 January 28, 2006


6.2. POWER SERIES

Proof. Let us set y = x2 . Then the series can be written as

(−1)n
a0 + a1 y + a2 y 2 + . . . where an = .
(2n)!

Using d’Alembert’s test we compute for any y ≥ 0

|an+1 y n+1 | y
lim = lim =0
n→∞ |an y n | n→∞ (2n + 1)(2n + 2)

Therefore the series is absolutely convergent.


Example 6.2.8. The series
x3 x5 x7
x− + − + ...
3! 5! 7!
has infinite radius of convergence.
The proof is similar to the previous one.

Example 6.2.9. Find all x ∈ R such that the series

X (−1)n xn∞
x x2 x3
1− + − + ··· =
3 5 7 2n + 1
n=0

is convergent.
For absolute convergence we use d’Alembert’s test.

|an+1 xn+1 | |x|(2n + 1)


lim n
= lim = |x|.
n→∞ |an x | n→∞ 2n + 3

Therefore the series is absolutely convergent if |x| < 1.


The series diverges if |x| > 1 since the general term does not converge to zero.
For x = 1 we have the alternating series
1 1 1
1− + − + ...
3 5 7
which converges.
For x = −1 we have the series
X 1 ∞
1 1 1
1+ + + + ··· = ,
3 5 7 2n + 1
n=0

which diverges by comparison with the harmonic series.

86 January 28, 2006


Chapter 7

Elementary functions

7.1 Exponential function


Definition 7.1.1. We define the exponential function exp : R → R by the following series

x x2 xn
(7.1.1) exp x = 1 + + + ··· + + ...
1! 2! n!
The series on the right is convergent for all x ∈ R. (see Example 6.2.1).
Theorem 7.1.1. For all x, y ∈ R

exp x · exp y = exp(x + y).

The proof was give in Example 6.1.10.


The following properties easily follow from the definition and the above theorem.
Corollary 7.1.1. (i) exp 0 = 1;
1
(ii) exp(−x) = exp x ;

(iii) (∀x ∈ R) (exp x > 0).

Theorem 7.1.2. The exponential function f (x) = exp x is everywhere differentiable and

f 0 (a) = exp a, i.e. (exp x)0 = exp x.

We will also use the notation ex = exp x for the reason which will become apparent a little
later.

Proof. Using ex+y = ex · ey we have

ea+h − ea eh − 1
= ea .
h h
Now
eh − 1 h h2
=1+ + + · · · = 1 + g(h).
h 2! 3!

87
7.1. EXPONENTIAL FUNCTION

If |h| < 2 we have

|h| |h|2 |h|n | 21 h|


|g(h)| ≤ + + ··· + n + ··· = → 0 as h → 0.
2 4 2 1 − | 12 h|

Corollary 7.1.2. exp x is a continuous function.


Theorem 7.1.3. (i) exp x is a strictly increasing function.

(ii) exp x → +∞ as x → +∞.

(iii) exp x → 0 as x → −∞.

(iv) Ran(exp) = (0, +∞).


£ exp x ¤
(v) (∀k ∈ N) lim k
= +∞ .
x→+∞ x

Definition 7.1.2. e = exp 1.

This number, namely


1 1 1
e=1+ + + ··· + + ...
1! 2! n!
is one of the fundamental constants in mathematics. Its approximate value is

2.718281828459045

It is not difficult to prove that e is irrational.

Note that from the definition and from Theorem 7.1.1 it follows that

(∀n ∈ N) (exp n = en ).

Theorem 7.1.4. Let r ∈ Q. Then


exp r = er .

Proof. Let n ∈ N and r = −n. Then


1 1
exp(−n) = = n = e−n .
exp n e

Let r = p/q, p, q ∈ N. Then


p
exp(p/q)q = exp( · q) = exp p = ep .
q
Hence
exp(p/q) = ep/q .
Definition 7.1.3. If x is irrational we set

ex = exp x.

88 January 28, 2006


CHAPTER 7. ELEMENTARY FUNCTIONS

Due to monotonicity and continuity of the function exp(·) we obtain

ex = sup{ep | p is rational and p < x}.

Remark 11. It is useful to redefine the function ex (after we have studied the properties
above) as a mapping from R to (0, ∞).
So, by the exponential function we mean the mapping f : R → (0, ∞) defined by f (x) = ex .
The above properties show in particular that this function is a bijection.

7.2 Logarithmic function


For the bijection defined in the previous section there exists the inverse function.
Definition 7.2.1. The function from (0, ∞) to R defined as the inverse to ex is called the
logarithmic e function.
In other words, for x > 0 y = log x if ey = x.
Theorem 7.2.1. log : (0, ∞) → R is a differentiable (and therefore continuous) increasing
function satisfying the following properties
£ ¤
(i) (∀x, y ∈ (0, ∞)) log(xy) = log x + log y ;

1
(ii) (log x)0 = ;
x
(iii) lim log x = ∞;
x→∞

(iv) lim log x = −∞;


x→0
µ ¶
log x
(v) (∀k ∈ N) lim =0 .
x→∞ xk

Proof. The proofs are straightforward from the properties of exp. Let us show (i).

exp(log(xy)) = xy = exp(log x) · exp(log y) = exp(log x + log y).

Now (i) follows from the injectivity of exp.


Let y = f (x) = log x. Then x = g(y) = exp y. We have

1 1 1
f 0 (x) = = = .
g 0 (y) exp y x

The rest is left as an exercise.

There is a simple representation of log(1 + x) as a power series.


Theorem 7.2.2. For x ∈ (−1, 1)

x2 x3 x4 xn
log(1 + x) = x − + − + · · · + (−1)n−1 + ....
2 3 4 n

89 January 28, 2006


7.3. TRIGONOMETRIC FUNCTIONS

Now we are able to define arbitrary powers of nonnegative numbers (until now it was only
clear how to define rational powers).
Definition 7.2.2. Let a > 0. Then for any x ∈ R we set

ax := ex log a .

The standard laws


ax · ay = ax+y = ax+y , (ax )y = axy
valid for a > 0 and arbitrary x, y ∈ R, can be easily verified from the definition and discussed
properties of the exponential and logarithmic functions.

7.3 Trigonometric functions


In this section we sketch an approach to define trigonometric functions.
Definition 7.3.1. We define the two main trigonometric functions sin : R → R and cos :
R → R by the following series

x3 x5 x7
(7.3.2) sin x := x − + − + ...
3! 5! 7!

x2 x4 x6
(7.3.3) cos x := 1 − + − + ...
2! 4! 6!

The series are absolutely convergent for all x ∈ R (see Example 6.2.8 and Example 6.2.7).

The next theorem can be proved using the above definitions by multiplying and adding
the corresponding series. We skip the details of the proof. An easier proof would be to use
the exponential function with complex variables. This will be done in the course “Further
topics in Analysis”.
Theorem 7.3.1. For all x, y ∈ R

(7.3.4) sin(x + y) = sin x cos y + cos x sin y;


(7.3.5) sin x + cos2 x = 1.
2

From the above it follows that


£ ¤
(∀x ∈ R) (−1 ≤ sin x ≤ 1) ∧ (−1 ≤ cos x ≤ 1) .

Theorem 7.3.2. The functions sin : R → R and cos : R → R are everywhere differentiable
and
(sin x)0 = cos x, (cos x)0 = − sin x.

The proof is similar to that of Theorem 7.1.2. We do not give the details.

90 January 28, 2006


CHAPTER 7. ELEMENTARY FUNCTIONS

Periodicity of trigonometric functions

Theorem 7.3.3. There exists a smalest positive constant 12 $ such that

1 1
cos $ = 0 and sin $ = 1.
2 2
Proof. If 0 < x < 2 then
µ ¶ µ 5 ¶
x3 x x7
sin x = x − + − + ··· > 0
3! 5! 7!

Therefore for 0 < x < 2 we have that

(cos x)0 = − sin x < 0,

hence cos x is decreasing.


Notice that µ ¶ µ 4 ¶
x2 x x6
cos x = 1 − + − + ...
2! 4! 6!

It is easy to see that cos 2 > 0.
Another rearrangement
µ ¶
x2 x4 x6 x8
cos x = 1 − + − − − ...
2! 4! 6! 8!

shows that cos 3 < 0.
√ √
This proves the existence of the smalest number 2 < 12 $ < 3 such that cos 12 $ = 0.
Since sin 12 $ > 0 it follows from (7.3.5) that sin 12 $ = 1.
Theorem 7.3.4. Let $ be the number defined in Theorem 7.3.3. Then for all x ∈ R
1 1
(7.3.6) sin(x + $) = cos x, cos(x + $) = − sin x;
2 2
(7.3.7) sin(x + $) = − sin x, cos(x + $) = − cos x;
(7.3.8) sin(x + 2$) = sin x, cos(x + 2$) = cos x.

The proof is an easy consequence from (7.3.4).


The above thoerem implies that sin and cos are periodic with period 2$. (We will further
identify the number $ with π).

91 January 28, 2006


Chapter 8

The Riemann Integral

In this chapter we study one of the approaches to theory of integration.

8.1 Definition of integral


Let us consider a function f : [a, b] → R defined on a closed interval [a, b] ⊂ R. We are going
to measure the area under the graph of the curve y = f (x) on R × R between the vertical
lines x = a and x = b. In order to visualize the picture you may think of the case f (x) ≥ 0.
The approach we undertake is to approximate the area by means of the sum of rectangles
dividing the interval [a, b].

In the following we always assume that f is bounded on [a, b].

8.1.1 Definition of integral and integrable functions


Definition 8.1.1. Let a < b. A partition of the interval [a, b] is a finite collection of points
in [a, b] one of which is a and one of which is b.

The point of a partition can be numbered x0 , x1 , . . . , xn so that

a = x0 < x1 < x2 < · · · < xn−1 < xn = b.

Definition 8.1.2. Let P = {x0 , . . . , xn } be a partition on [a, b]. Let

mi = inf{f (x) |, xi−1 ≤ x ≤ xi },


Mi = sup{f (x) |, xi−1 ≤ x ≤ xi }.

The lower sum of f for P is defined as


n
X
L(f, P ) = mi (xi − xi−1 ).
i=1

92
CHAPTER 8. THE RIEMANN INTEGRAL

The upper sum of f for P is defined as


n
X
U (f, P ) = Mi (xi − xi−1 ).
i=1
Definition 8.1.3. The integral sum of f for the partition P and
{ξi ∈ [xi−1 , xi ] | (xi )ni=0 ∈ P } is
n
X
σ(f, P, ξ) = f (ξi )(xi − xi−1 ).
i=1

Denote
m := inf f (x), M := sup f (x).
x∈[a,b] x∈[a,b]
The following inequality is obviously true
£ ¤
(∀i ∈ {1, 2, . . . , n}) m ≤ mi ≤ f (ξi ) ≤ Mi ≤ M .
Therefore we have
n
X n
X n
X
m(b − a) = m(xi − xi−1 ) ≤ mi (xi − xi−1 ) ≤ f (ξi )(xi − xi−1 )
i=1 i=1 i=1
n
X n
X
≤ Mi (xi − xi−1 ) ≤ M (xi − xi−1 ) = M (b − a),
i=1 i=1
which implies that for every partition and for every choice
{ξi ∈ [xi−1 , xi ] | (xi )ni=0 ∈ P } the following inequality holds

(8.1.1) m(b − a) ≤ L(f, P ) ≤ σ(f, P, ξ) ≤ U (f, P ) ≤ M (b − a).


In other words, the sets of real numbers
{L(f, P ) | P }, {U (f, P ) | P }
are bounded.
Definition 8.1.4. The upper integral of f over [a, b] is
J := inf{U (f, P ) | P },
the lower integral of f over [a, b] is
j := sup{L(f, P ) | P }.
(The infimum and the supremum are taken over all partitions on [a, b].)

Definition 8.1.5. A function f : [a, b] → R is called Riemann integrable if


J = j.
The common value is called integral of f over [a, b] and is denoted by
Z b
f (x)dx.
a

93 January 28, 2006


8.1. DEFINITION OF INTEGRAL

8.1.2 Properties of upper and lower sums


Lemma 2. Let P and Q be two partitions of [a, b] such that P ⊂ Q. Then
L(f, P ) ≤ L(f, Q),
U (f, P ) ≥ U (f, Q).

(The partition Q is called a refinement of P .)

Proof. First let us consider a particular case. Let P 0 be a partition formed from P by adding
one extra point, say c ∈ [xk−1 , xk ]. Let
m0k = inf f (x), m00k = inf f (x).
x∈[xk−1 ,c] x∈[c,xk ]

Then m0k ≥ mk , m00k ≥ mk , and we have


k−1
X n
X
L(f, P 0 ) = mi (xi − xi−1 ) + m0k (c − xk−1 ) + m00k (xk − c) + mi (xi − xi−1 )
i=1 i=k+1
k−1
X n
X
≥ mi (xi − xi−1 ) + mk (xk − xk−1 ) + mi (xi − xi−1 ) = L(f, P ).
i=1 i=k+1

Similarly one obtains that


U (f, P 0 ) ≤ U (f, P ).
Now to prove the assertion one has to add to P consequently a finite number of points in
order to form Q.
Proposition 8.1.1. Let P and Q be arbitrary partitions of [a, b]. Then
L(f, P ) ≤ U (f, Q).

Proof. Consider the partition P ∪ Q. The by Lemma 2 we have


L(f, P ) ≤ L(f, P ∪ Q) ≤ U (f, P ∪ Q) ≤ U (f, Q).
Theorem 8.1.1. J ≥ j.

Proof. Fix a partition Q. Then by Proposition 8.1.1


(∀P ) L(f, P ) ≤ U (f, Q).
Therefore
j = sup{L(f, P ) | P } ≤ U (f, Q).
And from the above
(∀Q)j ≤ U (f, Q).
Hence
j ≤ inf{U (f, Q) | Q} = J.
Remark. Integrability of f means by definition j = J. If j < J we say that f is not
Riemann integrable.

94 January 28, 2006


CHAPTER 8. THE RIEMANN INTEGRAL

Example 8.1.2. Let f : [a, b] → R is defined by f (x) = C. Then


h i
(∀P ) L(f, P ) = U (f, P ) = C(b − a) .

Hence
J = j = C(b − a).
Example 8.1.3. The Dirichlet function D : [0, 1] → R is defined as D(x) = 1 if x is rational
and D(x) = 0 is x is irrational.
Then h i h i
(∀P ) L(D, P ) = 0 ∧ U (D, P ) = 1 .

Hence the Dirichlet function is not Riemann integrable.

8.2 Criterion of integrability


Theorem 8.2.1. A function f : [a, b] → R is Riemann integrable if and only if for any ε > 0
there exists a partition P of [a, b] such that

U (f, P ) − L(f, P ) < ε.

Proof. 1) Necessity. Let J = j, i.e. let us assume that f is integrable. Fix ε > 0. Then
h i
(∃P1 ) L(f, P1 ) > j − ε/2 .

Also h i
(∃P2 ) U (f, 21 ) < J + ε/2 .

Let Q = P1 ∪ P2 . Then

j − ε/2 < L(f, P1 ) ≤ L(f, Q) ≤ U (f, Q) ≤ U (f, P2 ) < J + ε/2.

Therefore (since J = j)
U (f, Q) − L(f, Q) < ε.

2) Sufficiency. Fix ε > 0. Let P be a partition such that

U (f, P ) − L(f, P ) < ε.

Note that
J − j ≤ U (f, P ) − L(f, P ) < ε.
Therefore it follows that ³ ´
(∀ε > 0) J − j < ε .

This implies that J = j.

95 January 28, 2006


8.3. INTEGRABLE FUNCTIONS

8.3 Integrable functions


The following definition is used in the proof of the next theorems and will be also used in the
subsequent sections.
Definition 8.3.1. Let P be a partition of [a, b]. The length of the greatest subinterval of
[a, b] under the partition P is called the norm of the partition P and is denoted by kP k, i.e.

kP k := max (xi − xi−1 ).


1≤i≤n

Theorem 8.3.1. Let f : [a, b] → R be monotone. Then f is Riemann integrable.

Proof. Without loss of generality assume that f is increasing, so that f (a) < f (b). Fix ε > 0.
Let us consider a partition P of [a, b] such that
ε
kP k < δ = .
f (b) − f (a)

For this partition we obtain


n
X
U (f, P ) − L(f, P ) = (Mi − mi )(xi − xi−1 )
i=1
n
X n
X
= (f (xi ) − f (xi−1 ))(xi − xi−1 ) < δ (f (xi ) − f (xi−1 ))
i=1 i=1
= δ(f (b) − f (a)) = ε.

Theorem 8.3.2. Let f : [a, b] → R be continuous. Then f is Riemann integrable.

Proof. Fix ε > 0. Since f is continuous on a closed interval, it is uniformly continuous (See
ε
section 4.4). Therefore for b−a there exists δ > 0 such that
£ ε
(∀x1 , x2 ∈ [a, b]) (|x1 − x2 | < δ) ⇒ (|f (x1 ) − f (x2 )| < .
b−a
Hence for every partition P with norm kP k < δ we have
n
X n
ε X
U (f, P ) − L(f, P ) = (Mi − mi )(xi − xi−1 ) < (xi − xi−1 ) = ε.
b−a
i=1 i=1

96 January 28, 2006


CHAPTER 8. THE RIEMANN INTEGRAL

8.4 Elementary properties of integral


Theorem 8.4.1. Let a < c < b. Let f be integrable on [a, b]. Then f is integrable on [a, c]
and on [c, b] and
Z b Z c Z b
f (x)dx = f (x)dx + f (x)dx.
a a c
Conversely, if f is integrable on [a, c] and on [c, b] then it is integrable on [a, b].

Proof. Suppose that f is integrable on [a, b]. Fix ε > 0. Then there exists a partition
P = {x0 , . . . , xn } of [a, b] such that

U (f, P ) − L(f, P ) < ε.

We can assume that c ∈ P so that c = xj for some j ∈ {0, 1, . . . , n} (otherwise consider


the refinement of P adding the point c). Then P1 = {x0 , . . . , xj } is a partition of [a, c] and
P2 = {xj , . . . , xn } is a partition of [c, b]. Moreover,

L(f, P ) = L(f, P1 ) + L(f, P2 ), U (f, P ) = U (f, P1 ) + U (f, P2 ).

Therefore we have

[U (f, P1 ) − L(f, P1 )] + [U (f, P2 ) − L(f, P2 )] = U (f, P ) − L(f, P ) < ε.

Since each of the terms on the left hand side is non-negative, each one is less than ε, which
proves that f is integrable on [a, c] and on [c, b]. Note also that
Z c
L(f, P1 ) ≤ f (x)dx ≤ U (f, P1 ),
a
Z b
L(f, P2 ) ≤ f (x)dx ≤ U (f, P2 ),
c

so that Z Z
c b
L(f, P ) ≤ f (x)dx + f (x)dx ≤ U (f, P ).
a c
This is true for any partition of [a, b]. Therefore
Z b Z c Z b
f (x)dx = f (x)dx + f (x)dx.
a a c

Now suppose that f is integrable on [a, c] and on [c, b]. Fix ε > 0. Then there exists a
partition P1 of [a, c] such that

U (f, P1 ) − L(f, P1 ) < ε/2.

Also there exists a partition P2 of [c, b] such that

U (f, P2 ) − L(f, P2 ) < ε/2.

Let P = P1 ∪ P2 . Then

U (f, P ) − L(f, P ) = [U (f, P1 ) − L(f, P1 )] + [U (f, P2 ) − L(f, P2 )] < ε.

97 January 28, 2006


8.4. ELEMENTARY PROPERTIES OF INTEGRAL

Rb
The integral a f (x)dx was defined only for a < b. We add by definition that
Z a Z b Z a
f (x)dx = 0 and f (x)dx = − f (x)dx if a > b.
a a b

With this convention we always have that


Z b Z c Z b
f (x)dx = f (x)dx + f (x)dx.
a a c

Theorem 8.4.2. Let f and g be integrable on [a, b]. Then f + g is also integrable on [a, b]
and Z b Z b Z b
[f (x) + g(x)]dx = f (x)dx + g(x)dx.
a a a

Proof. Let P = {x0 , . . . , xn } be a partition of [a, b]. Let

mi = inf{f (x) + g(x) | xi−1 ≤ x ≤ xi },


m0i = inf{f (x) | xi−1 ≤ x ≤ xi },
m00i = inf{g(x) | xi−1 ≤ x ≤ xi }.

Define Mi , Mi0 , Mi00 similarly. The following inequalities hold

mi ≥m0i + m00i ,
Mi ≤Mi0 + Mi00 .

Therefore we have

L(f, P ) + L(g, P ) ≤L(f + g, P ),


U (f + g, P ) ≤U (f, P ) + U (g, P ).

Hence for any partition P

L(f, P ) + L(g, P ) ≤ L(f + g, P ) ≤ U (f + g, P ) ≤ U (f, P ) + U (g, P ),

or otherwise

U (f + g, P ) − L(f + g, P ) ≤ [U (f, P ) − L(f, P )] + [U (g, P ) − L(g, P )].

Fix ε > 0. Since f and g are integrable there are partitions P1 and P2 such that

U (f, P1 ) − L(f, P1 ) < ε/2,


U (g, P2 ) − L(g, P2 ) < ε/2.

Thus for the partition P = P1 ∪ P2 we obtain that

U (f + g, P ) − L(f + g, P ) < ε.

This proves that f + g is integrable on [a, b].

98 January 28, 2006


CHAPTER 8. THE RIEMANN INTEGRAL

Moreover,
Z b
L(f, P ) + L(g, P ) ≤ L(f + g, P ) ≤ [f (x) + g(x)]dx
a
≤ U (f + g, P ) ≤ U (f, P ) + U (g, P ),

and Z Z
b b
L(f, P ) + L(g, P ) ≤ f (x)dx + g(x)dx ≤ U (f, P ) + U (g, P ).
a a

Therefore it follows that


Z b Z b Z b
[f (x) + g(x)]dx = f (x)dx + g(x)dx.
a a a

Theorem 8.4.3. Let f be integrable on [a, b]. Then, for any c ∈ R, cf is also integrable on
[a, b] and
Z b Z b
cf (x)dx = c f (x)dx.
a a

Proof. The proof is left as an exercise. Consider separately two cases: c ≥ 0 and c ≤ 0.

Theorem 8.4.4. Let f, g be integrable on [a, b] and

(∀x ∈ [a, b]) (f (x) ≤ g(x)).

Then Z Z
b b
f (x)dx ≤ g(x)dx.
a a

Proof. For any partition P of [a, b] we have


Z b
L(f, P ) ≤ L(g, P ) ≤ g(x)dx.
a

The assertion follows by taking supremum over all partitions.

Corollary 8.4.1. Let f be integrable on [a, b] and there are M, m ∈ R such that

(∀x ∈ [a, b]) (m ≤ f (x) ≤ M ).

Then Z b
m(b − a) ≤ f (x)dx ≤ M (b − a).
a

Corollary 8.4.2. Let f be continuous on [a, b]. Then there exists θ ∈ [a, b] such that
Z b
f (x)dx = f (θ)(b − a).
a

99 January 28, 2006


8.4. ELEMENTARY PROPERTIES OF INTEGRAL

Proof. From Corollary 8.4.1 it follows that


Z b
1
m≤ f (x)dx ≤ M,
b−a a

where m = min[a,b] f (x), M = max[a,b] f (x). Then by the Intermediate Value Theorem we
conclude that there exists θ ∈ [a, b] such that
Z b
1
f (θ) = f (x)dx.
b−a a

Theorem 8.4.5. Let f be integrable on [a, b]. Then |f | is integrable on [a, b] and
Z Z
¯ b ¯ b
¯ f (x)dx¯ ≤ |f (x)|dx.
a a

Proof. Note that for any interval [α, β]

(8.4.2) sup |f (x)| − inf |f (x)| ≤ sup f (x) − inf f (x).


[α,β] [α,β] [α,β] [α,β]

Indeed,
³ ´
(∀x, y ∈ [α, β]) f (x) − f (y) ≤ sup f (x) − inf f (x) , so that
[α,β] [α,β]
³ ´
(∀x, y ∈ [α, β]) |f (x)| − |f (y)| ≤ sup f (x) − inf f (x) ,
[α,β] [α,β]

which proves (8.4.2) by passing to the supremum in x and y.


It follows from (8.4.2) that for any partition of [a, b]

U (|f |, P ) − L(|f |, P ) ≤ U (f, P ) − L(f, P ),

which proves the integrability of |f | by the criterion of integrability, Theorem 8.2.1. The last
assertion follows from Theorem 8.4.4.
Theorem 8.4.6. 1 Let f : [a, b] → R be integrable and (∀x ∈ [a, b])(m ≤ f (x) ≤ M ). Let g : [m, M ] →
R be continuous. Then h : [a, b] → R defined by h(x) = g(f (x)) is integrable.

Proof. Fix ε > 0. Since g is uniformly continuous on [m, M ], there exists δ > 0 such that δ < ε and
£ ¤
(∀t, s ∈ [m, M ]) (|t − s| < δ) ⇒ (|g(t) − g(s)| < ε) .

By integrability of f there exists a partition P = {x0 , . . . , xn } of [a, b] such that

(8.4.3) U (f, P ) − L(f, P ) < δ 2 .

Let mi = inf [xi−1 ,xi ] f (x), Mi = sup[xi−1 ,xi ] f (x) and m∗i = inf [xi−1 ,xi ] h(x), Mi∗ = sup[xi−1 ,xi ] h(x).
Decompose the set {1, . . . , n} into two subset : (i ∈ A) ⇔ (Mi −mi < δ) and (i ∈ B) ⇔ (Mi −mi ≥ δ).
For i ∈ A by the choice of δ we have that Mi∗ − m∗i < ε.
1
This theorem is is outside the syllabus and will not be included in the examination paper

100 January 28, 2006


CHAPTER 8. THE RIEMANN INTEGRAL

For i ∈ B we have that Mi∗ − m∗i ≤ 2K where K = supt∈[m,M ] |g(t)|. By (8.4.3) we have
X X
δ (xi − xi−1 ) ≤ (Mi − mi )(xi − xi−1 ) < δ 2 ,
i∈B i∈B
P
so that i∈B (xi − xi−1 ) < δ. Therefore
X X
U (h, P ) − L(h, P ) = (Mi∗ − m∗i )(xi − xi−1 ) + (Mi∗ − m∗i )(xi − xi−1 )
i∈A i∈B
<ε(b − a) + 2Kδ < ε[(b − a) + 2K],

which proves the assertion since ε is arbitrary.


Corollary 8.4.3. Let f, g be integrable on [a, b]. Then the product f g is integrable on [a, b].

Proof. Since f + g and f − g are integrable on [a, b], (f + g)2 and (f − g)2 are integrable on [a, b] by
the previous theorem. Therefore
1
fg = [(f + g)2 − (f − g)2 ] is integrable on [a, b].
4

101 January 28, 2006


8.5. INTEGRATION AS THE INVERSE TO DIFFERENTIATION

8.5 Integration as the inverse to differentiation


Theorem 8.5.1. Let f be integrable on [a, b] and let F be defined on [a, b] by
Z x
F (x) = f (t)dt.
a

Then F is continuous on [a, b].

Proof. By the definition of integrability f is bounded on [a, b]. Let M = sup[a,b] |f (x)|. Then
for x, y ∈ [a, b] we have
Z
¯ ¯ ¯ y ¯
¯F (x) − F (y)¯ = ¯ f (t)dt¯ ≤ M |x − y|,
x

which proves that F is uniformly continuous on [a, b].


If in the previous theorem we in addition assume that f is continuous, we can prove more.

Theorem 8.5.2. Let f be integrable on [a, b] and let F be defined on [a, b] by


Z x
F (x) = f (t)dt.
a

Let f be continuous at c ∈ [a, b]. Then F is differentiable at c and

F 0 (c) = f (c).

Proof. Let c ∈ (a, b). Let h > 0. Then


Z c+h
F (c + h) − F (c) 1
= f (t)dt.
h h c

By Corollary 8.4 there exists θ ∈ [c, c + h] such that


Z c+h
f (t)dt = f (θ)h.
c

Hence we have
F (c + h) − F (c)
= f (θ).
h
As h → 0, θ → c, and due to continuity of f we conclude that limh→0 f (θ) = f (c). The
assertion follows. The case h < 0 is similar. The cases c = a and c = b are similar (In these
cases one talks on one-sided derivatives only.)

Theorem 8.5.3. Let f be continuous on [a, b] and f = g 0 fore some function g defined on
[a, b]. Then for x ∈ [a, b]
Z x
f (t)dt = g(x) − g(a).
a

102 January 28, 2006


CHAPTER 8. THE RIEMANN INTEGRAL

Proof. Let Z x
F (x) = f (t)dt.
a
By Theorem 8.5.2 the function F − g is differentiable on [a, b] and F 0 − g 0 = (F − g)0 = 0.
Therefore by Corollary 5.2.2 there is a number c such that

F = g + c.

Since F (a) = 0 we have that g(a) = −c. Thus for x ∈ [a, b]


Z x
f (t)dt = F (x) = g(x) − g(a).
a

The next theorem is often called The Fundamental Theorem of Calculus.


Theorem 8.5.4. Let f be integrable on [a, b] and f = g 0 for some function g defined on [a, b].
Then Z b
f (x)dx = g(b) − g(a).
a

Proof. Let P = {x0 , . . . , xn } be a partition of [a, b]. By the Mean Value Theorem there exists
a point ti ∈ [xi−1 , xi ] such that

g(xi ) − g(xi−1 ) = g 0 (ti )(xi − xi−1 ) = f (ti )(xi − xi−1 ).

Let
mi = inf f (x), Mi = sup f (x).
[xi−1 ,xi ] [xi−1 ,xi ]

Then
mi ((xi − xi−1 ) ≤ f (ti )(xi − xi−1 ) ≤ Mi (xi − xi−1 ), that is
mi ((xi − xi−1 ) ≤ g(xi ) − g(xi−1 ) ≤ Mi (xi − xi−1 ).
Adding these inequalities for i = 1, . . . , n we obtain
n
X n
X
mi ((xi − xi−1 ) ≤ g(b) − g(a) ≤ Mi (xi − xi−1 ),
i=1 i=1

so that for any partition we have

L(f, P ) ≤ g(b) − g(a) ≤ U (f, P ),

which means that Z b


g(b) − g(a) = f (x)dx.
a

103 January 28, 2006


8.6. INTEGRAL AS THE LIMIT OF INTEGRAL SUMS

8.6 Integral as the limit of integral sums


2
Let f : [a, b] → R. We defined the integral sums of f in Definition 8.1.3.
Definition 8.6.1. A number A is called the limit of integral sums σ(f, P, ξ) if
h ¡ ¢i
(∀ε > 0)(∃δ > 0)(∀P )(∀(ξi ) ∈ P ) (kP k < δ) ⇒ |σ(f, P, ξ) − A| < ε .

In this case we write


lim σ(f, P, ξ) = A.
kP k→0

The next theorem shows that the Riemann integral can be equavalently defined via the limit of
the integral sums.
Theorem 8.6.1. Let f : [a, b] → R. The f is Riemann integrable if and only if limkP k→0 σ(f, P, ξ)
exists. In this case Z b
lim σ(f, P, ξ) = f (x)dx.
kP k→0 a

Proof. First, assume that f is Riemann integrable. Then we know that f is bounded, so that there
is a constant C such that |f (x)| ≤ C for all x ∈ [a, b]. Fix ε > 0. Then there exists a partition P0 of
[a, b] such that
U (f, P0 ) − L(f, P) < ε/2.
ε
Let m be the number of points in the partition P0 . Choose δ = . Then for any partition P1 such
8mC
that kP1 k < δ and P = P0 ∪ P1 we have
¡ ¢
U (f, P1 ) = U (f, P ) + U (f, P1 ) − U (f, P )
¡ ¢
≤ U (f, P0 ) + U (f, P1 ) − U (f, P )
≤ U (f, P0 ) + 2CkP1 km < U (f, P0 ) + ε/4.

Similarly,
L(f, P1 ) > L(f, P0 ) − ε/4.
Therefore we get
L(f, P0 ) − ε/4 < L(f, P1 ) ≤ U (f, P1 ) < U (f, P0 ) + ε/4.
Hence
U (f, P1 ) − L(f, P1 ) < ε,
which together with the inequalities
Z b
L(f, P1 ) ≤ f (x)dx ≤ U (f, P1 ),
a
L(f, P1 ) ≤ σ(f, P1 , ξ) ≤ U (f, P1 )

leads to Z
¯ b ¯
¯ f (x)dx − σ(f, P1 , ξ)¯ < ε.
a

Now suppose that limkP k→0 σ(f, P, ξ) = A. Fix ε > 0. Then there exists δ > 0 such that if kP k < δ
then
A − ε/2 < σ(f, P, ξ) < A + ε/2.
2
The entire section is not included in the examination paper

104 January 28, 2006


CHAPTER 8. THE RIEMANN INTEGRAL

Choose P as above. Varying (ξi ) take sup and inf of σ(f, P, ξ) in the above inequality. We obtain

Aε /2 ≤ L(f, P ) ≤ U (f, P ) ≤ A + ε/2.

By the criterion of integrability this shows that f is Riemann integrable.

8.7 Improper integrals. Series


Integrals over an infinite interval
R∞
Definition 8.7.1. The improper integral a f (x)dx is defined as
Z ∞ Z A
f (x)dx = lim f (x)dx.
a A→∞ a

We use the same terminology for improper integrals as for series, that is, an improper
integral may converge or diverge.
Example 8.7.1. Z ∞
1
dx = 1.
1 x2
Indeed, Z A
1 1
2
dx = 1 − → 1 as A → ∞.
1 x A
Z ∞
1
Theorem 8.7.2. dx converges if and only is k > 1.
1 xk

Integrals of unbounded functions

Example 8.7.3. Z Z
1
dx 1
dx √
√ = lim √ = lim (2 − 2 δ) = 2.
0 x δ→0 δ x δ→0

The notion of the improper integral is useful for investigation of convergence of certain
series. The following theorem is often called the integral test for convergence of series.
R∞
Theorem 8.7.4. P∞Let f : [1, ∞) → R be positive and increasing. Then the integral 1 f (x)dx
and the series n=1 f (n) both converge or both diverge.

Proof. Since f is monotone it is integrable on any finite interval (Theorem 8.3.1). For n − 1 ≤
x ≤ n we have
f (n) ≤ f (x) ≤ f (n − 1).
Integrating the above inequality from n − 1 to n we obtain
Z n
f (n) ≤ f (x)dx ≤ f (n − 1).
n−1

105 January 28, 2006


8.8. CONSTANT π

Adding these inequalities on interrvals [1, 2], [2, 3], . . . , [n − 1, n] we have


n
X Z n n−1
X
f (k) ≤ f (x)dx ≤ f (k).
k=2 1 k=1

Now the assertion easily follows.

As an application of the above theorem we consider the following


Example 8.7.5. The series

X 1

n=1

converges if α > 1.

Proof. By Thoerem 8.7.4 it is enough to prove that


Z ∞
dx
< ∞.
1 xα

Indeed, Z Z µ ¶
∞ A
dx dx 1 1 1
= lim = 1− = < ∞.
1 xα A→∞ 1 x α α−1 Aα−1 α−1
Example 8.7.6. The series

X 1
n(log n)α
n=1

converges if α > 1 and diverges if α ≤ 1.

Proof. Left as an exercise.

8.8 Constant π
In Chapter 7 we defined the trigonometric functions sin and cos by the series and proved that
they are periodic with period 2$. Here we show that $ is the same constant as π know from
elementary geometry.
Consider a circle x2 + y 2 ≤ 1, which is centered at the origin with radius 1. It is known
that its area is π.
The area of the semi-circle can be obtained by
Z 1p Z $ Z
2 2 1 $ 1
1 − x dx = sin θdθ = (1 − cos 2θ)dθ = $
−1 0 2 0 2

where we used the substitution x = cos θ.


Hence $ = π.

106 January 28, 2006


Further reading

[1] D. J. Velleman, How to prove it. A structural approach, Cambridge University Press, 1994.

[2] I. Stewart and D. Tall, The Foundations of Mathematics, Oxford University Press, 1977.

[3] S. Krantz, Real Analysis and Foundations, CRC Press, 1991.

[4] J. C. Burkill, A first course in Mathematical Analysis, Cambridge University Press, 1962.

[5] G. H. Hardy, A course of Pure Mathematics, Cambridge University Press, 1908.

107

Vous aimerez peut-être aussi