Vous êtes sur la page 1sur 12

8: DIFFERENTIATION IN SEVERAL VARIABLES

STEVEN HEILMAN

Contents
1.
2.
3.
4.
5.
6.
7.

Review
Introduction
Differentiation in multiple variables
Partial and Directional Derivatives
The Chain Rule in Several Variables
Iterated Derivatives and Clairauts Theorem
Appendix: Notation

1
2
2
3
7
8
10

1. Review
Definition 1.1 (Derivative on the real line). Let E be a subset of R, and let x0 be a
limit point of E, and let f : E R. If the limit
f (x) f (x0 )
.
xx0 ;xEr{x0 }
x x0
lim

exists and converges to a real number L R, then we write f 0 (x0 ) = L and we say that f is
differentiable at x0 . If this limit does not exist, then we say that f is not differentiable
at x0 .
Lemma 1.2. Let E be a subset of R, let f : E R, let x0 E, and let L R. Then the
following two statements are equivalent.
f is differentiable at x0 and f 0 (x0 ) = L.
0 )+L(xx0 ))|
We have limxx0 ;xEr{x0 } |f (x)(f (x
= 0.
|xx0 |
Definition 1.3. Let n be a positive integer. Let x = (x1 , . . . , xn ) Rn . We define the `2
norm kxk of x by
!1/2
n
X
kxk = k(x1 , . . . , xn )k :=
x2i
.
i=1
n

Let y = (y1 , . . . , yn ) R . We define the standard inner product h, i on Rn by


hx, yi :=

n
X
i=1

Date: December 22, 2014.


1

xi yi .

So, kxk =

p
hx, xi. We also denote the standard basis vectors e1 , . . . en so that
e1 = (1, 0, . . . , 0),

e2 = (0, 1, 0, . . . , 0),

en = (0, . . . , 0, 1).

Definition 1.4. Let n, m be positive integers. A linear transformation from Rn to Rm


is a function L : Rn Rm which satisfies the following properties.
For all x, y Rn , we have L(x + y) = L(x) + L(y).
For all x Rn and for all R, we have L(x) = L(x).
Remark 1.5. Given a linear transformation L : Rn Rm , there exists an m n matrix A
(that is, a matrix A with m rows and n columns) such that
L(x) = Ax,

x Rn .

Conversely, given a matrix A, the function L : Rn Rm defined by L(x) := Ax for all


x Rn , is a linear transformation from Rn to Rm . So, on Euclidean spaces, the notions of
matrices and linear transformations are interchangeable.
2. Introduction
Our final topic in this course will be differentiation in several variables. Here the theory
somewhat resembles the theory of differentiation in one variable, however there are many
key differences. The first obstacle we need to overcome is to simply define the derivative in
the higher dimensional setting. We therefore begin with this task.
3. Differentiation in multiple variables
Let n, m be positive integers. Let f : Rn Rm . In order to define the derivative of f , we
cannot simply copy and paste Definition 1.1, since we would need to let x Rn and then
divide by x, which is meaningless unless n = 1. We instead use the equivalent definition
within Lemma 1.2. In this case, we can successfully define differentiation by replacing the
absolute values by the appropriate norm, and by replacing L by a linear map.
Definition 3.1 (Derivatives in multiple variables). Let E be a subset of Rn , let f : E
Rm be a function, let x0 E, and let L : Rn Rm be a linear transformation. We say that
f is differentiable at x0 with derivative L if and only if we have
kf (x) (f (x0 ) + L(x x0 ))k
lim
= 0.
xx0 ;xE
kx x0 k
Example 3.2. Let f : R2 R2 be defined by f (x1 , x2 ) = (x21 , x22 ). Define the linear transformation L : R2 R2 by L(x1 , x2 ) := (2x1 , 4x2 ). We will show that L is the derivative of f
at the point x0 = (1, 2). We want to show that
kf (x) (f (1, 2) + L(x (1, 2)))k
= 0.
x(1,2);x6=(1,2)
kx (1, 2)k
lim

Now, note that


f (x) (f (1, 2) + L(x (1, 2))) = (x21 , x22 ) ((1, 4) + (2x1 , 4x2 ) (2, 8))
= (x21 , x22 ) (2x1 1, 4x2 4)
= ((x1 1)2 , (x2 2)2 ).
2

So, using the triangle inequality,


kf (x) (f (1, 2) + L(x (1, 2)))k k((x1 1)2 , 0)k + k(0, (x2 2)2 )k = (x1 1)2 + (x2 2)2 .
In conclusion,
p
(x 1)2 + (x2 2)2
p 1
(x1 1)2 + (x2 2)2 = 0.
=
lim
x(1,2);x6=(1,2)
(x1 1)2 + (x2 2)2 x(1,2);x6=(1,2)
So, we have proven our desired statement.
0

lim

The following lemma shows that a function can have at most one derivative at an interior
point of E.
Lemma 3.3. Let E be a subset of Rn , let f : E Rm be a function, and let x0 be an interior
point of E. Let La : Rn Rm and let Lb : Rn Rm be linear transformations. Suppose f is
differentiable at x0 with derivative La , and f is differentiable at x0 with derivative Lb . Then
La = Lb .
Exercise 3.4. Prove Lemma 3.3. (Hint: argue by contradiction. Assume that La 6= Lb .
Then there exists a nonzero vector v Rn such that La v 6= Lb v. Then, apply the definition
of the derivative, and try to specialize to the case where x = x0 + tv for some scalar t, in
order to obtain a contradiction.)
Using Lemma 3.3, we can now talk about the derivative of f at interior points x0 , and we
will label this derivative as f 0 (x0 ). That is, if x0 is an interior point of E, then f 0 (x0 ) is the
unique linear transformation from Rn to Rm such that
kf (x) (f (x0 ) + f 0 (x0 )(x x0 ))k
= 0.
lim
xx0 ;xE
kx x0 k
Informally, we therefore have Newtons approximation:
f (x) f (x0 ) + f 0 (x0 )(x x0 ).
Remark 3.5. We sometimes refer to f 0 (x0 ) as the total derivative of f , to distinguish
f 0 (x0 ) from the related directional and partial derivatives.
4. Partial and Directional Derivatives
We now relate the total derivative to the partial and directional derivatives. Let n, m be
positive integers.
Definition 4.1. Let E be a subset of Rn , let f : E Rm be a function, let x0 be an interior
point of E, let v Rn , and let t be a real number. If the limit
f (x0 + tv) f (x0 )
lim
.
t0;t6=0,x0 +tvE
t
exists, we say that f is differentiable in the direction v at x0 , and we denote this limit
by Dv f (x0 ).
f (x0 + tv) f (x0 )
Dv f (x0 ) :=
lim
.
t0;t6=0,x0 +tvE
t
Equivalently, we have
d
Dv f (x0 ) := f (x0 + tv)|t=0 .
dt
3

Note that in this definition we are dividing by the scalar t, so this division is okay, and
Dv f (x0 ) Rm .
Example 4.2. Let f : R2 R2 be defined by f (x1 , x2 ) = (x21 , x22 ). Let x0 := (1, 2) and let
v := (3, 4). We then compute
((1 + 3t)2 , (2 + 4t)2 ) (1, 4)
(1 + 6t + 9t2 , 4 + 16t + 16t2 ) (1, 4)
=
t
t
Therefore,
Dv f (x0 ) = lim (6 + 9t, 16 + 16t) = (6, 16).

= (6 + 9t, 16 + 16t).

t0;t6=0

If v is a standard basis vector, then we write


to

f
(x0 )
xj

f
(x0 )
xj

or

f (x0 )
xj

for Dej f (x0 ). We refer

as the partial derivative of f with respect to xj . So,


f (x0 + tej ) f (x0 )
d
f
(x0 ) :=
lim
= f (x0 + tej )|t=0 .
t0;t6=0,x0 +tej E
xj
t
dt

f
Note that if f : E Rm , then x
Rm . And if we write f in its components as f =
j
(f1 , . . . , fm ), then


f1
f1
f
(x0 ) =
(x0 ), . . . ,
(x0 ) .
xj
xj
xj
The total derivative and directional derivative are related in the following way.

Lemma 4.3. Let E be a subset of Rn , let f : E Rm be a function, let x0 be an interior


point of E, and let v Rn . If f is differentiable at x0 , then f is also differentiable in the
direction v at x0 , and
Dv f (x0 ) = f 0 (x0 )v.
Exercise 4.4. Prove Lemma 4.3.
From Lemma 4.3, total differentiability implies directional differentiability. Unfortunately,
the converse is false.
Exercise 4.5. Define f : R2 R by f (x, y) := x3 /(x2 + y 2 ) when (x, y) 6= (0, 0), and
f (0, 0) := 0. Show that for any v R2 , f is differentiable at (0, 0) in the direction v.
However, show that f is not differentiable at (0, 0).
Remark 4.6. From Lemma 4.3, if E Rn and if f : E Rm is differentiable at x0 E,
f
then all partial derivatives x
exist at x0 , for all j {1, . . . , n}, and
j
f
= f 0 (x0 )ej , j {1, . . . , n}.
xj
P
Also, given v = (v1 , . . . , vn ) = nj=1 vj ej Rn , we have
Dv f (x0 ) = f 0 (x0 )

n
X
j=1

vj ej =

n
X

vj f 0 (x0 )ej =

j=1

n
X
j=1

vj

f
(x0 ).
xj

()

From Exercise 4.5, partial differentiability does not imply differentiability. However, if
the partial derivatives of a function are continuous, then partial differentiability does imply
differentiability. We will use equation () to prove this assertion.
4

Theorem 4.7. Let E be a subset of Rn , let f : E Rm be a function, let F be a subset


f
of E, and let x0 be an interior point of F . If the partial derivatives x
exist on F and are
j
continuous at x0 for all j {1, . . . , n}, then f is differentiable at x0 . Moreover, f 0 (x0 ) : Rn
Rm is defined by
n
X
f
f 0 (x0 )(v1 , . . . , vn ) =
vj
(x0 ).
x
j
j=1
Proof. Define a linear transformation L : Rn Rm by
n
X

L(v1 , . . . , vn ) :=

j=1

vj

f
(x0 ).
xj

We need to show that


kf (x) (f (x0 ) + L(x x0 ))k
= 0.
xx0 ;xEr{x0 }
kx x0 k
lim

Let > 0. We will find > 0 such that, if x satisfies 0 < kx x0 k < , then
kf (x) (f (x0 ) + L(x x0 ))k
< .
kx x0 k
That is, we will show, if x satisfies 0 < kx x0 k < , then
kf (x) (f (x0 ) + L(x x0 ))k < kx x0 k .
Since x0 is an interior point of F , there exists r > 0 such that B(x0 , r) F . Since the
f
is continuous on F for each j {1, . . . , n}, there exists 0 < j < r
partial derivative x
j
f
f
such that k x
(x) x
(x0 )k < /(nm), for every x B(x0 , j ), for every j {1, . . . , n}.
j
j
f
f
Define := minj=1,...,n j . Then k x
(x) x
(x0 )k < /(nm), for every x B(x0 , ), for
j
j
every j {1, . . . , n}.
Let x B(x0 , ), and write x = x0 + v1 e1 + + vn en for some scalars v1 , . . . , vn . Note
that
q

kx x0 k =

v12 + + vn2 .

In particular, we have |vj | kx x0 k for all j {1, . . . , n}. Recall that we need to show
kf (x0 + v1 e1 + + vn en ) f (x0 )

n
X
j=1

vj

f
(x0 )k < kx x0 k .
xj

Write f in its components as f = (f1 , . . . , fm ), so that fi : E R for all i {1, . . . , m}.


Applying the Mean Value Theorem in the first variable, there exists a real number ti between
0 and v1 such that
fi
fi (x0 + v1 e1 ) fi (x0 ) =
(x0 + ti e1 )v1 .
x1
Note that, for all i {1, . . . , m}, for all j {1, . . . , n}, we have
|

fi
fi
f
f
(x0 + ti e1 )
(x0 )| k
(x0 + ti e1 )
(x0 )k /(nm).
x1
x1
x1
x1
5

Therefore,
fi
(x0 )v1 | |v1 | /(nm).
x1
Summing this inequality over i {1, . . . , m} and using k(y1 , . . . , ym )k |y1 | + + |ym |, we
have
f
kf (x0 + v1 e1 ) f (x0 )
(x0 )v1 k |v1 | /n kx x0 k /n.
x1
In the last inequality, we used |v1 | kx x0 k.
Using a similar argument, we conclude that
f
(x0 )v2 k kx x0 k /n.
kf (x0 + v1 e1 + v2 e2 ) f (x0 + v1 e1 )
x2
And so on, until we get
f
kf (x0 + v1 e1 + + vn en ) f (x0 + v1 e1 + + vn1 en1 )
(x0 )vn k kx x0 k /n.
xn
Summing these n inequalities and using the triangle inequality kx + yk kxk + kyk, we get
a telescoping sum which finally gives
n
X
f
kf (x0 + v1 e1 + + vn en ) f (x0 )
vj
(x0 )k < kx x0 k .
x
j
j=1
|fi (x0 + v1 e1 ) fi (x0 )


From Theorem 4.7 and Lemma 4.3, if the partial derivatives of a function f : E Rm
exist and are continuous on a set F , then all directional derivatives of f exist at every interior
point x0 of F , and
n
X
f
D(v1 ,...,vn ) f (x0 ) =
vj
(x0 ).
xj
j=1
In particular, if f : E R is a real-valued function, and if we define the gradient f (x0 )
of f at x0 to be the n-dimensional row vector
f
f
(x0 ), . . . ,
(x0 )),
f (x0 ) := (
x1
xn
then we have the formula
Dv f (x0 ) = hf (x0 ), vi.
m
More generally, if f : E R is a function with f = (f1 , . . . , fm ), and x0 is in the interior
of the region where the partial derivatives of f exist and are continuous, then Theorem 4.7
says
!m
n
n
X
X
f
f
i
f 0 (x0 )(v1 , . . . , vn ) =
vj
(x0 ) =
vj
(x0 )
.
x
x
j
j
j=1
j=1
i=1

So, if we define the matrix


f1

Df (x0 ) =

(x0 )
x1
f2 (x0 )
x1


fi
(x0 )
=
1im
xj

1jn

..
.
fm
(x0 )
x1
6

f1
(x0 )
x2
f2
(x0 )
x2

..
.

..
.
fm
(x0 )
x2

f1
(x0 )
xn
f2
(x0 )

xn

,
..
.
fm
(x0 )
xn

then we have
Dv f (x0 ) = f 0 (x0 )v = Df (x0 )v.
The matrix Df (x0 ) is sometimes called the derivative or the differential of f at x0 . We
still wish to distinguish the matrix Df (x0 ) from the linear transformation f 0 (x0 ), since the
latter is defined in a way which does not depend on the chosen basis of Euclidean space.
5. The Chain Rule in Several Variables
Let n, m, p be positive integers. Recall that if f : X Y and g : Y Z are functions,
then the composition g f : X Z is defined by g f (x) := g(f (x)), for all x X.
Theorem 5.1 (The Chain Rule in Multiple Variables). Let E be a subset of Rn , let
F be a subset of Rm , let f : E F be a function, and let g : F Rp . Let x0 be a point in
the interior of E. Assume that f is differentiable at x0 and that f (x0 ) is in the interior of
F . Assume also that g is differentiable at f (x0 ). Then g f : E Rp is also differentiable
at x0 , and
(g f )0 (x0 ) = g 0 (f (x0 ))f 0 (x0 ).
Remark 5.2. We can intuitively think of the chain rule as follows. From Newtons approximation, we have
f (x) f (x0 ) f 0 (x0 )(x x0 ).
Also, using Newtons approximation again,
g(f (x)) g(f (x0 )) g 0 (f (x0 ))(f (x) f (x0 )).
So, combining these two approximations, we have
g(f (x)) g(f (x0 )) g 0 (f (x0 ))f 0 (x0 )(x x0 ).
That is, (g f )0 (x0 ) = g 0 (f (x0 ))f 0 (x0 ). The rigorous version of this proof irons out the details
inherent in Newtons approximation.
Exercise 5.3.
Let L : Rn Rm be a linear transformation. Show that there exists a real number
M > 0 such that kLxk M kxk, for all x Rn . (Hint: first, using Remark 1.5,
write L in terms of a matrix A. Then, set M to be equal to the sum of the absolute
values of the entries of A. Use the triangle inequality a lot. There are many different
ways to do this exercise, some of which use a different value of M . For example,
you could try using the Cauchy-Schwarz inequality.) In particular, conclude that any
linear transformation L : Rn Rm is continuous.
Let E be a subset of Rn . Assume that f : E Rm is differentiable at an interior
point x0 of E. Then f is also continuous at x0 .
Prove Theorem 5.1. (Hint: it may be helpful to review the proof of the single variable
chain rule. It is probably easiest to use the sequence definition of a limit.)
Example 5.4. Suppose f : Rn Rm is a differentiable function, and xj : R R are
differentiable functions for all j {1, . . . , n}. Then
n
X
d
f
f (x1 (t), . . . , xn (t)) =
x0j (t)
(x1 (t), . . . , xn (t)).
dt
xj
j=1
This follows from the chain rule.
7

6. Iterated Derivatives and Clairauts Theorem


We now investigate what happens when we differentiate a function twice, in two different
directions.
Definition 6.1. Let E be a subset of Rn , and let f : E Rm be a function. We say that
f
f
, . . . , x
exist
f is continuously differentiable if and only if the partial derivatives x
n
1
and are continuous on E. We say that f is twice continuously differentiable if and only
f
f
, . . . , x
are themselves
if it is continuously differentiable, and the partial derivatives x
n
1
continuously differentiable.
Continuously differentiable functions are sometimes called C 1 functions. Twice continuously differentiable functions are sometimes called C 2 functions. One can also define C 3
functions, C 4 functions, etc., but we will not do so here.
Let f : R2 R. As you may have learned, it is often true that x 1 x 2 f = x 2 x 1 f .
Unfortunately, this equality does not always hold.
Exercise 6.2. Define f : R2 R by f (x, y) := (x3 y)/(x2 + y 2 ) when (x, y) 6= (0, 0), and
f (0, 0) := 0. Show that f is continuously differentiable, and the double derivatives x 1 x 2 f
and x 2 x 1 exist, but these derivatives are not equal at (0, 0).
Thankfully, if f is twice continuously differentiable, then the order of differentiation does
not matter.
Theorem 6.3 (Clairauts Theorem). Let E be an open subset of Rn , and let f : E Rm
be a twice continuously differentiable function. Then, for all 1 i, j n and for all interior
points x0 of E, we have x i x j f (x0 ) = x j x i f (x0 ).
Proof. The claim is certainly true for i = j so assume that i 6= j. By replacing f by f (xx0 )
as necessary, we may assume that x0 = 0.
Define a := x i x j f (x0 ) and define a0 := x j x i f (x0 ). We need to show that a = a0 .
Let > 0. Since f is twice continuously differentiable, there exists > 0 such that, for all
x with kxk < 2, we have








0

< ,

< .
f
(x)

a
f
(x)

a
xi x j

xj xi

Define
M := f (ei + ej ) f (ei ) f (ej ) + f (0).
Applying the Fundamental Theorem of Calculus to the ei variable, we have
Z
f
(xi ei + ej )dxi .
f (ei + ej ) f (ej ) =
0 xi
And
Z
f
f (ei ) f (0) =
(xi ei )dxi .
0 xi
Therefore,
Z
f
f
(xi ei + ej )
(xi ei )dxi
M=
xi
0 xi
8

For each xi (0, ), there exists xj [0, ] such that, by the Mean Value Theorem, we
have
f
f
f
(xi ei + ej )
(xi ei ) =
(xi ei + xj ej ).
xi
xi
xj xi
By our choice of (noting that kxi ei + xj ej k < 2), we therefore have


f

f
0


xi (xi ei + ej ) xi (xi ei ) a < .
So, integrating this inequality over xi [0, ], we get


M 2 a0 < 2 .
We can run this same argument with the roles of i and j reversed (noting that M is symmetric
in i, j) to get


M 2 a < 2 .
So, from the triangle inequality, we conclude that
|a a0 | < 2.
Since this inequality holds for all > 0, we conclude that a = a0 , as desired.

7. Appendix: Notation
Let A, B be sets in a space X. Let m, n be a nonnegative integers.

Z := {. . . , 3, 2, 1, 0, 1, 2, 3, . . .}, the integers


N := {0, 1, 2, 3, 4, 5, . . .}, the natural numbers
Z+ := {1, 2, 3, 4, . . .}, the positive integers
Q := {m/n : m, n Z, n 6= 0}, the rationals
R denotes the set of real numbers
R = R {} {+} denotes the set of extended real numbers

C := {x + y 1 : x, y R}, the complex numbers

Rn
AB
ArB
Ac
AB
AB

denotes the empty set, the set consisting of zero elements


means is an element of. For example, 2 Z is read as 2 is an element of Z.
means for all
means there exists
:= {(x1 , . . . , xn ) : xi R, i {1, . . . , n}}
means a A, we have a B, so A is contained in B
:= {x A : x
/ B}
:= X r A, the complement of A
denotes the intersection of A and B
denotes the union of A and B

Let (X, d) be a metric space, let x0 X, let r > 0 be a real number, and let E be a subset
of X. Let (x1 , . . . , xn ) be an element of Rn , and let p 1 be a real number.

B(X,d) (x0 , r) = B(x0 , r) := {x X : d(x, x0 ) < r}.


E denotes the closure of E
int(E) denotes the interior of E
E denotes the boundary of E
n
X
:=
k(x1 , . . . , xn )k`p
(
|xi |p )1/p
i=1

k(x1 , . . . , xn )k` := max |xi |


i=1,...,n

10

Let f, g : (X, dX ) (Y, dY ) be maps between metric spaces. Let V X, and let W Y .
f (V ) := {f (v) Y : v V }.
1

(W ) := {x X : f (x) W }.
d (f, g) := sup dY (f (x), g(x)).
f

xX

B(X; Y ) denotes the set of functions f : X Y that are bounded.


C(X; Y ) := {f B(X; Y ) : f is continuous}.
Let f, g : R C be Z-periodic functions.
kf k := sup |f (x)| .
x[0,1]

Z
hf, gi := (

f (x)g(x)dx)1/2 .
0
Z 1
p
kf k2 := hf, f i = (
|f (x)|2 dx)1/2
0
Z 1
dL2 (f, g) := kf gk2 = (
|f (x) g(x)|2 dx)1/2 .
0

Let n, m be positive integers, let (e1 , . . . , en ) denote the standard basis of Rn , let E be a
subset of Rn , let f : E Rm be a function, let x0 E be an interior point of E, let v Rn ,
and let j {1, . . . , n}.
f 0 (x0 ) denotes the total derivative of f .
Dv f (x0 ) denotes the derivative of f in the direction v.

f
(x0 ) =
f (x0 ) = Dej f (x0 ).
xj
xj
Let E be a subset of Rn , let f : E R be a function, and let x0 be an interior point of E.
f (x0 ) = (

f
f
(x0 ), . . . ,
(x0 )).
x1
xn

7.1. Set Theory. Let X, Y be sets, and let f : X Y be a function. The function f : X
Y is said to be injective (or one-to-one) if and only if: for every x, x0 V , if f (x) = f (x0 ),
then x = x0 .
The function f : X Y is said to be surjective (or onto) if and only if: for every y Y ,
there exists x X such that f (x) = y.
The function f : X Y is said to be bijective (or a one-to-one correspondence) if
and only if: for every y Y , there exists exactly one x X such that f (x) = y. A function
f : X Y is bijective if and only if it is both injective and surjective.
11

Two sets X, Y are said to have the same cardinality if and only if there exists a bijection
from X onto Y .
UCLA Department of Mathematics, Los Angeles, CA 90095-1555
E-mail address: heilman@math.ucla.edu

12

Vous aimerez peut-être aussi