Vous êtes sur la page 1sur 94

Inverse problems in Image processing and Image

segmentation: some mathematical


and numerical aspects

A. Chambolle

CEREMADE (CNRS, UMR 7534),


Universit de Paris-Dauphine, 75775 Paris cedex 16, France

Lecture given at the


School on Mathematical Problems in Image Processing
Trieste, 4  22 September 2000

LNS002001

 antonin.chambolle@ceremade.dauphine.fr

Abstract
These notes contain an introduction to some approaches to the regularization of inverse problems in image processing and to the mathematical tools that are necessary to handle correctly these approaches.
The methods we consider here are variational methods. We consider
mainly the minimization of two kinds of functionals: functionals based
on the total variation of the image, and the socalled Mumford and
Shah functional that penalizes the edge set and the gradient of the image. In both cases we study mathematically the existence of a solution
in the space of functions with bounded variation (BV ), and discuss
then some approximations and numerical methods for computing solutions.

Keywords:

Image processing, inverse problems, image segmentation, func-

tions with bounded variation,

convergence, iterative algorithms.

AMS Classication numbers:

26A45, 49J45, 49Q20, 68U10

Contents
1 Introduction: denoising and deblurring images

1.1

The classical approach

. . . . . . . . . . . . . . . . . . . . .

1.2

The total variation criterion . . . . . . . . . . . . . . . . . . .

10

1.3

The segmentation of images . . . . . . . . . . . . . . . . . . .

10

1.3.1

A statistical approach to image denoising

. . . . . . .

10

1.3.2

The MumfordShah functional

. . . . . . . . . . . . .

14

2 Some mathematical preliminaries


2.1

The functions with bounded variation (


2.1.1

2.2

2.4

15
. . . . . . . . . .

15

Why we need bounded variation functions . . . . . . .

15

2.1.2

BV

. . . . .

16

2.1.3

Existence for the Rudin-Osher approach . . . . . . . .

19

functions: denition and main properties

2.2.2

BV functions . . .
The jumps set Su . . . . . . . .
BV functions in one dimension

2.2.3

The jumps set and the singular part of

2.2.4

Special

2.2.5

The general,

2.2.6

Special

functions . . . . . . . . . . . . . . . . . . .

30

2.2.7

Ambrosio's compactness theorem . . . . . . . . . . . .

31

2.2.8

Slicing . . . . . . . . . . . . . . . . . . . . . . . . . . .

More properties of
2.2.1

2.3

BV )

BV

BV

. . . . . . . . . . . . .
. . . . . . . . . . . . .

21

. . . . . . . . . . . . .

21

Du

functions, in dimension one

N dimensional case

Back to the MumfordShah functional

21

. . . . . .

25

. . . . . . . .

27

. . . . . . . . . . . .

29

32

. . . . . . . . . . . . .

33

2.3.1

Existence for the weak formulation . . . . . . . . . . .

33

2.3.2

From the weak to the strong formulation . . . . . . . .

34

Variational approximations and

convergence

. . . . . . . .

35

3 The numerical analysis of the total variation minimization

36

3.1

The discrete energy . . . . . . . . . . . . . . . . . . . . . . . .

3.2

The method . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

3.3

Proof of the convergence of the algorithm

. . . . . . . . . . .

39

3.4

Two examples . . . . . . . . . . . . . . . . . . . . . . . . . . .

42

4 The numerical analysis of the MumfordShah problem (I)


4.1

Ambrosio and Tortorelli's approximate energy . . . . . . . . .

4.2

Sketch of the proof of Ambrosio and Tortorelli's theorem, in

36

43
43

dimension one . . . . . . . . . . . . . . . . . . . . . . . . . . .

45

4.2.1

45

Proof of (i)

. . . . . . . . . . . . . . . . . . . . . . . .

4.2.2
4.3

Proof of (ii) . . . . . . . . . . . . . . . . . . . . . . . .

Higher dimensions

48

. . . . . . . . . . . . . . . . . . . . . . . .

49

4.3.1

The rst inequality . . . . . . . . . . . . . . . . . . . .

49

4.3.2

The second inequality

51

. . . . . . . . . . . . . . . . . .

5 The numerical analysis of the MumfordShah problem (II) 53


5.1

Rescaling Blake and Zisserman's functional

5.2

The
5.2.1
5.2.2

. . . . . . . . . .

limit of the rescaled 1dimensional functional


Proof of (i)
Proof of (ii)

53

. . . . .

55

. . . . . . . . . . . . . . . . . . . . . . . .

55

. . . . . . . . . . . . . . . . . . . . . . .

56

5.3

The

. . . . .

57

5.4

More general nite-dierences approximations . . . . . . . . .

limit of the rescaled 2dimensional functional

58

6 A numerical method for minimizing the MumfordShah functional


61
6.1

An iterative procedure for minimizing (34) . . . . . . . . . . .

62

6.2

Anisotropy of the length term . . . . . . . . . . . . . . . . . .

64

6.3

Numerical experiments . . . . . . . . . . . . . . . . . . . . . .

67

A Proof of Theorems 11 and 12

74

A.1

A compactness lemma

A.2

Estimate from below the

limit

. . . . . . . . . . . . . . . .

77

A.3

Estimate from above the

limit

. . . . . . . . . . . . . . . .

83

A.4

Proof of Theorem 12 . . . . . . . . . . . . . . . . . . . . . . .

88

References

. . . . . . . . . . . . . . . . . . . . . .

74

89

Main notations

 a _ b, a ^ b: respectively, the max and the min of the two real numbers
a; b 2 R.
 Hk : the kth dimensional Hausdor measure. In particular, for every
N
0
set E  R , H (E ) is the cardinality of E , also denoted by ]E .
 E (x): the characteristic function of a set E , i.e., E (x) = 1 if x 2 E
and

E (x) = 0 otherwise.

 jE j = LN (E ) = RR E (x) dx: the Lebesgue measure of E  RN .


 Cc(
), Cc1(
), Cc1(
): the space of compactly supported continuous
N

(respectively, continuously dierentiable, innitely dierentiable) real


valued functions on the domain

 RN . Cc1(
) is also denoted by

D(
) when it is equipped with the appropriate topology in order to
dene the distributions by duality (see [55]).

Cc (
; RN ) = [Cc (
)]N ,

etc.

 C0 (
):

the space of the realvalued functions that are continuous on

 M(
):

the space of bounded Radon measures on

and vanish at the boundary and/or at innity, in the sense that if


' 2 C0 (
), 8" > 0, 9K 
compact set such that sup
nK j'j  ". The
norm on C0 (
) is k'k = sup
j'j. With this norm, C0 (
) = Cc (
).
N
N
Similarly, C0 (
; R ) = [C0 (
)] .
(and isometric) to the topological dual

C0 (
)0

of

[M(
)]N = C0 (
; RN )0 .
hx0; xi: the Euclidean scalar product of x; x0 2 RN , or the duality
0
0
product between an element x 2 X of a space X and an element x 2 X
0
of its dual (also sometimes denoted by hx ; xiX 0 ;X ). The Euclidean
N is usually denoted by j j = ph;  i.
norm in R

 (a; b):

the set ft 2 R : a < t < bg. [a; b] = ft


(a; b] = ft 2 R : a < t  bg, etc.

. It is isomorphic
C0 (
). M(
; RN ) =

SN 1 = f

2 R : a  t  bg,

2 RN : jj = 1g is the (N 1)dimensional sphere in RN .

Inverse problems in Image processing and Image segmentation

1 Introduction: denoising and deblurring images


One fundamental branch of the image processing concerns the problem of

reconstructing

images, i.e., given some data (that may be a corrupted image

but also any kind of signal, like the output of a tomography device or of
a satellite aerial), how to reconstruct a clear and clean image that can be
correctly understood by a human operator or post-processed by other image
analysis methods.
The most basic examples of image reconstruction problems are the problems of denoising and of deblurring an image. Although they are the simplest,
they share many common features with more complicated problems that are
usually too specic for the purpose of short lectures.
belong to the class usually known as

All these problems

inverse problems.

It means that the

process through which the data is obtained from the physical characteristics
of the observed scene corresponds to transformations that are roughly well
understood and can be more or less correctly modelized mathematically, but
whose inverse either is not known or is not computable by direct methods,
or whose computation is highly instable and sensitive to small changes in the
data (or noise), so that the scene itself is dicult to reconstruct.
First we will describe the main classical approach to denoising and deblurring (more or less the standard method for solving inverse problems)
and will try to explain why it is not well suited to the nature and structure
of images. Then, we will introduce solutions that have been proposed in the
past years to improve this approach.

1.1

The classical approach

G = (gi;j )1i;j n of
[0; 1], and suppose you know that this signal is the sum of
a perfect world unknown signal U = (ui;j )1i;j n and an additive Gaussian
noise N = (ni;j )1i;j n , where, for instance, all ni;j are independent and
2
have mean 0 and known variance  .

Assume you observe a signal (an image) which is a matrix


grey level values in

In a dierent point of view, in the continuous setting, you can assume


that the signal you observe is a bounded grey-level function

g :
![0; 1]
where

is the screen, usually an open domain of

R2

(although lower or

higher dimensions may be considered), and most of the time, in the applications, a rectangle, e.g.,

(0; 1)  (0; 1).

This function

g(x) will be assumed

A. Chambolle

to be the sum

u(x) + n(x)

of a good image

u(x) Rand an oscillation n(x)

n(x) dx = 0 and that

that we would like to remove. We will assume that

2
2

n(x) dx = 

is known or can be correctly estimated.

The rst point of view (the discrete setting) describes well the structure of
digital images, and is usually adopted in the statistical approaches to image
reconstruction. We will return to this setting in section 1.3 devoted to the
image segmentation problem, since the origins of the approach that we will
discuss in these notes are to be found in the statistical approach to image
denoising. However, in the PDE or variational approach that we will usually
adopt here, it is more common and more convenient to work in the continuous
setting, and except otherwise mentioned we will consider this point of view
in the sequel.
Up to now we have just considered an image corrupted by some noise, but
usually an image also goes through all kinds of degradations, that are usually
modelized by a blur of more or less known kernel. It means that instead of

g(x) = u(x) + n(x), the correct model should be g(x) = Au(x) + n(x) where
A is a linear operator, say, from L2 (
) into L2 (
) (or any kind of reasonable
function space). Usually

Au(x) =   u(x) =
is simply a blur (a convolution), with

(x y)u(x) dy

 some (usually non negative)

kernel

that is known or estimated, but one may imagine more complex operators
(like tomography kernels, or all sorts of transformation).

given g and an
A and 2 , is it possible to get a good approximation of u?
1 g = u + A 1 n, however this is
The rst idea would be to compute A
not feasible in practice: the operator A is often not invertible, or its inverse
is impossible to compute. Consider for instance the case where Au =   u.
In the Fourier domain, we nd that d
Au = ^u^ where ^ denotes the Fourier
1
transform. So that u = A
v if and only if u^ = v^=^. But, even if ^ does not
2
2
vanish, this ratio is usually not in L for an arbitrary v 2 L . Moreover, if 
is a smooth low pass lter, then 
^( ) is very small for large frequencies j j, so
that in the case where v is the oscillatory signal n, for which jn
^ ( )j remains
strictly greater than zero for large j j, the ratio n
^ ( )=^( ) will become very
large and go to +1 as j j increases. This enhancement of the high frequencies
1n
gives birth to wild oscillations and artifacts that make the image u + A
Then, the problem we need to solve is the following:

estimation of

impossible to read.

Inverse problems in Image processing and Image segmentation

A better approach to this kind of problem, therefore, is the following: we


will try to nd the best function

u among all u satisfying

8 Z
>
>
<

Au(x) g(x) dx = 0

>
>
:

jAu(x) g(x)j2 dx = 2:

(1)

So that the main issue, now, is to nd a good criterion for characterizing
what the best function

u is.

The classical approach of Tichonov consists in minimizing some quadratic


norm of

u, like
juj2

or

jruj

under the constraints (1). Both problems

can easily be solved (using the Fourier transform) and the linear transformation of

g that gives the solution u is called a Wiener lter.

Figure 1: From left to right: a white square on a black background; the same
image with noise added; the Tichonov reconstruction by minimizing
the minimization of

jruj2.

juj2 ;

However, while the rst criterion is not regularizing enough and produces
images that still look very noisy, the criterion

jruj

is not well suited ei-

ther for the analysis of images (see Fig. 1). Indeed, if it is nite, it means
that the image belongs to the Sobolev space

H 1 (
), and it is well known that

a function in that space may not have discontinuities along a hypersurface,


whereas the grey level of an image should be allowed to have such discontinuities that correspond to edges and boundaries of objects in the image. For
instance, in dimension 1, it is well known that if
interval of

R),

then for every

u(y) u(x) =
so that

u 2 C 0; 2 (I )

x; y 2 I

Z y

with

x  y,

may not have discontinuities.

sZ

p
u0 (s) ds  y x

(the space of continuous

u 2 H 1 (I ) (I
y
x

being some

ju0 (s)j2 ds

1 Hlder
2

functions in

I)

and

This motivates the introduction of the criterion that we discuss in the


next section.

A. Chambolle

10

1.2

The total variation criterion

In their paper [54], Rudin, Osher and Fatemi describe a dierent approach
(see also [46, 53, 60, 61, 24, 25, 44, 33, 47]).

Their idea is to try to nd

a criterion of minimization that corresponds better to the structure of the


images. They propose to consider the total variation of the function

u as

a measure of the optimality of an image.


The total variation (that will be introduced correctly in section 2.1) is
roughly the integral

jru(x)j dx.

The main advantage is that it can be

dened for functions that have discontinuities along hypersurfaces (in 2

1dimensional curves), and this is essential to get

dimensional images, along

a correct representation of the edges in an image.


The problem to solve is thus the following:

min

Z

jru(x)j dx : u satises (1) :

(2)

We will show in section 2.1 that under some simple and natural assumptions,
this problem has a solution. Then, we will propose a numerical approach for
computing a solution.

1.3

The segmentation of images

The last approach that we will discuss in these notes can be seen as an
independent problem, although historically it has the same origin. It is called
the problem of image segmentation, and can be described as the problem
of nding a simple representation of a given image in terms of edges and
smooth areas. The proposition of D. Mumford and J. Shah [49, 50]) to solve
this problem by minimizing a functional is indeed derived from statistical
approaches to image denoising, introduced in particular by S. and D. Geman,
that we will describe in the next section. Again, the problem of Geman and
Geman was to regularize correctly an inverse problem (the problem that we
have described in the previous paragraphs, written in the discrete setting),
and to restore correctly the edges of the image. Thus we will briey describe
the point of view of Geman and Geman, and explain then how Mumford and
Shah derived their continuous formulation.

1.3.1 A statistical approach to image denoising


The origins of the variational approaches to image segmentation are to be
found in Geman and Geman's famous paper [41] in which they introduce a

Inverse problems in Image processing and Image segmentation

11

statistical approach for image analysis that has proved to be very ecient.
First we will briey explain how it appeared in the probabilistic setting.
We return to the discrete setting of the image denoising problem: the
observed signal (or image) is a matrix

G = (gi;j )1i;j n

of grey level values

[0; 1], and is the combination of a perfect world unknown signal U =


(ui;j )1i;j n and an additive Gaussian noise N = (ni;j )1i;j n . The ni;j are
2
independent and have mean 0 and variance  . If you know the a priori
probability P (U ) of the perfect world signal U , since for a given G = U + N ,
the probability of G knowing U is P (GjU ) = P (N = G
U )  exp( kG
U k2 =22 ), the Bayes' rule tells you that

in

P (U jG)P (G) = P (GjU )P (U );


so that

P (U jG), up to a constant, is P (GjU )P (U ), that is:


0

X
p 1 nn exp @ 21 2 (gi;j
( 2)
i;j

ui;j )2 A P (U ):

Geman and Geman proposed the following

a priori

probability for

U:

they considered that most scenes are piecewise smooth with possible discontinuities (the edges), and introduce an edge set (or

(li+ 21 ;j )1i<n;1j n; (li;j + 12 )1in;1j<n ,

line process) L =
l ; is either 0

where each variable

or 1, and (see Fig. 2):

8
>
<

li+ 21 ;j = >

if there is a break (a vertical piece of edge) between

i; j

i + 1; j ,

and

i; j
i + 1; j

if

8
>
<

if there is a break (a horizontal piece of edge) between

li;j + 21 = >
:

if

has to be smooth between

has to be smooth between

i; j

and

and

i; j + 1.

They then proposed the following probability law for

P (U; L) =
1
exp
Z

8
<
:

X

i;j

(1 li+ 12 ;j )(ui+1;j

+(1 li;j + 21 )(ui;j +1

and

U; L:

ui;j )2 + li+ 21 ;j

ui;j )2 + li;j + 21

i; j
i; j + 1

A. Chambolle

12

x
x

i; j

x
i; j

QQ
k
x

x +1
i

+1

;j

x
x

+

li;j + 1

=1

i; j

li+ 1 ;j

Figure 2: The

=1

line process: li+ 1 ;j


2

and

li;j + 21 .

,  are two positive weights, and Z is computed in order to have


U;L P (U; L) = 1, the sum being computed over all the possible states U; L.

where

The problem that needs to be solved is therefore the following:


Among all possible images
greatest probability

where the

and

i;j

and line processes

P (U; LjG)

free energy

E (U; L; G) =

e

L, nd the one that has the

E (U;L;G) ;

(3)

E (U; L; G) is given by


 (1 li+ 21 ;j )(ui+1;j


+  li+ 12 ;j + li;j + 21
1
+ 2 (gi;j ui;j )2
2

ui;j )2 + (1 li;j + 21 )(ui;j +1 ui;j )2

G = (gi;j )1i;j n is the given data

(4)

G will be xed, we will drop the


dependency in G in the notations and merely write E (U; L).
In what follows, since the observed data

Then, Geman and Geman proposed to maximize the probability (3) using a

simulated annealing

algorithm (see for instance [11, 12], [15], [32], the

book [51] for more general segmentation models, and the book on Markov
Random Field Modeling in Computer Vision by Li [45] for a general introduction to the eld). This kind of method is still widely used in the computer
vision community and gives good result. It has to be adapted to each particular segmentation problem (in which the problem we exposed is among the
simplest, but might not be the most interesting!).
We will present other approaches, since in some simple cases it might be
too costly to implement a simulated annealing algorithm.

Notice that the

problem of maximizing (3) is equivalent to the problem of nding a minimum

Inverse problems in Image processing and Image segmentation


to the free energy

E (U; L) that appears in the exponential in (3).

13

The prob-

lem is that this energy is not convex, so that there is no known deterministic
(i.e., non-probabilistic) algorithm that can be proved to surely converge to
the minimum. The history of the minimization of

E (U; L) is therefore mostly

a competition for nding a better algorithm, in any possible sense.


In the 80's, already, many have suggested deterministic methods to minimize directly energy (4).

See for instance [38, 39], or [40], and more re-

cently [10], but of course this list is far from being exhaustive.
In most of these papers, the problem is iteratively approximated by a
sequence of simpler problems, each one becoming less convex as the process
evolves. This is the central idea of the book Visual Reconstruction by Blake
and Zisserman [14], who introduce the socalled Graduated Non-Convexity
(GNC) algorithm.
They rst noticed that, minimizing with respect to

L, the energy E in (4)

can be rewritten as

E (U ) =

i;j

W; (ui+1;j

ui;j ) + W; (ui;j +1 ui;j ) +

where the non-convex potential

1
(g
22 i;j

ui;j )2
(5)

W; is (see Figure 3)

W; (x) = min(x2 ; ):


(We will also denote

min(x2 ; )

(x2 ) ^ .)

by

Blake and Zisserman call

W1;1 (x)

1.5
1
0.5
0

-2

-1.5

-1

-0.5

Figure 3: The function

0.5

1.5

W; (x) for  =  = 1.

A. Chambolle

14

E (U )

the weak membrane energy, since it looks like the potential of an

elastic membrane that can break when the elastic energy becomes locally too
high. Notice now that their problem is very similar to the inverse problems
that we have presented in the previous sections. Now, instead of minimizing
a regularizing factor (that is quite more complex than Tichonov's) under a

A = Id), the energy has a termR21 2 (gi;j ui;j )2 that


2
could be seen as a Lagrange multiplier for the constraint
ju(x) g (x)j dx =
2
.
Their idea to minimize E (U ) is to replace W; with a family of potentials
 ,  2 [0; 1], with W  convex for  = 0 and gradually going to W
W;
; as
;
 increases to 1. They then propose to solve the problem for small , and
then to increase slowly  to improve the solution.
constraint like (1) (with

1.3.2 The MumfordShah functional


In order to study energies (4) or (5), Mumford and Shah (see [49, 50]) proposed to rewrite those in a continuous setting. They considered an observed

g(x; y), with (x; y) 2


,
bounded open set of R2 , and g(x; y) 2 [0; 1]
for (almost) every (x; y ), and then they noticed that the variable L, or rather
the set fL = 1g, describes the discontinuity or jump set K 
of a
piecewise regular function u(x; y ), (x; y ) 2
, whereas the nite dierences
ui+1;j ui;j (resp., ui;j +1 ui;j ), are approximations of the partial deriva@u (x; y ) (resp., @u (x; y )). The energy they wrote was thus (with the
tives
@x
@y
@u ; @u ) for the gradient)
standard notation ru = (
@x @y
image

E (u; K ) = 

ZnK

+
where

; ; 

jru(x; y)j2 dxdy +   length(K \


)

(u(x; y) g(x; y))2 dxdy

are positive parameters.

(6)

They then proposed to study the

problem of minimizing energy (6).


In these lecture notes we will try to explain briey
(a) how this problem can mathematically be handled, in what setting, what
functions space, in what sense it has a solution,
(b) a rst approximation result that has been proposed in order to minimize
more easily energy

E (u; K ), in a continuous setting,

Inverse problems in Image processing and Image segmentation


(c) in what sense can one say that

E (u; K )

and

E (U; L)

15

are the same

energies, in a continuous and in a discrete setting,


(d) how it is possible to approximate
in some sense, than with energy

E (u; K ) by discrete energies, better,


E (U; L).

What we will not describe on the other hand are the possible niteelement approaches that have also been proposed for solving the Mumford
Shah problem. It is still not clear whether they are of some interest for image
processing applications or not. They are usually useful in other elds where
similar problems are relevant, and in particular in fracture mechanics. The
interested reader may consult [13, 37, 16], or [23, 17].

2 Some mathematical preliminaries


2.1

BV )

The functions with bounded variation (

2.1.1 Why we need bounded variation functions


For the study of Rudin and Osher's problem (2), the correct mathematical
setting is clearly the functions with bounded variation (the criterion they
propose to minimize being simply the semi-norm dening such functions),
that we will dene in the next paragraph. Although it may be not as clear,
this is also true for the analysis of the MumfordShah functional. Indeed, in
order to study energy

E , Ambrosio and De Giorgi have suggested to introduce

a weak formulation depending only on the variable


assumes that we are able to dene, given a function

Su

and a gradient

ru everywhere outside of Su.

u.

This formulation

u, a set of discontinuities

The weak MumfordShah

energy is then

E (u) =

jru(x)j2 dx +

HN 1(S

u) +

ju(x) g(x)j2 dx:

(7)

u, g are dened in a domain


of a space of arbitrary
dimension N , and the set Su is (N
1)dimensional, for images you can
N 1 denotes the (N 1)
just replace N by 2 everywhere in the notes. H

Here we consider that

dimensional Hausdor measure (see for instance [35]). It is a Borel measure


RN that agrees with the traditional denition of the surface for every
N (any bounded part of an hyperplane, a sphere,
regular hypersurface in R

in

...).

A. Chambolle

16

The discontinuity set

Su

can be dened for very general functions, but

it usually has no kind of regularity. A correct denition of the gradient

u.

requires more regularity of

integrable function if

1;1 (
)).
Wloc
fact,

Usually, we can dene a gradient

belongs to the Sobolev space

But in this case it is possible to show that

HN 1(Su ) = 0: we say that Su is HN

W 1;1 (
)

ru

ru as an

(or at least

Su is almost empty (in

1 essentially

empty).

The space of bounded variation functions, that we are going to introduce,


doesn't suer this drawback. It contains functions for which it is possible to

Su and the gradient ru, in such a way that 0 < HN 1 (Su ) <
+1 and jru(x)j2 dx < +1. Such a function combines some regularity,
and discontinuities across the essentially (N
1)dimensional set Su

dene correctly

2.1.2 BV functions: denition and main properties


The space of

bounded variation functions in


, denoted by BV (
), is dened

in the following way:

BV (
) = u 2 L1 (
) : Du is a bounded Radon vector measure on
;
where

Du is the distributional (or weak) derivative of u, dened by

hDu; iD0(
;R

);D(
;

RN

)=

(8)

u(x)div (x) dx

 2 D(
; RN ), i.e., C 1 with compact support in
.
N
Let us denote by M(
; R ) the space of N dimensional bounded (vector
valued) Radon measures on
. It is well known (as a consequence of Riesz'
N
N
representation theorem) that M(
; R ) is identied to the dual of C0 (
; R ),

for any vector eld

the space of all continuous vector elds vanishing at the boundary (this
means that if

2 C0(
; RN ),

for every

" > 0,

there exists a compact

K 
such that supx=2K jj < "), on which the norm is given by
kkC0 (
;R ) = supx2
j(x)j. If  2 M(
; RN ) is a measure, we can deset

ne its

variation

as the Borel positive measure given by

jj(E ) = sup

n
X
i=1

j(Ei )j :

i=1

Ei  E ; Ei \ Ej =  8i 6= j ;

(9)

Ei ; i = 1;    ; n, are disjoint Borel


 is bounded is nothing else than saying that
jj(
) < +1, the quantity jj(
) is called the total variation of  (on
)
N
and denes the usual norm in the Banach space M(
; R ).

for every Borel set

n
[

(here the

sets). Saying that the measure

Inverse problems in Image processing and Image segmentation


As an element of the dual

C0 (
; RN )0

of

C0 (
; RN ), 

17

also has a norm

given by

kkC0 (
;R

sup
h; i =
sup
(x)(dx):
kkC0 (
;RN ) 1
kkC0 (
;RN ) 1

)0 =

In fact, both norms coincide, which means that for every

jj(
) = sup

Z

(x)(dx) :  2 C0

(
; RN ); j(x)j

 2 M(
; RN ),


 1 8x 2

 convergence of a sequence of measures (n) is understood as

The weak-

 convergence in the dual of C0 (


; RN ), which means that n*

the weakweakly-

 if and only if

for every
If

Du

for every

(x)n (dx) !

 2 C0 (
; RN ).

(x)(dx)

is a bounded Radon measure, then since

in

Cc1(
; RN ),

Du =

we deduce (the compactly supported

functions being dense in the space

u 2 BV (
)

, V (u;
) = sup
:

u div 
C 1 regular

C0 (
; RN ) ) that if u 2 L1 (
),

Z

u(x)div (x) dx :  2 Cc1 (


; RN );
o

j(x)j  1 8x 2
< +1:

(10)

V (u;
) coincides with the total variation of the measure Du,
V (u;
) = jDuj(
). In fact, saying that V (u;
) must be nite is an
equivalent way to dene the space BV (
). We call V (u;
) = jDuj(
) the
total variation of u in
. If u 2 C 1 (
), or u is in the Sobolev space W 1;1 (
),
then the notation in (2) is valid since it is simple to show that jDuj(
) =
R

jru(x)j dx. The space BV (


), endowed with the norm kukBV (
) =
kukL1 (
) + jDuj(
), is a Banach space.
Exercise. Prove that, given u 2 L1 (
), Du 2 M(
; RN ) if and only if
V (u;
) (given by (10)) is nite.

The quantity
i.e.,

The rst result we can state about the total variation is the following
semi-continuity property:

Theorem 1 (Semicontinuity of the total variation) The convex func-

u 7! V (u;
) = jDuj(
)
L1loc(
) topology.

tional

2 [0; +1]

is lower semicontinuous in the

A. Chambolle

18

un goes to u in L1 (
0 ) for every
0 
, then jDuj(
) 
lim inf n!1 jDun j(
). The proof of theorem 1 is straightforward if we consider the denition (10) of the variation of u. Indeed, in (10), V (u;
) is built
R
1
N
as the sup of the linear functionals u 7!
u(x)div (x) dx for  2 Cc (
; R ).
1
Since each of these functionals is continuous in the Lloc topology, we deduce

This means that if

that the sup is lower semicontinuous.


Next, we have the following Poincar inequalities

Theorem 2 (Poincar inequalities) There exists a constant c = c(N )


such that if u 2 L1 (RN ), then
loc

kukL
and if

is a ball and

N
N 1

(RN )

u 2 L1 (B ),

L N 1 (B )

Here and everywhere in the notes

 cjDuj(RN );

Xu

 cjDuj(B ):
=

X u(x) dx

denotes the average

jX j X u(x) dx.
If
is a bounded Lipschitzregular open set (this will be assumed always

in what follows) we can build a continuous linear extension operator

T
;
0

0
0
0
from BV (
) to BV (
) for every
with

, which means that for
0
0
0
0
every u 2 BV (
) we can nd u 2 BV (
) with u  u on
and ku kBV (
0 ) 
ckukBV (
) , the constant c depending only on
;
0 . This extension allows
to generalize the second inequality in the last theorem to any such
: we
deduce that there exists a constant c = c(
) such that


u

for every

L N 1 (
)

 cjDuj(
)

(11)

u 2 BV (
) (see [34] for details).

We state, still without any proof, the two next theorems that are fundamental for the study of the space

BV (
).

Theorem 3 (Sobolev embeddings) Let


be bounded and Lipschitzregular.
Then the space BV (
) is continuously embedded in LN=(N 1) (
), and compactly embedded in

Lp (
) for every 1  p < N=(N

1).

The rst assertion is a consequence of the previous theorem.


means that if a sequence of functions

(uj )j 1

is bounded in

The second

BV (
),

i.e.,

Inverse problems in Image processing and Image segmentation

19

supj kuj kL1 (


) + jDuj j(
) < +1, then we can extract a subsequence ujk and
there exists a function u 2 BV (
) such that, as k !1, Dujk *Du weakly-
p
as a measure and ujk !u strongly in L (
), for every p < N=(N
1).
Theorem 4 (Approximation by smooth functions) Let u 2 BV (
).
Then there exists a sequence (un )n1  C 1 (
) such that, as n!1, un !u
in L1 (
), Dun *Du weakly- as measures, and
Z

jDunj(
) =

jrun(x)j dx ! jDuj(
):

These properties (in fact, mainly Theorem 2) are sucient to derive the
existence for problem (2), as we are going to show in the next section.

2.1.3 Existence for the Rudin-Osher approach


The existence for problem (2) in dimension

N =1

or

N =2

is ensured

provided we assume that

the operator

A satises A1 = 1 (i.e., the image of a constant function

is the same function),




the initial data satises


there exists a

jg(x)

2
2

gj dx   ,

u~ satisfying (1) such that jDuj(


) < +1.
A1 6 0)
A
corresponds to a
R
1 (Au =   u,
 = 1) (provided the

The rst assumption is not absolutely necessary (we need that


but simplies a lot the proof, it is obviously satised if
convolution with a kernel of integral

boundary eects are treated correctly).

The second assumption is needed,

g = AuR+ n is correct then it should be satised


n rapidly oscillating so that
Au  n ' 0). The last assumption means
that I = inf fjDuj(
) : u satises (1)g < +1, otherwise any u satisfying (1)

observe that if the model


(with

is a solution but the problem is of little interest. In the general continuous


setting the existence of such a

u~ is not absolutely obvious.

The following proof is taken from [24]. We consider a minimizing sequence

(un )n1 for (2), of functions un that all satisfy the constraints and such
that jDun j(
)!I as n!1. Such a sequence exists because of our third
assumption. We assume in order to simplify the notations that j
j = 1 (so
R
R
that in particular
u =
u for every u). We show, rst, that the average

A. Chambolle

20

mn =
un

remains bounded. This is obvious if

A is the identity,
A1 = 1)

or has a

continuous inverse. Otherwise, we can write (since

2 =

jAun gj2 =
=

jAun mn + mn gj2

jA (un mn) + mn gj2

so that

 kmn gkL2 (
) kA (un mn)kL2 (
)
 kmn gkL2 (
) kAk kun mnkL2 (
)

kAk denotes the norm of A as a continuous operator of L2(


).
N = 1 or 2, 2  N=(N 1) and by (11),
where

kun mnkL2 (
) =
The total variation



un



un (x) dx

L2 (
)

Since

 cjDun j(
):

(12)

jDunj(
) remains bounded, therefore also mn = R
un is

bounded. This implies (using again (12)) that

un is bounded in L2 (
).

Upon extracting a subsequence we may thus assume that there exists

u 2 L2 (
) \ BV (
) such that un *u weakly in L2 and Dun *Du weakly-
as a measure. We also have (since A is continuous and linear) Aun *Au,
therefore by semicontinuity we get

jDuj(
)  lim
inf jDun j(
) = I;
n!1
jAu(x) g(x)j2 dx  2;

and,

Au(x) dx =

g(x) dx:

(Alternatively, we could invoke Theorem 3 to deduce that some subsequence

(un ) converges to some u strongly in L1 (


), and Theorem 1 to conclude
that jDuj(
)  lim inf n!1 jDun j(
) = I .)
R
t
We now introduce for t 2 [0; 1] the function u = tu + (1
t
)

g. We
R
R
t
t
have for every t, jDu j(
) = tjDuj(
)  tI  I ,
Au =
g , and we have
R
0 gj2 = R jg R gj2  2 (by assumption), and R jAu1 gj2 =
j
Au

R
2  2 . By continuity of the map t 7! R jAut gj2 , there
j
Au
g
j

t
t
exists therefore a t0 2 [0; 1] such that u 0 satises (1), and jDu 0 j(
)  I .
t
Necessarily we must have jDu 0 j(
) = I , so that t0 = 1 and u is the solution

of

of problem (2).

Inverse problems in Image processing and Image segmentation


2.2

More properties of

BV

21

functions

In the previous section we have just introduced the very basic properties
of

BV

functions that allowed us to state correctly problem (2) and show

that it is well posed.

Now, if we want to study the weak MumfordShah

energy (7), we see that we need to know more properties of these functions.
In particular, we must dene correctly the discontinuity set
its regularity.

Su

We also need to describe precisely the measure

and study

Du.

This

will be done in the next sections. We will not prove all the results since it
is too dicult for the purpose of these lectures, but we will try to give a
correct idea of these results by describing with more precision the simpler
one-dimensional case.

2.2.1 The jumps set Su


Let us rst introduce the

2
.

Given

u :
![

approximate limits

of a function

at some point

1; +1] a measurable function, we can dene the


approximate upper limit of u at x 2
as

jfy : u(y) > tg \ B(x)j = 0 ;
u (x) = inf t 2 [ 1; +1] : lim

N
where B (x) is the ball of radius  centered at x and jE j denotes the Lebesgue
measure of the set E . u+ (x) is thus the greatest lower bound of the set of
values t for which the set fu > tg has (Lebesgue) density 0 at x: on the other
hand if t < u+ (x), then this set must have strictly positive density at x. The
approximate lower limit u (x) is dened in the same way i.e.,

jfy : u(y) < tg \ B(x)j = 0 :
u (x) = ( u)+ (x) = sup t 2 [ 1; +1] : lim
#0
N
#0

The set

Su = fx 2
: u (x) < u+ (x)g;
is the set of essential discontinuities of u, it is a (Lebesgue)negligible Borel
set. If x 62 Su , we write u
~(x) = u (x) = u+ (x) = ap limy!x u(y), and when
u~(x) 6= 1 we say that u is approximately continuous at x.
Let us rst analyse the onedimensional case, which is simpler.

2.2.2 BV functions in one dimension


In this section, we consider a (bounded) interval

I = (a; b)  R

a<b
u 2 L1 (I )

(here,

are two real numbers). In this case the total variation of a function

A. Chambolle

22

is simply

V (u; I ) = sup

Z

u(x)v0 (x) dx : v 2 Cc1 (I ; [ 1; 1]) :

Notice that the usual classical denition is dierent:

(
Var

(u; I ) = sup

nX1
i=1

ju(ti+1 ) u(ti )j : a < t1 <    < tn < b

and in general we do not have

V (u; I ) =

Var

(u; I ).

Indeed, the second

u and the value of


u on a set of measure zero (for
in I ), whereas the rst denition

denition depends on the pointwise values of the function


Var

(u; I )

can be made innite by changing

instance on a sequence

(xn )n1

of points

gives the same value for two functions that are almost everywhere equal. In

u 2 C 1(I ), then clearly forR every v 2 Cc1 (I ; [ 1; 1]), I uv0 = I u0 v,


0
and we deduce that V (u; I ) = I ju j, in this case it is easy to show that
V (u; I ) = Var (u; I ).
Exercise. Show that for every u, V (u; I ) R Var (u; I ). [ Hint: consider
v 2 C 1 (I ; [ 1; 1]) and remark that limh!0 h1 Ih (v(x + h) v(x))u(x) dx =
R 0 c
v (x)u(x) dx. (HereR Ih = fx 2 I : x + h 2 I g.) Prove then that for h > 0
suciently small, h1 Ih (v (x + h) v (x))u(x) dx  Var (u; I ). ]
In the general case we have that V (u; I )  Var (v; I ) and V (u; I ) = minfVar (v; I ) :
v = u a. e.g (see the next exercise).
1
The distributional derivative of u 2 L (I ) is the distribution Du dened
fact, if

by

hDu; 'iD0(I );D(I ) =

u(x)'(x) dx

' 2 D(I ) (i.e., the set Cc1(I ) with the appropriate topology). The
function u is in BV (I ) if and only if Du is a bounded Radon measure on I ,
0
which means that Du 2 M(
) ' C0 (I ) , the dual of C0 (I ), which is the set
of continuous functions on I = [a; b] such that u(a) = u(b) = 0. It can be
proved (quite easily) that Du is a bounded Radon measure on I if and only
if V (u; I ) < +1, and that in this case we have
for every

V (u; I ) = jDuj(I ) = sup


(where the sets

n
X
i=1

jDu(Ii )j :

n
[
i=1

Ii  I ; Ii \ Ij =  8i 6= j ;

Ii are Borel sets) the right-hand side of the last equation being
Du (which is

the standard denition of the total variation of the measure

Inverse problems in Image processing and Image segmentation


also the norm of
generally, the

Du when it is seen as an element of the dual C0 (I )0 ).

variation

23

(More

of a vectorvalued (or realvalued) Borel measure

is the Borel positive measure

jj dened by equation (9).)

We now introduce two functions

ul (x) = Du((a; x))

ul

and

and

ur , dened for every x 2 I

by

ur (x) = Du((a; x]):

(a; x) = fy : a < y < xg denotes the open interval of


a and x > a, which is sometimes also denoted by ]a; x[, whereas
(a; x] is the interval fy : a < y  xg.

Here, as usual,
extremities

Lemma 1 The function ul is leftcontinuous, while ur is rightcontinuous.


Moreover, ul
Proof.
when

= ur except on a set at most countable.

First of all,

ur (x)

ul (x) = Du(fxg) so that ur (x) = ul (x) except


Du (a point such that Du(fxg) 6= 0),

is an atom of the measure

but a bounded vector (or realvalued) measure can have at most a countable
number of atoms.

x 2 I and each sequence of


n # 0 we must show that ul (x n ) goes to ul (x) as
But jul (x n ) ul (x)j = jDu([x n ; x))j  jDuj([x n ; x)), and by

To show that

ul

is leftcontinuous, for any

nonnegative numbers

n!1.

standard properties of positive measures we know that (assuming, without

lim jDuj([x n ; x)) =


ul is leftcontinuous. For
the same reason, ur is rightcontinuous (indeed, ur (x) = Du(I ) Du((x; b)) ).
loss of generality, that

n

is a decreasing sequence)

jDuj(\n1[x n; x)) = jDuj() = 0.

Remark.

Therefore

More precisely, we can show in the same way that

lim ul (x ) =

!0
>0

lim ur (x ) = ul (x);

!0
>0

lim ur (x + ) = ur (x):

and

lim ul (x + )

!0
>0

!0
>0

Lemma 2 The distributional derivatives of ul , ur and u are equal (Dul =

Dur = Du).

A. Chambolle

24

Proof.

Dul = Du.

Let us show, for instance, that

Consider

' 2 D(I ).

We

have (using Fubini's theorem)

Z b

'Dul =

a
(Z

y2(a;b)

Z b

'0 (x)Du((a; x)) dx =

x2(y;b)

'0 (x) dx Du(dy) =

'0 (x)
Z

(Z

y2(a;x)

Du(dy) dx =

'(y)Du(dy) =

'Du;

showing the desired equality.


In particular, we deduce from the last lemma that D (u ul ) = D (u ur ) =
D(ul ur ) = 0 so that the functions u, ul , ur can dier at most by a constant.
We can redene the functions ul and ur by adding the appropriate constant
so that ul = u and ur = u almost everywhere in I (i.e., now, ul (x) =
c + Du((a; x)) and ur (x) = c + Du((a; x]) with c 2 R appropriately chosen
to have ul = ur = u a. e.) We have shown so far the following proposition.

Proposition 1 Every u 2 BV (I ) has a leftcontinuous and a right-continuous

representant1

Exercise.

Show that

jDuj(I ) = Var (ul ; I ) = Var (ur ; I ).

We now introduce the function

u_ =

Du

L1

Du
u_ 2 L1 (I ) ).
1
Nykodym derivation theorem states that for L a. e. x 2 I ,

which is the RadonNykodym derivative of the measure

with respect

to the Lebesgue measure

The Radon

u_ (x) = lim
!0

L1 on I

(in particular,

Du([x ; x + ])
Du((x ; x + ))
= lim
!0
2
2

and we can write the measure

Du as

Du = u_ (x) dx + Dsu
Ds u ? L1 , which means that there exists a Borel set E  I such that
jE j = L1(E ) = 0 and jDsuj(I n E ) = 0. In particular the RadonNykodym
1
We recall that a representant of a function u 2 L1 is a function u~ a.e. equal to u, or
more precisely belonging to the equivalence class of a.e. equal functions dening u.

with

Inverse problems in Image processing and Image segmentation


derivative

= 0.

jDs uj is zero, so that for L1 a. e. x 2 I , lim jDsuj((x


!0
L1

25

; x + ))=2
R

x a Lebesgue point of u_ , i.e., such that lim!0 1 xx+ ju_ (y)


u_ (x)j dy = 0 (a. e. x 2 I satises this property), and also assume that
lim!0 jDsuj([x ; x + ])=2 = 0. Then:
Consider now


u (x + )
lim sup l


#0

+Ds u([x; x + ))



u_ (x)

ul (x)



u_ (x)

Z x+
1
lim sup
u_ (y) dy


#0

 lim sup 1 jDsuj([x; x + ))

#0
1 Z x+
ju_ (y) u_ (x)j dy = 0:
+ lim sup
#0  x

In the same way, we can prove that

0,

showing that

ul

lim sup#0 jul (x ) ul (x)= u_ (x)j =


x which is u_ (x). We have

has a (classical) derivative at

shown the following proposition.

Proposition 2 The functions ul and ur have a derivative a.e. in I , and

u0l (x) = u0r (x) = u_ (x) for a.e. x 2 I .

Remark.

In a similar way we can show that at a. e.

1 Z
ju(y) u(x) u_ (x)(y
lim sup
2

jy xj
jy xj<
!0
which expresses the fact that

u_ (x)

is the

x,
x)j

dy = 0;

approximate derivative

(13)

of

at

x.

This property will have a generalization in higher dimension.

2.2.3 The jumps set and the singular part of Du


Now we will try do describe better the singular part
and the set

Su .

Ds u of the measure Du,

The rst property is the following.

Proposition 3 At every x 2 I , u+ (x) = ul (x) _ ur (x) and u (x) = ul (x) ^

ur (x).

In particular, Su

Remark.
fact, since

Du.

= fx 2 I : ul (x) 6= ur (x)g.

By Lemma 1 we deduce that the set

Su is at most countable.

In

ur (x) ul (x) = Du(fxg), it is the set of the atoms of the measure

A. Chambolle

26

u+ (x)  ul (x) for every x 2 I . Let t <


ul there exists  > 0 such that x  < y 
x ) ul (y) > t. Therefore fy : ul (y) > tg  (x ; x) so that if 0  ,
0 = j(x 0 ; x)j  jfy : ul (y) > tg \ B0 (x)j = jfy : u(y) > tg \ B0 (x)j,
where the last equality comes from the fact that u = ul a. e. in I . We deduce
that lim inf 0 #0 jfy : u(y ) > tg\ B0 (x)j  1 so that (by the denition of u+ )
t  u+ (x). Thus ul (x)  u+ (x). In the same way we get that ur (x)  u+ (x).
Conversely let t > ul (x) _ ur (x). By left and rightcontinuity we know
that there exists  > 0 such that x
 < y < x ) ul (y) < t and x < y <
x +  ) ur (y) < t. As before we deduce this time that lim sup0 #0 jfy :
u(y) > tg \ B0 (x)j = 0. Thus, u+ (x)  ul (x) _ ur (x). This proves that
u+ = ul _ ur on I . The proof of the equality u = ul ^ ur is identical.

Proof.
ul (x):

Let us rst show that

by the leftcontinuity of

Ds u

Now, we split the measure

(

J

for jumps) and

Cu ( C 

Ju = Ds u Su
(Notice that, since
Since

Ju

and

Cu = Dsu (I n Su ):

jSuj = 0 (Su is nite or countable), Ju is also Du Su).

Su is the set of the atoms of the measure Du, we have


Ju =
=

into two parts, called respectively

for Cantor):

x2Su
X

x2Su

Du(fxg)x
(ur (x) ul (x))x :

x stands for the Dirac mass at x.)

This measure represents the jumps of

across its discontinuities. It can also be written as

Ju =

x2Su

(u+ (x) u (x))u (x)x = (u+

u ) u H0 Su

(14)

u (x) 2 f 1; +1g represents the direction of the jump of u at x:


u (x) = +1 if ul (x) = u (x), ur (x) = u+ (x), so that u is increasing at x
(ul (x) < ur (x)), whereas u (x) =
1 when ul (x) = u+ (x), ur (x) = u (x),
meaning u is decreasing at x (ul (x) > ur (x)). This last expression (14) will

where

be generalized in higher dimension.

Cu. It has no atoms (i.e., Cu(fxg) = 0 for


s
every x 2 I ) since Du(fxg) = 0 and D u(fxg) = 0 for every x 2 I n Su . On
Consider now the measure

the other hand, it is singular with respect to the Lebesgue measure

L1 (i.e.,

Inverse problems in Image processing and Image segmentation

Cu ? L1 ).

27

Cantor part of u. We will soon show an example


u with Du having a Cantor part.

It is called the

of a function

Let us now return for a while to the weak MumfordShah functional (7).
In onedimension, we can write it

E (u) =

ju_ (x)j2 dx +

H0(Su) +

ju(x) g(x)j2 dx:

(Here the zerodimensional Hausdor measure of

]Su of the set Su .)

In our denitions of
correctly dened.

u_ (x)

and

Su ,

Su is simply the cardinality

we see that the weak energy

E (u) is

E (u) in the
inf fE (u) : u 2

However, if we try to nd a minimum of

class of all functions with bounded variation, we realize that

BV (I )g = 0 and that it is in general not reached! This happens because it


2
is possible to approximate every function in L (I ) (here, g ) by BV functions
such that Su = , u
_ (x) = 0 a. e., and all the derivatives are Cu. A typical
example of such a function is the Cantor-Vitali function, dened as the
(uniform) limit of the continuous functions in

[0; 1]

jCk \ [0; x]j where C = [0; 1]; C = C n3[1  n ; n + 1  for k  1,


0
k
k 1
k 3k
jCk j
n=1 3
1
see Figure 4. The set C = \k =0 Ck = limk Ck is the Cantor set, it has zero
0
length. The function u is continuous, and u = 0 in [0; 1] n C , i.e., almost
k

uk (x) =

(0; 1). The derivative Du is entirely supported by the negligible


C , and is therefore singular with respect to the Lebesgue measure. Thus
Du = Cu.
Exercise. Show that any function f 2 L2 (0; 1) can be approximated in L2
norm by a sequence fn of functions in BV (0; 1) with Dfn = Cfn (f_n = 0,
Sfn =  for every n).
If we want to minimize E (u), we have to restrict ourselves to the set of

everywhere in
set

function we want to consider. We will therefore introduce a new subspace of

BV (I ), made of the functions for which Cu is zero.

2.2.4 Special BV functions, in dimension one

Denition.

u 2 BV (I ) is a special function with


Cu = 0, which means that the singular part Dsu of the

We say that a function

bounded variation

if

A. Chambolle

28

1
u20
u2
u1

0:875
0:75
0:625
0:5
0:375
0:25
0:125
0

0:111 0:222 0:333 0:444 0:556 0:667 0:778 0:889

Figure 4: The Cantor-Vitali function.

distributional derivative
by

Du is concentrated on the jump set Su .

SBV (I ) the space of such functions.

We denote

The main tool in order to prove the existence of a minimizer for the weak
MumfordShah energy

is the following compactness and semicontinuity

theorem, due to Ambrosio.

Theorem 5 (Ambrosio, one dimensional version) Let I


and bounded interval and

sup
j

 R be an open

(uj ) be a sequence in SBV (I ). Suppose that

u_ j (x)2 dx + H0 (Suj ) + kuj kL1 (I ) < +1:

Then there exist a subsequence (not relabeled) and a function


such that
uj (x)!u(x) a.e. in I;

u_ j *u_ weakly in L2 (I );
H0(Su)  lim
inf H0 (Suj ):
j !1

Proof.

Consider such a sequence

subsequences from

uj

uj .

u 2 SBV (I )

(15)

In what follows we will extract several

that will all still be denoted by

uj .

Remark that

Inverse problems in Image processing and Image segmentation

29

supj H0 (Suj ) = supj ]Suj < +1, there exists an integer k such that
k = lim inf j ]Suj and we can extract a rst subsequence such that ]Suj = k
1
k
1
2
k
for every j . We let Suj = fxj ;    ; xj g, with a < xj < xj <    < xj < b.
n
Extracting a further subsequence we may assume that each xj converges to
n
k
n t; xn + t]. For a
some x 2 I = [a; b]. For t  0 we will set It = I n [n=1 [x
n
xed `  1, if j is large enough we have that xj 62 I1=` for every n = 1;    ; k .
1
In this case, uj 2 H (I1=` ) and is uniformly bounded:
Z
Z
sup
ju0 j2 dx = sup ju_ j j2 dx < +1 ; and

since

I1=`

I1=`

sup kuj kL1 (I1=` ) < +1:


j

uj converges to some funcu 2 H 1 (I1=` ), uniformly on I1=` , and u_ j = u0j *u0 weakly in L2 (I1=` ).
Using a diagonal procedure, since [`1 I1=` = I0 , we can in this way build
1
a function u 2 Hloc (I0 ) such that uj !u locally uniformly on I0 and u
_ j *u0
2
weakly in Lloc (I0 ).
But since u
_ j is bounded in L2 (I ) = L2 (I0 ), we deduce that u_ j *u0 weakly
2
in L (I ).
0
2
1
1
k
1
In particular, u 2 L (I ) and u 2 H (I n fx ;    ; x g) \ L (I ), so that
u 2 SBV (I ), u_ = u0 , and Su  fx1 ;    ; xk g, showing also that ]Su  k and
We can therefore extract a subsequence such that
tion

achieving the proof of Theorem 5.

Exercise. Show that u 2 H 1 (I n fx1 ;    ; xk g) \ L1 (I ) ) u 2 SBV (I ),


u_ = u0 , and Su  fx1 ;    ; xk g.
Exercise. Use Theorem 5 to show that the weak MumfordShah energy E
has a minimizer in SBV (I ).
2.2.5 The general, N dimensional case
We return to the general case of functions dened on an open set

 1.
If

u 2 BV (
), it can been shown that the set Su is countably (HN 1 ; N

1)rectiable , i.e.,

Su =
where

i.

 RN ,

1
[
i=1

Ki [ N

HN 1(N ) = 0 and each Ki is a compact subset of a C 1hypersurface

Note that this is a very weak notion of regularity: the set

be, for instance, dense in

Su

could still

A. Chambolle

30

u : Su !SN 1 such that HN 1 -a. e. in Su


the vector u (x) is normal to Su at x in the sense that it is normal to
i
if x 2 Ki . For every u; v 2 BV (
), we must therefore have u = v
HN 1-a. e. in Su \ Sv .
As in the onedimensional case, the derivative Du of every u 2 BV (
)
There exists a Borel function

can be decomposed as follows:

Du =
=

ru(x) dx +
Ju
ru(x) dx + (u+ u )uHN

Su

+ Cu
+ Cu

ru = LDu , the RadonNykodym derivative of Du with respect to the


N
Lebesgue measure L , is also the approximate gradient of u, dened a. e. in

by
u(y) u(x) hru(x); y xi
= 0;
ap lim
y!x
jy xj
N 1 S is the restriction of the (N
(remember equation (13)). H
1)
u
N
1 Su
dimensional Hausdor measure to the set Su so that Ju = (u+ u )u H
where

Du, that is carried by the discontinuity set


of u (compare with equation (14)). Eventually, Cu is the Cantor part of the
measure Du, which is singular with respect to the Lebesgue measure and such
that jCuj(E ) = 0 for any (N
1)dimensional set E with HN 1 (E ) < +1.

is the jump part of the measure

With these denitions of

ru(x) and Su, we see here again that the weak

E (u), is correctly dened. Here again as in the onedimensional


inf fE (u) : u 2 BV (
)g = 0 and the inmum is usually not
reached. We must consider as previously the functions u 2 BV (
), such that

energy (7),

case we have

Cu is zero.

2.2.6 Special BV functions

u 2 BV (
) is a special function with
Cu = 0, which means that the singular part of the
distributional derivative Du is concentrated on the jump set Su . We denote by SBV (
) the space of such functions. We also dene the space
GSBV (
) of generalized SBV functions as the set of all measurable funck
tions u :
![ 1; +1] such that for any k > 0, u = ( k _ u) ^ k 2 SBV (
)
(where X ^ Y = min(X; Y ) and X _ Y = max(X; Y )) (This follows Ambrosio's denition in [2], notice that sometimes GSBV (
) is dened as the space

Denition.

We say that a function

bounded variation

if

Inverse problems in Image processing and Image segmentation

31

GSBVloc(
), which is the space of functions that belongs
GSBV (A) for any open set A 
, i.e., such that A is compact and
included in
.)
1
If u 2 GSBVloc (
) \ Lloc (
), u has an approximate gradient a. e. in
,
k
moreover, as k " 1, the function u = ( k _ u) ^ k satises
we call hereafter
to

ruk !ru a. e. in
, and
Su  Su ; HN 1 (Su )!HN 1 (Su )
k

jruk j " jruj


and

uk = u

a. e. in

HN

1 -a. e.

(16)
in

Suk .
(17)

2.2.7 Ambrosio's compactness theorem


We mention the following compactness and lower semi-continuity result that
was proved in [2]:

Theorem 6 (Ambrosio) Let


be an open subset of RN and let (uj ) be a
sequence in
such that

GSBV (
).

Suppose that there exist

p 2 [1; 1] and a constant C

jruj j2 dx + HN 1(Su ) + kuj kL (


)  C < +1
p

for every j . Then there exist a subsequence (not relabeled) and a function
u 2 GSBV (
) \ Lp(
) such that

uj (x)!u(x) a.e. in
;
ruj *ru weakly in L2(
; RN );

(18)

HN 1(Su)  lim
inf HN 1 (Su ):
j !1
j

Moreover

Su

for every

jhu

;  ij dHN 1

 lim
inf
j !1

Suj

j u ;  j dHN
j

(19)

 2 SN 1.

There exist variants of this theorem, with dierent proofs (see [3, 4, 5]).
We need however in these lectures to consider this version, since the conclusion (19) will be useful in order to study the anisotropic variants of the
MumfordShah functional that appear in the nite dierences discretizations
that are common in image processing.

A. Chambolle

32

Remark. By a standard diagonalization technique Theorem 6 also holds if


uj and u are only in GSBVloc(
).
In this setting we are now able to show the existence of the weak Mumford
Shah functional of Ambrosio and De Giorgi (cf section 2.3.1). However, rst
we end this section on functions with bounded variation with a paragraph
about some useful additional properties.

2.2.8 Slicing
We now explain how a (special) bounded variation function can be described
and its properties recovered from its 1-dimensional slices, i.e., its restrictions to 1-dimensional lines. Many results of the sections 2.2.22.2.4 can be

N dimensional case using the following properties. In fact,


most of Theorem 6 (in the case p = 1) can be recovered from Theorem 3 and
Theorem 5 in this way, the very dicult part being to show that ruj *ru.
extended to the

Many of the following results will be needed in order to study the variational
approximations of the MumfordShah functional.

 2 SN 1 the sets  ? = fx 2 RN : h; xi = 0g and for


?
any z 2  ,
z; = ft 2 R : z + t 2
g. On
z; we dene a function
uz; :
z; ![ 1; +1] by uz; (s) = u(z + s ). If u 2 BV (
), we have
We consider for

the following classical representation (see for instance [2, 7]): for

z 2  ?, uz; 2 BV (
z; ) and for any Borel set B 

hDu; i(B ) = hDu(B ); i =


where

Bz;

for at least

?

HN

1 -a. e.

dHN 1 (z )Duz; (Bz; )

z; ; conversely if uz; 2 BV (


z; )
N
2 S 1 and HN 1-a. e. z 2 ?, and if

is dened in the same way as

independent vectors

?

dHN 1 (z )jDuz; j(
z; ) < +1

u 2 BV (
). Now (see [3, 2]), if u 2 SBVloc(
), then for almost every
2 ?, uz; 2 SBVloc(
z; ) (the converse is true provided this property

then

is satised for at least

independent vectors

and

u has locally bounded

variation), and the approximate derivative satises

u_ z; (s) =
for a. e.

hru(z + s); i

s 2
z; , moreover
Suz; = fs 2
z; : z + s 2 Sug ;

Inverse problems in Image processing and Image segmentation

33

(uz; ) (s) = u (z + s ) 8s 2 Suz; ;


and for any Borel set

?

B

dHN 1 (z )H0 (B

z;

\ Su ) =
z;

B \Su

jhu(x); ij dHN 1 (x):

The reader interested in knowing more about the space

BV

and how the

results in this section are proved should consult, for instance, the books [6,
34, 36, 42, 62].

2.3

Back to the MumfordShah functional

2.3.1 Existence for the weak formulation


Now, in this setting, it is clear that the weak MumfordShah functional (7)

GSBV (
) (which, in fact, is in SBV (
)). Indeed, consider a minimizing sequence (uj )j 1 for the problem inf u E (u), in GSBV (
).
Then, this sequence satises the conditions of Theorem 6 (with p = 2).
Therefore, some subsequence (still denoted by uj ) converges almost everywhere to a function u 2 GSBV (
) with

has a minimum in

(since

jru(x)j2 dx  lim
inf jruj (x)j2 dx
j !1

ruj goes to ru weakly in L2(


), by (18)),
HN 1(Su)  lim
inf HN 1 (Su );
j !1
j

ju(x)

(by Fatou's lemma).

g(x)j2 dx
Therefore

and

 lim
inf juj (x) g(x)j2 dx
j !1

E (u)

 lim infj!1 E (uj ) and u is a mini-

g is bounded,
u with its truncation at level kgk1 , ( kgk1 _ u(x)) ^
kgk1 , and decrease the energy, so that the minimum u has to satisfy kuk1 
kgk1 and is in SBV (
) \ L1(
).
mizer for the weak MumfordShah functional. Notice that since

we can always replace

In the next section we will explain how the weak problem is then related
to the strong original one (that is, the minimization of

E (u; K )).

A. Chambolle

34

2.3.2 From the weak to the strong formulation


Once we have proved the existence of a minimizer for the weak Mumford
Shah energy

E (u)

using Theorem 6, we need to show that it can also be

E (u; K ) dened by (6). In


arbitrary dimension N the general denition for E is
considered as a minimizer for the original energy

E (u; K ) =

nK

jru(x)j2 dx + HN 1(K \
) + ju(x) g(x)j2 dx;

(N

1)dimensional Hausdor
measure. (We have also dropped the constant parameters ;  .)
The natural way to associate a set K to u 2 SBV (
) is to set K = Su .
N 1 (K \
) > HN 1 (S ). For
However, if u is arbitrary, we could have H
u
where the length has been replaced with the

instance, the function

v(x) =

1
X

1
k B2 k (xk ) ;
k=1 2

(xk )k1 is the set of all points in


with rational coor1
dinates, is such that Sv =
\ [k =1 @B2 k (xk ). This has nite length, but is
N
1
dense in
. Thus H
(Sv \
) = +1.
A minimizer u of E (u) will be a minimizer for E if and only if we can
N 1 (S \
) = HN 1 (S ). (Conversely, it is simple to show
prove that H
u
u
1
N 1 (K ) < +1, then
that if u 2 H (
n K ) and K is a closed set with H
N
1
u 2 SBV (
) and H (Su n K ) = 0, that is, Su is included in K up to a

where the sequence

HN

1 negligible

set.)

This dicult result was proved by De Giorgi, Carriero and Leaci [29], and
independently in dimension

N = 2 by Dal Maso, Morel and Solimini [28] (see

also the book [48] for a general overview of the problem). They proved that

u minimizes E (u), then


that (u; Su ) minimizes E .
if

HN 1(
\ Su n Su) = 0 and u 2 C 1(
n Su), so

We make here the observation that if we slightly change the problem,


introducing an anisotropy in the energy, then these results still hold. Indeed,
if we consider a weak functional

E 0 (u) =

Q(ru(x)) dx +

Su

N (u

(x)) dHN 1 (x); +

N
where Q is a positive denite quadratic form in R
1-homogeneous convex function with

ju(x) g(x)j dx;


(20)

N
is a norm in R

and N
(a
0 < min2SN 1 N ( )  max2SN 1 N ( )

Inverse problems in Image processing and Image segmentation

< +1), then E 0

has a minimizer

35

u in GSBV (
) (exercise, you need to use

inequality (19) in Theorem 6), moreover, it is possible to adapt the proofs


in [29] and show that

2.4

HN 1(
\ Su n Su) = 0 and u 2 C 1(
n Su).

Variational approximations and

convergence

In these lectures we will describe a few ways to approximate the Mumford


Shah problem, or variants of this problem.

This has to be done because

numerically, it is dicult to deal with a jump set

K.

We introduce in this part

a special notion of convergence that is adapted to variational problems. As a

F (x), x 2 X
(xn )
say that (xn )

matter of fact, if you are looking for the minimizer of a function


(where

is some space), and want to approximate it with minimizers

of approximate problems

minx2X Fn (x), then when can you


F ? If you consider the classical notions of limits

converge to a minimizer of

of functions, then only the uniform convergence seems suitable to handle


this problem. However, this notion of convergence is far too strong for most
applications. This motivates the introduction of the following denition of

convergence,

We will limit ourselves to the case where


details we refer to [27].
Given a metric space
tions, we dene for every

F 0 (u) =
and the

specially invented for studying the limit of variational problems.

-lim sup

of

is a metric space. For more

(X; d) and Fk : X ![ 1; +1] a sequence of funcu 2 X the -lim inf of F


lim inf Fk (u) = uinf
lim inf Fk (uk )
k!1
k !u k !1

F 00 (u) =

lim sup Fk (u) = uinf


lim sup Fk (uk );
k !u k !1
k!1
0
00 = F .
and we say that Fk
converges to F : X ![ 1; +1] if F = F
F 0 , F 00 , and F (if they exist) are lower semi-continuous on X . We have the
following two properties:

1. Fk converges to F if and only if for every u 2 X ,


(i) for every sequence uk converging to u,

F (u)  lim inf k!1 Fk (uk );

(ii) there exists a sequence uk that converges to


lim supk!1 Fk (uk )  F (u);

u and such that

A. Chambolle

36

2. If G : X !R is continuous and Fk converges to F , then Fk + G


converges to

F + G.

The following result makes clear the interest of the notion of

convergence:

Theorem 7 Assume Fk converges to F and for every k let uk be a minimizer of Fk over X . Then, if the sequence (or a subsequence) uk converges
to some u 2 X , u is a minimizer for F and Fk (uk ) converges to F (u).
Eventually, we give the following denition of

convergence in the case

(Fh )h>0 is a family of functionals on X indexed by a continuous pah: we say that Fh converges to F in X as h # 0 if and only if for
every sequence (hj ) that converges to zero as j !1, Fhj
converges to F .

where

rameter

The reader who would like to know more about the

convergence may

consult the books [9, 27]. Also, the excellent notes [1] by G. Alberti contain
a good introduction to this theory as well as to the applications to phase
transition problems, that are very close (at least technically) to the methods
and techniques of section 4.

3 The numerical analysis of the total variation minimization


3.1

The discrete energy

Let us consider problem (2), in dimension 2, and let us try to nd a way
to compute a solution. We will discuss the approach studied by Vogel and
Oman [60, 61] (see also [24, 31]).
Although it is not absolutely obvious we will rst assume that there
exists a Lagrange multiplier

 > 0 such that problem (2) is equivalent to the

problem

min jDuj(
) +  jAu(x) g(x)j2 dx
(21)
u2BV (
)

(see [24] for details, we must assume here A1 = 1, so that a minimizer


R
R
of (21) automatically satises
Au =
g, as well as the other assumptions
of section 2.1.3). The problem of determining the correct  is also dicult,
we will not consider it in this short section.
First we must discretize (21).

For simplicity we assume that

are discretized on the same square lattice,

i; j = 1;    ; L.

and

(This is the case

in some situations, but there exist other common situations like the reconstruction of tomographic data, or the zooming, where it is not true.)

The

Inverse problems in Image processing and Image segmentation

37

u and g are approximatedR by discrete matrices U = (Ui;j )1i;j L


G = (Gi;j )1i;j L . The P
term 
jAu(x)
g(x)j2 dx is replaced, in the
2
discrete setting, by a term 
i;j j(AU )i;j gi;j j . (We omit the scale factor,
but it is important in the practical applications.) In the discrete formula A
N = RLL (we set N = L2 ) and (AU ) is the
denotes a linear operator of R
i;j
component i; j of AU .
There are several ways to approximate jDuj(
). The simplest (which,

functions
and

however, has several drawbacks), is to consider the variation along the horizontal and vertical directions

1i<L 1j L

jUi+1;j Ui;j j +

1iL 1j<L

(here again we omitted the scale factor).

jUi;j+1 Ui;j j

Due to the strong anisotropy of

this approximation the results it gives are not very good (in fact it is an
approximation of

jD1 uj(
) + jD2 uj(
) where in R2 Diu is the derivative

u along the ith direction, i = 1; 2 which is a seminorm in BV (


) that
is equivalent to jDuj(
) but not invariant under rotations in the plane), and

of

many other authors try to consider a more isotropic approximation of the


total variation (see for instance [60, 33, 47].
Therefore the discrete energy we need to minimize is the following

E (U ) =
3.2

i;j

(jUi+1;j

Ui;j j + jUi;j +1 Ui;j j) + 

i;j

j(AU )i;j gi;j j2 :

(22)

The method

Due to the strong nonlinearity of (22) (or rather of its derivative

DU E ), it is

dicult (although feasible) to minimize it by a straightforward gradient de-

jxj at
p
x = 0 (a problem that is often overcome by replacing jxj with + x2 , with
a small parameter) is not the only diculty. Another approach would be
scent method. The nonexistence of the derivative of the absolute value

to dene a dual problem, using convex duality. In the onedimensional case


and when

A is the identity, it leads to a very simple and ecient algorithm,

but in other situations it is not very practical. (If you know a bit of convex
analysis you may think about it!) For a similar approach see [25].
The solution we will study here is common in the image processing literature (see [10, 15, 38, 39, 40, 60, 61]). It is closely related to the method we
will use in section 6, or to the approach in section 4.

A. Chambolle

38

It consists in noticing that for every

x 2 R, x 6= 0,



1
v 2
x
+
;
jxj = min
v>0 2
2v

v = 1=jxj. We thus introduce the funcf (x; v) = vx2 =2 + 1=(2v), a new eld V = (Vi+1=2;j )1i<L;1j L [
(Vi;j +1=2 )1iL;1j<L 2 R(+L 1)L+L(L 1) (of positive real numbers) and a

the minimum being reached for


tion

new energy,

F (U; V ) =
X

f (jUi+1;j

i;j

+
+

i;j

Ui;j j; Vi+ 21 ;j ) + f (jUi;j +1 Ui;j j; Vi;j + 12 )


X

j(AU )i;j gi;j j2 =

2Vi+ 12 ;j

i;j

2Vi;j + 21

+

and we notice that

i;j

1
V 1 jU
2 i+ 2 ;j i+1;j

1
Ui;j j2 + Vi;j + 21 jUi;j +1 Ui;j j2
2

j(AU )i;j gi;j j2

min F (U; V ) = E (U );
V

Vi+1=2;j = 1=jUi+1;j Ui;j j (or at +1


Ui+1;j = Ui;j ) and Vi;j +1=2 = 1=jUi;j +1 Ui;j j.
0 0
We choose some starting values U ; V and compute for every n  1
the minimum being reached for

Un =

arg

Vn =

min F (U; V n 1 ) ;
U

arg

if

and

min F (U n ; V ):

V
n
The idea is that as n becomes large, U will converge to the minimizer of (22).
This is actually true if we slightly modify the algorithm (and the function

we minimize).

" > 0 and introduce the convex closed set K" = fV : " 
 1=" and "  Vi;j+1=2  1=" 8i; j g in RM (M = (L 1)  L +

We choose an

Vi+1=2;j
L  (L

E" (U ) = minV 2K" F (U; V ). It is


E" approaches
the minimizer of E . Moreover, it is easy to compute explicitly E" :
1) ).

We dene a new energy

possible to show that as

E" =

i;j

(j" (Ui+1;j

"

becomes small, the minimizer of

Ui;j ) + j" (Ui;j +1 Ui;j )) + 

i;j

j(AU )i;j gi;j j2

Inverse problems in Image processing and Image segmentation


where

8
>
>
>
<

39

jxj  ";
1
if "  jxj  ;
j" (x) = min f (x; v) = > jxj
"
"v1="
>
>
: " 2
1
1
if jxj  :
2 x + 2"
"
Dene " (x) = (" _ 1=jxj) ^ 1=" (= 1=jxj if "  jxj  1=", 1=" if jxj  ",
and " if jxj  1="). Then " (x) is the unique value in ["; 1="] such that
j" (x) = f (x; " (x)). We deduce that the unique V 2 K" for which E" (U ) =
minK F (U; ) = F (U; V ) is given by Vi+ 21 ;j = "(xi+1;j xi;j ) and Vi;j + 21 =
1 2 "
2" x + 2

if

"

" (xi;j +1 xi;j ) for every i; j . In this case we set "(U ) = V and this denes
N
M
a continuous function " : R !K"  R .
The algorithm, now, consists in computing for every n  1, the starting
0 0 being chosen,
values U ; V
Un =
Vn =
3.3

arg

arg

min F (U; V n 1 ): ;
U

and

min F (U n ; V ) = "(U n ):

V 2K"

Proof of the convergence of the algorithm

We assume (as in the continuous formulation) that


the vector in

RN

dened by

(1N )i;j = 1

for every

N = L  L is the dimension of the space where U

A1N = 1N , where 1N is
1  i; j  L (remember

lives). Then we have the

following proposition.

Proposition 4 There exist U , V = " (U ) such that as n!1, U n !U and

V n !V , and U
Proof.

is a (the) minimizer of

E" .

First we claim that the following holds

Lemma 3 There exist 0 < < < +1 such that the second derivatives
D2 F and D2 F satisfy
UU

VV

2 F (U; V )  IN
 DUU
for every U 2 RN and V 2 K" .

IN

and

IM

 DV2 V F (U; V )  IM

2 RN , V 2
K",  2 RN and  2
2 F (U; V );   j j2 and jj2  D2 F (U; V );  
RM , j j2  DUU
VV
2
IM jj . Here and both depend on ".

This is equivalent to saying that for every

A. Chambolle

40

Proof.

We will leave to the reader the proof of three of the inequalities of

the lemma and will prove the rst one, which is the more dicult. We rst
recall the following Poincar inequality (in nite dimension): there exists
a constant

c > 0 such that for every  2 RN = RLL


0

1i;j L

ji;j j2  c @

1i<L;j

ji+1;j i;j j2 +

i;1j<L

such that

i;j i;j

= 0,

ji;j+1 i;j j2A :

(23)

Exercise.

Prove this inequality. Hint: suppose it is not true and consider


P
n = 0 and 1 = P j n j2 
a sequence  n such that for every n, i;j i;j
i;j i;j
P
n j2 + j n
n j2 ). Then, if  is the limit of a subsequence
n i;j (jin+1;j i;j

i;j +1
i;j
 nk of  n , nd a contradiction on  .
N
Notice that for every U , V 2 K" and  2 R ,
D

2 F (U; V ); 
DUU

X

Vi+ 21 ;j ji+1;j

i;jX 

 "

i;j

i;j j2 + Vi;j + 21 ji;j +1 i;j j2 + jA j2




ji+1;j i;j j2 + ji;j+1 i;j j2 + jAj2


P

m( ) = (1=N ) i;j i;j be the average of  , we have


2 F (U; V );   jA j = jA( m( )1N )+m( )1N j 
(since A1N = 1N )
DUU
jm()1N j jAjj m()1N j.
But by (23), j m()1N j2  c Pi;j (ji+1;j
2 F (U; V );  , therefore
i;j j2 + ji;j +1q i;j j2 )  (1=") DUU

2

constant that
jm()1N j  c DUU F (U; V );  (here c denotes any positive

2

does not depend on U; V;  ). Moreover, using again (23), c DUU F (U; V );  
j m(
)1N j2 . Since 1N and  m()1N are orthogonal we deduce that
2 F (U; V );  .
jj2  c DUU
In particular, letting

Remark.

Observe the identity between this proof and the proof of the

coerciveness of the energy in section 2.1.3.


We next prove the following lemma.

Lemma 4 For every n  1,

E" (U n 1 )
Proof.

For every

E" (U n )


 2 jU n


U n j2 + jV n 1 V n j2 :

n  1, we have DU F (U n ; V n 1 ) = 0 while

hDV F (U n; V n); V V ni  0

Inverse problems in Image processing and Image segmentation


for every

2 K". We deduce that (using Lemma 3)

F (U n ; V n 1 )

F (U n ; V n )

F (U n ; V n ); V n 1

Vn

41

+ DV
+ (1 t)
0
D
E
 DV2 V F (U n ; V n + t(V n 1 V n))(V n 1 V n); V n 1 V n dt
 F (U n; V n) + 2 jV n 1 V nj2 :
n 1 ; V n 1 )  F (U n ; V n 1 ) + jU n 1
In a similar way we prove that F (U
2
U n j2 . Since E" (U n ) = F (U n ; V n ), the lemma is proved.
n
n n
Since by construction the sequence E" (U ) = F (U ; V ) must decrease
n 1)
and is bounded from below, it must have a limit in R+ and E" (U
n
n
1
n
n
1
n
E" (U )!0, therefore U
U and V
V go to zero as n!1.
Now, we notice that E" is coercive, which means that for every c > 0
N (and closed), hence compact (this can be
the set fE"  cg is bounded in R
n
deduced from Lemma 3). Thus we may extract a subsequence U k and nd a
N
n
n
U 2 R such that as k!1, U k !U . By continuity V k = "(U nk )!"(U ),
 = " (U ). We also have DU F (U nk ; V nk 1 ) = 0 and since
and we let V
n
1
n
 V ) =
V k
V k !0 by Lemma 4, V nk 1 !V so that by continuity, DU F (U;
0.
=

We conclude using the following lemma

 V satisfy DU F (U;
 V ) = 0 and V = arg minV 2K" F (U;
 V)=
Lemma 5 Let U;
"(U ). Then DU E"(U ) = 0.
Proof.

Let

h 2 RN

t # 0), we have
E" (U + th)

and

t > 0.

Letting

Vt = " (U + th) (that goes to V

as

 V )
E" (U ) = F (U + th; " (U + th)) F (U;
 Vt ))
= (F (U + th; Vt ) F (U;
 Vt ) F (U;
 V ))
+ (F (U;
 Vt )  F (U;
 V )) so that E" (U + th) E"(U )  F (U +
Since Vt 2 K" , F (U;


 Vt ) = t
DU F (U;
 Vt ); h + R0t (t
th;
Vt ) F (U; Vt ). Since
F (U R+ th; Vt )
F (U;


2 F (U;
2 F (U;
 Vt )h; h dt and 0t (t s) DUU
 Vt )h; h dt  t2 jhj2 =2, we
s) DUU
deduce that

E" (U + th) E" (U )


t#0
t
 ) = 0.
Since h is arbitrary, DU E" (U

DU E" (U ); h = lim

 V ); h = 0:

DU F (U;

A. Chambolle

42

E" is (C 1 and) strictly convex2 (meaning that for every U; U 0 and


0 <  < 1, E" (U +(1 )U 0) < E"(U )+(1 )E"(U 0 ) unless U = U 0 ), it has
a unique minimizer characterized by the equation DU E = 0. We deduce that
U is the unique minimizer of E" . This achieves the proof of Proposition 4,
n
since by uniqueness of this minimizer any subsequence of (U ) must converge
n
 , so that the whole sequence (U ) converges to U .
to the same value U
Since

Exercise. Prove that as "!0, E"


of E" tends to the minimizer of E .
3.4

converges to

E,

so that the minimizer

Two examples

Figure 5: A noisy image and the reconstruction.


We just show two examples of image denoising (i.e.,

is the identity

matrix) obtained with this method (these are taken from [24]).

The rst

one (Fig. 5) represents the reconstruction of a piecewise constant image on


which a Gaussian noise (of standard deviation 60 for values between 0 and
255) has been added. On the left, the noisy image is presented, while on the
right the reconstruction with an appropriate value of

 is shown.

The second

example (Fig. 6) shows a true picture that has been corrupted by some
Gaussian noise (here the standard deviation is approximately 30, always for
values between 0 and 255). The reconstruction (on the right) is less good,
2
We have to assume that
might be not unique.

A is injective.

Otherwise, U is still a minimizer of

E"

but

Inverse problems in Image processing and Image segmentation

43

Figure 6: Another example of total variation minimization.

since the total variation tends to favor piecewise constant images, so that the
result is a bit blocky.

4 The numerical analysis of the MumfordShah problem (I)


4.1

Ambrosio and Tortorelli's approximate energy

Now we will describe the rst attempt that has been made to provide an
approximation of the MumfordShah functional by simpler (elliptic) variational problems.

This result is due to Ambrosio and Tortorelli (see [7]

by a function

design energies that depend on a scale parameter

so that as

and [8]). They have proposed to replace the set

",

v(x), and
" goes to

1 v(x), in some sense, becomes an approximation of the


K.
2
2
approximation F" (u; v ), dened over the space L (
)  L (
), is

zero the function

characteristic function of the discontinuity set


Their

the following

for

F" (u; v) =
(v(x)2 + k" )jru(x)j2 dx + "jrv(x)j2

Z
(1 v(x))2
dx + ju(x) g(x)j2 dx
(24)
+
4"

u; v 2 H 1 (
), and they set F (u; v) = +1 if u or v is in L2 (
) n H 1 (
).

The parameter

k" > 0 is needed in order to have that for " > 0, F

is coercive

A. Chambolle

44

in

H 1 (
)  H 1 (
)

in this space).

(i.e., greater than a constant times the norm of

It has to go to zero faster than

"

as

"

(u; v)

goes to zero (i.e.,

lim"#0 k" =" = 0), the reason for this will be made clear in the proof.
In order to show that F"
converges to an energy such as the weak
MumfordShah energy E , they need to redene it on the same space as F" ,
2
2
that is, L (
)  L (
). To do this, they consider the fact that as " goes to
zero, they want v to become almost everywhere equal to 1, and thus they set
2
for every u; v 2 L (
):
8 Z
>
>
>
>
<

F (u; v)

Notice that

=>
>
>
>
:

jru(x)j2 dx + HN 1 (Su ) + (u(x) g(x))2 dx




u 2 GSBV (
);
if

v(x) = 1 a.e. in
,
otherwise:

+1

F (u; v) is just E (u) when v = 1 a. e., and +1 otherwise.

Then,

they are able to show the following theorem.

Theorem 8 (AmbrosioTortorelli) As " goes to zero, F" converge to

F.

In particular, this means that for small


close to minimizers of

F.

",

the minimizers of

F"

will be

This approximation has been used in eorts to compute image segmentations and for other application (see [16, 37, 52], that are based on a
nite-element version of Theorem 8 established by Bellettini and Coscia [13],
and [57, 56, 58, 59] where in particular Shah considers the gradient ow of

F" (u; v) and of similar energies).

It works quite well, however, it has the fea-

ture that the approximation of the MumfordShah functional will be correct


only if the discretization step (the pixels' size) is much smaller than the scale
parameter

".

This is not very convenient for most applications.

For this

reason, we will study in the following sections the problem from a dierent

E (U; L)

point of view, considering the original discrete energies (

or

E (U ))

and explain how one can nd the energy they approximate in the continuous
setting (which will be a variant of the MumfordShah energy).
We quickly explain in one dimension how the proof of Theorem 8 goes. In
dimension greater than one, the proof is obtained mostly through a localization and a slicing argument, that we will briey mention in the subsequent
section.

Inverse problems in Image processing and Image segmentation


4.2

45

Sketch of the proof of Ambrosio and Tortorelli's theorem,


in dimension one

Basically, in order to show Theorem 8, you have to choose any sequence

("j )j 1
(u; v),

of positive numbers with

limj !1 "j = 0,

and to prove that for any

(uj ; vj ) goes to (u; v) (in L2 -norm) as j !1, then


F (u; v)  lim inf j !1 F"j (uj ; vj ), and

i) if

(uj ; vj ) that converges to (u; v) such that


lim supj !1 F"j (uj ; vj )  F (u; v)

ii) there exists

4.2.1 Proof of (i)


To prove (i), we consider a sequence

(uj ; vj ).

lim inf j !1 F"j (uj ; vj ) < +1 (otherwise

We of course can assume that

there is nothing to prove), and we

can extract a subsequence (unless really needed i.e., if we need to keep


the track of the original sequence or compare two dierent subsequences
we will always denote the subsequences like the original sequence) such

lim inf j !1 F"j (uj ; vj ) = limj !1


F"j (uj ; vj ). In particular we have
R
c = supj F"j (uj ; vj ) < +1, thus
(1 vj (x))2 dx  c"j !0 as j !1.
2
Then, since vj goes to 1 in L (
), we must have v = 1 a. e.. We then
just need to show that E (u)  lim inf j !1 F"j (uj ; vj ). Since it is clear that
R
R
2
2

(uj (x) g(x)) dx goes to


(u(x) g(x)) dx as j !1, we have to show
0
that u 2 GSBV (
) and estimate the other two terms, H (Su ) = ]Su (notice
0
that the measure H (Su ) is the cardinality of the set Su , that we also denote
R
by ]Su ), and
u
_ (x)2 dx.
Now let  be the set of points x 2
such that for every > 0 (small, so
that (x
; x + ) 2
), u 62 H 1 (x ; x + ). We will show that this set is
nite. Indeed, choose x1 ; : : : ; xk 2  with x1 < x2 <    < xk . Choose also
such that xi + < xi+1
for every i = 1; : : : ; k 1 and (xi ; xi + ) 

for every i.
Then, we must have for every i that lim supj !1 inf B
v = 0. Other=2 (xi ) j
wise, there exists i and a subsequence ujk ; vjk such that limk !1 inf B
v =
=2 (xi ) jk
R xi +=2
> 0. And vjk  =2 in B=2 (xi ) for k large enough. But then, xi =2 (vjk +
R
k"jk )u0jk (x)2 dx  c implies that xxii +==22 u0jk (x)2 dx  2c= , so that ujk is uni1
formly bounded in H (xi
=2; xi + =2). In this case, its limit u also has

that

A. Chambolle

46

H 1 (xi
of xi 2 .

to be in

=2; xi + =2),
i

and this is in contradiction with the choice

lim supj !1 inf B=2 (xi ) vj = 0. We also


know that vj !1 in
Therefore if we x i and choose  > 0 small,
there will exist for j large enough a point xi (j ) 2 (xi
=2; xi + =2) with
0
00
vj (xi (j )) < , and xi (j ) 2 (xi ; xi =2) and xi (j ) 2 (xi + =2; xi + )
0
with vj (xi (j )) > 1
 and vj (x00i (j )) > 1 . We then have (using the fact
2
2
that A + B  2AB )
Now, for every

Z xi +

xi

"j vj0 (x)2 +

we know that

L2 (
).

(1 vj (x)) 2
dx
4"j





In particular, we get that for

(1 2)k.

large

Z xi +

x
Z xi i (j )

jvj0 (x)jj1 vj (x)j dx


j1 vj (x)jjvj0 (x)j dx

x0i (j )
Z x00 (j )
i

+
j1 vj (x)jjvj0 (x)j dx
xi (j )
(1 )2 2 (1 )2 2
+
= 1 2:
2
2
2
R
enough
"j vj0 (x)2 + (1 vj (x)) dx 

Since this is valid for an arbitrary nite subset


that

] < +1, and more precisely,

](1 2)

4"j

fx1 ; : : : ; xk g of , it shows

2
0 (x)2 + (1 vj (x)) dx  c = sup F" (uj ; vj ) < +1:
 lim
inf
"
v
j
j
j !1
4"
Z

This is true for every

]

 > 0, so that eventually

Z
(1 vj (x))2
 lim
inf "j vj0 (x)2 +
dx
j !1
4"

holds.

1 (
n ), and we need to nd an
, u 2 Hloc
R
0
2
0 2
estimate for
n u (x) dx. Indeed, if we knew that
n u (x) dx < 1
1
(i.e., u 2 H (
n )), then it would yield u 2 SBV (
) and Su = , u
_ = u0 .
Now, by the denition of

Notice that the proof we have just written could easily be transformed
to lead to the following lemma:

Lemma 6 Let  > 0. Then, for every > 0, there exists J such that for
every j  J , if x1 < x2 <    < xk are such that vj (xi ) <  and xi+1 xi > ,
then

k  c=(1 2).

Inverse problems in Image processing and Image segmentation

47

We leave the proof of this lemma to the reader (use the same arguments as
in the proof above, after having chosen

1 gj < ).

such that if

 J , jfx : vj (x) <

 > 0 is chosen, we can choose > 0 and select for all


j  J a maximal set x1 (j ) < x2 (j ) <  < xk(j ) (j ), with xi+1 (j ) xi (j ) >
and vj (xi (j )) <  , and we have k (j )  c=(1
2). Therefore, there exist
k  c=(1 2), a subsequence (ujl ; vjl ), and k points x1 < x2 <    < xk ,
such that k (jl ) = k for all l and xi (jl )!xi as l !1 for every i = 1; : : : ; k .
If l is large enough we thus have (by the maximality of the set
fx1 (j ); : : : ; xRk(j)(j )g) that vjl   in the open set
=
n [ki=1[xi 2; xi +
2], so that
u0jl (x)2 dx  c= and since ujl goes to u in L2 (
) we know
0
0
2
that this implies the convergence of uj to u weakly in L (
).
ql
2
2
2
If w 2 L (
), the functions w k"j + vj go to w strongly in L (
),
l
l
2
since vjl !1 in L (
).
Z q
Thus, lim
k"jl + vjl (x)2 u0jl (x)w(x) dx =

In this case, once

l!1

lim

u0jl (x) w(x) k"jl + vjl (x)2 dx =

l!1

u0 (x)w(x) dx

k"jl + vjl (x)2 u0jl (x) goes weakly to u0 in L2 (


).

and

u0 (x)2 dx

 lim
inf
l!1

Since

(k"jl + vjl (x)2 )u0jl (x)2 dx  c < +1:

is arbitrary we can deduce that


Z

u0 (x)2 dx

and in particular

u_ (x)2 dx + ] +

 lim
inf (k" + vj (x)2 )u0j (x)2 dx
j !1

u 2 H 1 (
n ).

and
Z

This yields

Therefore

(u(x) g(x))2 dx

u 2 SBV (
), Su = , u_ = u0 ,

Z
Z
(1 vj (x))2
 lim
inf (k" + vj (x)2 )u0j (x)2 dx + "j vj0 (x)2 +
dx
j !1
4"

(uj (x)

g(x))2 dx

which was the thesis we wanted to prove, and (i) is true.

A. Chambolle

48

4.2.2 Proof of (ii)


To prove (ii), we consider

2 SBV (
) with E (u) = F (u; 1) < +1 (oth-

In this case, Su is a nite set, and


u 2 H 1 (
n Su ) (in particular it is continuous everywhere but in Su).
In order to simplify the proof we will consider that
= ( 1; 1) and that
u has just one jump at point 0, but the study of the general case can be
localized in small intervals around each discontinuity of u so that it is not
erwise there is nothing to prove).

very dierent.
Consider the function

(t) = 1

exp( t=2)

for

 0.

We leave the

following result to the reader:

Exercise.

Prove that v" (x) = (x=") minimizes

1 0 2 (1
"v (x) +

v(x))2
dx
4"
0
on the set fv 2 L2loc (0; +1); v 0 2 L2 (0; +1); v (0) = 0g, and that the value
of the minimum is 1=2.

jxj a" 
Now, we set for every " > 0 v" (x) = 0 if jxj < a" , and v" (x) =
"
otherwise, where a" goes to zero with " and will be xed later on, and we set
u" (x) = u( a" )+ u(a" )2au"( a" ) (x + a" ) if jxj < a" , and u" (x) = u(x) otherwise.
Z

Then (using the result of the previous exercise),

u( a") 2
E" (u" ; v" ) =
2a"
jxja"

2


Z
1 0 jxj a"
1
jxj a" 2 dx
+

+
1
"
4"
"
jxja" Z"
Z a"
2a
+ " +
(u(x) g(x))2 dx +
(u" (x) g(x))2 dx
4"
jxja"
a"
Z 1
k
"
 (1 + k") u0(x)2 dx + 2kuk21 a
1
"
1 2a"
+2  +
2 4"
Z 1
+ (u(x) g(x))2 dx + 2a" (kuk1 + kgk1 )2 :


u(a" )
(k" + v" (x))u0 (x)2 dx + 2a" k"

k" =a" and a" =" go to zero as "


k
" ="!0, since in this case we can
p
let a" =
k" ". Then, we have lim sup"#0 E" (u" ; v" )  F (u) and point (ii) is

We see that we will get the result if both


goes to zero. This is possible if and only if

proved.

Inverse problems in Image processing and Image segmentation


4.3

49

Higher dimensions

In dimension greater than one, the proof is obtained through a localization


and a slicing argument. The reader, if interested, should report himself to
Ambrosio and Tortorelli's paper, to [19] where a similar argument is used,
or to Braides' book [18]. We will briey explain, without giving too many
details, how the proof goes.

4.3.1 The rst inequality


Consider a sequence

L2 (
) as j !1.

"j # 0 and uj ; vj

such that

uj !u and vj !v strongly in

Again, it is clear that v = 1 a. e., and we need to show that


E (u)  lim inf j !1 F"j (uj ; vj ). To prove this inequality we rst localize the
j j
energy F"j (u ; v ) by letting, for every A 
open,
F"j (uj ; vj ; A) =

Z 
A

(v (x)2 + k"j )jruj (x)j2

vj (x))2
4"j

A and a unit vector 

Then, we x an open set

F"j (uj ; vj ; A) =

(1
"j jrv (x)j2 +
j

dHN 1 (z )
?

+ ju (x)

2 SN 1,

g(x)j2 dx:
(25)

and write

Az;

dt (vj (z + t )2 + k"j )jruj (z + t )j2

(1 vj (z + t ))2
4"j

j
2
+ ju (z + t ) g(z + t )j

+ "j jrvj (z + t )j2 +

so that

F"j (uj ; vj ; A)

?

j ; A ; g ) dHN 1 (z ):
F"1jD (ujz; ; vz;
z; z;

Az; = ft 2 R : z + t 2 Ag,
j
j
j
j
and for every t 2 Az; , uz; (t) = u (z + t ), vz; (t) = v (z + t ), gz; (t) =
1
D
g(z + t ).) Here F"j denotes the localized AmbrosioTortorelli energy (25)

(We follow the notations in section 2.2.8:

in dimension 1 (given

F"1jD (w; r; I; h) =

I R

an open set and

w; r 2 H 1 (I ), h 2 L1 (I )):

(r(t)2 + k"j )w0 (t)2 dt +


Z

jw(t)

h(t)j2 dt:

"j r0 (t)2 +

(1 r(t))2
dt
4"j

A. Chambolle

50

j !1,

Since as

u(x)j2 dx =

juj (x)

 ?
z;

jujz; (t) uz; (t)j dtdHN 1 (z) ! 0

we may assume we have extracted a subsequence (not relabeled) such that

ujz; !uz;

for almost every

z 2  ? (such that
z; 6= ).

,
j ; A ; g );
E 1D (uz; ; Az; ; gz; )  lim
inf F 1D (ujz; ; vz;
z; z;
j !1 "j

Then, the onedimensional result states that for such a

IR

where again if

is open,

E 1D (w; I; h) =
w

if

2 SBV (I ), and

w_ (t)2 dt + H0 (Sw \ I ) +

I
1
E D (w; I; h)

= +1

?

Az;

u_ z; (t)2 dt +

H0(S

uz;

(w(t) h(t))2 dt

otherwise. Using Fatou's lemma,

we deduce that

\ Az; ) +

Az;

gz; (t))2 dt dH0 (z )

(uz; (t)

 lim
inf F (uj ; vj ; A):
j !1 "
j

lim inf j !1 F"j (uj ; vj ; A) < +1, uz; 2 SBV (Az; )


and since this is true for every  we deduce that

In particular we get that if

for a. e. every

u 2 SBV (A).

 ?,

Thanks to the results of section 2.2.8, the last inequality can

then be rewritten as

EZ (u; A) =
hru(x); i2 dx +
A

Su \A

jhu

(x);  ij dHN 1 (x)

 lim
inf F (uj ; vj ; A):
j !1 "

ju(x) g(x)j2 dx

To conclude, we admit that (see [19, Prop. 6.5]), if


sequence of points in

E (u) = sup

k
X

n=1

SN 1,

En (u; An ) : k 2 N ; (An )n=1;;k

and we observe that if

k
X
n=1
so that

En (u; An )

(An )n=1;;k

(n )n1

is a dense

)
disjoint open subsets of

is such a family, then

k
X

lim
inf F"j (uj ; vj ; An )
j
!1
n=1

E (u)  lim inf j !1 F"j (uj ; vj ).

 lim
inf F (uj ; vj );
j !1 "
j

Inverse problems in Image processing and Image segmentation

51

4.3.2 The second inequality

u 2 SBV (
) such that E (u) < +1, let us build functions u" and
v" such that u" !u, v"!1 as " # 0, and such that lim sup"#0 F" (u" ; v" )  E (u).
We will also assume that u is bounded and that Su is essentially closed in
,
N 1 (
\ (S n S )) = 0. This, in fact, is not restrictive,
which means that H
u
u
since it is possible to approximate every u 2 SBV (
) with a sequence of
N 1 (
\ (S n S )) = 0,
bounded functions (uj ) such that for every j , H
uj
uj
in such a way that limj !1 E (uj ) = E (u). This is a consequence of the
Now, given

essentialclosedness of the jumps set of the minimizers of the MumfordShah


functional, mentioned in section (2.3.2).

Exercise. Show this approximation property.


If Su (which is rectiable) is essentially closed in
, then the limit of the
quantities

L (Su ) =

jfx 2
: dist(x; Su )  gj

2
N 1 (S ) (see [36]).
as # 0, called the Minkowsky content of Su , is exactly H
u
Notice moreover that since 7! L (Su ) is continuous on (0; +1) and bounded
by j
j=(2 ), it is bounded, so that there exists a constant cL such that

jfx 2
: dist(x; Su)  gj  2cL
for every

(26)

 0.

() dened in section 4.2.2, and, again, a" =


pk ", and let S " = fx 2
: dist
(x; S )  a g. We set
We consider the function

"

8
>
<

v" (x) = >


:

and

"

(x; Su ) a"
"

dist


if

x 62 S " ;

otherwise.

(x; Su )
u" (x) = u(x)
^1
a"
"
so that u" = u on
n S . Then u" !u and v" !1 as " # 0. We will denote
"
in what follows dist(x; Su ) = d(x). Out of S , ru" = ru, whereas if x 2
S " , ru"(x) = ru(x)d(x)=a" + u(x)rd(x)=a" so that jru" (x)j  jru(x)j +
kuk1 =a" (we admit that rd exists a. e. and that jrdj  1). Therefore,
dist

A. Chambolle

52

(v" (x)2 + k" )jru" (x)j2 dx

 (1 + k")

+ 2k"

nS "
S"

jru(x)j2 dx

!
j
S " jkuk21
2
:
jru(x)j dx +
2

a"

jS "j = 2aR"La (Su)  2a" HN R1(Su) as "!0 and k"=a"!0, we deduce


2
2
2
that lim sup"#0
(v" + k" )jru" j 
jruj .
Exercise. Show that u" 2 H 1 (
) and that ru" = (dru + urd)=a" in S " .
R
2
Let us now show that lim sup"#0
"jrv" j + (1
v")2 =(4")  HN 1 (Su ).
"
0
Out of S , rv" (x) = ((d(x)
a")=") rd(x)=", so that
Z
jS "j + 1 Z 4 0  d(x) a" 2
(1 v" (x))2
"jrv (x)j2 +
dx =
Since

"

"

4"
nS "
"
2

d(x) a"
+ 1
dx:
"
"
0
The ratio jS j=(4") is of order a" =" and goes to zero as " # 0. Since (t) =
(1 (t))=2 = exp( t=2)=2, the second integral is

1 Z
d(x) a" 
exp
dx:
2"
nS "
"

4"

4"

We notice that

nS

exp
"

d(x) a" 
dx =
"

nS " 0
Z 1

ft : texp( (d(x)

a" )=")g dtdx

jfx 2
: a" < d(x)  a" " log tgj dt:
Let h" (t) = jfx 2
: a" < d(x)  a"
" log tgj=(2") = (a" ="
log t)L(a " log t) (Su ) a" ="La (Su ). By (26), jh" (t)j  cL (a" =" log t) 
cL (1 log t) (if " is small enough) for every t 2 (0; 1) and the latter function is integrable on (0; 1). Moreover, we know that as " # 0 lim"#0 h" (t) =
( log t)HN 1 (SuR). By Lebesgue's dominated convergence theorem we de1
N 1 (S ), so that
duce that lim"#0 0 h" (t) dt = H
u
=

"

"

1 Z
lim
exp
"#0 2"
nS "

This shows that

d(x) a" 
dx =
"

HN 1(Su)

R
lim"#0
"jrv" j2 + (1 v" )2 =(4") = HN 1 (Su ), and achieves

the proof of Theorem 8 in arbitrary dimension.

Inverse problems in Image processing and Image segmentation

53

5 The numerical analysis of the MumfordShah problem (II)


Now we consider a dierent problem, that can be stated in this way:
what sense is Blake and Zisserman's energy
energy

in

E (U ) (or Geman and Geman's

E (U; L)) a discrete approximation of the MumfordShah functional?

We will see that in fact, it is

not

an approximation of this functional, but of

a slightly dierent functional in which the length of the discontinuity set is


measured in a dierent (anisotropic) way.

5.1

Rescaling Blake and Zisserman's functional

E (U ) as a discrete approximation of something, we have to introduce a discretization step (or scale parameter) h > 0
and explain how the parameters  and  in (5) must vary with h in order to

But rst of all, if we want to see

get some result. How can this be done?


Consider the simpler, onedimensional Blake and Zisserman's energy

E 1 (U ) =
where

U = (ui )ni=1

and

nX1
i=1

W; (ui+1 ui ) +

G = (gi )ni=1

n
X
i=1

(ui

gi )2

are 1dimensional signals.

First of all, if we want the last term of the energy to be an approximation

R1

2
0 (u(x) g(x)) dx, then we need to assume that the signal G
h
h n
is in fact some discretization G = (gi )i=1 at step h = 1=n of the function
R
2
g 2 L (0; 1). For instance, we can let gih = (1=h) (ihi 1)h g(x) dx. Then, we
Pn
h
2
know that
i=1 gi [(i 1)h;ih) will converge to g in L . In this case, if we
h
h n
consider for all h = 1=n a signal U = U = (ui )i=1 , we will have that
of an integral

n
X
i=1

h (uhi

gih )2

h goes to 0 (n to +1) if the


2
in L (0; 1).

as

1
0

(u(x) g(x))2 dx

functions

Pn

h
i=1 ui [(i 1)h;ih)

converge to

But if this is true, then it is easy to show that, at least in the distributional
sense,

nX1 uh
i+1
i=1

uhi 

[(i 1)h;ih) * Du

A. Chambolle

54

where

Du denotes the distributional derivative of u.

if

(uhi+1

x  ih, and the sum ni=11 W; (uhi+1 uhi ) will be


u0 (x)2 dx in these regions if when uhi+1 uhi is small,
W; (uhi+1

uhi )

' h

This implies that we must choose


On the other hand, when

u
uhi )=h  u0 (x)

In the regions where

is dierentiable, it is therefore reasonable to ask that

an approximation of

!
uhi+1 uhi 2 (uhi+1 uhi ) 2
=
:
h
h

  1=h.

(i + 1)h is on one side


h
dierence ui+1
uhi should go

has a jump, if

of

ih on the other side, the


to
ul = (u+ u ) and thus have the order of magnitude of a constant.
In this case, we want to count 1 in the energy. Since the value of W; (t)
for large t is , it means that we must choose   1. The rescaled energy
2
then becomes (recording that W1=h;1 (t) = min(t =h; 1))

the jump and

ur

n
(uhi+1 uhi ) 2 A X
1
h
@
Eh (U ) =
min
; 1 + h (uhi
h
i=1
i=1
nX1

Letting

gih )2 :

f (t) = min(t; 1) = t ^ 1, this can be written as


0

n
1 @ (uhi+1 uhi ) 2 A X
f
+ (uhi
Eh1 (U h ) = h @
h
i=1
i=1 h
nX1

gih )2 A :

(27)

In a similar way, the rescaled 2dimensional Blake and Zisserman energy


will be

Eh2 (U h ) =

0
X
h2 @

i;j

i;j

h )2 A
gi;j

h )
(gi;j
1i;j n is the correct discretization
with
= (0; 1)  (0; 1): for instance, we can let
h
gi;j

(uhi;j

where now

2
2
1 @ (uhi+1;j uhi;j ) A 1 @ (uhi;j +1 uhi;j ) A
f
+ f
h
h
h
h
(28)

of an image

1 Z ih Z jh
= 2
g(x; y) dxdy:
h (i 1)h (j 1)h

2 L2 (
)

Inverse problems in Image processing and Image segmentation


5.2

The

55

limit of the rescaled 1dimensional functional

In order to state a

by (27), we must rst consider


done by dening it for every

Eh1 (uh ) =

Eh1 (U h )
L2 (0; 1).

convergence result for the energy

Eh1 (U h )
+1

E1

as a functional over

uh 2 L2 (0; 1) as

if

uh =

Pn

h
h
i=1 ui [(i 1)h;ih) ; U

dened
This is

= (uhi )ni=1 ;

otherwise,

Eh1 (uh ) has a nite value only when uh is a piecewise


constant function at scale h. Then, we have the following result ([20]).

which means that

Theorem 9 Eh1 converges to E01 as h goes to zero (n goes to innity),


where

8 Z
<

Z
1
2 dx + H0 (Su) + 1 (u(x)
u
_
(
x
)
1
E0 (u) = : 0
0
2

+1 if u 2 L (0; 1) n SBV (0; 1).

g(x))2 dx if u 2 SBV (0; 1),


(29)

We will sketch a proof of this result. We need to prove that


i. if

(uh ) goes to u (in L2 -norm) as h!0, then E01 (u)  lim inf h!0 Eh1 (uh ),

and
ii. there exists

E01 (u)

(uh )

that converges to

such that

lim suph!0 Eh1 (uh )

5.2.1 Proof of (i)


To prove (i), we consider a sequence

(uh ) converging to u as h goes to zero.

1 h
We can assume that lim inf h!0 Eh (u ) < +1 (otherwise there is nothing to
1 h
prove), and even, by extracting a subsequence, that sup E (u ) < +1. In
h h

h, there is a discrete signal U h = (uhi )ni=1 (n = 1=h) such


Pn
h
h
1 h
1 h
that u =
i=1 ui [(i 1)h;ih) and Eh (u ) = Eh (U ). Then, we build a new
h
function v in the following way:

particular, for every

 if x 2 [0; h) we let vh(x) = uh1 ;


 then, for x 2 [ih; (i + 1)h) (1  i  n 1):
p
 if juhi+1 uhi j  h, we let vh (x) = uhi + (x ih)(uhi+1 uhi)=h,

A. Chambolle

56

p
juhi+1 uhij > h, we let vh (x) = uhi if x 2 [ih; (i +
1=2)h) and vh (x) = uhi+1 if x 2 [(i + 1=2)h; (i + 1)h).

otherwise, if

vh (ih) = uhi for all i, and that vh is ane in the intervals


[ih; (i + 1)h) such that f ((uhi+1 uhi )2 =h) = (uhi+1 uhi )2 =h, and piecewise
h
h 2
constant with exactly one jump in the intervals such that f ((ui+1 ui ) =h) =
1.
h
With this construction, we have that v 2 SBV (0; 1), and that
We see that

1 h 2
v_ (x) dx +

h
i=1 (ui

subsequence

1
0

v_ (x)2 dx + H0 (Sv )

L2 (0; 1), and since h


u 2 SBV (0; 1) and
0

i=1

uhi ) 2 A
:

R1

Since it is easy to show that

1 @ (uhi+1
f

h
2
0 (v (x) g(x)) dx is less than some constant times
h
2
gi ) , so that we may apply Theorem 6 to deduce that some
h
of v converges a. e. to a function v 2 SBV (0; 1), with

We can show that

Pn

H0(Sv ) = h

nX1

u_ (x)2 dx +

Pn

h
i=1 (ui

 lim
inf
h!0

1
0

v_ h (x)2 dx + H0 (Svh ):

vh must
converge (at least) weakly to u in
R1
h
2
gi ) ! 0 (u(x) g(x))2 dx, we get that v = u,

H0(Su) +

1
0

(u(x) g(x))2 dx

 lim
inf E 1 (uh );
h!0 h

which is the thesis we wanted to show.

Remark.

We have shown slightly more than just the point (i). Notice in-

deed that we can easily deduce that if

uh

is bounded uniformly in

L1 (0; 1)

suph Eh1 (uh ) < +1, then some subsequence of uh converges (weakly in
2
h
easy, strongly in L : rst show that (u ) is bounded in BV (0; 1), hence
2
1
1 h
compact in L ) to a function u with E0 (u)  lim inf h!0 Eh (u ). In particular, we can deduce from Theorem 9 and this remark that if uh is for every h
1
a minimizer of Eh , then it has subsequences that converge to a minimizer of
E01 . (See Theorem 12 in section 5.4 below for a more general statement.)

and

L2 :

5.2.2

Proof of (ii)

Now we consider proving (ii). This is very simple. Choose


with

E01 (u) < +1.

The function

u is piecewise continuous

2 SBV (0; 1)

and has a nite

Inverse problems in Image processing and Image segmentation

57

x1 < x2 <    < xk . We dene for every n  1 and h = 1=n


U = (uhi )ni=1 by uhi = u(ih 0), which is the left limit
lim"#0P
u(ih ") of u at ih (thus uhi = u(ih) if ih 62 Su ). It is standard that
h
u = ni=1 uhi [(i 1)h;ih) goes to u in L2 (0; 1), in fact it is easy to prove that
uh goes to u uniformly on (xl + ; xl+1 ) for every l = 1; : : : ; k and small
> 0.
If [ih; (i + 1)h) \ Su = , we have (using the Cauchy-Schwarz inequality)

number of jumps

the discrete signal

uhi+1 uhi =

Z (i+1)h

ih

u_ (x) dx

 h

(uhi+1 uhi )2 =h  ih(i+1)h u_ (x)2 dx.


[1; n] : [ih; (i + 1)h) \ Su 6= g, we have
R

so that

1 @ (uhi+1 uhi ) 2 A
h
f
h
i=1 h
nX1

Therefore

5.3




Z (i+1)h

ih

!1

u_ (x)2 dx

Thus, denoting

X Z (i+1)h

iZ62Ih

ih

Ih

the set

fi 2

u_ (x)2 dx + ]Ih

u_ (x)2 dx + ]Su :

lim sup Eh1 (uh )  E01 (u) and point (ii) is proved.

The

limit of the rescaled 2dimensional functional

For the 2dimensional functional, we have the same kind of result. We also

Eh2 (U h ) as a functional over L2 (


), with
= (0; 1) 
h be the square
(0; 1), in the following way: for every 1  i; j  n we let Ci;j
[(i 1)h; ih)  [(j 1)h; jh), and
dene the functional

Eh2 (uh ) =

Eh2 (U h )
+1

Then, we dene

8 Z
>
>
>
<

E02 (u) = >


>
>
:

Su

at

uh =

h
h
h
h ; U = (ui;j )1i;j n ;
1i;j n ui Ci;j

as

Su

+1

x.

otherwise.

jru(x)j2 dx +

Here the vector


set

E02

if

ju1(x)j + ju2 (x)j dH1+ (u(x) g(x))2 dx

if u 2 SBV (
),
2
if u 2 L (
) n SBV (
).

u (x) = ( 1 (x);  2 (x))


u

Notice that in this case

(30)
is the normal vector to the jump

E02

is not the Mumford and Shah

A. Chambolle

58

functional: it is slightly dierent and measures the length of the jump set
in an anisotropic way. We point out the fact that it is of the form of

E0

in

Su of a
u is still essentially closed (i.e., H1 (
\ Su n Su) = 0).
2
2
The anisotropy in E0 is unavoidable since the discrete energy Eh is not

denition (20), so that it still admits minimizers and the jump set
minimizer

isotropic either, as illustrated by the following exercise.

Exercise. Assume g = 0. Let C = [a; b]  [c; d] 


= (0; 1)  (0; 1), with
` = b a = c d > 0, be a square in
, and let C 0 be the same square rotated
by 45 . Let uhi;j be 1 if (ih; jh) 2 C (i.e., if a  ih  b and c  jh  d)
and 0 otherwise (uh is an approximation of the characteristic function of C ).
Similarly, let u0 hi;j be 1 if (ih; jh) 2 C 0 and 0 otherwise. Show that as h goes
p
to zero, Eh2 (uh )  4`. On the other hand, show that Eh2 (u0 h )  4 2`.
2
2
With these denitions of Eh and E0 , we have the following theorem [21]:
Theorem 10 As h = 1=n goes to zero, Eh2 converges to E02 in L2 (
).
The proof of Theorem 10 in [21] is based on the same ideas as the proof
of Theorem 9 that was just given, and we will not repeat it here.

On the

other hand, this result is a particular case of Theorem 11 that will be proved
in the next section.

5.4

More general nite-dierences approximations

Now, we will introduce a general result for nite dierence discrete approximations of the MumfordShah functional, of which Theorems 9 and 10 are
particular cases. What follows is derived from [22].
In 1995, De Giorgi imagined the following non-local functional, dened

u on RN ,

for any measurable function

F" (u) =

(u(x) u(x + " )) 2


2
1
arctg
e jj dxd
N
N
"
R R "

ZZ

as a possible approximation, as

" goes to zero, of the rst-order part of the


Du)

MumfordShah functional (the part that depends on

F (u) = 

RN

jru(x)j2 dx + HN 1(Su);

u 2 GSBVloc(RN ) . Here ,  are two positive parameNotice that the function arctg (t) looks like the function f (t) = min(t; 1)

dened on functions
ters.

Inverse problems in Image processing and Image segmentation


that we introduce in the previous sections:
behaves as

1).

t,

whereas as

t!1,

59

in the neighborhood of 0, it

it behaves like a constant (

instead of

We will see in the sequel that we could consider any non-decreasing

f such that f (0) = 0, f 0 (0) = 1 (or some


limt!+1 f (t) = 1 (or any other positive constant).

function

positive constant), and

This conjecture of De Giorgi was proved by Gobbino [43]. He established

p N
" # 0, F" converges
to F in the strong L (R )
pN
p
1  p < +1, for  = 2 and  =  .

that, as

topology for any

This result is very close to the Theorems stated in the previous section.

We will show here how we can formulate a discrete version of Gobbino's


theorem, and give its complete proof, based on Gobbino's proof.
Let us give some details. Let

 RN

be an open domain with Lipschitz

h > 0 and every u :


\ hZN ! R let
!
X
1
(u(x) u(x + h )) 2
f
( )
(
h 
h

boundary, and for every

Fh (u;
) = hN

X
(

x 2 hZN  2 ZN
x2

x + h 2

2 [0; +1] ;
(31)

where:

  : ZN ! [0; +1) is even, satises (0) = 0, P2Z jj2 () < +1,
and

(ei ) > 0

basis of

RN

for any

i = 1; : : : ; N

where

(ei )1iN

(in practical applications the support of

is the canonical

 will have to be

nite and small);

( ) > 0, f : [0; +1) ! [0; +1) is a non-decreasing


0
bounded function with f  f  , f (0) = 0, f (0) =  > 0, and
limt!+1 f (t) =  , and we assume that f is below (or equal to) the
function t 7!  t ^  . We also assume both sup 2ZN  and sup 2ZN 
for any

with

are nite;

we will adopt in the sequel the convention that any term in the sum
above is zero whenever either

or

x + h

is not in

even if we do

not explicitly write these conditions under the summation signs (this
convention will be adopted everywhere in what follows unless otherwise
stated), as well, we'll usually write

Fh (u) instead of Fh (u;


) when not

ambiguous.
Fix
and let

p 2 [1; +1) (but you can assume p = 2, as in the previous sections),


`p (
\ hZN ) be the vector space of functions u :
\ hZN ! R such

A. Chambolle

60

that the norm

8
<

kukp = :hN

91
=p

ju(x)jp ;

x2
\hZN

u in `p(
\hZN ) and

h; h N
p N
the piecewise constant function in L (R ) equal to u(x) on x +
2 2
N (and to 0 elsewhere), so that kuk = kuk p N and
for any x 2
\ hZ
p
L (R )
p
N
p
that a sentence such as  uh 2 ` (
\ hZ ) converges to u 2 L (
) as h # 0
p
will have a natural sense. We also set Fh (u) = +1 for any u 2 L (
) that
is not the restriction to
of the piecewise constant extension of a function
p
N
in ` (
\ hZ ).
p
Let now, for any u 2 L (
) \ GSBVloc (
),
is nite. In the sequel we will always identify a function

F (u) =

+
and set

( )  jhru(x);  ij2 dx

2ZN
Z

Su  2ZN

F (u) = +1 if u 2 Lp (
) n GSBVloc(
).

times

F (u; B ) =

B  2ZN

+
when

( )  jhu (x);  ij dHN 1 (x)

B 
is a Borel set.

2 [0; +1];

(32)

We will also denote some-

( )  jhru(x);  ij2 dx


X

B \Su  2ZN

( )  jhu (x);  ij dHN 1 (x)

We have the following theorem.

Theorem 11 Fh

converges to F as
strong topology), for any p 2 [1; +1).

#0

in

Lp (
)

(endowed with its

We also have the following compactness result:

Theorem 12 Let p 2 [1; +1), g 2 Lp(


) \ L1 (
), and for any h > 0 let
uh be a minimizer over `p (
\ hZN ) of

Fh (u) +
(or, equivalently, of

Fh (u) +

ju(x) g(x)jp dx


ku ghkp

p

(33)

(34)

Inverse problems in Image processing and Image segmentation

61

where g h 2 `p (
\ hZN ) is a suitable discretization of g at scale h, with
gh ! g in Lp (
) as h # 0 and kgh k1  kgk1 for all h). Then (uh ) is
relatively compact in Lp (
) and if some subsequence uhj goes to u as j ! 1,
u 2 SBVloc(
) \ Lp(
) is a minimizer of

F (u) +

ju(x) g(x)jp dx:

p = 2, provide a generalization of the previous Theorems 9 and 10. For instance, Theorem 10 is the case where N = 2, p = 2,

= (0; 1)  (0; 1),   0 on Z2 except (0; 1) = (1; 0) = (0; 1) =


( 1; 0) = 1=2, and f (t) = t ^ 1, so that
These theorems, for

F (u) =

Remark.

jru(x)j2 dx +

The condition

(ei ) > 0

Su

ju1(x)j + ju2(x)j dH1 (x):

for

i = 1; : : : ; N

is necessary only for

the coercivity, i.e. to establish Lemma 7 and Theorem 12. This is important
in practical applications for the stability of the numerical schemes.

Even

if we have not discussed it in the previous sections (except in the Remark


on page 56), a similar coercivity and compactness result also hold for Theorems 8, 9 and 10.
If we wanted only to prove the
sucient to assume that

ZN N

of

RN .

convergence of

(i ) > 0, i = 1; : : : ; N

Fh

to

F,

for some basis

it would be

(i )1iN

We will rst describe the implementation of these energies. The proofs


of Theorem 11 and Theorem 12 will then be given in the last section of these
notes. The next sections are extracted from [22].

6 A numerical method for minimizing the Mumford


Shah functional
In this section we describe a numerical method for the implementation of the
energies we have introduced in these lectures. We will describe the minimization of problem (34) for
case. In particular, we
the results.

p = 2, since the energies Eh1 and Eh2 are a particular


will show how the choice of ,  and  inuences

A. Chambolle

62

6.1

An iterative procedure for minimizing (34)

Let us quickly describe a standard procedure for minimizing energies such


as (34).

Of course we do not pretend to compute an exact minimizer of

the energy, since the high non-convexity of the problem does not allow this.
However, the iterative algorithm we describe gives satisfactory results.

variant has been successfully implemented in the case of the approximation


of [23] (see [17]). Many other similar implementations have been made for
solving image reconstruction problems (see for instance [60, 10], and the
pioneering work [40] by D. Geman and G. Reynolds).

is bounded so that the discrete problem is nite-dimensional


for every xed h > 0 (in the applications
will be a rectangle). The nonconvexity in the energy Fh comes from the non-convexity of the functions f ,
 2 ZN . In order to simplify the computations we will assume that the f
We assume

are all identical, up to a rescaling:


f (t) =  f  t


2 ZN (with () > 0) and t  0. The function f is nondecreasing,


0
and satises f (0) = 0, f (+1) = 1, and f (0) = 1. It could be, of course,
the function f (t) = t ^ 1 of section 5, except that a dierentiable function

for all

provides better numerical results. An interpretation of Blake and Zisserman's


GNC algorithm would correspond to approximate gradually
functions.

We will thus assume, as well, that

t ^ 1 with smooth

is concave, and dierentiable. Thus,

is convex (we extend it with the value

+1 on ft < 0g), and lower semi-

continuous. Let

( v) = sup tv
t2R

( f )(t) = ( f ) (v)

be the Legendre-Fenchel transform of


so that

f (t) = sup tv
v2R

f , by a classical result ( f ) = f

( v) =

inf tv + (v):

v2R

sup in this equation is attained at v such that


t 2 @ ( f ) (v) (the subdierential of ( f ) at t), and that this is equivalent
0
to v 2 @ ( f )(t), and since @ ( f )(t) = f f (t)g for t > 0 and ( 1; 1] for
t = 0 we deduce that the sup is reached at some v 2 [ 1; 0] (since for t = 0

It is well known that the rst

Inverse problems in Image processing and Image segmentation


we check that

63

( f ) ( 1) = 0 and thus the sup is reached at v = 1).

Hence

f (t) = min tv + (v)


v2[0;1]

min is reached for v = f 0(t). (If f (t) = t ^ 1, all of this is still true
except that for t = 1, the min is reached for any v 2 [0; 1].) We may therefore
rewrite Fh in the following way:

and the

Fh (u) = min Fh (u; v)


v(;)

for

v : (
\ hZN )  (
\ hZN )![0; 1] and
Fh (u; v) =

X
hN

x2hZN  2ZN


u(x)

 v(x; x + h )

u(x + h ) 2

h

(v(x; x + h )) 
+ 
( ):
h

(35)

Fh (u; v) + ku gh k22
with respect to u and v . The minimization with respect to v is straightforN
ward, since it just consists in computing for each x; y 2
\ hZ
The algorithm consists in minimizing alternatively

(u(x) u(y)) 2
;
v(x; y) = f 0 
h


with

 = (y

x)=h.

The minimization with respect to

u is also

a simple

(linear) problem, since the energy is convex and quadratic with respect to

u.

Of course there is no way of knowing whether the algorithm converges to a


solution or not, what is certain is that the energy decreases and goes to some

u converges to either a critical point or, if it


of critical points. Notice that if f is strictly increasing,

critical level, while the function


exists, a continuum

v is everywhere strictly positive.

In the applications shown in these notes we considered


so that

f 0(t) =

1 + 24x2

f (t) = 2 arctg x
2,

Notice that one never has to compute explicitly the position of the edges
during the minimization.

Once a minimizer of the energy has been found,

it is possible to extract the edges out of the segmented image by standard


algorithms (using Canny's or more sophisticated edge detectors, with a very

A. Chambolle

64

narrow kernel since the images on which the edges have to be found are
piecewise smooth).

The value of the auxiliary function

is also a good

indicator for the position of the edges (it is large on the edges and close
to zero everywhere else), and should be taken into account. An elementary
method may be for instance to consider the zero-crossings of the (discretized)
operator

6.2

d2 u(ru; ru) in the regions where v is large.

Anisotropy of the length term

In some of the above mentioned image processing papers it had been noticed
that the segmentations could be improved by trying to modify slightly the
energy, making it less anisotropic.

Here we illustrate how the result of

Theorem 11 allows to control this anisotropy and nd explicitly the correct
parameters for the best energies.
In this section, like in section 5, n will be an integer (n > 1), we will set
h = 1=n and the functions u and gh (dened on [0; 1)  [0; 1) \ hZ2) will be
h
denoted as n  n matrices (ui;j )0i;j<n and (gi;j )0i;j<n . We will compare the
following two cases (pay attention to the fact that the notations here for the
energies are dierent from the notations in section 5, in fact, the following

En1 is similar to Eh2

in sec. 5, the other energies are new)

X 1
ju
u j
f 1 i+1;j i;j
En1 (u) = h2
h

h
1
i;j
+ jui;j gh j2 ;

2!


ju
u j2
+ 1 f 1 i;j +1 i;j
h
1 h

i;j

(which is Blake and Zisserman's weak membrane energy, except that


smoother than

t 7! t ^ 1) and

X 2
ju
u j
En2 (u) = h2
f 2 i+1;j i;j
2 h
i;j h

2!

By Theorem 12, the limit points of the minimizers of

En1

and

jru(x)j2 dx + 11 (Su) +

h j2 :
gi;j

En2 , as n!1,

will be minimizers of respectively

is


ju
u j2
+ 2 f 2 i;j +1 i;j
+
h
2 h

0
ju
ui;j j 2
0
ju
ui;j j 2
+ 2 f 02 i+1;j +10
+ 2 f 02 i 1;j +10
+ jui;j
h
2 h
h
2 h

1 (u) = 
E1
1

ju(x) g(x)j2 dx

Inverse problems in Image processing and Image segmentation


and

2 (u) = 
E1
2

jru(x)j2 dx + 22(Su) +

u 2 L2 (
) \ GSBV (
),
1 = 1 , 2 = 2 + 2 02 , and

(for

1 1 (Su ) =
2 2 (Su ) =
where

Su

and

Su

+1

65

ju(x) g(x)j2 dx;

otherwise) with

= (0; 1)  (0; 1),

1 (j1 (x)j + j2 (x)j) dH1 (x);

2 (j1 (x)j+j2 (x)j)+ 20 (j1 (x) 2 (x)j+j1 (x)+2 (x)j) dH1 (x);

(1 (x); 2 (x))

is the normal vector to

0.5

0.5

-0.5

-0.5

-1

Su

at

x.

Simple computations

-1
-1

-0.5

0.5

-1

-0.5

0.5

Figure 7: The solid line represents the length of a unit vector, as a function
of the angle. Left: for

1 , right:

for

2 .

The dashed line is the unit circle.

20 = 2 = 2 is an optimal choice, since it minimizes the ratio of


1
the length 2 of the longest (with length 2 ) vector in S over the length of
2
the shortest. For any rectiable 1-set E  R with normal vector (1 ; 2 ) at
x we dene the lengths
Z
(j (x)j + j2 (x)j) dH1 (x)
1 (E ) =
4 E 1
show that

and

2 (E ) =

Z 
j (x) 2(x)jp+ j1 (x) + 2(x)j  dH1 (x):
j
1 (x)j + j2 (x)j + 1
8 E
2

A. Chambolle

66

2 = (1 + 1 R 4 )=2 where R 4 is the rotation of angle =4


2
in R .) The choice of the parameters =4 and =8 is made in order to ensure
that a random set of lines has in average the same length 1 and 2 (and
Euclidean length), in other words, the unit circle has length 2 in both cases.
(Notice that

This is of course not the only possible choice. For instance, one could prefer
to parameterize these lengths in such a way that the error (with respect to
the Euclidean length)
vector

emax

S1

i (emax )

on the measure of the longest (for

is equal to the error

i (emin )

on the measure of the

shortest vector. In this case, one should choose as parameters


of

=4 for 1

and

2
p p
p
1+ 2+ 4+2 2

instead of

=8 for 2 .

With the choice we made, we get that

i )

1 = 4 1 =

and

2p
1+ 2

instead

2 = 8 2 =.

In both cases the limit energy is anisotropic, what is interesting is that the

2 is far less anisotropic than the rst length 1 . As a matter


1
of fact, the longest vector in S for 1 is about 41:4% longer than the shortest
p
(the ratio is
2) whileqit is only 8:2% longer for the length 2 (the ratio is
p 
p
p
p
2 2 cos 8 =(1 + 2) = 4 + 2 2=(1 + 2)). The dierence of anisotropy of
second length

both lengths is striking in gure 7.

Figure 8: Original images for the next examples.

Inverse problems in Image processing and Image segmentation


6.3

67

Numerical experiments

We show here a few experiments, so that the reader can see for himself
the dierence of behaviour of the lengths

1

2 . Notice that in
1 = 2 and 1 = 2 .

and

of our comparisons we of course always choose

all
In

Figure 9 and Figure 10 (see original pictures in Figure 8), one notices that
the edges are usually nicer when length
by minimization of energy

En1

2 is used, whereas images obtained

are more blocky.

Notice in particular the

Figure 9: The segmented column. Left, by using energy

E1 .
n

En2 , then with energy

The details appear in the same order.

diagonal line at the bottom of the image. However, the vertical edges on the
column (Figure 9, second picture) are nicer with energy

En1 .

The reason is

clear: these edges are vertical, and the vertical and horizontal lines have a
much lower costs than lines with other orientations with this energy.
In Figures 10 and 11 the results are similar: the edges look much nicer
when energy

En2 is minimized.

The other two gures (Figs. 12 and 13) show

segmentations in presence of noise. In the two segmentations of the disk, the


total length of the edges found was
with

E1 .
n

6:56  R with

energy

En2

and

6:40  R

These lengths are slightly overestimated because a few spurious

edges were found, and also because of some oscillation of the boundary, that
is due to the noise. Again in Figure 13 the result is more blocky with energy

En1 .

68

A. Chambolle

Figure 10: Segmented lady with energy

En2 and two details.

Figure 11: Segmented lady with energy

En1 and two details.

Inverse problems in Image processing and Image segmentation

69

Figure 12: The noisy disk (grey level values 64 (disk) and 192 (background),
std. dev. of noise 40). Middle, the segmented disk with energy
with

E1 .
n

En2 .

Right,

= 25 for values between 0 and 255) and


En2 (middle) and En1 (right).

Figure 13: A noisy image (std. dev.

the segmented outputs, by minimizing

A. Chambolle

70

In the last two experiments we used another energy, namely,

En3 (u) = h2

i;j

ju
u j2

ju
u j2
3
f 3 i+1;j i;j
+ 3 f 3 i;j +1 i;j
+
h
3 h
h
3 h
!

0
ju
ui;j j 2
+ 3 f 03 i1;j +10
+
h
3 h
!

00
ju
u j2
00
ju
ui;j j 2
h j2 :
+ 3 f 003 i+2;j 001 i;j
+ 3 f 003 i1;j +200
+ jui;j gi;j
h
3 h
h
3 h

30 = 3 = 2 and 300 = 3 = 5.
3
the minimizers of En minimize

We choose

3 (u) = 3
E1
with

Now, as

jru(x)j2 dx + 33(Su) +

3 = 3 + 2 03 + 10 003 ,

n!1, the limit points of

ju(x) g(x)j2 dx;

Z 
j (x) 2(x)jp+ j1 (x) + 2(x)j +
3 (E ) =
j
1 (x)j + j2 (x)j + 1
16 E
2
+

j21 (x) 2 (x)j + j1(x) + 22 (x)jp+ j21 (x) + 2(x)j + j1 (x) 22(x)j  dH1(x)
5

3 = (1 + 1 R 4 + 1 R + 1 R  )=4 with  = arctg 2), and


3 = 16 3 =. Figure 14 illustrates how isotropic the measure 3 is, and

(this time

Figures 15 and 16 show examples.


in

S1

is about

5:0%

(Now, the length of the longest vector

greater than the length of the shortest.)

look slightly better than the ones obtained with energy


computational cost is quite higher.

E2 ,
n

The results
however, the

Inverse problems in Image processing and Image segmentation

0.5

-0.5

-1
-1

-0.5

0.5

Figure 14: Same as Figure 7, this time for

3 .

71

72

A. Chambolle

Figure 15: Segmentation with energy

En3

(the column).

Inverse problems in Image processing and Image segmentation

Figure 16: Segmentation with energy

En3 (the lady).

73

A. Chambolle

74

Proof of Theorems 11 and 12

This last section is entirely devoted to the proof of the theorems of section 5.4.
Most of it relies on Gobbino's work [43], except for some adaptations and
a slight dierence in the proof (which avoids the use of Gobbino's technical
Lemmas 3.1 and 3.2 in [43], and makes it simpler).
We let for any

u 2 Lp (
)

and

F 0 (u) = (

lim inf Fh )(u)


h#0

F 00 (u) = (

lim sup Fh )(u):


h#0

In the next section A.1 we will prove a preliminary lemma that will be
helpful in the sequel. Then, the aim of the following two sections A.2 and A.3
will be to prove Theorem 11, i.e., to prove that

F (u) for all u 2 Lp(


).
A.1

F 0 (u)  F (u) and F 00 (u) 

Eventually in section A.4, we will prove Theorem 12.

A compactness lemma

The lemma we show in this section will be needed to establish Theorem 12,
but it will also give some a priori information on the regularity of functions

u 2 Lp (
) such that F 0 (u) < +1.
Lemma 7 Let hj # 0 and uhj 2 `p (
\ hj ZN ) such that

sup Fhj (uhj ) < +1 and sup kuhj k1 < +1:


j

Then there exist a subsequence (not relabeled) uhj and


that
uhj (x)!u(x) a.e. in

as

u 2 SBVloc(
)

such

j.

Let

goes to innity, and


Z

Proof.

jru(x)j2 dx + HN 1(Su) < +1

In order to simplify the notations we drop the subscript

f = mini=1;:::;N fei , c = mini=1;:::;N (ei ) > 0, and choose ; > 0 such that
t ^  f (t) for all t  0. We have:
N u (x)
X X
N
Fh (uh )  2c h
h
x2hZN i=1

uh (x + hei ) 2
^ h
h

Inverse problems in Image processing and Image segmentation

75

x such that x; x + hei 2


). We rst show that
(uh ) is bounded in BVloc(
) (so that it is compact in L1loc(
)).
Choose R > 0, i 2 f1; : : : ; N g, and write (with
n
p o

21 pNh = x 2
: dist(x; @
) > 12 Nh )

(remember we only sum on


the sequence

jDi uhj (
12 pNh \ BR (0)) 


hN 1 juh (x) uh (x + hei )j

xX
2hZN \BR (0)
2kuh k1 hN 1
x2X+

X uh (x)
N

+h
x2X

uh (x + hei )

h

X+ = x 2 hZN \ BR (0) : juh (x) uh (x + hei )j > h= and X =


hZN \ BR (0) n X+ . Of course, we only consider points x 2 hZN such that x
and x + hei belong to
. Then, using the Cauchy-Schwarz inequality,
where

jDiuhj (
21 pNh \ BR(0))  2kuh k1hN 1 ]X+
8
<


X uh (x)
N

+ CR 2 :h

x2X
s
N

 kuh k1 cFh(uh) + CR 2

with

91

uh (x + hei ) 2 =
;
h
Fh (uh )
2 c

C some constant depending only on N , so that eventually, for any  > 0,


sup jDuh j(
 \ BR (0)) < +1:
h

uh

(and in Lploc(
), as well) to some function
u that belongs to BV (
\ BR (0)) for any R > 0.
N
Now consider the extension of uh (on R , uh (x) being considered to be
0 outside of
)

X
y x
;
vh(y) =
uh (x)N
h
x2hZN

This shows that upon extracting a subsequence we may assume that


converges almost everywhere in

jtjR)+ for any t 2 R and N (y) = QNi=1 (yi) for any


y 2 RN . We estimate jrvh j2 on an elementary cell, for instance (0; h)N :
where

(t) = (1

A. Chambolle

76

j@1 vh (y)j2 dy =

(0 )

;h N



X

uh

x2f0;hgN

(0;h)N

(0 )
Z

(0;h)N 2

>Z
h2 <

 2

>
:

dy3 : : : dyN

x1

Y
N
i=2

yi xi
h


 2

dy



N
X

uh h; x uh ; x Y


h
x2f0;hgN 1
i=2

Z h  
X
y2
uh h; h; x

h
h
0
x2f0;hgN 2

N
Y

( )

(0 )

(0 )

;h N

(0;h)N 2

dy3 : : : dyN

dy3 : : : dyN



X



x2f0;hgN 2



X



x2f0;hgN 2

yi


 2
xi

N
uh(0; h; x) Y

y2  X uh (h; 0; x) uh(0; 0; x)
 yi h
h x2f0;hgN 2
h
i=3

+ 1
8

dy2 : : : dyN

;h N


1
0
(x)  y1


 2
xi

i=3

yi

dy2

uh (h; h; x) uh (0; h; x)

h
i=3
N
Y

N
uh(h; 0; x) uh (0; 0; x) Y
yi

h
i=3


 2
yi xi

h

2 9
 >
xi =
>
h
;

by induction we deduce that

(0;h)N

j@1 vh(y)j2 dy

X
hN
2N 1 x2f0;hgN


uh (h; x)

uh (0; x) 2

Notice that we could therefore conclude that

pN h

jrvh(y)j2 dy  hN
n

(with

pNh = x 2
:

N
X X
uh (x)

x2hZN i=1

uh (x + hei ) 2

h

(x; @
) > Nh , since we control the gradient
N
N whose 2N vertices all belong
cubes x + (0; h) , x 2 hZ
dist

vh only on the

), but since we cannot control the right-hand side of this expression if it


is summed over all x we must introduce a slight modication of vh : we thus
dene v
^h = vh , except each time

of

to

juh(x) uh(x + hei )j > h


;

(A.1)

xi

Inverse problems in Image processing and Image segmentation

77

h .
v^h  0 on (x; x + hei )  i0 6=iS(x hei0 ; x + hei0 ) = Ux;e
i
h
The new function v
^h is in SBVloc (
), and Sv^h  (x;ei )2Xh @Ux;ei where the
union is taken on Xh = f(x; ei ) : (A.1) holdsg. Now, we can write
in which case we set

pN h

jrv^h(y)j2 dy

moreover since

(x;ei )2hZN nXh

uh(x + hei ) 2

h


uh (x)

h ) = hN 1
HN 1(@Ux;e
i

(with

 = 2N 1 (N

HN 1(
pNh \ Sv^ )  ]XhhN
h

 Fh2(c uh) ;

+ 1)),

  Fh2(c uh) :

(A.2)

(A.3)

suph kv^h k1 < +1, we deduce invoking Ambrosio's Theorem 6 (see section 2.2.6) that some subsequence of v
^h converges
1
to a function v 2 L (
) \ SBVloc (
), with
From (A.2) and (A.3) and since

jrv(x)j2 dx + HN 1(Sv ) = sup jrv(x)j2 dx + HN 1(A \ Sv )

A
A

 21c 1 +  limh#inf
F (u ) < +1:
0 h h

The proof of the lemma is achieved once we notice that

must be equal to

u (as for instance by the construction ofR vh and v^h it is simple to check that
for any A 
with regular boundary A (uh (y )
v^h (y)) dy!0 as h # 0).

Remark.

(ei ) > 0 for i = 1; : : : ; N , the result


may be false. For instance, if N = 1,   0 except at
2 and 2 where
( 2) = (2) = 1, the family (uh )h>0 dened by
If we drop the condition

uh (kh) =

0
1

if
if

k 2 2Z
k 2 2Z + 1

for every

k2Z

satises the assumptions of Lemma 7 but is not compact.

A.2

Estimate from below the

limit

In this section we wish to prove that for all

u 2 Lp (
),

F (u)  F 0 (u):

(A.4)

u 2 Lp(
) and any sequence (uhj ) that
p
converges to u in L (
) as j !1 (with limj !1 hj = 0) we have,

We must therefore prove that for any

F (u)  lim
inf F (u ):
j !1 hj hj

(A.5)

A. Chambolle

78

Let

2 Lp(
),

and we will suppose rst that it is bounded.

also an arbitrary decreasing sequence


to

u in Lp (
).

We can assume that

decrease its energy

Fhj (uhj ).

Choose

hj # 0 and functions uhj that converge


kuhj k1  kuk1 , as truncating uhj we

It is clearly not restrictive to consider, as well,

lim inf is in fact a limit, and that supj Fhj (uhj ) < +1 (since if
lim inf j !1 Fhj (uhj ) = +1 the result is obvious). In view of Lemma 7 we

that the

deduce that

u 2 SBVloc(
)

Z
and

jru(x)j2 dx + HN 1(Su) < +1:

In the sequel we will drop the subscripts

and write 

h # 0

(A.6)

for 

j !1.

We prove (A.5) following Gobbino's method in [43], with a few modications


and adaptations. Let

^ h =

x2hZN \

x+

and notice that

21 pNh = x 2
:

h h
;
2 2

N

p
(x; @
) > 21 Nh

dist


^ h.

We have

(still using the convention that we only consider in the sums the points that
fall inside

(uh (x) uh (x + h )) 2
1
f
( )
Fh (uh ) =
h
h
x2hZN  2ZN
!
Z
X
1
(uh (y) uh (y + h )) 2
= ^ dy
f
( )
h

h 2ZN \ 1 (
^ y) h
h
h
!
Z
X
1
(uh (y) uh (y + h )) 2
=
( ) ^ ^
f
dy:
h
h

\
(

h
)
h
h
N
 2Z
X
hN

For every

 2 ZN

we let

F^h (uh ;  ) =

1
(uh (y) uh (y + h )) 2
f
dy:

h

^ h \(
^ h h) h

Inequality (A.5) will follow by Fatou's lemma if we prove that for any

lim inf F^h (uh ;  )


h#0

 

jhru(x); ij2 dx + 

Su

,

jhu(x); ij dHN 1(x):


(A.7)

Inverse problems in Image processing and Image segmentation


We choose

A 
.

If

then

79

h is small enough (i.e., h  dist(A; @


)=(j j + 21 N ))

F^h (uh ;  )

1
(uh (y) uh (y + h )) 2
f
dy
h
Ah

and it will be sucient to show that

1
(uh (y) uh (y + h )) 2
lim inf
f
dy 
h#0
h
Ah
Z
Z
  jhru(x); ij2 dx + 
jhu(x); ij dHN 1 (x);
A
Su \A
Z

(A.8)

as the supremum of the right-hand side of (A.8) for all

A 
is the right-

hand side of (A.7). This is part of Gobbino's result [43], but we present a
slightly dierent approach, still based on the slicing (see section 2.2.6 for
technical details) of the functions

uoh in the direction  .

?
N : hz;  i = 0 , and for every z 2  ? , A = fs 2 R :
Let  = z 2 R
z;
z + s 2 Ag, (uh )z; (s) = uh (z + s ). We rewrite the rst integral over A
in (A.8):

dH

?

= j j
= j j

Z
?

?

1 f ((uh)z; (s) (uh)z; (s + h)) 2 jj ds =


h
Az; h
!
Z
X
((
uh )z; (s) (uh )z; (s + h)) 2
1
N 1
f
ds
dH (z )

1 (z )

2Z

dHN 1 (z )

Az;

[0;h)

\[kh;kh+h) h

dt

h
!)
(uh)z; (t + (k + 1)h)) 2
h

1 f ((uh)z; (t + kh)

h
k 2Z

(by the change of variable t + kh = s) where the sum is taken only on the
k 2 Z such that t + kh 2 Az; . Now, with the change of variable t = h , this
becomes

j j

Z
?

dH

1 (z )

d h

X
k

2Z h

We will prove that for a. e.

limh#inf
h
0


1 f ((uh)z; (( + k)h) (uh)z; (( + k + 1)h)) 2


h

(z;  ) 2  ?  (0; 1),

1 f ((uh)z; (( + k)h) (uh)z; (( + k + 1)h)) 2 


h
h

k2Z
( + k)h 2 Az;

!)

 

Z
Az;

(A.9)

ju_ z; (x)j2 dx +  H0 (Suz; \ Az; ):

A. Chambolle

80

In order to prove (A.9), we need some information on the limit of

k)h))k2Z as h # 0.
Z

juh(y)

((uh )z; (( +

Since, using the same changes of variables,

u(y)jp dy

?

dHN 1 (z )

z;
Z

j(uh)z; (s) uz; (s)jp jj ds


1

8
< X

= j j ? dHN 1 (z ) d :h j(uh )z; (( + k)h)



0
k2Z
)

uz; (( + k)h)jp

( + k)h 2
z; ) we may
?
assume (upon extracting a subsequence) that for a. e. (z;  ) 2   (0; 1),

(where in the sum we consider only

such that

lim h j(uh )z; (( + k)h) uz; (( + k)h)jp = 0:


h#0 k2Z
Choose a
choosing

(z;  ) such that (A.10) holds.


z that
Z

uz; 2 SBVloc(
z; )

and

uz; is continuous
almost all s 2
z; ,

so that



+

[] denotes the integer part).

vh (s) = (uh )z;

Remark.

uz;

in

ju_ z; (s)j2 ds + H0(Su ) < +1;


z;

Thus, for

  

that the piecewise constant function

converges to

By (A.6) we may also assume when

except at a nite number of points.

lim uz;
h#0
(where

z;

(A.10)

s
h

h = uz; (s)

We easily deduce from this and (A.10)

vh :
z; !R


+

dened by

  

s
h

Lploc(
z; ).

Following Gobbino (proof of Lemma 3.3, Step 2 in [43]) we could

also prove that for a. e.

2 (0; 1), uz; (( + [s=h])h)!uz; (s) in L1loc(


z; ),

u is not really needed.


Notice that if f (t) = t ^ 1, it is

so that the a priori information on the regularity of


We return to the proof of inequality (A.9).

simply a consequence of Theorem 9. The proof that follows is needed because

Inverse problems in Image processing and Image segmentation


we want to consider more general functions

81

f , and provide a generalization


I  Az; , we denote

to such functions of the thesis of Theorem 9. For any

jvh(s + h) vh(s)j 2 ds
G(vh ; I ) =
h
!
X
1
j
vh ((k + 1)h) vh (kh)j 2
:
=
j(kh; kh + h) \ I j f
Z

1
f
h
I

k2Z

h is small enough, ( + [s=h])h 2 Az; for every s 2 I so that the lim inf
in (A.9) is greater than lim inf h#0 G(vh ; I ). Therefore, we just need to prove
that for any I  Az; ,

If

lim inf G(vh ; I )


h#0

  ju_ z; (s)j2 ds +  H0(Su \ I );


z;

(A.11)

indeed, taking then the lowest greater bound of the right-hand term of (A.11)
for all

I , we will get (A.9).

Because of the super-additivity of

lim inf h#0 G(vh ; )

I is an interval. To prove (A.11),


; > 0 such that t ^  f (t) for all t  0 (noticing that
respectively, may be chosen as close as wanted to  resp.,  ), and

we may assume without loss of generality that


we then choose

we write

G(vh ; I )


vh ((k

(kh;kh+h)I

+ 1)h) vh (kh) 2
^ :
h

v~h with v~h (kh) = vh (kh)


vh ((k+1)h)
intervals (kh; kh + h)  I such that h
h

Redening a function

for

kh

vh (kh) 2

2 I , ane on the
 and piecewise

constant, jumping once on the intervals with the reverse inequality (just like
in the proof of Theorem 9), we get

with

Ih = fx

2I :

dist

G(vh ; I )

Ih

jv~_ h(s)j2 ds + H0(Sv~ \ Ih)


h

(x; R n I ) > hg,

(section 2.2.6) we get the existence of a function


of

v~h goes to v~ a. e., and that satises


Z
jv~_ (s)j2 ds + H0 (Sv~ \ I )
I

so that invoking Theorem 6

v~ such that some subsequence

 limh#inf
G(vh; I ):
0

v~ has to be equal to uz; (noticing easily, for


v~h )*0 weakly in Lp ). If !  we deduce from (A.12)

We check then that


that

(vh

ju_ z; (s)j2 ds  limh#inf


G(vh ; I );
0

(A.12)
instance,

(A.13)

A. Chambolle

82

whereas sending

to

we get

 H0 (Suz; \ I )

 limh#inf
G(vh; I ):
0

(A.14)

Inequality (A.11) is deduced from the last two inequalities by subdividing the
interval

into suitable subintervals (the connected components of a small

neighborhood of

Suz;

and its complement) and using the appropriate in-

equality (A.13) or (A.14) in each subinterval. Hence (A.9) holds, and using
Fatou's lemma we deduce (A.8), as

jj

?

dHN 1 (z )

= 

Az;

ju_ z; (s)j2 +  H0(Suz; \ Az; )

jhru(x); ij2 dx + 

Su \A

Inequality (A.5) therefore holds in the case

jhu(x); ijdHN 1 (x):

u 2 L1(
).

u 2 Lp(
) is not bounded, choose again uhj !u in Lp (
). Conk
k
k
k
p
sider u = ( k _ u) ^ k and uh = ( k _ uhj ) ^ k , clearly uh !u in L (
),
j
j
Now, if

so that

But as

F (uk )
f

is increasing,

 lim
inf F (uk ):
j !1 h h
j

Fhj (ukhj )  Fhj (uhj ) so that


F (uk )

 lim
inf F (u ):
j !1 h h
j

If this is nite, we conclude by noticing that

limk!1 F (uk ) = F (u) (by (16),

(17)); so that the proof of (A.4) is achieved.

Remark. Notice that if uhj !u in Lploc(


), the result still holds. Indeed,
p
for any A 
we have uhj !u in L (A) and since the result holds in this
case we can write

F (u; A)
Then, as

 lim
inf F (u ; A)  lim inf Fh (uh ;
):
j !1 h h
j !1
j

F (u;
) = supA
F (u; A) we get (A.5). (Thus the Fh
F in Lp (
) endowed with the Lploc(
) topology.)

converge to

also

Inverse problems in Image processing and Image segmentation


A.3

Estimate from above the

83

limit

u 2 GSBVloc(
) \ Lp(
) with F (u) = F (u;
) < +1,
p
N
p
build uh 2 ` (
\ hZ ) such that uh !u in L (
) and

Given

 F (u):

lim sup Fh (uh )


h#0

we want to

(A.15)

In order to be able to assume some regularity on the function

u we rst prove

the following lemma. It is a (simpler) variant of the results in [30] and [26]

lim sup inequality for most approx-

that are usually needed to show the

imations the MumfordShah functional, like Ambrosio and Tortorelli's. For

F" , however, a very strong regularity of the jump set is not needed, and this
lemma is sucient.

2 GSBVloc(
) \ Lp(
) with F (u) < +1. There exists
a sequence (uk )k1  SBV (
) of bounded functions with bounded supports,

Lemma 8 Let u

that are almost everywhere continuous in

and such that

 uk !u in Lp(
) as k goes to innity,
 limk!1 F (uk ) = F (u).
Remark.

The information on the support of

uk

makes sense only when

is unbounded.

k  1 rst let uk = ( k _ u) ^ k be the truncated


p N
of u at level k . We choose in L (R ) a minimizer vk of

Proof.

For every integer

7! F (v) + k

RN

jv(x) uk (x)jp dx:

Then,

kvk ukL (R )  kvk uk kL (R ) + kuk ukL (R ) 


p




1 k
F (u )
k

1
F (u)
k

1

1

!1

fjuj>kg

(ju(x)j k)p dx

!1

fjuj>kg

ju(x)jp dx !0

k!1, moreover (see the observation about functional E 0 dened by (20)


N 1 (
\ S n S ) = 0 and v 2 C 1 (
n S ).
in section 2.3.2), we know that H
vk u
vk
k

as

A. Chambolle

84

vk is almost everywhere continuous. We also have that F (vk ) 


F (uk )  F (u) and jvk (x)j  k for all x 2
. Set now for every integer n > 1
and x 2

In particular

vk;n(x) =

8
>
>
>
>
>
<
>
>
>
>
>
:

vk (x)
0
vk (x) +

1
n

if

1
n

if

vk (x) > 1=n;


if jvk (x)j  1=n; and
vk (x) < 1=n.

vk;n is still a. e. continuous and goes to vk in Lp (


) as n!1, so that
we can choose nk such that kvk;nk
vk kLp (
)  1=k. We set wk = vk;nk .
We also have Swk  Svk ,

Clearly

8
<

rwk = :
and

rvk

a. e. in

a. e. in the complement,



vk

fx 2
: jvk (x)j > 1=nk g ;


Z

jfwk 6= 0gj = j j > n1  npk jvk (x)jp dx < +1

k
q
so that in particular wk 2 L (
) for any q 2 [1; +1].
1 N
Choose at last  2 C0 (R ) with 0    1 and   1 on B1 (0), and set
x
for R > 0 and any x 2
wk;R (x) = 
R wk (x). For any R,

Swk;R  Swk  Svk


and if

 2 ZN ,

jhrwk;R(x); ij2 dx =
=

BR (0)\

jhrwk (x); ij2 dx

wk (x)   x   2
+
hrwk (x); i + R r R ;  dx

nBR (0)
Z

jhrvk (x); ij2 dx
BR (0)\

Z
C 2Z
2
+2
jhrvk (x); ij dx + R2 jj
jwk (x)j2 dx

nBR (0)

nBR (0)
Z

 

x


R

Inverse problems in Image processing and Image segmentation


with

C = 2kr k2L1 (RN ) .

 F (vk )

F (wk;R )
0

+@
Since

 2ZN

wk

1(

 ( )j j2 A

and

85

Hence

C Z
2
jrvk (x)j dx + R2
jwk (x)j2 dx :

nBR (0)

nBR (0)

rvk are in L2(


), we can choose R large enough in order to

have

F (wk;R )

 F (vk ) + k1 :

(A.16)

Rk large enough so that (A.16) holds and kwk;Rk wk kLp (


)  1=k,
uk = wk;Rk . Clearly uk is still a. e. continuous. Moreover, F (uk ) 
F (u)+1=k, uk goes to u in Lp (
) as k!1, and by Theorem 6 (section 2.2.6)

Choose

and set

we deduce that

so that

F (u)

 lim
inf F (uk )
k!1

limk!1 F (uk ) = F (u) and the lemma is true.

We now establish (A.15).

GSBVloc

(RN )

\ Lp(RN )

with

First consider the case

F (u) < +1,

= RN .

Given

sequence of compactly supported, bounded and a. e. continuous functions


converging to

u such that F (uk )!F (u) as k goes to innity.


in

uk

By a standard

diagonalization procedure, if we know how to build for every

((uk )h )h>0 converging to uk

we build invoking Lemma 8 a

Lp (RN ) as h # 0, such that

a sequence

lim sup Fh ((uk )h )  F (uk );


h#0

uh with uh !u and satisfying (A.15). In the sequel we


u is bounded, compactly supported, and continN
uous at almost every x 2 R .
N dene uy 2 `p (hZN ) by uy (x) = u(y + x) for any x 2 hZN .
For y 2 (0; h)
h y
h
N
We compute the mean of Fh (uh ) over (0; h) :

we will be able to nd

may therefore assume that

(0;h)N

Fh (u ) dy
y
h

=
=

X


2ZN

2ZN

( )
( )

X Z

x hZN

RN

1 f (u(y + x)

(0;h)N h

1 f (u(y)

u(y + x + h )) 2
dy
h

u(y + h )) 2
dy:
h

A. Chambolle

86

At this point (following exactly Gobbino's proof ), we write

1
(u(y) u(y + h )) 2
f
dy =

h
RN h
0

21

(u(z + t jj ) u(z + t jj + h ))


1
A
= ? dHN 1 (z ) dt f @
h

R h
!
Z
Z
1
(u(z + s ) u(z + (s + h) )) 2
N
1
= j j ? dH (z ) ds f
h
h
R
Z
1 (u )
= j j ? dHN 1 (z )F;h
z;
Z

uz; (s) = u(z + s ) and we have set

where

1 (v ) =
F;h

v.

for any measurable function

1 (v)
F;h

1
(v(s) v(s + h)) 2
f
ds
h
Rh

Since we assumed

v(s + h) 2 
^ h ds
h


v (s)

f (t)   t ^  , we have

(A.17)

and as shown in [43] by M. Gobbino, this is less than

jv_ (s)j2 ds +  H0(Sv )

v 2 SBVloc(R) and this expression is nite.


Exercise. SCheck this fact, by computing the integral
over Svh = s2Sv [s h; s] and over R n Svh .
provided

in (A.17) separately

Therefore,



=

(0;h)

Fh (uyh ) dy

1 (u )
( )j j ? dHN 1 (z )F;h
z;

 2ZN
 2ZN
X

 2ZN

( )j j ?


dHN 1 (z )

( )

 jhru(x);  ij2 dx +

Z

RN

jhru(z + s); ij2 ds +  H0 (Suz; )


Z

Su

 jhu

(x);  ij dHN 1 (x)

= F (u):

Inverse problems in Image processing and Image segmentation

87

y in some set of positive measure in (0; h)N ,


Fh (uyh )  F (u):
(A.18)
yh
For all h we choose yh such that inequality (A.18) holds and set uh = uh .
N then u (x)!u(x) as h # 0
We easily check that if u is continuous at x 2 R
hp
0
0
(since uh (x) = u(x ) for some x such that jx
x0 j < 23 Nh). Since u is
N
almost everywhere continuous, uh converges to u a. e. in R . We also have
kuhkL1 (RN )  kukL1 (RN ) and the functions uh, u are zero outside some
p N
compact set so that by Lebesgue's theorem uh !u in L (R ). Since clearly,
N is achieved.
(A.15) holds for this sequence uh , the proof of the case
= R
We now return to the general case where
is a Lipschitz domain. The
Thus, for

method used in order to localize the previous result is adapted from [23].
We choose a function

2 GSBVloc(
) \ Lp(
), and once again invoking

u is bounded with
bounded support. Since we assumed that @
is Lipschitz, (and since u is
zero outside some bounded set) we can extend u outside of
(using the
1;p
same reection procedure as for instance in [34] for the extension of W
functions) into a bounded compactly supported SBV function (still denoted
N 1 (@
\ S ) = 0 and F (u; RN ) < +1. Then, we build
by u) such that H
u
(uh ) like previously, such that uh goes to u in Lp (RN ) and
lim sup Fh (uh ; RN )  F (u; RN ):
Lemma 8 we see that it is not restrictive to assume that

h#0

We can write

where

Fh (uh ; RN )

 Fh(uh;
) + Fh (uh;
c)

is the complement of

in

RN .

Notice that we have dropped all

uh at one point in
and another in
h to zero we get
lim sup Fh (uh ; RN )  lim sup Fh (uh ;
) + lim inf Fh (uh ;
c );

terms involving dierences of values of

c .

Sending

h#0

h#0

h#0

and we deduce from (A.4) that

lim sup Fh (uh ;


) + F (u;
c )  lim sup Fh (uh ; RN )  F (u; RN ):
h#0
h#0
c
Thus, u being extended in such a way that F (u;
) < +1,
lim sup Fh (uh ;
)  F (u;
):
h#0
N 1 (@
\ S ) = 0, F (u;
) = F (u;
) and we get the thesis.
Since H
u
achieves the proof of Theorem 11.

This

A. Chambolle

88

A.4

Proof of Theorem 12

h > 0 let (uh )h>0 be a minimizer in `p(


\ hZN ) of

For any

Fh (u) +

ju(x) g(x)jp dx

(A.19)

g 2 L1(
) \ Lp(
).
Replacing uh with ( kg kL1 (
) _ uh ) ^ kg kL1 (
) we decrease the energy,
thus in fact kuh kL1 (
)  kg kL1 (
) . In view of Lemma 7, since suph>0 Fh (uh ) <
+1, some subsequence (uhj )j 1 of (uh )h>0 converges to a function u 2
SBVloc(
) a. e. in
. From the uniform bound on kuh k1 we deduce that
uhj !u in Lploc(
).
p
If j
j < +1, the convergence is in L (
) and we simply conclude invoking
where

Theorem 7 (section 2.4). Otherwise, we know (by the remark at the end of
section A.2 and Fatou's lemma) that

F (u) +

ju(x) g(x)jp dx  lim


inf F (u ) +
j !1 h h
j

juh (x) g(x)jp dx:


j

v 2 Lp(
), we consider (vhj )j 1 a sequence converging to v in Lp (
)

For any

such that

For all

lim sup Fhj (vhj )


j !1

 F (v):

we have that

Fhj (uhj ) +

juh (x)
j

g(x)jp dx  F

hj (vhj )

jvh (x) g(x)jp dx;


j

so that at the limit we get

F (u) +

ju(x) g(x)jp dx  F (v) +

showing the minimality of

u.

lim kuhj

j !1

thus, by equi-integrability,
vergence in

Lploc (
).

If we choose

jv(x) g(x)jp dx;

v = u, we also deduce that

gkLp (
) = ku gkLp (
) ;

uhj !u strongly in Lp(


), since we had the con-

In the case where we minimize

Fh (u) +

ku ghkp

instead of (A.19) the proof is not dierent.

p

Inverse problems in Image processing and Image segmentation

89

References
[1] G. Alberti.

Variational models for phase transitions, an approach via

gamma-convergence. In G. Buttazzo et al., editor, Dierential Equations


and Calculus of Variations. SpringerVerlag, 2000. (Also available at
http://cvgmt.sns.it/papers/).
[2] L. Ambrosio. A compactness theorem for a new class of functions with
bounded variation.
[3] L. Ambrosio.

Boll. Un. Mat. Ital. (7),

Variational problems in

Acta Appl. Math.,

SBV

3-B:857881, 1989.
and image segmentation.

17:140, 1989.

[4] L. Ambrosio. Existence theory for a new class of variational problems.

Arch. Rat. Mech. Anal.,

111(1):291322, 1990.

[5] L. Ambrosio. A new proof of the

Partial Dierential Equations,

SBV

compactness theorem.

Calc. Var.

3(1):127137, 1995.

[6] L. Ambrosio, N. Fusco, and D. Pallara.

and Free Discontinuity Problems.

Functions of Bounded Variation

Oxford mathematical monographs.

Oxford Clarendon Press, 2000.


[7] L. Ambrosio and V.M. Tortorelli. Approximation of functionals depending on jumps by elliptic functionals via

Appl. Math.,

-convergence.

Comm. Pure

43(8):9991036, 1990.

[8] L. Ambrosio and V.M. Tortorelli. On the approximation of free discontinuity problems.
[9] H. Attouch.

Boll. Un. Mat. Ital. (7),

6-B:105123, 1992.

Variational convergence for functions and operators.

Ap-

plicable Mathematics Series. Pitman (Advanced Publishing Program),


Boston, Mass.London, 1984.
[10] G. Aubert, M. Barlaud, P. Charbonnier, and L. Blanc-Fraud. Deterministic edge-preserving regularization in computed imaging. Technical
Report TR#94-01, I3S, CNRS URA 1376, Sophia-Antipolis, France,
1994.
[11] R. Azencott. Image analysis and Markov elds. In

of 1st Int. Conf. Appl. Math., Paris,

1987.

SIAM Proceedings

A. Chambolle

90

[12] R. Azencott. Markov elds and image analysis. In

Congress, Antibes, France,

Proceedings AFCET

1987.

[13] G. Bellettini and A. Coscia. Discrete approximation of a free discontinuity problem.

Numer. Funct. Anal. Optim.,

[14] A. Blake and A. Zisserman.

15(3-4):201224, 1994.

Visual Reconstruction.

MIT Press, 1987.

[15] L. Blanc-Fraud and M. Barlaud. Restauration d'images bruites par

proceedings of  13e
colloque GRETSI sur le traitement du signal et des images, Juan-lesPins, France, pages 829832, 1991.

analyse multirsolution et champs de Markov.

In

[16] B. Bourdin. Image segmentation with a nite element method.

Math. Model. Numer. Anal.,

[17] B. Bourdin and A. Chambolle.


approximation

of

the

M2AN

33(2):229244, 1999.
Implementation of a nite-elements

MumfordShah

functional.

Numer. Math.,

85(4):609646, 2000.
[18] A. Braides.

Approximation of free-discontinuity problems.

Number 1694

in Lecture Notes in Mathematics. SpringerVerlag, Berlin, 1998.


[19] A. Braides and G. Dal Maso. Non-local approximation of the Mumford
Shah functional.

Calc. Var. Partial Dierential Equations, 5(4):293322,

1997.
[20] A. Chambolle. Un thorme de
signaux.

convergence pour la segmentation des

C. R. Acad. Sci. Paris,

t. 314 Srie I:191196, 1992.

[21] A. Chambolle. Image segmentation by variational methods: Mumford


and Shah functional and the discrete approximations.

Math.,

SIAM J. Appl.

55(3):827863, 1995.

[22] A. Chambolle. Finite-dierences discretizations of the MumfordShah


functional.

M2AN Math. Model. Numer. Anal.,

[23] A. Chambolle

and G. Dal Maso.

Discrete approximation

MumfordShah functional in dimension two.

mer. Anal.,

33(2):261288, 1999.
of the

M2AN Math. Model. Nu-

33(4):651672, 1999.

[24] A. Chambolle and P.-L. Lions. Image recovery via total variation minimization and related problems.

Numer. Math.,

76(2):167188, 1997.

Inverse problems in Image processing and Image segmentation

91

[25] T. F. Chan, G. H. Golub, and P. Mulet. A nonlinear primal-dual method


for total variation-based image restoration.

SIAM J. Sci. Comput.,

20(6):19641977, 1999.
[26] G. Cortesani.

Strong approximation of

smooth functions.
[27] G. Dal Maso.

GSBV

functions by piecewise

Ann. Univ. Ferrara Sez. VII (N.S.),

An introduction to

-convergence.

43:2749, 1997.

Birkhuser, Boston,

1993.
[28] G. Dal Maso, J.-M. Morel, and S. Solimini.

A variational method in

image segmentation: Existence and approximation results.

Acta Math.,

168:89151, 1992.
[29] E. De Giorgi, M. Carriero, and A. Leaci. Existence theorem for a minimum problem with free discontinuity set.

Arch. Rational Mech. Anal.,

108:195218, 1989.
[30] F. Dibos and E. Sr.

An approximation result for the minimizers of

MumfordShah functional.

Boll. Un. Mat. Ital. (7), 11-A:149162, 1997.

[31] D. C. Dobson and C. R. Vogel.


for total variation denoising.

Convergence of an iterative method

SIAM J. Numer. Anal.,

34(5):17791791,

1997.
[32] R. C. Dubes, A. K. Jain, S. G. Nadabar, and C. C. Chen. MRF modelbased algorithms for image segmentation. In

on Pattern Recognition, Atlantic City,

proc. 10th IEEE Int. Conf

pages 808814, 1990.

[33] S. Durand, F. Malgouyres, and B. Roug. Image de-blurring, spectrum


interpolation and application to satellite imaging.

Technical Report

9916, CMLA, ENS Cachan, 1999.


[34] L. C. Evans and R. F. Gariepy.

functions.

Measure theory and ne properties of

Studies in Advanced Mathematics. CRC Press, Boca Raton,

FL, 1992.
[35] K. J. Falconer.

The geometry of fractal sets.

Cambridge University

Press, Cambridge, 1985.


[36] H. Federer.
1969.

Geometric Measure Theory.

SpringerVerlag, NewYork,

A. Chambolle

92

[37] S. Finzi-Vita and P. Perugia. Some numerical experiments on the energy

Proc. of the Second European Workshop on


Image Processing and Mean Curvature Motion, pages 233240, Palma

minimization problem. In

de Mallorca, September 1995.


[38] D. Geiger and F. Girosi. Parallel and deterministic algorithms for MRFs:
surface reconstruction.

IEEE Trans. PAMI,

PAMI-13(5):401412, May

1991.
[39] D. Geiger and A. Yuille. A common framework for image segmentation.

Internat. J.Comp. Vision,

6(3):227243, August 1991.

[40] D. Geman and G. Reynolds.


recovery of discontinuities.

Constrained image restoration and the

IEEE Trans. PAMI,

PAMI-3(14):367383,

1992.
[41] S. Geman and D. Geman.

Stochastic relaxation, Gibbs distributions,

and the Bayesian restoration of images.

IEEE Trans. PAMI,

PAMI-

6(6), November 1984.


[42] E.

Minimal surfaces and functions of bounded variation.

Giusti.

Birkhuser, Boston, 1984.


[43] M. Gobbino.
functional.

Finite dierence approximation of the MumfordShah

Comm. Pure Appl. Math.,

[44] F. Guichard and F. Malgouyres.


In

51(2):197228, 1998.

Total variation based interpolation.

Proceedings of the European Signal Processing Conference,

volume 3,

pages 17411744, 1998.


[45] S. Z. Li.

Markov Random Field Modeling in Computer Vision.

Spinger

Verlag, 1995. (see also

http://markov.eee.ntu.ac.sg:8000/szli/MRF_Book/MRF_Book.html).
[46] P.-L. Lions, S. J. Osher, and L. Rudin. Denoising and deblurring using
constrained nonlinear partial dierential equations.

Technical report,

Cognitech Inc., Santa Monica, CA, 1992.


[47] F. Malgouyres and F. Guichard. Edge direction preserving image zooming:

a mathematical and numerical analysis.

CMLA, ENS Cachan, 1999.

Technical Report 9930,

Inverse problems in Image processing and Image segmentation


[48] J.-M. Morel and S. Solimini.

tion.

93

Variational Methods in Image Segmenta-

Birkhuser, Boston, 1995.

[49] D. Mumford and J. Shah. Boundary detection by minimizing function-

Proc. IEEE Conf. on Computer Vision and Pattern Recognition, San Francisco, 1985. (also Image Understanding , 1988).

als, I. In

[50] D. Mumford and J. Shah. Optimal approximation by piecewise smooth

Comm. Pure Appl.

functions and associated variational problems.

Math.,

42:577685, 1989.

[51] M. Nitzberg, D. Mumford, and T. Shiota.

depth.

Filtering, segmentation and

Number 662 in Lecture Notes in Computer Science. Springer

Verlag, Berlin, 1993.


[52] T. J. Richardson and S. K. Mitter. A variational formulation-based edge
focussing algorithm.

S
adhan a,

22(4):553574, 1997.

[53] L. Rudin and S. J. Osher. Total variation based image restoration with
free local constraints.

In

Proceedings of the IEEE ICIP'94,

volume 1,

pages 3135, Austin, Texas, 1994.


[54] L. Rudin, S. J. Osher, and E. Fatemi. Nonlinear total variation based

Physica D., 60:259268, 1992. [also in Experimental Mathematics: Computational Issues in Nonlinear Science (Proc.
noise removal algorithms.

Los Alamo Conf. 1991)].


[55] L. Schwartz.

Thorie des distributions.

Hermann, Paris, 1966.

[56] J. Shah. Properties of segmentations which minimize energy functionals.

(Preprint Northeastern Univ. Math. Dept., Boston),

December 1988.

[57] J. Shah. Parameter esimation, multiscale representation and algorithms


for energy-minimizing segmentations. In

Pattern Recognition, Atlantic City,


[58] J. Shah.

proc. 10th IEEE Int. Conf. on

pages 815819, 1990.

Segmentation by nonlinear diusion.

Soc. Conf. on Pattern Recognition,

In

proc. IEEE Comp.

pages 202207, 1991.

[59] J. Shah. Segmentation by nonlinear diusion, II. In

proc. IEEE Comp.

Soc. Conf. on Pattern Recognition, Champaign, IL, pages 644647, June


1992.

A. Chambolle

94

[60] C. R. Vogel and M. E. Oman.


denoising.

Iterative methods for total variation

SIAM J. Sci. Comput,

17(1):227238, 1996. Special issue on

iterative methods in numerical linear algebra (Breckenridge, CO, 1994).


[61] C. R. Vogel and M. E. Oman.

Fast, robust total variation-based re-

construction of noisy, blurred images.

IEEE Trans. Image Process.,

7(6):813824, 1998.
[62] W. P. Ziemer.
1989.

Weakly Dierentiable Functions.

SpringerVerlag, Berlin,

Vous aimerez peut-être aussi