Noter

Stochastic calculus with applications to risk theory.
Jostein Paulsen.
Department of Mathematical Sciences
University of Copenhagen
Universitetsparken 5, DK-2100 Copenhagen
DENMARK
1
Contents
1 Some basic terminology. 3
2 Stopping times. 4
3 The optional -eld. 6
4 Martingale theory. 10
5 Predictability. 14
6 Some Markov theory. 19
7 Processes with nite variation and the Doob-Meyer decomposition. 28
8 Local martingales and semimartingales. 33
9 Stochastic integrals. 38
10 Quadratic variation and the Ito formula. 41
11 Linear stochastic dierential equations. 48
12 Applications to risk theory I. 52
13 Random measures. 67
14 Applications to risk theory II. 69
15 References. 80
2
1 Some basic terminology.
In these notes we will use the following notation
R = (, )
R
+
= [0, )
R
+
= R
+

R
d
= R R
N = 0, 1, 2,
N
= 1, 2,
Q = all rational numbers
Q
+
= all nonnegative rational numbers
Denition 1.1 A stochastic basis is a probability space (, T, P) equipped with a
ltration F = (T
t
)
tR
+
. Here ltration means an increasing and right continuous
family of sub -algebras of T, i.e. for 0 < s < t T
s
T
t
T and T
t
= T
t+
=
s>t
T
s
.
The stochastic basis (, T, F, P) is said to satisfy the usual conditions if T is P-
complete and T
t
contains all P-null sets of T.
Assumptions such as right continuity of T
t
and that T
t
contains all P-null sets of
T are by no means innocent, and they are not necessary for many results neither.
However, it makes matters much more comfortable to just assume that the stochastic
basis satises the usual conditions.
Some more terminology is:
T
= T and T
=

tR
+
T
t
, i.e. the smallest -algebra containing all the T
t
.
A random set is a subset of R
+
.
A stochastic process is a family X = (X
t
)
tR
+
from into R
d
.
Considering X as a mapping from R
+
into R
d
, we will sometimes use the
notation
(, t) X(, t) = X
t
().
Two stochastic processes X and Y are said to be indistinguishable if
P( : t R
+
so that X
t
() = Y
t
()) = 0.
In these notes we just write X = Y when X and Y are indistinguishable.
Exercise 1.1 Show that if X and Y are right continuous (or left continuous) and
P(X
t
() = Y
t
()) = 0 t R
+
, then X = Y .
If A is a random set, then X
t
() = 1
A
(, t) is a stochastic process.
When we say that a process is RCLL it means that it is Right Continuous with Left
Limits. Similarly LCRL means Left Continuous with Right Limits.
When the process X is RCLL, we can dene two other processes.
3
1. X
= (X
t
)
tR
+
where X
t
= lim
st
X
s
and X
0
= X
0
.
2. X = (X
t
)
tR
+
where X
t
= X
t
X
t
.
Let T :

R
+
be a mapping. Then we dene the stopped process X
T
by
X
T
t
= X
tT
.
2 Stopping times.
Denition 2.1 Let (, T, F, P) be a stochastic basis. A stopping time is a mapping
T :

R
+
such that T t T
t
, t R
+
.
Note that we can equivalently assume that T < t T
t
, t R
+
. Indeed, assume
rst that T t T
t
t. Then
T < t =
nN
T t 1/n T
t
,
since by assumption T t 1/n T
t1/n
T
t
. On the other hand assume that
T < t T
t
t. Then for any natural number m,
T t =
n=m
T < t + 1/n T
t+1/m
.
But a set A T
t+
if for any u > 0, A T
t+u
, hence T t T
t+
= T
t
(by
assumption).
Now let T = t
0
be a xed number. Then
T t = t
0
t =
t
0
t
t
0
> t
T
0
T
t
,
i.e. T = t
0
is a stopping time.
For stopping times we have the following simple but useful result
Theorem 2.1 Let S and T be stopping times. Then
S T, S T and S +T
are all stopping times.
Proof. We have
S T t = S t T t,
S T t = S t T t,
and for S +T write
S+T > t = S = 0, T > tS > t, T = 0S t, T > 00 < S < t, S+T > t.
The rst three events of this decomposition are obviously in T
t
, and for the fourth
0 < S < t, S +T > t =
rQ
+
(0,t)
r < S < t, T > t r
=
rQ
+
(0,t)
(S > r S < t T > t r) T
t
.
4
Denition 2.2 If T is a stopping time we denote by T
T
the collection of sets A T
so that
A T t T
t
.
Exercise 2.1 Show that T
T
is a -algebra.
Exercise 2.2 Show that equivalently T
T
consists of the collection of sets A T so
that
A T < t T
t
.
Again let T = t
0
. Then
A T t =
A t
0
t
t
0
> t.
Hence A T t T
t
t if and only if A T
t
0
, i.e. T
T
= T
t
0
as should be
expected.
Denition 2.3 We denote by T
T
the -algebra generated by T
0
and all the sets
of the form
A T > t, t R
+
and A T
t
.
Exercise 2.3 Show that T is T
T
measurable.
Theorem 2.2 Let S < T be stopping times. Then T
S
T
T
T
T
.
Proof. Let A T
S
. Then A S q T
q
q > 0. Also by assumption
A = A S < T =
qQ
+
A S q < T =
qQ
+
(A S q) T > q T
T
.
Now let B be a generating set of T
T
, i.e. B = AT > s for some s with A T
s
.
Then B T t is empty if s t and for s < t,
B T t = A T > s T t T
t
,
since when s < t we have that A, T > s and T t are all in T
t
. Therefore
B T
T
, and since T
T
is a -algebra, it follows that T
T
T
T
.
Theorem 2.3 Let S and T be stopping times and assume A T
S
. Then
A S T T
T
,
A S < T T
T
.
(In particular S < T T
T
) Furthermore
S T T
S
T
T
.
5
Proof. To prove that A S T T
T
, we must prove that
(A S T) T t = (A S t) S t T t T t
is in T
t
. Now the rst and the third term are in T
t
by assumption. Also for any
number a,
T t a =
t a
T a T
a
T
t
t > a.
Therefore S t T t T
t
and the assertion is proven.
That A S < T T
T
is actually proven in Theorem 2.2.
By setting A = we immediately get from the above that S < T T
T
and
S T T
T
. Also since T t is a stopping time, we get from this last relation
S T S t = S T t T
Tt
T
t
,
hence S T T
S
. This ends the proof the theorem.
Exercise 2.4 Let S and T be stopping times. Show that
T
ST
= T
S
T
T
.
Exercise 2.5 Let T be a stopping time and let A T
T
. Dene
T
A
() =
T() A,
A
c
.
Prove that T
A
is a stopping time.
Exercise 2.6 Let (T
n
) be a sequence of stopping times and set
S =
T
n
and T =
T
n
.
Prove that S and T are stopping times and that T
S
=

T
Tn
.
Exercise 2.7 Let T be a stopping time and dene
T
n
() =
k
2
n
k1
2
n
T() <
k
2
n
,
T() = .
Then T
n
() T() as n . Prove that T
n
is a stopping time.
Thus by Exercise 2.6, we have that T
T
=

T
Tn
.
3 The optional -eld.
We make the following denitions.
Denition 3.1 A process X is said to be adapted to a ltration F if X
t
is T
t
measurable t R
+
.
The optional -eld is the -eld O on R
+
that is generated by all RCLL adapted
processes (considered as mappings on R
+
)
A process or a random set that is O measurable will be called optional.
6
A very important tool in proving measurability for a class of functions is the mono-
tone class theorem. It appears in various versions, here is a quite useful one.
Denition 3.2 A monotone vector space H of functions dened on a space is
dened to be the collection of bounded, real-valued functions f on satisfying the
three conditions:
(i) H is a vector space over R, i.e. if f, g H, then af +bg H for real numbers
a, b.
(ii) 1
H (i.e. the constant functions are in H.)

(iii) If (f
n
) H and 0 f
1
f
n
and lim
n
f
n
= f and f is bounded,
then f H.
A collection of real functions dened on is said to be multiplicative if f, g
implies that fg .
Theorem 3.1 Let be a multiplicative class of bounded real-valued functions de-
ned on a space T , and let / = (), i.e. the set of functions measurable w.r.t.
the -algebra on T dened by f
1
(A) : A B(R), f .
If H is a monotone vector space containing , then H contains all bounded /
measurable functions.
Comments.
1. If we want to use the monotone class theorem to prove that a class of functions
T R
d
is measurable w.r.t. a -algebra (, it is sucient to prove that each
component is (-measurable, and therefore we may assume that d = 1. Indeed
if f = (f
1
, , f
d
) : T R
d
and A is a generating set of B(R
d
) of the form
A = B
1
B
d
, we have
f
1
(A) =
d
i=1
f
1
i
(B
i
) (
i each f
i
is measurable.
2. Note that the theorem assumes bounded functions. However, in most cases
we want to prove measurability for unbounded functions. In that case it is
usually straightforward to extend to unbounded functions by setting f
n
=
(f n) (n), and then letting n .
Here is an application of the monotone class theorem.
Theorem 3.2 Let X be an optional process. When considered as a mapping on
R
+
it is T B(R
+
) measurable. Moreover, if T is a stopping time, then
(a) X
T
1
{T<}
is T
T
measurable (hence X is adapted).
(b) the stopped process X
T
is also optional.
7
Proof. The collection of bounded processes that are T B(R
+
) measurable and
satisfy (a) and (b) is easily seen to be a monotone class. Hence if we can prove
that every right continuous bounded adapted process is T B(R
+
) measurable
and satises (a) and (b), we can conclude that this also applies for any bounded
optional process, since it is readily seen that O is equivalently generated by all
bounded RCLL adapted processes. To go from there to any optional process X we
just set X
n
= (X n) (n) and then let n .
We therefore assume that X is RCLL and adapted. Dene X
n
by
X
n
t
= X
k/2
n for t [
k 1
2
n
,
k
2
n
), n, k N
.
Then for any B B(R
d
),
(, t) : X
n
(, t) B =
kN
: X
k/2
n() B [
k 1
2
n
,
k
2
n
)
T B(R
+
),
i.e. X
n
is T B(R
+
) measurable. Now since X is right continuous, X
n
X
pointwise, hence X is T B(R
+
) measurable.
Now let T be a stopping time and dene the stopping times T
n
as in Exercise 2.7.
Then
X
Tn
B T
n
< T
n
t =
kN
,k/2
n
t
(X
k/2
n B T
n
=
k
2
n
) T
t
,
hence X
Tn
1
{Tn<}
is T
Tn
measurable, and by right continuity of X, X
Tn
1
{Tn<}

X
T
1
{T<}
pointwise. Since the (T
n
) are decreasing, X
T
1
{T<}
is T
Tn
measurable
for all n, and by Exercise 2.7, T
T
=

T
Tn
. Therefore X
T
1
{T<}
is T
T
measurable.
Finally X
T
is also RCLL and
X
T
t
A = (X
t
A T > t) (X
T
1
{T<}
A T t).
The rst term here is obviously in T
t
, and since X
T
1
{T<}
is T
T
measurable, the
second component is also in T
t
by denition of T
T
. Therefore X
T
is adapted, hence
optional.
Let S and T be two stopping times. We dene the following four random sets.
[[S, T]] = (, t) : t R
+
, S() t T(),
[[S, T[[ = (, t) : t R
+
, S() t < T(),
]]S, T]] = (, t) : t R
+
, S() < t T(),
]]S, T[[ = (, t) : t R
+
, S() < t < T().
Instead of [[T, T]] we write [[T]]. Then [[T]] is the graph of the restriction of the
mapping T :

R
+
to the set R
+
. In particular for any set A we get
[[T
A
]] [[T]].
Note that the stochastic process 1
[[0,T[[
= 1
[[0,T[[
(, t) is optional since it is right
continuous and
: 1
[[0,T[[
(t) = 0 = : T() t T
t
,
hence it is adapted. In particular [[0, T[[ O. In fact we have the following result
8
Theorem 3.3 The -eld O is generated by intervals of the form A 0 where
A T
0
and [[0, T[[ for stopping times T.
Theorem 3.4 Let S, T be two stopping times and let Y be a T
S
measurable random
variable. Then the processes
Y 1
[[S,T]]
, Y 1
[[S,T[[
, Y 1
]]S,T]]
and Y 1
]]S,T[[
are all optional.
Proof. By using the monotone class theorem, it is sucient to prove the result for
Y () = 1
A
() with A T
S
. Then for the simplest case
Y 1
[[S,T[[
= 1
A
1
[[S,T[[
= 1
[[S
A
,T
A
[[
= 1
[[0,T
A
[[
1
[[0,S
A
T
A
[[
which is optional by Exercise 2.5.
For the more complicated case X = 1
A
1
]]S,T]]
, set
X
n
= 1
A
1
[[S
n
,T
n
[[
,
where S
n
= S + 1/n and T
n
= T + 1/n. Then S
n
and T
n
are both stopping times,
A T
S
T
S
n, hence by what we have just proved, X
n
is optional. But X
n
X
as n , hence X is also optional.
As an application we have
Theorem 3.5 Every adapted left continuous process is optional
Proof. For each n N
, dene
X
n
=
kN
X
k/2
n1
[[k/2
n
,(k+1)/2
n
[[
.
Then by the above theorem, X
n
is optional, and by left continuity of X, X
n
X
as n , hence X is also optional.
As a corollary we get that for a RCLL adapted process, the two processes X
and
X = X X
are both optional.

Theorem 3.6 Let X and Y be two optional processes. Then either of the following
two conditions imply that X = Y .
(a) X
T
= Y
T
for all nite stopping times T.
(b) For all stopping times T, X
T
1
{T<}
and Y
T
1
{T<}
are integrable, and
E[X
T
1
{T<}
] = E[Y
T
1
{T<}
].
9
Now let B be a set in R
d
, and dene the hitting time
T = inft : X
t
B.
Also dene
S = inft : X
t
B or X
t
B.
If B is open then S = T. This is also true if X is continuous or nondecreasing. In
general we have
Theorem 3.7 Let X be adapted and RCLL.
(a) if B is open, then T is a stopping time.
(b) if B is closed, then S is a stopping time.
Proof. Let B be open. Then
T < t =
sQ[0,t)
X
s
B T
t
.
(Note that the inclusion also applies for sets that are not open). This proves part
(a). For part (b) let d(x, B) be the distance from the point x to the set B, and
dene
B
n
= x : d(x, B) < 1/n =
yB
x : d(x, y) < 1/n.
Then B
n
is open and
S t = X
t
B
X
t
B
sQ[0,t)
X
s
B
n
T
t
.
This ends the proof.
Comment. In fact it can be proven that the hitting time for an optional process
(not necessarily RCLL) of any Borel set is a stopping time. Here the completeness
of the ltration F is essential, i.e. T
t
is right continuous and contains all P-null sets
of T. In the above proof we used only the right continuity of T
t
.
4 Martingale theory.
Denition 4.1 A RCLL adapted process X is called a submartingale if E[[ X
t
[] <
and
E[X
t
[ T
s
] X
s
, s t.
If the inequality is reversed it is called a supermartingale, and if there is an equality
it is called a martingale.
If E[X
2
t
] < t, it is said to be square integrable.
The following theorem is classical.
10
Theorem 4.1 (Doobs convergence theorem.) Let X be a submartingale and
assume
sup
t0
E[X
+
t
] < , X
+
t
= maxX
t
, 0.
Then
X
() = lim
t
X
t
() exists a.e. and E[[ X
[] < .
If X is a nonnegative supermartingale, it follows that X
() = lim
t
X
t
() always
exists a.e., and that E[[ X
[] < . Also by Fatous lemma

X
t
E[X
[T
t
].
Denition 4.2 A family of random variables (U
)
A
is said to be uniformly inte-
grable (UI) if
lim
n
sup
{|U|>n}
[U
[dP = lim
n
sup
E[[U
[1
{|U|>n}
] = 0.
Note that when (U
)
A
is U.I., then
sup
E[[U
[] = sup
(E[[U
[1
{|U|n}
] +E[[U
[1
{|U|>n}
]) n +c
n
< c, (4.1)
for some constant c.
Uniform integrability is very often proved by using the following easy theorem.
Theorem 4.2 Assume is an increasing function with lim
u
(u)
u
= , and that
sup
A
E[(U
)] < .
Then (U
)
A
is uniformly integrable.
Proof. By assumption u <
n
(u) for u > n with
n
0 as n . Therefore
{|U|>n}
[U
[dP <
n
{|U|>n}
(U
)dP
n
sup
E[(U
)] 0 as n .
Theorem 4.3 Let X be a uniformly integrable submartingale. Then X
= lim
t
X
t
exists a.e. , E[[ X
[] < and
X
t
E[X
[T
t
],
i.e. X is a submartingale on [0, ].
Proof. It follows from (4.1) and Theorem 4.1 that X
exists and is integrable. Let

Y
t
= [X
t
X
[. Then Y
t
[X
t
[ +[X
[ which is UI, hence Y is UI. But

E[Y
t
] =
Y
t
1
{Ytn}
dP +
{Yt>n}
Y
t
dP.
11
The rst integrand is bounded by n, hence the integral converges towards
1
{Yn}
dP = 0 as t by the bounded convergence theorem. As for the
second integral we have
limsup
t
{Yt>n}
Y
t
dP sup
t
{Yt>n}
Y
t
dP 0 as n
by the very denition of UI. Therefore E[[X
t
X
[] 0 as t , and so for any

A T
t
and s > t,
A
X
t
dP
A
X
s
dP lim
u
A
X
u
dP =
A
lim
u
X
u
dP =
A
X
dP. (4.2)
Here we changed limit and integral because
lim
u
A
X
u
dP
A
X
dP
lim
u
E[1
A
[X
u
X
[] lim
u
E[[X
u
X
[] = 0.
The theorem is thus proved.
Similarly we have.
Theorem 4.4 Let X be a martingale. Then the following four conditions are equiv-
alent.
(a) X is uniformly integrable.
(b) it converges in L
1
as t (i.e. E[[X
t
X
[] 0 as t ).
(c) it converges a.s. to an integrable random variable X
so that X is a martingale
on [0, ], i.e. X
t
= E[X
[T
t
].
(d) there exists an integrable random variable Y such that X
t
= E[Y [T
t
] t. More-
over E[Y [T
] = X
.
Proof. The implications (a) (b) (c) are contained in the proof of Theorem 4.3
since the inequalities in (4.2) now become equalities. Furthermore (c) implies the
rst part of (d) by setting Y = X
. Finally assume (d). Then for s < t,

E[X
t
[T
s
] = E[E[Y [T
t
][T
s
] = E[Y [T
s
] = X
s
.
Dene the measure on T by (A) = E[[Y [1
A
]. Then () = E[[Y [] < , is
absolutely continuous w.r.t. P and
E[[X
t
[1
{|Xt|>n}
] = E[[E[Y 1
{|Xt|>n}
[T
t
][] ([X
t
[ > n).
But by Markovs inequality
P([X
t
[ > n)
E[[X
t
[]
n

E[[Y []
n
0 as n .
This proves that X is U.I.
It only remains to prove that the rst part of (d) implies the second. Let A T
t
,
hence E[1
A
X
t
] = E[1
A
Y ] by denition. But by (b), (which we have just proven
follows from the rst part of (d)), E[1
A
X
t
] E[1
A
X
] as t , hence E[1
A
X
] =
E[1
A
Y ]. Since
t0
T
t
generates T
, the result follows from e.g. Theorem 6.2.

The following theorem is extremely useful.
12
Theorem 4.5 (Optional sampling theorem.) Let X be a submartingale and S
T bounded stopping times. Then
X
S
E[X
T
[T
S
].
In particular E[X
0
] E[X
T
]. If X is a martingale, the inequalities can be replaced
by equalities.
If X is uniformly integrable, the results hold for arbitrary stopping times S T.
As an application we have.
Theorem 4.6 A RCLL adapted process X is a submartingale (martingale) if and
only if for each pair of bounded stopping times S T,
E[X
S
] E[X
T
] (E[X
S
] = E[X
T
] = E[X
0
]).
Proof. We consider the submartingale case, since in the martingale case the in-
equalities can be replaced by equalities. If X is a submartingale, it follows by the
optional stopping theorem that E[X
S
] E[X
T
]. On the other hand assume that
this inequality holds for each pair of bounded stopping times S T. Let s < t and
set S = s and T = t
A
s
A
c where A T
s
. Then S T t are bounded stopping
times by Exercise 2.5 and Theorem 2.1, hence by the optional stopping theorem,
E[X
s
] E[X
T
] = E[X
t
1
A
] +E[X
s
1
A
c] E[X
s
1
A
] E[X
t
1
A
],
which proves the theorem.
Exercise 4.1 Let X be a RCLL adapted process and assume that E[X
T
] = E[X
0
]
for any stopping time T. Prove that X is a uniformly integrable martingale.
Here is another very useful theorem.
Theorem 4.7 Let X be a submartingale (martingale) and T a stopping time. Then
the stopped process X
T
is also a submartingale (martingale).
Proof. Again it is sucient to consider the submartingale case. Let S
1
S
2
be
bounded stopping times. Then by Theorem 2.1, S
1
T S
2
T are also bounded
stopping times, hence from the above theorem
E[X
T
S
1
] = E[X
S
1
T
] E[X
S
2
T
] = E[X
T
S
2
].
The result now follows from the above theorem.
Theorem 4.8 (Jensens inequality) Let be a convex function and assume that
Y and (Y ) are integrable random variables. Then for any -algebra (,
(E[Y [(]) E[(Y )[(].
Theorem 4.9 Let X be a stochastic process and let be a convex function. Assume
that X
t
and (X
t
) are integrable t 0. Then
13
(a) if X is a martingale, then (X) is a submartingale.
(b) if X is a submartingale and is nondecreasing, then (X) is a submartingale.
Proof. We have from Jensens inequality with s < t.
(a) (X
s
) = (E[X
t
[T
s
]) E[(X
t
)[T
s
].
(b) (X
s
) (E[X
t
[T
s
]) E[(X
t
)[T
s
].
The following inequality will be useful.
Theorem 4.10 (Doobs inequality.) Let X be a square integrable martingale so
that sup
t
E[X
2
t
] < (which by Theorem 4.2 implies that X is uniformly integrable).
Then
E[sup
t
X
2
t
] 4 sup
t
E[X
2
t
] = 4E[X
2
]
Denition 4.3 A RCLL adapted stochastic process X is said to be a local martin-
gale if there exists an increasing sequence (T
n
) of stopping times with lim
n
T
n
=
a.s. so that each stopped process is a uniformly integrable martingale.
We will denote the class of uniformly integrable martingales by , and the class
of local martingales by
loc
.
Exercise 4.2 Prove that a martingale is a local martingale.
The opposite conclusion is not valid however. In fact it is possible to construct a
uniformly integrable local martingale that is not a martingale. However we have.
Exercise 4.3 Let X be an RCLL adapted process so that there exists an increasing
sequence (T
n
) of stopping times with lim
n
T
n
= a.s. and each X
Tn
is a
martingale. Prove that X is a local martingale.
Exercise 4.4 Let X be a local martingale, and assume that t 0, E[sup
st
[X
s
[] <
. Prove that X is a martingale. Prove that X is a uniformly integrable martingale
if E[sup
t
[X
t
[] < .
5 Predictability.
Denition 5.1 The predictable -eld { is the -eld on R
+
that is generated
by all left continuous adapted processes.
A process or a random set that is { measurable is called predictable.
By Theorem 3.5 all adapted left continuous processes are optional, hence { O.
Theorem 5.1 { is also generated by any of the following collection of random sets.
(i) A0 where A T
0
and [[0, T]] where T is any stopping time.
(ii) A0 where A T
0
and A(s, t] where s < t and A T
s
.
14
Proof. Let {
and {
be the -elds generated by the sets in (i) and (ii) respectively.

Since 1
[[0,T]]
is left continuous and
: 1
[[0,T]]
(t) = 0 = : T() < t T
t
,
it follows that {
{.
For s < t and A T
s
, it follows from Exercise 2.5 that s
A
= s1
A
and t
A
= t1
A
both
are stopping times. Also s
A
t
A
so therefore
A(s, t] =]]s
A
, t
A
]] = [[0, t
A
]] ` [[0, s
A
]] {
,
hence {
.
Finally let X be adapted and left continuous and dene
X
n
(, t) = X
0
()1
{0}
(t) +
kN
X
k/2
n()1
(k/2
n
,(k+1)/2
n
]
(t).
Then X
n
is {
measurable and X
n
X pointwise, hence X is {
measurable.
Therefore { {
and the result follows.

Theorem 5.2 If X is predictable and T a stopping time then
(a) X
T
1
{T<}
is T
T
measurable.
(b) The stopped process X
T
is also predictable.
Proof. By a monotone class argument using the above theorem we may assume
that X(, t) = 1
A
()1
{0}
(t) with A T
0
or X(, t) = 1
[[0,S]]
(, t) for S any stopping
time. But in the rst case
: X
T
()1
{T<}
= 1 = A T = 0 T
0
T
T
,
and for the second case
: X
T
()1
{T<}
= 1 = T S = S < T
c
T
T
by Theorem 2.3.
Denition 5.2 A predictable time T is a stopping time so that the stochastic in-
terval [[0, T[[ is predictable.
Since [[T]] = [[0, T]] ` [[0, T[[ it follows that [[T]] { when T is predictable. The
converse is also true.
Theorem 5.3 Let T be a stopping time so that [[T]] {. Then T is predictable.
Proof. [[0, T[[= [[0, T]] ` [[T]] {.
Theorem 5.4 Let X be an RCLL predictable process. Then
(a) if X is increasing, then T = inft : X
t
c is a predictable stopping time.
15
(b) T = inft : [X
t
[ > 0 is a predictable stopping time.
Proof. For part (a) it follows by Theorem 3.7 (b) that T is a stopping time. Let
B = (, t) : X(, t) c { since X is predictable. Also since X is increasing,
X(, T()) c on T() < , hence [[T]] B. But then [[T]] = [[0, T]] B, and
the result follows from the above theorem.
To prove part (b), we use the comment after Theorem 3.7 to conclude that T is
a stopping time. Since X = X X
is predictable, the rest follows as above.

Exercise 5.1 Let (T
n
) be a sequence of predictable times. Prove that T =

T
n
is
a predictable time.
Theorem 5.5 Let T be a predictable time and A T
T
. Then T
A
is a predictable
time.
Theorem 5.6 Let S be a predictable time and T a stopping time. Let A T
S
.
Then
A S T T
T
.
Proof.
AS T = (AS T < )
(AT = ) = S
A
T <
(AT = ).
By the above theorem S
A
is a predictable time, so X = 1
[[S
A
,[[
is predictable and
hence X
T
1
{T<}
= 1 = S
A
T < is in T
T
by Theorem 5.2.
It remains to prove that AT = T
T
. Now A T
S
T
, and T
is
generated by sets of the form B T
t
for t 0. Also T = T
T
, and therefore
the sets
B : B T = T
T
is a -algebra. It is therefore sucient to prove that

A T = T
T
for A T
t
.
But then
A T = = A T > t T = T
T
,
since A T > t T
T
by denition of T
T
.
Theorem 5.7 Let S, T be two stopping times, and Y a random variable. Then
(a) if T is predictable and Y is T
S
measurable, then Y 1
]]S,T[[
is predictable.
(b) if S is predictable and Y is T
S
[[S,T]]
is predictable.
(c) if S and T are both predictable and Y is T
S
[[S,T[[
is
predictable.
16
Proof. For part (a) note that X = Y 1
]]S,T]]
is left continuous, and for A B(R
d
)
so that 0 is not in A,
: X
t
() A = : Y () A S < t T
= ( : Y () A S < t) T t T
t
,
hence X is predictable. But Y 1
]]S,T[[
= (Y 1
]]S,T]]
)1
[[0,T[[
which is the product of two
predictable processes.
To prove (b), note that by the monotone class theorem we may assume that Y = 1
A
where A T
S
. Then
Y 1
[[S,T]]
= 1
[[S
A
,T]]
= (1
[[0,T]]
1
[[0,S
A
[[
)1
[[0,T]]
which is predictable since by Theorem 5.3, S
A
is predictable.
Finally (c) follows from the fact that
Y 1
[[S,T[[
= (Y 1
[[S,T]]
)1
[[0,T[[
.
Theorem 5.8 We have
(a) if (T
n
) is a sequence of stopping times that increases to T, and such that
T
n
< T on T > 0, then T is predictable.
(b) if T is a predictable time, then there exists a sequence (T
n
) of stopping times
that increases to T, and such that T
n
< T on T > 0.
Proof.
(a) ]]0, T[[=

n
]]0, T
n
]] {.
(b) this is very dicult.
Theorem 5.9 Let X and Y be two predictable processes. Then either of the follow-
ing two conditions imply that X = Y .
(a) X
T
= Y
T
for all nite predictable stopping times T.
(b) for all predictable stopping times T, X
T
1
{T<}
and Y
T
1
{T<}
are integrable,
and
E[X
T
1
{T<}
] = E[Y
T
1
{T<}
].
Denition 5.3 A stopping time T is called totally inaccessible if P(T = S < ) =
0 for all predictable times S.
Note that if T is a totally inaccessible stopping time and S is a stopping time so
that [[S]] [[T]], then S is also totally inaccessible.
We have the following decomposition.
Theorem 5.10 Let T be a stopping time. There exists a sequence (S
n
) of predictable
times and a unique (up to P-null set) T
T
measurable subset A T < so
that the stopping time T
A
is totally inaccessible, and the stopping time T
A
c satises
[[T
A
c ]]
n
[[S
n
]].
17
T
A
is called the totally inaccessible part of T and T
A
c its accessible part. They are
uniquely dened up to a P-null set.
In most applications a stopping time will either be totally inaccessible, e.g. the times
of jumps of a Poisson process, or predictable, i.e. the times a Brownian motion hits
a boundary.
Exercise 5.2 Let X be an adapted process with continuous paths, and let B R
d
be closed. Let T = inft : X
t
B. Prove that T is a predictable stopping time.
Denition 5.4 An RCLL adapted process X is called quasi-left continuous if X
T
=
0 a.s. on the set T < for every predictable time T.
Note that a quasi-left continuous process can have no xed time of discontinuity, i.e.
there can be no t
0
so that X
t
0
= 0 (since T = t
0
is a predictable stopping time).
Theorem 5.11 Let X be an RCLL adapted process. Then there is equivalence
between
(a) X is quasi-left continuous.
(b) for any increasing sequence (T
n
) of stopping times with limit T, we have that
lim
n
X
Tn
= X
T
a.s. on the set T < .
Proof. Assume that X is not quasi left-continuous. Then there exists a predictable
time T so that P(X
T
= 0, T < ) > 0. Let T
n
T be stopping times so that
T
n
< T on T > 0 (they exist by Theorem 5.8 (b)). Since by assumption X has
a left limit, limX
Tn
= X
T
a.s. on 0 < T < , hence limX
Tn
= X
T
on the set
X
T
= 0, T < , which contradicts (b).
Now assume that (b) fails for some sequence T
n
, and put S
n
= (T
n
)
{Tn<T}
n and
S = T
A
where A =
n
T
n
< T. Then S
n
< S and S
n
S, hence S is a predictable
stopping time by Theorem 5.8 (a). Also
limX
Tn
= X
T
, T < = X
S
= 0, S < ,
hence P(X
S
= 0, S < ) > 0 which contradicts (a).
The following three theorems deal with predictable projections of processes. They
are not very hard to prove, but we omit the proofs here.
Theorem 5.12 Let X be a local martingale. Then E[X
T
[T
T
] = X
T
on the set
T < for all predictable times T.
Example 5.1 Let N be a Poisson process with intensity 1. Then M
t
= N
t
t is a
martingale. Let T be the time of the rst jump of N, so that N
T
= 1 and N
T
= 0.
But by Exercise 2.3, T is T
T
measurable, hence
E[M
T
[T
T
] = 1 T = T = M
T
.
Therefore T cannot be predictable.
18
Theorem 5.13 Let X be an TB(R
+
) measurable process. There exists a (, ]
valued process, called the predictable projection of X and denoted by
p
X, that is de-
termined uniquely by the following two conditions.
(i) it is predictable.
(ii) for all predictable times T,
(
p
X)
T
= E[X
T
[T
T
] on T < .
Theorem 5.14 The predictable projection
p
X has the following properties
(a) for any stopping time T
p
(X
T
) = (
p
X)1
[[0,T]]
+X
T
1
]]T,[[
.
(b) if Z is any predictable process,
p
(ZX) = Z(
p
X).
6 Some Markov theory.
Denition 6.1 Let X be an RCLL adapted process dened on the probability space
(, T, F, P) satisfying the usual conditions. Let T
t
= X
s
: s t. We say that
X is a Markov process (relative to F) if for all B T
t
and all t 0,
P(B[T
t
) = P(B[X
t
). (6.1)
In particular we get for any C B(R
+
) and any s 0 that P(X
t+s
C[T
t
) =
P(X
t+s
C[X
t
).
Basically for a Markov process, information about the future depends only on the
past through the last observed value of the process.
Remark 6.1 For a Markov process it is possible that the ltration F is strictly
bigger than the ltration generated by the process X (extended so that it satises
the usual conditions). It is therefore common to use the terminology (X
t
, T
t
; t 0)
is a Markov process, since it is then clear which ltration that X is Markov relative
to (by extending the ltration, we may loose the Markov property (6.1)). In these
notes no confusion will arise, and therefore we will usually just say that X is a
Markov process.
An equivalent denition of a Markov process is the following. Let A T
t
and
B T
t
. Then
P(A B[X
t
) = P(A[X
t
)P(B[X
t
), (6.2)
i.e. the past and the future are independent given the present.
19
To prove that (6.1) and (6.2) are equivalent, assume rst (6.1). Then using that
(X
t
) T
t
and the law of iterated expectations, we get
P(A B[X
t
) = E[1
A
1
B
[X
t
] = E[E[1
A
1
B
[T
t
][X
t
] = E[1
A
E[1
B
[T
t
][X
t
]
= E[1
A
P(B[T
t
)[X
t
] = E[1
A
P(B[X
t
)[X
t
] = P(B[X
t
)E[1
A
[X
t
]
= P(A[X
t
)P(B[X
t
).
Assume now that (6.2) holds. By the denition of conditional probability, we have
to prove A T
t
,
E[1
A
P(B[X
t
)] = P(A B).
But (6.2) says that E[1
A
[X
t
]E[1
B
[X
t
] = E[1
AB
[X
t
], hence
E[1
A
P(B[X
t
)] = E[E[1
A
P(B[X
t
)[X
t
]] = E[E[1
B
[X
t
]E[1
A
[X
t
][X
t
]
= E[E[1
AB
[X
t
]] = E[1
AB
] = P(A B).
To obtain even more general characterizations of a Markov process, the following
two general results are useful. First dene C
c
(R
d
) as the space of functions on R
d
with compact support, i.e. for every f C
c
(R
d
), there exists a compact C R
d
so
that f(x) = 0 when x = C.
Theorem 6.1 Let G R
d
be open. Then there exists a sequence of functions (f
n
)
in C
c
(R
d
) so that
f
n
1
G
.
Theorem 6.2 (Dynkins theorem.) Let S be an arbitrary space and H
a class of subsets of S which is closed under nite intersection. Let ( be a class
of subsets of S such that S ( and H (. Assume furthermore that ( has the
following closure properties.
(i) if A
n
( and A
n
A
n+1
for n N
, then

n
A
n
(.
(ii) if A B and A (, B (, then B ` A (.
Then (H) (.
Exercise 6.1 Use Theorems 6.1 and 6.2 to prove that the following are equivalent.
(i) for all s, t 0, f C
c
(R
d
),
E[f(X
s+t
)[T
t
] = E[f(X
s+t
)[X
t
].
(ii) for all s, t 0 and all bounded f,
E[f(X
s+t
)[T
t
] = E[f(X
s+t
)[X
t
].
(iii) for all B T
t
and all t 0,
P(B[T
t
) = P(B[X
t
).
20
(iv) for all bounded T
t
measurable random variables Y ,
E[Y [T
t
] = E[Y [X
t
]. (6.3)
The most standard form of describing a Markov process is by means of a transition
function.
Denition 6.2 A homogeneous transition function is a function P : R
+
R
d
B(R
d
) [0, 1] so that
(i) for all t R
+
, x R
d
, P
t
(x, ) is a probability measure on B(R
d
).
(ii) for all A B(R
d
), P
(, A) is B(R
+
) B(R
d
) measurable.
(iii) it satises the Chapman-Kolmogorov relation, i.e. for all s, t 0 and A
B(R
d
),
P
s+t
(x, A) =
R
d
P
t
(x, dy)P
s
(y, A).
For a measurable function f we then write
(P
t
f)(x) =
R
d
P
t
(x, dy)f(y),
if the integral exists.
By (ii), P
t
f is measurable, and by (i) P
t
f is bounded if f is bounded. Also by (iii).
(P
s+t
f)(x) =
R
d
P
s+t
(x, dz)f(z) =
R
d
R
d
P
t
(x, dy)P
s
(y, dz)f(z)
=
R
d
P
t
(x, dy)(P
s
f)(y) = (P
t
P
s
f)(x),
i.e.
P
s+t
= P
t
P
s
= P
s
P
t
.
For this reason the family (P
t
) is called a semigroup, with identity P
0
dened by
(P
0
f)(x) = f(x).
This gives us the following denition.
Denition 6.3 The process X is a homogeneous Markov process for the ltration
F with (P
t
) as its transition function if for all s, t 0 and bounded measurable f,
E[f(X
s+t
)[T
t
] = (P
s
f)(X
t
) (6.4).
By (P
s
f)(X
t
) is just meant that we evaluate h(x) = (P
s
f)(x) for every x and
then insert X
t
, i.e. (P
s
f)(X
t
) = h(X
t
).
In order to fully specify the process, we need the initial distribution of X
0
.
Then we have for 0 t
1
< < t
n
and bounded f.
E[f(X
t
1
, , X
tn
)] =
(dx
0
)
P
t
1
(x
0
, dx
1
)
P
tnt
n1
(x
n1
, dx
n
)f(x
1
, , x
n
).
21
Assume now that the transition functions (P
t
)
t0
are given, and let T
t
= X
s
:
s t be the natural ltration. Then using Kolmogorovs existence theorem, it
can be proven that a Markov process with the given transition functions always
exists (not necessarily RCLL). If X
0
= x, we denote the corresponding probability
measures on by P
x
, i.e. for C B(R
d
),
P
x
(X
t
C) = P
t
(x, C).
Now let Y = 1
C
(X
t
) and set
E
x
[Y ] =
Y ()P
x
(d) = P
x
(X
t
C) = P
t
(x, C).
It is easy to verify that this expression extends to all bounded T
= X
s
: s 0
measurable Y , i.e.
E
x
[Y ] =
Y ()P
x
(d).
The Markov property can now be written
P(X
s+t
C[T
t
) = P(X
s+t
C[X
t
) = P(X
t
+X
s+t
X
t
C[X
t
)
= P
Xt
(X
s
C) = P
s
(X
t
, C).
By using methods similarly to those in Exercise 6.1, we can extend the denition
of a Markov process as follows. Let X
t
() be the whole path of X() for s t,
i.e. what comes after time t. Also let F B((R
d
)
[0,)
) which means that X
t

F T
t
= X
s
: s t. Then the Markov property can be written as
P(X
t
F[T
t
) = P(X
t
F[X
t
) = P
Xt
(X
0
F). (6.5)
Equivalently for any bounded measurable g : (R
d
)
[0,)
R,
E[g(X
t
)[T
t
] = E
Xt
[g(X
0
)]. (6.6)
In the last case we may assume g nonnegative.
If X
0
has initial measure we simply dene for any A T
,
P
(A) =
R
d
P
x
(A)(dx).
The next denition is crucial for many applications of Markov processes.
Denition 6.4 We say that the homogeneous Markov process X has the strong
Markov property if for all nite stopping times T, number s > 0 and C B(R
d
),
P(X
T+s
C[T
T
) = P
s
(X
T
, C) = P
X
T
(X
s
C),
or equivalently for all bounded measurable f
E[f(X
T+s
)[T
T
] = (P
s
f)(X
T
) = E
X
T
[f(X
s
)].
22
As in (6.5) and (6.6) we get the equivalent denitions (see above (6.5) and (6.6) for
the denitions),
P(X
T
F[T
T
) = P(X
T
F[X
T
) = P
X
T
(X
0
F). (6.7)
E[g(X
T
)[T
T
] = E
X
T
[g(X
0
)]. (6.8)
Now let C
0
be the set of continuous functions that vanishes at innity, i.e.
lim
|x|
f(x) = 0.
Denition 6.5 The Markov transition functions (P
t
) are said to have the Feller
property if for all f C
0
(i) P
t
f C
0
t 0.
(ii) (P
t
f)(x) f(x) as t 0 (pointwise for all x).
In this case we call the corresponding Markov process a Feller process.
It can be shown that the corresponding Markov process can be chosen to be
RCLL. If on the other hand we construct a Markov process in a way that makes
it right continuous, then condition (ii) is automatically satised. Indeed, by right
continuity X
t
X
0
a.s. as t 0, hence by the bounded convergence theorem
(P
t
f)(x) = E
x
[f(X
t
)] E
x
[f(X
0
)] = f(x) as t 0.
Most Markov processes that turn up in applications have the Feller property.
That this is very useful follows from the next theorem.
Theorem 6.3 Let X be a Feller process. Then X is a quasi-left continuous strong
Markov process.
This means in particular that a Feller process can have no xed time of discontinuity.
We actually have a seemingly stronger result.
Theorem 6.4 Let X be a Feller process and T S nite stopping times. Assume
that T is T
S
measurable. Then for any bounded measurable f,
E[f(X
T
)[T
S
] = (P
TS
f)(X
S
) = E
X
S
[f(X
TS
)].
If lim
t
X
t
= X
exists, we can drop the assumption that the stopping times are
nite.
Example 6.1 Let X be a RCLL Markov process taking values on the integers.
Here the natural topology is the discrete topology, i.e. each point is an open set,
and therefore every function is continuous. As mentioned above, condition (ii) of
Denition 6.5 is always satised for RCLL processes, hence X is a Feller process.
23
We will now analyse RCLL processes with stationary independent increments
(called Levy processes), since they are important for our applications. So let X be
an RCLL adapted process with transition probabilities
P
t
(x, A) = P
t
(0, Ax)
def
=
t
(Ax),
where Ax = y : y +x A.
The measures (
t
) has the convolution property, i.e. by the Chapman-Kolmogorov
relation
s+t
(A) = P
s+t
(0, A) =
P
t
(0, dy)P
s
(y, A) =

s
(Ay)
t
(dy).
Let x
n
x and let f C
0
(R
d
). Then by the above properties and bounded con-
vergence
lim
n
(P
t
f)(x
n
) = lim
n
f(y)P
t
(x
n
, dy)
= lim
n
f(y)
t
(dy x
n
)
= lim
n
f(y +x
n
)
t
(dy)
=
f(y +x)
t
(dy) = (P
t
f)(x).
Hence X is a Feller process. Actually we will prove.
Theorem 6.5 Let X be an RCLL adapted process with stationary independent in-
crements, and let T be a stopping time. Then the process Y dened by Y
t
=
X
T+t
X
T
is independent of T
T
, and Y has the same law as X.
Proof. We must prove that for bounded measurable f
1
, f
n
and times 0 < t
1
<
< t
n
,
E
f
1
(Y
t
1
)
n
i=2
f
i
(Y
t
i
Y
t
i1
)
T
T
= E
0
[f
1
(X
t
1
)]
n
i=2
E[f
i
(X
t
i
X
t
i1
)].
There is no loss of generality by assuming that n = 2, since the method used carries
straightforward over to general n. From the discussion preceding the theorem, we
get for bounded measurable f,
E
x
[f(X
t
X
0
)] = E
x
[f(X
t
x)] =
f(y x)P
t
(x, dy) =
f(y)P
t
(0, dy) = h(t),
independent of x. Therefore since Y
t
2
Y
t
1
= X
T+t
2
X
T+t
1
, we get from (6.8)
E[f
1
(Y
t
1
)[T
T
] = E[f
1
(X
T+t
1
X
T
)[T
T
] = E
X
T
[f
1
(X
t
1
X
0
)] = h
1
(t
1
)
= E
0
[f
1
(X
t
1
],
E[f
2
(Y
t
2
Y
t
1
)[T
T+t
1
] = h
2
(t
2
t
1
) = E[f
2
(X
t
2
X
t
1
)],
24
which are both independent of T
T
. Therefore since Y
t
1
is T
T+t
1
measurable,
E[f
1
(Y
t
1
)f
2
(Y
t
2
Y
t
1
)[T
T
] = E[f
1
(Y
t
1
)E[f
2
(Y
t
2
Y
t
1
)[T
T+t
1
][T
T
]
= E[f
1
(Y
t
1
)h
2
(t
2
t
1
)[T
T
] = h
1
(t
1
)h
2
(t
2
t
1
).
This ends the proof.
Remark 6.2 As in Remark 6.1, when we say that X has independent increments,
we actually mean that X
t+s
X
t
is independent of T
t
, and of course the ltration F
may be strictly bigger than the ltration generated by the process X itself. However,
our shorthand terminology will cause no confusion in what follows.
A special case of a RCLL process with independent increments is a Brownian
motion. By using the above notation, its transition function is given by
t
(A) =
A
1
2t
e
y
2
2t
dy = P
t
(0, A)
i.e. W
t+s
W
t
^(0, t) (we use the notation W for Wiener process).
To prove that W has almost surely continuous paths, we need the following
theorem by Kolmogorov.
Theorem 6.6 Let X be a RCLL process, and assume there exist positive constants
, and C so that
E[X
t
X
s
[
C[t s[
1+
t > s.
Then X is a.s. continuous.
Exercise 6.2 Prove that a Brownian motion is a.s. continuous.
By means of Theorem 6.5, we get the density of the hitting times of a Brownian
motion.
Theorem 6.7 Let W be a Brownian motion with W
0
= 0, and let b = 0 be a
number. Dene
T
b
= inft : W
t
= b.
Then T
b
has density
f
T
b
(t) =
[b[
2t
3
e
b
2
2t
.
Proof. By the symmetry of Brownian motion, we may assume that b > 0. Also since
random walk is null recurrent, it follows that T
b
< . Furthermore, by Theorem
6.5, W
T
b
+t
W
T
b
is a Brownian motion independent of T
T
b
, hence by symmetry
P(W
T
b
+t
W
T
b
< 0[T
T
b
) = P(W
T
b
+t
W
T
b
> 0[T
T
b
) (=
1
2
).
Now T
b
is T
T
b
measurable, so therefore on the set T
b
< t,
P(W
T
b
+(tT
b
)
W
T
b
< 0[T
T
b
) = P(W
T
b
+(tT
b
)
W
T
b
> 0[T
T
b
) (=
1
2
).
25
Hence
P(T
b
< t, W
t
W
T
b
< 0) =
{T
b
<t}
P(W
T
b
+(tT
b
)
W
T
b
< 0[T
T
b
)dP
= P(T
b
< t, W
t
W
T
b
> 0).
This gives
P(T
b
< t) = P(T
b
< t, W
t
W
T
b
< 0) +P(T
b
< t, W
t
W
T
b
> 0)
= 2P(T
b
< t, W
t
W
T
b
> 0) = 2P(W
t
> b) = 2P(W
1
> bt
1/2
)
= 2

bt
1/2
1
2
e
y
2
2
dy.
Dierentiation w.r.t. t gives the desired result.
Let us return to general processes with independent stationary increments, and
prove the following easy result.
Theorem 6.8 Let X be a process with independent stationary increments, and let
X
0
= x. Then assuming the relevant quantities exist, there are constants c
1
, c
2
and
a function c
3
() so that
E[X
t
] = x +c
1
t,
V [X
t
] = c
2
t,
E[e
Xt
] = e
x+c
3
()t
.
Also c
3
(1) c
1
. Here V [X
t
] is the variance of X
t
. Furthermore
M
t
=
e
Xt
E[e
Xt
]
is a nonnegative martingale with M
0
= 1.
Proof. Let U
t
= X
t
x so that U
0
= 0. Then
h
1
(t +s) = E[U
t+s
] = E[U
t
+ (U
t+s
U
t
)] = h
1
(t) +h
1
(s) h
1
(t) = c
1
t,
h
2
(t +s) = V [U
t+s
] = V [U
t
+ (U
t+s
U
t
)] = h
2
(t) +h
2
(s) h
2
(t) = c
2
t,
h
3
(t +s) = E[e
U
t+s
] = E[e
Ut
e
(U
t+s
Ut)
] = h
3
(t)h
3
(s) h
3
(t) = e
c
3
()t
.
By Jensens inequality
e
x+c
3
(1)t
= E[e
Xt
] e
E[Xt]
= e
x+c
1
t
,
showing that c
3
(1) c
1
. Furthermore
E[M
t+s
[T
t
] =
E[e
X
t+s
[T
t
]
E[e
X
t+s
]
=
e
Xt
E[e
(X
t+s
Xt)
[T
t
]
E[e
Xt
]E[e
(X
t+s
Xt)
]
=
e
Xt
E[e
Xt
]
= M
t
.
This result will be used to prove the ruin result in Theorem 6.9 below, but rst a
preliminary lemma.
26
Lemma 6.1 Let X be a process with independent stationary increments, and let
X
0
= x. Assume that E[X
t
] exists, and that P(X
t
= E[X
t
]) < 1. Assume also that
the equation
E[e
rX
1
] = e
rx
(6.9)
has a solution R > 0. Then this solution is unique. Furthermore if we as in the
above theorem write E[X
t
] = x + c
1
t, it implies that c
1
> 0, and nally it implies
that X
t
a.s. as t .
Note that by assumption of independent stationary increments, (6.9) is equivalent
to the seemingly stronger assumption E[e
rXt
] = e
rx
.
Proof. Let r (0, R). For any y we can write ry = a0 + (1 a)Ry for some
a (0, 1) depending on y. Since e
z
is convex, we get that e
ry
ae
0
+(1a)e
Ry
<
1 +e
Ry
, hence
E[e
rXt
] < 1 +E[e
RXt
] = 1 +E[e
Rx
] < .
Setting U
t
= X
t
x, this means that (let t be xed), h(r) = E[e
rUt
] exists for
r [0, R]. Also h(0) = h(R) = 1 and h
(r) = E[U
2
t
e
rUt
] > 0, the latter property
implying that h is convex where it is dened. Together this implies that R is unique.
Now by Jensens inequality
1 = E[e
RUt
] > e
RE[Ut]
= e
Rc
1
t
which means that c
1
> 0. Therefore
h
(0) = E[U
t
] = c
1
t < 0.
Consequently there exists an r > 0 so that E[e
rUt
] < 1. But by the above theorem,
there exists a c
3
so that E[e
rUt
] = e
c
3
t
< 1, which implies that c
3
> 0. But again
by the above theorem for this r,
N
t
=
e
rUt
E[e
rUt
]
is a nonnegative (super)martingale, hence by Theorem 4.1 there exists an a.s. nite
random variable N
so that N
t
N
as t , i.e.
e
rUt
e
c
3
t
N
as t .
But c
3
> 0, hence U
t
as t .
We are now ready for the result on ruin probabilities.
Theorem 6.9 Let X be a process with independent stationary increments, and as-
sume that X
0
= x > 0. Dene
T
x
= inft : X
t
< 0.
If R > 0 is the unique solution of (6.9), we have
P(T
x
< ) =
e
Rx
E[e
RX
Tx
[T
x
< ]
e
Rx
.
27
Proof. Since X
Tx
0 on T
x
< , the inequality (called Lundbergs inequality)
follows immediately.
By Theorem 6.8 and the denition of R, M
t
= e
RXt
is a martingale, and since
T
x
t is a nite stopping time, we get by Theorem 4.5 (Optional sampling theorem),
M
0
= e
Rx
= E[M
Txt
] = E[M
Txt
1
{Txt}
] +E[M
Txt
1
{Tx>t}
]
= E[M
Tx
1
{Txt}
] +E[M
t
1
{Tx>t}
].
But on T
x
> t, X
t
> 0 hence M
t
1
{Tx>t}
1. By Lemma 6.1, M
t
0 a.s. as
t , and therefore by the bounded convergence theorem,
E[M
t
1
{Tx>t}
] 0 as t .
Also M
Tx
1
{Txt}
increases monotonically towards M
Tx
1
{Tx<}
, hence by the mono-
tone convergence theorem as t ,
E[M
Tx
1
{Txt}
] E[M
Tx
1
{Tx<}
] = E[M
Tx
[T
x
< ]P(T
x
< ).
We therefore have
e
Rx
= E[M
Tx
[T
x
< ]P(T
x
< ).
The result follows.
Exercise 6.3 Let X
t
= x+ct+W
t
where c and x are positive and W is a Brownian
motion. Find P(T
x
< ).
Exercise 6.4 Let X
t
= x+ct
Nt
i=1
S
i
where N is a Poisson process with intensity ,
the (S
i
) are i.i.d. independent of N and exponentially distributed with expectation
1
. Assume that x is positive and that c >
1
. Find P(T
x
< ).
7 Processes with nite variation and the Doob-
Meyer decomposition.
Denition 7.1 We denote by 1
+
(resp. 1) the set of processes A that are RCLL,
adapted with A
0
= 0, so that each path is increasing (has nite variation on each
nite time interval). If A 1
+
we write A
= lim
t
A
t
. Then A
()

R
+
. We
write Var(A) for the variation of A, i.e. Var(A)
t
() is the total variation of A()
on [0, t].
Theorem 7.1 Let A 1. There exists a unique pair (B, C) of processes in 1
+
so
that A = B C and Var(A) = B + C. If A is predictable, then B and C are also
predictable.
Proof. It is known from real analysis that for each , there is a unique pair
(B(), C()) that are RCLL, nondecreasing with B
0
() = C
0
() = 0 and such that
A() = B() C() and Var(A)() = B() +C().
28
This gives
B =
1
2
(A+ Var(A)) and C =
1
2
(Var(A) A).
We must therefore prove that Var(A) is adapted, and predictable when A is pre-
dictable. But
Var(A)
t
= lim
n
n
k=1
[A
tk/n
A
t(k1)/n
[
which shows that Var(A)
t
is T
t
measurable. Also Var(A)
is left continuous, hence

predictable, and since Var(A) = [A[ is predictable if A is predictable, it follows
that Var(A) = Var(A)
+Var(A) is predictable if A is so.

Let H be an optional process so that by Theorem 3.2, H is TB(R
+
) measurable.
Therefore for xed, the function t H
t
() is Borel measurable. Consequently
with A 1, we may dene the integral process H A
(H A)
t
() =
t
0
H
s
()dA
s
() if
t
0
[H
s
()[d(Var(A)
s
()) < .
We can then prove.
Theorem 7.2 Let A 1 and H be an optional process so that B = H A is nite
valued. Then B 1. (If A 1
+
and H nonnegative, then B 1
+
). If A and H
are predictable, then B is predictable.
Proof. Clearly B is RCLL. Let H = 1
[[0,T[[
with T a nite stopping time. Then
B
t
= (H A)
t
=
t
0
1
[[0,T[[
(s)dA
s
= A
t
1
{t<T}
+A
T
1
{Tt}
.
Now A
t
1
{t<T}
is adapted. Also

A = A
is left continuous and adapted, hence

optional by Theorem 3.5. Therefore by Theorem 3.2,

A
T
= A
T
is T
T
measurable,
and consequently A
T
1
{Tt}
is T
t
measurable. Hence B 1 when H = 1
[[0,T[[
.
The result for general H follows by a monotone class argument (Theorem 3.1) and
Theorem 3.3.
If A and H are predictable, then B = HA is predictable, hence B = B
+B
is predictable.
Denition 7.2 We denote by /
+
the set of all A 1
+
so E[A
] < .
We denote by / the set of all A 1 so that E[Var(A)
] < .
We dene /
+
loc
as follows: For each A /
+
loc
there is a sequence of increasing
stopping times (T
n
) with T
n
and every A
Tn
/
+
.
Similarly /
loc
is dened relative to /.
Since 1
+
and 1 are dened by pathwise properties, we have that 1
+
loc
= 1
+
and
1
loc
= 1. Also we have
/
+
/
+
loc
1
+
and / /
loc
1.
29
Denition 7.3 A stochastic process X is said to be locally bounded if there is a
sequence of increasing stopping times (T
n
) with T
n
, and so that [X
Tn
[ is bounded.
In particular if A 1
+
and A is locally bounded, then A /
+
loc
. Similarly if A 1
and Var(A) is locally bounded, then Var(A) /
+
loc
.
Theorem 7.3 The following applies:
(a) Any left continuous nite process X is locally bounded.
(b) If A 1 and predictable, then Var(A) is locally bounded.
Proof. For part (a), let T
n
= inft : [X
t
[ n. Then T
n
, and [X
Tn
[ n
because of left continuity. We omit the proof of part (b).
We also have
Theorem 7.4 Any local martingale that belongs to 1 also belongs to /
loc
.
Proof. Assume rst that X is a uniformly integrable martingale. Let T
n
= inft :
Var(X)
t
n. Then T
n
and Var(X)
Tn
n. Since
[X
Tn
[ = [X
Tn
X
Tn
[ [X
Tn
[ +[X
Tn
[ [X
Tn
[ + Var(X)
Tn
,
we get
Var(X)
Tn
[X
Tn
[ + 2n.
But by Theorem 4.4
E[[X
Tn
[] = E[[E[X
[T
Tn
][] E[E[[X
[[T
Tn
]] = E[[X
[] < .
Now let X be a local martingale. From what we have proven there exists an in-
creasing sequence T
n
of stopping times so that each X
Tn
/
loc
. I.e. for each
n there exists an increasing sequence T
n,p
as p of stopping times so that
each (X
Tn
)
Tn,p
/. For each n there exists a p(n) so that P(T
n,p(n)
< T
n
n) < 2
n
.
Put
S
n
= T
n
mn
T
m,p(m)
).
Then each S
n
is a stopping time, (S
n
) is increasing and
P(S
n
< T
n
n)
mn
P(T
m,p(m)
< T
n
n)
mn
P(T
m,p(m)
< T
m
m) <
mn
2
m
= 2
(n1)
.
Therefore T
n
implies S
n
by the Borel-Cantelli Lemma. Finally
X
Sn
= ((X
Tn
)
Tn,p
)
Sn
/,
and therefore X /
loc
.
30
Theorem 7.5 Let A / and M be a bounded martingale. For any stopping time T
we have that E[M
T
A
T
] = E[MA
T
]. If A is predictable then E[M
T
A
T
] = E[M
A
T
].
Theorem 7.6 Let A /
loc
and M be a locally bounded local martingale. Then the
process MA M A is a local martingale.
If A is predictable, the process MA M
A is a local martingale.
Proof. By localization we may assume that A / and that M is a bounded
martingale. Then for any stopping time T we get by the above theorem
E[(MA M A)
T
] = 0.
The result follows from Exercise 4.1.
Denition 7.4 A process X is of class (D) if the set of random variables X
T
:
T nite valued stopping time is uniformly integrable.
Note that a process of class (D) is uniformly integrable. Just let T = t R
+
. The
converse is true in the following case.
Exercise 7.1 Prove that a uniformly integrable martingale is of class (D).
Theorem 7.7 (Doob-Meyer decomposition.) If X is a submartingale of class
(D), there exists a unique predictable process A /
+
so that X A is a uniformly
integrable martingale.
The following is a very useful application of Theorem 7.7.
Theorem 7.8 Let X 1 be a predictable local martingale. Then X = 0. In
particular every continuous local martingale with nite variation on nite intervals
is a constant.
Proof. By Theorem 7.4, X /
loc
. Hence there exists stopping times T
n

and S
n
so that X
Tn
/ and X
Sn
. But then X
TnSn
/ and
T
n
S
n
. We may therefore assume that X / , which by Exercise 7.1
implies that X is of class (D).
Since X / and predictable, by Theorem 7.1 there are two unique predictable
processes A, B /
+
so that
X = A B A = X +B.
But A /
+
, hence a submartingale. Therefore by Theorem 7.7 there exists a
unique predictable A
and a uniformly integrable martingale M so that

A = M +A
.
Because of uniqueness we must have A
= B. But A = 0+A is also a decomposition,

therefore by uniqueness A = A
, hence A = B implying that X = 0.

The next theorem is also very important.
31
Theorem 7.9 Let A /
+
loc
. There is a predictable process A
p
/
+
loc
called the
compensator of A, which is uniquely characterized by any of the following three
equivalent statements.
(i) AA
p
is a local martingale.
(ii) E[A
p
T
] = E[A
T
] for all stopping times T.
(iii) E[(H A)
p
] = E[(H A)
] for all nonnegative predictable H.

Proof. We start by proving that (i)-(ii) are equivalent.
(i) (ii). Let (T
n
) be a localizing sequence for A A
p
so that A
Tn
(A
p
)
Tn
is a
uniformly integrable martingale. Then we get by Theorems 4.5 and 4.7 that for any
stopping time T,
E[A
TTn
] = E[A
Tn
T
] = E[(A
p
)
Tn
T
] = E[A
p
TTn
].
But A and A
p
are increasing, so by monotone convergence
E[A
T
] = lim
n
E[A
TTn
] = lim
n
E[A
p
TTn
] = E[A
p
T
].
(ii) (i). Let (T
n
) be a localizing sequence so that A
Tn
and (A
p
)
Tn
are in /
+
. Then
for any stopping time T,
E[(AA
p
)
Tn
T
] = E[A
TTn
] E[A
p
TTn
] = 0,
hence by Exercise 4.1, (A A
p
)
Tn
is a uniformly integrable martingale. Therefore
A A
p
(iii) (ii). Let H = 1
[[0,T]]
so that
(H A
p
)
= A
p
T
and (H A)
= A
T
.
(ii) (iii). For H = 1
[[0,T]]
we saw above that this is true. The result for general
nonnegative predictable H follows by a monotone class argument and Theorem 5.1.
It remains to prove the existence of a predictable A
p
/
+
loc
that satises (i).
Let (T
n
) be a localizing sequence so that A
Tn
/
+
. Then A
Tn
is a submartingale
of class (D), hence by Theorem 7.7 there is a unique predictable B(n) /
+
so that
M
n
= A
Tn
B(n) . Now M
Tn
n+1
= (A
T
n+1
B(n + 1))
Tn
also, and
M
Tn
n+1
= A
Tn
B(n + 1)
Tn
= M
n
+B(n) B(n + 1)
Tn
.
Therefore B(n) B(n+1)
Tn
= M
Tn
n+1
M
n
1 is predictable with initial value
zero, so by Theorem 7.8 B(n) = B(n + 1)
Tn
. Consequently the process
A
p
=
n
B(n)1
]]T
n1
,Tn]]
is predictable and satises (A
p
)
Tn
= B(n). Furthermore A
p
/
+
loc
and A A
p
loc
, so this ends the proof of the theorem.
By splitting into negative and positive parts, the following generalization can be
obtained.
32
Theorem 7.10 Let A /
loc
. There is a predictable process A
p
/
loc
(again
called the compensator) which is uniquely characterized by A A
p
being a local
martingale.
Moreover, for each predictable process H so that H A /
loc
, then H A
p
/
loc
and H A
p
= (H A)
p
. In particular H AH A
p
Exercise 7.2 Prove the following for A /
loc
.
(a) if A is predictable, then A
p
= A.
(b) if T is a stopping time, then (A
T
)
p
= (A
p
)
T
.
(c)
p
(A) = A
p
.
(d) A is a local martingale if and only if A
p
= 0.
Exercise 7.3 Let A
loc
1 and H a predictable process so that H A /
loc
.
Prove that H A is a local martingale.
Exercise 7.4 Let X
t
=

Nt
i=1
S
i
be a compound Poisson process, i.e. N is a Pois-
son process with intensity and the (S
i
) are i.i.d. independent of N. Find the
compensator of X.
8 Local martingales and semimartingales.
Denition 8.1 We denote by H
2
the set of all martingales M so that sup
t
E[M
2
t
] <
.
The set of all martingales for which there exists an increasing sequence T
n
of
stopping times so that M
Tn
H
2
for all n is denoted H
2
loc
.
Note that by Theorem 4.2, H
2
, hence by Exercise 7.1, M H
2
implies that
M is of class (D).
We have the important result.
Theorem 8.1 Let M, N H
2
loc
. Then there is a unique predictable bilinear process
'M, N` /
loc
(called the predictable quadratic covariation) so that
MN 'M, N`
loc
.
Moreover
'M, N` =
1
4
('M +N, M +N` 'M N, M N`), (8.1)
and if M, N H
2
, then 'M, N` / and MN 'M, N` .
Finally 'M, M` /
+
loc
and 'M, M` is continuous if and only if M is quasi-left
continuous.
33
Proof. Assume that B 1 is another predictable process so that MNB
loc
.
Then by taking dierences, the predictable process X = 'M, N` B
loc
1,
so by Theorem 7.8, X = 0, and uniqueness follows.
Assume we can prove the result when N = M. Then by (8.1)
MN 'M, N` =
1
4
[((M +N)
2
'M +N, M +N`)
((M N)
2
'M N, M N`)]
loc
.
We may therefore assume that M = N. Also by localization we may assume
that M H
2
. By Theorem 4.9, M
2
is a submartingale, and by Theorem 4.10,
E[sup
t
M
2
t
] 4 sup
t
E[M
2
t
] < . Since for any stopping time T, M
2
T
sup
t
M
2
t
,
it follows that M
2
is of class (D). Therefore by Theorem 7.7, there is a unique
predictable process 'M, M` / so that M
2
'M, M` .
To prove the nal statement, let T be any predictable stopping time. By Theo-
rems 5.12 and 5.2(a),
0 = E[(M
2
'M, M`)
T
[T
T
] = E[M
2
T
[T
T
] 'M, M`
T
.
Now
M
2
T
= (M
T
+M
T
)
2
M
2
T
= (M
T
)
2
+ 2M
T
M
T
hence by Theorems 5.12 and 5.2(a),
'M, M`
T
= E[(M
T
)
2
[T
T
]
Assume rst that 'M, M` is continuous. Then
E[(M
T
)
2
[T
T
] = 0 E[(M
T
)
2
] = 0 M
T
= 0,
so that M is quasi-left continuous.
Assume next that M is quasi-left continuous and let T = inft : 'M, M`
t
> 0.
Then by Theorem 5.4(b), T is a predictable stopping time. Hence on T < we
have that
M
T
= 0 E[(M
T
)
2
[T
T
] = 0 'M, M`
T
= 0,
so therefore T = , i.e. 'M, M` is continuous.
Exercise 8.1 Compute 'M, M` when M is a martingale with independent station-
ary increments (e.g. a Wiener process or a compound Poisson process).
For M, N H
2
loc
, M
0
N and N
0
M are both in
loc
. Therefore
(M M
0
)(N N
0
) MN
loc
.
Also
(M M
0
)(N N
0
) 'M M
0
, N N
0
` (MN 'M, N`)
loc
.
34
Therefore by uniqueness
'M M
0
, N N
0
` = 'M, N`.
Assume now that M, N H
2
, and let M
= lim
t
M
t
and N
= lim
t
N
t
. On H
2
we dene the inner product
(M, N)
H
2 = E[M
], |M|
2
H
2 = E[M
2
] = |M
|
2
L
2 (8.2)
where L
2
= L
2
(, T
, P). Note that |M|

H
2 = 0 |M
|
L
2 = 0 M
= 0
(a.s). But then M
t
= E[M
[T
t
] = 0, hence |M|
H
2 = 0 implies that M = 0
so that | |
H
2 is a norm. Also X = MN 'M, N` , X
0
= M
0
N
0
and
E[X
] = E[M
] E['M, N`
] = E[M
0
N
0
] which implies,
(M, N)
H
2 = E['M, N`
] +E[M
0
N
0
]. (8.3)
To prove that H
2
is a Hilbert space, we must prove that it is complete. So let
(M
n
) be a Cauchy sequence in H
2
, i.e.
|M
n
M
m
|
H
2 = |M
n
M
m
|
L
2 0 as m, n .
But it is well known that L
2
is complete, so there is a stochastic variable M
in L
2
so that M
n
in L
2
. Then dene M H
2
by M
t
= E[M
[T
t
], hence H
2
is a
Hilbert space.
We have
Theorem 8.2 The space H
2
with the norm (8.2) or (8.3) is a Hilbert space.
The set of all continuous elements of H
2
is a closed subspace of H
2
.
Proof. It only remains to prove the last part. So let M
n
M in H
2
where the
M
n
are continuous. Then by Theorem 4.10,
E[sup
t
[M
n
t
M
t
[
2
] 4E[(M
n
)
2
] = 4|M
n
M|
2
H
2 0 as n .
It follows that M is continuous.
Denition 8.2 Two local martingales M and N are called orthogonal if their prod-
uct MN is a local martingale.
For M, N H
2
loc
this is equivalent to 'M, N` = 0 by uniqueness of 'M, N`. Thus
if M, N H
2
and either M
0
= 0 or N
0
= 0, it follows from (8.3) that orthogonality
of M and N implies orthogonality in the Hilbert space H
2
.
Denition 8.3 A local martingale X is called purely discontinuous if X
0
= 0 and
X is orthogonal to all continuous local martingales.
To get a feeling of what is involved, we have the following.
35
Theorem 8.3 Let M and N be local martingales. Then
(a) M is orthogonal to itself if and only if E[M
2
0
] < and M = M
0
constant.
(b) a purely discontinuous local martingale which is continuous is equal to zero.
(c) if M, N are orthogonal, then for all stopping times S and T, the stopped local
martingales M
S
and N
T
are also orthogonal.
Proof. Part (b) follows from (a). To prove (a) note that if E[M
2
0
] < and
M
t
= M
0
, then obviously M
2
is a local martingale. So assume that M
2
is a local
martingale. By localization we may assume that M and M
2
are uniformly integrable,
implying that M H
2
. Then E[M
2
0
] < and also E[M
] = E[M
0
], E[M
2
] =
E[M
2
0
], hence by Theorem 4.10,
E[sup
t
(M
t
M
0
)
2
] 4E[(M
M
0
)
2
] = 0
since E[M
M
0
] = E[E[M
M
0
[T
0
]] = E[M
0
E[M
[T
0
]] = E[M
2
0
].
We omit the proof of part (c).
Exercise 8.2 Prove the following
(a) a local martingale M with M
0
= 0 is purely discontinuous if and only if it is
orthogonal to all bounded continuous martingales N with N
0
= 0.
(b) a local martingale that belongs to 1 is purely discontinuous.
In particular if X
t
=

Nt
i=1
S
i
is a compound Poisson process with = E[S
i
]
and the intensity of N, then
M
t
= X
t
t
is a purely discontinuous martingale, hence if W is a Brownian motion, the
process MW is a local martingale (actually a martingale).
The following result explains the terminology.
Theorem 8.4 Any local martingale M admits a unique decomposition
M = M
c
+M
d
where M
c
is a continuous local martingale and M
d
is a purely discontinuous local
martingale.
Denition 8.4 A semimartingale is a process X of the form
X = M +A
with M
loc
and A 1. We denote by o the space of all semimartingales.
36
Denition 8.5 A special semimartingale is a semimartingale so that A in the de-
composition X = M +A can be chosen to be predictable. We denote by o
p
the space
of all special semimartingales.
Theorem 8.5 The decomposition of a special semimartingale is unique, hence by
Theorem 8.4 a special semimartingale can be uniquely decomposed as
X = M
c
+M
d
+A.
Note that we have M
d
0
= A
0
= 0.
Proof. Let X = M
1
+ A
1
= M
2
+ A
2
be two decomposition. Then A
1
A
2
=
M
2
M
1

loc
1 is predictable, hence zero by Theorem 7.8.
Exercise 8.3 By Theorem 8.4 a semimartingale X can be decomposed as X =
M
c
+M
d
+A. Prove that M
c
is unique in this decomposition.
To decide whether a semimartingale is special, we have the following result.
Theorem 8.6 Let X be a semimartingale. There is equivalence between.
(a) X is a special semimartingale.
(b) there exists a decomposition X = M +A where A /
loc
.
(c) all decompositions X = M +A satisfy A /
loc
.
(d) The process Y
t
= sup
st
[X
s
[ belongs to /
+
loc
.
Proof. We only prove the equivalence between (a) and (b). Assume (a). Then by
Theorem 7.3 (b), A /
loc
, so that (b) follows. Next assume (b). Then by Theorem
7.10 there is a predictable process A
p
/
loc
so that AA
p

loc
. But then
X = (M +AA
p
) +A
p
is a decomposition as a special semimartingale.
Example 8.1 By the Doob-Meyer decomposition, every submartingale is a semi-
martingale. So let W be a Brownian motion. Then
W
t
= W
t
+ 0,
W
2
t
= (W
2
t
t) +t
are decompositions as special semimartingales.
Let N be a Poisson process with intensity . Then
N
t
= 0 +N
t
is a decomposition as a semimartingale with N 1. However,
N
t
= (N
t
t) +t
is the unique decomposition as a special semimartingale.
37
9 Stochastic integrals.
Let X H
2
. We denote by L
2
p
(X) the set of all predictable processes H so that
|H|
2
L
2
p
(X)
= E[H
2
'X, X`
] < . We will dene the stochastic integral H X

when X H
2
and H L
2
p
(X), and also show that it has the following properties.
It is linear, i.e. for a and b real constants and H
1
, H
2
L
2
p
(X),
(aH
1
+bH
2
) X = aH
1
X +bH
2
X, (9.1)
H X H
2
, (9.2)
'H X, H X` = H
2
'X, X`, (9.3)
(H X)
0
= 0. (9.4)
Let c be the class of functions H so that either
H = Y 1
{0}
Y bounded T
0
measurable,
H = Y 1
(r,s]
Y bounded T
r
measurable.
In this case we get the obvious denition
H X
t
= 0 if H = Y 1
{0}
,
H X
t
= Y (X
st
X
rt
) if H = Y 1
(r,s]
.
This denition extends readily to functions H of the form
H = Y
0
1
{0}
+
n
i=1
Y
i
1
(t
i
,t
i+1
]
, (9.5)
where 0 t
1
< < t
n+1
and the Y
i
are T
t
i
measurable. We denote the set of such
functions by c
.
On ( R
+
, {) dene the nite measure
m(B) = E[1
B
'X, X`
] =
R
+
1
B
(, t)d'X, X`
t
dP()
(nite since m( R
+
) = E['X, X`
] < ). This just says that m(d, dt) =

d'X, X`
t
()dP(), hence for H L
2
p
(X),
|H|
2
L
2
(R
+
,P,m)
=
R
+
H
2
(, t)m(d, dt)
=
R
+
H
2
(, t)d'X, X`
t
()dP() = |H|
2
L
2
p
(X)
.
Therefore L
2
p
(X) is the Hilbert space L
2
(R
+
, {, m). But from the general theory
of Hilbert spaces, this implies that c
is dense in L
2
p
(X), so therefore we will dene
the stochastic integral H X for H c
and then extend it to general H L

2
p
(X).
For H c
of the form (9.5) we dene

H X =
n
i=1
Y
i
(X
t
i+1
X
t
i
). (9.6)
38
Then H X obviously satises (9.1) and (9.4).
Now the (Y
i
) are bounded, so for i < j
E[Y
i
(X
t
i+1
X
t
i
)Y
j
(X
t
j+1
X
t
j
)] = E[Y
i
Y
j
(X
t
i+1
X
t
i
)E[X
t
j+1
X
t
j
[T
t
j
]] = 0
and also
E[Y
2
i
X
t
i+1
X
t
i
] = E[Y
2
i
X
t
i
E[X
t
i+1
[T
t
i
]] = E[Y
2
i
X
2
t
i
].
Therefore
E[(H X)
2
] =
n
i=1
E[Y
2
i
(X
t
i+1
X
t
i
)
2
] =
n
i=1
E[Y
2
i
(X
2
t
i+1
X
2
t
i
)] < ,
so that (H X)
L
2
(, T
, P).
Exercise 9.1 Prove that Y
i
(X
t
i+1
X
t
i
) is a martingale.
Together this means that the stochastic integral (9.6) satises (9.2).
Some straightforward calculations show that
(H X)
2
H
2
'X, X` = 2
i<j
Y
i
Y
j
(X
t
i+1
X
t
i
)(X
t
j+1
X
t
j
)
+
n
i=1
Y
2
i
[((X
t
i+1
)
2
'X, X`
t
i+1
) ((X
t
i
)
2
'X, X`
t
i
)]
2
n
i=1
Y
2
i
X
t
i
(X
t
i+1
X
t
i
).
Some boring calculatations show that this is a martingale, and since H
2
'X, X` is
predictable by Theorem 7.2, the uniqueness of the decomposition in Theorem 8.1
shows that the stochastic integral (9.6) satises (9.3) as well.
Since (H X)
0
= 0, we get by (8.3) that the map H H X is an isometry from
the subspace c
of the Hilbert space L

2
( R
+
, {, m) into the Hilbert space H
2
.
This is because
|H|
2
L
2
(R
+
,P,m)
= E[H
2
'X, X`
] = E['H X, H X`
] = |H X|
2
H
2.
Since c
is dense in L
2
( R
+
, {, m), we may extend this denition to all of
L
2
(R
+
, {, m), i.e. HX is the unique element of H
2
so that |H
n
XHX|
H
2
0 as n for any sequence H
n
L
2
(R
+
, {, m) with |H
n
H|
L
2
(R
+
,P,m)

0 as n .
This general integral obviously satises (9.2) and (9.4) and it is straightforward
to show that it satises the linearity condition (9.1). It only remains to prove (9.3).
So let H
n
c
, H
n
H in L
2
p
(X) as n , i.e.
(H
n
)
2
'X, X`
t
H
2
'X, X`
t
in L
1
as n .
Also by the denition of the stochastic integral (replace H by H1
[0,t]
).
(H
n
X)
2
t
(H X)
2
t
in L
1
as n .
39
Therefore
M
n
t
= (H
n
X)
2
t
(H
n
)
2
'X, X`
t
(H X)
2
t
H
2
'X, X`
t
def
= M
t
in L
1
as n . To see that M is a martingale, let s < t and A T
s
. Then
E[1
A
M
t
] = lim
n
E[1
A
M
n
t
] = lim
n
E[1
A
M
n
s
] = E[1
A
M
s
].
Therefore
(H X)
2
H
2
'X, X`
is a martingale, but by denition
(H X)
2
'H X, H X`
is also a martingale. Also by Theorem 7.2, H
2
'X, X` is predictable, so (9.3) follows
by uniqueness in Theorem 8.1.
We will not prove any further properties and extensions, but the following result
can be proven.
Theorem 9.1 Let X H
2
loc
and let H L
2
p,loc
(X), i.e. E[H
2
'X, X`
Tn
] < for
T
n
stopping times. Then the map H H X can be uniquely extended (still
denoted H X), having the following properties.
(i) if a sequence (H
n
) of predictable processes converges pointwise to a limit H
and [H
n
[ < K for some K L
2
p,loc
(X), then
sup
st
[H
n
X
s
H X
s
[
P
0 as n t 0.
We say that H
n
X converges in probability to H X, uniformly on compacts.
(ii) H X H
2
loc
and H X H
2
if and only if H L
2
p
(X).
(iii) properties (9.1), (9.3) and (9.4) hold.
(iv) (H X) = HX. In particular H X is continuous if X is so.
(v) if H L
2
p,loc
(X) and K L
2
p,loc
(H X), then
K (H X) = (KH) X.
In particular for K = 1
[[0,T]]
, T any stopping time
K (H X)
t
= (H X)
Tt
= ((H1
[[0,T]]
) X)
t
.
(vi) for X, Y H
2
loc
and H L
2
p,loc
(X), K L
2
p,loc
(Y )
'H X, K Y ` = HK 'X, Y `.
In particular with K = 1 and H = 1
[[0,T]]
, T any stopping time,
'X
T
, Y ` = 'X, Y `
T
.
40
Now let X = M + A be a semimartingale, and assume that H L
2
p
(M) is
predictable and locally bounded. Then we dene the stochastic integral H X by
H X = H M +H A,
where H M is in the above sense, and H A is in the sense of Section 7.
This denition can also be extended as follows.
Theorem 9.2 Let X be a semimartingale. Then the map H H X dened above
can be uniquely extended to all locally bounded predictable processes, H and it has
the properties
(i) H X is a semimartingale, and if X is a local martingale, then so is H X.
(ii) property (i) of Theorem 9.1 is valid (but now with K locally bounded).
(iii) properties (9.1) and (9.4) are valid. (Note that (9.3) is not dened here).
(iv) property (iv) of Theorem 9.1 is valid.
(v) property (v) of Theorem 9.1 is valid (but with H and K locally bounded).
The following Riemann approximation is often useful.
Theorem 9.3 Let (t
n
i
) satisfy 0 = t
n
0
< t
n
1
< and let t
n
i
as i . Assume
also that sup
i
(t
n
i+1
t
n
i
) 0 as n . Let H be an adapted left continuous process,
and dene
H
n
=
iN
H
t
i
1
(t
n
i
,t
n
i+1
]
. (9.7)
Then for any t 0 and X a semimartingale
sup
st
[H
n
X
s
H X
s
[
P
0 as n .
Proof. H is locally bounded by Theorem 7.3, and H
n
H pointwise. Furthermore
K
t
= sup
st
[H
s
[ is locally bounded, and the result follows from Theorem 9.2(ii).
10 Quadratic variation and the Ito formula.
Denition 10.1 The quadratic covariation of the two semimartingales X and Y
(quadratic variation when X = Y ) is the process
[X, Y ] = XY X
0
Y
0
X
Y Y
X.
Note that we get the integration by parts formula
Y
X = XY X
0
Y
0
X
Y [X, Y ].
As for the predictable quadratic covariation, we have the decomposition
[X, Y ] =
1
4
([X +Y, X +Y ] [X Y, X Y ]),
41
as well as the properties
[X, Y ]
0
= 0 and [X X
0
, Y Y
0
] = [X, Y ].
The following result explains the word quadratic covariation.
Theorem 10.1 Let X and Y be semimartingales.
(a) let (t
n
i
) be as in Theorem 9.3. Dene
S
n
(X, Y )
t
=
iN
(X
t
n
i+1
t
X
t
n
i
t
)(Y
t
n
i+1
t
Y
t
n
i
t
).
Then S
n
(X, Y ) converges in probability uniformly on compacts to [X, Y ] as
n (see Theorem 9.1 for the terminology.)
(b) [X, Y ] 1 and [X, X] 1
+
.
(c) [X, Y ] = XY.
Proof. Using the polarization identities above as well as the facts that
S
n
(X, Y ) =
1
4
(S
n
(X +Y, X +Y ) S
n
(X Y, X Y ))
and
XY =
1
4
((X +Y )
2
(X Y )
2
),
it is sucient to prove the result for Y = X. Then using that (x y)
2
= x
2
y
2
2y(x y), we get

S
n
(X, X)
t
=
iN
(X
t
n
i+1
t
X
t
n
i
t
)
2
=
iN
X
2
t
n
i+1
t

iN
X
2
t
n
i
t
2
iN
(X
t
n
i
t
+X
t
n
i
t
)(X
t
n
i+1
t
X
t
n
i
t
)
= X
2
t
X
2
0
2X
n
X
t
2
iN
X
t
n
i
t
(X
t
n
i+1
t
X
t
n
i
t
),
where
X
n
iN
(X
t
n
i

)1
(t
n
i
,t
n
i+1
]
.
It follows from Theorem 9.3 that X
n
X
P
X
X uniformly on compacts, and also

since X is right continuous
iN
X
t
n
i
t
(X
t
n
i+1
t
X
t
n
i
t
)
P
0,
uniformly on compacts. This proves part (a).
To prove (b), let s < t. Then for n suciently large, S
n
(X, X)
t
S
n
(X, X)
s
since S
n
(X, X)
t
will contain more nonnegative summands. Hence the limit [X, X]
42
is nondecreasing. Also since [X, X]
0
= 0 and [X, X] is RCLL and adapted by
denition, it follows that [X, X] 1
+
. Consequently [X, Y ] 1.
Finally it follows from the denition of [X, X] together with Theorem 9.2(iv) that
[X, X] = (X
2
) 2X
X = (X
+X)
2
2X
X
= 2X
X + (X)
2
2X
X = (X)
2
which ends the proof of the theorem.
The following theorem is often useful.
Theorem 10.2 Let X o and Y 1. Then
(a) [X, Y ] = X Y and XY = Y
X +X Y.
(b) if Y is predictable, then [X, Y ] = Y X and XY = Y X +X
Y.
(c) if Y is predictable and X
loc
, then [X, Y ]
loc
.
(d) if X or Y is continuous, then [X, Y ] = 0.
Proof. We omit parts (a) and (b), noting that by Theorem 7.3 (b), Y is locally
bounded in (b). Then (c) follows from (b) and Theorem 9.2(i). Also (d) follows from
(a) if X is continuous, and from (b) if Y is continuous, since a continuous process
is predictable.
Exercise 10.1 Let X
loc
/. Prove that X .
Theorem 10.3 Let X, Y
loc
. Then
(a) XY X
0
Y
0
[X, Y ]
loc
.
(b) if X, Y H
2
loc
then [X, Y ] /
loc
and its compensator is 'X, Y `. If X, Y
H
2
, then XY [X, Y ] .
(c) X H
2
(resp. H
2
loc
) if and only if [X, X] /
+
(resp /
+
loc
) and E[X
2
0
] < .
(d) X = X
0
if and only if [X, X] = 0.
Proof. Since
XY X
0
Y
0
[X, Y ] = X
Y +Y
X,
part (a) follows from Theorem 9.2(i).
By polarization it is sucient to prove (b) when Y = X, in which case we prove
(c) simultaneously. By localization we may assume that X H
2
or [X, X] /. So
assume rst that X H
2
. Clearly E[X
2
0
] < . Also by (a),
X
2
[X, X]
loc
,
and by Theorem 8.1
X
2
'X, X` .
43
Therefore
M = [X, X] 'X, X`
loc
1 with M
0
= 0.
So by Theorem 7.4, M
loc
/
loc
and since 'X, X` /
+
we must have that
[X, X] /
+
loc
. Hence by Theorem 7.10, 'X, X` is the compensator of [X, X]. Now
let T
n
be a localizing sequence so that [X, X]
Tn
/
+
. Then we get by the
monotone convergence theorem,
E[[X, X]
] = lim
n
E[[X, X]
Tn
] = lim
n
E['X, X`
Tn
] = E['X, X`
] < .
Therefore [X, X] /
+
, so by Exercise 10.1, [X, X] 'X, X` . But X
2
'X, X` , so therefore
X
2
[X, X] = X
2
'X, X` +'X, X` [X, X] .
Assume on the other hand that [X, X] /
+
and that E[X
2
0
] < . Let T
n

be a localizing sequence so that (X
2
X
2
0
[X, X])
Tn
, see (a). Then by Fatous
lemma,
sup
t
E[X
2
t
] = sup
t
E[liminf
n
X
2
tTn
] sup
t
liminf
n
E[X
2
tTn
]
= sup
t
lim
n
(E[X
2
0
] +E[[X, X]
tTn
]) = E[X
2
0
] +E[[X, X]
] < .
Therefore X H
2
.
As for (d), it is obvious that X = X
0
implies that [X, X] = 0. Assume that
[X, X] = 0. Then by (a), X
2
X
2
0

loc
, and since X X
0

loc
, the result
follows from Theorem 8.3(a).
Theorem 10.4 Let X and Y be semimartingales, and let X
c
and Y
c
denote their
continuous martingale parts (see Exercise 8.3). Then
(a) [X, Y ]
t
= 'X
c
, Y
c
` +
st
XY.
(b) For any locally bounded predictable H and K.
[H X, K Y ] = HK [X, Y ].
In particular with H = 1
[[0,T]]
, T any stopping time, and K = 1,
[X
T
, Y ] = [X, Y ]
T
.
Proof. The proof of (a) is fairly complicated and is omitted. Part (b) follows from
(a), Theorem 9.1(vi) and Theorem 9.2(iv).
Theorem 10.5 Let X, Y
loc
. Then
(a) [X, X]
1/2
/
loc
.
(b) [X, Y ] = 0 when X is continuous and Y purely discontinuous.
44
(c) if X is continuous (resp. purely discontinuous) and H is locally bounded pre-
dictable, then H X is a continuous (resp. purely discontinuous) local martin-
gale.
Exercise 10.2 Prove Theorem 10.5 (b) and (c) (not (a)).
In applications it is often of great interest to know when a local martingale is a
martingale or even a uniformly integrable martingale. A criteria for deciding this is
given in Exercise 4.4, but it is usually hard to use directly. However, Exercise 4.4
together with the following theorem is often very useful.
Theorem 10.6 (Burkholder-Davis-Gundy inequality.) Let X
loc
with
X
0
= 0, and let T be a stopping time. Then for any p 1 there exist constants c
p
and C
p
(independent of X and T) so that,
c
p
E[[X, X]
p/2
T
] E[sup
tT
[X
t
[
p
] C
p
E[[X, X]
p/2
T
].
The importance of the next theorem can hardly be overstated.
Theorem 10.7 (Itos formula) Let X = (X
1
, , X
d
) be a d-dimensional semi-
martingale, and let f : R
d
R be twice continuously dierentiable. Then f(X) is
a semimartingale, and
f(X
t
) = f(X
0
) +
d
i=1
D
i
f(X
) X
i
t
+
1
2
d
i,j=1
D
ij
f(X
) 'X
i,c
, X
j,c
`
t
+
st
f(X
s
) f(X
s
)
d
i=1
D
i
f(X
s
)X
i
s
.
Here D
i
=

x
i
and D
ij
=

2
x
i
x
j
.
If d = 1 we get
f(X
t
) = f(X
0
) +f
(X
) X
t
+
1
2
f
(X
) 'X
c
, X
c
`
t
+
st
[f(X
s
) f(X
s
) f
(X
s
)X
s
].
Comment. The above form of Itos formula is the only correct form in the general
case. To see why this is so, let us consider the one-dimensional case. Since f
(X
)
is left continuous, hence locally bounded and predictable, it follows from Theorem
9.2 that f
(X
) X is a semimartingale. Also f
(X
) 'X
c
, X
c
` is a process of
bounded variation, hence a semimartingale. To prove that the sum in Itos formula
is a semimartingale, assume that [f
(x)[ K for all x, K a constant. Then by

Taylors formula
[f(X
s
) f(X
s
) f
(X
s
)X
s
[ K(X
s
)
2
.
But by Theorem 10.4 (a),

st
(X
s
)
2
< (since [X, X]
t
< ), and therefore the
last sum in Itos formula converges, so it is a semimartingale. We can then extend
45
this argument to general twice continuously dierentiable f by considering stopping
times of the form
T
n
= inft : [f
(X
t
)[ n
and then use the same argument.
It is tempting to merge the two expressions containing f
(X
) into one to get rid

of the jumps, i.e. to write
f(X
t
) = f(X
0
) +f
(X
)

X
t
+
1
2
f
(X
) 'X
c
, X
c
`
t
+
st
[f(X
s
) f(X
s
)],
(10.1)
where

X
t
= X
t

st
X
s
.
However (10.1) does not make sense in the general case, since

st
X
s
may not
converge. If however we know that

st
X
s
converges, then (10.1) is often more
easy to use in computations.
Example 10.1 Let
X
t
= x +pt +W
t

Nt
i=1
S
i
,
where W is a Brownian motion independent of the compound Poisson process
Nt
i=1
S
i
. Then

st
X
s
=
Nt
i=1
S
i
, and the form (10.1) can be used. Here
X
c
= W, hence 'X
c
, X
c
`
t
=
2
t. Also

X
t
= x +pt + W
t
, so with f(x) = e
ax
, we
get
e
aXt
= e
ax
+ap
t
0
e
aX
s
ds +a
t
0
e
aX
s
dW
s
+
1
2
a
2
t
0
e
aX
s
ds
+
Nt
i=1
(e
aX
T
i
e
aX
T
i
),
where (T
i
) are the times of jump of N. But for , the set s : X
s
() = X
s
()
has Lebesgue measure zero, hence
t
0
e
aX
s
ds =

t
0
e
aXs
ds and
t
0
e
aX
s
dW
s
=

t
0
e
aXs
dW
s
.
Also
e
aX
T
i
e
aX
T
i
= e
aX
T
i
(e
aX
T
i
1) = e
aX
T
i
(e
aS
i
1).
Therefore
e
aXt
= e
ax
+ (ap +
1
2
a
2
2
)
t
0
e
aXs
ds +a
t
0
e
aXs
dW
s
+
Nt
i=1
e
aX
T
i
(e
aS
i
1).
The proofs of the following theorems show the force of Itos formula.
Theorem 10.8 (Levys theorem.) Let X be a continuous d-dimensional local
martingale with X
0
= 0, and assume that 'X
i
, X
j
`
t
=
ij
t, i, j = 1, , d. Then
X is a d-dimensional Brownian motion (i.e. the components of X are independent
one-dimensional Brownian motions).
46
Proof. We restrict ourselves to the case d = 1, i.e. the one-dimensional case. Fix
u R and set
Y
t
= iuX
t
+
1
2
u
2
t.
Then Y
c
t
= iuX
t
and 'Y
c
, Y
c
`
t
= u
2
t. So letting Z
t
= e
Yt
, we get by Itos formula,
Z
t
= 1 +
t
0
Z
s
dY
s
+
1
2
t
0
Z
s
d'Y
c
, Y
c
`
s
= 1 +iu
t
0
Z
s
dX
s
+
1
2
u
2
t
0
Z
s
ds
1
2
u
2
t
0
Z
s
ds
= 1 +iu
t
0
Z
s
dX
s
.
Since X is a continuous local martingale, it follows by Theorem 9.1 that Z also is a
continuous local martingale. Moreover [Z
t
[ = e
1/2u
2
t
, hence by Exercise 4.4, Z is a
continuous martingale with Z
0
= 1. Therefore since Z = 0,
E
Z
t
Z
s
T
s
= E
Z
t
Z
s
Z
s
+ 1
T
s
= Z
1
s
E[Z
t
Z
s
[T
s
] + 1 = 1.
But since Z
t
Z
1
s
= expiu(X
t
X
s
) + 1/2u
2
(t s), this implies
E[e
iu(XtXs)
[T
s
] = e
1/2u
2
(ts)
.
This shows that X
t
X
s
is independent of T
s
and also that X
t
X
s
has characteristic
function exp1/2u
2
(t s). This proves the theorem.
We also have a similar characterization of Poisson processes.
Theorem 10.9 (Watanabes teorem.) Let X be a point process (i.e. X is RCLL
adapted, and X increases by 1 on times 0 < T
1
< T
2
< ). Assume that X
t
t
is a local martingale where > 0 is a number. Then X is a homogeneous Poisson
process with intensity .
Proof. Fix u R and set
Y
t
= iuX
t
t(e
iu
1).
In this case,

st
Y
s
= iuX
t
, so we can use formula (10.1). Referring to this we
get that Y
c
= 0 and

Y
t
= t(e
iu
1). Also we get
st
(e
Ys
e
Y
s
) =
Xt
i=1
(e
Y
T
i
e
Y
T
i
) = (e
iu
1)
Xt
i=1
e
Y
T
i
= (e
iu
1)
t
0
e
Y
s
dX
s
.
So by setting Z
t
= e
Yt
we have
Z
t
= 1 + (e
iu
1)
t
0
Z
s
d(X
s
s).
47
Since [Z
t
[ e
2t
, it follows that Z is a martingale. Therefore as in the proof of the
above theorem, we get
E[e
iu(XtXs)
[T
s
] = exp(t s)(e
iu
1).
But this shows that X
t
X
s
is independent of T
s
and also that X
t
X
s
has char-
acteristic function exp(t s)(e
iu
1). This proves the theorem.
Comment. In Theorem 10.9 it is assumed that X is a point process, in particular
that it is increasing, and in addition it is assumed that X
t
t is a martingale. By
Theorem 7.9 this amounts to saying that the compensator of X is t. Also by that
theorem, instead of assuming that X
t
t is a martingale, we could have assumed
that
E

0
H
t
dX
t
= E

0
H
t
dt
for all nonnegative predictable H.

11 Linear stochastic dierential equations.
The following theorem has found a wide number of applications.
Theorem 11.1 (Exponential formula.) Let X be a semimartingale with X
0
= 0.
Then there exists a unique semimartingale Z that satises the stochastic dierential
equation
Z
t
= 1 +
t
0
Z
s
dX
s
.
This solution is
Z
t
= expX
t

1
2
'X
c
, X
c
`
t
st
(1 +X
s
)e
Xs
, (11.1)
and we write
Z = c(X).
If

st
X
s
converges, we may write the solution as
Z
t
= exp

X
t

1
2
'X
c
, X
c
`
t
st
(1 +X
s
), (11.2)
where

X
t
= X
t

st
X
s
.
Proof. We start by proving that (11.1) is a semimartingale. The term expX
t

1
2
'X
c
, X
c
`
t
is obviously a semimartingale, and since the product of two semimartin-
gales again is a semimartingale, we must prove that
st
(1 +X
s
)e
Xs
=
st
(1 +X
s
1
{|Xs|<1/2}
)e
Xs1
{|Xs|<1/2}
st
(1 +X
s
1
{|Xs|1/2}
)e
Xs1
{|Xs|1/2}
48
is a semimartingale. But X is RCLL, hence for each , there is only a nite
number of s t so that [X
s
()[ 1/2, therefore the last product above consists
of a nite number of terms (which varies with ), hence it has nite variation.
Evidently it is RCLL and adapted, so therefore it is a semimartingale. Next let
Y
s
= X
s
1
{|Xs|<1/2}
. We want to prove that
A
t
=
st
(1 +Y
s
)e
Ys
has nite variation. It then follows directly that A is RCLL and adapted, and
consequently Z given by (11.1) is a semimartingale. But since [Y
s
[ < 1/2, we have
Var(log A)
t

st
[ log(1 +Y
s
) Y
s
[
st
Y
2
s

st
(X
s
)
2
< .
Therefore log(A) has nite variation, hence A has nite variation.
To prove that (11.1) actually is a solution, set
U
t
= X
t

1
2
'X
c
, X
c
`
t
,
V
t
=
st
(1 +X
s
)e
Xs
.
Then Z
t
= f(U
t
, V
t
) = e
Ut
V
t
. For f(u, v) = e
u
v we have (using the terminology of
Theorem 10.7)
D
1
f = f, D
2
f = e
u
, D
11
f = f, D
12
f = e
u
and D
22
f = 0.
Also 'U
c
, U
c
` = 'X
c
, X
c
` and since V has nite variation, V
c
= 0, hence 'U
c
, V
c
` =
'V
c
, V
c
` = 0. Furthermore we have that

t
0
e
U
s
dV
s
=

st
e
U
s
V
s
, that U
s
=
X
s
and that V
s
= V
s
(1 +X
s
)e
Xs
. So
Z
s
Z
s
Z
s
U
s
= e
U
s
+Us
V
s
(1 +X
s
)e
Xs
Z
s
Z
s
X
s
= Z
s
(e
Xs
(1 +X
s
)e
Xs
1 X
s
) = 0.
All this combined with Itos formula now gives,
Z
t
= 1 +
t
0
Z
s
dU
s
+
t
0
e
U
s
dV
s
+
1
2
t
0
Z
s
d'X
c
, X
c
`
s
+
st
(Z
s
Z
s
Z
s
U
s
e
U
s
V
s
)
= 1 +
t
0
Z
s
dX
s
.
We omit the proof that the solution is unique.
Comment. If we rather want to solve
Z
t
= Z
0
+
t
0
Z
s
dX
s
, (11.3)
49
the solution is given as
Z
t
= Z
0
c(X)
t
. (11.4)
This is seen as follows. If Z
0
() = 0 then Z
t
() = 0 obviously satises both (11.3)
and (11.4). On Z
0
() = 0, dene Z
t
= Z
t
/Z
0
, so that Z
t
satises
Z
t
= 1 +
t
0
Z
s
dX
s
,
hence Z
t
= c(X)
t
so that Z
t
= Z
0
c(X)
t
.
Example 11.1 Let
X
t
= ct +W
t
+
Nt
i=1
S
i
,
where N is a counting process. Let Z solve
Z
t
= Z
0
+
t
0
Z
s
dX
s
.
Using (11.2) we see that

X
t
= ct +W
t
, 'X
c
, X
c
`
t
=
2
t, so therefore
Z
t
= Z
0
e
(c1/2
2
)t+Wt
Nt
i=1
(1 +S
i
).
The exponential c(X) has the following quasi-multiplicative property.
Theorem 11.2 Let X and Y be two semimartingales with X
0
= Y
0
= 0. Then
c(X)c(Y ) = c(X +Y + [X, Y ]).
Comment. Intuitively one would prefer c(X)c(Y ) = c(X + Y ) as with ordinary
exponentials. This is so if and only if [X, Y ] = 'X
c
, Y
c
` +
XY = 0. This is
the case if e.g. X or Y is continuous with nite variation on nite intervals.
Proof. Let U
t
= c(X)
t
and V
t
= c(Y )
t
so that U
t
= 1+U
X
t
and V
t
= 1+V
Y
t
.
Then by the integration by parts formula (Denition 10.1), Theorem 9.2(v) and
Theorem 10.4(b).
U
t
V
t
= 1 +U
V
t
+V
U
t
+ [U, V ]
t
= 1 + (UV )
X
t
+ (UV )
Y
t
+ (UV )
[X, Y ]
t
= 1 +
t
0
(UV )
s
d(X +Y + [X, Y ])
s
= c(X +Y + [X, Y ])
t
.
Theorem 11.1 can be generalized to linear stochastic dierential equations.
50
Theorem 11.3 Let V and X be semimartingales, and let T = inft : X
t
= 1.
Consider the stochastic dierential equation
Z
t
= V
t
+
t
0
Z
s
dX
s
.
If T = , the unique solution is given by
c
V
(X)
t
= Z
t
= c(X)
t
V
0
+
t
0
c(X)
1
s
dV
s
t
0
c(X)
1
s
d[V, X]
s
)
= c(X)
t
V
0
+
t
0
c(X)
1
s
d(V
s
[

V, X]
s
)
,
where
[

V, X]
t
=
t
0
1
1 +X
s
d[V, X]
s
= 'V
c
, X
c
` +
st
V
s
X
s
1 +X
s
.
Note that [

V, X]
t
= [

X, V ]
t
.
We omit the proof of the rst part. To prove the second part, note that by Theorem
11.1
c(X)
s
c(X)
s
=
1
1 +X
s
,
hence

t
0
c(X)
1
s
[V, X]
s
=
t
0
c(X)
1
s
d[

V, X]
s
.
The last expression for [

V, X]
t
follows from Theorem 10.4.
Comment. If T < in Theorem 11.3, then X
T
= 1, so c(X)
T
= 0 by (11.1),
and c(X)
1
T
is not dened. Therefore the above solution does not work in this case,
but it is still possible to obtain a solution by considering the successive stopping
times (T
n
) dened by T
0
= 0 and for n 0, T
n+1
= inft > T
n
: X
t
= 1, and
then dene the solution on the intervals [[T
n
, T
n+1
[[.
Theorem 11.4 Assume that (V, X) in Theorem 11.3 is a Levy process, i.e. it has
independent stationary increments (relative to the ltration F). Then Z is a strong
Markov process.
Proof. Let T be a nite stopping time and dene Q = V [

V, X]. Then
Z
T+t
= c(X)
T+t
c(X)
1
T
c(X)
T
(V
0
+
T
0
c(X)
1
s
dQ
s
) +
T+t
T
c(X)
1
s
c(X)
T
dQ
s
= c(X)
T+t
c(X)
1
T
Z
T
+
t
0
c(X)
1
T+s
c(X)
T
d

Q
s
,
where

Q
s
= Q
T+s
Q
T
. By Exercise 8.1 and Theorem 11.3,
[

V, X]
t
= ct +
st
V
s
X
s
1 +X
s
,
51
hence [

V, X] is also a Levy process, implying that Q is a Levy process. By The-
orem 6.5,

Q is a Levy process independent of T
T
, and

Q has the same law as Q.
Furthermore by (11.1) and Exercise 8.1,
c(X)
T+t
c(X)
1
T
= expX
T+t
X
T

1
2
c
1
t
T<sT+t
(1 +X
s
)e
Xs
,
which by Theorem 6.5 is independent of T
T
and has the same law as c(X)
t
. Thus
Z
T+t
depends on T
T
only through Z
T
, and the theorem is proved.
Example 11.2 Consider the stochastic dierential equation
dZ
t
= (a +bZ
t
)dt +Z
t
dW
t
= adt +Z
t
d(bt +W
t
)
with Z
0
= z. Here W is a Brownian motion. For this equation
V
t
= z +at and X
t
= bt +W
t
.
Therefore c(X)
t
= e
(b1/2
2
)t+Wt
and [

V, X]
t
= 0, hence
Z
t
= e
(b1/2
2
)t+Wt
z +a
t
0
e
(b1/2
2
)sWs
ds
.
Exercise 11.1 Solve the equation
Z
t
= z +pt
Nt
i=1
S
i
+
t
0
Z
s
d(rs +W
s
),
where N is a point process and W is a Brownian motion.
12 Applications to risk theory I.
In this section we will dene a rather general model for the assets of an insurance
company, involving income and payments from the insurance business, return on
investment on assets and also ination. Then new assumptions will gradually be
introduced so that actual computations can be performed.
As before we work on a ltered probability space (, T, F, P) satisfying the usual
conditions.
The construction of the model will be through several steps.
Step 1. There is a surplus generating process P with P
0
= 0. P is assumed to be
a semimartingale. This process consists of premium income minus claims
payments in an ination and interest free economy. One typical example is
the classical risk process
P
t
= pt
Nt
i=1
S
i
, (12.1)
where

Nt
i=1
S
i
is a compound Poisson process.
52
Step 2. There is an ination generating process I with I
0
= 0. I is assumed to be a
semimartingale. So if

I
t
is the price level at time t,

I
t
is the solution of the
exponential equation
I
t
= 1 +
t
0
I
s
dI
s
= c(I)
t
.
where the notation is from Theorem 11.1. It will be assumed throughout that
I
t
> 0 t.
If e.g. the process I
t
= t, then

I
t
= e
t
, i.e. we have a xed rate of ination .
Step 3. The surplus generating process is subject to ination, so that the inated
surplus process

P is given by
P
t
=
t
0
I
s
dP
s
.
Step 4. There is a return on investment generating process R with R
0
= 0. R is
assumed to be a semimartingale.
Total assets of the insurance company at time t then follows the linear stochas-
tic dierential equation,
Y
t
= y +

P
t
+
t
0
Y
s
dR
s
.
Here y are initial assets. Constant return on investments at rate r corresponds
to R
t
= rt. We will write

R = c(R), i.e. the exponential of R.
Step 5. Total assets in real units at time t is then given as,
Y
t
=

I
1
t
Y
t
.
We will call Y the risk process. In addition to assuming that

I
t
> 0 t we will
also assume that

R
t
> 0 t. Letting T = inft : I
t
1 or R
t
1, this
assumption amounts to P(T = ) = 1.
With this assumption we can use Theorem 11.3 which gives
Y
t
=

I
1
t
R
t
y +
t
0
R
1
s
d(

P
s
[

P, R]
s
)
= U
1
t
y +
t
0
U
s
d(P
s
[

P, R]
s
)
(12.2)
where U
1
t
=

I
1
t

R
t
is the level of return in real units. The last equality in (12.2)
follows from the fact that d
P
s
=

I
s
dP
s
and d[

P, R]
s
=

I
s
d[

P, R]
s
(easy to prove).
Let us remind the reader that
[

P, R]
t
= 'P
c
, R
c
`
t
+
st
P
s
R
s
1 +R
s
,
53
so that in particular [

P, R] = [

R, P].
We will be interested in the probability of eventual ruin, i.e let
T
y
= inft : Y
t
< 0 = inft :

Y
t
< 0.
(equality because

I > 0). We use the subscript y to indicate dependence on initial
capital y. But U > 0, hence
T
y
= inf
t : y +
t
0
U
s
d(P
s
[

P, R]
s
) < 0
= inft : Z
t
< y (12.3)
where
Z
t
=
t
0
U
s
d(P
s
[

P, R]
s
). (12.4)
A main problem in risk theory is to nd P(T
y
< ). A more dicult problem
is to nd the distribution of T
y
. The distribution of a nonnegative random variable
(can be innite) is determined by its Laplace transform, and it is usually simpler to
nd E[e
Ty
] for > 0.
Remark 12.1 The process U
1
t
=

I
1
t

R
t
= c(R)
t
/c(I)
t
measures the level of return
in real units. If we instead had worked with rate of return in real units R I, we
would instead of Y have
Y
t
= y +P
t
+
t
0
Y
s
d(R
s
I
s
),
and by Theorem 11.3 the solution is
Y
t
= c(R I)
t
y +
t
0
c(R I)
1
s
d(P
s
[

P, RI]
s
)
,
which is dierent from Y given in (12.2). If we assume that P is independent of
(R, I) so that [

P, R] = [

P, R I] = 0, then

Y = Y if and only if
c(R I)c(I) = c(R).
But according to Theorem 11.2
c(RI)c(I) = c(R + [I, R I]),
so in this case

Y = Y if and only if [I, RI] = 0. We see that this is the case if e.g.
I
t
= t, i.e. constant rate of ination.
Before we make any further simplicications, let us take a closer look at the
process U = c(I)/c(R). By Theorem 11.1, keeping in mind that we have assumed
I
t
> 1 and R
t
> 1, we get by (11.1),
U
t
= exp
R
t
I
t

1
2
('R
c
, R
c
`
t
'I
c
, I
c
`
t
)
+
st
(log(1 +R
s
) R
s
(log(1 +I
s
) I
s
))
.
(12.5)
54
If

st
R
s
and

st
I
s
both converge, we can use (11.2) to get
U
t
= exp

R
t

I
t
1
2
('R
c
, R
c
`
t
'I
c
, I
c
`
t
) +
st
log
1 +R
s
1 +I
s
. (12.6)
The model (12.2) is too general to be of any use in analytical considerations,
but it may be useful for Monte-Carlo simulations. In these notes we will focus on
analytical methods, so let us introduce one further assumption.
A1: The 3-dimensional process (P, I, R) is a Levy process, i.e. it has independent
stationary increments.
From the proof of Theorem 11.4, [

P, R] is also a Levy process, so if we dene
Q = P [

P, R],
then (Q, I, R) is a 3-dimensional Levy process as well. We get from (12.2)
Y
t
= U
1
t
(y +Z
t
), (12.7)
where (12.4) now takes the form
Z
t
=
t
0
U
s
dQ
s
. (12.8)
By (12.5)
U
t
= e
Xt
, (12.9)
where X is a Levy process. This gives in particular
U
t+s
U
1
t
= e
(X
t+s
Xt)
e
Xs
= U
s
, (12.10)
where we by A B mean that A and B have the same distribution. Furthermore
since X has independent increments (relative to the ltration F), it follows from
Theorem 6.5 that U
t+s
U
1
t
is independent of T
t
.
Theorem 12.1 Let Z
t
be given by (12.8) and assume that Z
= lim
t
Z
t
exists,
is a.e. nite and is not equal to a constant. Let H be the distribution function of
Z
. Then H is continuous, and the probability of ultimate ruin, (see (12.3)) is given
as
P(T
y
< ) =
H(y)
E[H(Y
Ty
)[T
y
< ]
.
(Note the resemblence with Theorem 6.9).
Proof. Dene
V
t
= U
1
t

t
U
s
dQ
s
=
U
s
U
t
dQ
s
=
U
s
d

Q
s
Z
,
where

U
s
= U
t+s
U
1
t
U
s
by (12.10) and

Q
s
= Q
s+t
Q
t
Q
s
. Since both

U
and

Q are independent of T
t
, it follows that V
t
is independent of T
t
. In fact by
55
Theorem 6.5, V
T
is independent of T
T
for any nite stopping time T, and V
T
Z
.
Furthermore Z
= Z
T
+U
T
V
T
.
Now let p be the largest probability of any point mass of Z
. Then by assumption
p < 1. Assume that P(Z
= c
i
) = p, i = 1, , K, and let G
t
be the distribution
function of U
1
t
(c
1
Z
t
). Then since Z
= Z
t
+ U
t
V
t
and V
t
is independent of
U
1
t
(c
1
Z
t
),
p = P(Z
= c
1
) = P(V
t
= U
1
t
(c
1
Z
t
)) =
H(z)dG
t
(z),
which implies that

K
k=1
G
t
(c
k
) = 1 t. But Z
t
Z
as t (hence U
t
0
as t ), so
H(c
1
) = P(Z
= c
1
) P(limsup
n
Z
n
c
1
U
n
c
1
, , c
K
)
limsup
n
P(Z
n
c
1
U
n
c
1
, , c
K
) = 1,
a contradiction. Hence H is continuous. Here the rst inequality comes from the fact
that U
n
0 as n , the second inequality is standard, and the last equality from
the above conclusion that
K
k=1
G
n
(c
k
) = 1, i.e. P(U
1
n
(c
1
Z
n
) c
1
, , c
K
) =
1.
For notational simplicity we now replace T
y
by T. On T < we have,
y +Z
= y +Z
T
+U
T
V
T
= U
T
[U
1
T
(y +Z
T
) +V
T
)] = U
T
(Y
T
+V
T
).
Therefore by continuity of H,
H(y) = P(y +Z
< 0) = P(T < , y +Z
< 0)
= P(T < , V
T
< Y
T
) =
{T<}
P(V
T
< Y
T
[T
T
)dP
=
{T<}
H(Y
T
)dP = E[H(Y
T
)1
{T<}
]
= E[H(Y
T
)[T < ]P(T < ).
Here the second equality follows from that fact that y + Z
< 0 if and only if

there exists a t so that y + Z
t
< 0, which again by positivity of U is equivalent
to U
1
t
(y + Z
t
) = Y
t
< 0. This then implies that T < . The third equality is
because U
T
> 0, hence y + Z
< 0 Y
T
+ V
T
< . The fourth equality is just
the denition of conditional probability, and the fth comes from the fact that H is
continuous and that V
T
is independent of T
T
and has distribution function H, while
Y
T
is T
T
measurable. Finally the last equality is just the denition of conditional
expectation. This ends the proof.
Remark 12.2 If Z
exists, then by (12.8)

Z

0
U
t
dQ
t
=

0
e
X
t
dQ
t
,
where the latter expression is from (12.9). So considering X
t
as a discount factor
applicable at time t, Z
is just the present value of the cash ow Q discounted by

56
the random factor e
X
. This makes the task of nding the distribution function

H of Z
interesting also for other purposes. We will return to this subject later.
Going back to the ruin problem, we see from Theorem 12.1 that even if we
know H, we still have to compute E[H(Y
Ty
)[T
y
< ]. By the denition of
T
y
, on T
y
< we have Y
Ty
< 0, hence since H is a distribution function,
H(0) E[H(Y
Ty
)[T
y
< ] 1. Also if P is continuous (which implies that
Q is continuous), then as ruin occurs because P gets too small (remember U > 0),
it follows that in this case Y
Ty
= 0, hence E[H(Y
Ty
)[T
y
< ] = H(0). We therefore
have
Corollary 12.1 Under the same assumptions as in Theorem 12.1, we have that the
probability of ruin satises
H(y) P(T
y
< )
H(y)
H(0)
where the right hand inequality becomes an equality if P is continuous.
So when does Z
exist and nite? The following simple result can be improved,

but at the expense of a much more complicated proof.
Theorem 12.2 Assume that Q is square integrable and that E[U
2
t
] < 1 when t > 0.
Then Z
exists and is a.s. nite and

Z
t
Z
as t a.s. and in L
1
.
Also E[Z
2
] < .
Proof. By Theorem 6.8 there is a constant c
1
so that Q
t
= M
t
+ c
1
t where M is a
square integrable martingale. Then
Z
t
=
t
0
U
s
dM
s
+c
1
t
0
U
s
ds = N
t
+V
t
.
By Exercise 8.1, 'M, M`
t
= c
2
t for some constant c
2
, hence by Theorems 9.1 and
10.3
E[[N, N]
t
] = E['N, N`
t
] = E[U
2
'M, M`
t
] = c
2
E
t
0
U
2
s
ds
.
Now by assumption E[U
2
t
] < 1, but by (12.9), U
t
= e
Xt
so U
2
t
= e
2Xt
where X is
a Levy process. But by Theorem 6.8, E[e
2Xt
] = e
c
4
t
where c
4
> 0. Therefore
E[[N, N]
] = c
2

0
e
c
4
t
dt =
c
2
c
4
< .
Hence by Theorem 10.3, N H
2
so N
t
N
a.s. and in L
1
. Exercise 12.1 nishes
the proof.
Exercise 12.1 In Theorem 12.2, prove that
E

0
U
s
ds
< .
Comments.
57
(a) using the Burkholder-Davis-Gundy inequality, Theorem 10.6, it is not hard to
prove that under the assumptions of Theorem 12.2, Z
t
Z
a.s. and in L
2
.
(b) it can be proven, but the proof gets considerably more complicated, that the
conclusion of Theorem 12.2 still holds if we assume that Q is integrable, that
E[U
t
] < 1 when t > 0, and that

st
[X
s
[ < t > 0 where again
U
t
= e
Xt
.
In order to get any further, it seems necessary to add the following assumption.
A2: The surplus generating process P is independent of the ination and return
on investment generating process (I, R).
Since the processes P and (I, R) model dierent aspects of economic activity, this
assumption is quite reasonable.
We will from now on assume A2 (as well as A1). Then [

P, R] = 0, hence the
processes P and Q are equal. Furthermore by writing T = T
y
, on T < we
have that U
T
= 0 since ruin must come from the process P (remember U > 0).
Therefore by (12.2) and Theorem 9.2(iv).
Y
T
= U
1
T
(U
P)
T
= U
1
T
U
T
P
T
= P
T
.
In particular assume P is the classical risk process (12.1). Then Y
T
= S
N
T
so
that,
Y
T
= (Y
T
+Y
T
) = S
N
T
Y
T
.
But Y
T
< 0, hence S
N
T
> Y
T
0. So therefore
E[H(Y
T
)[T < ] = E[H(S
N
T
Y
T
)[S
N
T
> Y
T
, T < ].
In particular if the claims S are exponentially distributed, by the memoryless prop-
erty of the exponential distribution we obtain,
E[H(S
N
T
Y
T
)[S
N
T
> Y
T
, T < ] = E[H(S)].
If S has a decreasing failure rate (new worse than used), i.e. P(S > t + s[S > t) is
increasing in t for all nonnegative s, then
E[H(S
N
T
Y
T
)[S
N
T
> Y
T
, T < ] E[H(S)],
with the equality reversed if S has increasing failure rate (new better than used).
We have proved.
Theorem 12.3 In addition to the assumptions in Theorem 12.1, assume that P is
of the form (12.1) (pluss A2). Then if S has decreasing failure rate, the probability
of ruin satises
P(T
y
< )
H(y)
E[H(S)]
.
If S is exponentially distributed there is equality, and the inequality is reversed if S
has increasing failure rate.
58
Exercise 12.2 Let S be Pareto distributed, i.e. S has density
f
S
(s) = (1 +s)
(1+)
> 0, s > 0.
Does S have increasing or decreasing failure rate?
Let us now turn to the distribution function H of Z
, assuming Z
exists and
is a.s. nite. Since P has stationary independent increments, we get as in Theorem
6.8.
h
u
(t +s) = E[e
iuP
t+s
] = E[e
iuPt
]E[e
iu(P
t+s
Pt)
] = h
u
(t)h
u
(s).
Therefore
E[e
iuPt
] = e
(u)t
, (12.11)
for some continuous function . (Continuous since e
(u)
= E[e
iuP
1
] is a characteristic
function.) We then dene
(u) = E[e
iuZ
]. (12.12)
Theorem 12.4 Under the same assumptions as in Theorem 12.3, we have
(u) = E
exp

0
(uU
t
)dt
= E
u
exp

0
(U
t
)dt
where in the rst expectation U

t
= e
Xt
as usual (see (12.9)), while in the sec-
ond U
t
= ue
Xt
, i.e. U
0
= u. Here and are dened in (12.11) and (12.12)
respectively.
Proof. The equality of the two expressions is obvious. To prove the rst equality, let
( = U
t
: t 0. Note that the -algebras ( and P
t
: t 0 are independent.
Let t
n
i
= i2
n
t, i = 0, 1, 2, . Then (t
n
i
) satises the assumptions of Theorem 9.3,
so by that theorem
Z
(n)
t
=
2
n
1
i=0
U
t
n
i

(P
t
n
i+1
P
t
n
i
)
P
t
0
U
s
dP
s
= Z
t
as n .
Therefore
lim
n
E[e
iuZ
(n)
t
] = E[e
iuZt
].
And since Z
t
Z
as t by assumption,
lim
t
lim
n
E[e
iuZ
(n)
t
] = E[e
iuZ
] = (u).
Now since U is ( measurable and P has independent increments and is independent
of (, it follows that
E[e
iuZ
(n)
t
] = E
exp
iu
2
n
1
i=0
U
t
n
i

(P
t
n
i+1
P
t
n
i
)
= E
2
n
1
i=0
exp
iuU
t
n
i

(P
t
n
i+1
P
t
n
i
)
59
= E
2
n
1
i=0
E
exp
iuU
t
n
i

(P
t
n
i+1
P
t
n
i
)
= E
2
n
1
i=0
exp
(uU
t
n
i

)(t
n
i+1
t
n
i
)
= E
exp
2
n
1
i=0
(uU
t
n
i

)(t
n
i+1
t
n
i
)
.
Since [E[e
iuPt
][ 1, it follows that Re((u)) 0. Therefore by the dominated
convergence theorem and continuity of ,
lim
n
E[e
iuZ
(n)
t
] = E
exp
t
0
(uU
t
)dt
.
Letting t , dominated convergence theorem gives the desired result.
In order to get further, we will now specify a particular model for (I, R). We set
I
t
= t +
I
W
I,t
R
t
= rt +
R
W
R,t
.
(12.13)
Here W
I
and W
R
are (correlated) Brownian motions with E[W
I,t
W
R,t
] =
I,R
t where
I,R
< 1. Then by (12.5) and (12.9)
X
t
= rt +
R
W
R,t
t
I
W
I,t

1
2
(
2
R
t
2
I
t).
Dene
W
t
=
1
2
R
2
I,R
R
+
2
I
(
R
W
R,t

I
W
I,t
).
Then W has independent stationary normally distributed increments with E[W
t
] = 0
and E[W
2
t
] = t, hence W is a Brownian motion. Therefore if we set,
r = r
1
2
2
R
(
1
2
2
I
)
2
=
2
R
2
I,R
R
+
2
I
,
(12.14)
we get
X
t
= rt +W
t
.
Therefore in our model (12.13) is equivalent to I = 0 and R
t
= rt + W
t
where r
and are given by (12.14). Furthermore (with U
0
= 1)
E[U
2
t
] = E[e
2Xt
] = e
2rt
E[e
2Wt
] = e
2rt
e
2
2
t
= e
2(r
2
)t
.
So in order to have E[U
2
t
] < 1 we must have
r
2
= r
3
2
2
R

1
2
2
I
+ 2
I,R
R
> 0 (12.15)
We then have the following theorem.
60
Theorem 12.5 Let (I, R) be given by (12.13) and assume that (12.15) holds. As-
sume also that P is square integrable. Then (u) = E[e
iuZ
] is the solution of the
dierential equation
1
2
2
u
2
(u) (r
1
2
2
)u
(u) +(u)(u) = 0, (12.16)

where r and
2
are given by (12.14) and by (12.11). Side conditions are
(0) = 1 and [(u)[ 1, u R.
Comment. The condition [(u)[ 1 may seem a bit imprecise, but it is a strong
condition requiring that we are searching for a bounded solution of (12.16). If
however we know that Z
has a density function h, the condition can be replaced

by the more precise (and numerically much better) condition (u) 0 as u
and u . This follows directly from the Riemann-Lebesgue lemma, Feller
(1971) Lemma 3 p.513.
Proof. The side conditions follow from (u) = E[e
iuZ
]. By Theorem 12.2 and the
above, E[Z
2
] < . Hence (u) is twice continuously dierentiable with
(u) =
u
2
E[Z
2
e
iuZ
].
Dene the uniformly integrable martingale M (see Theorem 4.4) by,
M
t
= E
u
exp

0
(U
s
)ds
T
t
= e
t
0
(Us)ds
E
u
exp

t
(U
s
)ds
T
t
= e
t
0
(Us)ds
E
Ut
exp

0
(U
s
)ds
= e
t
0
(Us)ds
(U
t
) = V
t
(U
t
).
Here the second equality is because

t
0
(U
s
)ds is T
t
measurable, the third equality
comes from (6.6) and the fourth from Theorem 12.4. Note that from Theorem 12.4,
M
0
= (u).
Since the process V above is continuous with nite variation, it follows from
Theorem 10.2(d) that [V, (U)] = 0. Therefore by the integration by parts formula
(Denition 10.1)
M
t
= (u) +
t
0
V
s
d(U
s
) +
t
0
(U
s
)dV
s
. (12.17)
(In the rst integral we can write V
s
instead of V
s
because (U
s
) is continuous.
Similar with the second integral). Now
dV
s
= V
s
(U
s
)ds.
Also by the exponential formula, Theorem 11.1, we see that U is the solution of
dU
s
= U
s
d

X
s
, U
0
= 1,
61
where

X
s
= X
s
+
1
2
2
s = (r
1
2
2
)s W
s
. Therefore U
c
= U W, hence by
Theorem 9.1(vi), 'U
c
, U
c
`
t
=
2
t
0
U
2
s
ds. So by Itos formula
d(U
s
) =
(U
s
)U
s
d

X
s
+
1
2
(U
s
)U
2
s
ds.
Therefore (12.17) becomes
M
t
= (u) +
t
0
V
s
1
2
2
U
2
s
(U
s
) (r
1
2
2
)U
s
(U
s
) +(U
s
)(U
s
)
ds
t
0
V
s
U
s
(U
s
)dW
s
.
But M is a martingale and the last integral is a local martingale, hence the rst
integral is a continuous local martingale of nite variation. But then it is identically
zero by Theorem 7.8, and since V is never zero, we get
1
2
2
U
2
s
(U
s
) (r
1
2
2
)U
s
(U
s
) +(U
s
)(U
s
) = 0.
Now as in Theorem 10.3, U
0
= u and this equality applies for all u. Therefore
(12.16) must be true, and the theorem is proved.
Remark 12.3 When it is known that the Laplace transform
L
(u) = E[e
uZ
]
exists for all u 0, it is often more convenient to work with this instead of the
characteristic function. In particular if P is increasing, in which case Z
0,
L
(u)
will always exist for nonnegative u, and also
L
(u) = E[Z
2
e
uZ
] < for u > 0,
thus avoiding the extra condition (12.15). Note that P nondecreasing implies that
ruin never occurs, but as mentioned in Remark 12.2, Z
is a present value of an
innite cash ow, and therefore it is of interest by itself to nd this distribution. To
nd an expression for
L
(u), we can replace (u) in (12.11) by
L
(u) dened by
E[e
uPt
] = e
L
(u)t
. (12.18)
Then replacing by
L
in the above calculations, we get the dierential equation
1
2
2
u
2
L
(u) (r
1
2
2
)u
L
(u)
L
(u)
L
(u) = 0. (12.19)
Assuming that Z
0, we get the additional boundary conditions
L
(0) = 1 and lim
u
L
(u) = 0. (12.20)
Let us now assume that P has the form
P
t
= pt +
P
W
P,t
Nt
i=1
S
i
, (12.21)
62
where

Nt
i=1
S
i
is a compound Poisson process independent of the Brownian motion
W
P
. We denote the distribution function of the claims S
i
by F and the intensity of
the Poisson process N by .
We then get
E[e
iuPt
] = e
iupt
E[e
iu
P
W
P,t
]E[expiu
Nt
i=1
S
i
].
Therefore some easy calculations show that (u) in (12.11) is given by,
(u) = iup
1
2
u
2
2
P
(1 (u)),
where
(u) = E[e
iuS
] =
R
e
ius
dF(s).
As is well known, there exists an inversion formula for the characteristic function
giving the distribution function. Indeed, by Feller (1971) Formula (3.11) p.511, this
formula is given by
H(z) =
1
2
lim
a
R
e
iua
e
iuz
iu
(u)du
whenever the integral exists. Using this relation together with the integration by
parts formula, it is possible to prove.
Theorem 12.6 In addition to the assumptions of Theorem 12.5, assume that P is
given by (12.21). Assume also
(a) if
2
> 0 or
2
P
> 0 then
R
[u(u)[du < .
Otherwise it is sucient that
R
[(u)[du < .
(b)
R
[
(u)[du < .
Then the distribution function H of Z
is twice continuously dierentiable and is

the solution of
1
2
(
2
z
2
+
2
P
)H
(z) + ((r +
1
2
2
)z p)H
(z) +
R
(H(z +s) H(z))dF(s) = 0,
with boundary conditions,
lim
z
H(z) = 0 and lim
z
H(z) = 1.
63
The assumptions of Theorem 12.6 are not easy to verify in general unless is known,
but the following result may be helpful.
Theorem 12.7 The assumptions (a) and (b) of Theorem 12.6 are satised if either
of the following conditions hold:
(a)
2
P
> 0.
(b) > 2(r +
2
) and there exist positive constants K, c and so that when
[u[ K then [Re((u))[ c[u[
.
We will not prove Theorems 12.6 or 12.7 here, but we see that Theorem 12.7 give
sucient conditions for [(u)[ 0 when u or u , see the comment
after Theorem 12.5.
Example 12.1 Assume = 0 so that
Z

0
e
rt+Wt
d(pt +
P
W
P,t
),
where r >
2
. Then by Theorems 12.6 and 12.7, the density h of Z
is the solution
of
1
2
(
2
z
2
+
2
P
)h
(z) + ((r +
1
2
2
)z p)h(z) = 0,
i.e.
h
(z)
h(z)
= (r +
1
2
2
)
2z
2
z
2
+
2
P
+
2p
2
z
2
+
2
P
,
or
d
dz
log h(z) = (
1
2
+
r
2
)
d
dz
log(
2
z
2
+
2
P
) +
2p
P
d
dz
arctan(

P
z).
This nally gives
h(z) =
h
0
(
2
z
2
+
2
P
)
1/2+r/
2
exp
2p
P
arctan
P
z
,
where h
0
is a constant so that

R
h(z)dz = 1. A closed form expression for h
0
does
not seem to exist.
We see from the expression for h(z) that Z
is distributed on the whole real

axis, and also that Z
has only a nite number of moments. Therefore the Laplace

transform or moment generating function does not exist.
Now by Corollary 12.1 (se also (12.13) and (12.14)),
P(T
y
< ) =
H(y)
H(0)
=
(
2
z
2
+
2
P
)
(1/2+r/
2
)
exp
2p
P
arctan
P
z
dz
(
2
z
2
+
2
P
)
(1/2+r/
2
)
exp
2p
P
arctan
P
z
dz
.
Substituting v = arctan(

P
z) we get
P(T
y
< ) =
G(arctan(

P
y))
G(0)
,
64
where
G(x) =
2
cos
v e
v
dv,
with =
2r
2
1 and =
2p
P
.
Unfortunately when p, and
2
are all positive, it is very hard to get closed form
solutions for either or H. However, when p = 0 the following example is a very
nice result.
Example 12.2 In this example we are going to nd the distribution of
Z

0
e
rt+Wt
d(
Nt
i=1
S
i
) =
n=1
e
rTn+W
Tn
S
n
(12.22)
where

Nt
i=1
S
i
is a compound Poisson process and (T
n
) are the times of jumps of N.
As usual W is an independent Brownian motion, and we assume that the S
i
are
exponentially distributed with expectation
1
.
Since Z
is nonnegative, we use the Laplace transform discussed in Remark 12.3.

Referring to (12.18), we see that
E[e
uPt
] = e
(1
L
(u))t
where
L
(u) = E[e
uS
] = /( +u), hence 1
L
(u) = u/( +u).
By (12.19) we then get
1
2
2
u
2
L
(u) (r
1
2
2
)u
L
(u)
u
+u
L
(u) = 0.
A change of variable g(v) = (u) where v = u/, brings this equation into the
hypergeometric form
v(1 v)g
(v) + ( (1 + +)v)g
(v) g(v) = 0,
where , and are determined by
1 + + = 1
2r
2
, =
2
2
and = 1
2r
2
.
Some easy calculations show that
= b, = (a +b) and = 1 a,
where
a =
2r
2
and b =
r
1 +
2
2
r
2
1
. (12.23)
By (12.20), boundary conditions are g(0) = 1 and g() = 0.
Using standard results from the theory of hypergeometric dierential equations,
it can be proven that the solution is
L
(u) = c
1
0
y
a+b1
(1 y)
b
(y +u)
b
dy, (12.24)
65
where c can be determined from
L
(0) = 1. However, it is well known that the
Laplace transform of a (b, y) distributed random variable Z is
E[e
uZ
] =

0
(y)
b
(b)
z
b1
e
yz
e
uz
dz =
y
y +u
b
.
Therefore
(y +u)
b
=

0
1
(b)
z
b1
e
yz
e
uz
dz.
Inserting this expression in (12.24) and using Fubinis theorem, this gives
L
(u) = C
z
b1
1
0
y
a+b1
(1 y)
b
e
yz
dy
e
uz
dz
with C = c/(b). But then the uniqueness of the Laplace transform gives
h(z) = Cz
b1
1
0
y
a+b1
(1 y)
b
e
yz
dy.
Now let X (, ) be independent of Y B(, ) for general nonnegative , ,
and , i.e.
f
X
(x) =

()
x
1
e
x
, x 0,
f
Y
(y) =
( +)
()()
y
1
(1 y)
1
, 0 y 1.
Then letting Z = X/Y , we nd by conditioning on Y
P(Z z) =
1
0
P(X zy)f
Y
(y)dy.
So that
h(z) =
d
dz
P(Z z) =
1
0
yf
X
(zy)f
Y
(y)dy
=
( +)
()()
()
z
1
1
0
y
+1
(1 y)
1
e
zy
dy.
By identifying parameters, we then get the folowing result.
Theorem 12.8 Let Z
be given by (12.22) where r > 0. Then Z
exists and is
a.e. nite, and furthermore Z
has the same distribution as
=
X
Y
,
where X (b, ) is independent of Y B(a, 1 + b). Here a and b are dened in
(12.23).
66
13 Random measures.
Random measures are very useful to keep track of the jumps of an RCLL adapted
process X. A random measure is dened as follows.
Denition 13.1 A random measure on R
+
R
d
is a family = ((; dt, dx) :
) of nonnegative measures on (R
+
R
d
, B(R
+
)B(R
d
)) satisfying (; 0
R
d
) = 0.
Denition 13.2 We dene the space
= R
+
R
d
,
and the -algebras
O = O B(R
d
) and

{ = { B(R
d
).
A function W = W(, t, x) that is

O measurable (resp.

{ measurable) is called an
optional (resp. predictable) function.
For an optional function W we dene the integral process
W
t
() =
[0,t]R
d
W(, s, x)(; ds, dx) if
[0,t]R
d
[W(, s, x)[(; ds, dx) < ,
otherwise.
Denition 13.3 The random measure is said to be optional (resp. predictable)
if for any optional (resp. predictable) function W, the process W
t
is optional
(resp. predictable).
Denition 13.4 An integer valued random measure is a random measure that sat-
ises
(i) (; t R
d
) 1.
(ii) for each A B(R
+
) B(R
d
), ( ; A) takes it values in N .
(iii) is optional.
The following result can be proven.
Theorem 13.1 Let X be an adapted RCLL process. Then
X
(; dt, dx) =
s
1
{Xs()=0}
(s,Xs())
(dt, dx)
denes an integer valued random measure on R
+
R
d
. Here
(s,y)
(dt, dx) = 1 if
s dt and y dx and is zero otherwise.
We say that
X
is the random measure associated with X.
67
Example 13.1 Let W(, t, x) = [x[
2
. Then
W
X
t
() =
[0,t]R
d
[x[
2
X
(; ds, dx) =
st
[X
s
()[
2
.
Obviously W is a predictable function.
Example 13.2 Let W(, t, x) = f(X
t
()+x)f(X
t
()) where f is a measurable
function. Then since X
t
is predictable (left continuous), it follows that X
t
() +x
is

{ measurable, hence W is a predictable function. Moreover
W
X
t
() =
[0,t]R
d
(f(X
s
+x) f(X
s
))
X
(; ds, dx)
=
st
(f(X
s
+X
s
) f(X
s
)) =
st
(f(X
s
) f(X
s
)),
provided

st
[f(X
s
) f(X
s
)[ < .
Let be an optional random measure and let A B(R
d
). Dene
K(, t; A) =
[0,t]R
d
1
A
(x)(; ds, dx) = (; [0, t] A).
Then for any (, t), K(, t; ) is a measure on R
d
. Furthermore for any A
B(R
d
), the function (, t) K(, t; A) is (R
+
, O) measurable (and therefore by
Theorem 3.2, it is (R
+
, T B(R
+
)) measurable). Similarly if is a predictable
random measure, then (, t) K(, t; A) is a (R
+
, {) measurable function for
all A B(R
d
).
Denition 13.5 Let (G, () and (E, c) be two measure spaces. We say that the
map K(g; de) is a transition kernel from (G, () into (E, c) if g G, K(g; ) is a
measure on (E, c), and A c, K( ; A) is a (G, () measurable function.
We saw above that if is a predictable random measure, then the kernel K(, t; dx) =
(; [0, t] dx) is a transition kernel from ( R
+
, {) into (R
d
, B(R
d
)).
Note the similarity of a transition kernel and the transition function of Denition
6.2. However, for given g G, K(g; ) is only assumed to be a measure on (E, c),
not a probability measure.
The following result is a generalization of Theorem 7.10
Theorem 13.2 Let the the random measure associated with the adapted RCLL
process X. Then there exists a unique predictable random measure
p
called the
compensator of , satisfying either of the two equivalent properties.
(i) E[W
p
] = E[W
] for every nonnegative predictable function W.

(ii) for every predictable function W so that [W[ /
+
loc
, then [W[
p
/
+
loc
and W
p
is the compensator of the process W (or equivalently WW
p
is a local martingale).
68
Moreover, there exists a predictable process A /
+
and a transition kernel K(, t; dx)
from ( R
+
, {) into (R
d
, B(R
d
)) so that
p
(; dt, dx) = dA
t
()K(, t; dx)
(this decomposition is not unique).
Example 13.3 Let X =

Nt
i=1
S
i
be a compound Poisson process with associated
random measure . Then of course
[0,t]R
d
x(; ds, dx) =
st
X
s
= X
t
.
We will prove that
p
(; dt, dx) = dtdF(x) (13.1)
(independent of ), where is the intensity of N and F the distribution function of
the jumps (S
i
). By Theorem 5.1, { is generated by sets of the form A0 where
A T
0
and A(u, v] where u < v and A T
u
. Now by denition (; 0R
d
) =
0, so we can forget the sets A 0. Using a monotone class argument, it is
enough to prove that condition (i) of Theorem 13.2 is satised for W of the form
W(, t, x) = 1
A
()1
(u,v]
(t)1
B
(x) with A T
u
and B B(R
d
). But then since
Nv
i=Nu+1
S
i
is independent of T
u
and 1
A
() is T
u
measurable, we get
E[W
] = E
1
A
()
Nv
i=Nu+1
1
{S
i
B}
= E
1
A
()
Nv
i=Nu+1
1
{S
i
B}
T
u
= E[1
A
()]E
Nv
i=Nu+1
1
{S
i
B}
= E
1
A
()(v u)
B
dF(x)
= E[W
p
].
This proves (13.1).
14 Applications to risk theory II.
In this section we will consider a more direct approach to the probability of
eventual ruin than in Section 12. Referring to the setup in that section, our model
assumptions here will be
P
t
= pt +
P
W
P,t
Nt
i=1
S
i
,
I
t
= t,
R
t
= rt +W
t
,
69
i.e. P is the same as in (12.21), rate of ination is constant and R is the same as in
(12.13). Although not necessary, we assume that P and R are independent.
By Remark 12.1, Y is given as the solution of the linear equation
Y
t
= y +P
t
+
t
0
Y
s
d(R
s
I
s
).
But R
s
I
s
= ( r )s +W
s
= rs +W
s
where r = r . We therefore get exactly
the same model by letting I = 0 and R
t
= rt + W
t
, and since this is notationally
simpler, assume from now on the following model
Y
t
= y +P
t
+
t
0
Y
s
dR
s
, (14.1)
with
P
t
= pt +
P
W
P,t
Nt
i=1
S
i
,
R
t
= rt +W
t
.
(14.2)
In addition we will assume that r > 0, i.e. that r > .
As in (12.2), the solution of (14.1) is
Y
t
=

R
t
(y +
t
0
R
1
s
dP
s
),
where
R
t
= e
(r
1
2
2
)t+Wt
.
By Theorem 11.4, Y is a strong Markov process.
We are going to develop two integro-dierential equations useful in the analysis
of the ruin problem. These equations will depend on the initial capital y, so let as
before
T
y
= inft : Y
t
< 0,
with T
y
= if ruin never occurs. Also dene the operator A acting on twice
continuously dierentiable functions as follows
Ag(y) =
1
2
(
2
y
2
+
2
P
)g
(y) + (ry +p)g
(y) +

0
(g(y x) g(y))dF(x),
where again is the intensity of N and F is the distribution function of the claims
(S
i
). We then get.
Theorem 14.1 With the above notation and assumptions, we have:
(i) Assume R(y) is a bounded and twice continuously dierentiable function on
y 0 with a bounded rst derivative there, where we at y = 0 mean the right
hand derivative. If R(y) solves
AR(y) = 0 on y > 0,
70
together with the boundary conditions
R(y) = 1 on y < 0,
R(0) = 1 if
2
P
> 0,
lim
y
R(y) = 0,
then
R(y) = P(T
y
< ).
(ii) Assume q
(y) is a bounded and twice continuously dierentiable function on

y 0 with a bounded rst derivative there, where we at y = 0 mean the right
hand derivative. If q
(y) solves
Aq
(y) = q
(y) on y > 0,
together with the boundary conditions
q
(y) = 1 on y < 0,
q
(0) = 1 if
2
P
> 0,
lim
y
q
(y) = 0,
then
q
(y) = E[e
Ty
].
Proof. We prove part (i). Obviously P(T
y
< ) = 1 if y < 0. Furthermore it
can be proven that a Brownian motions starting at 0 will immediately assume a
negative value, so if
2
P
> 0 and y = 0, P will immediately assume a negative value,
hence ruin will occur. If y , then since r > 0, interest on capital will oset any
bad behaviour of insurance business P, and therefore P(T
y
< ) 0 as y .
Therefore the stipulated boundary conditions of R(y) satisfy those of P(T
y
< ).
Now R is not necessarily twice continuously dierentiable on (, ). Therefore
dene R
n
by
R
n
(y) = R(y) on (,
1
n
] [0, )
and let R
n
lie between an neighbourhood of R(0) and of 1 on (
1
n
, 0] so that R
n
is twice continuously dierentiable on (, ). (Such an R
n
exists). Then using
the simplied form (10.1) of Itos formula,
R
n
(Y
t
) = R
n
(y) +
t
0
R
n
(Y
s
)d
Y
s
+
1
2
t
0
R
n
(Y
s
)d'Y
c
, Y
c
`
s
+
st
[R
n
(Y
s
) R
n
(Y
s
)],
where

Y
t
= y+pt+
P
W
P,t
+
t
0
Y
s
d(rs+W
s
). Using that W
P
and W are independent,
we get
'Y
c
, Y
c
`
t
=
2
P
t +
2
t
0
Y
2
s
ds.
71
Let be the random measure associated with (Y ) (see Theorem 13.1). Since
Y = P, this equals the random measure associated with (P), which again
equals the random measure associated with

Nt
i=1
S
i
. Therefore as in Example 13.2,
st
[R
n
(Y
s
) R
n
(Y
s
)] =
[0,t]R
(R
n
(Y
s
x) R
n
(Y
s
))(; ds, dx).
By assumption R
n
is bounded, so letting
p
be the compensator of , it follows from
Theorem 13.2 that
M
(n)
t
=
[0,t]R
(R
n
(Y
s
x) R
n
(Y
s
))((; ds, dx)
p
(; ds, dx))
is a martingale. Moreover from Example 13.3,
p
(; ds, dx) = dsdF(x), hence
[0,t]R
(R
n
(Y
s
x)R
n
(Y
s
))
p
(; ds, dx) =
t
0
R
(R
n
(Y
s
x) R
n
(Y
s
))dF(x)
ds.
Therefore
R
n
(Y
t
) = R(y) +
t
0
AR
n
(Y
s
)ds +M
(n)
t
,
where M
(n)
is the local martingale
M
(n)
t
=
P
t
0
R
n
(Y
s
)dW
P,s
+
t
0
R
n
(Y
s
)Y
s
dW
s
+

M
(n)
t
.
Since R
is assumed bounded, it follows that R
n
is bounded (for n xed), hence
the rst and the third terms are martingales. Let T
b
= inft : Y
t
b for b > y.
Then on [0, T
y
T
b
), R
n
(Y
s
)Y
s
mb where m = supR
(z) : z > 0. Therefore

(M
(n)
)
TyT
b
is a martingale, and since by assumption AR(Y
s
) = 0 for s < T
y
, we get
[E[R
n
(Y
tTyT
b)] R(y)[ = E
tTyT
b
0
(AR
n
(Y
s
) AR(Y
s
))ds
tTyT
b
0
[(AR
n
(Y
s
) AR(Y
s
))[ds
.
But for y 0,
[AR
n
(y) AR(y)[ =

0
(R
n
(y x) R(y x))dF(x)
y+1/n
y
(R
n
(y x) 1)dF(x)
(1 +[R(0)[ +)(F(y + 1/n) F(y)) 0 as n .

Therefore by bounded convergence theorem, letting n ,
E[R(Y
tTyT
b )] = R(y).
Letting b , bounded convergence theorem gives
E[R(Y
tTy
)] = R(y).
72
But
E[R(Y
tTy
)] = E[R(Y
Ty
)1
{Tyt}
] +E[R(Y
t
)1
{Ty>t}
].
Now R(Y
Ty
) = 1 and since r > 0, Y
t
as t on T
y
= , hence R(Y
t
) 0
as t on T
y
= . Therefore by bounded convergence
R(y) = lim
t
E[R(Y
tTy
)] = lim
t
P(T
y
t) = P(T
y
< ).
This proves part (i). The proof of part (ii) is similar and is left as an exercise.
It follows from the theorem that the boundary conditions together with the
boundedness assumptions are sucient to determine R and q
uniquely, provided
the solutions exist. However, it may happen that the relevant integro-dierential
equations must be dierentiated one or several times to get rid of the integral oper-
ator, and in this case it is important to retain the boundary value information from
the original equation. To be specic, consider the equation Aq
(y) = q
(y) since
the equation AR(y) = 0 is obtained from this by setting = 0 and q
= R. Then
letting y 0 in this equation and using the boundary conditions of Theorem 14.1,
we nd
1
2
2
P
q
(0+) +pq
(0+) = 0 when
2
P
> 0, (14.3)
and
pq
(0+) ( +)q
(0) + = 0 when
2
P
= 0. (14.4)
Similarly if we have to dierentiate Aq
(y) = q
(y) twice. Then letting y 0 in

d
dy
(Aq
(y) q
(y)) = 0 and assuming that f(x) = F
(x) exists and is continuous

in an interval [0, ) (right derivative at 0), we get the following additional boundary
conditions when
2
P
> 0,
1
2
2
P
q
(0+) +
p

2
P
2p
(r )
(0+) +

p
(r ) = 0, (14.5)
and when
2
P
= 0,
pq
(0+) +
r +
p
+
f(0)
(0+)

+
f(0) = 0. (14.6)
Here we used (14.3) and (14.4) to get rid of q
(0) and q
(0) respectively.
To see that all this make sense, consider the special case when the (S
i
) are expo-
nentially distributed with expectation
1
. Then from
d
dy
(Aq
(y) q
(y)) +(Aq
(y) q
(y)) = 0,
we nd that q
(y) solves
1
2
(
2
y
2
+
2
P
)q
(y) +
2
y
2
+ (r +
2
)y +p +

2
2
P
(y)
+(ry + (p +r ))q
(y) q
(y) = 0,
(14.7)
73
with boundary conditions given in Theorem 14.1 and (14.3) or (14.4).
Exercise 14.1 Prove formulas (14.3)-(14.7).
It is time to look at some examples. We shall follow the notation of Slater (1960)
and write
h(y) = T(a, b, y)
for any general solution of the conuent hypergeometric equation
yh
(y) + (b y)h
(y) ah(y) = 0.
Similarly we write F(a, b, y) for the standard conuent hypergeometric function and
U(a, b, y) for its second form (see Slater (1960) p.5 for details.) We will make use of
the integral representations
F(a, b, y) =
(b)
(b a)(a)
1
0
e
yt
t
a1
(1 t)
ba1
dt b > a > 0, (14.8)
and
U(a, b, y) =
1
(a)

0
e
yt
t
a1
(1 +t)
ba1
dt a > 0, y > 0. (14.9)
These formulas are given in Slater (1960) p.34 and 38 respectively.
Example 14.1 Consider the case when =
P
= 0 and the (S
i
) are exponentially
distributed with expectation
1
. Then the stochastic dierential equation (14.1)
takes the form
Y
t
= y +pt
Nt
i=1
S
i
+r
t
0
Y
s
ds.
By Theorem 14.1, (14.4) and (14.7) we must nd a solution of
(ry +p)R
(y) + (ry +p +r )R
(y) = 0
with
lim
y
R(y) = 0,
pR
(0+) R(0) + = 0.
It is straightforward to verify that R(y) is given by
R(y) =
y
e
x
(1 +
rx
p
)
r
1
dx
p
0
e
x
(1 +
rx
p
)
r
1
dx
,
a result that goes back to Segerdahl (1942).
Consider now the problem of nding q
(y) = E[e
Ty
]. Then again by Theorem
14.1, (14.4) and (14.7), we look for a solution of
(ry +p)q
(y) + (ry +p +r )q
(y) q
(y) = 0 (14.10)
74
with
lim
y
q
(y) = 0,
pq
(0) ( +)q
(0) + = 0.
(14.11)
A change of variable y =
z

p
r
and g(z) = q
(y) brings (14.10) into the conuent

hypergeometric form
zg
(z) +
1
+
r
z
(z) (
r
)g(z) = 0.
Therefore
q
(y) = T
r
, 1
+
r
, (y +
p
r
)
.
The solution we are seeking is (see Slater (1960) p.5)
q
(y) = ce
y
(y +
p
r
)
+
r
U
1 +

r
, 1 +
+
r
, (y +
p
r
)
,
since then by Slater (1960) p. 60
q
(y) ce
y
(y +
p
r
)
r
1
0 when y .
Using the dierentiation rule for conuent hypergeometric functions (see Slater
(1960) p.16) the boundary conditions (14.11) are satised for
q
(y) =

p
e
y
(1 +
r
p
y)
+
r
U(1 +

r
, 1 +
+
r
, (y +
p
r
))
U(1 +

r
, 1 +
+
r
,
p
r
) + (1 +

r
)U(2 +

r
, 2 +
+
r
,
p
r
)
.
This can be computed by using (14.9) and then inverted numerically to nd the
distribution of T
y
.
Exercise 14.2 Verify the expression for R(y) in Example 14.1.
Example 14.2 As in Example 14.1, assume that =
P
= 0, but that S is a
mixture of two exponentials, i.e.
f(x) = (
1
e
1
x
+ (1 )
2
e
2
x
)1
{x0}
.
Some tedious calculations show that from
d
2
dy
2
AR(y) + (
1
+
2
)
d
dy
AR(y) +
1
2
AR(y) = 0,
we get the following dierential equation
(ry +p)R
(y) + ((
1
+
2
)ry + (2r + (
1
+
2
)p ))R
(y)
+(
1
2
ry + ((
1
+
2
)r +
1
2
p ((1 )
1
+
2
)))R
(y) = 0.
(14.12)
75
From Theorem 14.1, (14.4) and (14.6) we get the boundary conditions
lim
y
R(y) = 0,
pR
(0+) R(0) + = 0
pR
(0+) + (r +p(
1
+ (1 )
2
) )R
(0+) = 0.
(14.13)
We assume without loss of generality that
1
>
2
and dene
D
=
1

2
. A
change of variable y =
1
D
z
p
r
and R
(y) = exp
D
z
g(z) brings (14.12) into

the conuent hypergeometric form
zg
(z) + (2

r
z)g
(z) (1
r
)g(z) = 0.
We therefore get
R
(y) = e
1
y
T
r
, 2

r
,
D
(y +
p
r
)
.
We choose the following two independent solutions
1
(y) = e
1
y
(y +
p
r
)
r
1
F
(1 )
r
,

r
,
D
(y +
p
r
)
2
(y) = e
1
y
(y +
p
r
)
r
1
U
(1 )
r
,

r
,
D
(y +
p
r
)
.
Our candidate solution is then
R(y) = c
1
R
1
(y) +c
2
R
2
(y),
where as before asymptotic formulas give that
R
i
(y) =
i
(x)dx, i = 1, 2,
exist and are nite.
Some calculations show that the last two conditions of (14.13) are satised for
c
1
=
U
(0) +

r
U
(1)
D
c
2
=
F
(0)
F
(1)
D
where
D =
F
(0)
F
(1)
r(
p
r
)
U
(0) +R
2
(0)
U
(0) +

r
U
(1)
r(
p
r
)
F
(0) +R
1
(0)
76
and we used the abbreviated forms
F
(k) = F
k +
r
, k +

r
,
D
p
r
U
(k) = U
k +
r
, k +

r
,
D
p
r
.
Using the integral representations (14.8) and (14.9) together with Fubinis theorem,
we also nd
R
1
(y) =
(
r
)
(
r
)((1 )
r
)
e
1
p
r
1
0
t
(1)
r
1
(1 t)
r
1
(
1
D
t)
r
, (y +
p
r
)(
1
D
t)
dt
and
R
2
(y) =
1
((1 )
r
)
e
1
p
r

0
t
(1)
r
1
(1 +t)
r
1
(
1
+
D
t)
r
, (y +
p
r
)(
1
+
D
t)
dt
where (a, x) is the incomplete gamma function
(a, x) =

x
e
t
t
a1
dt.
Before we consider the pure diusion case, i.e. when = 0, we will discuss how
this can be seen as a limit of nondiusion cases.
Let (P
(n)
, R
(n)
) be a sequence of processes of the form (14.2), i.e.
P
(n)
t
= p
(n)
t +
(n)
P
W
(n)
P,t

N
(n)
t
i=1
S
(n)
i
,
R
(n)
t
= r
(n)
t +
(n)
W
(n)
t
,
and let
(n)
be the intensity of N
(n)
. We assume that (P
(n)
) and (R
(n)
) are indepen-
dent sequences of processes (all dened on the same probability space (, T, F, P)).
Then as in (14.1) we set
Y
(n)
t
= y +P
(n)
t
+
t
0
Y
(n)
s
dR
(n)
s
,
i.e. we let initial assets Y
(n)
0
= y be independent of n. We then have.
Theorem 14.2 With the above model, assume there exist constants p,
P
, r, so
77
that
lim
n
p
(n)
(n)
E[S
(n)
]
= p,
lim
n
(
(n)
P
)
2
+
(n)
E[(S
(n)
)
2
]
=
2
P
,
lim
n
(n)
E[1
{|S
(n)
|>}
(S
(n)
)
2
]
= 0, > 0,
lim
n
r
(n)
= r,
lim
n
(
(n)
)
2
=
2
.
Then if
T
(n)
y
= inft : Y
(n)
t
0,
and T
(n)
y
= if Y
(n)
t
0 t 0, we have for all t 0,
lim
n
P(T
(n)
y
t) = P(T
y
< t),
where as usual T
y
= inft : Y
t
< 0, and T
y
= if Y
t
0 t 0. Also if r >
2
we have
lim
n
P(T
(n)
y
< ) = P(T
y
< ).
Since R in (14.2) is already a diusion, it is natural to let R
(n)
= R for all n.
Exercise 14.3 Let
P
t
= pt +
P
W
P,t
,
R
t
= rt +W
t
,
(14.14)
and P and R independent. Use Theorem 14.1 to prove that P(T
y
< ) is the same
as in Example 12.1.
Consider now the model (14.14) so that
Y
t
= y +
t
0
(p +rY
s
)ds +
P
W
P,t
+
t
0
Y
s
dW
s
.
We are interested in the Laplace transform q
(y) = E[e
Ty
] for > 0. By Theorem
14.1, q
(y) is the solution of

1
2
(
2
y
2
+
2
P
)q
(y) + (ry +p)q
(y) q
(y) = 0
with boundary conditions
q
(0) = 1 and lim

y
q
(y) = 0.
By using the method of contour integration it can be proven that a solution is given
by
q
(y) =
D(y, )
D(0, )
,
78
where
=
1
2
(
2r
2
1)
2
+ 8

2
+ (1
2r
2
)
,
and
D(x, ) =

x
(t x)
K(t)dt,
with
K(t) = (
2
t
2
+
2
P
)
(1+
1
2
)
exp
2p
P
arctan(

P
t)
and
=
(
2r
2
1)
2
+ 8

2
1.
79
15 References.
These notes have drawn heavily on the book by Jacod and Shiryaev (1987). With
the exception of Section 6, the rst 11 sections are to a large extent taken from
Chapter I of that book. In addition I have benetted from the book by He, Wang
and Yan (1992), and also the book by Protter (1992) has been useful. Theorem 10.9
is taken from Bremaud (1981). For Section 6, the main source has been the book
by Chung (1982), but the books by Karatzas and Shreve (1988) and Revuz and Yor
(1994) has also been very useful, mainly in this section, but also in other sections.
Theorem 6.9 is taken from Grandell (1991).
Section 13 is basically taken from Chapter II of Jacod and Shiryaev. Section 12
is based on a paper by Paulsen (1993), except Example 12.2 which is from Nilsen
and Paulsen (1996). Section 14 is taken from Paulsen and Gjessing (1996).
Bremaud, P. (1981). Point Processes and Queues: Martingale Dynamics (1981).
Springer, New York.
Chung, K.L. (1982). Lectures from Markov Processes to Brownian Motion. Springer,
New York.
Feller W. (1971) An Introduction to Probability Theory and its Applications, Vol
II. Wiley, New York.
Grandell, J. (1991). Aspects of Risk Theory. Springer, New York.
He, S, Wang, J and J. Yan (1992). Semimartingale Theory and Stochastic Cal-
culus. CRC Press Inc. Boca Raton.
Jacod, J. and A.N. Shiryaev ( 1987). Limit Theorems for Stochastic Processes.
Springer, Berlin.
Karatzas, I. and S.E. Shreve (1988). Brownian Motion and Stochastic Calcu-
lus. Springer, New York.
Nilsen, T and J. Paulsen (1996). On the distribution of a randomly discounted
compound Poisson process. Stochastic Processes and their Applications, 61,305-
310.
Paulsen, J. (1993). Risk theory in a stochastic economic environment Stochastic
Processes and their Applications, 46,327-361.
Paulsen, J. and H.K. Gjessing (1996). Ruin theory with stochastic return on
investments. To appear in Advances in Applied Probability.
Protter, P. (1992). Stochastic Integration and Dierential Equations: A New Ap-
proach. Springer, Berlin.
Revuz, D. and M. Yor (1994). Continuous Martingales and Brownian Motion,
Second edition. Springer, Berlin.
80
Segerdahl, C.O. (1942).

Uber einige risikotheoretische Fragestellungen. Skand.Aktuar
Tidskr. 25, 43-83.
Slater, L.J. (1960). Conuent Hypergeometric Functions. Cambridge University
Press, London.
81

Noter

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Noter

Transféré par

Droits d'auteur :

Formats disponibles

Stochastic calculus with applications to risk theory.

H (i.e. the constant functions are in H.)

are both optional.

[] < . Also by Fatous lemma

exists and is integrable. Let

[ which is UI, hence Y is UI. But

[] 0 as t , and so for any

. Finally assume (d). Then for s < t,

, the result follows from e.g. Theorem 6.2.

be the -elds generated by the sets in (i) and (ii) respectively.

and the result follows.

is predictable, the rest follows as above.

is a -algebra. It is therefore sucient to prove that

is left continuous, hence

+Var(A) is predictable if A is so.

is left continuous and adapted, hence

and a uniformly integrable martingale M so that

= B. But A = 0+A is also a decomposition,

, hence A = B implying that X = 0.

] for all nonnegative predictable H.

, P). Note that |M|

] < . We will dene the stochastic integral H X

] < ). This just says that m(d, dt) =

and then extend it to general H L

of the form (9.5) we dene

of the Hilbert space L

2y(x y), we get

X uniformly on compacts, and also

(x)[ K for all x, K a constant. Then by

) into one to get rid

for all nonnegative predictable H.

< 0) = P(T < , y +Z

< 0 if and only if

exists, then by (12.8)

is just the present value of the cash ow Q discounted by

. This makes the task of nding the distribution function

exist and nite? The following simple result can be improved,

exists and is a.s. nite and

where in the rst expectation U

(u) +(u)(u) = 0, (12.16)

has a density function h, the condition can be replaced

] < . Hence (u) is twice continuously dierentiable with

0, we get the additional boundary conditions

is twice continuously dierentiable and is

is distributed on the whole real

has only a nite number of moments. Therefore the Laplace

is nonnegative, we use the Laplace transform discussed in Remark 12.3.

be given by (12.22) where r > 0. Then Z

has the same distribution as

] for every nonnegative predictable function W.

(y) + (ry +p)g

(y) is a bounded and twice continuously dierentiable function on

is assumed bounded, it follows that R

(z) : z > 0. Therefore

(1 +[R(0)[ +)(F(y + 1/n) F(y)) 0 as n .

(y) twice. Then letting y 0 in

(y)) = 0 and assuming that f(x) = F

(x) exists and is continuous

(y) brings (14.10) into the conuent

g(z) brings (14.12) into

(y) is the solution of

(y) + (ry +p)q

(0) = 1 and lim