Attribution Non-Commercial (BY-NC)

8 vues

Attribution Non-Commercial (BY-NC)

- Evaluating Crusher System Location in an Open Pit Mine Using Markov Chains
- fragility 2.0
- e-PROPAINOR: A Web-Server for Fast Prediction of C
- Course Outline MMSC Monte Carlo Methods
- Networked System State Estimation in Smart Grid Over Cognitive Radio Infrastructures
- cep04-08
- Measuring TFP at the Firm Level
- Chapter Markov Analysis
- 201_vgGray R. - Probability, Random Processes, And Ergodic Properties
- PERT Master - Risk Analysis Tool
- Hafeez Allan Agboola, Et Al
- Reliability Analysis in Performance-based Earthquake Engineering
- Jurnal Fungsi Produksi 1
- Tutorial Probabilistic Analysis
- A_probabilistic_approach_to_design_civil_engineering_structures.pdf
- Exercise 3 Computer Intensive Statistics
- Introduction to Simulation - Raj Jain
- Summary of Results on Markov Chains
- [PDF] Risk Management Current Issues and Challenges.pdf
- europe travel paper 2019 sdss

Vous êtes sur la page 1sur 30

RANDOM WALKS BY SEQUENTIAL

IMPORTANCE SAMPLING AND RESAMPLING

HOCK PENG CHAN,

SHAOJIE DENG,

Microsoft

TZE-LEUNG LAI,

Stanford University

Abstract

We introduce a new approach to simulating rare events for Markov random

walks with heavy-tailed increments. This approach involves sequential im-

portance sampling and resampling, and uses a martingale representation of

the corresponding estimate of the rare-event probability to show that it is

unbiased and to bound its variance. By choosing the importance measures

and resampling weights suitably, it is shown how this approach can yield

asymptotically ecient Monte Carlo estimates.

Keywords: Ecient simulation; heavy-tailed distributions; regularly varying

tails; sequential Monte Carlo

2010 Mathematics Subject Classication: Primary 65C05

Secondary 60G50

1. Introduction

The past decade has witnessed many important advances in Monte Carlo methods

for computing tail distributions and boundary crossing probabilities of multivariate

random walks with i.i.d. or Markov-dependent increments; see the survey paper

by Blanchet and Lam [6]. In particular, the case of heavy-tailed random walks has

attracted much recent attention because of its applications to queueing and commu-

nication networks. A random variable is called light-tailed if its moment generating

function is nite in some neighborhood of the origin. It is said to be heavy-tailed

otherwise.

Another area of much recent interest is the development and the associated prob-

ability theory of ecient Monte Carlo method to compute rare-event probabilities

1

2 H.P. CHAN, S. DENG AND T.L. LAI

n

= P(A

n

) such that

n

0 as n . A Monte Carlo estimator

n

of

n

using m

simulation runs is said to be logarithmically ecient if

mVar(

n

)

2+o(1)

n

as n ; (1.1)

it is said to be strongly ecient if

mVar(

n

) = O(

2

n

). (1.2)

Strong eciency mean that for every > 0,

Var(

n

)

2

n

, (1.3)

can be achieved by using m simulation runs, with m depending on but not on n.

In the case of logarithmic eciency, (1.3) can be achieved by using m

n

simulation

runs, with m

n

= (

1

n

)

o(1)

to cancel the

o(1)

n

term in (1.1). Since the focus of this

paper is on rare events associated with a random walk S

n

, any Monte Carlo estimate

of a rare-event probability has to generate the i.i.d. or Markov dependent increments

X

1

, . . . , X

n

of the random walk for each simulation run, and this computational task

is linear in n. We call the Monte Carlo estimate linearly ecient if m

n

= O(n)

simulation runs can be used to achieve (1.3). More generally, for any nondecreasing

seqence of positive constants C

n

such that C

n

= o(

1

n

), we call the Monte

Carlo estimate C

n

-ecient if m

n

= O(C

n

) simulations runs can achieve (1.3). Note in

this connection that the variance of the direct Monte Carlo estimate of

n

using m

n

independent simulation runs is

n

(1

n

)/m

n

, and therefore (1.3) can be achieved

only by choosing m

n

(

n

)

1

(1

n

).

To achieve strong eciency, Blanchet and Glynn [4] and Blanchet and Liu [7] have

made use of approximations of Doobs h-transform to develop an importance sampling

method for computing P(A) when the event A is related to a Markov chain Y

k

that

has transition probability densities p

k

([Y

k1

) with respect to some measure . Letting

h

k

(Y

k

) = P(A[Y

k

), note that

E(h

k

(Y

k

)[Y

k1

) = E(P(A[Y

k

)[T

k1

) = P(A[T

k1

) = h

k1

(Y

k1

),

i.e.,

_

p

k

(x, y)h

k

(y)d(y) = h

k1

(x). This yields the transition density

p

h

k

(x, y) := p

k

(x, y)

h

k

(y)

h

k1

(x)

(1.4)

SISR for heavy-tailed random walks 3

of an importance measure Q = P([A), and p

h

k

is called the h-transform of p

k

. Al-

though the likelihood ratio dP/dQ is equal to P(A) and has therefore zero variance,

this importance measure cannot be used in practice because P(A) is the unknown

probability to be estimated. On the other hand, one may be able to nd a tractable

approximation v

k

of h

k

for k = 1, 2, . . . so that p

h

k

(x, y) can be approximated by

q

k

(x, y) = p

k

(x, y)

v

k

(y)

_

p

k

(x, y)v

k

(y

)d(y

)

, (1.5)

which is the transition density function of an importance measure that can be used to

perform importance sampling.

In this paper, we propose a new approach to simulating rare-event probabilities

for heavy-tailed random walks. This approach uses not only sequential (dynamic)

importance sampling but also resampling. Chan and Lai [9] have introduced the

sequential importance sampling with resampling (SISR) methodology and applied it to

simulate Pg(S

n

/n) b and Pmax

n0nn1

ng(S

n

/n) c for light-tailed random

walks, where g is a general function and S

n

is a random walk. Note that unlike [2], we

consider here the situation in which n approaches , rather than with n xed. In [9],

the importance measure is simply Q = P and the resampling weights for the light-tailed

case depend heavily on the niteness of the moment generating function. Moreover, a

distinguishing feature of a heavy-tailed random walk S

n

is the possibility of a single

large increment resulting in the exceedance of g(S

n

/n) or max

n0nn1

ng(S

n

/n) over

a threshold. An important idea underlying the SISR method to simulate rare-event

probabilities for heavy-tailed random walks in Section 3 is to make use of the single

large jump property to decompose the event of interest into two disjoint events, one

of which involves the maximum increment being large. We use dierent Monte Carlo

schemes to simulate these two events.

In Section 2, we describe another way of using SISR to simulate rare-event probabil-

ities of heavy-tailed random walks. Here we start with a target importance measure,

such as the one that uses the transition density (1.5) to approximate the h-transform

(1.4). The normalizing constant, which is the integral in (1.5), may be dicult to

compute for general state spaces. Moreover, it may be dicult to sample from such

density. The SISR procedure in Section 2 provides an alternative to this elaborate

direct importance sampling procedure but still achieves its eect. The analysis of the

4 H.P. CHAN, S. DENG AND T.L. LAI

two dierent SISR schemes for estimating rare-event probabilities, given in Sections 2

and 3 respectively, enables us to bound the variance of a SISR estimate. In Section 4

we use these bounds to show that the SISR estimates developed in Sections 2 and 3

are linearly ecient under certain regularity conditions. Section 5 provides numerical

results to supplement the asymptotic theory and gives further discussions on related

literature.

2. Implementing a target importance measure by SISR

Let Y

n

= (Y

1

, . . . , Y

n

) and let p

k

([y

k1

) be the conditional density, with respect

to some measure , of Y

k

given Y

k1

= y

k1

. Let p

n

(y

n

) =

n

k=1

p

k

(y

k

[y

k1

).

To evaluate a rare-event probability = PY

n

, direct Monte Carlo involves

the generation of m independent samples from the density function p

n

(y

n

) and then

estimating by

D

= m

1

m

j=1

I

{Y

(j)

n }

.

Importance sampling involves the generation of m independent samples from an alter-

native density q

k

([y

k1

) and then estimating by

I

= m

1

m

j=1

p

n

(Y

(j)

n

)I

{Y

(j)

n }

q

n

(Y

(j)

n

)

, (2.1)

where q

n

(y

n

) =

n

k=1

q

k

(y

k

[y

k1

) and satises q

n

(y

n

) > 0 whenever p

n

(y

n

)I

{yn}

>

0. If one is able to choose q

n

such that p

n

(y

n

)I

{yn}

/ q

n

(y

n

) c for some positive

constant c, then one can ensure that

mE

Q

(

2

I

) c

2

2

, (2.2)

yielding a strongly ecient

I

.

For the case in which Y

n

is a random walk S

n

and the rare event is A = S

n

b,

a candidate for the choice of q

k

([S

k1

) is (1.5) in which v

k

is an approximation to the

h-transform. Large deviation or some other asymptotic method leads to an asymptotic

approximation of the form

P(S

n

b[S

k

) g(b S

k

, n k), (2.3)

SISR for heavy-tailed random walks 5

which can be used to derive v

k

. As noted in Section 1, the normalizing constant

(i.e., the denominator) in (1.5) is often dicult to evaluate and the target importance

measure with transition density (1.5) may be dicult to sample from. We next show

that we can bypass the normalizing constant by using SISR, which also enables us to

weaken and generalize (2.3) to

c

n

g

n

(Y

k

, n k) P(A

n

[Y

k

) c

n

g

n

(Y

k

, n k) (2.4)

for all n and k and almost all Y

k

, where c

n

and c

n

are positive constants. In (2.4), Y

k

is a general stochastic sequence and we denote the event of interest by A

n

to indicate

that it is rare in the sense that

n

= P(A

n

) 0 as n . The weakening of (2.3)

to (2.4) is of particular importance for implementation since it allows one to choose g

n

to be piecewise constant so that not only can the normalizing constants in (2.5) below

be easily computed but (2.5) is also convenient to sample from. Let

q

k

(y

k

[y

k1

) =

p

k

(y

k

[y

k1

)g

n

(y

k

, n k)

w

k1

(y

k1

)g

n

(y

k1

, n k + 1)

, (2.5)

in which w

0

1 and w

k1

(y

k1

) is a normalizing constant to make q

k

([y

k1

) a density

function for k 2. From (2.4), it follows that

1

n

w

k1

(y

k1

)

n

, where

n

= c

n

/c

n

. (2.6)

To be more specic, we describe the SISR procedure in stages, initializing with

Y

()

0

= y

0

, a specied initial state, or with Y

(1)

0

, . . . , Y

(m)

0

generated from the initial

distribution.

1. Importance sampling at stage k. Generate

Y

(j)

k

from q

k

([Y

(j)

k1

) and let

Y

(j)

k

= (Y

(j)

k1

,

Y

(j)

k

), for all 1 j m.

2. Resampling at stage k. Let w

k

= m

1

m

=1

w

k

(

Y

()

k

) and the resampling

weights

w

(j)

k

= w

k

(

Y

(j)

k

)/(m w

k

). (2.7)

Generate i.i.d. multinomial random variables b

1

, . . . , b

m

such that Pb

1

= j =

w

(j)

k

for 1 j m. Let Y

()

k

=

Y

(b)

k

for all 1 m. If k < n, increment k

by 1 and go to step 1, otherwise end the procedure. There is no resampling at

stage n.

6 H.P. CHAN, S. DENG AND T.L. LAI

After stage n, estimate by

B

=

w

1

w

n1

m

m

j=1

I

{

Y

(j)

n

}

g

n

(Y

(aj )

0

, n)

g

n

(

Y

(j)

n

, 0)

, (2.8)

where Y

(aj )

0

is the initial (ancestral) state of

Y

(j)

n . For notational simplicity, we assume

a specied initial state Y

()

0

= y

0

for all and denote g

n

(y

0

, n) and

n

simply by g

0

and , respectively.

Resampling is used in the above procedure to handle the normalizing constants in a

target importance measure that approximates the h-transform. In [8], a computatation-

ally expensive discretization scheme, with partition width 1/n, is used to implement the

state-dependent importance sampling scheme based on the asymptotic approximation

(2.3) in the case of regularly varying random walks. Using resampling as described in

the preceding paragraph enables us to bypass the costly computation of the normalizing

constants, and the SISR estimate

B

is still linearly ecient in this case, as will be

shown in the second paragraph of Section 4.1. More importantly, for more complicated

models, one can at best expect to have approximations of the type (2.4) rather than

the sharp asymptotic formula (2.3). In this case, using (2.5) to perform importance

sampling usually does not yield a good Monte Carlo estimate because unlike the

situation in (2.2), (2.4) does not imply good bounds for

n

k=1

[p

k

(y

k

[y

k1

)/q

k

(y

k

[y

k1

)]

on A

n

. On the other hand, using (2.5) for the importance sampling component of an

SISR procedure, whose resampling weights are proportional to w

k1

(y

k1

), can result

in a Monte Carlo estimate

B

that has a bound similar to (2.2), which can be used to

establish eciency of the SISR procedure, as we now proceed to show.

Following [9], let E

from which the

Y

(i)

k

and Y

(i)

k

are drawn; this diers from E

Q

for importance sampling

from the measure Q since it involves both importance sampling and resampling. A key

tool for the analysis of the SISR estimate

B

is the following martingale representation

of m(

B

); see Section 2 of [9].

Lemma 1. Let f

k

(y

k

) = PY

n

[Y

k

= y

k

for 1 k n 1, f

0

= and

f

n

(y

n

) = I

{yn}

. Let

T

2k1

= ((

Y

(j)

t

, Y

(j)

t

) : 1 j m, 1 t k 1

Y

(j)

k

: 1 j m),

SISR for heavy-tailed random walks 7

T

2k

= ((

Y

(j)

t

, Y

(j)

t

) : 1 j m, 1 t k).

Let #

(j)

k

be the number of copies of

Y

(j)

k

generated during the kth resampling stage.

Dene

(j)

2k1

= (g

0

w

1

w

k1

) (2.9)

_

f

k

(

Y

(j)

k

)

g

n

(

Y

(j)

k

, n k)

f

k1

(Y

(j)

k1

)

w

k1

(Y

(j)

k1

)g

n

(Y

(j)

k1

, n k + 1)

_

, 1 k n,

(j)

2k

= (#

(j)

k

mw

(j)

k

)(g

0

w

1

w

k

)

f

k

(

Y

(j)

k

)

w

k

(

Y

(j)

k

)g

n

(

Y

(j)

k

, n k)

, 1 k n 1.

Then (

(1)

t

, . . . ,

(m)

t

) : 1 t 2n1 is a martingale dierence sequence with respect

to the ltration T

t

, 1 t 2n 1. Moreover,

m(

B

) =

2n1

t=1

m

j=1

(j)

t

. (2.10)

Proof. By (2.5),

E

_

f

k

(

Y

(j)

k

)

g

n

(

Y

(j)

k

, n k)

T

2k2

_

= E

Q

_

f

k

(

Y

(j)

k

)

g

n

(

Y

(j)

k

, n k)

Y

(j)

k1

_

(2.11)

=

E[f

k

(

Y

(j)

k

)[Y

(j)

k1

]

w

k1

(Y

(j)

k1

)g

n

(Y

(j)

k1

, n k + 1)

=

f

k1

(Y

(j)

k1

)

w

k1

(Y

(j)

k1

)g

n

(Y

(j)

k1

, n k + 1)

.

Since g

0

w

1

w

k1

is measurable with respect to T

2k2

, E

(

(j)

2k1

[T

2k2

) = 0 by

(2.11). Moreover, note that E

(#

(j)

k

[T

2k1

) = mw

(j)

k

and that g

0

w

1

w

k

and

Y

(j)

k

are measurable with respect to T

2k1

. Therefore E

(

(j)

2k

[T

2k1

) = 0.

Theorem 1. If (2.4) holds, then

B

is unbiased and

mVar

(

2

B

) n(

4

n

+

6

n

)(1 +

2

n

/m)

n2

2

. (2.12)

Hence the SISR estimate

B

of is n

6

n

-ecient.

Proof. Since

(i)

t

is a martingale dierence sequence by Lemma 1, it follows from

(2.10) that

B

is unbiased. Moreover, as shown in Example 1 of [9], all terms on

the right-hand side of (2.10) are either uncorrelated or negatively correlated with each

other, and therefore

m

2

Var

(

2

B

)

2n1

t=1

m

j=1

Var

(

(j)

t

). (2.13)

8 H.P. CHAN, S. DENG AND T.L. LAI

Since f

k

(y

k

) = P(A

n

[Y

k

) with A

n

= Y

n

, f

k

(y

k

)/g

n

(y

k

, n k) c

n

by (2.4);

moreover, g

0

/c

n

. Hence

g

0

f

k

(y

k

)

g

n

(y

k

, n k)

n

. (2.14)

By (2.6), (2.9) and (2.11),

Var

(

(j)

2k1

) E

(

(j)

2k1

)

2

2

n

2

E

( w

2

1

w

2

k1

). (2.15)

Similarly, since E

[(#

(j)

k

)

2

[T

2k1

] = mw

(j)

k

and

m

j=1

w

(j)

k

= 1,

m

j=1

Var

(

(j)

2k

) m

4

n

2

E

( w

2

1

w

2

k

). (2.16)

By (2.5), for 0 s k,

k

=s

q

+1

(y

+1

[y

) =

k

=s

p

+1

(y

+1

[y

)g

n

(y

k+1

, n k 1)

k

=s

w

(y

)g

n

(y

s

, n s)

,

and therefore by (2.4),

E

Q

_

k

=s

w

(Y

Y

s

_

n

for 0 s k. (2.17)

Theorem 1 follows from (2.13)(2.16) and Lemma 2 below.

Lemma 2. If (2.17) holds, then E

( w

2

1

w

2

k

)

2

n

(1 +m

1

2

n

)

k1

.

Proof. By (2.6),

w

2

k

m

2

u=v

w

k

(

Y

(u)

k

)w

k

(

Y

(v)

k

) + m

1

2

n

.

Hence by the independence of

Y

(u)

k

and

Y

(v)

k

conditioned on T

2k2

,

E

( w

2

k

[T

2k2

) m

2

u=v

E

Q

[w

k

(Y

k

)[Y

(u)

k1

]E

Q

[w

k

(Y

k

)[Y

(v)

k1

] + m

1

2

n

. (2.18)

Since Y

(u)

k1

is sampled from

Y

(i)

k1

with probability w

(i)

k1

= w

k1

(

Y

(i)

k1

)/(m w

k1

),

E

(E

Q

[w

k

(Y

k

)[Y

(u)

k1

][T

2k3

) =

m

i=1

w

(i)

k1

E

Q

[w

k

(Y

k

)[

Y

(i)

k1

] (2.19)

=

1

m w

k1

m

i=1

E

Q

_

k

=k1

w

(Y

)[

Y

(i)

k1

_

.

SISR for heavy-tailed random walks 9

By (2.17)(2.19),

E

( w

2

1

w

2

k

[T

2k3

)

w

2

1

w

2

k2

_

m

1

m

i=1

E

Q

_

k

=k1

w

(Y

Y

(i)

k1

__

2

+m

1

2

n

E

( w

2

1

w

2

k1

[T

2k3

)

w

2

1

w

2

k2

m

2

u=v

E

Q

_

k

=k1

w

(Y

Y

(u)

k1

_

E

Q

_

k

=k1

w

(Y

Y

(v)

k1

_

+m

1

2

n

E

( w

2

1

w

2

k1

+ w

2

1

w

2

k2

[T

2k3

).

Conditioning successively on T

2k4

, T

2k5

, . . . then yields

E

( w

2

1

w

2

k

)

_

E

Q

_

k

=1

w

(Y

)

__

2

+m

1

2

n

E

( w

2

1

w

2

k1

+ + w

2

1

)

2

n

+m

1

2

n

E

( w

2

1

w

2

k1

+ + w

2

1

),

from which the desired conclusion follows by induction.

3. SISR schemes via truncation and tilting for heavy-tailed random walks

Let X, X

1

, X

2

, . . . be i.i.d. with a common distribution function F. Let S

n

=

n

k=1

X

k

and M

n

= max

1kn

X

k

. Let

b

= infn : S

n

b. Assume that

F(x)[= 1 F(x)] = e

(x)

, with (x) =

(x) 0 as x . (3.1)

Then (x) = o(x) and F is heavy-tailed, with density function

f(x) = (x)e

(x)

.

We use to develop general SISR procedures for simulating the probabilities

p = P(S

n

b), = P( max

1jn

S

j

b) = P(

b

n). (3.2)

These algorithms are shown to be linearly ecient in Section 4 as b = b

n

approach

with n, under certain conditions for which asymptotic approximations to p and

have been developed. Unlike the SISR procedures in Section 2 that are based on (2.3)

or its relaxation (2.4), the SISR procedures based on do not make explicit use of

the asymptotic approximations to p and . On the other hand, these approximations

guide the choice of importance measure and the truncation in the SISR procedure.

10 H.P. CHAN, S. DENG AND T.L. LAI

3.1. Truncation and tilting measures for evaluating p by SISR

To evaluate p, we express it as the sum of probabilities of two disjoint events

A

1

= S

n

b, M

n

c

b

, A

2

= S

n

b, M

n

> c

b

, (3.3)

for which the choice of c

b

( as b ) will be discussed in Theorem 2 and in

Sections 4 and 5. Juneja [16] applied a similar decomposition in the special case of

non-negative regularly varying random walks, and eciency was achieved with c

b

= b

and with xed n. However, the rare events considered herein involve n , which

requires a more elaborate method to evaluate P(A

1

).

Let

b

= (b)/b,

b

=

_

cb

1

x

2

dx ( 1), 0 < r < 1 and dene the mixture density

q(x) = rf(x) +

1 r

b

x

2

I

{1xcb}

. (3.4)

Let p

1

be the SISR estimate of P(A

1

), with importance density (3.4) and resampling

weights

w

k

(X

k

) =

e

bXk

f(X

k

)

q(X

k

)

I

{Xkcb}

. (3.5)

Specically, instead of using (2.5) to dene q

k

([y

k1

), we dene q

k

([y

k1

) by (3.4)

for the importance sampling step at stage k in the third paragraph of Section 2.

Moreover, we now use (3.5) instead of (2.6) to dene the resampling weights and

perform resampling even at stage n. The counterpart of (2.8) now takes the simple

form

p

1

= ( w

1

w

n

)m

1

m

j=1

e

bS

(j)

n

I

{S

(j)

n

b}

(3.6)

where w

k

= m

1

m

j=1

w

k

(

X

(j)

k

); see (2.3) and (2.4) of [9]. As in (2.4) and (2.5) of [9],

dene

Z

k

(x

k

) =

_

k

t=1

f(x

t

)

q(x

t

)

_

P(A

1

[x

k

), h

k

(x

k

) =

k

t=1

w

t

w

t

(x

t

)

, w

(j)

k

=

w

k

(

X

(j)

k

)

m w

k

, (3.7)

with Z

0

= and h

0

= 1. Then (2.10) of [9] gives the martingale decomposition

m[ p

1

P(A

1

)] =

2n

t=1

t

, (3.8)

where #

(j)

k

is the number of copies of

X

(j)

k

in the kth resampling step and

2k1

=

m

j=1

[Z

k

(

X

(j)

k

) Z

k1

(X

(j)

k1

)]h

k1

(X

(j)

k1

),

SISR for heavy-tailed random walks 11

2k

=

m

j=1

(#

(j)

k

mw

(j)

k

)Z

k

(

X

(j)

k

)h

k

(

X

(j)

k

).

Theorem 2. Let

b

= E

Q

[w

1

(X

1

)]. Suppose that one of the following conditions is

satised:

(C)

_

cb

1

2

(x)x

2

e

2[bx(x)]

dx = O(1),

(C

)

_

cb

(x)e

2bx(x)

dx = O(1)

as b . Then there exists a constant K > 0 such that for all large b,

Var( p

1

)

Kn

m

2n

b

e

Kn/m

P

2

X > b.

Proof. We shall show that

PS

t

x, M

t

c

b

t

b

e

bx

for all t 1, x R. (3.9)

Let G be the distribution function with density

g(x) =

1

b

e

bx

f(x)I

{xcb}

.

Let E

G

denote expectation under which X

1

, . . . , X

t

are i.i.d. with distribution G. Then

PS

t

x, M

t

c

b

= E

G

__

t

k=1

f(X

k

)

g(X

k

)

_

I

{Stx}

_

=

t

b

E

G

(e

bSt

I

{Stx}

),

and (3.9) indeed holds.

In the martingale decomposition (3.8), the summands are either uncorrelated or

negatively correlated with each other, as shown in Example 1 of [9]. Therefore

E

[ p

1

P(A

1

)]

2

m

1

n

k=1

E

[Z

2

k

(

X

(1)

k

)h

2

k1

(X

(1)

k1

)] (3.10)

+m

1

n

k=1

E

[(#

(1)

k

mw

(1)

k

)

2

Z

2

k

(

X

(1)

k

)h

2

k

(

X

(1)

k

)].

Let s

k

= x

1

+ +x

k

. Since P(A

1

[x

k

) = PS

nk

bs

k

, M

nk

c

b

I

{max(x1,...,xk)cb}

,

it follows from (3.5), (3.7) and (3.9) that

E

[Z

2

k

(

X

(1)

k

)h

2

k1

(X

(1)

k1

)[X

(1)

k1

= x

k1

] (3.11)

= w

2

1

w

2

k1

E

Q

_

f

2

(X)e

2bsk1

P

2

[A

1

[X

k

= (x

k1

, X)]

q

2

(X)

_

w

2

1

w

2

k1

2n2k

b

e

2bb

E

Q

_

f

2

(X)e

2bX

q

2

(X)

I

{Xcb}

_

= w

2

1

w

2

k1

2n2k

b

e

2bb

E

Q

[w

2

1

(X

1

)].

12 H.P. CHAN, S. DENG AND T.L. LAI

By independence of the X

k

in (3.5),

E

( w

2

1

w

2

k1

) = [E

Q

( w

2

1

)]

k1

=

_

2

b

+

Var

Q

[w

1

(X

1

)]

m

_

k1

(3.12)

2k2

b

exp

_

(k 1)E

Q

[w

2

1

(X

1

)]

m

2

b

_

.

Since c

b

as b ,

b

1 + o(1). Moreover, e

2bb

= P

2

X > b. Hence it

follows from (3.11), (3.12) and Lemma 3 below that there exists K

1

> 0 such that

m

1

n

k=1

E

[Z

2

k

(

X

(1)

k

)h

2

k1

(X

(1)

k1

)]

K

1

n

m

2n

b

exp

_

K

1

n

m

_

P

2

X > b. (3.13)

By (3.9),

Z

2

k

(

X

(j)

k

)h

2

k

(

X

(j)

k

) = w

2

1

w

2

k

e

2b

S

(j)

k

P

2

(A

1

[

X

(j)

k

) w

2

1

w

2

k

2n2k

b

e

2bb

. (3.14)

Since Var(#

(j)

k

[T

2k1

) mw

(j)

k

and

m

j=1

w

(j)

k

= 1, by (3.14),

E

[(#

(1)

k

mw

(1)

k

)

2

Z

2

k

(

X

(1)

k

)h

2

k

(

X

(1)

k

)] (3.15)

= m

1

m

j=1

E

[(#

(j)

k

mw

(j)

k

)

2

Z

2

k

(

X

(j)

k

)h

2

k

(

X

(j)

k

)]

2n2k

b

e

2bb

E

__

m

j=1

w

(j)

k

_

w

2

1

w

2

k

_

=

2n2k

b

e

2bb

E

( w

2

1

w

2

k

).

Combining (3.12) with (3.15) and applying (3.13), we then obtain Theorem 2 from

(3.10).

Lemma 3. Under the assumptions of Theorem 2,

E

Q

[w

2

1

(X

1

)] = = E

Q

[w

2

n

(X

n

)] = O(1) as b . (3.16)

Proof. First assume (C). Then

E

Q

[w

2

1

(X

1

)] =

_

cb

e

2bx

f

2

(x)

q(x)

dx

e

2b

r

_

1

f(x) dx +

1

1 r

_

cb

1

2

(x)x

2

e

2[bx(x)]

dx.

As b ,

b

= (b)/b 0 and therefore the rst summand in the above inequality

converges to F(1)/r. Moreover, by (C), the integral in the second summand is O(1),

proving (3.16) in this case.

SISR for heavy-tailed random walks 13

Next assume (C

). Since E

Q

[w

2

1

(X

1

)] r

1

_

cb

e

2bx

f(x) dx and f(x) = (x)e

(x)

,

(3.16) follows similarly. In fact, under (C

when q is the original density f. Therefore, if (C

with q = f.

We next evaluate P(A

2

) by using importance sampling that draws X

n

from a

measure

Q for which

d

Q

dP

(X

n

) =

#i : X

i

> c

b

nP(X > c

b

)

on M

n

> c

b

. (3.17)

Letting F(x[X > c) = Pc < X x/PX > c, we carry out m simulation runs,

each using the following procedure:

1. Choose an index k 1, . . . , n at random.

2. Generate X

k

F([X > c

b

) and X

i

F for i ,= k.

This sampling procedure indeed draws from the measure

Q as the factor #i : X

i

>

c

b

in the likelihood ratio (3.17) corresponds to assigning equal probability to each

component X

i

of X

n

that exceeds c

b

to be the maximum M

n

on M

n

> c

b

. We

estimate P(A

2

) by the average p

2

of the m independent realizations of

nP(X > c

b

)

#i : X

i

> c

b

I

{Snb}

(3.18)

given by the m simulation runs. Note that p

2

is an importance sampling estimate and

is therefore unbiased. Since the denominator in (3.18) is at least 1 under the measure

b

, yielding the variance bound

Var( p

2

) n

2

P

2

X > c

b

/m. (3.19)

3.2. Truncations and tilting measures for SISR estimates of

We are interested here in Monte Carlo evaluation of Pmax

1jn

S

j

b as b, n

, when E(X) 0. It is technically easier to consider the equivalent case of evaluating

PS

j

b + ja for some 1 j n in the case where E(X) = 0 and a 0. More

generally, consider the evaluation of P

b

n, where

b

= infj : S

j

b(j) and

b(j) is monotone increasing, e.g., b(j) = b + ja. Let c

b

be monotone increasing in b,

14 H.P. CHAN, S. DENG AND T.L. LAI

n

i

= minj : b(j) 2

i

and n

i

= min(n

i

, n). Let

/

1,i

= n

i

b

< n

i+1

, X

k

c

b(k)

for all 1 k

b

, (3.20)

/

2

=

b

n, X

k

> c

b(k)

for some 1 k

b

.

Let

i

= (2

i

)/2

i

. Let

1,i

be the SISR estimate of P(/

1,i

), with importance

density for X

k

of the form

q

k

(x) = rf(x) +

1 r

b(k)

x

2

I

{1xc

b(k)

}

, (3.21)

and with resampling weights

w

k,i

(X

k

) =

_

_

_

e

i

X

k

f(Xk)

qk(Xk)

I

{Xkc

b(k)

}

for 1 k

b

,

1 otherwise.

(3.22)

Note the similarity between (3.4)(3.5) and (3.21)(3.22). In fact, the latter just

replaces c

b

,

b

and q in (3.4)(3.5) by c

b(k)

,

i

and q

k

. Using an argument similar

to the proof of Theorem 2, we can extend (3.6) to obtain a similar variance bound for

1,i

in the following.

Theorem 3. Let

i

= max1,

_

c

2

i+1

e

ix

f(x) dx. Suppose

1,i

is based on m

i

SISR

samples. Suppose one of the following conditions is satised:

(A)

_

c

2

i+1

1

2

(x)x

2

e

2[ix(x)]

dx = O(1),

(A

)

_

c

2

i+1

(x)e

2ix(x)

dx = O(1),

as i . Then there exists a constant K > 0 such that for all large i,

Var(

1,i

)

Kn

i+1

m

i

(

i

)

ni+1

e

Kni+1/mi

P

2

X > 2

i

.

To evaluate P(/

2

), we perform m simulations, each using the following procedure:

1. Choose an index k 1, . . . , n with probability

F(c

b(k)

)/

n

j=1

F(c

b(j)

).

2. Generate X

k

F([X > c

b(k)

) and X

j

F for j ,= k.

We estimate P(/

2

) by the average

2

of m independent realizations of

[

n

k=1

F(c

b(k)

)]I

A2

#k : X

k

> c

b(k)

(3.23)

given by the m simulation runs. Analogous to (3.19), we have the following variance

bound for

2

.

SISR for heavy-tailed random walks 15

Lemma 4. Suppose

F(c

b(k)

) = O(

F(b(k))) as b , uniformly in 1 k n. Then

mVar(

2

)

_

n

k=1

F(c

b(k)

)

_

2

= O

__

n

k=1

F(b(k))

_

2

_

.

4. Eciency of SISR schemes

In this section, we apply the bounds in Theorems 13 to show that the above SISR

procedures give ecient estimates of p and when we have asymptotic lower bounds

to these quantities for certain classes of heavy-tailed random walks. Except for the

second paragraph of Section 4.1 that considers the SISR procedures in Section 2, the

eciency results are for the SISR procedures developed in Section 3.

4.1. Regularly varying tails

We say that a distribution function F is regularly varying with index > 0 if

F(x) x

L(x) as x , (4.1)

for some slowly varying function L, that is, lim

x

L(tx)

L(x)

= 1 for all t > 0. Suppose

E(X) = and Var(X) =

2

< . Let

g

(b, n) = n

F(b (n 1))I

{bn

n}

+

_

b n

n

_

, (4.2)

in which denotes the standard normal distribution. Rozovskii [18] has shown that if

F is regularly varying then

PS

n

b g

By (4.3), (2.3) holds with g = g

only requires bounds is that g

n

can be chosen to be considerably simpler than g

. In

particular, we can discretize g

and dene

g

n

(Y

n

, n k) = g

(

i

, n k) if

i

b S

k

<

i+1

, (4.4)

where

2

= ,

1

= 0 and

i

=

i

for some > 1 and all i 0. Consider the

SISR procedure with importance density (2.5), in which g

n

is given by (4.4), and with

resampling weights (2.7). From (4.1) and (4.3), it follows that (2.4) holds with c

n

=

16 H.P. CHAN, S. DENG AND T.L. LAI

c < 1 < c

= c

n

, and therefore by Theorem 1, the SISR estimate of

n

= PS

n

b

is linearly ecient, noting that

n

= c

/c in this case.

In Section 3.1 we have proposed an alternative SISR procedure that involves a

truncation scheme and established in Theorem 2 and (3.19) upper bounds for Var( p

1

)+

Var( p

2

), which can be used to prove linear eciency of the procedure, in the case of

b being some power of n. This is the content of the following corollary, which gives a

stronger result than linear eciency.

Corollary 1. Assume (4.1) and that there exists J > 0 for which

(x) =

Assume that for some 0 < < with 2, n = O(b

/(logb)

) and E(X

< .

For the case > 1, also assume EX = 0. Then the estimate p

1

+ p

2

of p is linearly

ecient if c

b

= b for some 0 < < min

,

1

2

. In fact,

Var( p

1

+ p

2

) = O(p

2

/m)[= o(p

2

)] when liminf(m/n) > 0. (4.6)

Proof. Recall that f(x) = (x)e

(x)

is the density of X. With

b

dened in

Theorem 2, we shall show that

b

=

_

b

e

bx

f(x) dx 1 + O(

b

) = 1 + O(n

1

), (4.7)

_

b

e

2bx

f(x) dx = O(1), (4.8)

i.e, (C

2n

b

= O(1). Moreover, it will be shown that

nPX b = O(PS

n

b). (4.9)

Since PX > b = O(PX > b) by (4.1), Corollary 1 follows from Theorem 2, (3.19)

and (4.9).

To prove (4.9), note that in the case > 2, Var(X) < and (4.9) follows from

(4.3). For the case 2, we use the inclusion-exclusion principle to obtain

PS

n

b P

_

n

_

i=1

B

i

_

where B

i

= X

i

2b, S

n

X

i

b (4.10)

nPS

n1

bPX 2b n

2

P

2

X 2b.

SISR for heavy-tailed random walks 17

Note that nPX b 0 under (4.1) and n = O(b

/(log b)

case 1, PS

n1

< b E(S

n1

)

/b

nE(X

/b

2, (4.1) and the assumption E(X

< . Therefore by

the Marcinkiewicz-Zygmund law of large numbers [12, p.125], S

n

= o(n

1/

) a.s. and

hence PS

n1

< b 0 as n

1/

= o(b). Since PX 2b 2

PX b by (4.1),

(4.9) follows from (4.10).

We next prove (4.7) and (4.8) when 0 < 1. Since e

x

1 + 2x

for 0 < x 1

and e

x

1 for x 0,

b

1 + 2

b

_ 1

b

0

x

f(x) dx +

_

b

1

b

e

bx

f(x) dx. (4.11)

By (4.1), E(X

+

)

2

b

_ 1

b

0

x

f(x) dx = O(

b

). (4.12)

Let 0 < < 1. Since (x) logx, (x) logx for large x. Moreover

1

b

b

for

all large b and therefore by selecting

_

/,

e

(

2J

b

)

e

(

1

b

)

= O(e

log b

) = O(b

2

) = O(

b

). (4.13)

By (4.5) and (4.13),

_ 2J

b

1

b

e

bx

f(x) dx

_

2J

b

_

e

2J

sup

1

b

x

2J

b

[(x)e

(x)

] = O(

b

). (4.14)

Integration by parts yields

_

b

2J

b

e

bx

(x)e

(x)

dx e

2J(

2J

b

)

+

b

_

b

2J

b

e

bx(x)

dx. (4.15)

For x 2J/

b

, [

b

x (x)]

=

b

(x)

b

2

by (4.5) and therefore

b

x (x)

b

b (b) +

b

2

(x b) if x b. A change of variables y = x b then yields

b

_

b

2J

b

e

bx(x)

dx

b

e

b(b)(b)

_

0

e

yb/2

dy = O(e

(b)(b)

). (4.16)

By (4.1), (b) = (b) + O(1) and therefore (b) (b) = ( 1)(b) + O(1).

Since 1 <

(b)(b)

= O(b

) = O(

b

).

Combining this with (4.13)(4.16) yields

_

b

1

b

e

bx

f(x) dx = O(

b

). (4.17)

18 H.P. CHAN, S. DENG AND T.L. LAI

Substituting (4.12) and (4.17) into (4.11) proves (4.7). To prove (4.8), we make use of

the inequality

_

b

e

2bx

f(x) dx e

2J

_

J/b

f(x) dx +

_

b

J

b

e

2bx

f(x) dx. (4.18)

Since [2

b

x (x)]

b

for x

J

b

and

1

2

, it follows from integration by parts

and the bounds in (4.13)(4.17) that

_

b

J

b

e

2bx

(x)e

(x)

dx = e

2bx(x)

[

b

J

b

+ 2

b

_

b

J

b

e

2bx(x)

dx (4.19)

= O(

b

) + O(e

(21)(b)

) = O(1).

By (4.18) and (4.19), (4.8) holds.

To prove (4.7) and (4.8) for the case 1 < 2, E(X) = 0 and E([X[

) < ,

we start with the bound e

x

1 + 2x

1

for 0 x 1 from which it follows by

integration that

e

x

1 + x + 2[x[

(4.20)

for 0 x 1. We next show that (4.20) in fact holds for all x 1, by noting that

LHS of (4.20) 1 whereas RHS of (4.20) 1 + [x[ for x 1, and that

RHS of (4.20) 1+x+2x

2

_

_

_

1 +x + x

2

+ x

4

+ x

6

+ e

x

for

1

2

x 0,

= 2(x +

1

4

)

2

+

7

8

e

x

for 1 x

1

2

.

It follows from (4.20) that

b

1 +

b

_ 1

xf(x) dx + 2

b

_ 1

f(x) dx +

_

b

1

b

e

bx

f(x) dx (4.21)

1 + O(

b

),

since E(X) = 0 implies that

b

_

1

xf(x) dx 0, E([X[

2

b

_

1

f(x) dx = O(

b

), and (4.14)(4.16) can still be applied to show that (4.17)

holds. Using arguments similar to (4.18) and (4.19), we can prove (4.8) in this case.

Similarly, we can prove the following analog of Corollary 1 for the SISR algorithm

in Section 3.2 to simulate .

Corollary 2. Assume (4.1) with > 1 and (4.5). Suppose n = O(b

/(logb)

),

E(X) = 0 and E(X

< for some 1 < < with 2. Let b(j) = b for all

SISR for heavy-tailed random walks 19

1 j n, and suppose 2

i

b < 2

i+1

. Assign all m simulations to evaluate P(/

1,i

).

Then

1

+

2

is linearly ecient when c

b

= b for some 0 < <

1

2

min

,

1

2

. In

fact,

Var(

1

+

2

) = O(

2

/m) when liminf(m/n) > 0. (4.22)

Proof. By Theorem 3 and Lemma 4,

Var(

1

) = O

_

n

m

P

2

X > 2

i

_

= O

_

n

m

P

2

X > b

_

when liminf(m/n) > 0,

Var(

2

) = O

_

n

2

m

P

2

X > b

_

.

By (4.9), nPX > b = O(PS

n

b) = O() and therefore (4.22) holds.

Corollary 3. Assume (4.1) with > 1 and (4.5). Suppose E(X) = 0 and E(X

<

for some 1 < < . Let b(j) = b + ja for some a > 0 and let (k) = log

2

b(k)|,

where log

2

denote logarithm to base 2. Assign m

i

simulation runs for the estimation

of P(/

1,i

) such that

m

i

m[i (1) + 1]

2

_

(n)(1)+1

=1

2

uniformly over (1) i (n) as b ,

(4.23)

where m =

(n)

i=(1)

m

i

is the total number of simulation runs. Let

1

=

(n)

i=(1)

1,i

.

Then the estimate

1

+

2

is n(log

2

n)

2

-ecient if c

b

= b for some 0 < <

1

2

min

,

1

2

. In fact,

Var(

1

+

2

) = m

1

O(

2

)[= o(

2

)] whenever liminf(m/[n(log

2

n)

2

]) > 0.

Proof. We can proceed as the proofs of (4.7) and (4.8) to show that

i

1 +O(

i

)

and (A

liminf

_

m

nlog

2

n

_

> 0 liminf

_

inf

(1)i(n)

m

n(i (1) + 1)

2

_

> 0,

we obtain from (4.23) that

liminf

_

inf

(1)i(n)

m

i

n

_

> 0. (4.24)

Since n

i+1

i

0, it then follows from Theorem 3 and (4.24) that

Var(

1,i

) = O

_

n

i+1

m

i

P

2

X > 2

i

_

uniformly over (1) i (n),

20 H.P. CHAN, S. DENG AND T.L. LAI

and hence by (4.23),

Var(

1

) = m

1

O

_

(n)

i=(1)

[i (1) + 1]

2

n

i+1

P

2

X > 2

i

_

. (4.25)

Since 2

i

b(j) 2

i+1

for n

i

j < n

i+1

and n

i+1

n

i

ni+1

2

, (4.1) implies that for

some positive constants C

1

and C

2

,

_

ni+11

j=ni

PX > b(j)

_

2

C

1

(n

i+1

n

i

)

2

P

2

X > 2

i

C

2

[i(1)+1]

2

n

i+1

P

2

X > 2

i

.

Putting this in (4.25) yields

Var(

1

) = m

1

O

__

n

j=1

PX > b(j)

_

2

_

= m

1

O(

2

);

see [15, Theorem 5.5(i)]. A similar bound can be derived for Var(

2

) by applying

Lemma 4, completing the proof of Corollary 3.

4.2. More general heavy-tailed distributions

A distribution function F is said to be (right) heavy-tailed if

_

e

x

F(dx) =

for all > 0. It is said to be long-tailed if its support is not bounded above and for all

xed a > 0,

F(x +a)/

[15, Section 3.5]. To simulate p = PS

n

b, we have shown in Section 4.1 that the

truncation method described in Section 3.1 is linearly ecient in the case of regularly

varying tails. For other long-tailed distributions, such as the Weibull and log-normal

distributions, some modication of the truncation method is needed for eciency. It is

based on representing PS

n

b as a sum of four probabilities that can be evaluated

by SISR or importance sampling.

Let c

b

< b, V

n

= #k : c

b

< X

k

b,

/

1

= S

n

b, M

n

c

b

, /

2

= S

n

b, M

n

> b,

/

3

= S

n

b, V

n

= 1, M

n

b, /

4

= S

n

b, V

n

2, M

n

b.

The Monte Carlo estimate p

1

of P(/

1

) is described in Section 3.1, using SISR with

mixture density (3.4) and resampling weights (3.5). The Monte Carlo estimate p

2

of

P(/

2

) uses the importance sampling scheme described in Section 3.1, with b taking the

SISR for heavy-tailed random walks 21

place of c

b

. To simulate P(/

3

), we retain the simulations results X

(j)

n1

: 1 j m

in p

1

after the (n 1)th resampling step. The corresponding SISR estimate is

p

3

= ( w

1

w

n1

)m

1

j=1

ne

bS

(j)

n1

[min(e

(bS

(j)

n1

)

, e

(cb)

) e

(b)

]I

{S

(j)

n1

0,M

(j)

n1

cb}

.

To evaluate P(/

4

), we perform m simulations such that for the jth simulation run,

k

(j)

1

and k

(j)

2

are selected at random without replacement from 1, . . . , n, and X

(j)

k

F([c

b

< X b) for k = k

(j)

1

, k

(j)

2

, while X

(j)

k

F for k ,= k

(j)

1

, k

(j)

2

. The Monte Carlo

estimate of P(/

4

) is

p

4

= [

F(c

b

)

F(b)]

2

_

n

2

_

m

1

m

j=1

_

V

(j)

n

2

_1

I

{S

(j)

n

b,M

(j)

n

b}

.

Theorem 4. The Monte Carlo estimate p

i

of P(/

i

) is unbiased for i = 1, 2, 3, 4. Let

b

= min

0xbcb

[

b

x+(bx)]. Assume either (C) or (C

such that

mVar( p

1

) Kn

2n

b

e

Kn/m

P

2

X > b, mVar( p

2

) n

2

F

2

(b), (4.26)

mVar( p

3

) K(n 1)

2n2

b

e

K(n1)/m

n

2

e

2b

, mVar( p

4

) n

4

F

4

(c

b

).

Proof. As noted above, p

1

is an SISR estimate of P(/

1

) and p

2

is an importance

sampling estimate of P(/

2

). By exchangeability,

P(/

3

) = nPS

n

b, M

n1

c

b

, c

b

< X

n

b. (4.27)

Let / = S

n

b, c

b

< X

n

b. In view of (4.27), P(/

3

) can be evaluated by Monte

Carlo using the SISR estimate

n w

1

w

n1

m

1

n

j=1

e

bS

(j)

n1

P(/[S

(j)

n1

)I

{M

(j)

n1

cb}

. (4.28)

Note that P(/[S

(j)

n1

) = 0 if S

(j)

n1

< 0. For s > 0,

P(/[S

(j)

n1

= s) =

_

_

_

Pb s X

n

b = e

(bs)

e

(b)

if b s > c

b

,

Pc

b

< X

n

b = e

(cb)

e

(b)

if b s c

b

.

Hence p

3

is the same as the SISR estimate (4.28) of P(/

3

) and is therefore unbiased.

The estimate p

4

is also unbiased. In fact, it is an importance sampling estimate that

22 H.P. CHAN, S. DENG AND T.L. LAI

draws X

n

from a measure Q for which

dQ

dP

(X

n

) =

_

V

n

2

_

__

_

n

2

_

P

2

(c

b

< X b)

_

on V

n

2, M

n

b,

which is an extension of (3.17) to the present problem.

We next prove the variance bounds (4.26) for the unbiased estimates p

3

and p

4

;

those for p

1

and p

2

have already been shown in Section 3.1. Consider the martingale

decomposition m[ p

3

P(/

3

)] =

2(n1)

t=1

t

, where

t

is given in the display after (3.8)

with

Z

k

(x

k

) =

_

k

t=1

f(x

t

)

q(x

t

)

_

nPS

n

b, M

n1

c

b

, c

b

< X

n

b[X

k

= x

k

(4.29)

in view of (4.27), noting that p

3

is based on the simulations used in p

1

up to the

(n 1)th resampling step. The change-of-measure argument used to prove (3.9) can

be modied to show that for all t 1 and x R,

PS

t

x, M

t1

c

b

, c

b

< X

t

b

t1

b

e

b(bx)

max

0ybcb

e

by(by)

. (4.30)

Making use of (4.29) and (4.30), we can proceed as in the proof of Theorem 2 to

prove the upper bound for Var( p

3

) in (4.26). The bound for Var( p

4

) follows from

(

F(c

b

)

F(b))

2

_

n

2

_

n

2

F

2

(c

b

), thus completing the proof of Theorem 4.

The following corollary of Theorem 4 establishes linear eciency of the Monte

Carlo method to evaluate PS

n

b for heavy-tailed distributions satisfying certain

assumptions. Examples 1 and 2 in Section 5.1 show that these assumptions are satised

in particular by Weibull and log-normal X.

Corollary 4. Let X be heavy-tailed with E(X) = 0, Var(X) < and let n =

O(b

2

/

2

(b)). Assume either (C) or (C

). If

b

(x) for all c

b

x b,

b

= 1+O(

2

b

)

and

be

2(cb)

= O(e

(b)

), (4.31)

then

4

i=1

p

i

is linearly ecient for estimating PS

n

b.

Proof. Since n = o(b

2

), nPX > b 0 by Chebyshevs inequality. Therefore it

follows from the inclusion-exclusion principle and the central limit theorem that

PS

n

b nPS

n1

0PX > b n

2

P

2

X > b

SISR for heavy-tailed random walks 23

[1 + o(1)]nPX > b/2.

Hence it suces to show that for any > 0, there exists m = O(n) such that

Var( p

i

) n

2

F

2

(b) for 1 i 4. (4.32)

We shall assume liminf m/n > 0. Since nP(X > b) 0, (4.32) holds for i = 2.

Since n

2

b

= O(1) and

b

= 1 + O(

2

b

),

2n

b

= O(1) and (4.32) holds for i = 1. Since

[

b

x+(bx)]

=

b

(bx) 0 for all 0 x bc

b

, the minimum of

b

x+(bx)

over 0 x c

b

is attained at x = 0 and therefore

b

= (b), proving (4.32) for i = 3.

Finally, by (4.31), n

3

F

4

(c

b

) = O(n

3

e

2(b)

/b

2

) = O(n

2

e

2(b)

), proving (4.32) for

i = 4.

5. Examples and discussion

In this concluding section, we rst give examples of heavy-tailed distributions sat-

isfying the assumptions of Corollary 4. We also give numerical examples to illustrate

the performance of the proposed Monte Carlo methods. In this connection we describe

in Section 5.2 some implementation details such as the use of occasional resampling to

speed up the SISR procedure and the estimation of standard errors for the SISR esti-

mates of rare-event probabilities. Finally we discuss in Section 5.4 related works in the

literature and compare our approach with importance sampling and IPS (interacting

particle system) methods.

5.1. Weibull and log-normal increments

Example 1. (Weibull.) A long-tailed distribution is Weibull if (x) = x

I

{x>0}

for

some 0 < < 1. Let Y F where

F(x) = e

(x)

and let X = Y EY . Then

PX > x = e

(x+)

= exp((x +)

x > ,

(x+) = (x+)

1

. Therefore

b

= (b+)/b b

1

and

(x+)

b

for all b/2 x b when b is suciently large, noting that 2

1

< 1 for 0 < < 1.

Let c

b

= b/2 and n = O(b

2(1)

). It is easy to check that (4.31) holds. By (4.20) with

= 2 and (4.21),

b

1 +O(

2

b

) +

_

b/2

1/b

f(x)e

bx

dx 1 +O(

2

b

),

24 H.P. CHAN, S. DENG AND T.L. LAI

n Direct Method Truncation Method

10 (4.80 0.69) 10

4

(5.02 0.04) 10

4

50 0 0 (8.78 0.06) 10

7

100 0 0 (2.61 0.02) 10

8

500 0 0 (1.27 0.01) 10

12

1000 0 0 (8.61 0.07) 10

15

.

Table 1: Monte Carlo estimates of P{Sn (5 + )n} for log-normal increments, with

estimated standard errors (after the sign).

where f(x) = (x + )

1

exp((x + )

the range 2

b

x 1 and using the bound f(x) 1 for x 1,

_

b/2

1

x

2

e

2bx

f

2

(x) dx e

_

1/(2b)

1

x

2

f(x) dx + (b/2)

3

max

1

2

b

x

b

2

exp2[

b

x (x + )

],

(5.1)

in which the last term is an upper bound of

_

b/2

1/(2b)

x

2

e

2bx

f

2

(x) dx, noting that

f(x) exp((x+)

b

x(x+)

over

1

2b

x

b

2

is attained at

1

2b

and is equal to (

1

2

+o(1))b

(1)

, for all large b.

Since E(X

2

) < , (5.1) implies that (C) holds. Hence all the conditions of Corollary

4 hold in this case.

Example 2. (log-normal.) Let and be the standard normal density and dis-

tribution functions, respectively. Let X = e

Z

, where Z is standard normal. Then

X is log-normal and has distribution function F(x) = 1 e

(x)

, where (x) =

[ log

(log x)[I

{x>0}

. Since

(z) (2z

2

)

1/2

e

z

2

/2

as z , it follows that

(x) = (log x)

2

/2 + loglog x + log(2)/2 +o(1) as x ,

f(x)(= (x)e

(x)

) =

(log x)

x

(x)

logx

x

as x .

Let = E(X) = E(e

Z

) =

e and p = PS

n

b +n, where n = O(b

2

/

2

(b)). Let

c

b

= b/2. By using arguments similar to those in Example 1, it can be shown that all

the assumptions of Corollary 4 again hold in this case.

To illustrate the performace of the truncation method in Section 4.2 to estimate

p = PS

n

(5 + )n, which is shown to be linearly ecient in Corollary 4, we

consider n = 10, 50, 100, 500 and 1000 and use the procedure described in the next

SISR for heavy-tailed random walks 25

subsection to implement the SISR estimates p

1

and p

3

with 10,000 sample paths and

the importance density (3.4) in which r = 0.8. Recall that p

3

uses the SISR sample

paths for p

1

up to the (n 1)th resampling step. The importance sampling estimates

p

2

and p

4

are each based on 100,000 simulations. For comparison, we also apply direct

Monte Carlo with 100,000 runs to evaluate the probability. The results are given in

Table 1, which shows about 300-fold variance reduction for a probability of order 10

4

.

For probabilities of order 10

7

or smaller, Table 1 shows that direct Monte Carlo is not

feasible whereas the truncation method does not seem to deteriorate in performance.

5.2. Standard errors and occasional resampling

The SISR procedure carries out importance sampling sequentially within each sim-

ulated trajectory and performs resampling across the m trajectories. Instead of imple-

menting this procedure directly, we use the modication in [9, Section 3.3] to reduce

computation time for resampling, which increases with m, and also to obtain standard

error estimates easily. Dividing the m sample paths into r subgroups of size so that

m = r, we perform resampling within each subgroup of sample paths, independently

of the other subgroups. This method also has the advantage of providing a direct

estimate of the standard error of the Monte Carlo estimate := r

1

r

i=1

(i),

where (i) denotes the SISR estimate of the rare-event probability based on the

ith subgroup of simulated sample paths. Due to resampling, the SISR samples are no

longer independent and one cannot use the conventional estimate of the standard error

for Monte Carlo estimates. On the other hand, since the r subgroups are independent

and yield the independent estimates (1), . . . , (r) of , we can estimate the standard

error of be /

r, where

2

= (r 1)

1

r

i=1

( (i) )

2

. In Example 2 above and

Example 3 below, we use = r = 100, corresponding to a total of m =10,000 SISR

sample paths.

An additional modication that can be used to further reduce the resampling task

is to carry out resampling at stage k only when the coecient of variation (CV) of the

resampling weights w

(j)

k

exceeds some threshold. As pointed out in [17], the purpose

of resampling is to help prevent the weights w

(j)

k

from becoming heavily skewed (e.g.,

nearly degenerate) and the eective sample size for sequentially generated sample

paths is /(1 + CV

2

). Therefore [17] recommends to resample when CV exceeds a

26 H.P. CHAN, S. DENG AND T.L. LAI

threshold. Choosing the threshold to be 0 is tantamount to resampling at every step,

and a good choice in many applications is in the range from 1 to 2.

5.3. Positive increments with regularly varying tails

Example 3. Let X = Y , where P(Y > x) = min(x

4

, 1) and Laplace(1) is

independent of Y . Blanchet and Liu [8] in their Example 1 showed that X has tail

probability

F(x) = 2x

4

[6 e

x

(6 + 6x + 3x

2

+ x

3

)]. (5.2)

Let X, X

1

, . . . , X

n

be i.i.d. and S

n

= X

1

+ + X

n

. In [8], PS

n

n is simulated

for n =100, 500 and 1000 by using

(I) state-dependent importance sampling (IS) that approximates the h-transform,

(II) time-varying mixtures for IS introduced by Dupuis, Leder and Wang [14].

We compare their results in [8], each of which is based on 10,000 simulations, with

those of 10,000 SISR sample paths generated by the following methods:

(III) SISR using (4.4) with

i

=

_

_

for i = 1,

bi

180

for 0 i 90,

b

2

+

b(i90)

20

for 91 i 100,

for i = 101,

(5.3)

and resampling conducted at every step,

(IV) SISR using (4.4) and (5.3) with resampling only when CV exceeds 2.

In addition, we also apply the truncation method in Section 3.1 with c

b

= 2b/5,

importance density (3.4) with r = 0.9 and resampling weights (3.5) in which

b

=

4b

1

logb. For this truncation method, which is labeled Method V in Table 2, we

use 10,000 SISR sample paths to estimate PS

n

n, M

n

2n/5 and 10,000 IS

simulations to estimate PS

n

n, M

n

> 2n/5. As shown in Table 2, the standard

errors of (I) and (III)(V) are comparable and are all smaller than that of (II) when n =

500 and 1000, whereas for n = 100, the standard errors of (III)(V) are substantially

smaller than those of (I) and (II). Although Blanchet and Liu [8, Theorem 4] have

shown (II) to be strongly ecient, their parametric mixtures are based on a single

SISR for heavy-tailed random walks 27

Method n = 100 n = 500 n = 1000

I (2.37 0.23) 10

5

(1.02 0.01) 10

7

(1.23 0.01) 10

8

II (2.09 0.10) 10

5

(1.11 0.04) 10

7

(1.16 0.05) 10

8

III (2.21 0.06) 10

5

(1.04 0.01) 10

7

(1.25 0.01) 10

8

IV (2.26 0.03) 10

5

(1.05 0.01) 10

7

(1.24 0.01) 10

8

V (2.16 0.03) 10

5

(1.05 0.02) 10

7

(1.24 0.02) 10

8

Table 2: Monte Carlo estimate of P{Sn n} standard error.

large jump since the eect of two or more large jumps is asymptotically negligible

when the tail probability is of the order 10

7

or smaller. For larger tail probabilities,

the eect of two or more jumps may be signicant, and Table 2 shows that (V) can

provide substantial improvement by taking this eect into consideration.

5.4. Other methods, related works and discussion

Asmussen, Binswanger and Hojgaard [2] have introduced several methods for impor-

tance sampling of tail probabilities of sums of heavy-tail random variables and shown

that these importance sampling methods are strongly ecient for xed n as b .

One such method involves simulating i.i.d. X

1

, . . . , X

n

from a distribution H that has

a heavier tail than F. This method cannot be extended to the case n because

the likelihood ratio statistic has exponentially increasing variance with n. Noting that

P(S

n

b) = nEP[S

n

b, X

n

max(X

1

, . . . , X

n1

)[X

1

, . . . , X

n1

],

Asmussen and Kroese [3] introduced the conditional Monte Carlo method that esti-

mates P(S

n

b) by the average of m independent realizations of

F(maxb (X

1

+ + X

n1

), X

1

, . . . , X

n1

),

and showed that it is strongly ecient for xed n as b , when F is regularly

varying. This approach, however, breaks down if n also approaches .

Blanchet, Juneja and Rojas-Nandayapa [5] have also introduced a truncation method

to simulate tail probabilities of a random walk S

n

with log-normal increments, and

showed that it is strongly ecient as b for xed n. Their truncation method uses

c

b

= b and importance sampling to estimate PS

n

b, M

n

b, and their argument

28 H.P. CHAN, S. DENG AND T.L. LAI

depends heavily on xed n. By using SISR instead, we can control the variances of the

likelihood ratio statistics associated with sequential importance sampling and of the

resampling steps, as shown in Theorems 2 and 4 and Corollaries 2 and 4.

The truncation scheme in Sections 3 and 4 can be regarded as a Monte Carlo

implementation of a similar truncation method for the analysis of tail probabilities

of random walks whose i.i.d. increments have mean 0 and nite variance. Chow and

Lai [10, 11] have used the truncation method to prove that for > 1/2 and p > 1/,

n=1

n

p2

P max

1kn

S

k

n

C

p,

E(X

+

)

p

+ (EX

2

)

(p1)/(21)

, (5.4)

where C

p,

is a universal constant depending only on p and . This inequality is sharp

in the sense that there is a corresponding lower bound for the two-sided tail probability

in the case p 2:

n=1

n

p2

P max

1kn

[S

k

[ n

n=1

n

p2

P[S

n

[ n

(5.5)

B

p,

E[X[

p

+ (EX

2

)

(p1)/(21)

.

The proof of (5.4) makes use of the bound

P max

1kn

S

k

n

PM

n

> n

+P max

1kn

S

k

n

, M

n

n

,

with = 1/(2) for some positive integer . In fact, the term E(X

+

)

p

in (5.4) comes

from the bound

n=1

n

p2

PM

n

> n

n=1

n

p1

PX > n

A

p,

E(X

+

)

p

,

and is associated with the large jump probability of an increment for heavy-tailed

random walks. In this connection, note that b = n

O(b

2

/

2

(b)) in Corollary 1 and Examples 1 and 2 when > 1/2 and EX

2

< .

Although we have focused on one-dimensional random walks, the SISR procedures

can be readily extended to the multivariate setting in which the X

i

are i.i.d. d-

dimensional random vectors such that |X| is heavy-tailed, satisfy P|X| > x =

e

(x)

such that (x) =

n

/n) b and = Pmax

n1jn

jg(S

j

/j) b

n

, as considered in [9] for the light-tailed case. Another extension, also

considered in [9] for the light-tailed case, is to heavy-tailed Markov random walks for

SISR for heavy-tailed random walks 29

which (x) above is replaced by

u

(x), where u is a generic state of the underlying

Markov chain.

Approximating the h-transform closely is crucial for the sequential (state-dependent)

importance sampling methods of Blanchet and Glynn [4] and Blanchet and Liu [7, 8]

to be strongly ecient. This requires sharp and easily computable analytic approxi-

mations of and p, provided for [4] by the Pakes-Veraberbeke theorem [1, p.296] and

provided for [8] by Rozovskiis theorem [18]. In addition, an elaborate acceptance-

rejection scheme is needed to sample from the state-dependent importance measure at

every stage. If less accurate approximations to the h-transform are used, e.g., using

(2.4) instead of (2.3) because either (2.3) is not available or because the g

n

in (2.4) is

much simpler to compute, then the likelihood ratios associated with the corresponding

sequential importance sampling scheme would eventually have very large variances

that approach as n . This was rst pointed out by Kong, Liu and Wong [17]

who proposed to use resampling to address this diculty. While these SISR schemes,

also called particle lters or interacting particle systems (IPS), were used primarily for

ltering in nonlinear state-space models and more general hidden Markov models, Del

Moral and Garnier [13] recognized that they could be used to simulate probabilities of

rare events of the formV (U

n

) a for a possibly non-homogeneous Markov chain U

n

,

with large a but xed n. Chan and Lai [9] recently developed a comprehensive theory

of SISR for simulating large deviation probabilities of g(S

n

/n) for large n in the case of

light-tailed multivariate random walks. This paper continues the development for the

heavy-tailed case, which provides new insights into the SISR approach to rare-event

simulation.

References

[1] Asmussen, S. (2003). Applied Probability and Queues. Springer, New York.

[2] Asmussen, S., Binswanger, K. and Hojgaard, B. (2000). Rare events simulation for heavy-

tailed distributions. Bernoulli 6, 303322.

[3] Asmussen, S. and Kroese, D. (2006). Improved algorithms for rare event simulation with heavy

tails. Adv. Appl. Prob. 38, 545558.

[4] Blanchet, J. and Glynn, P. (2008). Ecient rare-event simulation for the maximum of heavy-

tailed random walks. Ann. Appl. Prob. 18, 13511378.

30 H.P. CHAN, S. DENG AND T.L. LAI

[5] Blanchet, J., Juneja, S. and Rojas-Nandayapa, L. (2008). Ecient tail estimation for sums

of correlated lognormals. Proc. 40th Conf. Winter Simulation, IEEE, 607614.

[6] Blanchet, J. and Lam, H. (2011). State-dependent importance sampling for rare-event

simulation: an overview and recent advances. Manuscript.

[7] Blanchet, J. and Liu, J. (2006). Ecient simulation for large deviation probabilities of sums

of heavy-tailed distributions. Proc. 38th Conf. Winter Simulation, IEEE, 664672

[8] Blanchet, J. and Liu, J. (2008). State-dependent importance sampling for regularly varying

random walks. Adv. Appl. Prob. 40, 11041128.

[9] Chan, H.P. and Lai, T.L. (2011). A sequential Monte Carlo approach to computing tail

probabilities in stochastic models. Ann. Appl. Prob. 21, 23152342.

[10] Chow, Y.S. and Lai, T.L. (1975). Some one-sided theorems on the tail distribution of sample

sums with applications to the last time and largest excess of boundary crossings. Trans. Amer.

Math. Soc. 208, 5172.

[11] Chow, Y.S. and Lai, T.L. (1978). Paley-type inequalities and convergence rates related to the

law of large numbers and extended renewal theory. Z. Wahrsch. verw. Geb. 45, 119.

[12] Chow, Y.S. and Teicher, H. (1988). Probability Theory: Independence, Interchangeability,

Martingales, 2nd edn. Springer, New York.

[13] Del Moral, P. and Garnier, J. (2005). Genealogical particle analysis of rare events. Ann.

Appl. Prob. 15, 24962534.

[14] Dupuis, P., Leder K. and Wang, H. (2007). Notes on importance sampling for randomvariables

with regularly varying heavy tails. ACM Trans. Modeling Comp. Simulation, 17.

[15] Foss, S., Korshunov, D. and Zachary, S. (2009). An Introduction to Heavy-tailed and

Subexponential Distributions. Springer, New York.

[16] Juneja, S. (2007). Estimating tail probabilities of heavy tailed distributions with asymptotically

zero relative error. Queueing Syst. 57, 115127.

[17] Kong, A., Liu, J. and Wong, W.H. (1994). Sequential imputation and Bayesian missing data

problems. Jour. Amer. Statist. Assoc. 89 278288.

[18] Rozovskii, L.V. (1989). Probabilities of large deviations of sums of independent random

variables with common distribution function in the domain of attraction of the normal law.

Theory Prob. Appl. 34, 625644.

- Evaluating Crusher System Location in an Open Pit Mine Using Markov ChainsTransféré parJuan Yarmuch
- fragility 2.0Transféré parapi-20008635
- e-PROPAINOR: A Web-Server for Fast Prediction of CTransféré parRian Triana
- Course Outline MMSC Monte Carlo MethodsTransféré parLatosha Trevino
- Networked System State Estimation in Smart Grid Over Cognitive Radio InfrastructuresTransféré parsahathermal6633
- cep04-08Transféré parmarhelun
- Measuring TFP at the Firm LevelTransféré parrutemarlene40
- Chapter Markov AnalysisTransféré parSabrina Abdullah
- 201_vgGray R. - Probability, Random Processes, And Ergodic PropertiesTransféré pardavid
- PERT Master - Risk Analysis ToolTransféré parPallav Paban Baruah
- Hafeez Allan Agboola, Et AlTransféré pargoutam_trip_acharjee
- Reliability Analysis in Performance-based Earthquake EngineeringTransféré parMuhammadHassanBashir
- Jurnal Fungsi Produksi 1Transféré parMeuthia Alamsyah
- Tutorial Probabilistic AnalysisTransféré parBillyAradheaLiharfin
- A_probabilistic_approach_to_design_civil_engineering_structures.pdfTransféré parAbu Hadiyd Al-Ikhwan
- Exercise 3 Computer Intensive StatisticsTransféré parageoir goe
- Introduction to Simulation - Raj JainTransféré parJosé Luis Haro Vera
- Summary of Results on Markov ChainsTransféré parEdgar Jamil
- [PDF] Risk Management Current Issues and Challenges.pdfTransféré parZbreh Skeeper
- europe travel paper 2019 sdssTransféré parapi-448507449

- Imo 2010 Official SolutionsTransféré parrysanal
- A Collection of LimitsTransféré parDương Thái Bảo
- Evolutionary Finance and Dynamic Games EvstigneevTransféré parropiu
- Social InteractionsTransféré parwotan_alone7
- 20101103 Atiyah - From Algebraic Geometry to PhysicsTransféré parropiu
- Putnam Sol2012Transféré parropiu
- Exceptionally High Intelligence and Schooling Winner97Transféré parropiu
- Mathematical General Relativity Bib-mgrTransféré parropiu
- Behavioral Finance and the Pricing Kernel Puzzle Con_035927Transféré parropiu
- A Note on Price's Equation for Evolutionary Economics PriceTransféré parropiu
- Putnam Sol2013Transféré parropiu
- 100 Notable Books of the Year - New York Times 2005Transféré parropiu
- Algebraic InequalitiesTransféré parropiu
- 92 HGT Two-SidedMatchingTransféré parropiu
- imo98solTransféré parropiu
- imo05solTransféré parropiu
- CpsMT00Transféré parropiu
- Why Do We Study CalculusTransféré parropiu

- Stiffness Method ExampleTransféré paryijang
- Real Analysis Questions and solutionsTransféré parJJ
- Using Matlab 03Transféré parFarid Anuar
- koleksi-soalan-spm-paper-2Transféré parLiew Kok Yee
- Complex 5Transféré parParul Priya
- Ch 8 - Limits & ContinuityTransféré parTinh Linh
- 61375 Mark Scheme Unit 4722 Core Mathematics 2 JuneTransféré parEden Hailu
- AB2_9Transféré parمحمد احمد
- OPTIMIZATION OF CUTTING PARAMETERS OF TURNING PROCESS IN ENGINEERING MACHINING OPERATION, USING A GEOMETRIC PROGRAMMING APPROACHTransféré parijaert
- Solution of Problems in the Theory of Shells by FdmTransféré parVitor Folador Gonçalves
- Quantum Mechanics II - Homework 1Transféré parAle Gomez
- CNNs From Different Viewpoints – MediumTransféré parAndres Tuells Jansson
- MaterialTransféré parIbadul Qadeer
- treatiseonanalyt00whit_0.pdfTransféré parLuisSanabria
- Exam - 2009Transféré parscribd6289
- ProbabilityTransféré parKevin
- Log ProbsTransféré parNick Schleyer
- EE227A -- Convex Optimization SYLLABUSTransféré parRohan Varma
- [Cikgujep.com] Pahang Add Maths P2 2017 Marking SchemeTransféré parlisma
- Ch 4 3 Piecewise Stepwise FunctionalityTransféré parAnnie Santos
- Part_5_Semilog Analysis for Oil WellsTransféré parChai Cws
- MATH 2209 Probability Notes CompilationTransféré parCallum Biggs
- Topic 8 MatricesTransféré parmaths_w_mr_teh
- Maple by ExampleTransféré partresttia_tresttia
- Student Understanding of Quantum Mechanics at the Beginning of Graduate InstructionTransféré parEdis Djedovic
- Mathematical Structure of ARIMA Models--Robert NauTransféré parMzukisi
- Graphing Reciprocal Functions-IIITransféré parsanits591
- Non Convex OptimizationTransféré parAakash Garg
- McAdam Swan, Unique Comaximal Factorization 2004Transféré parAdrian
- RL with LCSTransféré pararturoraymundo

## Bien plus que des documents.

Découvrez tout ce que Scribd a à offrir, dont les livres et les livres audio des principaux éditeurs.

Annulez à tout moment.