Vous êtes sur la page 1sur 26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

2. Simulation of Random Variables

Illusions are art, for the feeling person, and it is by art that we
live, if we do
Elizabeth Bowen

2. Simulation of Random Variables

1/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

Often we are interested in the distribution of a random variable X


which is complicated, but which can none-the-less be built up from
simple components such as independent rvs with known
distributions.
Monte-Carlo simulation is an excellent tool for such problem: we
seek to generate a random sample from the distribution of X ,
which we can use to estimate its mean, median, mode, percentiles,
etc.
The starting point for any simulation is the generation of r.v.s with
known distributions (binomial, poisson, exponential, normal, etc.),
which are the building blocks for more complicated distributions. It
turns out that all random variables can be generated by
manipulating U (0, 1) rvs.

2. Simulation of Random Variables

2/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

Simulating iid uniform samples

We cannot generate truly random numbers on a computer. Instead


we generate pseudo-random numbers, which have the appearance
of random numbers, but are in fact completely deterministic.
Pseudo-random numbers can be generated by chaotic dynamical
systems, which have the characteristic that the future is very hard
to predict given the present.
A very important advantage of using pseudo-random numbers is
that, because they are deterministic, any experiment performed
using pseudo-random numbers can be repeated exactly.

2. Simulation of Random Variables

3/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

Congruential generators
Congruential generators were the first reasonable class of
pseudo-random number generators. R uses a pseudo-random
number generator called the Mersenne-Twister, which has similar
properties to congruential generators.
Given an initial number X0 {0, 1, . . . , m 1} and two big
numbers A and B we define a sequence of numbers
Xn {0, 1, . . . , m 1}, n = 0, 1, . . ., by
Xn+1 = (AXn + B )

mod m.

We get a sequence of numbers Un [0, 1), n = 0, 1, . . ., by putting


Un = Xn /m. If m, A, and B are well chosen then the sequence
U0 , U1 , . . ., is almost impossible to distinguish from an iid
sequence of U (0, 1) random variables.
2. Simulation of Random Variables

4/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

In practice it is sensible to discard the value 0 when it occurs, as


we often divide by Un . This is justifiable since for a true uniform,
the probability of taking on the value 0 is zero. The value 1 can
also be a problem, but note that as defined, Un < 1 for all n.
Example: If we take m = 10, A = 103, and B = 17, then for
X0 = 2, we have
X1 = 223 mod 10 = 3
X2 = 326 mod 10 = 6
X3 = 635 mod 10 = 5
..
.
Clearly the sequence produced by a congruential generator will
eventually cycle and thus since there are at most m possible
values, the maximum cycle length is m.
(The Mersenne-Twister has a cycle length of 219937 1.)
2. Simulation of Random Variables

5/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

Because computers use binary arithmetic, if we have m = 2k for


some k , then taking x mod m is very quick.
An example of a good congruential generator is m = 232 ,
A =1,664,525, and B = 1,013,904,223.
An example of a bad congruential generator is RANDU, which was
shipped with IBM computers in the 1970s. RANDU used
m = 231 , A =65,539, and B = 0.

2. Simulation of Random Variables

6/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

Seeding

The number X0 is called the seed. If you know the seed (as well as
m, A, and B ), then you can reproduce the whole sequence exactly.
This is a very good idea from a scientific point of view; being able
to repeat an experiment means that your results are verifiable.
To generate n pseudo-random numbers in R, use runif(n). R
does not use a congruential generator, but is still needs a seed to
generate pseudo-random numbers. For a given value of seed
(assumed integer), the command set.seed(seed) always puts
you at the same point on the cycle of pseudo-random numbers.

2. Simulation of Random Variables

7/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

The current state of the random number generator is kept in the


vector .Random.seed. You can save the value of .Random.seed
and then use it to return to that point in the sequence of
pseudo-random numbers.
If the random number generator is not initialised before you start
generating pseudo-random numbers, then R initialises it using a
value taken from the system clock.

2. Simulation of Random Variables

8/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

> set.seed(42)
> runif(2)
[1] 0.9148060 0.9370754
> RNG.state <- .Random.seed
> runif(2)
[1] 0.2861395 0.8304476
> set.seed(42)
> runif(4)
[1] 0.9148060 0.9370754 0.2861395 0.8304476
> .Random.seed <- RNG.state
> runif(2)
[1] 0.2861395 0.8304476
2. Simulation of Random Variables

9/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

Simulating discrete random variables


Let X be a discrete random variable taking values in the set
{0, 1, . . .} with cdf F and pmf p. The following snippet of code
takes a uniform random variable U and returns a discrete random
variable X with cdf F .
# given U ~ U(0,1)
X <- 0
while (F(X) < U) {
X <- X + 1
}
When the algorithm terminates we have F (X ) U and
F (X 1) < U , that is U (F (X 1), F (X )]. That is,
P(X = x ) = P(U (F (x 1), F (x )]) = F (x ) F (x 1) = p(x ).
2. Simulation of Random Variables

10/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

1.0

simulating from a binom(3, 0.5) c.d.f.

0.8

(0.875,1) mapped to 3

0.6
0.4

U ~ U(0,1)

(0.5,0.875) mapped to 2

0.2

(0.125,0.5) mapped to 1

0.0

(0,0.125) mapped to 0

X ~ binom(3, 0.5)

2. Simulation of Random Variables

11/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

To simulate binomial, geometric, negative-binomial or Poisson rvs


in R, use rbinom, rgeom, rnbinom or rpois.
For simulating other (finite) discrete rvs R provides
sample(x, size, replace = FALSE, prob = NULL).
The inputs are
x A vector giving the possible values the rv can take;
size How many rvs to simulate;
replace Set this to TRUE to generate an iid sample, otherwise
the rvs will be conditioned to be different from each
other;
prob A vector giving the probabilities of the values in x. If
omitted then the values in x are assumed to be
equally likely.

2. Simulation of Random Variables

12/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

Simulating continuous random variables

Suppose that we are given U U (0, 1) and want to simulate a


continuous rv X with cdf FX .
Put Y = FX1 (U ) then we have
FY (y) = P(Y y) = P(FX1 (U ) y) = P(U FX (y)) = FX (y).
That is, Y has the same distribution as X .
Thus, if we can simulate a U (0, 1) rv, then we can simulate any
continuous rv X for which we know FX1 . This is called the inverse
transformation method or simply the inversion method.

2. Simulation of Random Variables

13/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

0.8

1.0

Inversion method for U(1, 3)

0.0

0.2

0.4

0.6

If X U (1, 3) then
FX (x ) = (x 1)/2 for
x (1, 3) and thus
FX1 (y) = 2y + 1 for
y (0, 1).

2. Simulation of Random Variables

14/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

0.6

0.8

1.0

Inversion method for exp(1)

0.0

0.2

0.4

If X exp() then
FX (x ) = 1 e x for
x 0 and thus
FX1 (y) = 1 log (1 y).

2. Simulation of Random Variables

15/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

Random variable simulators in R


Distribution
binomial
Poisson
geometric
negative binomial
uniform
exponential
normal
gamma
beta
student t
F
chi-squared
Weibull
2. Simulation of Random Variables

R command
rbinom
rpoisson
rgeom
rnbinom
runif
rexp
rnorm
rgamma
rbeta
rt
rf
rchisq
rweibull
16/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

The rejection method

The inversion method works well if we can find F 1 analytically. If


not. we can use root-finding techniques to invert F numerically,
but this can be time-consuming. An alternative method in this
situation, which is often faster, is the rejection method.
We start with an example. Suppose that we have a continuous
random variable X with pdf fX concentrated on the interval (0, 4).
We imagine sprinkling points P1 , P2 , . . ., uniformly at random
under the density function, and consider the distribution of X1 , the
x coordinate of P1 .

2. Simulation of Random Variables

17/26

Discrete RVs

Continuous RVs

Rejection method

0.3
0.0

0.1

0.2

pdf

0.4

0.5

0.6

iid uniforms

a
1

b
2

2. Simulation of Random Variables

18/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

Let R be the shaded region under fX between a and b, then


P(a < X1 < b) = P(P1 hits R)
Area of R
=
Area under density
Rb
fX (x )dx
= a
1
Z b
=
fX (x )dx .
a

So X1 has the same distribution as X .


But how do we generate the points Pi uniformly under fX ? The
answer is to generate points at random in the rectangle
[0, 4] [0, 0.5], and then reject those that fall above the pdf.
2. Simulation of Random Variables

19/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

Rejection method (uniform envelope) Suppose that fX is nonzero only on [a, b], and fX k .
1. Generate X U (a, b) and Y U (0, k ) independent of X
(so P = (X , Y ) is uniformly distributed over the rectangle
[a, b] [0, k ]).
2. If Y < fX (X ) then return X , otherwise go back to step 1.
Example: consider the triangular pdf fX defined as

if 0 < x < 1;
x
(2 x ) if 1 x < 2;
fX (x ) =

0
otherwise.
We apply the rejection method as follows:
source(rejecttriangle.r)
2. Simulation of Random Variables

20/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

General rejection method

Our rejection method uses a rectangular envelope to cover the


target density fX . What to do if fX is unbounded?
Let X have pdf h and let Y U (0, kh(X )), then (X , Y ) is
uniformly distributed under the curve kh:

P((X , Y ) (x , x + dx ) (y, y + dy))


= P(Y (y, y + dy) | X (x , x + dx ))P(X (x , x + dx ))
1
dy
h(x )dx = dxdy.
=
kh(x )
k

2. Simulation of Random Variables

21/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

Suppose we wish to simulate from the density fX . Let h be a


density we can simulate from, and choose k such that
k k = sup
x

fX (x )
.
h(x )

Then kh forms an envelope for fX , and we can generate points


uniformly within this envelope. By accepting points below the
curve fX , we get the general rejection method:
General rejection method
To simulate from the density fX , we assume that we have envelope
density h from which you can simulate, and that we have some
k < such that supx fX (x )/h(x ) k .
1. Simulate X from h.
2. Generate Y U (0, kh(X )).
3. If Y < fX (X ) then return X , otherwise go back to step 1.
2. Simulation of Random Variables

22/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

Efficiency

The efficiency of the rejection method is measured by the expected


number of times you have to generate a candidate point (X , Y ).
The area under the curve kh is k and the area under the curve fX
is 1, so the probability of accepting a candidate is 1/k .
Thus the number of times N we have to generate a candidate
point has distribution 1 + geom(1/k ), with mean
EN = 1 + (1 1/k )/(1/k ) = k .
So, the closer h is to fX , the smaller we can choose k , and the
more efficient the algorithm.

2. Simulation of Random Variables

23/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

Example: gamma

For m, > 0 the (, m) density is


f (x ) = m x m1 e x /(m), for x > 0,
There is no explicit formula for the cdf F or its inverse, so we will
use the rejection method to simulate from f .
We will use an exponential envelope h(x ) = e x , for x > 0.
Using the inversion method we can easily simulate from h using
log(U )/, where U U (0, 1).

2. Simulation of Random Variables

24/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

To envelop f we need to find


k = sup
x >0

f (x )
m x m1 e ()x
= sup
.
h(x ) x >0
(m)

Clearly k will be infinite if m < 1 or . For m = 1 the


gamma is just an exponential. Thus we will assume m > 1 and
choose < .
For m (0, 1) the rejection method can still be used, but a
different envelope is required.
To find k we take the derivative of the right-hand side above and
set it to zero, to find the point where the maximum occurs. You
can check that this is at the point x = (m 1)/( ), which
gives
m (m 1)m1 e (m1)
k =
.
( )m1 (m)
2. Simulation of Random Variables

25/26

iid uniforms

Discrete RVs

Continuous RVs

Rejection method

To improve efficiency we would like to choose our envelope to


make k as small as possible. Looking at the formula for k this
means choosing to make ( )m1 as large as possible.
Setting the derivative with respect to to zero, we see that the
maximum occurs when = /m. Plugging this back in we get
k = m m e (m1) /(m).
We can now code up our rejection algorithm.
gamma_sim.r

2. Simulation of Random Variables

26/26

Vous aimerez peut-être aussi