Vous êtes sur la page 1sur 16

Chapter 7

Random-Number Generation
Properties of Random Numbers
 Random numbers should be uniformly distributed and independent.
 Uniformity: If we divide (0,1) into n equal intervals, then we expect the number
of observations in each sub-interval to be N/n where N is the total number of
observations.
 Independence: The probability of observing a value in any sub-interval is not
influenced by any previous value drawn.
 Random Number, Ri, must be independently drawn from a uniform distribution with
pdf:

1
Generation of Pseudo-Random Numbers
 “Pseudo”, because generating numbers using a known method removes the potential for
true randomness.

f ( x) = 
 Pseudo-random numbers are model random numbers for simulation purposes.
 Goal: To produce a sequence of numbers in [0,1] that simulates, or imitates, the ideal
properties of random numbers (RN).
 Potential issues:
Non-uniformity

0
Discrete valued, not continuously valued


Inaccurate mean
Inaccurate variance
Dependence
Autocorrelation between numbers
Runs of numbers with skewed values, with respect to previous numbers or mean
value
 Important considerations in RN routines:
 Fast
 Portable to different computers
 Have sufficiently long cycle
 Replicable
 Closely approximate the ideal statistical properties of uniformity and independence.

Techniques for Generating Random Numbers


 Linear Congruential Method (LCM).
 Combined Linear Congruential Generators (CLCG).
 Random-Number Streams.

Linear Congruential Method


 To produce a sequence of integers, X1, X2, … between 0 and m-1 by following a recursive
relationship:

X0 -seed
a-constant multiplier
c-increment
m-modulus
 c=0: multiplicative congruential method, otherwise mixed congruential method
 The selection of the values for a, c, m, and X0 drastically affects the statistical properties
and the cycle length.
 The random integers are being generated [0,m-1], and to convert the integers to random
Xi
numbers: Ri = , i =1,2,...
m

Example: X0 = 27, a = 17, c = 43, m = 100


Xi+1 = (aXi+c) mod m
= (17Xi+43) mod 100
X1 = (17*27 + 43) mod 100
= 502 mod 100
= 2
R1 = X1/m
= 2/100
= .02
X2 = (17*2 + 43) mod 100
= 77 mod 100
= 77
R2 = X2/m
= 77/100
= .77

Characteristics of a Good Generator [LCM]


 Maximum Density
 Such that he values assumed by Ri, i = 1,2,…, leave no large gaps on [0,1]
 Problem: Instead of continuous, each Ri is discrete
 Solution: a very large integer for modulus m
• Approximation appears to be of little consequence
 Maximum Period
 To achieve maximum density and avoid cycling.
 Achieve by: proper choice of a, c, m, and X0.
 Most digital computers use a binary representation of numbers
 Speed and efficiency are aided by a modulus, m, to be (or close to) a power of 2.

Combined Linear Congruential Generators [Techniques]


 Reason: Longer period generator is needed because of the increasing complexity of
stimulated systems.
 Approach: Combine two or more multiplicative congruential generators.
 Let Xi,1, Xi,2, …, Xi,k, be the ith output from k different multiplicative congruential generators.
 The jth generator:
• Has prime modulus mj and multiplier aj and period is mj-1
• Produces integers Xi,j is approx ~ Uniform on integers in [1, m-1]
• Wi,j = Xi,j -1 is approx ~ Uniform on integers in [1, m-2]
 Suggested form:

• The maximum possible period is:


( m1 − 1)(m2 − 1)...( mk − 1)
P=
2 k −1
 Example: For 32-bit computers, L’Ecuyer [1988] suggests combining k = 2 generators with
m1 = 2,147,483,563, a1 = 40,014, m2 = 2,147,483,399 and a2 = 20,692. The algorithm
becomes:
Step 1: Select seeds
 X1,0 in the range [1, 2,147,483,562] for the 1st generator
 X2,0 in the range [1, 2,147,483,398] for the 2nd generator.
Step 2: For each individual generator,
X1,j+1 = 40,014 X1,j mod 2,147,483,563
X2,j+1 = 40,692 X1,j mod 2,147,483,399.
Step 3: Xj+1 = (X1,j+1 - X2,j+1 ) mod 2,147,483,562.
Step 4: Return  X j +1
 , X j +1 > 0

R j +1 =  2,147,483, 563
 2,147,483, 562 , X j +1 = 0

 2,147,483, 563
Step 5: Set j = j+1, go back to step 2.
 Combined generator has period: (m1 – 1)(m2 – 1)/2 ~ 2 x 1018

Random-Numbers Streams [Techniques]

 The seed for a linear congruential random-number generator:


 Is the integer value X0 that initializes the random-number sequence.
 Any value in the sequence can be used to “seed” the generator.
 A random-number stream:
 Refers to a starting seed taken from the sequence X0, X1, …, XP.
 If the streams are b values apart, then stream i could defined by starting seed:
S i = X b ( i −1)

 Older generators: b = 105; Newer generators: b = 1037.


 A single random-number generator with k streams can act like k distinct virtual random-
number generators
 To compare two or more alternative systems.
 Advantageous to dedicate portions of the pseudo-random number sequence to
the same purpose in each of the simulated systems.

Tests for Random Numbers


 Two categories:
 Testing for uniformity:
• H0: Ri ~ U[0,1]
• H1: Ri ~ U[0,1]
 Failure to reject the null hypothesis, H0, means that evidence of non-
uniformity has not been detected.
 Testing for independence:
• H0: Ri ~ independently
• H1: Ri ~ independently
 Failure to reject the null hypothesis, H0, means that evidence of dependence
has not been detected.
 Level of significance a, the probability of rejecting H0 when it is true: a = P(reject H0|H0 is
true)
 When to use these tests:
 If a well-known simulation languages or random-number generators is used, it is
probably unnecessary to test
 If the generator is not explicitly known or documented, e.g., spreadsheet programs,
symbolic/numerical calculators, tests should be applied to many sample numbers.
 Types of tests:
 Theoretical tests: evaluate the choices of m, a, and c without actually generating
any numbers
 Empirical tests: applied to actual sequences of numbers produced. Our emphasis.

Frequency Tests [Tests for RN]


 Test of uniformity
 Two different methods:
 Kolmogorov-Smirnov test
 Chi-square test
Kolmogorov-Smirnov Test [Frequency Test]
 Compares the continuous cdf, F(x), of the uniform distribution with the empirical cdf, SN(x),
of the N sample observations.
 We know: F ( x) = x, 0 ≤ x ≤1
 If the sample from the RN generator is R1, R2, …, RN, then the empirical cdf, SN(x) is:
number of R1 , R2 ,..., Rn which are ≤ x
S N ( x) =
N
 Based on the statistic: D = max| F(x) - SN(x)|
 Sampling distribution of D is known (a function of N, tabulated in Table A.8.)
 A more powerful test, recommended.
Example: Suppose 5 generated numbers are 0.44, 0.81, 0.14, 0.05, 0.93.

Arrange R(i) from


Step 1: R(i) 0.05 0.14 0.44 0.81 0.93 smallest to largest
i/N 0.20 0.40 0.60 0.80 1.00
D+ = max {i/N – R(i) }
i/N – R(i) 0.15 0.26 0.16 - 0.07
Step 2:
R(i) – (i-1)/N 0.05 - 0.04 0.21 0.13
D- = max {R(i) - (i-1)/N}

Step 3: D = max(D+, D-) = 0.26


Step 4: For α = 0.05,
Dα = 0.565 > D

Hence, H0 is not rejected.

Chi-square test [Frequency Test]


 Chi-square test uses the sample statistic:

n is the # of classes Ei is the expected


# in the i th class
(Oi − Ei ) 2
n
Χ =∑
2
0
i =1 Ei Oi is the observed
# in the i th class

 Approximately the chi-square distribution with n-1 degrees of freedom (where the
critical values are tabulated in Table A.6)
 For the uniform distribution, Ei, the expected number in the each class is:
N
Ei = , where N is the total # of observation
 Valid only for large nsamples, e.g. N >= 50
Tests for Autocorrelation [Tests for RN]
 Testing the autocorrelation between every m numbers (m is a.k.a. the lag), starting with
the ith number
 The autocorrelation rim between numbers: Ri, Ri+m, Ri+2m, Ri+(M+1)m
 M is the largest integer such that i +(M +1 )m ≤ N
H 0 : ρ im = 0, if numbers are independentt
H 1 : ρ im ≠ 0, if numbers are dependent
 Hypothesis:

 If the values are uncorrelated:


 For large values of M, the distribution of the estimator of rim, denoted is
approximately normal.
 Test statistics is:Z =
ρ
ˆ im
0
σ
ˆρˆ im

 Z0 is distributed normally with mean = 0 and variance = 1, and:

1 M 
ρim =
ˆ
M +1


k=0
Ri +m Ri +
k (k +
1 )m −

0.2
5

3 M +
1 7
ˆ ρim =
σ
2 (M +
1 1)
 If rim > 0, the subsequence has positive autocorrelation
 High random numbers tend to be followed by high ones, and vice versa.
 If rim < 0, the subsequence has negative autocorrelation
 Low random numbers tend to be followed by high ones, and vice versa.

Example:

1 (0.23 )( 0.28 ) +(0.25 )( 0.33 ) +(0.33 )( 0.27 ) 


ˆ 35 =
ρ  −0.25
4 +1 
+( 0. 28 )( 0. 05 ) +( 0. 05 )( 0. 36 ) 
=−0.1945
13 ( 4) +7
σˆ ρ35 = =0.128
12 ( 4 +1 )
0.1945
Z0 =− =−1.516
0.1280
 Test whether the 3rd, 8th, 13th, and so on, for the following output on P. 265.
 Hence, a = 0.05, i = 3, m = 5, N = 30, and M = 4
 From Table A.3, z0.025 = 1.96. Hence, the hypothesis is not rejected.

Shortcomings:
 The test is not very sensitive for small values of M, particularly when the numbers being
tests are on the low side.
 Problem when “fishing” for autocorrelation by performing numerous tests:
 If a = 0.05, there is a probability of 0.05 of rejecting a true hypothesis.
 If 10 independence sequences are examined,
• The probability of finding no significant autocorrelation, by chance alone, is
0.9510 = 0.60.
• Hence, the probability of detecting significant autocorrelation when it does
not exist = 40%

Summary
 In this chapter, we described:
 Generation of random numbers
 Testing for uniformity and independence
 Caution:
 Even with generators that have been used for years, some of which still in used, are
found to be inadequate.
 This chapter provides only the basic
 Also, even if generated numbers pass all the tests, some underlying pattern might
have gone undetected.
CHAPTER 8: Random Variate Generation
Inverse-transform Technique

• For cdf function: r = F (x)

• Generate r from uniform (0,1)

• Find x:

r=F
(x)
x=F -1(
r) r1

x1

Exponential Distribution

Exponential cdf:

r= F(x)

= 1 – e-x for x  0

To generate X1, X2, X3 …

Xi = F-1(Ri)

= -(1/λ

 ln(1-Ri)

Example: Generate 200 variate Xi with distribution exp (= 1)

Generate 200 Rs with U (0,1) and utilize above equation, the histogram of Xs become:
Check: Does the random variable X1 have the desired distribution?
P ( X 1 ≤ x0 ) = P ( R1 ≤ F ( x0 )) = F ( x 0 )

Other Distributions
• Examples of other distributions for which inverse cdf works are:

1. Uniform distribution

2. Weibull distribution

3. Triangular distribution

Empirical Continuous Dist’n

• When theoretical distribution is not applicable

• To collect empirical data:

1. Resample the observed data

2. Interpolate between observed data points to fill in the gaps

3. For a small sample set (size n):


x (1) ≤x (2) ≤…≤x (n)
4. Arrange the data from smallest to largest
x (i -1) ≤x ≤x (i)
5. Assign the probability 1/n to each interval

ˆ −1 ( R ) =x  (i − 1) 
X =F 1) +
(i − ai R − 
 n 

where
x(i ) − x( i −1) x( i ) − x(i −1)
ai = =
1 / n −(i −1) / n 1/ n
Example: Suppose the data collected for100 broken-widget repair times are:

Interval Relative Cumulative Slope,


Consider
i R 1 = 0.83:
(Hours) Frequency Frequency Frequency, c i ai
1 0.25 ? x ? 0.5 31 0.31 0.31 0.81
c 3 = 0.66 < R < c = 1.00
2 0.51? x ?41.0 10 0.10 0.41 5.0
X1 = x3 (4 -1) +1.0 ? x ?–1.5
a 4(R 1 c(4-1) ) 25 0.25 0.66 2.0
4 1.5
= 1.5 + 1.47(0.83 ? x ? 2.0-0.66) 34 0.34 1.00 1.47
= 1.75
Discrete Distribution

• All discrete distributions can be generated via inverse-transform technique

• Method: numerically, table-lookup procedure, algebraically, or a formula

Example: Suppose the number of shipments, x, on the loading dock of IHW company is 0, 1, or 2. Internal
consultants have been asked to improve the efficiency of loading and hauling operation.

Data - Probability distribution:

x p
(x) F
(x)
0 0
.50 0
.50
1 0
.30 0
.80
2 0
.20 1
.00

F (x) is given by:

0, x<0
Consider R 1 =0.73:
F(x i-1) < R <= F( xi)
0.5, 0<=x<1 F(x 0) <0.73<=F(x 1)
Hence, x 1 =1
F (x)= 0.8, 1<=x<2

1.0, 2<=x<3

CDF of a discrete random variable has horizontal line segments with jumps of size p (x) at those points X that
the RV can assume.

Method - Given R, the generation scheme becomes:

0, R ≤ 0 .5

x =  1, 0 .5 ≤ R ≤ 0 .8
2, 0 .8 ≤ R ≤ 1 .0

Acceptance-Rejection technique G e n e ra te R
no
• Useful particularly when inverse CDF does not exist in closed form,
a.k.a. thinning C o n d itio n
• Illustration: To generate random variates, X ~ U (1/4, 1) ye s
• Procedures:
O utp ut R ’
Step 1. Generate R ~ U [0,1]

Step 2a. If R >= ¼, accept X=R.

Step 2b. If R < ¼, reject R,

Step 3: If another RV required, return to Step 1

• R does not have the desired distribution, but R conditioned (R’) on the event {R >= ¼} does.

• Efficiency: Depends heavily on the ability to minimize the number of rejections.

¼ <=R<=1, set X=R

Probability of Rejection ¼

Probability of Acceptation ¾

Mean number of rejections is 1/P –1=4/3-1=1/3

Mean number of acceptations is 1/3 +1=4/3=1.333

I.e., generating 100 values of X would require 133 random numbers.

Poisson Distribution:

A Poisson random variable, N, with mean a > 0 has pmf

p(n) = P(N = n) = e-α αn/ n! , n = 0,1,2,……. (5.29)

but more important, N can be interpreted as the number of arrivals from a Poisson arrival process in
one unit of time. Recall that the inter- arrival times, A1, A2,... of successive customers are
exponentially distributed with rate a (i.e., a is the mean number of arrivals per unit time); in
addition, an exponential variate can be generated by Equation (5.3). Thu/there is a relationship
between the (discrete) Poisson distribution and the (continuous) exponential distribution, namely
N=n

if and only if

A1 + A2 + …………+ An <= 1 < A1 + ….. + An + An+1 (5.30)

Equation (5.29)/N = n, says there were exactly n arrivals during one unit of time; but relation (8.30)
says that the nth arrival occurred before time 1 while the (n + l)st arrival occurred after time l)
Clearly, these two statements are equivalent. Proceed now by generating exponential interarrival
times until some arrival, say n + 1, occurs after time 1; then set N = n.
For efficient generation purposes, relation (5.30) is usually simplified by first using Equation (5.3),
Ai= (—1/α )ln Ri, to obtain

∑I=1n – 1/α ln Ri <=1 < ∑n+1 i=1 -1/α ln Ri

Next multiply through by reverses the sign of the inequality, and use the fact that a sum of logarithms
is the logarithm of a product, to get

Finally, use the relation elnx=x for any number x to obtain

which is equivalent to relation (8.30).The procedure for generating a Poisson random variate, N,

is given by the following steps:

Step 1. Set n = 0, P = 1.

Step 2. Generate a random number R n +i and replace P-by-P •R n +i.

Step 3. If P < e-0, then accept N = n. Otherwise, reject the current n, \ increase n by one, and return to step 2.

Notice that upon completion of step 2, P is equal to the rightmost expression in relation (5.31). The basic idea
of a rejection technique is again exhibited; if P > e~a in step 3, then n is rejected and the generation process
must proceed through at least one more trial.

How many random numbers will be required, on the average to generate one Poisson variate, N? If N=n, then
n+1 random numbers are required so the average number is given by

Which is quite large if the mean , alpha, of the Poisson distribution is large.

Example 4:
NSPP
G e ne ra te E ~ Eλx*)p(
: Non-stationary Poisson Process t=t+E
no
• It is a Poisson arrival process with an arrival rate that varies with
time C o n ditio n
• Idea behind thinning:
R <=λ (t)
ye s
1. Generate a stationary Poisson arrival process at the
fastest rate, l* = max l(t). O ut p ut E ’~ t
2. But “accept” only a portion of arrivals, thinning out just
enough to get the desired time-varying rate

Example: Generate a random variate for a NSPP


Procedures:

Step 1. l* = max l(t) = 1/5, t = 0 and i = 1. Data: Arrival Rates

Step 2. For random number R = 0.2130,

E = -5ln(0.213) = 13.13

t = 13.13
Mean Time
Step 3. Generate R = 0.8830 Between Arrival
t Arrivals Rate λ(t)
l(13.13)/l*=(1/15)/(1/5)=1/3 (min) (min) (#/min)
0 15 1/15
Since R>1/3, do 60 12 1/12
120 7 1/7
not generate the arrival 180 5 1/5
240 8 1/8
Step 2. For random number R = 0.5530, 300 10 1/10
360 15 1/15
E = -5ln(0.553) = 2.96 420 20 1/20
480 20 1/20
t = 13.13 + 2.96 = 16.09

Step 3. Generate R = 0.0240

l(16.09)/l*=(1/15)/(1/5)=1/3

Since R<1/3, T1 = t = 16.09,

and i = i + 1 = 2

Gamma Distribution:

Special Properties

• They are based on features of particular family of probability distributions

• For example:

1. Direct Transformation for normal and lognormal distributions

2. Convolution

3. Beta distribution (from gamma distribution)

Direct Transformation
1. Approach for normal (0,1):
Consider two standard normal random variables, Z1 and Z2, plotted as a point in the plane:

In polar coordinates:

Z1 = B cos φ

Z2 = B sin φ

B2 = Z21 + Z22 ~ chi-square distribution with 2 degrees of


freedom = Exp (λ

= 2). Hence,

The radius B and angle φ

are mutually independent.

2. Approach for normal (µ , σ 2):


Generate Zi ~ N (0,1)

Xi = µ + σ
Zi
3. Approach for lognormal (µ , σ 2):
Generate X ~ N (µ ,σ 2)

Yi = eXi

Vous aimerez peut-être aussi