Académique Documents
Professionnel Documents
Culture Documents
Nilesh M. Patil
SYLLABUS
Teaching Scheme
Course Course Credits Assigned
(Hrs. / Week)
Code Name
Theory Practical Tutorial Theory Practical/Oral Tutorial Total
Computer
Simulation
BEITC803 04 02 -- 04 01 -- 05
And
Modeling
Examination Scheme
Theory Marks
Internal
Course Assessment
Course Name End Term
Code Avg. Practical Oral Total
Semester Work
Test Test of
Exam
1 2 2
Tests
Computer
BEITC803 Simulation 20 20 20 80 25 25 -- 150
And Modeling
Course Objectives:
This course presents an introduction to discrete event simulation systems. Emphasis of the course
will be on modeling and the use of simulation languages/software to solve real world problems in
the manufacturing as well as services sectors. The course discusses the modeling techniques of
entities, queues, resources and entity transfers in discrete event environment. The course will teach
the students the necessary skills to formulate and build valid models, implement the model, perform
simulation analysis of the system and analyze results properly. The “theory” of simulation involves
probability and statistics, thus a good background in probability and statistics is a required
prerequisite
Course Outcomes:
Understand the meaning of simulation and its importance in business, science, engineering,
industry and services.
Identify the common applications of discrete-event system simulation.
Practice formulation and modeling skills.
Understand simulation languages.
Ability to analyze events and inter-arrival time, arrival process, queuing strategies, resources
and disposal of entities.
An ability to perform a simulation using spreadsheets as well as simulation
language/package.
Ability to generate pseudorandom numbers using the Linear Congruential Method.
Ability to perform statistical tests to measure the quality of a pseudorandom number
generator.
Ability to define random variate generators for finite random variables.
Ability to analyze and fit the collected data to different distributions.
DETAILED SYLLABUS
Sr.
Module Detailed Content Hours
No.
UNIT – I 1. Introduction to Simulation
1 Introduction 2. Simulation Examples 15
to Simulation 3. General Principles
UNIT – II
Mathematical
& 4. Statistical Models in simulation.
2 8
Statistical 5. Queuing Models
Models
In Simulation
6. Random Number Generation, Testing random numbers
(Refer to Third edition)
UNIT – III
7. Random Variate Generation: Inverse transform technique,
3 Random 9
Direct Transformation for the Normal Distribution,
Numbers
Convolution Method, Acceptance-Rejection Technique
(only Poisson Distribution)
UNIT – IV 8. Input Modeling
Analysis of 9. Verification, Calibration and Validation of Simulation
4 12
Simulation Models
Data 10. Estimation of absolute performance
11. Case study:
UNIT – V
5 Processor and Memory simulation 4
Application
Manufacturing & Material handling
Text Books:
1. Discrete Event System Simulation; Third Edition, Jerry Banks, John Carson, Barry Nelson, and
David M. Nicol, Prentice-Hall
2. Discrete Event System Simulation; Fifth Edition, Jerry Banks, John Carson, Barry Nelson, and
David M. Nicol, Prentice-Hall
References:
1. System Modeling & Analysis; Averill M Law, 4th Edition TMH.
2. Principles of Modeling and Simulation; Banks C M , Sokolowski J A; Wiley
3. System Simulation ; Geoffrey Gordon ; EEE
4. System Simulation with Digital Computer; Narsing Deo, PHI
Term work:
Laboratory work: 10 Marks
Mini Simulation Project presentation: 10 Marks
Attendance / Quiz: 5 Marks
Simulation: Real World Examples – can be in the field of business, transportation, medical,
computing, manufacturing and material handling- Presentation to be taken.
Theory Examination:
Question paper will comprise of 6 questions, each carrying 20 marks.
Total 4 questions need to be solved.
Q.1 will be compulsory, based on entire syllabus where in sub questions of 2 to 3 marks will
be asked.
Remaining question will be randomly selected from all the modules.
UNIT: I
INTRODUCTION TO SIMULATION
CHAPTER 1
INTRODUCTION TO SIMULATION
2. Semiconductor Manufacturing
i. To compare dispatching rules using large-facility models.
ii. To plan capacity with time constraints between operations.
iii. To compare a 200-mm and 300-mm X-ray lithography cell.
iv. To reduce up to 300-mm logistic system risk.
3. Construction Engineering
i. To construct a dam embankment.
Computer Simulation and Modeling 4
4. Military Applications
i. To design and test an intelligent controller for autonomous underwater vehicles.
ii. To model military requirements for non-war-fighting operations.
iii. To evaluate multi-trajectory performance for varying scenario sizes.
iv. To use adaptive agents in Air Force pilot retention.
7. Human Systems
i. Modeling human performance in complex system.
ii. Studying the human element in air traffic control.
8. Health Care
i. Estimating maximum capacity in the emergency room.
ii. Reducing the length of stay in an emergency department.
possible outcomes.
4. State: The state of a system is defined to be that collection of variables necessary to
describe the system at any time, relative to the objects of the study. Example:Number
of Tellers, Number of customers waiting in queue
5. Event: An event is defined as an instantaneous occurrence that may change the state
of the system.
There are two types of events: [D-14]
i. Endogenous Event: An endogenous event is an event occurring within the system.
Example: Completion of service of a customer
ii. Exogenous Event: An exogenous event is an event occurring in the environment that
affects the system. Example: Arrival of customer in bank
discrete set of points in time. Example: Number of customers waiting in line in bank.
8. Continuous Model: A continuous model is one in which the state variables change
continuously over time. Example: Head of water behind the dam.
1. Problem Formulation
i. Every simulation study begins with a statement of the problem.
ii. If the statement is provided by the policy makers, or those that have the
problem, the analyst must ensure that the problem being described is clearly
understood.
iii. If a problem statement is being developed by the analyst, it is important that
policy makers understand and agree with the formulation.
iv. Even with all these precautions, it is possible that the problem will need to be
reformulated as the simulation study progresses.
3. Model conceptualization
i. A conceptual model abstracts the real-world system under investigation.
ii. Modeling should begin in a simple manner and then built towards greater but
appropriate complexity.
iii. Model user should be involved in conceptualization as it will enhance the
quality of the resulting model and also increase the confidence of the model
user in the application of the model.
4. Data Collection
i. There is a constant overplay between the construction of the model and the
collection of the needed input data.
ii. As the complexity of the model changes, the required data elements may also
change.
iii. The system objectives dictate the kind of data to be collected.
iv. Data collection should be started as early as possible since it takes large
amount of simulation time.
5. Model Translation
i. The conceptual model constructed in step 3 is coded into a computer
recognizable form, an operational model using any simulation language.
6. Verified?
i. Verification pertains to the computer program prepared for the simulation
Computer Simulation and Modeling 8
8. Experimental Design
i. For each system design that is to be simulated, decisions need to be made
concerning the length of the initialization period, the length of simulation runs,
and the number of replications to be made of each run.
12. Implementation
i. This is the final step in the simulation study.
ii. If the model user has been involved throughout the study period, and the
simulation analyst has followed all the steps rigorously, then the
implementation is likely to be successful.
■■■
Computer Simulation and Modeling 10
CHAPTER 2
SIMULATION EXAMPLES
A queueing system is described by its calling population, the nature of the arrivals, the
service mechanism, the system capacity, and the queueing discipline. These attributes of a
queueing system are described in detail below.
we might be able to determine a rate for arrivals by counting the number of customers
during a specific time period, we would not know exactly when these customers
would arrive. It might be that no customers would arrive during one hour and 20
customers would arrive during another hour.
Arrivals are assumed to be independent of each other and to vary randomly over time.
Arrivals may occur at scheduled times or random times.
The most important model for random arrivals is the Poisson arrival process. Poisson
arrival process is used as a model for the arrival of people to restaurants, driving
banks and other service facilities like the arrival of telephone calls to a telephone
exchange, etc.
Second important class of intervals is the scheduled arrivals. In this case the inter
arrival times may be constant, or constant plus or minus a small random amount to
represent early or late arrivals. For example, Patients to a physician‟s office,
scheduled airline flight arrivals to an airport.
A third situation occurs when at least one customer is assumed to be always present in
the queue so that the server is never idle because of lack of customers. For example, a
customer may represent raw material for a product and sufficient raw material is
assumed to be always available.
The system capacity has no limit, the units are served in the order of their arrival
usually in FIFO for a single server or channel.
Arrivals and services are defined by the distribution of the time between arrivals and
the distribution of the service times respectively.
Some concepts of queuing system are:
State of the system – The number of units in the system and the status of the server-
busy or idle
Event – Set of circumstances that cause an instantaneous change in the system. There
are only 2 possible events that can affect the state of the system. They are the entry of
a unit in the system (the arrival event) or the completion of service on a unit (the
departure unit).
Simulation Clock – used to track simulated time.
The queuing system includes the server, the unit being serviced and the units in the
queue. If a unit has just completed the service, the simulation proceeds as shown in
the figure 2.4.
The arrival event occurs when the unit enters the system. The flow diagram for the
arrival event is shown in the figure 2.5.
The unit may find the server either idle or busy; therefore it begins service
immediately or enters the queue for the server. This course of action is shown in
figure 2.6.
Computer Simulation and Modeling 15
Queue status
Not empty Empty
Server Busy Enter queue Enter queue
status Idle Impossible Enter service
Fig 2.6 Potential unit actions upon arrival
After the completion of the service, the server may become idle or remain busy with
the next unit. This relationship of the server outcomes to the status of the queue is
shown in figure 2.7.
Queue status
Not empty Empty
Server Busy Impossible
outcomes Idle Impossible
Figure 2.7 Server outcomes after service completion
If the queue is not empty, another unit will enter the server and it will be busy. If the
queue is empty, the server will be idle after a service is completed. These two
possibilities are shown in the shaded portion of the figure 2.7.
Simulation clock times for arrivals and departures are computed in a simulation table
customized for each problem.
In simulation, events usually occur at random times.
The randomness needed to imitate real life is made possible through the use of
random numbers.
Random numbers are distributed uniformly and independently on the interval (0, 1).
Random digits are uniformly distributed on the set {0, 1, 2,…, 9}.
Random digits can be used to form random numbers by selecting the proper number
of digits for each random number and placing a decimal point to the left of the value
selected.
The proper number of digits is dictated by the accuracy of the data being used for
input purposes.
If the input distribution has values with two decimal places, two digits are taken from
a random-digits table (such as Table A.1) and the decimal point is placed to the left to
form a random number.
In a single-channel queueing system inter-arrival times and service times are
generated from the distributions of these random variables.
Solution:
Given that the arrival distribution is uniformly distributed between 1 and 10 minutes. Hence
the probability of occurrence of each inter-arrival time is the same which would be 0.100.
Using this we can generate the distribution of inter-arrival time and assign random digits to it
as shown in table 2.1 below:
Random-
Time Between Cumulative
Probability Digit
Arrivals(minutes) Probability
Assignment
1 0.100 0.100 001-100
2 0.100 0.200 101-200
3 0.100 0.300 201-300
4 0.100 0.400 301-400
5 0.100 0.500 401-500
6 0.100 0.600 501-600
7 0.100 0.700 601-700
8 0.100 0.800 701-800
9 0.100 0.900 801-900
10 0.100 1.000 901-000
Table 2.1 Distribution of time between arrivals
Similarly, assign random digits to the service time as shown in table 2.2 below.
Random-
Service Time Cumulative
Probability Digit
(minutes) Probability
Assignment
1 0.04 0.04 01-04
2 0.20 0.24 05-24
3 0.10 0.34 25-34
4 0.26 0.60 35-60
5 0.35 0.95 61-95
6 0.05 1.00 96-00
Table 2.2 Service time distributions
First, initialize the table by entering the details for the first customer. Here, we are assuming
that the first customer is arriving at time 0. Service of first customer begins immediately since
nobody is present in the system and gets over at time 5. So we can say that the customer
spends 5 minutes in the system. After filling the first customer details, subsequent rows in the
table are filled based on the random numbers for inter-arrival time and service time and the
completion of service time of the previous customer.
Computer Simulation and Modeling 17
The simulation table for single server system is given in table 2.3 below.
Customer Random Time Arrival Random Service Time Time Time Time Idle
Digits Between Time Digits Time Service Customer Service Customer Time
For Arrival For Begins Waits in Ends Spends In Of
Arrival Service Queue System Server
1 -- -- 0 71 5 0 0 5 5 0
2 853 9 9 59 4 9 0 13 4 4
3 340 4 13 12 2 13 0 15 2 0
4 205 3 16 88 5 16 0 21 5 1
5 99 1 17 97 6 21 4 27 10 0
6 669 7 24 66 5 27 3 32 8 0
7 742 8 32 81 5 32 0 37 5 0
8 301 4 36 35 4 37 1 41 5 0
9 888 9 45 29 3 45 0 48 3 4
10 444 5 50 91 5 50 0 55 5 2
Ʃ =50 Ʃ =44 Ʃ =8 Ʃ =52 Ʃ =11
Table 2.3 Simulation Table for Single Server Queue
Output Statistics:
( )
( )
( ) ( )
∑ ( ) ( ) ( ) ( ) ( ) ( ) ( )
( )
( )
( )
Computer Simulation and Modeling 18
Solution:
The simulation table for single server system is given in table below.
Customer Time Arrival Service Time Time Time Time Idle
Between Time Time Service Customer Service Customer Time
Arrival Begins Waits in Ends Spends In Of
Queue System Server
1 -- 0 4 0 0 4 4 0
2 8 8 1 8 0 9 1 4
3 6 14 4 14 0 18 4 5
4 1 15 3 18 3 21 6 0
5 8 23 2 23 0 25 2 2
6 3 26 4 26 0 30 4 1
7 8 34 5 34 0 39 5 4
8 7 41 4 41 0 45 4 2
9 2 43 5 45 2 50 7 0
10 3 46 3 50 4 53 7 0
Ʃ =46 Ʃ =33 Ʃ =9 Ʃ =40 Ʃ =18
Output Statistics:
( )
( )
( )
( )
( )
Computer Simulation and Modeling 19
Example 2.3
Consider a drive in restaurant where carhops take order and bring food to the car. Cars arrive
according to the inter-arrival distribution of cars. There are two carhops, Able and Baker.
Able is better able to do the job and works a bit faster than Baker. The distribution of their
service time is also given.
Able’s Baker’s
Service 2 3 4 5 Service 3 4 5 6
Time Time
Probability 0.17 0.24 0.29 0.30 Probability 0.18 0.22 0.30 0.30
Develop the simulation table and analyze the system by simulating the arrival and service of
10 customers. Random digits for inter-arrival time and service time are as follows:
Customer 1 2 3 4 5 6 7 8 9 10
R.D. for Interarrival Time -- 32 66 41 21 37 79 18 60 98
R.D. for Service Time 49 53 34 17 30 52 22 62 56 73
Solution:
Example 2.4
Suppose that the maximum inventory level, M, is 11 units and the review period, N, is 5 days.
The distribution of the number of units demanded per day is given in table 2.8.
Demand 0 1 2 3 4
Probability 0.10 0.25 0.35 0.21 0.09
Computer Simulation and Modeling 22
Table 2.8: Probability distribution of the number of units demanded per day
Lead time is a random variable as shown in table 2.9.
Lead Time (Day) 1 2 3
Probability 0.6 0.3 0.1
Table 2.9: Probability distribution of lead time
Assume that orders are placed at the close of business and are received for the inventory at
the beginning of business as determined by lead time. Also assume that the initial inventory is
3 units. Estimate the simulation for 5 cycles, the average ending units in the inventory and the
number of days when a shortage condition occurs.
Random digits for demand and lead time are given in table 2.10 and table 2.11.
Random digit for 26 68 33 39 86 18 64 79 55 74 21 43 49
demand 90 35 08 98 61 85 81 53 15 94 19 44
Table 2.10: Random digit for demand
Random digit for lead time 8 7 2 3 1
Table 2.11: Random digit for lead time
Solution:
Table 2.12: Random Digit Assignment for Daily Demand
Demand Probability Cumulative Probability Random Digit Assignment
0 0.10 0.10 01 – 10
1 0.25 0.35 11 – 35
2 0.35 0.70 36 – 70
3 0.21 0.91 71 – 91
4 0.09 1.00 91 – 00
5 3 35 1 2 0 9 2 1
4 1 2 08 0 2 0 - - 0
2 11 98 4 7 0 - - -
3 7 61 2 5 0 - - -
4 5 85 3 2 0 - - -
5 2 81 3 0 1 11 3 1
5 1 0 53 2 0 3 - - 0
2 11 15 1 7 0 - - -
3 7 94 4 3 0 - - -
4 3 19 1 2 0 - - -
5 2 44 2 0 0 11 1 1
∑ = 64
Based on five cycles of simulation, the average ending inventory is approximately, 2.56
(64/25) units. On 5 of 25 days a shortage condition occurs.
Solution:
Table 2.15: Random Digit Assignment for Daily Demand
Demand Probability Cumulative Probability Random Digit Assignment
0 0.2 0.2 1-2
1 0.5 0.7 3-7
2 0.3 1.0 8-0
If the initial inventory is 4 units, determine on which day the shortage condition occurs.
Solution:
Table 2.17: Random Digit Assignment for Daily Demand
Demand Probability Cumulative Probability Random Digit Assignment
0 0.2 0.2 1-2
1 0.5 0.7 3-7
2 0.3 1.0 8-0
[( + ( + ( , ( ,]
Solution:
Note→ 1$ = 100 cents
The revenue from sales is 50 cents for each paper sold. The cost of newspapers is 33
cents for each paper purchased.
The lost profit from excess demand is 50 – 33 = 17 cents for each paper demanded
that could not be provided.
The salvage value of scrap papers is 5 cents each.
Tables 2.20 and 2.21 provide the random digits for the types of newsdays and the
demands for those days.
To solve this problem by simulation, requires setting a policy of buying a certain
number of papers each day, then simulating the demands for papers over the 10-day
time period to determine the total profit. Here simulation is carried out for 70
newspapers and is shown in the table 2.22.
The policy (number of newspapers purchased) is changed to other values and the
simulation is repeated until the best value is found.
Table 2.20 Random - digit assignment for type of newsday
Cumulative Random-digit
Type of Newsday Probability
probability assignment
Good 0.35 0.35 01-35
Fair 0.45 0.80 36-80
Poor 0.20 1.00 81-00
Bagels sell for ₹ 4.40 per dozen. They cost ₹ 3.40 per dozen to make. All bagels not sold at
the end of the day are sold at half price to a local grocery store. Assume that the baker baked
20 dozens every day, simulate for 5 days and find out the total profit. Also mention your
suggestions to baker on the basis of current scenario. Random digits for number of
customers/day and dozens ordered/customer is given in table 2.25 and table 2.26 respectively.
R.D. for
1 0 0 2 0 9 3 3 2 9 8 7 5 3 6 3 4 9 6 3 1 0 7 2 8 7 4
Dozens
Ordered/
8 5 1 7 1 3 5 0 4 0 1 8 2 4 5 9 2 6 3 0 9 0 0 8 1 5 6
Customer
Table 2.26 Random Digits for dozens ordered/customer
Computer Simulation and Modeling 27
Solution:
Table 2.27 Random Digit Assignment for Number of Customers/Day
Number of Probability Cumulative Random-Digit
Customers/Day Probability Assignment
8 0.35 0.35 01 – 35
10 0.25 0.60 36 – 60
12 0.20 0.80 61 – 80
14 0.14 0.94 81 – 94
16 0.06 1.00 95 – 00
[( + ( + ( , ( )]
Given that:
Cost price of 1 dozen bagels = ₹ 3.40
Selling price of 1 dozen bagels = ₹ 4.40
3 2
4 2
9 5
6 3
3 2
1 1
0 5
7 3
2 1
8 4
7 3
3 97 16 20 ₹ 88.00 ₹0 ₹ 19.00 ₹ 1.00
4 2
8 4
5 2
1 1
7 3
1 1
3 2
5 1
0 5
4 2
0 5
1 1
8 4
2 1
4 72 12 20 ₹ 88.00 ₹0 ₹ 13.00 ₹ 7.00
4 2
5 2
9 5
2 1
6 3
3 2
0 5
9 5
0 5
0 5
5 20 8 20 ₹ 88.00 ₹0 ₹ 10.00 ₹ 10.00
8 4
1 1
5 2
6 3
Total Profit = ₹ (13+11+1+7+10) = ₹ 42. Looking at the trends, the baker should bake more
than 20 dozens of bagels.
Example 2.9
A large milling machine has three different bearings that fail in service. The cumulative
distribution function of the life of each bearing is identical, as shown in Table 2.30.
Bearing Life (Hours) 1000 1100 1200 1300 1400 1500
Probability 0.20 0.30 0.25 0.10 0.08 0.07
Table 2.30 Probability Distribution of the life of bearing
When a bearing fails, the mill stops, a repairperson is called, and a new bearing is installed.
The delay time of the repairperson's arriving at the milling machine is also a random variable,
with the distribution given in Table 2.31.
Delay Time (Minutes) 5 10 15
Probability 0.4 0.5 0.1
Table 2.31 Probability Distribution of delay
Downtime for the mill is estimated at ₹5 per minute. The direct on-site cost of the
repairperson is ₹15 per hour. It takes 20 minutes to change one bearing, 30 minutes to change
two bearings, and 40 minutes to change three bearings. The bearings cost ₹16 each. A
proposal has been made to replace all three bearings whenever a bearing fails. Management
needs an evaluation of this proposal. Run the simulation for 10,000 hours of operation.
Solution:
Table 2.32 Random Digit Assignment for Distribution of the life of bearing
Bearing Probability Cumulative Random-
Life Probability Digit
(Hours) Assignment
1000 0.20 0.20 01 – 20
1100 0.30 0.50 21 – 50
1200 0.25 0.75 51 – 75
1300 0.10 0.85 76 – 85
1400 0.08 0.93 86 – 93
1500 0.07 1.00 94 – 00
Proposed Method:
This method is given in table 2.35. In this method the bearing life time is assumed to be same
as that used for current method. As proposed method requires more bearings than the current
method, the simulation requires a new set of random digits for generating the additional life
times of bearings. Additional life times required at 9th replacement of bearing 2 and 3.
In proposed method, we replace all the three bearings at a time when first failure occurred.
The repairperson delay is generated independently using different sets of random digits.
At the end of simulation for 10,000 hours, 9 sets of bearings were replaced.
The total cost of the current method is computed as follows:
Cost of bearings = 27 bearings * ₹16 per bearing = ₹432
Cost of delay = 80 minutes * ₹5 per minute = ₹400
Cost of downtime during repair = 9 sets * ₹5 per minute * 40 minutes per set
= ₹1800
Cost of repairperson = 9 sets * 40 minutes per set * ₹ 15/60 minutes
= ₹90
Total cost = ₹ (432 + 400 + 1800 + 90) = ₹2722
Hence, we can conclude that the proposed method has a saving of ₹1278 over a period of
10,000 hours simulation.
Example 2.10
A firm sells bulk rolls of newsprint. The daily demand is given by the following probability
distribution:
Daily Demand (Rolls) 3 4 5 6
Probability 0.20 0.35 0.30 0.15
Lead time is a random variable given by the following distribution:
Solution:
3.5
3
3
Relative Frequency
2.5
1.5
1 1
1
0.5
0
0
0-3 4-7 8 - 11 12 - 15
Lead-Time Demand
■■■
Computer Simulation and Modeling 33
CHAPTER 3
GENERAL PRINCIPLES
This chapter develops a common framework for the modeling of complex systems using
discrete-event simulation. It covers the basic building blocks of all discrete-event simulation
models: entities and attributes, activities and events. In discrete-event simulation, a system is
modeled in terms of its state at each point in time; the entities that pass through the system
and the entities that represent system resources; and the activities and events that cause
system state to change. Discrete-event models are appropriate for those systems for which
changes in system state occur only at discrete points in time.
Delay Activity
1. A delay's duration is not specified by 1. An activity‟s duration is defined by
the modeler ahead of time, but rather the modeler and is computable from
is determined by system conditions. its specification at the instant it
begins.
2. A delay is sometimes called a 2. An activity is called an unconditional
conditional wait. wait.
3. The completion of a delay is 3. The completion of an activity is an
sometimes called a conditional or event, often called a primary event,
secondary event, but such events are which is managed by placing an event
not represented by event notices, nor notice on the FEL.
do they appear on the FEL.
4. A delay's duration is measured and is 4. An activity typically represents a
one of the desired outputs of a model service time, an inter-arrival time, or
run. Typically, a delay ends when any other processing time.
some set of logical conditions
becomes true or one or more other
events occur.
Cumulative
Entities Future
System Set Set Statistics
Clock and … Event
State 1 2 and
Attributes List, FEL
Counters
t (x, y, z, …) (2, t1) – type 2 event
to occur at time t1
(3, t2) – type 3 event
to occur at time t2
. . . . . . .
. . . . . . .
Figure 3.1 Prototype System Snapshot at simulation time t
Computer Simulation and Modeling 35
The FEL is ordered by event time, meaning that the events are arranged
chronologically; that is, the event times satisfy
Future Events Generation by the alternate generation of runtimes and downtimes for a
machine subject to breakdowns
At time 0, the first runtime will be generated and an end-of-runtime event scheduled.
Whenever an end-of-runtime event occurs, a downtime will be generated and an end-
of-downtime event is scheduled on the FEL.
When the CLOCK is eventually advanced to the time of this end-of-downtime event,
a runtime is generated and an end-of-runtime event is scheduled on the FEL.
Computer Simulation and Modeling 37
In this way, runtimes and downtimes continually alternate throughout the simulation.
A runtime and a downtime are examples of activities, and end of runtime and end of
downtime are primary events.
Stopping Event
Every simulation must have a stopping event, here called E, which defines how long the
simulation will run. There are generally two ways to stop a simulation:
At time 0, schedule a stop simulation event at a specified future time TE. Thus, before
simulating, it is known that the simulation will run over the time interval [0, TE].
Example: Simulate a job shop for TE = 40 hours.
Run length TE is determined by the simulation itself. Generally, TE is the time of
occurrence of some specified event E. Examples: TE is the time of the 100th service
completion at a certain service center. TE is the time of breakdown of a complex
system. TE is the time of disengagement or total kill (whichever occurs first) in a
combat simulation. TE is the time at which a distribution center ships the last carton in
a day's orders.
Proponents claim that the activity-scanning approach is simple in concept and leads to
modular models that are more maintainable and easily understood and modified by
other analysts at later times. They admit, however, that the repeated scanning to
determine whether an activity can begin results in slow runtime on computers.
Thus, the pure activity-scanning approach has been modified (and made conceptually
somewhat more complex) by what is called the three-phase approach, which
combines some of the features of event scheduling with activity scanning to allow for
variable time advance and the avoidance of scanning when it is not necessary, but
keeping the main advantages of the activity-scanning approach.
In the three-phase approach, events are considered to be activities of duration-zero
time units. With this definition, activities are divided into two categories, called B and
C.
B activities: activities bound to occur; all primary events and unconditional activities.
C activities: activities or events those are conditional upon certain conditions being
true.
The B-type activities and events can be scheduled ahead of time, just as in the event-
scheduling approach.
This allows variable time advance. The FEL contains only B-type events. Scanning to
check if any C-type activities can begin or C-type events occur happens only at the
end of each time advance, after all
B-type events have completed. In summary, with the three-phase approach, the
simulation proceeds with repeated execution of the three phases until it is completed:
Phase A Remove the imminent event from the FEL and advance the clock to its event
time. Remove any other events from the FEL that have the same event time.
Phase B Execute all B-type events that were removed from the FEL. (This may free a
number of resources or otherwise change system state.)
Phase C Scan the conditions that trigger each C-type activity and activate any whose
conditions are met. Rescan until no additional C-type activities can begin or events
occur.
The three-phase approach improves the execution efficiency of the activity scanning
method.
In addition, proponents claim that the activity-scanning and three-phase approaches
are particularly good at handling complex resource problems, in which various
combinations of resources are needed to accomplish different tasks.
These approaches guarantee that resources being freed at a given simulated time will
all be freed before any available resources are reallocated to new tasks.
Event Scheduling
Basic building block is the event
Model program's code segments consist of event routines waiting to be executed
Event routine associated with each event type -- performs required operations for that
type
Simulation executive moves from event to event executing the corresponding event
routines
Process Interaction
Basic building block is the process
System consists of a set of interacting processes
Model program's code for each process includes the operations that it carries out
throughout its lifetime
Event sequence for system consists of merging of event sequences for all processes
Future event list consists of a sequence of event nodes (or notices)
Each event node indicates the event time and process to which it belongs
Simulation executive carries out the following tasks:
o placement of processes at particular points of time in the list
o removal of processes from the event list
o activation of the process corresponding to the next event node from the event
list
o rescheduling of processes in the event list
Typically, a process object can be in one of several states:
o active -- process is currently executing
There is only one such process in a system.
o ready -- process is in event list, waiting for activation at a certain time
o idle (or blocked) -- process is not in event list, but eligible to be be reactivated
by some other entity (e.g., waiting for a passive resource)
o terminated -- process has completed its sequence of actions, is not in event
list, and cannot be reactivated
3.3 Manual Simulation Using Event Scheduling [M-12,17 D-06, 09,10, 11]
1. SINGLE SERVER QUEUE
Consider a grocery store with single checkout counter.
The system consists of those customers in the waiting line plus the one (if any) checking out.
The model has the following components:
System state (LQ(t), LS(t)), where LQ(t) is the number of customers in the waiting line, and
LS(t) is the number being served (0 or 1) at time t .
Entities The server and customers are not explicitly modeled, except in terms of the state
variables above.
Events
Arrival (A)
Departure (D)
Stopping event (SE), scheduled to occur at time TSE.
Event notices
(A, t), representing an arrival event to occur at future time t
Computer Simulation and Modeling 40
Activities
Inter-arrival time
Service time
The event notices are written as (event type, event time). In this model, the FEL will always
contain either two or three event notices.
The effect of arrival and departure events is shown in figure 3.2 and 3.3 respectively.
Arrival event occur at
CLOCK= t
Collect Statistics
Collect Statistics
Develop the simulation table using event-scheduling approach and analyze the system until
the stopping event occurring at clock time 30. Calculate the server utilization and the
maximum queue length.
Solution:
The system consists of those customers in the waiting line plus the one (if any) checking out.
The model has the following components:
System state (LQ(t), LS(t)), where LQ(t) is the number of customers in the waiting line, and
LS(t) is the number being served (0 or 1) at time t .
Entities The server and customers are not explicitly modeled, except in terms of the state
variables above.
Events
Arrival (A)
Departure (D)
Stopping event (SE), scheduled to occur at time 30.
Computer Simulation and Modeling 42
Event notices
(A, t ), representing an arrival event to occur at future time t
(D, t ), representing a customer departure at future time t
(SE, 30), representing the simulation-stop event at future time 30.
Activities
Inter-arrival time
Service time
The event notices are written as (event type, event time). In this model, the FEL will always
contain either two or three event notices.
Using the distribution of inter-arrival time and service time, we get the clock times for arrival
and departure event.
Customer Inter-arrival Clock Time of Service Time Departure Time
Time Arrival
1 -- 0 5 5
2 4 4 3 8
3 5 9 4 13
4 2 11 6 19
5 8 19 2 21
6 3 22 7 29
7 7 29 1 30
Using clock time of arrival and departure of each customer, we generate the chronological
ordering of events as shown below:
Event Type Customer Number Clock Time
Arrival 1 0
Arrival 2 4
Departure 1 5
Departure 2 8
Arrival 3 9
Arrival 4 11
Departure 3 13
Departure 4 19
Arrival 5 19
Departure 5 21
Arrival 6 22
Departure 6 29
Arrival 7 29
Departure 7 30
We know that the event notices are written as (event type, event time). In case of single
server queue, the future event list (FEL) will contain either two or three notices; the notices
are (A,t), (D,t), and (SE,t). Here it is found that the departure of 7th customer occurs at clock
time 30, so the stopping event notice is (SE,30).
Computer Simulation and Modeling 43
The simulation table for single server queue is given below. We are gathering here two
statistics such as server utilization i.e., the total busy time (B) of the server and the maximum
queue length (MQL).
Solution:
Given that the arrival distribution is uniformly distributed between 1 and 10 minutes. Hence
the probability of occurrence of each inter-arrival time is the same which would be 0.100.
Using this we can generate the distribution of inter-arrival time and assign random digits to it
as shown in table 3.1 below:
Computer Simulation and Modeling 44
Random-
Time Between Cumulative
Probability Digit
Arrivals(minutes) Probability
Assignment
1 0.100 0.100 001-100
2 0.100 0.200 101-200
3 0.100 0.300 201-300
4 0.100 0.400 301-400
5 0.100 0.500 401-500
6 0.100 0.600 501-600
7 0.100 0.700 601-700
8 0.100 0.800 701-800
9 0.100 0.900 801-900
10 0.100 1.000 901-000
Table 3.1 Distribution of time between arrivals
Similarly, assign random digits to the service time as shown in table 3.2 below.
Random-
Service Time Cumulative
Probability Digit
(minutes) Probability
Assignment
1 0.04 0.04 01-04
2 0.20 0.24 05-24
3 0.10 0.34 25-34
4 0.26 0.60 35-60
5 0.35 0.95 61-95
6 0.05 1.00 96-00
Table 3.2 Service time distributions
First, initialize the table by entering the details for the first customer. Here, we are assuming
that the first customer is arriving at time 0. Service of first customer begins immediately since
nobody is present in the system and gets over at time 5. So we can say that the customer
spends 5 minutes in the system. After filling the first customer details, subsequent rows in the
table are filled based on the random numbers for inter-arrival time and service time and the
completion of service time of the previous customer.
The simulation table for single server system is given in table 3.3 below.
Customer Random Time Arrival Random Service Time Time Time Time Idle
Digits Between Time Digits Time Service Customer Service Customer Time
For Arrival For Begins Waits in Ends Spends In Of
Arrival Service Queue System Server
1 -- -- 0 71 5 0 0 5 5 0
2 853 9 9 59 4 9 0 13 4 4
3 340 4 13 12 2 13 0 15 2 0
4 205 3 16 88 5 16 0 21 5 1
5 99 1 17 97 6 21 4 27 10 0
6 669 7 24 66 5 27 3 32 8 0
7 742 8 32 81 5 32 0 37 5 0
8 301 4 36 35 4 37 1 41 5 0
9 888 9 45 29 3 45 0 48 3 4
10 444 5 50 91 5 50 0 55 5 2
Ʃ =50 Ʃ =44 Ʃ =8 Ʃ =52 Ʃ =11
Table 3.3 Simulation Table for Single Server Queue
Computer Simulation and Modeling 45
Using clock time of arrival and departure of each customer, we generate the chronological
ordering of events as shown below:
Event Type Customer Number Clock Time
Arrival 1 0
Departure 1 5
Arrival 2 9
Departure 2 13
Arrival 3 13
Departure 3 15
Arrival 4 16
Arrival 5 17
Departure 4 21
Arrival 6 24
Departure 5 27
Departure 6 32
Arrival 7 32
Arrival 8 36
Departure 7 37
Departure 8 41
Arrival 9 45
Departure 9 48
Arrival 10 50
Departure 10 55
The system consists of those customers in the waiting line plus the one (if any) checking out.
The model has the following components:
System state (LQ(t), LS(t)), where LQ(t) is the number of customers in the waiting line, and
LS(t) is the number being served (0 or 1) at time t .
Entities The server and customers are not explicitly modeled, except in terms of the state
variables above.
Events
Arrival (A)
Departure (D)
Stopping event (SE), scheduled to occur at time 55.
Event notices
(A, t ), representing an arrival event to occur at future time t
(D, t ), representing a customer departure at future time t
(SE, 55), representing the simulation-stop event at future time 55.
Activities
Inter-arrival time
Service time
The event notices are written as (event type, event time). In this model, the FEL will always
contain either two or three event notices.
We know that the event notices are written as (event type, event time). In case of single
server queue, the future event list (FEL) will contain either two or three notices; the notices
are (A,t), (D,t), and (SE,t). Here it is found that the departure of 10th customer occurs at clock
time 55, so the stopping event notice is (SE, 55).
The simulation table for single server queue is given below. We are gathering here two
statistics such as server utilization i.e., the total busy time (B) of the server and the maximum
queue length (MQL).
Six dump trucks are used to haul coal from the entrance of a small mine railroad. Figure
below provides a schematic of the dump-truck operation. Each truck is loaded by one of the
two loaders. After loading, the truck immediately moves to the scale, to be weighed as soon
as possible. Both the loaders and the scale have a FCFS waiting line (or queue) for trucks.
Travel time from a loader to a scale is considered negligible. After being weighed, a truck
Computer Simulation and Modeling 47
begins a travel time (during which time the truck unloads) and then afterward returns to the
loader queue.
Traveling
Loading
Scale
Loader Weighing
queue queue
The distributions of loading time, weighing time, and travel time are given in tables 1, 2, and
3 respectively.
Loading Time 10 5 5 10 15 10 10
Weighing Time 12 12 12 16 12 16
Travel Time 60 100 40 40 80
Estimate the loader and scale utilizations (percentage of time busy).
SOLUTION:
The model has the following components:
Computer Simulation and Modeling 48
Event notices
(ALQ, t, DTi), dump truck arrives at loader queue (ALQ) at time t
(EL, t, DTi), dump truck i ends loading (EL) at time t
(EW, t, DTi), dump truck i ends weighing (EW) at time t
Lists
Loader queue, all trucks waiting to begin loading, ordered on a first-come, first-served basis
Weigh queue, all trucks waiting to be weighed, ordered on a first-come, first-serve basis.
When an end-loading (EL) event occurs, say for truck j at time t, other events may be
triggered.
If the scale is idle [W(t)=0], truck j begins weighing and an end-weighing event (EW)
is scheduled on the FEL.
Otherwise, truck j joins the weigh queue.
If at this time there is another truck waiting for a loader, it will be removed from the
loader queue and will begin loading by the scheduling of an end-loading event (EL)
on the FEL.
In order to estimate the loader and scale utilizations, two cumulative statistics are maintained:
10 0 2 3 1 DT3 (EW,12,DT1) 20 10
DT2 (EL,20,DT5)
DT4 (EL,10+15,DT6)
12 0 2 2 1 DT2 (EL,20,DT5) 24 12
DT4 (EW, 12+12,DT3)
(EL,25,DT6)
(ALQ,12+60,DT1)
20 0 1 3 1 DT2 (EW,24,DT3) 40 20
DT4 (EL,25,DT6)
DT5 (ALQ,72,DT1)
24 0 1 2 1 DT4 (EL,25,DT6) 44 24
DT5 (EW,24+12,DT2)
(ALQ,72,DT1)
(ALQ,24+100,DT3)
25 0 0 3 1 DT4 (EW,36,DT2) 45 25
DT5 (ALQ,72,DT1)
DT6 (ALQ,124,DT3)
36 0 0 2 1 DT5 (EW,36+16,DT4) 45 36
DT6 (ALQ,72,DT1)
(ALQ,36+40,DT2)
(ALQ,124,DT3)
52 0 0 1 1 DT6 (EW,52+12,DT5) 45 52
(ALQ,72,DT1)
(ALQ,76,DT2)
(ALQ,52+40,DT4)
(ALQ,124,DT3)
64 0 0 0 1 (ALQ,72,DT1) 45 64
(ALQ,76,DT2)
(EW,64+16,DT6)
(ALQ,92,DT4)
(ALQ,124,DT3)
(ALQ,64+80,DT5)
72 0 1 0 1 (ALQ,76,DT2) 45 72
(EW,80,DT6)
(ALQ,92,DT4)
(ALQ,124,DT3)
(ALQ,144,DT5)
76 0 2 0 1 (EW,80,DT6) 49 76
(EL,82,DT1)
(EL,76+10,DT2)
(ALQ,92,DT4)
(ALQ,124,DT3)
(ALQ,144,DT5)
Average loader utilization = 0.32 Average scale utilization = 1.00
■■■
Computer Simulation and Modeling 50
UNIT: II
MATHEMATICAL & STATISTICAL
MODELS IN SIMULATION
CHAPTER 4
STATISTICAL MODELS
Let X be the discrete random variable, RX the possible values of X, given by range
space of X and xi the individual outcome in RX.
A number p(xi) = P(X = xi) gives the probability that the random variable equals the
value of xi.
The number p(xi), i=1, 2, 3 … must satisfy two conditions
1. p(xi) ≥ 0, for all the values of i
2. ∑ ( )
The collection of pairs (xi, p(xi)) i.e. a list of probabilities associated with each of its
possible values is called probability distribution of X.
p(xi) is called probability mass function (pmf) of X.
Computer Simulation and Modeling 52
Example 4.1
Consider the experiment of tossing a single die. Define X as the number of spots on up face
of die after a toss. Assume that the die is loaded so that the probability that a given face lands
up is proportional to the number of spots showing.
a) Give the probability distribution of X.
b) Show that X is a discrete random variable.
Solution:
Given X is the random variable showing the number of spots on the up face.
RX = {1, 2, 3, 4, 5, 6}
N=total number of observations = 21
a) The discrete probability distribution is given by
xi 1 2 3 4 5 6
P(xi) 1/21 2/21 3/21 4/21 5/21 6/21
b) The conditions also are satisfied, i.e.
1. p(xi) ≥ 0, for i = 1,2,…6
2. ∑ ( )
Hence, X is a discrete random variable.
The distribution is shown graphically in figure 4.1
For a continuous random variable X, the probability that X lies in the interval [a, b], is
given by
( ) ∫ ( )
The function f(x) is called probability density function (pdf) of random variable X.
The pdf must satisfy the following conditions
1. f(x) ≥ 0, for all x in RX
2. ∫ ( ) ( )
3. f(x) = 0, if x is not in RX
For any specified value x0, P(X = x0 ) = 0 since ∫ ( )
Since P(X = x0 ) = 0, the following equation also hold:
Example 4.2
The life of a laser- ray device used to inspect cracks in aircraft wings is given by X,
continuous random variable, assuming x ≥ 0.The pdf of lifetime in years is ,
( ) {
Solution:
Given
( ) {
( ) ∫
Computer Simulation and Modeling 54
0 1
= -e -3/2 + e -1
= -0.223 + 0.368 = 0.145
If X is continuous, then
( ) ∫ ( )
Note: All probability questions about X can be answered in terms of cdf. For example,
P (a < X ≤ b) = F (b) – F (a), for all a < b.
Example 4.3
The life of a laser- ray device used to inspect cracks in aircraft wings is given by X,
continuous random variable, assuming x ≥ 0. The cdf for the device is given by:
( ) ∫
1. Determine the probability that the device will last for less than 2 years.
2. Determine the probability that the life of laser ray device is between 2 and 3 years.
Solution:
The cdf for the device is given by:
( ) ∫
1. The probability that the device will last for less than 2 years is,
P(0 ≤ X ≤ 2) = F(2) – F(0) = F(2)
= 1- e-1
= 0.632
2. The probability that the life of laser ray device between 2 and 3 years is
Computer Simulation and Modeling 55
( ) ∑ ( )
( ) ∫ ( )
( ) ∫ ( )
Note:
Example 4.4
Consider the experiment of tossing a single die. Define X as the number of spots on up face
of die after a toss. Assume that the die is loaded so that the probability that a given face lands
up is proportional to the number of spots showing.
a) Give the probability distribution of X.
b) Find the mean, variance and standard deviation.
Solution:
Given X is the random variable showing the number of spots on the up face.
RX = {1, 2, 3, 4, 5, 6}
N=total number of observations = 21
a) The discrete probability distribution is given by
Computer Simulation and Modeling 56
xi 1 2 3 4 5 6
P(xi) 1/21 2/21 3/21 4/21 5/21 6/21
( ) ( * ( * ( *
4.1.4 Mode
In case of discrete, Mode is the value of random variable that occurs most frequently.
In case of continuous, the mode is the value at which the pdf is maximized.
The mode may not be unique; if the modal value occurs at two values of the random
variable, the distribution is said to be bimodal.
1. Queuing system
In queuing examples, inter-arrival and service-time patterns are given.
The times between the arrivals are always probabilistic.
Service times may be constant or probabilistic.
If service times are completely random, exponential distribution is often used for
simulation purposes.
If service time is constant, but some random variability causes fluctuations in positive
or negative way, then normal distribution is used. For example, the time it takes for
lathe to traverse a 10cm shaft should be always same, but the material may have slight
difference in hardness, causing different processing times.
If there are more large service times, then weibull distribution is a better model.
To model inter-arrival and service times, exponential, gamma and weibull
distributions are used.
The differences between these distributions involve the location of modes of pdf‟s and
the shapes of their tails for large and small times.
Computer Simulation and Modeling 57
2. Inventory systems
It has three random variables
1. Number of units demanded per order or per time period
2. Time between demands
3. Lead time ( time between placing an order and receiving receipt of that order)
In simple mathematical model, demand is constant over time and lead time is zero or
constant.
In realistic cases (Simulation models), demand occurs randomly in time and number
of units demanded each time is also random.
In practice, the lead-time distribution can often be fitted fairly well by a gamma
distribution.
The geometric, Poisson, and negative binomial distribution provide a range of
distribution shapes that satisfy a variety of demand patterns.
The geometric distribution has its mode at unity, given that at least one demand has
occurred.
If demand data are characterized by a long tail, Negative Binomial distribution is
appropriate.
The Poisson distribution is often used to model the demand.
4. Limited data
In many situations, simulations begin before data collection is completed.
Three distributions have application to incomplete/ limited data. These are uniform,
triangular, and beta distribution.
Uniform distribution can be used when an inter-arrival or service time is known to be
random but no information is immediately available about the distribution.
Triangular distribution can be used when assumptions are made about minimum,
maximum and modal values of the random variable.
Beta distribution provides a variety of distributional forms on the unit interval, which,
with appropriate modification, can be shifted to any desired interval.
Computer Simulation and Modeling 58
5. Other distributions
Several other distributions may be useful in discrete system simulation.
The Bernoulli and binomial distributions are two discrete distributions which may
describe phenomena of interest.
The hyperexponential distribution is similar to exponential distribution, but its greater
variability may make it useful in certain instances.
( ) ( ) {
Variance of Xj is given by
P (SSS……………..SS FF……………FF) =
where q = 1 – p,
Computer Simulation and Modeling 59
There are ( ) outcomes having the required number of S‟s and F‟s.
( )
Mean and variance is computed by considering X as a sum of n independent Bernoulli
random variables each with mean p and variance p (1 - p) = pq.
Then X = X1 + X2 + … + Xn
and the mean E (X) = p + p + … + p = np
Variance V (X) = pq + pq + …+ pq = npq
Example 4.5
A production process manufactures computer chips on the average at 2% non-conforming.
Every day a random sample of size 50 is taken from the process. If the sample contains more
than two non-conforming chips, the process will be stopped. Determine the probability that
the process is stopped by the sampling scheme. Also calculate the mean and variance of the
number of non-conforming chips.
Solution:
Let the sampling process be n = 50 Bernoulli trails with p = 2% = 0.02 for each trial.
Therefore, q = 1 - p = 0.98
Let X is the random variable denoting the total number of non conforming chips in the
sample. This X will have a Binomial distribution given by
( ) {( *( )
To determine the probability that more than two non conforming chips are present in the
sample:
P(X > 2) = 1 – P(X ≤ 2)
P(X ≤ 2) = ∑ ( )( ) ( )
Variance is given by
V (X) = npq = 50 (0.02) (0.98) =0.98
Example 4.6
A recent survey indicated that 82% of single women aged 25 years old will be married in
their lifetime. Using the binomial distribution, find the probability that two or three women in
a sample of twenty will never be married.
Computer Simulation and Modeling 60
Solution:
Let X is a random variable denoting the number of women in a sample will never be married.
Therefore, q = 82% = 0.82 and p = 1 – q = 0.18
Given, n = 20
The probability that two or three women in a sample of twenty will never be married is given
by the binomial distribution as:
P(2 ≤ X ≤ 3) = ∑ ( )
= ( )( ) ( ) ( )( ) ( )
= 0.173 + 0.228
= 0.401
Example 4.7
The Hawks are currently winning 0.55 of their games. There are 5 games in the next two
weeks. What is the probability that they will win more games than they lose?
Solution:
Let X is the random variable denoting the number of games won.
Therefore, p = 0.55 and q = 1 – p = 0.45
The probability that the Hawks will win more games than they lose is given by the binomial
distribution as:
P(3 ≤ X ≤ 5) = ∑ ( )
= ( )( ) ( ) ( )( ) ( ) ( )( ) ( )
{(
( ) *
Because we can think of the negative binomial random variable Y as the sum of k
independent geometric random variables, it is easy to see that
Mean, ( )
Variance, ( )
Example 4.8
Forty percent of assembled ink-jet printers are rejected at inspection station. Find the
probability that
(i) the first acceptable printer is third one inspected.
(ii) the second acceptable printer is third one inspected
Solution:
Let X is the random variable denoting printer is accepted.
Therefore, q =40% =0.4 and p = 1- q = 1 - 0.4 = 0.6
(i) The probability that the first acceptable printer (k = 1) is third one inspected is given
by geometric distribution p (x) = qx-1 p as:
p (3) = (0.4)3-1(0.6) = 0.096
(ii) To determine the probability that the second acceptable printer (k = 2) is the third one,
we use negative binomial distribution.
Therefore, ( ) ( )( ) ( ) ( )( )( )
Solution:
Let X is the random variable denoting the number of calls received until an order is placed.
Then, X is geometric (p = 48% = 0.48) with the probability mass function
( ) ( ) ( )
(i) The probability that the first order will come on the fourth sales call of the day is
( ) ( ) ( )
(ii) The number of orders, Y, in eight calls is binomial (n = 8, p = 0.48) with the
probability mass function
( ) ( *( ) ( )
Computer Simulation and Modeling 62
Solution:
The geometric distribution is Memoryless if P(X > s + t | X > s) = P(X > t), where s and t are
constant integers and X is geometrically distributed random variable.
( ) ∑ ∑
We have, ∑
( )
( )
( | ) ( )
( )
Hence, we proved that geometric distribution is Memoryless.
( ) {
where α > 0
One of the important properties of the Poisson distribution is that the mean and
variance are both equal to α, i.e.,
E (X) = α = V (X)
Computer Simulation and Modeling 63
( ) ∑
Most queuing systems characteristics such as arrival and departure processes are
described by a Poisson distribution.
Example 4.11
A computer terminal repairperson is “beeped” each time there is a call for service. The
number of beeps per hour is known to occur in accordance with a Poisson distribution with a
mean of α = 2 per hour.
(i) Determine the probability of three beeps in next hour.
(ii) Determine the probability of two or more beeps in an hour period.
Solution:
Let X is the random variable denoting the number of beeps.
Given α = 2 per hour
Example 4.12
The number of hurricanes hitting the coast of Florida annually has a Poisson distribution with
mean of 0.8.
(i) What is the probability that more than two hurricanes will hit the Florida coast in a
year?
(ii) What is the probability that exactly one hurricane will hit the coast of Florida in a
year?
Solution:
Let X is the random variable denoting the number of hurricanes hitting the coast.
It is Poisson distributed (α = 0.8) with the probability mass function
( )
( )
(i) The probability that more than two hurricanes will hit the Florida coast in a year is
given by
( ) ( )
( ) ( ) ( )
(ii) The probability that exactly one hurricane will hit the coast of Florida in a year is
given by,
( )
( )
Example 4.13
Lane Braintwain is quite a popular student. Lane receives, on the average, four phone calls a
night with Poisson distribution. What is the probability that tomorrow night the number of
calls received will exceed that average by more than one standard deviation.
Solution:
Let X is the random variable denoting the number of calls which is Poisson distributed, with
mean α = 4.
For Poisson distribution E(X) = V(X) = 4
Standard deviation is given by √ ( ) √
The probability that tomorrow night the number of calls received will exceed the average (i.e.
E(X) = 4) by more than one standard deviation (i.e. ) is given by
( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( )
=
A random variable X is uniformly distributed on the interval (a, b), if its pdf is given
by,
( ) {
( ) {
Note that ( ) ( ) ( )
The probability is proportional to the length of interval for all x1 and x2 satisfying
a ≤ x1 < x2 ≤ b.
Computer Simulation and Modeling 65
Mean, ( )
( )
Variance, ( )
Uniform distribution plays a vital role in simulation. Random numbers, uniformly
distributed between 0 and 1, provide a means to generate random events.
Example 4.14
A bus arrives every 20 minutes at Mc Donald‟s stop beginning at 6:40 A.M. and continues till
8:40 A.M. A certain passenger does not know the schedule, but arrives randomly (uniformly
distributed) between 7:00 A.M. and 7:30 A.M. every morning. What is probability that the
passenger waits more than 5 minutes for a bus?
Solution:
The passenger waits for more than 5 minutes only if his/her arrival time is between 7:00 A.M.
and 7:15 A.M. or between 7:20 A.M. and 7:30 A.M.
Let X is the random variable denoting the number of minutes past 7:00 A.M. that the
passenger arrives.
The probability is given by
P (0 < X < 15) + P (20 < X < 30)
Now, X is uniform random variable on (0, 30) i.e. a = 0 and b = 30.
Therefore the desired probability is
F (15) – F (0) + F (30) – F (20) =
Example 4.15
Ace Heating and Air Conditioning service finds that the amount of time a repairman needs to
fix a furnace is uniformly distributed between 1.5 and 4 hours.
(i) Find the probability that a randomly selected furnace repair requires more than 2
hours.
(ii) Find the probability that a randomly selected furnace repair requires less than 3 hours.
(iii)Find the mean and standard deviation.
Solution:
Let X is the random variable denoting the time needed to fix a furnace. It is uniformly
distributed between 1.5 and 4 hours.
(i) The probability that a randomly selected furnace repair requires more than 2 hours is
given as
( ) ( )
( )
(ii) The probability that a randomly selected furnace repair requires less than 3 hours is
given as
Computer Simulation and Modeling 66
( ) ( )
(iii)
( ) ( )
(iv) √ √
The exponential distribution has been used to model inter-arrival times when arrivals
are completely random and to model service times which are highly variable. In these
instances, λ is a rate: arrivals per hour or services per minute.
The exponential distribution has also been used to model the lifetime of a component
that fails catastrophically (instantaneously), such as light bulb. Then, λ is the failure
rate.
Several different exponential pdf‟s are shown in figure 4.4.
( ) {
∫
Exponential distribution is memoryless i.e. P(X > s + t | X > s) = P(X > t).
Solution:
Let X is the random variable denoting life of the lamp.
Given, λ=1/3
1. The probability that the lamp will last longer than its mean life of 3000 hours is given
by
P(X > 3) = 1 – P (X ≤ 3)
= 1 – F (3)
= 1 – (1 – e-3/3)
= 0.368
2. The probability that the lamp will last between 2000 and 3000 hours is given by
P(2 ≤ X ≤ 3) = F (3) – F (2)
= (1 – e-3/3) - (1 – e-2/3)
= - 0.368 + 0.513
= 0.145
3. The probability that the lamp will last for another 1000 hours, given that it is
operating after 2500 hours is given by (use memoryless property)
P(X > 2.5 + 1| X > 2.5) = P(X > 1)
= 1 – P (X ≤ 1)
= 1 – F (1)
= 1 – (1 – e-1/3)
= 0.7165
Example 4.17
A component has an exponential time-to-failure distribution with mean of 10,000 hours.
(i) The component has already been in operation for its mean life. What is the probability
that it will fail by 15,000 hours?
(ii) After 15,000 hours the component is still in operation. What is the probability that it
will operate for another 5000 hours?
Computer Simulation and Modeling 68
Solution:
Let X is the random variable denoting the lifetime of a component.
Then X is exponentially distributed (λ=1/10,000 hours) with cumulative distribution function
( ) ,x>0
We use the memoryless property to compute the probabilities.
Given that the component has not failed for s = 10,000 hours or s = 15,000 hours.
The probability that the component lasts more 5000 hours is given by
P(X ≥ s + 5000 | X > s) = P(X ≥ 5000)
= 1 – P (X ≤ 5000)
= 1 – F (5000)
= 1 – (1 – e-5000/10000)
= 0.6065
A random variable X is gamma distributed with parameters β and θ if its pdf is given
by
( )
( ) {⌈
Note: ⌈ ( )
The parameter β is called the shape parameter and θ is called the scale parameter.
Several different pdf‟s for gamma distribution with shape parameter = 2 are shown in
figure 4.5.
∫ ( )
( ) { ⌈
and the Xj are mutually independent, X has the pdf same as that of gamma
distribution.
When β = 1, it results in the exponential distribution.
Example 4.18
Lead time is gamma distributed in 100s of units with a shape parameter of 3 and a scale
parameter of 1. What is the probability that the lead time exceeds 2 (hundred) units during an
upcoming cycle?
Solution:
Let X is the random variable denoting the lead time.
Given shape parameter β = 3 and scale parameter θ = 1.
The probability that the lead time exceeds 2 is given by
( ) ( )
= 1 – F(2)
= 0 ∫ ( ) 1
⌈
= 0 ∫ ( ) 1
⌈
=∫
= ∫
= 0( ) . / ( ). / . /1
= 0 . /1
Computer Simulation and Modeling 70
= , ( )-
= 25e-6
= 0.062
( )
( ) {⌈
We know that, the expected value of the sum of random variables is the sum of
expected value of each of the random variable.
Thus,
E(X) = E(X1) + E(X2) + …+ E(Xk)
But, the expected value of the exponentially distributed Xi is .
Hence, the expected value for Erlang distribution is
( )
If the random variables Xj are independent, the variance of their sum is sum of the
variances.
( )
( ) ( ) ( )
The cdf of the Erlang distribution is given as
( )
( ) ∑
{
Solution:
Let X is the random variable denoting the duration of exam.
A medical examination is given in three stages by a physician, i.e. k = 3.
Each stage is exponentially distributed with a mean service time of 20 minutes, i.e.
The probability that the exam will take 50 minutes or less is given by
Computer Simulation and Modeling 71
( ) ( )
. /( )
,( )( )-
= ∑
( )
( )
( )( )
Solution:
Let X is the random variable denoting time between dial-up connections.
The desired probability is Erlang distributed with and X = 30.
The probability that the third dial-up connection occurs after 30 seconds is given by
( ) ( )
( )
( )( )
,( )( )-
[ ∑ ]
( ) [ . / ]
√
It is also denoted sometimes as X ~ N(µ, σ2) to indicate that random variable X is
normally distributed with mean µ and variance σ2.
The normal pdf is represented in figure 4.6.
Computer Simulation and Modeling 72
( ) ( ) ∫ * ( * +
√
Since the above equation is in closed form, it is not possible to evaluate.
So a transformation variable, . /, allows the evaluation to be independent of
µ and σ.
If X ~ N(µ, σ2), let Z = (X - µ) / σ, to obtain
( ) ( ) . /
( )
∫ √
( )
∫ ( ) . /
The pdf of the standard normal distribution with mean 0 and variance 1 [Z~N(0, 1)] is
given as
( )
√
The standard normal distribution is shown in figure 4.7 below.
( ) ∫
√
The probabilities Ф(z) for Z ≥ 0 are given in Table A.2.
Example 4.21
The time to pass through a queue to begin self-service at a cafeteria has been found to be
N(15, 9). Determine the probability that an arriving customer waits between 14 and 17
minutes.
Solution:
Let X is the random variable denoting that an arriving customer waits.
Given N (15, 9). It implies that µ = 15, σ2 = 9.
The probability that an arriving customer waits between 14 and 17 minutes is given by
P (14 ≤ X ≤ 17) = F (17) – F (14)
= Φ [0.667] - Φ [-0.333]
= Φ [0.667] – [1 - Φ [0.333]]
= 0.3780
Example 4.22
IQ scores are normally distributed throughout society with a mean of 100 and a standard
deviation of 15.
(i) A person with an IQ of 140 or higher is called a “genius”. What proportion of society
is in the genius category?
(ii) What proportion of society will miss the genius category by 5 or less points?
(iii)An IQ of 110 or better is required to make it through an accredited college or
university. What proportion of society could be eliminated from completing a higher
education based on a low IQ score?
Solution:
Let X is the random variable denoting I.Q. scores.
Given X is normally distributed with mean μ = 100, and standard deviation σ = 15.
(i) The probability that a score is 140 or greater is given by
P(X ≥140) = 1 − Ф [(140 − 100)/15] = 0.00383
(ii) The probability that a score is between 135 and 140 is given by
P(135 ≤ X ≤ 140) = Φ[(140 − 100)/15] − Φ[(135 − 100)/15]
= 0.00598
(iii)The probability that a score is less than 110 is given by
P(X <110) = Φ [(110 − 100)/15] = 0.7475
Computer Simulation and Modeling 74
. / * . / +
( )
{
The three parameters of the Weibull distribution are
υ → location parameter (-∞ < υ < ∞)
α → Scale parameter (α > 0)
β → Shape parameter (β > 0)
When υ = 0, pdf becomes
. / * . / +
( )
( ) {
Mean is given as ( ) ⌈. /
Computer Simulation and Modeling 75
The location parameter υ has no effect on the variance; however, the mean is
increased or decreased by υ.
The cdf of the Weibull distribution is given by
( ) {
* . / +
Example 4.23
The time it takes for an aircraft to land and clear the runway at a major airport has a weibull
distribution with υ = 1.34 minutes, β = 0.5 and α = 0.04 minutes. Determine the probability
that an incoming airplane will take more than 1.5 minutes to land and clear the runway.
Solution:
Let X is the random variable denoting the time taken by an aircraft.
Given v = 1.34 minutes, β = 0.5, and α = 0.04 minutes
The probability than an incoming airplane will take more than 1.5 minutes is given by
P (X > 1.5) = 1- P(X ≤ 1.5)
= 1- F(1.5)
= 1- exp [- {(1.5 – 1.34)/ (0.04)} 0.5]
= 1- e- 2
= 1- 0.135
= 0.865
Example 4.24
The time to failure of an electronic subassembly can be modeled by a Weibull distribution
whose location parameter is zero, β = 0.5 and α = 1000 hours.
(i) What is the mean time to failure?
(ii) What fraction of these subassemblies will fail by 3000 hours?
Solution:
Let X is the random variable denoting the time to failure of an electronic subassembly.
Given v = 0, β = 0.5, and α = 1000 hours
(i) The mean time to failure is given by
( ) ⌈( * ⌈( *
(ii) The probability that these subassemblies will fail by 3000 hours is given by
( ) ( )
[ . / ]
= 1 – e-1.732
= 0.823
Computer Simulation and Modeling 76
4.4.7Triangular distribution
A random variable X has a triangular distribution if its pdf is given by
( )
( )( )
( ) ( )
( )( )
{
where a ≤ b ≤ c.
The mode occurs at x = b.
A triangular pdf and the representation of height is shown in figure 4.9
Mean = E(X) = (a + b + c) / 3
Variance = V(X) = (a2 + b2 + c2 – ab – ac – bc) / 18
Mode = b = 3 E(X) – (a + c)
Since a ≤ b ≤ c, it follows that
( )
The cdf for the triangular distribution is
( )
( )( )
( )
( )
( )( )
{
Example 4.25
The central processing requirements for a program that will execute, have a triangular
distribution with a = 0.05 second, b = 1.1 seconds and c = 6.5 seconds. Determine the
probability that the CPU requirement for a random number is 2.5 seconds or less.
Solution:
Let X is the random variable denoting the CPU requirement for a program.
This is triangular distributed with a = 0.05 second, b = 1.1 seconds, and c = 6.5 seconds.
The probability that the CPU requirement for a random number is 2.5 seconds or less is given
by
P (X ≤ 2.5) = F (2.5)
Computer Simulation and Modeling 77
The value of F(2.5) is from the portion of the cdf in the interval (0.05, 1.1) plus that portion
in the interval (1.1, 2.5) i.e., 1.1 < 2.5 ≤ 6.5 (b < x ≤ c)
( ) ( )
Therefore F (2.5) =
( )( ) ( )( )
Thus, the probability is 0.541 that the CPU requirement is 2.5 seconds or less.
Solution:
A random variable, X, has a triangular distribution with pdf
( )
( )( )
( ) ( )
( )( )
{
The variance is
( ) ( ) , ( )-
E(X) = (a + b + c)/3
( ) ( *∫ ( ) ( *∫ ( )
( )( ) ( )( )
0 1, ( ) ( )-
( )
( ) ( )
( ) * + [ ]
where σ2 > 0
( )
√
. /
Example 4.27
The rate of return on a volatile investment is modeled as having a lognormal distribution with
mean 20% and standard deviation 5%. Compute the parameters for the lognormal
distribution.
Solution:
( ) ( )
√ √
. / . /
Computer Simulation and Modeling 79
⌈( )⌈( )
where ( )
⌈( )
The cdf of the beta distribution does not have a closed form in general.
The beta distribution is very flexible and has a finite range from 0 to 1, as shown in
figure 4.11.
In practice, we often need a beta distribution defined on a different range, say (a,b),
with a < b, rather than (0,1).
This is easily accomplished by defining a new random variable
( )
The mean and variance of Y are given by
( ) ( )( *
( ) ( ) ( *
( ) ( )
The counting process, {N(t), t ≥ 0}is said to be a Poisson process with mean rate λ, if
it satisfies the following assumptions.
E(N(t) = α = λt = V[N(t)]
For any times s and t, such that s < t, the assumption - stationary increments implies
random variable N(t) –N(s) representing number of arrivals in interval s to t is also
Poisson distributed with mean λ (t - s).
Thus,
( ), ( )-
[ ( )– ( ) ]
Now, consider the time at which an arrival occur in Poisson process. Let the first
arrival occur at time A1, the second at time A1+A2 and so on. Thus A1, A2, … are
successive inter arrival times. It is depicted in figure 4.12.
Since first arrival occurs after time t and no arrivals in interval [0, t], it is seen that
{A1 > t} = {N(t) = 0}
Therefore P(A1 > t) = P[N(t) = 0] = e-λt
The probability that the first arrival will occur in [0, t] is given by P(A1 ≤ t) = 1 - e-λt
which is the cdf of exponential distribution with λ.
Hence A1 is exponentially distributed with mean 1/λ and also A1, A2, … inter-arrival
times are exponentially distributed and independent with mean 1/λ.
Computer Simulation and Modeling 81
1. Random splitting
Let N1(t) be random variable denoting number of type I event, N2(t) for type II event.
N(t) = N1(t)+ N2(t).
N1(t)and N2(t) are both Poisson processes having rates λp and λ(1-p) as shown in
figure 4.13.
2. Pooled process
It is the process of pooling two arrival streams.
If Ni(t) are random variables representing independent Poisson process with rates λi
then N(t) = N1(t)+ N2(t) is a poisson process with rate λ1+ λ2, as shown in figure 4.14.
Solution:
Let X is the random variable denoting the number of hours until crash occurs.
Given, λ=1/36
The probability that the next crash will occur between 24 and 48 hours is given by
P(24 ≤ X ≤ 48) = F (48) – F (24)
= (1 – e-48/36) - (1 – e-24/36)
= 0.513 – 0.264 = 0.249
■■■
Computer Simulation and Modeling 82
CHAPTER 5
QUEUEING MODELS
– When N and K are infinite, they are often dropped from the notation
• G/G/1/5/5 represents a queueing system with general (or arbitrary) inter-
arrival and service distribution with single server, with a queue capacity of 5
and finite population model of size 5
– General models are used to solve the queue system with no particular
distribution in mind
– Very useful as the final results can be obtained by plugging in the
values of specific distributions
̂ ∑ ∫ ( )
If LQ(t) denotes the number of customers waiting in line, and denotes the total
time during [0,T] in which exactly i customers are waiting in line then, the time-
weighted-average number in queue is:
̂ ∑ ∫ ( )
The average time spent in system per customer, called the average system time, is:
̂ ∑
̂ ∑
where ̂ is the observed average time spent in queue (called delay), and wQ is the
long-run average delay per customer.
Referring to the figure 5.4, N=5 customers are in the system over a period (0, 20). For
getting W1, W2, …, W5 we require more information. Here we assume that the system
has single server and a FIFO queue discipline. In figure 5.4, each jump in upward
direction of L(t) represents the arrival event whereas the jump in downward direction
represent the departure event. Hence arrivals occur at 0, 3, 5, 7, 16 where as the
departures occur at 2, 8, 10, 14, 20.
Now, W1 = 2 – 0 = 2, W2 = 8 – 3 = 5, W3 = 10 – 5 = 5, W4 = 14 – 7 = 7, W5 = 20 – 16
=4
The average system time is:
W1 W2 ... W5 2 (8 3) ... (20 16)
wˆ 4.6 time units
5 5
Similarly, the time spent by each customer in waiting line is:
W1 W2 ... W5 0 0330
Q Q Q
Derivation:
Total system time for all customers is given by the total area under the number in system
function L(t).
∑ ∫ ( )
∑ ∫ ( )
∫ ( ) ∑ ̂ ̂̂ ̂
4. Server Utilization
Server utilization is the proportion of time that a server is busy.
Observed server utilization, ̂, is defined over a specified time interval [0,T].
Long-run server utilization is ρ.
For systems with long-run stability: ̂
As per the above figure same as figure 5.4, the server utilization is
( ) ∑
̂
where T0 is the total idle time.
Any single-server queueing system with average arrival rate λ customers per time
unit, where average service time E(S) = 1/µ time units or µ is the average service rate,
infinite queue capacity and calling population.
The server alone can be considered as a queueing system in itself, so conservation
equation, L = λw, can be applied.
For a stable system, the average arrival rate to the server, λs, must be identical to λ.
The average number of customers in the server is:
Computer Simulation and Modeling 90
̂ ∫ ( ( ) ( ))
For a single-server case, the average number of customers being served at any
arbitrary point in time is equal to server utilization!!
In general, for a single-server queue:
(̂ ̂) ( ) ( )
For a single-server stable queue, the arrival rate λ must be less than the service rate µ;
(λ< µ) or
For an unstable queue (λ > µ)
The server is always busy
Waiting line tend to grow in length at an average rate of (λ -µ) customers
per time unit and long run average queue length is
Long-run server utilization is 1.
Example 5.1
Customers arrive at random to the passport center at a rate of 40 customers per hour.
Currently, there are 20 clerks, each serving 4 customers per hour on the average. Estimate the
average utilization of a server and the average number of busy servers. Can we decrease the
number of servers?
Solution:
Given that
Arrival rate, = 40 customers per hour,
Service rate = 4 customers per hour
Number of servers, c = 20 i.e. it is a multi-server system.
The long-run or steady state average utilization of a server is
( )
The average number of busy servers is:
Example 5.2
A physician schedules patients every 10 minutes and spends Si minutes with the ith patient
which is given by:
{
Find the steady state average utilization of the server.
Solution:
Given that patients are arriving after every 10 minutes; hence the arrivals are deterministic.
Therefore, arrival rate λ = (1/10) per minute.
The services are probabilistic.
Therefore, the mean service time = E(S) = 9(0.9) + 12(0.1) = 9.3 minutes
Computer Simulation and Modeling 92
Since L is also one of the long-run performance measure of the system, hence, the
other steady-state parameters can be computed by using Little‟s equation.
Computer Simulation and Modeling 93
system/customer w
2(1 )
Long-run average time spent in (1 / 2 2 )
queue/customer w
2(1 )
Q
Example 5. 3
Widget-making machines malfunction apparently at random and then require a mechanic‟s
attention. It is assumed that malfunctions occur according to a Poisson process, at the rate of
1.5 per hour. Observations over several months has found that repair times by a single
mechanic take an average time of 30 minutes, with the standard deviation of 20 minutes.
Compute:
a) The utilization of mechanic
b) The time-average number of machines in the system
c) The average time an arrival spends in the system
d) The average time machine spends in the queue
e) The time-average number in the queue
f) The probability of zero machines with the mechanic.
Solution:
Given, λ=1.5 per hour
The mean service time = 1/µ = 30/60 = ½ hour
Therefore, µ = 2 per hour
σ² = (20)² minutes² =1/9 hour²
2 (1 2 2 ) 2 (1 2 2 ) (0.75)2 (1 4 / 9)
L 0.75 2.375 machines
2(1 ) 2(1 ) 2(1 0.75)
Solution:
Given: λ = 2 per hour, µ = 3 per hour
( ) ( )
( ) ( )
f) The probability of zero, one, two, three, and four or more customers in the shop.
( )
( ) ( )
( ) ( )( )
( ) ( )( )
Computer Simulation and Modeling 96
( ) ( )( )
∑ ( )
Solution:
The CNG station facility is modeled by M/M/2 queue with:
λ= 15/ hour and
Mean service time =1/ µ = 5 minutes = (5/60) hours = (1/12) hours.
Therefore, service rate µ = 12 / hour
Number of servers = c = 2
The steady state parameters of M/M/2 system are computed as follows:
=( . / . / . /) = . /
Probability that all servers are busy c c P0
PL c
c!1
i.e. all CNG filling machines are
busy
,( )( )- ( )
= ( )( )
Long-run time average number of P L ( ) c
L c
customers in the system i.e. average 1
number of taxis ( )( )
=( )( )
Solution:
The tool crib is modeled by an M/M/c queue (λ = 1/4, μ = 1/3, c = 1 or 2).
Given that attendants are paid ₹ 10 per hour and the mechanics are paid ₹ 15 per hour.
Mean cost per hour = ₹10c + ₹15L
assuming that mechanics impose cost on the system while in the queue and in service.
It would be advisable to have a second attendant because long run costs are reduced by ₹
21.91 per hour.
Many systems are naturally modeled as networks of single queues in which customers
departing one queue may be routed to another.
The following results assume a stable system with infinite calling population and no
limit on system capacity:
1. If there are no customers created or destroyed in the queue, then the departure
rate out of the queue is the same as the arrival rate into the queue, over the
long run.
2. If customers arrive to queue m at rate λm, and a fraction 0 ≤ Pmn ≤ 1 of them
are routed to queue n upon departure, then the arrival rate from queue m to
queue n is λmPmn over the long run.
3. The overall arrival rate into queue n is the sum of the arrival rate from all
sources.
4. If queue n has cn < ∞ parallel servers with service rate µn, then the long run
utilization of each server is and < 1is the condition for stable
queue.
5. If, for each queue n, arrivals from outside the network is Poisson process with
rate an and there are cn identical servers having exponentially distributed
Computer Simulation and Modeling 99
service times with mean 1/µn, then in steady state queue n behaves like an
M/M/cn queue with arrival rate
∑
■■■
Computer Simulation and Modeling 100
UNIT: III
RANDOM NUMBERS
CHAPTER 6
RANDOM NUMBER GENERATION
6.1 Introduction
A random number is a number generated by a process, whose outcome is
unpredictable, and which cannot be sub sequentially reliably reproduced.
Random numbers are the basic building blocks for all simulation algorithms.
Similarly, simulation languages generate random numbers that are used to generate
event times and other random variables.
( ) 2
( ) ∫ * +
( ) ∫ , ( )- * +
Computer Simulation and Modeling 102
Some consequences of the uniformity and independence properties are the following:
1. If the interval (0, 1) is divided into n classes, or subintervals of equal length,
the expected number of observations in each interval is N / n, where N is the
total number of observations.
2. The probability of observing a value in a particular interval is independent of
previous values drawn.
The ultimate test of the linear congruential method or any generation technique is
uniformity and independence.
There are, however, several secondary properties that must be considered. These
include maximum density and maximum period.
Maximum Density: It means that the values assumed by Ri, i = 1,2,… leave no large
gaps on [0, 1].
Maximum Period: It helps to achieve maximum density, and to avoid cycling (i.e.
recurrence of the same sequence of generated numbers). Maximal period can be
achieved by the proper choice of a, c, m, and X0. Table below gives the period that
can be achieved for different choices of a, c, m, and X0.
Modulus Increment Constant Seed Period
(m) (c) Multiplier (a) (X0) (P)
≠ 0 and
relatively prime
to m i.e., 1 + 4k and k is an
2b -- m = 2b
greatest integer
common factor
of c and m is 1.
3 + 8k or 5 + 8k
2b 0 Odd
k = 0, 1, …
For smallest integer
Prime Number 0 k , ak – 1 is -- m-1
divisible by m
Example 6.1
(i) Use the linear congruential method to generate a sequence of three two-digit random
integers. Let X0 = 27, a = 8, c = 47, and m = 100.
(ii) Do we encounter a problem if X0 = 0?
Computer Simulation and Modeling 104
Solution:
(i) Given X0 = 27, a = 8, c = 47, and m = 100.
Note: Here the random integers will be generated between 0 and 99 because of the
value of the modulus.
Random numbers between 0 and 1 can be generated by
According to linear congruential method, ( )
X0 = 27
X1 = (8 x 27 + 47) mod 100 = 263 mod 100 = 63
R1 = 63/100 = 0.63
(ii) If X0 = 0
Solution:
Given X0 = 13, a = 9, and c = 35.
Random integers between 0 and 24 indicates m = 25.
Random numbers between 0 and 1 can be generated by
According to linear congruential method, ( )
X0 = 13
Example 6.3
Use the multiplicative congruential method to generate a sequence of four three-digit random
numbers. Use X0 = 117, a = 43, and m = 1000.
Solution:
Given: X0 = 117, a = 43, m = 1000
For multiplicative congruential method,
( )
X1 = [43(117)] mod 1000 = 31
Example 6.4
Determine whether the multiplicative congruential generator with a = 6507, c = 0, and m =
1024 can achieve a maximum period. Also state the restriction on X0 to obtain this period.
Solution:
Given a = 6507, c = 0, and m = 1024 = 210
Since modulus m = 210 is of the form 2b and c = 0, then constant multiplier must be
3 + 8k or 5 + 8k for k = 0.1,2,…
When a = 3 + 8k,
i.e., 6507 = 3 + 8k k = 813 which is an integer.
When a = 5 + 8k,
i.e., 6507 = 5+ 8k k = 812.75 which is not an integer.
From this result we can say that the multiplicative congruential generator can achieve a
maximum period
(∑ )
is also uniformly distributed on the integers 0 to m1-2. Now this result can be used
to combine generators.
ii. Let Xi,1, Xi,2,…, Xi,k be the ith output from k different multiplicative congruential
generators.
iii. In this case the jth generator has the prime modulus mj, the multiplier aj and the
period mj-1.
iv. The jth generator produces integers ( ) and
( )
v. The combined generators take the form
(∑( ) )
{
vi. The maximum possible period for this generator is
( )( ) ( )
vii. This leads to the following algorithm that combines k generators, with modulus mi
and constant multiplier ai.
Algorithm:
1. Select seeds Xi,0 in the range [1, mi-1] for each k generator such that i = 1 to k. Set
j=0, l=0.
2. Evaluate each individual generator.
For i= 1 to k
(∑( ) )
4. Return {
Example 6.5
Use the combined linear congruential method to combine three multiplicative generators with
m1 = 32363, a1 = 157, m2 = 31727, a2 = 146, m3 = 31657, and a3 = 142. Generate random
number with the combined generator using initial seeds Xi,0 = 100, 300, 500 for the individual
generators i = 1, 2, 3.
Solution:
Given the following:
Initial seeds: X1,0 = 100, X2,0 = 300, X3,0 = 500
Modulus: m1 = 32363, m2 = 31727, m3 = 31657
Constant multiplier: a1 = 157, a2 = 146, a3 = 142
(∑( ) ) ( )
( )
= 11313
4. Compute the 1st random number
The null hypothesis, H0, reads that the numbers are independent. Failure to reject the
null hypothesis means that the evidence of dependence has not been detected by the
test.
From each test, a level of significance α must be stated. The level α is the probability
of rejecting the null hypothesis when the null hypothesis is true i.e.,
α = P(reject H0 | H0 true)
There are two errors associated with hypothesis testing, namely Type I error and Type
II error.
1. Type I error (α): It is the error of rejecting H0 when in fact it is true.
2. Type II error (β): It is the error of accepting H0 when in fact it is false.
Computer Simulation and Modeling 109
The type I error is commonly known as false negative, and the type II error is known
as the false positive.
A basic test that should always be performed to validate a new generator is the test for
uniformity.
Two different methods of testing are available. They are the Kolmogorov-Smirnov
and the chi-square test.
Both of these tests measure the degree of agreement between the distribution of a
sample of generated random numbers and the theoretical uniform distribution.
Both tests are based on the null hypothesis of no significant difference between the
sample distribution and the theoretical distribution.
This test compares the continuous CDF, F(x), of the uniform distribution with the
empirical CDF, SN(x), of the N sample observations.
By definition, ( )
If the sample from the random-number generator is R1, R2,…,RN, then the empirical
CDF, SN(x), is defined by
( )
As N becomes larger, SN(x) should become a better approximation to F(x), provided
that the null hypothesis is true.
The Kolmogorov-Smirnov test is based on the largest absolute deviation between F(x)
and SN(x) over the range of the random variable, i.e., it is based on the statistic
| ( ) ( )|
The sampling distribution of D is known and is tabulated as a function of N in Table
A.5.
Algorithm:
{ }
Computer Simulation and Modeling 110
Note: Kolmogorov-Smirnov test is more powerful and can be applied to small sample sizes
(N<50).
Solution:
Step 1: Define the hypothesis for testing the uniformity as:
, -
, -
Step 2: Now, rank data from smallest to largest
0.11 ≤ 0.54 ≤ 0.68 ≤ 0.73 ≤ 0.98
Step 3: Compute D+ and D-
i 1 2 3 4 5
Ri 0.11 0.54 0.68 0.73 0.98
0.20 0.40 0.60 0.80 1.00
D+ = max{ } = 0.09,
D- = max { } = 0.34
Step 4: Compute D
D = max {D+, D-} = max{0.09, 0.34} = 0.34
Step 5: Given, for α = 0.05, D0.05, 5 = 0.565
Since D = 0.34 < D0.05, 5 = 0.565,
, - is accepted i.e., random numbers are uniformly distributed.
Example 6.7
The sequence of numbers 0.63, 0.49, 0.24, 0.89, and 0.71 has been generated. Use the
Kolmogorov-Smirnov test with α = 0.05 to determine if the hypothesis that the numbers are
uniformly distributed on the interval [0, 1] can be rejected. Use D0.05, 5 = 0.565.
Computer Simulation and Modeling 111
Solution:
Step 1: Define the hypothesis for testing the uniformity as:
, -
, -
Step 2: Now, rank data from smallest to largest
0.11 ≤ 0.54 ≤ 0.68 ≤ 0.73 ≤ 0.98
Step 3: Compute D+ and D-
i 1 2 3 4 5
Ri 0.24 0.49 0.57 0.63 0.71
0.20 0.40 0.60 0.80 1.00
D+ = max{ } = 0.29,
D- = max { } = 0.29
Step 4: Compute D
D = max {D+, D-} = max{0.29, 0.29} = 0.29
Step 5: Given, for α = 0.05, D0.05, 5 = 0.565
Since D = 0.29 < D0.05, 5 = 0.565,
, - is accepted i.e., random numbers are uniformly distributed.
Solution:
Step 1: Define the hypothesis for testing the uniformity as:
, -
, -
Step 2: Now, rank data from smallest to largest
0.05 ≤ 0.15 ≤ 0.29 ≤ 0.51 ≤ 0.94
Step 3: Compute D+ and D-
i 1 2 3 4 5
Ri 0.05 0.15 0.29 0.51 0.94
0.20 0.40 0.60 0.80 1.00
0.05 -- -- -- 0.14
D+ = max{ } = 0.31,
D- = max { } = 0.14
Step 4: Compute D
D = max {D+, D-} = max{0.31, 0.14} = 0.31
Step 5: Given, for α = 0.05, D0.05, 5 = 0.565
Since D = 0.31 < D0.05, 5 = 0.565,
, - is accepted i.e., random numbers are uniformly distributed.
( )
class
class which is N/n.
4. Determine the critical value for the specified significance level α with (n-1) degrees of
freedom from Table A.4.
5. If , reject H0. Else, conclude that, there is no difference detected between
the sample distribution and the uniform distribution.
Note: Chi-square test is valid only for large sample sizes (N ≥ 50).
Example 6.9
Consider the following sequence of 100 numbers.
0.43 0.09 0.52 0.98 0.78 0.44 0.21 0.12 0.64 0.76
0.38 0.67 0.97 0.46 0.07 0.18 0.49 0.47 0.22 0.47
0.69 0.99 0.77 0.76 0.65 0.14 0.25 0.37 0.99 0.20
0.74 0.03 0.71 0.28 0.65 0.50 0.54 0.13 0.87 0.50
Computer Simulation and Modeling 113
0.97 0.17 0.32 0.91 0.28 0.39 0.56 0.73 0.93 0.24
0.99 0.71 0.99 0.64 0.50 0.66 0.01 0.24 0.81 0.94
0.73 0.15 0.45 0.10 0.18 0.82 0.96 0.43 0.57 0.94
0.27 0.34 0.65 0.79 0.03 0.49 0.69 0.85 0.37 0.50
0.60 0.93 0.48 0.42 0.04 0.46 0.04 0.91 0.97 0.26
0.81 0.62 0.79 0.88 0.46 0.74 0.06 0.11 0.92 0.87
Use the chi-square test, with α = 0.05, to test the hypothesis that they are uniformly
distributed on the interval [0, 1] can be rejected.
Solution:
Step 1: Define the hypothesis for testing the uniformity as:
, -
, -
Step 2: Choose the value of n such that Ei ≥ 5.
Since N = 100 and Ei = N/n
Let n = 10 intervals of equal length, namely [0, 0.1), [0.1, 0.2), … , [0.9,1.0)
( )
Interval
[0, 0.1) 8 10 0.4
[0.1, 0.2) 9 10 0.1
[0.2, 0.3) 10 10 0.0
[0.3, 0.4) 6 10 1.6
[0.4, 0.5) 13 10 0.9
[0.5, 0.6) 8 10 0.4
[0.6, 0.7) 11 10 0.1
[0.7, 0.8) 12 10 0.4
[0.8, 0.9) 7 10 0.9
[0.9, 1.0) 16 10 3.6
( )
( )
∑
Step 4: Determine the critical value for the specified level of significance α with (n-1)
degree of freedom from Table A.4.
Since α = 0.05, and degree of freedom (n - 1) = 10 – 1 = 9
Step 5: Since
, - is accepted i.e., random numbers are uniformly distributed.
Computer Simulation and Modeling 114
( )
Interval
[0] 8 5 1.8
[1] 5 5 0
[2] 3 5 0.8
[3] 1 5 3.2
[4] 6 5 0.2
[5] 2 5 1.8
[6] 11 5 7.2
[7] 4 5 0.2
[8] 4 5 0.2
[9] 6 5 0.2
( )
( )
∑
Step 4: Determine the critical value for the specified level of significance α with (n-1)
degree of freedom from Table A.4.
Since α = 0.05, and degree of freedom (n - 1) = 10 – 1 = 9
Step 5: Since
, - is accepted i.e., random numbers are uniformly distributed.
Computer Simulation and Modeling 115
The runs test examines the arrangement of numbers in a sequence to test the
hypothesis of independence.
A run is defined as a succession of similar events preceded and followed by a
different event.
The length of the run is the number of events that occur in the run.
Example: Consider the following sequence generated by tossing a coin 10 times.
H T T H H H T H H T
1 2 3 1 2 1
In the above example, there are six runs. The length of each run is 1, 2, 3, 1, 2, and 1.
Algorithm:
and
5. Compute the standard normal statistics
( )
Computer Simulation and Modeling 116
6. Determine the critical value and for the specified significance level α from
Table A.2.
7. If , H0 is not rejected for the specified significance level α.
The critical values and rejection region are shown in figure 6.2.
Example 6.11
Consider the following sequence of 40 numbers.
0.41 0.68 0.89 0.94 0.74 0.91 0.55 0.62 0.36 0.27
0.19 0.72 0.75 0.08 0.54 0.02 0.01 0.36 0.16 0.28
0.18 0.01 0.95 0.69 0.18 0.47 0.23 0.32 0.82 0.53
0.31 0.42 0.73 0.04 0.83 0.45 0.13 0.57 0.63 0.29
Based on runs up and runs down, determine whether the hypothesis of independence can be
rejected, where α = 0.05 and Z0.025 = 1.96
Solution:
Step 1: Define the hypothesis for testing the independence as:
( )
( )
√
Step 6: Given critical value → Zα/2= Z0.025 = 1.96
Computer Simulation and Modeling 117
Solution:
Step 1: Define the hypothesis for testing the independence as:
( )
( )
Solution:
Computer Simulation and Modeling 118
( )
( )
√
Step 6: Given critical value → Zα/2= Z0.025 = 1.96
–Zα/2 ≤ Z0 ≤ Zα/2 → -1.96 ≤ -0.27 ≤ 1.96
Therefore, H0 cannot be rejected, we accept null hypothesis.
i.e.; the numbers are independent
Here, the runs are defined as being above the mean or below the mean.
A “+” will be used to denote a number above the mean and a “-” sign will be used to
denote a number below the mean.
Example: Consider the following sequence of 10 numbers.
0.37 0.59 0.63 0.07 0.92 0.48 0.12 0.86 0.71 0.24
The numbers are given “+” or “-” depending on whether they are greater than or
smaller than the expected mean which is (0 + 0.99)/2 = 0.495.
The sequence of 10 +‟s and –„s is given below:
- + + - + - - + + -
In the above example, there are seven runs.
The length of each run is 1, 2, 1, 1, 2, 2, and 1.
Let n1 and n2 be the number of observations above and below the mean respectively.
The maximum number of runs is N = n1 + n2 and the minimum number of runs is 1.
Algorithm:
1. Define the hypothesis for testing the independence as:
2. Write down the sequence of runs above and runs below the mean.
3. Count the number of observations above mean (n1), the number of observations below
mean (n2), and the total number of runs (b) present in the sequence.
4. Compute mean and variance of b.
Computer Simulation and Modeling 119
( )
( )
5. Compute the standard normal statistics
( )
6. Determine the critical value and for the specified significance level α from
Table A.2.
7. If , H0 is not rejected for the specified significance level α.
Solution:
Step 1: Define the hypothesis for testing the independence as:
Step 2: The sequence of runs above and below the mean (i.e. 0.495) is:
- - - - + - + - - +
- - - - + - + - + +
Step 3: The number of observations above mean = n1 = 7
The number of observations below mean = n2 = 13
The total number of runs b = 12
Step 4: Mean and variance of b is given as
( )( )
( ) ( )( ), ( )( ) -
( ) ( )
Length of runs is another concern of runs test and it is expected that length of runs
should not be constant.
Let Yi be the number of runs of length „i‟ in a sequence of N numbers.
The expected value of Yi fro runs up and runs down or runs above and runs below the
mean is determined.
Then the chi-square test is applied to compare expected value with the observed
value.
The following algorithm is used to test independence on the basis of length of runs up
and down or above and below the mean in the given sequence of numbers.
Algorithm:
2. Write down the sequence of runs up and down or above and below the mean (as asked
in the problem).
3. Find the length of runs in the sequence.
4. Prepare a table for number of observed runs of each length.
Run length (i) 1 2 …
Observed runs (Oi)
5. Compute the expected value of Yi.
(i) For runs up and runs down:
, ( ) ( )-
( )
( )
{
(ii) For runs above and below mean:
( )
( )
where
. / . / . /. /
( )
6. Compute the mean or expected total number of runs (of all lengths) in a sequence.
(i) For runs up and runs down:
Computer Simulation and Modeling 121
( )
( )
7. Compute the expected number of runs of length greater than or equal to the maximum
length of observed length.
(i) For runs up and runs down:
∑ ( ), where m is equal to the maximum length of observed run.
(ii) For runs above and below mean:
( ) ∑ ( ), where m is equal to the maximum length of observed run.
8. Apply the chi-square test.
Run Observed Number of Expected Number of , ( )-
Length, Runs, Runs, ( )
i Oi E(Yi)
1
.
.
.
≥m
Example 6.15
Consider the following sequence of the numbers below. Can the hypothesis that the numbers
are independent be rejected on the basis of the length of runs up and runs down at α = 0.05.
0.34 0.90 0.25 0.89 0.87 0.44 0.12 0.21 0.46 0.67
0.83 0.76 0.09 0.64 0.70 0.81 0.04 0.74 0.22 0.74
0.96 0.99 0.27 0.67 0.56 0.41 0.52 0.73 0.99 0.02
Solution:
Step 1: Define the hypothesis for testing the independence as:
+ - + - - - + + + +
- - + + + - + - + +
+ - + - - + + + -
Step 3: The length of run in the sequence is
1 1 1 3 4 2 3 1 1 1 3 1 1 2 3 1
Computer Simulation and Modeling 122
Run length, i 1 2 3 4
Observed Runs, Oi 9 2 4 1
Step 5: The expected number of runs of length one, two, three, and four are
( ) , ( ) ( )-=12.58
( )
( ) , ( ) ( )-=5.26
( )
( ) , ( ) ( )-=1.45
( )
( ) , ( ) ( )-=0.30
( )
Step 6: The mean or expected total number of runs of all lengths in a sequence is:
( )
∑ ( ) ( )
Here class 3, 4, and 5 has expected frequency less than 5, so it is combined with class 2.
Similarly, we combine the observed frequency of class 3, class 4, and class 5 with class 2.
Reduce the number of classes by 3 which leads n = 5 - 3 = 2. The test statistics is calculated
as:
, ( )-
∑
( )
Step 9: The critical value for the specified significance level α = 0.05 with n - 1= (2 - 1) = 1
degree of freedom from Table A.4 is:
Example 6.16
Consider the following sequence of the numbers below. Can the hypothesis that the numbers
are independent be rejected on the basis of the length of runs above and below mean at α =
0.05.
0.34 0.90 0.25 0.89 0.87 0.44 0.12 0.21 0.46 0.67
0.83 0.76 0.09 0.64 0.70 0.81 0.04 0.74 0.22 0.74
0.96 0.99 0.27 0.67 0.56 0.41 0.52 0.73 0.99 0.02
Solution:
Step 1: Define the hypothesis for testing the independence as:
- + - + + - - - - +
+ + - + + + - + - +
+ + - + + - + + + -
Step 3: The length of run in the sequence is
1 1 1 2 4 3 1 3 1 1 1 3 1 2 1 3 1
Step 4: The number of observed runs of each length is
Run length, i 1 2 3 ≥4
Observed Runs, Oi 10 2 4 1
Step 5: The expected number of runs of length one, two, three, and four are:
The number of observations above mean = n1 = 18 and
The number of observations below mean = n2 = 12
(i) The approximate probability that a run has length „i‟ is:
. / . / . /. /
. / . / . /. / ( * ( * ( *( *
. / . / . /. / ( * ( * ( *( *
. / . / . /. / ( * ( * ( *( *
( )
( )
( )
Step 6: The mean or expected total number of runs of all length in a sequence is:
( )
( )
Step 7: The expected number of runs of length greater than or equal to 4 is:
( ) ∑ ( ) ( )
, ( )-
∑
( )
Step 9: The critical value for the specified significance level α = 0.05 with n - 1= (1 - 1) = 0
degree of freedom from Table A.4 is:
If ρim > 0 then the subsequence has positive autocorrelation whereas if ρim < 0 then
the subsequence has negative autocorrelation.
Algorithm:
2. Find out the value of „i‟ and lag „m‟ using the given data.
3. Using i, m, and N estimate the value of M where M is largest integer such that
( ) ,
where N is the total number of values in the sequence.
4. For large values of M, the distribution of estimator of , denoted by ̂ is
approximately normal if the numbers Ri,Ri+m, Ri+2m, …, Ri+(M+1)m are uncorrelated
where
̂ [∑ ( ) ]
7. Determine the critical value and for the specified significance level α from
Table A.2.
8. If , H0 is not rejected for the specified significance level α.
Example 6.17
Consider the following sequence of 60 numbers.
0.30 0.48 0.36 0.01 0.54 0.34 0.96 0.06 0.61 0.85
0.48 0.86 0.14 0.86 0.89 0.37 0.49 0.60 0.04 0.83
0.42 0.83 0.37 0.21 0.90 0.89 0.91 0.79 0.57 0.99
0.95 0.27 0.41 0.81 0.96 0.31 0.09 0.06 0.23 0.77
0.73 0.47 0.13 0.55 0.11 0.75 0.36 0.25 0.23 0.72
0.60 0.84 0.70 0.30 0.26 0.38 0.05 0.19 0.73 0.44
Test whether the 2nd, 9th, 16th, … numbers in the sequence are autocorrelated, where α = 0.05.
Solution:
Step 1: Define the hypothesis for testing the independence as
Step 2: Here, the value of i = 2 (starting with second number) and lag m = 7 (every seven
Computer Simulation and Modeling 126
numbers).
Step 3: Given that N = 60. Using i, m and N, estimate M which is the largest integer such that
i + (M+1)m ≤ N
i.e. 2 + (M + 1)7 ≤ 60
i.e. (M + 1)7 ≤ 58
i.e. (M + 1) ≤ 8.28
i.e. M ≤ 7.28
M = max {7, 6 , 5, ….}
Hence, M = 7
Step 4: The distribution of estimator
̂ [∑ ( ) ]
̂ [∑ ( ) ]
= , -
,( )( ) ( )( ) ( )( ) ( )( ) ( )( )
( )( ) ( )( ) ( )( )-
= 0.20 – 0.25
= -0.05
Step 5: The standard deviation of the estimate
√ √ ( )
̂ = 0.10
( ) ( )
Step 7: Determine the critical value and for the specified significance level α from
Table A.2.
Since α = 0.05, Zα/2= Z0.025 = 1.96
The gap test is used to count the number of digits between successive occurrences of
the same digit.
Here, we are interested in the frequency of the gaps. A gap of length x occurs between
the recurrences of some specified digit.
Computer Simulation and Modeling 127
( ) ( ) ∑( )
The following algorithm is used to test independence on the basis of length of gaps
associated with every digit.
Algorithm:
2. Determine the number of gaps and length of each gap associated with each digit (0,
1,…,9).
3. Select the interval width based on the number of gaps and generate the frequency
distribution table for the sample of gaps and apply Kolmogorov-Smirnov test.
Gap Frequency Relative Cumulative CDF of |F(x) – SN(x)|
Length Frequency Relative Theoretical
Frequency Frequency
SN(x) Distribution
F(x)
4. Compute the test statistic D, which is the maximum deviation between F(x) and
SN(x).
| ( ) ( )|
5. Determine the critical value, Dα for the specified significance level α and the sample
size N from Table A.5.
6. If D ≤ Dα, H0 is accepted.
Solution:
Step 1: Define the hypothesis for testing the independence as:
Step 5: Determine the critical value, Dα for the specified significance level α and the sample
size N from Table A.5.
Since α = 0.05 and N = 110
√ √
Computer Simulation and Modeling 129
The poker test for independence is based on the frequency with which certain digits
are repeated in a series of numbers.
Here, we show only the three digit version for testing the independence property.
In this case, the generated random numbers are rounded to three digits.
In three-digit numbers there are only three possibilities, as follows:
1. The individual numbers can all be different.
2. The individual numbers can all be the same.
3. There can be one pair of like digits.
The probability associated with each of these possibilities is given by the following:
P(three different digits) = P(second different from the first) x
P(third different from the first and second)
= (0.9)(0.8) = 0.72
P(three like digits) = P(second digit same as the first) x
P(third digit same as the first)
= (0.1)(0.1) = 0.01
P(exactly one pair) = 1 – [0.72 + 0.01] = 0.27
The observed value is then compared with the expected value using the chi-square
test.
Algorithm:
2. Generate the frequency distribution table for the above three combinations and apply
chi-square test.
Combination, i Observed Frequency, Expected Frequency, ( )
Oi Ei
3 different digits, 1
3 like digits, 2
Exactly one pair, 3
3. Compute the sample test statistics
( )
∑
( )
class
class which is N/n.
4. Determine the critical value for the specified significance level α with (n-1) degrees of
freedom from Table A.4.
Computer Simulation and Modeling 130
Example 6.19
A sequence of 1000 three digit numbers has been generated and an analysis indicates that 290
have three different digits, 570 contain exactly one pair of like digits, and 140 contain exactly
three like digits. Based on the Poker test, check whether these numbers are independent. Use
α = 0.05.
Solution:
Step 1: Define the hypothesis for testing the independence as:
Step 2: Generate the frequency distribution table for the above three combinations and apply
chi-square test.
Combination, i Observed Frequency, Expected Frequency, ( )
Oi Ei = PxN
3 different digits, 1 290 0.72 x 1000 = 720 256.80
3 like digits, 2 140 0.01 x 1000 = 10 1690.00
Exactly one pair, 3 570 0.27 x 1000 = 270 333.33
Step 4: Determine the critical value for the specified significance level α with (n-1) degrees
of freedom.
Since α = 0.05 and n – 1 = 3 – 1 = 2
Solution:
Given N = 20
Let us assume to be three-digit numbers.
Hence, the numbers are
{594, 928, 515, 055, 507, 351, 262, 797, 788, 442, 097, 798, 227, 127, 474, 825, 007, 182,
929, 852}
Step 1: Define the hypothesis for testing the independence as:
Computer Simulation and Modeling 131
Step 2: Generate the frequency distribution table for the above three combinations and apply
chi-square test.
Combination, i Observed Frequency, Expected Frequency, ( )
Oi Ei = PxN
3 different digits, 1 10 0.72 x 20 = 14.4 1.34
3 like digits, 2 0 0.01 x 20 = 0.2 3.45
Exactly one pair, 3 10 10 0.27 x 20 = 5.4 5.6
Step 4: Determine the critical value for the specified significance level α with (n-1) degrees
of freedom from Table A.4.
Since α = 0.05 and n – 1 = 2 – 1 = 1
■■■
Computer Simulation and Modeling 132
CHAPTER 7
RANDOM VARIATE GENERATION
7.1 Introduction
In this chapter it is assumed that a distribution has been completely specified, and
ways are sought to generate sample from this distribution to be used as input to a
simulation model.
Here, we explain and illustrate some widely used techniques for generating random
variates.
All the techniques in this chapter assume that a source of uniform (0, 1) random
numbers R1, R2, …is readily available, where each Ri has pdf
( ) 2
and cdf
( ) {
Algorithm:
Example 7.1
The time to attend a breakdown call is found to follow exponential distribution with mean of
0.5. Generate two exponential random variates representing the time to attend. Use R1 = 0.54
and R2 = 0.72.
Solution:
The exponential random variate is
( )
Given λ = 0.5
( ) ( ) ( )
Now, X1 = - 2 ln (1 – R1)
= - 2 ln (1 – 0.54)
= 1.55
X2 = - 2 ln (1 – R2)
= - 2 ln (1 – 0.72)
= 2.54
Computer Simulation and Modeling 134
∫ ∫
0 1 = 0 1 =
Set F(X) = R Set F(X) = R
( ) {
( )
( )
4. Generate uniform random numbers R1, R2, … from [0,1) and calculate desired
random variate by
( )
Example 7.3
The time required to travel from station to college is uniformly distributed over the interval
20 to 25 minutes. Generate two random travel times from this distribution. Use R1= 0.2753
and R2 = 0.6418.
Solution:
The uniform random variate is
( )
Computer Simulation and Modeling 135
Given a = 20 and b = 25
( )
Now, X1 = 20 + 5 R1
= 20 + 5(0.2753)
= 21.3765
X2 = 20 + 5 R2
= 20 + 5(0.6418)
= 23.509
( ) , . /
2. Set F(X) = R.
. /
. /
. / ( )
Taking β root on both sides,
, ( )-
, ( )-
4. Generate uniform random numbers R1, R2,… from [0,1) and calculate desired
random variate by
, ( )-
Solution:
( ), ( )-
Now, ( ), ( )-
1.33
= (8)[-ln(1 – 0.5249)]
= 5.4
( ), ( )-
= (8)[-ln(1 – 0.612)]1.33
= 7.4385
( ) {
2. This distribution is called a triangular distribution with endpoints (0, 2) and mode at 1.
3. Its CDF is given by
( )
( )
{
4. Set F(X) = R.
√
( )
Computer Simulation and Modeling 137
( ) ( )
√ ( )
√ ( )
√
{
√ ( )
Example 7.5
Develop a generator for a triangular distribution with range (1, 10) and mode at x = 4. Also
generate two random variates for R1= 0.2584 and R2 = 0.6591.
Solution:
Given a triangular distribution with a = 1, b = 4 and c = 10.
f(x)
0 1 4 10 x
i.e. h = 2/9
( ) ( ) ( )
( )
( )
For 4 < x ≤ 10, by similar triangles so
( ) ( ) ( )
( )
( ) ( )
√
√ , 0 ≤ R ≤ 9/27 ( ) ( )
√ ( )
√ ( ), 9/27 < R ≤ 1
Algorithm
I. Original Data
1. Sort the data points increasing order.
2. Generate a frequency distribution table and assign a probability to each interval as
1/n where n is the number of observations.
i Interval Probability Cumulative Probability Slope
1/n i/n ai
1
2
:
:
n
3. Compute the slope „ai‟ of the ith line segment.
( )
( )
( )
{
5. Set F(X) = R on the range of X.
( )
6. Compute the inverse of empirical CDF by solving the equation F(X) = R.
( )
( )
( )( *
( *
7. Generate uniform random numbers R1, R2,… and compute desired random
variate by
. /, if
Example 7.6
Five data points of computer service center response time to incoming phone call have been
collected and are used in simulation to investigate and improve the service quality for
customer. The response times are 2.76, 1.83, 0.80, 1.45, and 1.24. Set up a table for
generating response time by the table look-up method and generate two values of response
times using uniform random numbers R1 = 0.71 and R2 = 0.83.
Solution:
1. Sort the data points in increasing order and let it be X1 ≤ X2 ≤…≤ X3.
0.80 1.24 1.45 1.83 2.76
Since the smallest possible value is believed to be 0, x0 = 0.
2. Generate a frequency distribution table and assign a probability to each interval.
Given that n = 5 observations.
Hence, 1/5 = 0.2 is the probability of each interval.
i Interval Probability Cumulative Probability Slope
1/n i/n ai
1 0 < X ≤ 0.80 0.2 0.2 4.00
2 0.80 < X ≤ 1.24 0.2 0.4 2.20
3 1.24 < X ≤ 1.45 0.2 0.6 1.05
4 1.45 < X ≤ 1.83 0.2 0.8 1.90
5 1.83 < X ≤ 2.76 0.2 1.0 4.65
( ) ( )
( ) ( )
( ) ( )
( ) ( )
4. The empirical random variate of original data is
. /, if
5. Given that, R1 = 0.71 and R2 = 0.83.
R1 = 0.71 is lying between 3/5 and 4/5.
Therefore, i = 4.
( * ( )
( * ( )
Example 7.7
Data have been collected on service times at a drive-in bank window at the Canara Bank.
This data are summarized into intervals as follows:
Interval (seconds) Frequency
15 – 30 10
30 – 45 20
45 – 60 25
60 – 90 35
Computer Simulation and Modeling 141
90 – 120 30
120 – 180 20
180 - 300 10
Set up a table for generating service time by the table look-up method and generate two
values of service times using uniform random numbers 0.3561 and 0.5459.
Solution:
1. Summarize the data into frequency distribution table.
i Interval Frequency Relative Cumulative Slope
Frequency Frequency |ai|
Ci
1 15 < X 10 0.067 0.067 223.88
2 30 < X 20 0.133 0.200 112.78
3 45 < X 25 0.167 0.367 89.82
4 60 < X 35 0.233 0.600 128.76
5 90 < X 30 0.2 0.800 150
6 120 < X 20 0.133 0.933 451.13
7 180 < X 0 10 0.067 1.000 1791.04
∑ = 150
( ) ( )
( )
Algorithm
1. Generate uniform random numbers R1, R2,…
i.e. ( )
2. Return the desired random variate X = i if it satisfies
∑ () ∑ () ( ) ( )
3. Note that the algorithm will never return a value X = i for p(i) = 0, because the strict
inequality between the two summations in step (2) is impossible.
4. Step (2) requires a search, which is time consuming.
Consider the discrete uniform distribution on {1, 2,…,k} with pmf and CDF given by
( ) and
( )
{
Let xi = i and ri = p(1) + p(2) + ….+ p(xi)
= F(xi) = i/k, i = 1, 2, …, k
Then from inequality ( ) ( ), it can be seen that, if the
generated random number satisfies
Example 7.8
Consider the discrete uniform distribution on {1, 2, 3, …, 10} with CDF
( )
{
Generate two random values of X using two random numbers 0.78 and 0.23.
Solution:
⌈ ⌉
Here k=10
⌈ ⌉ ⌈( )( )⌉ ⌈ ⌉
⌈ ⌉ ⌈( )( )⌉ ⌈ ⌉
Example 7.9
The CDF of a discrete random variable X is given by
( )( )
( )
( )( )
When n = 4, generate three values of X, using R1 = 0.83, R2 = 0.24, and R3 = 0.57.
Solution:
For given random number R, X will take the value in Rx = {1, 2, 3, 4} provided that
( ) ( )
( )( )
But, ( )
( )( )
Put n = 4, we have
( )( ) ( )( )
( )
( )( )
We know that,
( ) ( )
( )( )( ) ( )( )
As x = 1, 2, 3, 4
F(1) = 0.033 F(2) = 0.167
F(3) = 0.467 F(4) = 1
Now, X can be generated by table look-up method below:
Computer Simulation and Modeling 144
i Xi F(Xi)
1 1 0.033
2 2 0.167
3 3 0.467
4 4 1.000
The discrete random variate is
X = i if ( ) ( )
Solution:
For integer values of x in the range {1, 2, …., k}, the CDF is given by
( ) ( )
( ) ∑ ∑
( ) ( ) ( ) ( )
We know that,
( ) ( )
( ) ( )
( ) ( )
( ) ( ) ( )
To solve this inequality for x in terms of R, first find a value of x that satisfies
( ) ( )
( )
√
By quadratic formula, namely
Computer Simulation and Modeling 145
√ ( ) √ ( )
⌈ ⌉ ⌈ ⌉
( )
⌈ ( )
⌉
Since p is a fixed parameter, let
( )
⌈ ( ) ⌉
Here, - β ln(1 – R) is an exponentially distributed random variable with mean β.
Occasionally, a geometric variate X is needed which can assume values {q, q+1,…}
with pmf ( ) ( )
In that case, the desired random variate is
( )
⌈ ⌉
( )
Computer Simulation and Modeling 146
Example 7.11
In an International Conference held by Delhi University, the reviewer reviews the research
papers which is geometric distributed on the range {X ≥ 1} with mean of 3 papers per day.
Generate two values of X using random numbers 0.932 and 0.105.
Solution:
Given that the geometric distribution is defined on the range {X ≥ 1}.
( ) ( ) for x = 1,2,… with mean 1/p.
The geometric random variate is
( )
⌈ ⌉
( )
Given that mean = 1/p = 3
Consider two standard normal random variables Z1 and Z2, plotted as a point in the
plane as shown below:
whereas
Therefore, B2 has the chi-square distribution with 2 degrees of freedom, which is
equivalent to an exponential distribution with mean 2.
Now, B can be generated using inverse transform technique for exponential
distribution.
i.e. √ ……………. (2)
The angle θ is uniformly distributed on (0,2π) and is independent on B.
Combining equations (1) and (2) gives a direct method for generating two
independent standard normal variates Z1 and Z2 from two independent random
numbers R1 and R2.
√ ( ) ( )
√ ( ) ( )
Obtain normal variate Xi with mean µ and variance σ2, using the following
transformation
Finally, obtain the lognormal variate by using the following direct transformation
Solution:
1. Two standard normal random variates are generated as follows:
√ ( ) ( )
√ ( ) ( )
Given R1 = 0.1758 and R2 = 0.1489.
√ ( ) ( )
√ ( ) ( )
2. The normal variate is
where µ = 10 and σ2 = 4 i.e.
( )
( )
The convolution method thus refers to adding together two or more random variables
to obtain a new random variable with the desired distribution.
This technique can be applied to obtain
1. Erlang Variates
2. Binomial Variates
We know that an Erlang random variable X with parameters k and θ is the sum of k
independent exponential random variables, Xi (i = 1, 2,…, k) each having mean 1/kθ.
i.e. ∑
As each is exponentially distributed with mean 1/λ, so it can be generated using
inverse transform technique
( ) where is uniform random number.
∑ ( )
But
∑ ( ) (∏ +
Example 7.13
In Mc Donald‟s restaurant, the consumption of bread is approximated by Erlang distribution
with parameters k = 2 and θ = 5. Generate the value of consumption from this distribution.
Use R1 = 0.937 and R2 = 0.217.
Solution:
The Erlang random variate is
(∏ +
(∏ + ,( )( )- ( )
We know that, the binomial random variate X can be represented as the number of
successes in ‘n’ independent Bernoulli trials, each success having probability p.
Thus, ∑
where p(Xi = 1) = p
Computer Simulation and Modeling 149
and p(Xi = 0) = 1 – p
Example 7.14
The number appearing on the up-face of the tossed die is modeled by binomial distribution.
The value of x is 1, 2, 3, 4, 5, or 6 with probability 0.10, 0.20, 0.30, 0.25, 0.10, 0.05. Set up a
table for generating appearances of number using table look-up method. Use R1= 0.40 and R2
= 0.60
Solution:
No
Does R Reject R, try again
meet
Yes
Yes
Accept R Finished? Stop
No
Computer Simulation and Modeling 150
Algorithm
∑ ∑
Next multiply by - λ, which reverses the sign of the inequality, and use the fact that a
sum of logarithm is the logarithm of product.
∏ ∑ ∑ ∏
∏ ∏
1. Set x = 0 and P = 1.
2. Generate a random number ( ) and replace P by P.(Rx+1).
-λ
3. If P < e , then accept X = x.
4. Else, reject x and increment x by 1 and return to step 2.
Example 7.15
The number of customers arriving at Café Coffee Day is Poisson distributed with mean 4.
Generate Poisson variate. Use random numbers 0.5389, 0.0532, 0.3492 in sequence.
Solution:
Iteration 1:
1. Set x= 0 and P = 1.
2. Given ( )
Compute P.R1 = 1(0.5389) = 0.5389
Set P = P.R1 = 0.5389
3. Check P < e-λ
i.e. 0.5389 < e-4
i.e. 0.5389 < 0.0183 is false.
Reject x and set x = x + 1 i.e. 0 + 1 = 1 and go to step 2.
Iteration 2:
1. Given ( )
Compute P.R2 = (0.5389)(0.0532) = 0.0286
Set P = P.R2 = 0.0286
2. Check P < e-λ
i.e. 0.0286 < e-4
i.e. 0.0286 < 0.0183 is false.
Reject x and set x = x + 1 i.e. 1 + 1 = 2 and go to step 2.
Iteration 3:
1. Given ( )
Compute P.R3 = (0.0286)(0.3492) = 0.0099
Set P = P.R3 = 0.0099
2. Check P < e-λ
i.e. 0.0099 < e-4
i.e. 0.0099 < 0.0183 is true.
Accept x.
Poisson random variate X = x = 2.
■■■
Computer Simulation and Modeling 152
UNIT: IV
ANALYSIS OF SIMULATION DATA
CHAPTER 8
INPUT MODELING
Input data provide the driving force for a simulation model.
In queuing system the distribution of time between arrivals and service times are the
input data.
The distributions of demand and lead time are the input data for inventory system.
There are four steps in development of a useful model of input data
1. Data Collection
2. Identifying the distribution with data
3. Parameter Estimation
4. Goodness of fit tests
are largely beyond the control of system and will not be altered by changes made
to improve the system.
8.1.2.1 Histogram
The sample mean and sample variance are used to estimate the parameters of a
hypothesized distribution.
The observations in a sample of size n are X1, X2, … Xn.
1. If data are discrete or continuous raw data then sample mean and sample variance
is defined by
∑ ∑
2. If data are discrete and grouped in a frequency distribution then mean and
variance is given by
∑ ∑
3. If data are discrete or continuous and have been placed in class interval, then
mean and variance is
∑ ∑
Numerical estimates of the distribution parameters are required to reduce the family
of distributions to a specific distribution and to test the resulting hypothesis.
The table 8.2 contains suggested estimators for distribution often used in simulation.
Example 8.2
The percentage rates of return on 10 investments in a portfolio are 18 .8, 27.9, 21.0, 6.1, 37.4,
5.0, 22.9, 1.0, 3.1 and 8.3.Estimate the parameter of a lognormal model of this data.
Solution:
Natural log of the given data is 2.9, 3.3, 3.0, 1.8, 3.6, 1.6, 3.1, 0, 1.1 and 2.1 respectively.
∑
̅ ̅
∑ ̅ ( )
̂
It formalizes the spontaneous idea of comparing the histogram of data to the shape of
candidate density or mass function.
The test procedure beings by arranging the n observations into a set of k class
intervals or cells.
The test statistic is given by
∑ ( )
Note
1. For discrete - Number of class intervals is determined by number of cells resulting
after combining adjacent cells as necessary.
2. For continuous – Number of class intervals must be specified.
The table 8.3 helps in determining the number of class intervals for continuous data.
Example 8.3
The number of vehicles arriving at the toll booth in a 5-minute period between 7:00 A.M.
and 7:05 A.M. was monitored for 100 days. The table below shows the resulting data.
Arrivals per Period Frequency
0 12
1 10
2 19
3 17
4 10
5 8
6 7
7 5
8 5
9 3
10 3
11 1
Use the Chi-square test to test the hypothesis that these arrivals are Poisson distributed.
Use the level of significance α = 0.05 and
Solution:
( ) {
6 7 8.5 0.26
7 5 4.4
8 5 2.0
9 3 17 0.8 7.6 11.62
10 3 0.3
11 1 0.1
∑ = 100 ∑ = 100.0 ∑ = 27.68
The value of E1 is given by nP0 = 100(0.026) = 2.6. In the similar manner, the
remaining Ei values are determined.
Since E1 = 2.6 < 5, E1 and E2 are combined. In that case O1 and O2 are also combined
and the value of k is reduced by 1. The last five class intervals are also combined for
the same reason, and k is further reduced by four.
Hence value of k becomes k = 12 – 1 - 4 = 7
The calculated
The degrees of freedom for the tabulated value of is
Here, s =1, since one parameter.
Critical value
Since, , H0 would be rejected, i.e., the given data is not Poisson
distributed.
i. Apply the Chi-Square test to these data to test the hypothesis that the underlying
distribution is Poisson.
ii. Apply the Chi-Square test to these data to test the hypothesis that the underlying
distribution is Poisson with mean 1.0
Use level of significance α = 0.05 and ,
Solution:
i. Define the hypothesis as:
H0: The data fits to Poisson distribution.
H1: The data does not fit to Poisson distribution.
For Poisson distribution,
∑
Where ̅
Using Chi-Square test we test the hypothesis as:
Computer Simulation and Modeling 163
Xi Oi Pi Ei= n*Pi ( )
Where
Using Chi-Square test we test the hypothesis as:
Xi Oi Pi Ei= n*Pi ( )
Notice that we have grouped cells i = 3; 4; 5and 6 together into a single cell with Oi = 12 and
Ei = 8.03 since the expected frequency of those cells is less than 5.
We do not estimate the value of , hence, s = 0.
Therefore, degree of freedom = k-s-1 = 4 – 0 – 1 =3
Given, 7.81
Since, < 7.81, H0 is accepted.
Hence the given data fits to Poisson distribution.
Computer Simulation and Modeling 164
If a continuous distributional assumption is being tested, class intervals that are equal
in probability should be used instead of equal in width of interval.
For equal probabilities
pi = 1/k
Ei = npi ≥ 5
Substituting for pi, we get
Solving for k,
Changing the number of classes and interval width affects the value of calculated and
tabulated chi-square.
A hypothesis may be accepted when the data are grouped in one way but rejected if it
is done in another way.
It requires the data to be placed in the class intervals. In case of continuous grouping
is arbitrary.
Solution:
Define the hypothesis as:
H0: The data fits to Uniform distribution.
H1: The data does not fit to Uniform distribution.
Given N = 30 and D0.05,30 = 0.24.
Arrange all the random digits RD in ascending order.
i RD Ri i/N (i-1)/N (i/N)-Ri [Ri-(i-1)]/N
1 6.0 0.06 0.033333 0 - 0.06
2 7.0 0.07 0.066667 0.033333 - 0.036667
3 17.2 0.172 0.1 0.066667 - 0.105333
4 20.6 0.206 0.133333 0.1 - 0.106
5 21.6 0.216 0.166667 0.133333 - 0.082667
6 23.3 0.233 0.2 0.166667 - 0.066333
7 23.7 0.237 0.233333 0.2 - 0.037
8 27.3 0.273 0.266667 0.233333 - 0.039667
9 27.3 0.273 0.3 0.266667 0.027 0.006333
10 32.4 0.324 0.333333 0.3 0.009333 0.024
11 36.3 0.363 0.366667 0.333333 0.003667 0.029667
12 36.8 0.368 0.4 0.366667 0.032 0.001333
13 40.7 0.407 0.433333 0.4 0.026333 0.007
14 45.2 0.452 0.466667 0.433333 0.014667 0.018667
15 45.3 0.453 0.5 0.466667 0.047 -
16 62.6 0.626 0.533333 0.5 - 0.126
17 67.3 0.673 0.566667 0.533333 - 0.139667
18 69.8 0.698 0.6 0.566667 - 0.131333
19 73.1 0.731 0.633333 0.6 - 0.131
20 73.2 0.732 0.666667 0.633333 - 0.098667
21 76.6 0.766 0.7 0.666667 - 0.099333
22 87.2 0.872 0.733333 0.7 - 0.172
23 87.6 0.876 0.766667 0.733333 - 0.142667
24 87.8 0.878 0.8 0.766667 - 0.111333
25 88.6 0.886 0.833333 0.8 - 0.086
26 90.1 0.901 0.866667 0.833333 - 0.067667
27 91.7 0.917 0.9 0.866667 - 0.050333
28 97.4 0.974 0.933333 0.9 - 0.074
29 98.8 0.988 0.966667 0.933333 - 0.054667
30 99.7 0.997 1 0.966667 0.003 0.030333
Computer Simulation and Modeling 166
D+ = max{(i/N)-Ri} = 0.047
D- = max{Ri-[(i-1)/N]} = 0.172
D = max(D+, D-) = 0.172
D0.05,30 = 0.24.
Since, D0.05,30 = 0.24 > D = 0.172, do not reject H0.
Hence, the data fits to Uniform distribution.
There are many ways to obtain information, if data are not available. Few are mentioned
below
1. Engineering data
The values provided by manufacturers (example, the mean time to failure of a
computer chip, the cutting speed of the tool) provide a starting point for input
modeling by fixing a central value. Company rules might specify time or production
standards.
2. Expert option
Talking to the experts who have experience with the process or similar processes.
They can provide optimistic, pessimistic and most likely thoughts. They might also be
able to say whether the process is nearly constant or highly variable, and they might
be able to define the source of availability.
3. Physical or conventional limitations
Many real processes have physical limits on performance (Ex. Computer data entry is
not faster than what a person can type). Because of company policies, there could be
upper limits on how long a process may take. Do not ignore obvious limits or bounds
that narrow the range of input process.
4. The nature of process
The choice of distribution should be done after clear understanding of distributions.
When no data is available then uniform, triangular and beta distributions are used as input
models. A useful refinement is obtained, when minimum, maximum and one or more
breakpoints can be given. A breakpoint is an intermediate value and a probability of being
less than or equal to that value.
The variables may be related and if the variables appear in a simulation models as
inputs, the relationship should be determined.
When inputs exhibit dependence then multivariate input models are used. Example:
Two random variables lead time and annual demand in inventory system.
Time series models are useful for representing a sequence of dependent inputs.
Example: Successive time between orders in a system.
Let X1 and X2 be two random variables, and let μi = E(Xi) and σi2 = var(Xi) be the
mean and variance of Xi.
The covariance and correlation are the measures of linear dependence between X1
and X2.
In others words it indicates how well the relationship between X1 and X2 is described
by the model
(X1 – μ1) = β(X2 – μ2) + ε
where ε is the random variable with mean 0, and is independent of X2
If (X1 – μ1) = β(X2 – μ2), then model is perfect.
If X1 and X2 are statistically independent, then β = 0, and the model is of no value.
A positive value of β indicates that X1 and X2 tends to be above or below their means.
A negative value of β indicates that X1 and X2 tends to be on opposite sides of their
means.
The covariance between X1 and X2 is defined as
cov ( X1, X2 ) = E[(X1 – μ1 ) (X2 – μ2 )] = E (X1X2 ) - μ1μ2
The value cov (X1, X2) = 0 implies β = 0.
The value cov (X1, X2) < 0 implies β < 0.
The value cov (X1, X2) > 0 implies β > 0.
The covariance can take any value between - ∞ and ∞.
The correlation standardizes covariance to be between -1 and 1:
( )
( )
If X1 and X2 are normally distributed then dependence between them can be modeled
by bivariate normal distribution with parameter μ1, μ2, σ12, σ22 and ρ = corr (X1, X2).
To estimate ρ, let (X11, X21), (X12, X22), … (X1n, X2n) be the n independent and
identically distributed pairs.
The sample covariance is
Solution:
Lead Time Demand
X1 X2 X1 X2 X12 X22
6.5 103 669.5 42.25 10609
4.3 83 356.9 18.49 6889
6.9 116 800.4 47.61 13456
6.0 97 582.0 36.00 9409
6.9 112 772.8 47.61 12544
6.9 104 717.6 47.61 10816
5.8 106 614.8 33.64 11236
7.3 109 795.7 53.29 11881
4.5 92 414.0 20.25 8464
6.3 96 604.8 39.69 9216
∑ ̅̅̅̅ ∑ ̅̅̅̅
̂ = 1.04 ̂ = 98.62
̂ ̂
̂( ) ∑ ̅̅̅ ̅̅̅
, -
̂( )
̂( )
̂
̂̂ ( )( )
If X1, X2, X3,… is a sequence of identically distributed but dependent and covariance –
stationary random variables then there are number of time series models that can be
used to represent the process.
Two models for describing time series are
1. AR(1) – Autoregressive order-1 model
2. EAR(1) – Exponential autoregressive order-1 model
Both these models have the characteristic that the autocorrelation take the form
ρh = corr (Xt, Xt +h) lag-h autocorrelation for h = 1, 2, …
̂( ) ∑( ̅ )( ̅) (∑ ( )̅ +
̂( )
Then ̂
̂
Finally, estimate µ and σε2 by ̂ ̅ and ̂ ̂ ( ̂ )
Solution:
∑ ∑
̅
∑ ̅ ∑ ( )
̂
̂( ) (∑ ( )̅ )
, ( )( ) -
̂
Therefore we could model interarrival times as an EAR (1) process with
̂ and ̂ ̂ = 0.8
̅
provided exponential distribution is good model for individual gaps.
Example 8.8
The numbers of patrons staying at Hotel Delight near Andheri Station on 20 successive nights
were observed to be 20, 14, 21, 19, 14, 18, 21, 25, 27, 26, 22, 18, 18, 18, 25, 23, 20, 21. Fit
both an AR(1) and EAR(1) model to this data.
Computer Simulation and Modeling 171
Solution:
For AR(1) model, the parameters ̂ ̂ ̂ are estimated as follows:
∑ ∑
̂ ̅
∑ ̅ ∑ ( )
̂
̂( ) (∑ ( )̅ )
, ( )( ) -
̂
Therefore we could model interarrival times as an EAR (1) process with
̂ and ̂ ̂ = 0.42
̅
■■■
Computer Simulation and Modeling 172
CHAPTER 9
VERIFICATION, CALIBRATION AND VALIDATION OF
SIMULATION MODELS
Verification and Validation of the simulation model is one of the most important and
difficult task carried out by the model developer, to work closely with end users
throughout the period of development and to increase the model‟s credibility.
Validation is an integral part of the model development.
The goal of validation is a twofold process:
1. To produce a model that represents true system behavior, this can be used as a
substitute for the actual system, for the purpose of experimenting.
2. To increase the acceptance, credibility level of model, so that the model will be
used by managers and other decision makers.
Conceptually, the verification and validation process consists of the following
components.
1. Verification is concerned with building the model right, which is used in
comparison of conceptual model to the computer representation.
2. Validation is concerned with building the right model, which is used to
determine that a model is an accurate representation of a real system. It is
achieved through the calibration of the model, an iterative process of comparing
the model to actual system behavior. This process is repeated until model
accuracy is judged to be acceptable.
This step leads in understanding the system behavior. Persons familiar with the
system or sub system should be questioned and gain the advantage of their special
knowledge. As the development proceeds, new questions may arise and model
developers will return to this step.
Step 2- The second step involves in the construction of a conceptual model. It includes the
collection of assumptions of components, structure of the system and hypothesis on
the values of model input parameters.
Step 3- The third step is the translation of operational model into a computer recognizable
form – computerized model.
Model building is not a linear process instead the model builder goes back to these steps
many times while building, verifying and validating the model. The figure 9.1 shows the
model building process.
Computer Simulation and Modeling 173
Real
Calibration system
and Validation
Conceptual validation
Conceptual model
Model verification
Operational model
(Computerized
representation)
Figure 9.2 shows the relationship of model calibration to the overall validation
process.
The comparison of the model to reality is carried out by a variety of tests – some
subjective and others objective.
Subjective tests involve people, who are knowledgeable about one or more aspects of
the system, making judgments about the model and its output.
Objective tests always require data on the system‟s behavior plus the corresponding
data produced by the model.
Then one or more statistical tests are performed to compare some aspect of the system
data set to the same aspect of the model data set.
This iterative process of comparing model and system, and revising both the
conceptual and operational models to accommodate any perceived model deficiencies,
is continued until the model is judged to be sufficiently accurate.
The potential users of the model must be involved in the model construction from its
conceptualization to implementation, to assure that the reality is built into the model
through assumptions regarding system structure and reliable data.
The advantages of involving potential users are.
1. They can evaluate the model output for reasonableness and help in identifying the
deficiencies. So they are involved in the calibration process, as the model is iteratively
improved.
Computer Simulation and Modeling 176
2. The increase in the model‟s perceived validity or credibility helps the manager to trust
the simulation results, a basis for decision making.
Sensitivity analysis can also be used to check a model‟s face validity – the model user
is asked whether the model behaves in the expected way, when one or more input
variables are changed. Based on experience and observation on the real system, both
model user and builder address the problem.
For most large-scale simulation models, many possible sensitivity tests are carried out
as there are many input variables. The builder must choose the most critical input
variables for testing if it is too expensive or time consuming.
2. Data assumptions involve collection of reliable data and correct statistical analysis of
the data.
Example – For a bank the data that were collected are
a. Inter arrival times of customers during several 2 hours period of peak
loading.
b. Inter arrival times during a slack period.
c. Service times for commercial accounts.
d. Service times for personal accounts.
The reliability of data is verified by consultation with bank managers, who identify
typical slack/rush time. When two or more data sets collected are combined, objective
statistical tests is performed for homogeneity of data.
Additional tests may be required for correlation in data. The analyst begins statistical
analysis as soon as he is assured of dealing with a random sample.
The analysis consists of three steps
1. Identifying the appropriate probability distribution.
2. Estimating the parameters of hypothesized distribution.
3. Validating the model by goodness-of-fit tests (chi-square or Kolmogorov-Smirnov
test) and by graphical methods.
In this phase, the model is viewed as an input-output transformation i.e. model accepts
values of input parameters and transforms these inputs into outputs measures of
performance.
Computer Simulation and Modeling 177
The modeler collects two sets of data, one data set used at the time of developing and
calibrating the model and the other if required at the final validation test.
In any case, the modeler should use the main responses of interest as criteria for
validating a model. A necessary condition in this phase is, some version of system
under study exists, so data can be collected (at least one set of input conditions),
which might be useful to compare with model predictions. If system is in planning
stage and no system operating data is collected, complete input output validation is
not possible.
What about the validity of model of a non- existent proposed system or model of
existing system under new input conditions?
First, the responses of two models under similar input conditions will be used as
criteria for comparison of existing and proposed system.
Second, the proposed system is a modification of existing system in most cases.
The modeler hopes that confidence in the model of existing system can be
transferred to the model of new system. This transfer of confidence by modeler
can be justified only if new model is relatively with minor modification of old
model in terms of changes to computerized representation of the system.
Changes in computerized representation of the system, ranging relatively from minor
to major includes.
Minor changes of single numerical parameters. Example – speed of a machine,
arrival rate of customers
Minor changes of statistical distribution. Example – distribution of a service time
or time to failure of a machine.
Major changes in logical structure of a subsystem. Example – change in queue
discipline for a waiting–line model.
Major changes of design in new system. Example – computerized inventory
control system.
If the changes are minor then it can be carefully verified and output from the new
model is accepted with confidence. If a similar subsystem exists elsewhere, it may be
possible to validate sub model that represents the subsystem and then integrate this
sub model with other validated sub models to build a complete model, this is a partial
validation of major changes.
There is no way to completely validate the input-output transformations of a model of
non existing system. The modeler should consider time and budget constraints and
use as many validation techniques including input-output validation of subsystem
models if operating data can be collected on such subsystems..
To conduct a validation test using historical input data, it is important that all input
data (An , Sn…..) and all system response data such as average delay (Z2) should be
collected during the same time period. Otherwise the comparison of model to the
system responses could be misleading – the responses depends on inputs and structure
of the system or model.
Implementation of this technique for large system is difficult because of the need of
simultaneous data collection. Some electronic counters and devices are used for ease
of data collection. In this technique the modeler hopes that simulation will provide a
replica of a real system, but to determine the level of accuracy both model builder‟s
and model user‟s judgment is considered.
Computer Simulation and Modeling 178
The comparison of model output to system output can be carried out by persons who
are knowledgeable about system behavior, when no statistical test is readily
applicable.
For example: Suppose five reports of system performance over five different days are
prepared and simulation output data are used to produce five fake reports. So there are
10 reports exactly in same format and contains information as required by managers
and engineers. These 10 reports are shuffled randomly and submitted to the engineer,
to identify fake and real reports. If the engineer identifies fake reports, then the model
builder questions the engineer and uses the information gained to improve the model.
If the engineer cannot distinguish, then the modeler will conclude that this model is
adequate. This type of validation test is called Turing test.
It provides a valuable tool in detecting model inadequacies and eventually in
increasing model credibility.
■■■
Computer Simulation and Modeling 179
CHAPTER 10
ESTIMATION OF ABSOLUTE PERFORMANCE
̂ ∑
Example: ̂ and ̂ in the queueing models, where Wi is the time spent by the
customer I in the system.
given by . / .
It is very difficult to obtain approximately unbiased estimates of ( ̂)
̂ ∑̂
̂ ( ̂) ∑( ̂ ̂)
( )
4. A 100(1 - α)% confidence interval with f = R – 1 degrees of freedom is given
by ̂ ̂( ̂) ̂ ̂ ̂( ̂)
5. The standard error of the point estimator ̂ is given by
̂( ̂) √ ̂ ( ̂)
Computer Simulation and Modeling 184
As R increases the standard error becomes smaller and finally reduces to zero.
̂ ∑̂
̂ (̂) ∑( ̂ ̂)
( )
4. The confidence interval is similar as that in Discrete Time Data.
Solution:
The analyst desires a 95% confidence interval for Able‟s true utilization, .
As 95% confidence interval is expected, 100(1 – α) = 95
1 – α = 95/100 = 0.95
α = 1 – 0.95 = 0.05 = 5%
Also degree of freedom f = R – 1 = 4 – 1 = 3
An overall point estimator
̂ ∑̂
∑(̂ ̂)
( )
( ) ( ) ( ) ( )
( )
Computer Simulation and Modeling 185
= (0.036)2
Thus, the standard error of ̂ = 0.808 is estimated by s.e.( ̂) = ̂( ̂) = 0.036.
Given t0.025,3 = 3.18.
Now, compute the 95% confidence interval by
̂ ̂( ̂) ( )( )
or with 95% confidence, 0.694 ≤ ρ ≤ 0.922
In the similar way, compute a 95% confidence interval for mean time in system w:
( ) ( ) ( ) ( )
̂ ( ̂) ( )
( )
so that
̂ ̂( ̂) ( )( )
or with 95% confidence, 3.46 ≤ w ≤ 4.58
√
Solving for R in inequality,
( )
( *
̂ ̂
√ √
Also, ̂ ( ̂)
5. If the confidence interval is too large, repeat the steps to estimate an even
larger sample size.
̂( ̂)
̂ √
We know that the quantile estimation problem is the inverse of the probability
estimation problem.
Steps:
1. Obtain such that ( ) .
̂
2. To estimate „p‟ quantile, find such that 100p% of the data in the histogram
of Y is to the left of ̂ or the R pth smallest value of Y1, Y2, …, YR.
3. Obtain the appropriate (1 – α)100% confidence interval for by calculating:
(i) that cuts off 100P1% of the histogram and
(ii) that cuts off 100Pµ% of the histogram
where
( ) ( )
√ √
4. In case of sorted values,
̂ is the Rp1 smallest value (rounded down) of Y1, Y2, …, YR
̂ is the Rpµ smallest value (rounded up) of Y1, Y2, …, YR
The sample size „n‟ (or TSE) is a design choice and independent of the nature of
problem.
Since simulation can‟t run forever, we choose n to terminate simulation based on:
1. Bias in the point estimator due to inappropriate initial conditions. This is more
severe for short term simulation. Average out for long runs.
2. Desired precision of the point estimator (measure point estimator variance).
3. Time and budget available for computers and other resources.
The selection of „T0‟ is an important issue as the system state „I‟ at time
T0, is proper representation of steady state behavior than the original initial
condition (I0) at time 0.
Also, the duration of data collection phase „TSE‟ should be long enough so
that the steady state behavior must be sufficiently precise estimates.
Note that the system state „I‟ is a random variable. The probability
distribution of the system state at time „T0‟ is sufficiently close to the
Computer Simulation and Modeling 188
steady state probability distribution which makes the bias point estimates
of the response variables negligible. So we can say that the system is
approximately in steady state.
Several issues:
1. Ensemble average will reveal a smoother and more precise trend as the number of
replications, R, is increased.
2. Ensemble average can be smoothened further by plotting a moving average. In a
moving average each plotted point is actually the average of several adjacent
ensemble averages.
3. Cumulative averages become less variable as more data are averaged. Thus, it is
expected that the curve at left side (the starting of the simulation) of the plotting is
less smooth than the right side.
4. Simulation data, especially from queueing models, usually exhibits positive
autocorrelation. The more correlation present, the longer it takes for the average to
approach steady state.
5. In most simulation studies the analyst is interested in several measures such as queue
length, waiting time, utilization, etc. Different performance measures may approach
stead state at different rates. Thus it is important to examine each performance
measure individually for initialization bias and use a deletion point that is adequate
for all of them.
Solution:
Given that: R0 = 10, d = 2, = 25.30, =2
As 90% confidence interval is expected, 100(1 – α) = 90
1 – α = 90/100 = 0.90
α = 1 – 0.90 = 0.10 = 10%
Estimate the final sample size as desired by:
( ) ( )
( +
( )
Thus, at least 18 replications will be needed.
10.6.5 Batch Means for Interval Estimation in Steady State Simulations [M-12,17, D-10]
One problem of replication method is that you have to delete some initial data i.e.„d‟
observations from each of the „R‟ replications.
So, we are throwing away a total of „dR‟ observations over all of the observations.
To overcome this issue we may use an experimental design that is based on single
long replication.
But, the major problem is in the computation of the standard error of the sample mean
as the data are dependent, so usual estimator is biased.
Batch means is the method that overcomes the above mentioned problems.
Method:
Divide the output data from one replication (after appropriate deletion) into a
few large batches.
Process the means of these batches as if they were independent.
Calculate the batch means ( ̅ ) based on the form of the raw output data.
… ( )
deleted ̅ ̅ ̅
(̅ ̅) (̅ ̅)
√ (̂ +
∑ (̅ ̅)
If , then accept the independence of the batch means else extend the
replication by 50% to 100% and go to step 2.
Due to the small number of batches the estimated lag-1 autocorrelation would be
slightly negative.
Also, it is difficult to estimate correlation with small numbers of observation.
̂ ∑̂
■■■
Computer Simulation and Modeling 193
UNIT: V
APPLICATION
CHAPTER 11
Application: Case Study
2. Memory Simulation
Data is written into the L1 cache. This is done in one of the ways:
(i) Write through all the caches whenever there is a write.
(ii) Entire block is written back every time it is ejected.
The simulation here compares the performance of these two methods or strategies,
taking costs and contention for resources into consideration.
Direct-execution simulation is used to provide reference streams for the evaluation of
a memory hierarchy design.
Caches must be very fast. Dedicating comparison hardware with every location in the
associative memory does this.
Computer Simulation and Modeling 195
Another role of simulation is to work out, for a given cache size, how the space ought
to be partitioned into sets. This is largely a cost consideration.
It works to compute the miss ratio of many different set sizes in 1 pass of reference
string.
Presented with a reference, simulation searches the list of cache blocks for a match.
Building a cache to have a unique comparator associated with every address in the
cache is extremely expensive; so various memory tricks are applied to bring down the
costs.
For this, the address space is partitioned into various sets, and any given memory
address is mapped to the set identified by its set-id address bits.
In context of a set associative cache simulation, each set must be managed separately.
4. Maintenance policy
5. Effect of changes in order profiles
Due to the stochastic nature of the data, each single simulation should run for a couple
of days.
■■■
Computer Simulation and Modeling 200
APPENDIX
Computer Simulation and Modeling 201
21215 91791 76831 58678 87054 31687 93205 43685 19732 08468
10438 44482 66558 37649 08882 90870 12462 41810 01806 02977
36792 26236 33266 66583 60881 97395 20461 36742 02852 50564
73944 04773 12032 51414 82384 38370 00249 80709 72605 67497
49563 12872 14063 93104 78483 72717 68714 18048 25005 04151
64208 48237 41701 73117 33242 42314 83049 21933 92813 04763
51486 72875 38605 29341 80749 80151 33835 52602 79147 08868
99756 26360 64516 17971 48478 09610 04638 17141 09227 10606
71325 55217 13015 72907 00431 45117 33827 92873 02953 85474
65285 97198 12138 53010 95601 15838 16805 61004 43516 17020
17264 57327 38224 29301 31381 38109 34976 65692 98566 29550
95639 99754 31199 92558 68368 04985 51092 37780 40261 14479
61555 76404 86210 11808 12841 45147 97438 60022 12645 62000
78137 98768 04689 87130 79225 08153 84967 64539 79493 74917
62490 99215 84987 28759 19177 14733 24550 28067 68894 38490
24216 63444 21283 07044 92729 37284 13211 37485 10415 36457
16975 95428 33226 55903 31605 43817 22250 03918 46999 98501
59138 39542 71168 57609 91510 77904 74244 50940 31553 62562
29478 59652 50414 31966 87912 87514 12944 49862 96566 48825
To get a random number between 0 and 1, take a grouping of digits, for example 5 at a time,
and place a decimal point in front.
Example: If we take random digit 92729, then the random number will be represented as
.92729
Computer Simulation and Modeling 202
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
.0 .5000 .5040 .5080 .5120 .5160 .5199 .5239 .5279 .5319 .5359
.1 .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .5753
.2 .5793 .5832 .5871 .5910 .5948 .5987 .6026 .6064 .6103 .6141
.3 .6179 .6217 .6255 .6293 .6331 .6368 .6406 .6443 .6480 .6517
.4 .6554 .6591 .6628 .6664 .6700 .6736 .6772 .6808 .6844 .6879
.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224
.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549
.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852
.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133
.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389
1.0 .8413 .8438 .8461 .8485 .8508 .8531 .8554 .8577 .8599 .8621
1.1 .8643 .8665 .8686 .8708 .8729 .8749 .8770 .8790 .8810 .8830
1.2 .8849 .8869 .8888 .8907 .8925 .8944 .8962 .8980 .8997 .9015
1.3 .9032 .9049 .9066 .9082 .9099 .9115 .9131 .9147 .9162 .9177
1.4 .9192 .9207 .9222 .9236 .9251 .9265 .9279 .9292 .9306 .9319
1.5 .9332 .9345 .9357 .9370 .9382 .9394 .9406 .9418 .9429 .9441
1.6 .9452 .9463 .9474 .9484 .9495 .9505 .9515 .9525 .9535 .9545
1.7 .9554 .9564 .9573 .9582 .9591 .9599 .9608 .9616 .9625 .9633
1.8 .9641 .9649 .9656 .9664 .9671 .9678 .9686 .9693 .9699 .9706
1.9 .9713 .9719 .9726 .9732 .9738 .9744 .9750 .9756 .9761 .9767
2.0 .9772 .9778 .9783 .9788 .9793 .9798 .9803 .9808 .9812 .9817
2.1 .9821 .9826 .9830 .9834 .9838 .9842 .9846 .9850 .9854 .9857
2.2 .9861 .9864 .9868 .9871 .9875 .9878 .9881 .9884 .9887 .9890
2.3 .9893 .9896 .9898 .9901 .9904 .9906 .9909 .9911 .9913 .9916
2.4 .9918 .9920 .9922 .9925 .9927 .9929 .9931 .9932 .9934 .9936
2.5 .9938 .9940 .9941 .9943 .9945 .9946 .9948 .9949 .9951 .9952
2.6 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .9964
2.7 .9965 .9966 .9967 .9968 .9969 .9970 .9971 .9972 .9973 .9974
2.8 .9974 .9975 .9976 .9977 .9977 .9978 .9979 .9979 .9980 .9981
2.9 .9981 .9982 .9982 .9983 .9984 .9984 .9985 .9985 .9986 .9986
3.0 .9987 .9987 .9987 .9988 .9988 .9989 .9989 .9989 .9990 .9990
3.1 .9990 .9991 .9991 .9991 .9992 .9992 .9992 .9992 .9993 .9993
3.2 .9993 .9993 .9994 .9994 .9994 .9994 .9994 .9995 .9995 .9995
3.3 .9995 .9995 .9995 .9996 .9996 .9996 .9996 .9996 .9996 .9997
3.4 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9998
Computer Simulation and Modeling 203
(If calculated t is greater than value shown, reject the null hypothesis.)
(If calculated ratio is greater than value shown, then reject the null hypothesis at the chosen
level of confidence.)
OVER 35
√ √ √ √ √
Computer Simulation and Modeling 206
References:
1. Jerry Banks, John S. Carson II, Barry L. Nelson and David M. Nicol,
“Discrete-Event System Simulation”, 3rd Edition, © 2001, Pearson
Education.
2. Jerry Banks, John S. Carson II, Barry L. Nelson and David M. Nicol,
“Discrete-Event System Simulation”, 5th Edition, © 2010, Pearson
Education.
3. Averill M Law, “System Modeling & Analysis”, 4th Edition, TMH.
4. http://www.fao.org/docrep/009/a0238e/A0238E02.htm
5. http://individual.utoronto.ca/ranodya/6P1.html
6. www.bcnn.net
7. http://chemwiki.ucdavis.edu/Reference/Reference_Tables/Analytic_Refer
ences/Appendix_14%3A_Random_Number_Table
8. http://www.sjsu.edu/faculty/gerstman/EpiInfo/z-table.htm
9. http://dlc.erieri.com/index.cfm?fuseaction=textbook.appendix