CSM Book 2018 PDF

Academic Year 2017-18
Information Technology Branch

University of Mumbai
Nilesh M. Patil
SYLLABUS
Teaching Scheme
Course Course Credits Assigned
(Hrs. / Week)
Code Name
Theory Practical Tutorial Theory Practical/Oral Tutorial Total
Computer
Simulation
BEITC803 04 02 -- 04 01 -- 05
And
Modeling
Examination Scheme
Theory Marks
Internal
Course Assessment
Course Name End Term
Code Avg. Practical Oral Total
Semester Work
Test Test of
Exam
1 2 2
Tests
Computer
BEITC803 Simulation 20 20 20 80 25 25 -- 150
And Modeling
Course Objectives:
This course presents an introduction to discrete event simulation systems. Emphasis of the course
will be on modeling and the use of simulation languages/software to solve real world problems in
the manufacturing as well as services sectors. The course discusses the modeling techniques of
entities, queues, resources and entity transfers in discrete event environment. The course will teach
the students the necessary skills to formulate and build valid models, implement the model, perform
simulation analysis of the system and analyze results properly. The “theory” of simulation involves
probability and statistics, thus a good background in probability and statistics is a required
prerequisite
Course Outcomes:
 Understand the meaning of simulation and its importance in business, science, engineering,
industry and services.
 Identify the common applications of discrete-event system simulation.
 Practice formulation and modeling skills.
 Understand simulation languages.
 Ability to analyze events and inter-arrival time, arrival process, queuing strategies, resources
and disposal of entities.
 An ability to perform a simulation using spreadsheets as well as simulation
language/package.
 Ability to generate pseudorandom numbers using the Linear Congruential Method.
 Ability to perform statistical tests to measure the quality of a pseudorandom number
generator.
 Ability to define random variate generators for finite random variables.
 Ability to analyze and fit the collected data to different distributions.
DETAILED SYLLABUS
Sr.
Module Detailed Content Hours
No.
UNIT – I 1. Introduction to Simulation
1 Introduction 2. Simulation Examples 15
to Simulation 3. General Principles
UNIT – II
Mathematical
& 4. Statistical Models in simulation.
2 8
Statistical 5. Queuing Models
Models
In Simulation
6. Random Number Generation, Testing random numbers
(Refer to Third edition)
UNIT – III
7. Random Variate Generation: Inverse transform technique,
3 Random 9
Direct Transformation for the Normal Distribution,
Numbers
Convolution Method, Acceptance-Rejection Technique
(only Poisson Distribution)
UNIT – IV 8. Input Modeling
Analysis of 9. Verification, Calibration and Validation of Simulation
4 12
Simulation Models
Data 10. Estimation of absolute performance
11. Case study:
UNIT – V
5 Processor and Memory simulation 4
Application
Manufacturing & Material handling
Text Books:
1. Discrete Event System Simulation; Third Edition, Jerry Banks, John Carson, Barry Nelson, and
David M. Nicol, Prentice-Hall
2. Discrete Event System Simulation; Fifth Edition, Jerry Banks, John Carson, Barry Nelson, and
David M. Nicol, Prentice-Hall
References:
1. System Modeling & Analysis; Averill M Law, 4th Edition TMH.
2. Principles of Modeling and Simulation; Banks C M , Sokolowski J A; Wiley
3. System Simulation ; Geoffrey Gordon ; EEE
4. System Simulation with Digital Computer; Narsing Deo, PHI
Term work:
Laboratory work: 10 Marks
Mini Simulation Project presentation: 10 Marks
Attendance / Quiz: 5 Marks
Suggested Practical List (If Any):

Perform simulation exercises given in the text book (third edition) using spreadsheets and/or
simulation language/package
 Queue- single server, multi-server, classic case- dump truck
 Inventory – Lead time=0, lead time fixed, lead time probabilistic
 Reliability problem
 Tutorials on statistical models
 Random number generate and test
 Goodness of fit test
 Output analysis – Point estimate and Confidence Interval
Simulation: Real World Examples – can be in the field of business, transportation, medical,
computing, manufacturing and material handling- Presentation to be taken.
Theory Examination:
 Question paper will comprise of 6 questions, each carrying 20 marks.
 Total 4 questions need to be solved.
 Q.1 will be compulsory, based on entire syllabus where in sub questions of 2 to 3 marks will
be asked.
 Remaining question will be randomly selected from all the modules.
Weightage of marks should be proportional to number of hours assigned to each module.

Computer Simulation and Modeling 1
UNIT: I
INTRODUCTION TO SIMULATION
CHAPTER 1 INTRODUCTION TO SIMULATION
CHAPTER 2 SIMULATION EXAMPLES
CHAPTER 3 GENERAL PRINCIPLES

CHAPTER 1
INTRODUCTION TO SIMULATION
1.1 Introduction [M-05,06,07,08,09, S-15, N-04, D-09]

1. A simulation is the imitation of the operation of a real-world process or system over
time.
2. Simulation is acting out or mimicking an actual or probable real life condition, event,
or situation to find a cause of a past occurrence (such as an accident), or
to forecast future effects (outcomes) of assumed circumstances or factors.
3. A simulation may be performed through
a) solving a set of equations (a mathematical model),
b) constructing a physical (scale) model,
c) game (such as war games), or
d) a computer graphics model (such as an animated flowchart)
4. It involves generation of an artificial history of the system.
5. It draws inferences from the artificial history concerning the operating characteristics
of the real system.
6. Simulation is used to describe and analyze the behavior of a system.
7. Example: Flight Simulator, Need for Speed.
1.2 When Simulation is an Appropriate Tool [M-06,08,09,12,N-04,D-05,10 ]

Simulation can be used for the following purposes:
1. Simulation enables one to study internal interactions of a complex system, or of a
subsystem within a complex system.
2. The effect of informational, organizational, and environmental changes on the
model‟s behavior can be simulated and observed.
3. The performance of the system under investigation can be improved by knowledge
gained from simulations.
4. Simulation results can help to find which variables are the most important ones and
how variables interact.
5. Simulation can be used as the learning tool.
6. Simulations can be used to verify analytical solutions, e.g. queueing systems
7. By simulating different capabilities for machine requirements can be determined.
8. Simulation models designed for training allow learning without any cost.
9. Simulation in the form of animation can show the system in action, so that the plan
can be visualized.
10. The interactions in modern complex system like factory, water fabrication plant, etc,
can be treated only through simulation.
1.3 When Simulation is Not Appropriate [M-08,09]

Simulation should not be used, in the case
1. when problem is solvable by common sense
2. when the problem can be solved mathematically
3. when direct experiments are easier
4. when the simulation costs exceed the savings
5. when the simulation requires time, which is not available
6. when no (input) data is available, but simulations need data
7. when the simulation cannot be verified or validated
8. when the system behavior is too complex or unknown
1.4 Advantages of Simulation [M-10,11]

1. We can explore new policies, procedures, decision rules, information flows and so on
without disrupting ongoing operations of the real system.
2. We can test new hardware designs, physical layouts, and transportation systems
without committing resources for their acquisition.
3. Hypotheses about how or why certain phenomena occur can be tested for feasibility.
4. Time can be compressed or expanded allowing for slow-down or speed-up of the
phenomena under investigation.
5. Insight can be obtained about the interaction of variables.
6. Insight can be obtained about the importance of variables to the performance of the
system.
7. Bottleneck analysis can be performed to detect excessive delays.
8. Simulation can help to understand how the system operates rather than how people
think the system operates.
9. “What if” questions can be answered which is particularly important in the design of
new systems.
1.5 Disadvantages of Simulation [M-10,11]

1. Model building requires special training. It is an art that is learned over time and
through experience.
2. Simulation results may be difficult to interpret. Since most simulation outputs are
essentially random variables, it may be not simple to decide whether output is
randomness or system behavior
3. Simulation can be time consuming and expensive. Skimping in time and resources
could lead to useless or wrong results
4. Simulation is used in some cases when an analytical solution is possible.
The disadvantages can be offset as follows

1. Simulation packages contain models that only need input data.
2. Simulation packages contain output-analysis capabilities.
3. Sophistication in computer technology improves simulation times.
4. For most of the real-world problems there are no closed form solutions.
1.6 Areas of Application [M-05,06,13, D-05,06]

1. Manufacturing Applications
i. To analyze electronics assembly operations.
ii. To analyze storage and retrieval strategies in a warehouse.
iii. To investigate dynamics in service-oriented supply chain.
iv. To model an Army chemical munitions disposal facility.
2. Semiconductor Manufacturing
i. To compare dispatching rules using large-facility models.
ii. To plan capacity with time constraints between operations.
iii. To compare a 200-mm and 300-mm X-ray lithography cell.
iv. To reduce up to 300-mm logistic system risk.
3. Construction Engineering
i. To construct a dam embankment.
ii. To renew underground urban infrastructure.

iii. To investigate the structural steel erection process.
iv. To design special-purpose template for utility tunnel construction.
4. Military Applications
i. To design and test an intelligent controller for autonomous underwater vehicles.
ii. To model military requirements for non-war-fighting operations.
iii. To evaluate multi-trajectory performance for varying scenario sizes.
iv. To use adaptive agents in Air Force pilot retention.
5. Logistics, Transportation, and Distribution Application

i. To evaluate the potential benefits of a rail-traffic planning algorithm.
ii. To analyze passenger flows in an airport terminal.
iii. Product distribution in the newspaper industry.
iv. Design of a toll plaza.
6. Business Process Simulation

i. Modeling and simulation of a telephone call center.
ii. Simulation‟s role in baggage screening at airports.
iii. Optimization of a telecommunications billing system.
iv. Segmenting the customer base for maximum returns.
7. Human Systems
i. Modeling human performance in complex system.
ii. Studying the human element in air traffic control.
8. Health Care
i. Estimating maximum capacity in the emergency room.
ii. Reducing the length of stay in an emergency department.
1.7 System and System Environment [M-12]

i. A system is defined as a group of objects that are joined together in some regular
interaction or interdependence toward the accomplishment of some purpose.
ii. An example is a production system manufacturing automobiles. The machines,
component parts, and workers operate jointly along an assembly line to produce
high-quality vehicle.
iii. A system is often affected by changes occurring outside the system. Such changes
are said to occur in the system environment.
iv. In modeling systems, it is necessary to decide on the boundary between the system
and its environment. This decision may depend on the purpose of study.
1.8 Components of a System

1. Entity: An entity is an object of interest in the system. Example: Bank, Customers
2. Attribute: An attribute is property of an entity. Example: Balance in an account
3. Activity: An activity represents a time period of specified length. Example:Making
deposit
There are two types of activities:
i. Deterministic Activity: The outcome of the activity can be described completely
in terms of its inputs.
ii. Stochastic Activity: The effects of the activity vary randomly over various
possible outcomes.
4. State: The state of a system is defined to be that collection of variables necessary to
describe the system at any time, relative to the objects of the study. Example:Number
of Tellers, Number of customers waiting in queue
5. Event: An event is defined as an instantaneous occurrence that may change the state
of the system.
There are two types of events: [D-14]
i. Endogenous Event: An endogenous event is an event occurring within the system.
Example: Completion of service of a customer
ii. Exogenous Event: An exogenous event is an event occurring in the environment that
affects the system. Example: Arrival of customer in bank
Table 1.1 Examples of Systems and Components

System Entities Attributes Activities Events System
Variables
Banking Customers Account Making Arrival, Number of
Number, Deposits, Departure Teller
Balance Withdraw Machines,
Number of
Customers in
Queue
Production Machines Speed, Welding, Breakdown Status of
Capacity Stamping Machines
(Busy, Idle or
Down)
Communication Messages Length, Transmitting, Arrival at Number of
Source, Receiving Destination Packets to be
Destination Transmitted
Inventory Warehouse Capacity Withdrawing Demand Inventory
Level,
Backlogged
Demand
Cafeteria Diners Size of Selecting Food, Arrival at Number of
Appetite Paying for Service Line, Diners in
Food Departure from Queue,
Service Line Number of
Busy Servers
Grocery Store Shoppers Grocery List Checking Out Arrival at Number of
Checkout Checkout
Counter, Counters,
Departure from Number of
Checkout Shoppers in
Counter Line
Fast-food Customers Desired Placing the Arrival at Number of
Restaurant Order Order, Paying Counter, Waiting
for the Order Completion of Customers
Purchase
Hospital Patients Attention Providing Arrival of Number of
Emergency Level Service Patient, Waiting
Room Required Departure of Patients,
Patient Number of
Busy
Physicians
1.9 Types of Systems

1. Discrete System: A discrete system is one in which the state variables change only at
a discrete set of points in time. Example: Number of customers waiting in line in
bank.
2. Continuous System: A continuous system is one in which the state variables change
continuously over time. Example: Head of water behind the dam.
Figure 1.1 Discrete and Continuous Systems
1.10 Model of a System [N-04, M-05,07,12, D-09,15,17]

i. A model is defined as a representation of a system for the purpose of studying the
system.
ii. It is necessary to consider only those aspects of the system that affect the problem
under investigation for studying them.
iii. These aspects are represented in a model of the system, and the model, by
definition, is a simplification of the system.
iv. Model should be sufficiently detailed to permit valid conclusions to be drawn
about the real system.
1.11 Types of Models [M-05,13, N-04,D-05,06,12,13,15]

Different types of models are as follows:
1. Mathematical Model: A mathematical model uses symbolic notation and

mathematical equations to represent a system. It represents a system in terms of
2
logical and quantitative relationships. Example: Area of Circle = πR
2. Physical Model: A physical model is based on some analogy between systems such
as mechanical and electrical, electrical and hydraulic. System attributes are
represented by measurements such as voltage or the position of a motor shaft.
Example: Scale Models, Prototype Plants
3. Static Model: A static model, sometimes called a Monte Carlo simulation model,
represents a system at a particular point in time. Example: A map is a static model.
So is a photo.
4. Dynamic Model: A dynamic model represents systems as they change over time.
Example: Simulation of cafeteria from 2:00 p.m. to 4:00 p.m.
5. Deterministic Model: A deterministic model is one which contains no random
variables. It has a known set of inputs which will result in a unique set of outputs.
Example: Arrival of patients at a doctor‟s clinic as their scheduled appointment time.
6. Stochastic Model: A stochastic model is one which contains one or more random
variables as inputs. Random outputs are generated which can be considered only as
estimates of the true characteristics of a model. Example: Simulation of a bank
7. Discrete Model: A discrete model is one in which the state variables change only at a
discrete set of points in time. Example: Number of customers waiting in line in bank.
8. Continuous Model: A continuous model is one in which the state variables change
continuously over time. Example: Head of water behind the dam.
1.12 Steps in a Simulation Study

[M-05,06,11,12,13,14,15,16,17 S-15, D - 06,07,08,11,12,13,14,16,17]
Figure 1.2 shows a set of steps to guide a model builder in simulation study.
1. Problem Formulation
i. Every simulation study begins with a statement of the problem.
ii. If the statement is provided by the policy makers, or those that have the
problem, the analyst must ensure that the problem being described is clearly
understood.
iii. If a problem statement is being developed by the analyst, it is important that
policy makers understand and agree with the formulation.
iv. Even with all these precautions, it is possible that the problem will need to be
reformulated as the simulation study progresses.
2. Setting of objectives and overall project plan

i. The objectives indicate the questions to be answered by simulation.
ii. The overall project plan should include a statement of the alternative systems
to be considered, and a method of evaluating the effectiveness of these
alternatives.
iii. It should also include the plans for the study in terms of the number of people
involved, the cost of the study, and the number of days required to accomplish
each phase of the work with the anticipated results at the end of each stage.
3. Model conceptualization
i. A conceptual model abstracts the real-world system under investigation.
ii. Modeling should begin in a simple manner and then built towards greater but
appropriate complexity.
iii. Model user should be involved in conceptualization as it will enhance the
quality of the resulting model and also increase the confidence of the model
user in the application of the model.
4. Data Collection
i. There is a constant overplay between the construction of the model and the
collection of the needed input data.
ii. As the complexity of the model changes, the required data elements may also
change.
iii. The system objectives dictate the kind of data to be collected.
iv. Data collection should be started as early as possible since it takes large
amount of simulation time.
5. Model Translation
i. The conceptual model constructed in step 3 is coded into a computer
recognizable form, an operational model using any simulation language.
6. Verified?
i. Verification pertains to the computer program prepared for the simulation
model. Is the computer program performing properly?

ii. If the input parameters and logical structure of the model are correctly
represented in the computer, verification has been completed.
iii. For most of the part, common sense is used in completing this step.
Figure 1.2 Steps in Simulation Study

7. Validated?
i. Validation is the determination that the model is an accurate representation of
the real system.
ii. Validation is usually achieved through the calibration of model, an iterative
process of comparing the model to actual system behavior and using
discrepancies between the two, and the insights gained, to improve the model.
iii. This process is repeated until model accuracy is judged acceptable.
8. Experimental Design
i. For each system design that is to be simulated, decisions need to be made
concerning the length of the initialization period, the length of simulation runs,
and the number of replications to be made of each run.
9. Production runs and analysis

i. A run is defined as a succession of similar events preceded and followed by a
different event.
ii. Production runs and their subsequent analysis are used to estimate measures of
performance for the system designs that are being simulated.
10. More runs?

i. Based on the analysis of runs that have been completed, the analyst
determines if additional runs are needed and what design those additional
experiments should follow.
11. Documenting and reporting

i. There are two types of documentation: program and progress.
ii. Program documentation is necessary to understand how the program
operates, if the program is to be used again by the same or different analysts.
iii. This will build confidence in the program, so that model builders and policy
makers can make decisions based on the analysis.
iv. If the program is to be modified, the program documentation can be greatly
facilitated by adequate documentation.
v. Program document is also used to determine the input parameters that
optimize some output measure of performance.
vi. Progress reports provide the important written history of a simulation
project. Project reports give a chronology of work done and decisions made.
vii. The result of all the analysis should be reported clearly and concisely in a final
report. This will enable the model users to review the final formulation, the
alternative systems that were addressed, the results of the experiments, and the
recommended solution to the problem.
12. Implementation
i. This is the final step in the simulation study.
ii. If the model user has been involved throughout the study period, and the
simulation analyst has followed all the steps rigorously, then the
implementation is likely to be successful.
■■■
CHAPTER 2
SIMULATION EXAMPLES
2.1 The Simulation Process

 Simulations can be performed by devising a simulation table either manually or with a
spreadsheet (Excel Sheet).
 The simulation table provides a systematic method for tracking system state over
time.
 The examples in this chapter provide insight into the methodology of discrete system
simulation and the descriptive statistics used for predicting system performance.
 The simulations in comprises of three steps:
1. Determine the characteristics of each of the inputs to the simulation that may be
modeled as probability distributions, either continuous or discrete.
2. Construct a simulation table. Each simulation table is different for different
problem at hand. A simulation table consists of p inputs, xij; j = 1; 2; . . . ; p, and
one response, yi, for each of repetitions i = 1; 2; . . . ; n. Initialize the table by
filling in the data for iteration 1.
3. For each iteration i , generate a value for each of the p inputs, and evaluate the
function, calculating a value of the response yi . The input values may be
computed by sampling values from the distributions determined in step 1. A
response typically depends on the inputs and one or more previous responses.
2.2 Simulation of Queueing Systems

[M-05,06, 08,09,11,13,14, N-04, D-06,07,09,11,12]
Simulation is often used in the analysis of queuing models. In a simple but typical queuing
model, shown in Figure 2.1, customers arrive from time to time and join a queue or waiting
line, are eventually served by the server and finally leave the system.
Figure 2.1 Queueing System
Characteristics of Queueing Systems

The key elements of a queuing system are the customers and servers.
The term customer can refer to people, machines, trucks, mechanics, patients, airplanes, e-
mail, cases, orders, or dirty clothes-anything that arrives at a facility and requires service.
The term server might refer to receptionists, mechanics, tool-crib clerks, medical personnel,
automatic storage and retrieval machines (e.g., cranes), runaways at an airport, automatic
packers, etc which provides the requested service.
A queueing system is described by its calling population, the nature of the arrivals, the
service mechanism, the system capacity, and the queueing discipline. These attributes of a
queueing system are described in detail below.
2.2.1 The calling population

 The population of potential customers is referred to as the calling population.
 It may be assumed to be finite or infinite.
 Finite calling population model: The arrival rate to the queuing system does depend
on the number of customers being served and waiting. Examples of a finite calling
population are a repair facility in a shop, where there is a fixed number of machines
available to be worked on, a trucking terminal that services a fleet of a specific
number of trucks, or a nurse assigned to attend to a specific number of patients.
 Infinite calling population model: The arrival rate (i.e., the average number of
arrivals per unit time) is not affected by the number of customers who have left the
calling population and joined the queuing system. When the arrival process is
homogeneous over time (e.g., there are no “rush hours”), the arrival rate is usually
assumed to be constant. Examples of infinite calling population include the potential
customers of the restaurant, bank, or other similar service facility and also very large
group of machines serviced by a technician.
 In systems with large population of potential customers, the calling population is
usually assumed to be infinite.
1.2.2 System capacity

 In many queuing systems there is a limit to the number of customers that may be in
the waiting line or system.
 When a system has limited capacity, a distinction is made between the arrival rate
(i.e., the number of arrivals per time unit) and the effective rate (i.e., the number who
arrive and enter the system per time unit).
 For example,
1. (Limited capacity) - An automatic car wash may have room only for 10 cars to
enter the mechanism. It may be too dangerous or illegal for cars to wait in the
street. An arriving customer who finds the system full does not enter but returns
immediately to the calling population.
2. (Unlimited capacity) - Some systems, such as concert ticket sales for students,
may be considered as having unlimited capacity. There are no limits on the
number of students allowed to wait to purchase tickets.
2.2.3. The arrival process

 The arrival rate is the rate at which customers arrive at the service facility during a
specified period of time.
 This rate can be estimated from empirical data derived from studying the system or a
similar system, or it can be an average of these empirical data.
 For example, if 100 customers arrive at a store checkout counter during a 10-hour
day, we could say the arrival rate averages 10 customers per hour. However, although
we might be able to determine a rate for arrivals by counting the number of customers
during a specific time period, we would not know exactly when these customers
would arrive. It might be that no customers would arrive during one hour and 20
customers would arrive during another hour.
 Arrivals are assumed to be independent of each other and to vary randomly over time.
Arrivals may occur at scheduled times or random times.
 The most important model for random arrivals is the Poisson arrival process. Poisson
arrival process is used as a model for the arrival of people to restaurants, driving
banks and other service facilities like the arrival of telephone calls to a telephone
exchange, etc.
 Second important class of intervals is the scheduled arrivals. In this case the inter
arrival times may be constant, or constant plus or minus a small random amount to
represent early or late arrivals. For example, Patients to a physician‟s office,
scheduled airline flight arrivals to an airport.
 A third situation occurs when at least one customer is assumed to be always present in
the queue so that the server is never idle because of lack of customers. For example, a
customer may represent raw material for a product and sufficient raw material is
assumed to be always available.
2.2.4 Queue behavior and queue discipline

 Queue behavior refers to customer actions while in a queue waiting for service to
begin. There is a possibility that the incoming customers may
1. Balk: leave when they see that the line is too long.
2. Renege: leave after being in the line when they see that the line is moving too
slow.
3. Jockey: move from one line to another if they think they have chosen a slow line.
 Queue discipline refers to the logical ordering of customers in a queue and determines
which customer is chosen for service when the server becomes free.
 Queue disciplines can be FIFO (first in, first out), LIFO (last in, first out), SIRO
(service in random order), SPT (shortest processing time first), PR (priority service).
 In a job shop, queue disciplines are sometimes based on due dates and expected
processing time for a given type of job.
2.2.5 Service times and the service mechanism

 The service times of successive arrivals are denoted by S1, S2, and S3…
 They may be constant or of random duration.
 In case of random, {S1, S2, …} is usually characterized as a sequence of independent
and identically distributed random variables.
 The distributions like exponential, weibull, gamma, etc can be used as models for
service times.
 Sometimes services may be identically distributed for all customers of a given type or
class or priority, while customers of different types may have different service time
distributions.
 Sometimes, service times depend upon the time of day or length of waiting line.
 A queuing system consists of a number of service centers and interconnecting queues.

 Each service center consists some number of severs c, working in parallel.
 Parallel service mechanisms are either single server (c=1), multiple servers (1 < c <
∞) or unlimited servers (c=∞).
 A self service facility is usually characterized as having an unlimited number of
servers.
 Example: Consider a discount warehouse where customers may either serve
themselves or wait for one of the 3 clerks and finally leave after paying a single
cashier. The system is represented by the flow diagram in figure 2.2.The sub system
consisting of queue 2 and service center 2 is shown in more detail in the figure 2.3.
Figure 2.2 Discount warehouse with three service centers
Figure 2.3 Service center 2, with c=3 parallel servers
Single Channel Queuing Systems

 In a single channel queue the calling population is infinite.
 Arrivals for service occur one at a time in a random fashion; once they join the
waiting line, they are eventually served.
 The system capacity has no limit, the units are served in the order of their arrival
usually in FIFO for a single server or channel.
 Arrivals and services are defined by the distribution of the time between arrivals and
the distribution of the service times respectively.
 Some concepts of queuing system are:
State of the system – The number of units in the system and the status of the server-
busy or idle
Event – Set of circumstances that cause an instantaneous change in the system. There
are only 2 possible events that can affect the state of the system. They are the entry of
a unit in the system (the arrival event) or the completion of service on a unit (the
departure unit).
Simulation Clock – used to track simulated time.
 The queuing system includes the server, the unit being serviced and the units in the
queue. If a unit has just completed the service, the simulation proceeds as shown in
the figure 2.4.
Figure 2.4 Service-just-completed flow diagram
 The arrival event occurs when the unit enters the system. The flow diagram for the
arrival event is shown in the figure 2.5.
Figure 2.5 Unit-entering-system flow diagram
 The unit may find the server either idle or busy; therefore it begins service
immediately or enters the queue for the server. This course of action is shown in
figure 2.6.
Queue status
Not empty Empty
Server Busy Enter queue Enter queue
status Idle Impossible Enter service
Fig 2.6 Potential unit actions upon arrival
 After the completion of the service, the server may become idle or remain busy with
the next unit. This relationship of the server outcomes to the status of the queue is
shown in figure 2.7.
Queue status
Not empty Empty
Server Busy Impossible
outcomes Idle Impossible
Figure 2.7 Server outcomes after service completion
 If the queue is not empty, another unit will enter the server and it will be busy. If the
queue is empty, the server will be idle after a service is completed. These two
possibilities are shown in the shaded portion of the figure 2.7.
 Simulation clock times for arrivals and departures are computed in a simulation table
customized for each problem.
 In simulation, events usually occur at random times.
 The randomness needed to imitate real life is made possible through the use of
random numbers.
 Random numbers are distributed uniformly and independently on the interval (0, 1).
 Random digits are uniformly distributed on the set {0, 1, 2,…, 9}.
 Random digits can be used to form random numbers by selecting the proper number
of digits for each random number and placing a decimal point to the left of the value
selected.
 The proper number of digits is dictated by the accuracy of the data being used for
input purposes.
 If the input distribution has values with two decimal places, two digits are taken from
a random-digits table (such as Table A.1) and the decimal point is placed to the left to
form a random number.
 In a single-channel queueing system inter-arrival times and service times are
generated from the distributions of these random variables.
Steps to perform Manual Simulation of Single Channel Queue

1. Generate the arrivals using a set of uniformly distributed random numbers.
2. We use random digits (from Table A.1) to generate the random numbers.
3. We start at random position in the random digit table and proceed in proper direction,
without using the same stream of digits.
4. Determine the inter-arrival time and service time.
5. Fill all the cells for the first arriving customer initially.
6. After filling the first customer details, subsequent rows in the table are filled based on
the random numbers for inter-arrival time and service time and the completion of
service time of the previous customer.
7. Collect the measures of the performance of the system from the table columns and
generate the summary statistics.
Example 2.1 (Single Channel Queue Simulation Problem) [S-15, D-16,17]

Consider a single server system. Let the arrival distribution be uniformly distributed between
1 and 10 minutes and the service time distribution is as follows:
Service Time (Min) 1 2 3 4 5 6
Probability 0.04 0.20 0.10 0.26 0.35 0.05
Develop the simulation table and analyze the system by simulating the arrival and service of
10 customers. Random digits for inter-arrival time and service times are as follows:
Customer 1 2 3 4 5 6 7 8 9 10
R.D. for Inter-arrival Time -- 853 340 205 99 669 742 301 888 444
R.D. for Service Time 71 59 12 88 97 66 81 35 29 91
Solution:
Given that the arrival distribution is uniformly distributed between 1 and 10 minutes. Hence
the probability of occurrence of each inter-arrival time is the same which would be 0.100.
Using this we can generate the distribution of inter-arrival time and assign random digits to it
as shown in table 2.1 below:
Random-
Time Between Cumulative
Probability Digit
Arrivals(minutes) Probability
Assignment
1 0.100 0.100 001-100
2 0.100 0.200 101-200
3 0.100 0.300 201-300
4 0.100 0.400 301-400
5 0.100 0.500 401-500
6 0.100 0.600 501-600
7 0.100 0.700 601-700
8 0.100 0.800 701-800
9 0.100 0.900 801-900
10 0.100 1.000 901-000
Table 2.1 Distribution of time between arrivals
Similarly, assign random digits to the service time as shown in table 2.2 below.
Random-
Service Time Cumulative
Probability Digit
(minutes) Probability
Assignment
1 0.04 0.04 01-04
2 0.20 0.24 05-24
3 0.10 0.34 25-34
4 0.26 0.60 35-60
5 0.35 0.95 61-95
6 0.05 1.00 96-00
Table 2.2 Service time distributions
First, initialize the table by entering the details for the first customer. Here, we are assuming
that the first customer is arriving at time 0. Service of first customer begins immediately since
nobody is present in the system and gets over at time 5. So we can say that the customer
spends 5 minutes in the system. After filling the first customer details, subsequent rows in the
table are filled based on the random numbers for inter-arrival time and service time and the
completion of service time of the previous customer.
The simulation table for single server system is given in table 2.3 below.
Customer Random Time Arrival Random Service Time Time Time Time Idle
Digits Between Time Digits Time Service Customer Service Customer Time
For Arrival For Begins Waits in Ends Spends In Of
Arrival Service Queue System Server
1 -- -- 0 71 5 0 0 5 5 0
2 853 9 9 59 4 9 0 13 4 4
3 340 4 13 12 2 13 0 15 2 0
4 205 3 16 88 5 16 0 21 5 1
5 99 1 17 97 6 21 4 27 10 0
6 669 7 24 66 5 27 3 32 8 0
7 742 8 32 81 5 32 0 37 5 0
8 301 4 36 35 4 37 1 41 5 0
9 888 9 45 29 3 45 0 48 3 4
10 444 5 50 91 5 50 0 55 5 2
Ʃ =50 Ʃ =44 Ʃ =8 Ʃ =52 Ʃ =11
Table 2.3 Simulation Table for Single Server Queue
Output Statistics:
( )
( )
( ) ( )
∑ ( ) ( ) ( ) ( ) ( ) ( ) ( )
( )
( )
( )
Example 2.2 [M-10,17, D-07,08,14]

Calculate the output statistics for the queueing system whose inter-arrival and service times
for ten arrivals are given below:
Inter-arrival time -- 8 6 1 8 3 8 7 2 3
Service time 4 1 4 3 2 4 5 4 5 3
Solution:
The simulation table for single server system is given in table below.
Customer Time Arrival Service Time Time Time Time Idle
Between Time Time Service Customer Service Customer Time
Arrival Begins Waits in Ends Spends In Of
Queue System Server
1 -- 0 4 0 0 4 4 0
2 8 8 1 8 0 9 1 4
3 6 14 4 14 0 18 4 5
4 1 15 3 18 3 21 6 0
5 8 23 2 23 0 25 2 2
6 3 26 4 26 0 30 4 1
7 8 34 5 34 0 39 5 4
8 7 41 4 41 0 45 4 2
9 2 43 5 45 2 50 7 0
10 3 46 3 50 4 53 7 0
Ʃ =46 Ʃ =33 Ʃ =9 Ʃ =40 Ʃ =18
Output Statistics:
( )
( )
( )
( )
( )
Able-Baker Carhop Problem (Multi Server Simulation Problem) [M-12]

 In this simulation problem we have more than one server.
 Consider a drive-in restaurant where carhops take orders and bring food to the car.
 Cars arrive in random fashion as per the distribution of inter-arrival time of cars.
 There are two carhops- Able and Baker.
 Able is better able to do the job and works a bit faster than Baker.
 These two carhops serve the customer according to their service times distribution.
 When a customer arrives, if he finds Able free, the customer is immediately served by
Able.
 If Able is busy, and Baker is idle, then the customer starts service immediately with
Baker.
 If both are busy, the customer starts service with the first server to become free.
 If both carhops are free, Able will only get the customer.
Example 2.3
Consider a drive in restaurant where carhops take order and bring food to the car. Cars arrive
according to the inter-arrival distribution of cars. There are two carhops, Able and Baker.
Able is better able to do the job and works a bit faster than Baker. The distribution of their
service time is also given.
Time Between Arrivals (min) 2 3 4 5 6

Probability 0.18 0.25 0.27 0.17 0.13
Able’s Baker’s
Service 2 3 4 5 Service 3 4 5 6
Time Time
Probability 0.17 0.24 0.29 0.30 Probability 0.18 0.22 0.30 0.30
10 customers. Random digits for inter-arrival time and service time are as follows:
Customer 1 2 3 4 5 6 7 8 9 10
R.D. for Interarrival Time -- 32 66 41 21 37 79 18 60 98
Solution:
Table 2.4: Inter-arrival Time Distribution

Time Between
Cumulative Random Digit
Arrivals Probability
Probability Assignment
(Minutes)
2 0.18 0.18 01 – 18
3 0.25 0.43 19 – 43
4 0.27 0.70 44 – 70
5 0.17 0.87 71 – 87
6 0.13 1.00 88 – 00
Table 2.5: Able's Service Time Distribution

Service Time Probability
2 0.17 0.17 01 – 17
3 0.24 0.41 18 – 41
4 0.29 0.70 42 – 70
5 0.30 1.00 71 – 00
Table 2.6: Baker's Service Time Distribution

Service Time Probability
3 0.18 0.18 01 – 18
4 0.22 0.40 19 – 40
5 0.30 0.70 41 – 70
6 0.30 1.00 71 – 00
Table 2.7 Simulation Table for Carhop Problem

Customer Random Inter- Arrival Random Able Baker Time
Digit arrival Time Digit for in
Time Service Time Time Service Time
For Time (Min) Service Queue
Service Time Service Service Time Service
Arrival
Begins Ends Begins Ends
1 -- -- 0 49 0 4 4 -- -- -- 0
2 32 3 3 53 -- -- -- 3 5 8 0
3 66 4 7 34 7 3 10 -- -- -- 0
4 41 3 10 17 10 2 12 -- -- -- 0
5 21 3 13 30 13 3 16 -- -- -- 0
6 37 3 16 52 16 4 20 -- -- -- 0
7 79 5 21 22 21 3 24 -- -- -- 0
8 18 2 23 62 -- -- -- 23 5 28 0
9 60 4 27 56 27 4 31 -- -- -- 0
10 98 6 33 73 33 5 38 -- -- -- 0
∑= ∑= ∑=
28 10 0
2.3 Simulation of Inventory Systems [M-06,08,13,14,16 D-06,07,09,11,12,13,17]

 Inventory system is one of the important classes of simulation problems.
 This inventory system has a periodic review of length N, at which time the inventory
is checked.
 At each review, an order is made to bring the inventory up to the level M (Maximum
inventory).
 At the end of first review period, an order quantity Q1 is placed.
 The lead time (length of time between the placement and receipt of an order) is zero.
 A simple inventory system is shown in figure 2.8.
Figure 2.8 Probabilistic Order-Level Inventory System

 Demand is shown as being uniform over the time period in figure 2.8.
 Actually demands are not usually uniform, they fluctuate over time.
 One possibility is that demands all occur at the beginning of the cycle. Another is that
the lead time is random of some positive length.
 In the second cycle the amount in inventory drops below zero, indicating shortage. In
figure 2.8 these units are backordered; when the order arrives, the demands for the
backordered items are satisfied first.
 To avoid shortages, a buffer, or safety stock would need to be carried.
 Carrying stock in inventory might affect the cost attribute. Some of the scenarios are
1. Holding Cost: The costs can be interest paid on the funds borrowed to buy the
items, renting of storage space, hiring guards etc.
2. Ordering Cost: Carrying high inventory causes more frequent reviews and
consequently, more frequent purchases which lead to ordering cost.
3. Shortage Cost: Carrying short inventory may cause loss of good will to
customers.
 Larger inventories decreases the possibilities of shortages but these costs must be
traded off in order to minimize the total cost of an inventory system.
 The total cost (profit) of an inventory system is the measure of performance which is
affected by changing M and N.
 In an (M, N) inventory systems, the events that may occur are:
1. the demand for items in the inventory,
2. the review of the inventory position and
3. the receipt of an order at the end of each review period.
 When the lead time is zero, as in figure 2.8, the last two events occur simultaneously.
Example 2.4
Suppose that the maximum inventory level, M, is 11 units and the review period, N, is 5 days.
The distribution of the number of units demanded per day is given in table 2.8.
Demand 0 1 2 3 4
Probability 0.10 0.25 0.35 0.21 0.09
Table 2.8: Probability distribution of the number of units demanded per day
Lead time is a random variable as shown in table 2.9.
Lead Time (Day) 1 2 3
Probability 0.6 0.3 0.1
Table 2.9: Probability distribution of lead time
Assume that orders are placed at the close of business and are received for the inventory at
the beginning of business as determined by lead time. Also assume that the initial inventory is
3 units. Estimate the simulation for 5 cycles, the average ending units in the inventory and the
number of days when a shortage condition occurs.
Random digits for demand and lead time are given in table 2.10 and table 2.11.
Random digit for 26 68 33 39 86 18 64 79 55 74 21 43 49
demand 90 35 08 98 61 85 81 53 15 94 19 44
Table 2.10: Random digit for demand
Random digit for lead time 8 7 2 3 1
Table 2.11: Random digit for lead time
Solution:
Table 2.12: Random Digit Assignment for Daily Demand
Demand Probability Cumulative Probability Random Digit Assignment
0 0.10 0.10 01 – 10
1 0.25 0.35 11 – 35
2 0.35 0.70 36 – 70
3 0.21 0.91 71 – 91
4 0.09 1.00 91 – 00
Table 2.13: Random Digit Assignment for Lead Time

Lead Time (Day) Probability Cumulative Probability Random Digit Assignment
1 0.6 0.6 1–6
2 0.3 0.9 7–9
3 0.1 1.0 0
Table 2.14: Simulation Table for (M, N) Inventory System

Cycle Day Beginning Random Demand Ending Shortage Order Random Days
Inventory Digit Inventory Quantity Quantity Digit for until
For Lead Order
Demand Time Arrives
1 1 3 26 1 2 0 - - 1
2 2 68 2 0 0 - - 0
3 8 33 1 7 0 - - -
4 7 39 2 5 0 - - -
5 5 86 3 2 0 9 8 2
2 1 2 18 1 1 0 - - 1
2 1 64 2 0 1 - - 0
3 9 79 3 5 0 - - -
4 5 55 2 3 0 - - -
5 3 74 3 0 0 11 7 2
3 1 0 21 1 0 1 - - 1
2 0 43 2 0 3 - - 0
3 11 49 2 6 0 - - -
4 6 90 3 3 0 - - -
5 3 35 1 2 0 9 2 1
4 1 2 08 0 2 0 - - 0
2 11 98 4 7 0 - - -
3 7 61 2 5 0 - - -
4 5 85 3 2 0 - - -
5 2 81 3 0 1 11 3 1
5 1 0 53 2 0 3 - - 0
2 11 15 1 7 0 - - -
3 7 94 4 3 0 - - -
4 3 19 1 2 0 - - -
5 2 44 2 0 0 11 1 1
∑ = 64
Based on five cycles of simulation, the average ending inventory is approximately, 2.56
(64/25) units. On 5 of 25 days a shortage condition occurs.
Example 2.5 [M-07,09,15,17]

Perform the simulation of the inventory system. Daily demand is represented by the random
number 4, 1, 8, 5, 2 and the demand probability is given by
Demand Probability
0 0.2
1 0.5
2 0.3
If the initial inventory is 4 units, determine on which day the shortage condition occurs.
Solution:
0 0.2 0.2 1-2
1 0.5 0.7 3-7
2 0.3 1.0 8-0
Table 2.16: Simulation Table to Determine Shortage Condition

Day Beginning Random Demand Ending Shortage
Inventory Digits for Inventory Quantity
Demand
1 4 4 1 3 0
2 3 1 0 3 0
3 3 8 2 1 0
4 1 5 1 0 0
5 0 2 0 0 0
Hence, no shortage occurs in the inventory.
Example 2.6 [D-11]

Perform the simulation of the inventory system. Daily demand is represented by the random
number 4, 3, 8, 2, 5 and the demand probability is given by
Demand Probability
0 0.2
1 0.5
2 0.3
If the initial inventory is 4 units, determine on which day the shortage condition occurs.
Solution:
0 0.2 0.2 1-2
1 0.5 0.7 3-7
2 0.3 1.0 8-0
Table 2.18: Simulation Table to Determine Shortage Condition

Day Beginning Random Demand Ending Shortage
Inventory Digits for Inventory Quantity
Demand
1 4 4 1 3 0
2 3 3 1 2 0
3 2 8 2 0 0
4 0 2 0 0 0
5 0 5 1 0 1
th
Hence, shortage occurs on 5 day of the inventory.
Example 2.7 (The Newspaper Seller’s Problem)

A classical inventory problem concerns the purchase and sales of newspapers. The paper
seller buys the paper for 33 cents each and sells them for 50 cents each. Newspapers not sold
at the end of the day are sold as scrap for 5 cents each. Newspapers can be purchased in
bundles of 10. Thus, the paper seller can buy 50, 60 and so on. There are 3 types of newsdays,
„good‟, „fair‟ and „poor‟ with probabilities of 0.35, 0.45, and 0.20 respectively. The
distribution of papers demanded on each of these days is given in table 2.19. The problem is
to determine the optimal number of papers the newspaper seller should purchase. Simulate
demands for 10 days and record profits from sales each day.
Table 2.19 Distribution of newspapers demanded

Demand Probability Distribution
Demand
Good Fair Poor
40 0.03 0.10 0.44
50 0.05 0.18 0.22
60 0.15 0.40 0.16
70 0.20 0.20 0.12
80 0.35 0.08 0.06
90 0.15 0.04 0.00
100 0.07 0.00 0.00
The profits are given by the following relationship
[( + ( + ( , ( ,]
Assume that purchase of 70 newspapers is done each day.

Solution:
Note→ 1$ = 100 cents
 The revenue from sales is 50 cents for each paper sold. The cost of newspapers is 33
cents for each paper purchased.
 The lost profit from excess demand is 50 – 33 = 17 cents for each paper demanded
that could not be provided.
 The salvage value of scrap papers is 5 cents each.
 Tables 2.20 and 2.21 provide the random digits for the types of newsdays and the
demands for those days.
 To solve this problem by simulation, requires setting a policy of buying a certain
number of papers each day, then simulating the demands for papers over the 10-day
time period to determine the total profit. Here simulation is carried out for 70
newspapers and is shown in the table 2.22.
 The policy (number of newspapers purchased) is changed to other values and the
simulation is repeated until the best value is found.
Table 2.20 Random - digit assignment for type of newsday
Cumulative Random-digit
Type of Newsday Probability
probability assignment
Good 0.35 0.35 01-35
Fair 0.45 0.80 36-80
Poor 0.20 1.00 81-00
Table 2.21 Random – Digit assignments for newspapers demanded

Cumulative Distribution Random – Digit assignment
Demand
Good Fair Poor Good Fair Poor
40 0.03 0.10 0.44 01-03 01-10 01-44
50 0.08 0.28 0.66 04-08 11-28 45-66
60 0.23 0.68 0.82 09-23 29-68 67-82
70 0.43 0.88 0.94 24-43 69-88 83-94
80 0.78 0.96 1.00 44-78 89-96 95-00
90 0.93 1.00 1.00 79-93 97-00
100 1.00 1.00 1.00 94-00
 Cost of 70 newspapers = 70 * 33 cents = 2310 cents = $23.10

 On day 1 the demand is 60 newspapers. The revenue from the sale of 60 newspapers
is $30. Ten newspapers are left at the end of the day. The salvage value at 5 cents
each is 50 cents. The profit for the first day is
Profit = $30.00 - $23.10 – 0 + $0.50 = $7.40
 th
On 5 day the demand is greater than the supply. The revenue from sales is $35, since
only 70 papers are available under this policy. An additional 20 papers could have
been sold. Thus a lost profit of $3.40 (20 * 17 cents) is assessed. The daily profit is
determined as follows
Profit = $35.00 - $23.10 - $3.40 + 0 = $8.50
 The profit for the 10-day period is the sum of the daily profits $85.2. It can also be
computed from the totals for the 10 days of simulation as
Total profit = $320.00 - $231.00 - $6.80 + $3.0 = $85.20.
 This simulation is repeated by changing the values for the policy (buying number of
newspapers) like 60, 80 and so on. The best policy is obtained by comparing all the
profits.
Table 2.22 Simulation Table for Purchase of 70 Newspapers

Day Random Type Random Demand Revenue Lost Salvage Daily
Digits of Digits From Profit From Profit
for Type Newsday for Sale From Sale of ($)
of Demand ($) Excess Scrap
Newsday Demand ($)
($)
1 94 Poor 80 60 30 -- 0.5 7.4
2 77 Fair 20 50 25 -- 1.0 2.9
3 49 Fair 15 50 25 -- 1.0 2.9
4 45 Fair 88 70 35 -- -- 11.9
5 43 Fair 98 90 35 3.4 -- 8.5
6 32 Good 65 80 35 1.7 -- 10.2
7 49 Fair 86 70 35 -- -- 11.9
8 00 Poor 73 60 30 -- 0.5 7.4
9 16 Good 24 70 35 -- -- 11.9
10 24 Good 60 80 35 1.7 -- 10.2
∑ = $320 ∑ = $6.8 ∑ = $3 ∑ = $85.2
Comparing all the profits, we can suggest newspaper seller to buy 70 newspapers each day.
Example 2.8 [D-05, 10]

A baker is trying to determine how many dozens of bagels to bake each day. The probability
distribution of the number of customers/day is given in table 2.23.
Number of 8 10 12 14 16
Customers/day
Probability 0.35 0.25 0.20 0.14 0.06
Table 2.23 Probability distribution of number of customers/day
Customers order 1, 2, 3, 4 or 5 dozen bagels according to the probability distribution given in
table 2.24.
Number of 1 2 3 4 5
dozens
ordered/customer
Probability 0.2 0.3 0.2 0.1 0.2
Table 2.24 Probability distribution of bagels ordered by customer
Bagels sell for ₹ 4.40 per dozen. They cost ₹ 3.40 per dozen to make. All bagels not sold at
the end of the day are sold at half price to a local grocery store. Assume that the baker baked
20 dozens every day, simulate for 5 days and find out the total profit. Also mention your
suggestions to baker on the basis of current scenario. Random digits for number of
customers/day and dozens ordered/customer is given in table 2.25 and table 2.26 respectively.
R.D. for customers/day 34 53 97 72 20

Table 2.25 Random Digits for bagels customer/day
R.D. for
1 0 0 2 0 9 3 3 2 9 8 7 5 3 6 3 4 9 6 3 1 0 7 2 8 7 4
Dozens
Ordered/
8 5 1 7 1 3 5 0 4 0 1 8 2 4 5 9 2 6 3 0 9 0 0 8 1 5 6
Customer
Table 2.26 Random Digits for dozens ordered/customer
Solution:
Table 2.27 Random Digit Assignment for Number of Customers/Day
Number of Probability Cumulative Random-Digit
Customers/Day Probability Assignment
8 0.35 0.35 01 – 35
10 0.25 0.60 36 – 60
12 0.20 0.80 61 – 80
14 0.14 0.94 81 – 94
16 0.06 1.00 95 – 00
Table 2.28 Random Digit Assignment for Number of Dozens Ordered/Customer

Number of Dozens Probability Cumulative Random-Digit
Ordered/Customer Probability Assignment
1 0.2 0.2 1–2
2 0.3 0.5 3–5
3 0.2 0.7 6–7
4 0.1 0.8 8
5 0.2 1.0 9–0
The profit on each day is:
[( + ( + ( , ( )]
Given that:
Cost price of 1 dozen bagels = ₹ 3.40
Selling price of 1 dozen bagels = ₹ 4.40
Table 2.29 Simulation Table for 5 Days Sale of Bagels

TOTAL
REVENUE
R.D. NO. NO. NO. OF
D R.D. REVENUE FROM
FOR OF OF DOZENS LOST DAILY
A FOR FROM GROCERY
CUST/ CUST/ DOZEN/ SOLD ON PROFIT PROFIT
Y DEMAND SALE STORE
DAY DAY CUSTOMER EACH
SALE
DAY
1 1
0 5
0 5
2 1
1 34 8 20 ₹ 88.00 ₹0 ₹ 7.00 ₹ 13.00
0 5
9 5
6 3
3 2
2 1
9 5
8 4
2 53 10 7 3 20 ₹ 88.00 ₹0 ₹ 9.00 ₹ 11.00
5 2
3 2
6 3
3 2
4 2
9 5
6 3
3 2
1 1
0 5
7 3
2 1
8 4
7 3
3 97 16 20 ₹ 88.00 ₹0 ₹ 19.00 ₹ 1.00
4 2
8 4
5 2
1 1
7 3
1 1
3 2
5 1
0 5
4 2
0 5
1 1
8 4
2 1
4 72 12 20 ₹ 88.00 ₹0 ₹ 13.00 ₹ 7.00
4 2
5 2
9 5
2 1
6 3
3 2
0 5
9 5
0 5
0 5
5 20 8 20 ₹ 88.00 ₹0 ₹ 10.00 ₹ 10.00
8 4
1 1
5 2
6 3
Total Profit = ₹ (13+11+1+7+10) = ₹ 42. Looking at the trends, the baker should bake more
than 20 dozens of bagels.
2.4 A Reliability Problem [D-06,09,11,12,16]

 A large milling machine has three different bearings that fail in service.
 The cumulative distribution function of the life of each bearing is identical.
 When a bearing fails, the mill stops, a repairperson is called, and a new bearing is
installed.
 The delay time of the repairperson's arriving at the milling machine is also a random
variable.
 Downtime for the mill is estimated at ₹ x per minute.
 The direct on-site cost of the repairperson is ₹ y per hour.
 It takes „n1‟ minutes to change one bearing, „n2‟ minutes to change two bearings, and
„n3‟ minutes to change three bearings, where n1 < n2 < n3.
 The bearings cost ₹ z each.

 Current Method: If a bearing fails only that bearing is replaced which means that
only one bearing is changed at any breakdown.
 Proposed Method: Replace all three bearings whenever a bearing fails.
 Management needs an evaluation of this proposal.
Example 2.9
A large milling machine has three different bearings that fail in service. The cumulative
distribution function of the life of each bearing is identical, as shown in Table 2.30.
Bearing Life (Hours) 1000 1100 1200 1300 1400 1500
Probability 0.20 0.30 0.25 0.10 0.08 0.07
Table 2.30 Probability Distribution of the life of bearing
When a bearing fails, the mill stops, a repairperson is called, and a new bearing is installed.
The delay time of the repairperson's arriving at the milling machine is also a random variable,
with the distribution given in Table 2.31.
Delay Time (Minutes) 5 10 15
Table 2.31 Probability Distribution of delay
Downtime for the mill is estimated at ₹5 per minute. The direct on-site cost of the
repairperson is ₹15 per hour. It takes 20 minutes to change one bearing, 30 minutes to change
two bearings, and 40 minutes to change three bearings. The bearings cost ₹16 each. A
proposal has been made to replace all three bearings whenever a bearing fails. Management
needs an evaluation of this proposal. Run the simulation for 10,000 hours of operation.
Solution:
Table 2.32 Random Digit Assignment for Distribution of the life of bearing
Bearing Probability Cumulative Random-
Life Probability Digit
(Hours) Assignment
1000 0.20 0.20 01 – 20
1100 0.30 0.50 21 – 50
1200 0.25 0.75 51 – 75
1300 0.10 0.85 76 – 85
1400 0.08 0.93 86 – 93
1500 0.07 1.00 94 – 00
Table 2.33 Random Digit Assignment for Distribution of delay

Delay Probability Cumulative Random
Time Probability Digit
(Minutes) Assignment
5 0.4 0.4 1–4
10 0.5 0.9 5–9
15 0.1 1.0 0
Current Method
Table 2.34 represents the simulation table for current method. The simulation runs for 10,000
hours of life time of bearing. As the accumulated life time of all the three bearings reach to
10,000 hours or more, we stop the simulation. In current method, we replace only one
bearing.
Table 2.34 Simulation table for current method

No. Bearing 1 Bearing 2 Bearing 3
RD Life Accumu RD Delay RD Life Accumu RD Delay RD Life Accumu RD Delay
For (Hrs) lated For (Min) For (Hrs) lated For (Min) For (Hrs) lated For (Min)
Life Life Delay Life Life Delay Life Life Delay
(Hrs) (Hrs) (Hrs)
1 65 1200 1200 3 5 72 1200 1200 5 10 86 1400 1400 3 5
2 46 1100 2300 5 10 56 1200 2400 4 5 96 1500 2900 8 10
3 89 1400 3700 2 5 96 1500 3900 3 5 74 1200 4100 6 10
4 74 1200 4900 0 15 87 1400 5300 2 5 66 1200 5300 5 10
5 23 1100 6000 1 5 54 1200 6500 8 10 78 1300 6600 2 5
6 12 1000 7000 6 10 35 1100 7600 6 10 98 1500 8100 0 15
7 36 1100 8100 2 5 66 1200 8800 7 10 64 1200 9300 4 5
8 07 1000 9100 3 5 58 1200 10000 1 5 85 1300 10600 7 10
9 42 1100 10200 4 5 33 32
10 12 21 46
Ʃ= Ʃ= Ʃ=
65 60 70
At the end of simulation for 10,000 hours, 25 bearings were replaced.
The total cost of the current method is computed as follows:
Cost of bearings = 25 bearings * ₹16 per bearing = ₹400
Cost of delay = (65 + 60 + 70) minutes * ₹5 per minute = ₹975
Cost of downtime during repair = 25 bearings * ₹5 per minute * 20 minutes per bearing
= ₹2500
Cost of repairperson = 25 bearings * 20 minutes per bearing * ₹ 15/60 minutes
= ₹125
Total cost = ₹ (400 + 975 + 2500 + 125) = ₹4000
Proposed Method:
This method is given in table 2.35. In this method the bearing life time is assumed to be same
as that used for current method. As proposed method requires more bearings than the current
method, the simulation requires a new set of random digits for generating the additional life
times of bearings. Additional life times required at 9th replacement of bearing 2 and 3.
In proposed method, we replace all the three bearings at a time when first failure occurred.
The repairperson delay is generated independently using different sets of random digits.
Table 2.35 Simulation table for proposed method

No. Bearing 1 Bearing 2 Bearing 3 First Accumulated Random Delay
Life Life Life Failure Life Digit for (Min)
(Hrs) (Hrs) (Hrs) (Hrs) (Hrs) Delay
1 1200 1200 1400 1200 1200 2 5
2 1100 1200 1500 1100 2300 6 10
3 1400 1500 1200 1200 3500 9 10
4 1200 1400 1200 1200 4700 0 15
5 1100 1200 1300 1100 5800 1 5
6 1000 1100 1500 1000 6800 3 5
7 1100 1200 1200 1100 7900 5 10
8 1000 1200 1300 1000 8900 8 10
9 1100 33/1100 32/1100 1100 10000 7 10
Ʃ = 80
At the end of simulation for 10,000 hours, 9 sets of bearings were replaced.
The total cost of the current method is computed as follows:
Cost of bearings = 27 bearings * ₹16 per bearing = ₹432
Cost of delay = 80 minutes * ₹5 per minute = ₹400
Cost of downtime during repair = 9 sets * ₹5 per minute * 40 minutes per set
= ₹1800
Cost of repairperson = 9 sets * 40 minutes per set * ₹ 15/60 minutes
= ₹90
Total cost = ₹ (432 + 400 + 1800 + 90) = ₹2722
Hence, we can conclude that the proposed method has a saving of ₹1278 over a period of
10,000 hours simulation.
2.5 Lead Time Demand

 Lead-time demand may occur in an inventory system when the lead time is not
instantaneous.
 The lead time is the time from placement of an order until the order is received.
 In a realistic situation, lead time is a random variable.
 During the lead time, demands also occur at random.
 Lead-time demand is thus a random variable defined as the sum of the demands over
the lead time.
 ∑ ,
where i is the time period of the lead time, i = 0, 1, 2,…
Di is the demand during the ith time period; and
T is the lead time.
 The distribution of lead-time demand is determined by simulating many cycles of lead
time and building a histogram based on the results.
Example 2.10
A firm sells bulk rolls of newsprint. The daily demand is given by the following probability
distribution:
Daily Demand (Rolls) 3 4 5 6
Probability 0.20 0.35 0.30 0.15
Lead time is a random variable given by the following distribution:
Lead Time (Days) 1 2 3

Determine the lead-time demand for 5 cycles of simulation.
Solution:
Table 2.36 Random Digit Assignment for Demand

Daily Demand (Rolls) Probability Cumulative Probability Random Digit Assignment
3 0.20 0.20 01 – 20
4 0.35 0.55 21 – 55
5 0.30 0.85 56 – 85
6 0.15 1.00 86 – 00
Table 2.37 Random Digit Assignment for Lead Time

Lead Time (Days) Probability Cumulative Probability Random Digit Assignment
1 0.36 0.36 01 – 36
2 0.42 0.78 37 – 78
3 0.22 1.00 79 - 00
Table 2.38 Simulation Table for Lead-Time Demand

Cycle Random Lead Time Random Demand Lead-Time
Digit for (Days) Digit for (Rolls) Demand
Lead Time Demand
34 4
1 46 2 9
56 5
26 4
2 75 2 9
68 5
88 6
3 86 3 19 3 13
43 4
4 27 1 46 4 4
92 6
5 63 2 10
36 4
The histogram might appear as shown below:
3.5
3
3
Relative Frequency
2.5
1.5
1 1
1
0.5
0
0
0-3 4-7 8 - 11 12 - 15
Lead-Time Demand
■■■
CHAPTER 3
GENERAL PRINCIPLES
This chapter develops a common framework for the modeling of complex systems using
discrete-event simulation. It covers the basic building blocks of all discrete-event simulation
models: entities and attributes, activities and events. In discrete-event simulation, a system is
modeled in terms of its state at each point in time; the entities that pass through the system
and the entities that represent system resources; and the activities and events that cause
system state to change. Discrete-event models are appropriate for those systems for which
changes in system state occur only at discrete points in time.
3.1 Concepts in Discrete-Event Simulation [M-05,06,07,10,11,14,15 D-05,07,11]

The major concepts are briefly defined as follows:
 System: A collection of entities (Example: people and machines) that interact
together over time to accomplish one or more goals.
 Model: An abstract representation of a system, usually containing structural, logical,
or mathematical relationships which describe a system in terms of state, entities and
their attributes, sets, processes, events, activities, and delays.
 System state: A collection of variables that contain all the information necessary to
describe the system at any time.
 Entity: Any object or component in the system which requires explicit representation
in the model (Example: a server, a customer, a machine).
 Attributes: The properties of a given entity (Example: the priority of a waiting
customer, the routing of a job through a job shop).
 List: A collection of (permanently or temporarily) associated entities, ordered in some
logical fashion (such as all customers currently in a waiting line, ordered by first
come, first served, or by priority).
 Event: An instantaneous occurrence that changes the state of a system (such as an
arrival of a new customer).
 Event notice: A record of an event to occur at the current or some future time, along
with any associated data necessary to execute the event; at a minimum, the record
includes the event type and the event time.
 Event list: A list of event notices for future events, ordered by time of occurrence;
also known as the future event list (FEL).
 Activity: A duration of time of specified length (Example: a service time or inter-
arrival time), which is known when it begins (although it may be defined in terms of a
statistical distribution).
 Delay: A duration of time of unspecified indefinite length, which is not known until it
ends (Example: a customer's delay in a last-in, first-out waiting line which, when it
begins, depends on future arrivals).
 Clock A variable that represents simulated time.
Difference between Delay and Activity [M-12, D-11]
Delay Activity
1. A delay's duration is not specified by 1. An activity‟s duration is defined by
the modeler ahead of time, but rather the modeler and is computable from
is determined by system conditions. its specification at the instant it
begins.
2. A delay is sometimes called a 2. An activity is called an unconditional
conditional wait. wait.
3. The completion of a delay is 3. The completion of an activity is an
sometimes called a conditional or event, often called a primary event,
secondary event, but such events are which is managed by placing an event
not represented by event notices, nor notice on the FEL.
do they appear on the FEL.
4. A delay's duration is measured and is 4. An activity typically represents a
one of the desired outputs of a model service time, an inter-arrival time, or
run. Typically, a delay ends when any other processing time.
some set of logical conditions
becomes true or one or more other
events occur.
3.2 World Views [M-10, D-07,09,11,14,17]

3.2.1 The Event-Scheduling/Time-Advance Algorithm
[M-08,09,11,12,13,15, D-05, 06, 08, 12,13,14,17]
 The mechanism for advancing simulation time and guaranteeing that all events occur
in correct chronological order is based on the future event list (FEL).
 This list contains all event notices for events that have been scheduled to occur at a
future time.
 Scheduling a future event means that at the instant an activity begins, its duration is
computed or drawn as a sample from a statistical distribution and the end-activity
event, together with its event time, is placed on the future event list.
 At any given time t, the FEL contains all previously scheduled future events and their
associated event times (called t1; t2; ….) as shown in figure 3.1.
Cumulative
Entities Future
System Set Set Statistics
Clock and … Event
State 1 2 and
Attributes List, FEL
Counters
t (x, y, z, …) (2, t1) – type 2 event
to occur at time t1
(3, t2) – type 3 event
to occur at time t2
. . . . . . .
. . . . . . .
Figure 3.1 Prototype System Snapshot at simulation time t
 The FEL is ordered by event time, meaning that the events are arranged
chronologically; that is, the event times satisfy
 Time t is the value of CLOCK, the current value of simulated time.

 The event associated with time t1 is called the imminent event; that is, it is the next
event that will occur.
 After the system snapshot at simulation time CLOCK = t has been updated, the
CLOCK is advanced to simulation time CLOCK = t1, and the imminent event notice
is removed from the FEL and the event executed.
 Execution of the imminent event means that a new system snapshot for time t1 is
created based on the old snapshot at time t and the nature of the imminent event.
 At time t1, new future events may or may not be generated, but if any are, they are
scheduled by creating event notices and putting them in their proper position on the
FEL.
 After the new system snapshot for time t1 has been updated, the clock is advanced to
the time of the new imminent event and that event is executed.
 This process repeats until the simulation is over.
 The sequence of actions which a simulator (or simulation language) must perform
to advance the clock and build a new system snapshot is called the event-
scheduling/time-advance algorithm.
Old system snapshot at time t

CLOCK System … Future Event List …
State
t (4, 2, 5) (2, t1) – Type 2 event to occur at time t1
(3, t2) – Type 3 event to occur at time t2
.
.
.
(4, tn) – Type 4 event to occur at time tn
Event –Scheduling / Time – Advance Algorithm

Step 1: Remove the event notice for the imminent event (event 2, time t1)
from FEL.
Step 2: Advance CLOCK to imminent event time t1.
Step 3: Execute imminent event, update system state, change entity attributes,
and set membership as needed.
Step 4: Generate future events (if necessary) and place their event notices on
FEL ranked by their event time. (Example: Event 1 to occur at time t*,
where t2 ≤ t* ≤ t3)
Step 5: Update cumulative statistics and counters.
New system snapshot at time t1

CLOCK System … Future Event List …
State
t1 (4, 2, 5) (3, t2) – Type 3 event to occur at time t2
(1, t*) – Type 1 event to occur at time t*
.
.
.
(4, tn) – Type 4 event to occur at time tn
Future Event Generation by an External Arrival Event in Queueing System

 At time 0, the first arrival event is generated and is scheduled on the FEL (meaning its
event notice is placed on the FEL). The inter-arrival time is an example of an activity.
 When the clock eventually is advanced to the time of this first arrival, a second arrival
event is generated.
 First, an inter-arrival time is generated, a*; it is added to the current time, CLOCK=t ;
the resulting (future) event time, t + a* = t*, is used to position the new arrival event
notice on the FEL.
 This method of generating an external arrival stream is called bootstrapping.
Future Event Generation by a Service Completion Event in Queueing System

 When one customer completes service, at current time CLOCK = t, if the next
customer is present, then a new service time, s*, will be generated for the next
customer.
 The next service-completion event will be scheduled to occur at future time t* = t +
s* by placing onto the FEL a new event notice of type service completion with event
time t*.
 In addition, a service-completion event will be generated and scheduled at the time of
an arrival event, provided that, upon arrival, there is at least one idle server in the
server group.
 A service time is an example of an activity.
 Beginning service is a conditional event, because its occurrence is triggered only on
the condition that a customer is present and a server is free.
 Service completion is an example of a primary event.
Future Events Generation by the alternate generation of runtimes and downtimes for a
machine subject to breakdowns
 At time 0, the first runtime will be generated and an end-of-runtime event scheduled.
 Whenever an end-of-runtime event occurs, a downtime will be generated and an end-
of-downtime event is scheduled on the FEL.
 When the CLOCK is eventually advanced to the time of this end-of-downtime event,
a runtime is generated and an end-of-runtime event is scheduled on the FEL.
 In this way, runtimes and downtimes continually alternate throughout the simulation.
A runtime and a downtime are examples of activities, and end of runtime and end of
downtime are primary events.
Stopping Event
Every simulation must have a stopping event, here called E, which defines how long the
simulation will run. There are generally two ways to stop a simulation:
 At time 0, schedule a stop simulation event at a specified future time TE. Thus, before
simulating, it is known that the simulation will run over the time interval [0, TE].
Example: Simulate a job shop for TE = 40 hours.
 Run length TE is determined by the simulation itself. Generally, TE is the time of
occurrence of some specified event E. Examples: TE is the time of the 100th service
completion at a certain service center. TE is the time of breakdown of a complex
system. TE is the time of disengagement or total kill (whichever occurs first) in a
combat simulation. TE is the time at which a distribution center ships the last carton in
a day's orders.
3.2.2 Process Interaction Approach

 A simulation analyst thinks in terms of processes.
 The analyst defines the simulation model in terms of entities or objects and their life
cycle as they flow through the system, demanding resources and queueing to wait for
resources.
 More precisely, a process is the life cycle of one entity. This life cycle consists of
various events and activities.
 Some activities may require the use of one or more resources whose capacities are
limited.
 These and other constraints cause processes to interact, the simplest example being an
entity forced to wait in a queue (on a list) because the resource it needs is busy with
another entity.
 The process-interaction approach is popular because of its intuitive appeal, and
because the simulation packages that implement it allow an analyst to describe the
process flow in terms of high-level block or network constructs, while the interaction
among processes is handled automatically.
 In more precise terms, a process is a time-sequenced list of events, activities, and
delays, including demands for resources, which define the life cycle of one entity as it
moves through a system.
 The process-interaction approach use a variable time advance; that is, when all events
and system state changes have occurred at one instant of simulated time, the
simulation clock is advanced to the time of the next imminent event on the FEL.
3.2.3 Activity Scanning Approach [D-10]

 The activity-scanning approach, in contrast, uses a fixed time increment and a rule-
based approach to decide whether any activities can begin at each point in simulated
time.
 With the activity-scanning approach, a modeler concentrates on the activities of a
model and those conditions, simple or complex, that allow an activity to begin.
 At each clock advance, the conditions for each activity are checked and, if the
conditions are true, then the corresponding activity begins.
 Proponents claim that the activity-scanning approach is simple in concept and leads to
modular models that are more maintainable and easily understood and modified by
other analysts at later times. They admit, however, that the repeated scanning to
determine whether an activity can begin results in slow runtime on computers.
 Thus, the pure activity-scanning approach has been modified (and made conceptually
somewhat more complex) by what is called the three-phase approach, which
combines some of the features of event scheduling with activity scanning to allow for
variable time advance and the avoidance of scanning when it is not necessary, but
keeping the main advantages of the activity-scanning approach.
 In the three-phase approach, events are considered to be activities of duration-zero
time units. With this definition, activities are divided into two categories, called B and
C.
B activities: activities bound to occur; all primary events and unconditional activities.
C activities: activities or events those are conditional upon certain conditions being
true.
 The B-type activities and events can be scheduled ahead of time, just as in the event-
scheduling approach.
 This allows variable time advance. The FEL contains only B-type events. Scanning to
check if any C-type activities can begin or C-type events occur happens only at the
end of each time advance, after all
 B-type events have completed. In summary, with the three-phase approach, the
simulation proceeds with repeated execution of the three phases until it is completed:
Phase A Remove the imminent event from the FEL and advance the clock to its event
time. Remove any other events from the FEL that have the same event time.
Phase B Execute all B-type events that were removed from the FEL. (This may free a
number of resources or otherwise change system state.)
Phase C Scan the conditions that trigger each C-type activity and activate any whose
conditions are met. Rescan until no additional C-type activities can begin or events
occur.
 The three-phase approach improves the execution efficiency of the activity scanning
method.
 In addition, proponents claim that the activity-scanning and three-phase approaches
are particularly good at handling complex resource problems, in which various
combinations of resources are needed to accomplish different tasks.
 These approaches guarantee that resources being freed at a given simulated time will
all be freed before any available resources are reallocated to new tasks.
Compare event-scheduling, process interaction and activity scanning algorithms.

Answer:
Activity Scanning
 Basic building block is the activity
 Model program's code segments consist of sequences of activities (operations)
waiting to be executed
 An activity sequence's conditions must be satisfied before it is executed
 Simulation executive moves from event to event executing those activity sequences
whose conditions are satisfied
Event Scheduling
 Basic building block is the event
 Model program's code segments consist of event routines waiting to be executed
 Event routine associated with each event type -- performs required operations for that
type
 Simulation executive moves from event to event executing the corresponding event
routines
Process Interaction
 Basic building block is the process
 System consists of a set of interacting processes
 Model program's code for each process includes the operations that it carries out
throughout its lifetime
 Event sequence for system consists of merging of event sequences for all processes
 Future event list consists of a sequence of event nodes (or notices)
 Each event node indicates the event time and process to which it belongs
 Simulation executive carries out the following tasks:
o placement of processes at particular points of time in the list
o removal of processes from the event list
o activation of the process corresponding to the next event node from the event
list
o rescheduling of processes in the event list
 Typically, a process object can be in one of several states:
o active -- process is currently executing
There is only one such process in a system.
o ready -- process is in event list, waiting for activation at a certain time
o idle (or blocked) -- process is not in event list, but eligible to be be reactivated
by some other entity (e.g., waiting for a passive resource)
o terminated -- process has completed its sequence of actions, is not in event
list, and cannot be reactivated
3.3 Manual Simulation Using Event Scheduling [M-12,17 D-06, 09,10, 11]
1. SINGLE SERVER QUEUE
Consider a grocery store with single checkout counter.
The system consists of those customers in the waiting line plus the one (if any) checking out.
The model has the following components:
System state (LQ(t), LS(t)), where LQ(t) is the number of customers in the waiting line, and
LS(t) is the number being served (0 or 1) at time t .
Entities The server and customers are not explicitly modeled, except in terms of the state
variables above.
Events
Arrival (A)
Departure (D)
Stopping event (SE), scheduled to occur at time TSE.
Event notices
(A, t), representing an arrival event to occur at future time t
(D, t), representing a customer departure at future time t

(SE, 30), representing the simulation-stop event at future time 30.
Activities
Inter-arrival time
Service time
Delay Customer time spent in waiting line.
The event notices are written as (event type, event time). In this model, the FEL will always
contain either two or three event notices.
The effect of arrival and departure events is shown in figure 3.2 and 3.3 respectively.
Arrival event occur at
CLOCK= t
Set LS(t) = 1 Is Increase LQ(t)

LS(t) = by 1
Generate service time S*; schedule new

departure event at time t + S*
Generate interarrival time a*;

schedule next arrival at time t + a*
Collect Statistics
Return control to time-

advance routine to continue
simulation
Figure 3.2 Execution of Arrival Event

Departure event occurs at

CLOCK= t
Set LS(t) = 0 Is Reduce LQ(t)

LQ(t) > by 1
0?
Generate service time S*; schedule new

departure event at time t + S*
Collect Statistics
Return control to time-

advance routine to continue
simulation
Figure 3.3 Execution of Departure Event

Example 3.1:
Consider the single server system. Let the inter-arrival times and the service times be as
follows:
Inter-arrival
-- 4 5 2 8 3 7
Times
Service Times 5 3 4 6 2 7 1
Develop the simulation table using event-scheduling approach and analyze the system until
the stopping event occurring at clock time 30. Calculate the server utilization and the
maximum queue length.
Solution:
variables above.
Events
Arrival (A)
Departure (D)
Stopping event (SE), scheduled to occur at time 30.
Event notices
(A, t ), representing an arrival event to occur at future time t
(D, t ), representing a customer departure at future time t
Activities
Inter-arrival time
Service time
Using the distribution of inter-arrival time and service time, we get the clock times for arrival
and departure event.
Customer Inter-arrival Clock Time of Service Time Departure Time
Time Arrival
1 -- 0 5 5
2 4 4 3 8
3 5 9 4 13
4 2 11 6 19
5 8 19 2 21
6 3 22 7 29
7 7 29 1 30
Using clock time of arrival and departure of each customer, we generate the chronological
ordering of events as shown below:
Event Type Customer Number Clock Time
Arrival 1 0
Arrival 2 4
Departure 1 5
Departure 2 8
Arrival 3 9
Arrival 4 11
Departure 3 13
Departure 4 19
Arrival 5 19
Departure 5 21
Arrival 6 22
Departure 6 29
Arrival 7 29
Departure 7 30
We know that the event notices are written as (event type, event time). In case of single
server queue, the future event list (FEL) will contain either two or three notices; the notices
are (A,t), (D,t), and (SE,t). Here it is found that the departure of 7th customer occurs at clock
time 30, so the stopping event notice is (SE,30).
The simulation table for single server queue is given below. We are gathering here two
statistics such as server utilization i.e., the total busy time (B) of the server and the maximum
queue length (MQL).
System State Cumulative Statistics

Clock Future Event List(FEL)
LQ(t) LS(t) B MQL
0 0 1 (A,4) (D,5) (SE,30) 0 0
4 1 1 (D,5) (SE,30) 4 1
5 0 1 (D,8) (A,9) (SE,30) 5 0
8 0 0 (A,9) (SE,30) 8 0
9 0 1 (A,11) (D,13) (SE,30) 8 0
11 1 1 (D,13) (SE,30) 10 1
13 0 1 (D,19) (A,19) (SE,30) 12 0
19 0 0 (A,19) (SE,30) 18 0
19 0 1 (D,21) (A,22) (SE,30) 18 0
21 0 0 (A,22) (SE,30) 20 0
22 0 1 (D,29) (A,29) (SE,30) 20 0
29 0 0 (A,29) (SE,30) 27 0
29 0 1 (D,30) (SE,30) 27 0
30 0 0 (SE,30) 28 0
30 0 0 Execute Stopping Event
At clock time 30, stopping event is executed. The system is empty.
Example 3.2: [S-15, D-16]

Consider a single server system. Let the arrival distribution be uniformly distributed between
1 and 10 minutes and the service time distribution is as follows:
Service Time (Min) 1 2 3 4 5 6
Probability 0.04 0.20 0.10 0.26 0.35 0.05
10 customers. Random digits for inter-arrival time and service times are as follows:
Customer 1 2 3 4 5 6 7 8 9 10
R.D. for Inter-arrival Time -- 853 340 205 99 669 742 301 888 444
Also calculate the server utilization and the maximum queue length.
Solution:
Given that the arrival distribution is uniformly distributed between 1 and 10 minutes. Hence
the probability of occurrence of each inter-arrival time is the same which would be 0.100.
Using this we can generate the distribution of inter-arrival time and assign random digits to it
as shown in table 3.1 below:
Random-
Time Between Cumulative
Probability Digit
Arrivals(minutes) Probability
Assignment
1 0.100 0.100 001-100
2 0.100 0.200 101-200
3 0.100 0.300 201-300
4 0.100 0.400 301-400
5 0.100 0.500 401-500
6 0.100 0.600 501-600
7 0.100 0.700 601-700
8 0.100 0.800 701-800
9 0.100 0.900 801-900
10 0.100 1.000 901-000
Table 3.1 Distribution of time between arrivals
Similarly, assign random digits to the service time as shown in table 3.2 below.
Random-
Service Time Cumulative
Probability Digit
(minutes) Probability
Assignment
1 0.04 0.04 01-04
2 0.20 0.24 05-24
3 0.10 0.34 25-34
4 0.26 0.60 35-60
5 0.35 0.95 61-95
6 0.05 1.00 96-00
Table 3.2 Service time distributions
First, initialize the table by entering the details for the first customer. Here, we are assuming
that the first customer is arriving at time 0. Service of first customer begins immediately since
nobody is present in the system and gets over at time 5. So we can say that the customer
spends 5 minutes in the system. After filling the first customer details, subsequent rows in the
table are filled based on the random numbers for inter-arrival time and service time and the
completion of service time of the previous customer.
The simulation table for single server system is given in table 3.3 below.
Customer Random Time Arrival Random Service Time Time Time Time Idle
Digits Between Time Digits Time Service Customer Service Customer Time
For Arrival For Begins Waits in Ends Spends In Of
Arrival Service Queue System Server
1 -- -- 0 71 5 0 0 5 5 0
2 853 9 9 59 4 9 0 13 4 4
3 340 4 13 12 2 13 0 15 2 0
4 205 3 16 88 5 16 0 21 5 1
5 99 1 17 97 6 21 4 27 10 0
6 669 7 24 66 5 27 3 32 8 0
7 742 8 32 81 5 32 0 37 5 0
8 301 4 36 35 4 37 1 41 5 0
9 888 9 45 29 3 45 0 48 3 4
10 444 5 50 91 5 50 0 55 5 2
Ʃ =50 Ʃ =44 Ʃ =8 Ʃ =52 Ʃ =11
Table 3.3 Simulation Table for Single Server Queue
Using clock time of arrival and departure of each customer, we generate the chronological
ordering of events as shown below:
Event Type Customer Number Clock Time
Arrival 1 0
Departure 1 5
Arrival 2 9
Departure 2 13
Arrival 3 13
Departure 3 15
Arrival 4 16
Arrival 5 17
Departure 4 21
Arrival 6 24
Departure 5 27
Departure 6 32
Arrival 7 32
Arrival 8 36
Departure 7 37
Departure 8 41
Arrival 9 45
Departure 9 48
Arrival 10 50
Departure 10 55
variables above.
Events
Arrival (A)
Departure (D)
Stopping event (SE), scheduled to occur at time 55.
Event notices
(A, t ), representing an arrival event to occur at future time t
(D, t ), representing a customer departure at future time t
Activities
Inter-arrival time
Service time

We know that the event notices are written as (event type, event time). In case of single
server queue, the future event list (FEL) will contain either two or three notices; the notices
are (A,t), (D,t), and (SE,t). Here it is found that the departure of 10th customer occurs at clock
time 55, so the stopping event notice is (SE, 55).
The simulation table for single server queue is given below. We are gathering here two
statistics such as server utilization i.e., the total busy time (B) of the server and the maximum
queue length (MQL).
System Cumulative Statistics

Clock State Future Event List(FEL)
LQ(t) LS(t) B MQL
0 0 1 (D, 5), (A, 9), (SE, 55) 0 0
5 0 0 (A, 9), (SE, 55) 5 0
9 0 1 (D,13), (A, 13), (SE, 55) 5 0
13 0 0 (A, 13), (SE, 55) 9 0
13 0 1 (D, 15), (A, 16), (SE, 55) 9 0
15 0 0 (A, 16), (SE, 55) 11 0
16 0 1 (A, 17), (D, 21), (SE,55) 11 0
17 1 1 (D, 21), (SE, 55) 12 1
21 0 1 (A, 24), (D, 27), (SE, 55) 16 0
24 1 1 (D, 27), (SE, 55) 19 1
27 0 1 (D, 32), (A, 32), (SE,55) 22 0
32 0 0 (A,32), (SE,55) 27 0
32 0 1 (A, 36), (D, 37), (SE, 55) 27 0
36 1 1 (D, 37), (SE, 55) 31 1
37 0 1 (D, 41), (A, 45), (SE, 55) 32 0
41 0 0 (A, 45), (SE, 55) 36 0
45 0 1 (D, 48), (A, 50), (SE, 55) 36 0
48 0 0 (A, 50), (SE, 55) 39 0
50 0 1 (D, 55), (SE, 55) 39 0
55 0 0 (SE, 55) 44 0
55 0 0 Execute Stopping Event
At clock time 30, stopping event is executed. The system is empty.
2. DUMP TRUCK PROBLEM [D-15,16]
Six dump trucks are used to haul coal from the entrance of a small mine railroad. Figure
below provides a schematic of the dump-truck operation. Each truck is loaded by one of the
two loaders. After loading, the truck immediately moves to the scale, to be weighed as soon
as possible. Both the loaders and the scale have a FCFS waiting line (or queue) for trucks.
Travel time from a loader to a scale is considered negligible. After being weighed, a truck
begins a travel time (during which time the truck unloads) and then afterward returns to the
loader queue.
Traveling
Loading
Scale
Loader Weighing
queue queue
The distributions of loading time, weighing time, and travel time are given in tables 1, 2, and
3 respectively.
Table 1: Distribution of Loading Time for the Dump Trucks

Loading Time Probability Cumulative Random Digit
5 0.3 0.3 1-3
10 0.5 0.8 4-8
15 0.2 1.0 9-0
Table 2: Distribution of Weighing Time for the Dump Trucks

Weighing Time Probability Cumulative Random Digit
12 0.7 0.7 1-7
16 0.3 1.0 8-0
Table 3: Distribution of Travel Time for the Dump Trucks

Travel Time Probability Cumulative Random Digit
40 0.4 0.4 1-4
60 0.3 0.7 5-7
80 0.2 0.9 8-9
100 0.1 1.0 0
Assume that five of the trucks are at the loaders and one is at the scale at time 0. The activity
times are taken from the following list as needed:
Loading Time 10 5 5 10 15 10 10
Weighing Time 12 12 12 16 12 16
Travel Time 60 100 40 40 80
Estimate the loader and scale utilizations (percentage of time busy).
SOLUTION:
System state [LQ (t), L (t), WQ (t), W (t)], where

LQ(t) = number of trucks in loader queue
L(t) = number of trucks (0,1, or 2) being loaded
WQ(t)= number of trucks in weigh queue
W(t) = number of trucks (0 or 1) being weighed, all at simulation time t
Event notices
(ALQ, t, DTi), dump truck arrives at loader queue (ALQ) at time t
(EL, t, DTi), dump truck i ends loading (EL) at time t
(EW, t, DTi), dump truck i ends weighing (EW) at time t
Entities The six dump trucks (DTI,..., DT6)
Lists
Loader queue, all trucks waiting to begin loading, ordered on a first-come, first-served basis
Weigh queue, all trucks waiting to be weighed, ordered on a first-come, first-serve basis.
Activities Loading time, weighing time, and travel time.

Delays Delay at loader queue, and delay at scale.
The logic for the occurrence of the end-loading event
 When an end-loading (EL) event occurs, say for truck j at time t, other events may be
triggered.
 If the scale is idle [W(t)=0], truck j begins weighing and an end-weighing event (EW)
is scheduled on the FEL.
 Otherwise, truck j joins the weigh queue.
 If at this time there is another truck waiting for a loader, it will be removed from the
loader queue and will begin loading by the scheduling of an end-loading event (EL)
on the FEL.
In order to estimate the loader and scale utilizations, two cumulative statistics are maintained:
 BL = total busy time of both loaders from time 0 to time t

 BS = total busy time of the scale from time 0 to time t
Table 4: Simulation Table for Dump-Truck Operation

Clock System State Lists Future Event Cumulative
t List Statistics
LQ(t) L(t) WQ(t) W(t) Loader Weigh BL BS
Queue Queue
0 3 2 0 1 DT4 (EL,5,DT3) 0 0
DT5 (EL,10,DT2)
DT6 (EW,12,DT1)
5 2 2 1 1 DT5 DT3 (EL,10,DT2) 10 5
DT6 (EL,5+5, DT4)
(EW,12,DT1)
10 1 2 2 1 DT6 DT3 (EL,10,DT4) 20 10
DT2 (EW,12,DT1)
(EL,10+10,DT5)
10 0 2 3 1 DT3 (EW,12,DT1) 20 10
DT2 (EL,20,DT5)
DT4 (EL,10+15,DT6)
12 0 2 2 1 DT2 (EL,20,DT5) 24 12
DT4 (EW, 12+12,DT3)
(EL,25,DT6)
(ALQ,12+60,DT1)
20 0 1 3 1 DT2 (EW,24,DT3) 40 20
DT4 (EL,25,DT6)
DT5 (ALQ,72,DT1)
24 0 1 2 1 DT4 (EL,25,DT6) 44 24
DT5 (EW,24+12,DT2)
(ALQ,72,DT1)
(ALQ,24+100,DT3)
25 0 0 3 1 DT4 (EW,36,DT2) 45 25
DT5 (ALQ,72,DT1)
DT6 (ALQ,124,DT3)
36 0 0 2 1 DT5 (EW,36+16,DT4) 45 36
DT6 (ALQ,72,DT1)
(ALQ,36+40,DT2)
(ALQ,124,DT3)
52 0 0 1 1 DT6 (EW,52+12,DT5) 45 52
(ALQ,72,DT1)
(ALQ,76,DT2)
(ALQ,52+40,DT4)
(ALQ,124,DT3)
64 0 0 0 1 (ALQ,72,DT1) 45 64
(ALQ,76,DT2)
(EW,64+16,DT6)
(ALQ,92,DT4)
(ALQ,124,DT3)
(ALQ,64+80,DT5)
72 0 1 0 1 (ALQ,76,DT2) 45 72
(EW,80,DT6)
(ALQ,92,DT4)
(ALQ,124,DT3)
(ALQ,144,DT5)
76 0 2 0 1 (EW,80,DT6) 49 76
(EL,82,DT1)
(EL,76+10,DT2)
(ALQ,92,DT4)
(ALQ,124,DT3)
(ALQ,144,DT5)
Average loader utilization = 0.32 Average scale utilization = 1.00
■■■
UNIT: II
MATHEMATICAL & STATISTICAL
MODELS IN SIMULATION
CHAPTER 4 STATISTICAL MODELS IN SIMULATION
CHAPTER 5 QUEUEING MODELS

CHAPTER 4
STATISTICAL MODELS
4.1 Review of Terminology and Concepts

4.1.1 Random variables
 A random variable is a rule that assigns a number to each outcome of an experiment.
 These numbers are called values of random variable.
 Random variables are usually denoted by X.
 Example:
1. If a die is rolled out, the outcome has a value from 1 through 6.
2. If a coin is tossed, the possible outcome is head „H‟ or tail „T‟.
 There are two types of random variables
1. Discrete random variable
2. Continuous random variable
Discrete Random Variable

 A discrete random variable takes only specific, isolated numerical values.
 The variables which take finite numeric values are called as Finite discrete random
variables and which takes unlimited values are called as Infinite discrete random
variables.
 The examples are shown in the table 4.1.
Table 4.1 Examples for discrete random variables
Random Variable Values Type
Flip a coin three times; Discrete Finite
X = the total number of {0, 1, 2, 3} There are only four possible
heads. values for X.
Select a mutual fund;
Discrete Infinite
X = the number of
{2, 3, 4, ...} There is no stated upper limit
companies in the fund
to the size of the portfolio.
portfolio.
 Let X be the discrete random variable, RX the possible values of X, given by range
space of X and xi the individual outcome in RX.
 A number p(xi) = P(X = xi) gives the probability that the random variable equals the
value of xi.
 The number p(xi), i=1, 2, 3 … must satisfy two conditions
1. p(xi) ≥ 0, for all the values of i
2. ∑ ( )
 The collection of pairs (xi, p(xi)) i.e. a list of probabilities associated with each of its
possible values is called probability distribution of X.
 p(xi) is called probability mass function (pmf) of X.
Example 4.1
Consider the experiment of tossing a single die. Define X as the number of spots on up face
of die after a toss. Assume that the die is loaded so that the probability that a given face lands
up is proportional to the number of spots showing.
a) Give the probability distribution of X.
b) Show that X is a discrete random variable.
Solution:
Given X is the random variable showing the number of spots on the up face.
RX = {1, 2, 3, 4, 5, 6}
N=total number of observations = 21
a) The discrete probability distribution is given by
xi 1 2 3 4 5 6
P(xi) 1/21 2/21 3/21 4/21 5/21 6/21
b) The conditions also are satisfied, i.e.
1. p(xi) ≥ 0, for i = 1,2,…6
2. ∑ ( )
Hence, X is a discrete random variable.
The distribution is shown graphically in figure 4.1
Fig 4.1 pmf for loaded die example
Continuous random variable

 Continuous Random variable takes any values within a continuous range or an
interval. The example is tabulated in table 4.2.
Table 4.2 Example for continuous random variables
Random Variable Values Type
Measure the length of an Any positive real number Continuous
object; X = its length in The set of possible
centimeters. measurements can take
on any positive value
 For a continuous random variable X, the probability that X lies in the interval [a, b], is
given by
( ) ∫ ( )
 The function f(x) is called probability density function (pdf) of random variable X.
 The pdf must satisfy the following conditions
1. f(x) ≥ 0, for all x in RX
2. ∫ ( ) ( )
3. f(x) = 0, if x is not in RX
 For any specified value x0, P(X = x0 ) = 0 since ∫ ( )
 Since P(X = x0 ) = 0, the following equation also hold:
P (a ≤ X ≤ b) = P (a < X ≤ b) = P (a ≤ X < b) = P (a < X < b)
 The graphical interpretation of equation 4.1 is shown in figure 4.2
Figure 4.2 Graphical interpretation of P (a < X < b)
Example 4.2
The life of a laser- ray device used to inspect cracks in aircraft wings is given by X,
continuous random variable, assuming x ≥ 0.The pdf of lifetime in years is ,
( ) {
Determine the life of device between 2 and 3 years.
Solution:
Given
( ) {
Also given that X is a continuous random variable.

The probability that the life of laser ray device between 2 and 3 years is determined from
( ) ∫
0 1
= -e -3/2 + e -1
= -0.223 + 0.368 = 0.145
4.1.2 Cumulative distribution function (cdf)

 The cumulative distribution function (cdf), denoted by F(x), measures the probability
that the random variable X assumes the value less than or equal to x,
i.e. F(x) = P (X ≤ x).
 If X is discrete, then
( ) ∑ ( )
 If X is continuous, then
( ) ∫ ( )
 Some properties of cdf are

1. F is a non decreasing function. If a < b, then F (a) ≤ F (b)
2. ( )
3. ( )
Note: All probability questions about X can be answered in terms of cdf. For example,
P (a < X ≤ b) = F (b) – F (a), for all a < b.
Example 4.3
The life of a laser- ray device used to inspect cracks in aircraft wings is given by X,
continuous random variable, assuming x ≥ 0. The cdf for the device is given by:
( ) ∫
1. Determine the probability that the device will last for less than 2 years.
2. Determine the probability that the life of laser ray device is between 2 and 3 years.
Solution:
The cdf for the device is given by:
( ) ∫
1. The probability that the device will last for less than 2 years is,
P(0 ≤ X ≤ 2) = F(2) – F(0) = F(2)
= 1- e-1
= 0.632
2. The probability that the life of laser ray device between 2 and 3 years is
P(2 ≤ X ≤ 3) = F(3) – F(2)

= (1 – e-3/2) – (1 – e-1)
= -e-3/2 + e-1
= -0.233 + 0.368
= 0.145
4.1.3 Expectation, Variance and Standard Deviation

 If X is a random variable, the expected value of X, denoted by E (X) is defined as
( ) ∑ ( )
( ) ∫ ( )
 E (X) is also referred as Mean µ or first moment of X.

 E (X n), n ≥ 1, called as the nth moment of X, is computed as
( ) ∑ ( )
( ) ∫ ( )
 Variance of random variable X is denoted by V(X) or Var(X) or σ2, is defined as
V(X) = E(X2) – [E (X)] 2
 Standard deviation is given by √ ( )
Note:
1. Mean E (X) is a measure of central tendency of a random variable.

2. Variance V(X) is a measure of the spread or variation of the possible values of X
around the mean E (X).
Example 4.4
Consider the experiment of tossing a single die. Define X as the number of spots on up face
of die after a toss. Assume that the die is loaded so that the probability that a given face lands
up is proportional to the number of spots showing.
a) Give the probability distribution of X.
b) Find the mean, variance and standard deviation.
Solution:
Given X is the random variable showing the number of spots on the up face.
RX = {1, 2, 3, 4, 5, 6}
N=total number of observations = 21
a) The discrete probability distribution is given by
xi 1 2 3 4 5 6
P(xi) 1/21 2/21 3/21 4/21 5/21 6/21
b) The mean, variance and standard deviation are calculated below:

( ) ( * ( * ( *
( ) ( * ( * ( *
Variance = V(X) = 21 – (4.33)2 = 2.22

√
4.1.4 Mode
 In case of discrete, Mode is the value of random variable that occurs most frequently.
 In case of continuous, the mode is the value at which the pdf is maximized.
 The mode may not be unique; if the modal value occurs at two values of the random
variable, the distribution is said to be bimodal.
4.2 Useful Statistical Models [M-07,09,10,14, N-04, D-05,09,10, 12,14]

 During the conduct of simulation, numerous situations arise where an investigator
choose to introduce probabilistic events.
 For example:
1. In Queuing systems ─ inter-arrival and service times are probabilistic.
2. In Inventory models ─ time between demand and lead time may be probabilistic.
3. In Reliability model ─ time to failure may be probabilistic.
 In each case, the simulation analyst desires to generate random events and use a
known statistical model if the distribution can be found.
 Some of the systems and the chosen statistical models are discussed.
1. Queuing system
 In queuing examples, inter-arrival and service-time patterns are given.
 The times between the arrivals are always probabilistic.
 Service times may be constant or probabilistic.
 If service times are completely random, exponential distribution is often used for
simulation purposes.
 If service time is constant, but some random variability causes fluctuations in positive
or negative way, then normal distribution is used. For example, the time it takes for
lathe to traverse a 10cm shaft should be always same, but the material may have slight
difference in hardness, causing different processing times.
 If there are more large service times, then weibull distribution is a better model.
 To model inter-arrival and service times, exponential, gamma and weibull
distributions are used.
 The differences between these distributions involve the location of modes of pdf‟s and
the shapes of their tails for large and small times.
Distribution Mode Tail

Exponential distribution at origin Long
Gamma distribution at some point (≥ 0) Long
Weibull distribution at some point (≥ 0) declines more or less rapidly
2. Inventory systems
 It has three random variables
1. Number of units demanded per order or per time period
2. Time between demands
3. Lead time ( time between placing an order and receiving receipt of that order)
 In simple mathematical model, demand is constant over time and lead time is zero or
constant.
 In realistic cases (Simulation models), demand occurs randomly in time and number
of units demanded each time is also random.
 In practice, the lead-time distribution can often be fitted fairly well by a gamma
distribution.
 The geometric, Poisson, and negative binomial distribution provide a range of
distribution shapes that satisfy a variety of demand patterns.
 The geometric distribution has its mode at unity, given that at least one demand has
occurred.
 If demand data are characterized by a long tail, Negative Binomial distribution is
appropriate.
 The Poisson distribution is often used to model the demand.
3. Reliability and maintainability

 Time to failure has been modeled with numerous distributions, including the
exponential, gamma, and Weibull.
 If only random failures occur, the time-to-failure distribution may be modeled using
exponential distribution.
 The gamma distribution arises from modeling standby redundancy where each
component has an exponential time to failure.
 If most failure are due to wear, normal distributions may be appropriate.
 To describe time-to-failure for some types of components, lognormal distribution is
found to be applicable.
 When there are number of components in a system and failures is due to serious large
number of defects, then Weibull distribution is extensively used.
4. Limited data
 In many situations, simulations begin before data collection is completed.
 Three distributions have application to incomplete/ limited data. These are uniform,
triangular, and beta distribution.
 Uniform distribution can be used when an inter-arrival or service time is known to be
random but no information is immediately available about the distribution.
 Triangular distribution can be used when assumptions are made about minimum,
maximum and modal values of the random variable.
 Beta distribution provides a variety of distributional forms on the unit interval, which,
with appropriate modification, can be shifted to any desired interval.
5. Other distributions
 Several other distributions may be useful in discrete system simulation.
 The Bernoulli and binomial distributions are two discrete distributions which may
describe phenomena of interest.
 The hyperexponential distribution is similar to exponential distribution, but its greater
variability may make it useful in certain instances.
4.3 Discrete distributions

Discrete random variables are used to describe random phenomena in which only integer
values can occur. The four distributions are discussed.
4.3.1 Bernoulli trials and Bernoulli distribution

 Consider an experiment consisting of n trials each of which is success or failure.
 Let Xj =1, if jth experiment results in success and Xj =0, if jth experiment results in
failure.
 The n Bernoulli trails are called Bernoulli process, if trails are independent, each trail
has only two possible outcomes (success, failures) and the probability of success
remains constant from trail to trail.
 Thus, p(x1, x2, … xn) = p1(x1). p2(x2)… pn (xn) and
( ) ( ) {
 For one trial, the distribution above is called Bernoulli distribution.

 Mean of Xj is given by
E (Xj) = 0.q + 1.p = p
 Variance of Xj is given by
V (Xj) = E (X2j) – [E (Xj)]2 = [(02 .q) + (12 .p)] – p2 = p (1– p)
4.3.2 Binomial distribution

 The random variable X denotes the number of successes in n Bernoulli trails has a
binomial distribution given by p(x).
. /
( ) {
i.e. probability of a particular outcome with all successes, each denoted by S,

occurring in the first x trails followed by n-x failures, each denoted by an F.
x of these n-x of these
P (SSS……………..SS FF……………FF) =
where q = 1 – p,
 There are ( ) outcomes having the required number of S‟s and F‟s.
( )
 Mean and variance is computed by considering X as a sum of n independent Bernoulli
random variables each with mean p and variance p (1 - p) = pq.
 Then X = X1 + X2 + … + Xn
and the mean E (X) = p + p + … + p = np
Variance V (X) = pq + pq + …+ pq = npq
Example 4.5
A production process manufactures computer chips on the average at 2% non-conforming.
Every day a random sample of size 50 is taken from the process. If the sample contains more
than two non-conforming chips, the process will be stopped. Determine the probability that
the process is stopped by the sampling scheme. Also calculate the mean and variance of the
number of non-conforming chips.
Solution:
Let the sampling process be n = 50 Bernoulli trails with p = 2% = 0.02 for each trial.
Therefore, q = 1 - p = 0.98
Let X is the random variable denoting the total number of non conforming chips in the
sample. This X will have a Binomial distribution given by
( ) {( *( )
To determine the probability that more than two non conforming chips are present in the
sample:
P(X > 2) = 1 – P(X ≤ 2)
P(X ≤ 2) = ∑ ( )( ) ( )
= (0.98)50 + 50 (0.02) (0.98)49 + 1225 (0.02)2. (0.98)48

= 0.92
P(X>2) = 1 - 0.92 = 0.08
Therefore the probability that the process is stopped on any day by sampling scheme is
approximately 0.08.
The mean number of non conforming chips with sample size 50 is

E (X) = np = 50 (0.02) =1
Variance is given by
V (X) = npq = 50 (0.02) (0.98) =0.98
Example 4.6
A recent survey indicated that 82% of single women aged 25 years old will be married in
their lifetime. Using the binomial distribution, find the probability that two or three women in
a sample of twenty will never be married.
Solution:
Let X is a random variable denoting the number of women in a sample will never be married.
Therefore, q = 82% = 0.82 and p = 1 – q = 0.18
Given, n = 20
The probability that two or three women in a sample of twenty will never be married is given
by the binomial distribution as:
P(2 ≤ X ≤ 3) = ∑ ( )
= ( )( ) ( ) ( )( ) ( )
= 0.173 + 0.228
= 0.401
Example 4.7
The Hawks are currently winning 0.55 of their games. There are 5 games in the next two
weeks. What is the probability that they will win more games than they lose?
Solution:
Let X is the random variable denoting the number of games won.
Therefore, p = 0.55 and q = 1 – p = 0.45
The probability that the Hawks will win more games than they lose is given by the binomial
distribution as:
P(3 ≤ X ≤ 5) = ∑ ( )
= ( )( ) ( ) ( )( ) ( ) ( )( ) ( )
= 0.337 + 0.206 + 0.050

= 0.593
4.3.3 Geometric and Negative Binomial distribution

 The geometric distribution is related to a sequence of Bernoulli trails. The random
variable X is defined as number of trails to achieve the first success. The distribution
of X is given by
( ) {
 The event {X = x} occurs when there are x-1 failures followed by a success.
 Each failure has an associated probability q = 1 – p and each success has the
probability p.
 Thus, P (FFF…..FS) = q x - 1 p
 Mean, ( ) and Variance, ( )
 Geometric distribution is Memoryless.
 The negative binomial distribution is the distribution of the number of trials until the
kth success, for k = 1, 2, …..
 If Y has the negative binomial distribution with parameters p and k, then the
distribution of Y is given by:
{(
( ) *
 Because we can think of the negative binomial random variable Y as the sum of k
independent geometric random variables, it is easy to see that
 Mean, ( )
 Variance, ( )
Example 4.8
Forty percent of assembled ink-jet printers are rejected at inspection station. Find the
probability that
(i) the first acceptable printer is third one inspected.
(ii) the second acceptable printer is third one inspected
Solution:
Let X is the random variable denoting printer is accepted.
Therefore, q =40% =0.4 and p = 1- q = 1 - 0.4 = 0.6
(i) The probability that the first acceptable printer (k = 1) is third one inspected is given
by geometric distribution p (x) = qx-1 p as:
p (3) = (0.4)3-1(0.6) = 0.096
(ii) To determine the probability that the second acceptable printer (k = 2) is the third one,
we use negative binomial distribution.
Therefore, ( ) ( )( ) ( ) ( )( )( )
Example 4.9 [D-13]

An industrial chemical that will retard the spread of fire in paint has been developed. The
local sales representative has determined from past experience that 48% of the sales calls will
result in an order.
(i) What is the probability that the first order will come on the fourth sales call of the
day?
(ii) If eight sales calls are made in a day, what is the probability of receiving exactly six
orders?
(iii)If four sales calls are made before lunch, what is the probability that one or less
results in an order?
Solution:
Let X is the random variable denoting the number of calls received until an order is placed.
Then, X is geometric (p = 48% = 0.48) with the probability mass function
( ) ( ) ( )
(i) The probability that the first order will come on the fourth sales call of the day is
( ) ( ) ( )
(ii) The number of orders, Y, in eight calls is binomial (n = 8, p = 0.48) with the
probability mass function
( ) ( *( ) ( )
The probability of receiving exactly six orders in eight calls is

( ) ( *( ) ( )
(iii)The number of orders, X, in four calls is binomial (n = 4, p = 0.48) with probability
mass function
( ) ( *( ) ( )
The probability of receiving one or less order in four calls is
( ) ( *( ) ( *( )( )
= 0.0731 + 0.2699
= 0.343
Example 4.10
Show that the geometric distribution is Memoryless.
Solution:
The geometric distribution is Memoryless if P(X > s + t | X > s) = P(X > t), where s and t are
constant integers and X is geometrically distributed random variable.
( ) ∑ ∑
We have, ∑
( )
Similarly we can show that, ( ) and ( )
( )
( | ) ( )
( )
Hence, we proved that geometric distribution is Memoryless.
4.3.4 Poisson distribution

 Poisson distribution is a discrete probability distribution that expresses the probability
of a number of events occurring in a fixed period of time if these events occur with a
known average rate and independently of the time since the last event.
 The probability mass function is given by
( ) {
where α > 0
 One of the important properties of the Poisson distribution is that the mean and
variance are both equal to α, i.e.,
E (X) = α = V (X)
 The cumulative distribution function is given by,
( ) ∑
 Most queuing systems characteristics such as arrival and departure processes are
described by a Poisson distribution.
Example 4.11
A computer terminal repairperson is “beeped” each time there is a call for service. The
number of beeps per hour is known to occur in accordance with a Poisson distribution with a
mean of α = 2 per hour.
(i) Determine the probability of three beeps in next hour.
(ii) Determine the probability of two or more beeps in an hour period.
Solution:
Let X is the random variable denoting the number of beeps.
Given α = 2 per hour
(i) The probability of three beeps in next hour

( )( )
( )
(ii) The probability of two or more beeps in an hour period
p (2 or more) = 1 – p(x ≤ 1)
= 1- F (1)
= 1- 0.406
= 0.594
Example 4.12
The number of hurricanes hitting the coast of Florida annually has a Poisson distribution with
mean of 0.8.
(i) What is the probability that more than two hurricanes will hit the Florida coast in a
year?
(ii) What is the probability that exactly one hurricane will hit the coast of Florida in a
year?
Solution:
Let X is the random variable denoting the number of hurricanes hitting the coast.
It is Poisson distributed (α = 0.8) with the probability mass function
( )
( )
(i) The probability that more than two hurricanes will hit the Florida coast in a year is
given by
( ) ( )
( ) ( ) ( )
= 1 – 0.45 – 0.36 – 0.14 = 0.046

(ii) The probability that exactly one hurricane will hit the coast of Florida in a year is
given by,
( )
( )
Example 4.13
Lane Braintwain is quite a popular student. Lane receives, on the average, four phone calls a
night with Poisson distribution. What is the probability that tomorrow night the number of
calls received will exceed that average by more than one standard deviation.
Solution:
Let X is the random variable denoting the number of calls which is Poisson distributed, with
mean α = 4.
For Poisson distribution E(X) = V(X) = 4
Standard deviation is given by √ ( ) √
The probability that tomorrow night the number of calls received will exceed the average (i.e.
E(X) = 4) by more than one standard deviation (i.e. ) is given by
( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( )
=
= 1 - 0.018 – 0.073 – 0.147 – 0.195 – 0.195 – 0.156 – 0.104

= 0.112
4.4 Continuous distribution

 Continuous random variables are used to describe the random phenomena. The
distributions are described below.
4.4.1 Uniform distribution
 A random variable X is uniformly distributed on the interval (a, b), if its pdf is given
by,
( ) {
 The cdf is given by
( ) {
 Note that ( ) ( ) ( )
 The probability is proportional to the length of interval for all x1 and x2 satisfying
a ≤ x1 < x2 ≤ b.
 Mean, ( )
( )
 Variance, ( )
 Uniform distribution plays a vital role in simulation. Random numbers, uniformly
distributed between 0 and 1, provide a means to generate random events.
Example 4.14
A bus arrives every 20 minutes at Mc Donald‟s stop beginning at 6:40 A.M. and continues till
8:40 A.M. A certain passenger does not know the schedule, but arrives randomly (uniformly
distributed) between 7:00 A.M. and 7:30 A.M. every morning. What is probability that the
passenger waits more than 5 minutes for a bus?
Solution:
The passenger waits for more than 5 minutes only if his/her arrival time is between 7:00 A.M.
and 7:15 A.M. or between 7:20 A.M. and 7:30 A.M.
Let X is the random variable denoting the number of minutes past 7:00 A.M. that the
passenger arrives.
The probability is given by
P (0 < X < 15) + P (20 < X < 30)
Now, X is uniform random variable on (0, 30) i.e. a = 0 and b = 30.
Therefore the desired probability is
F (15) – F (0) + F (30) – F (20) =
Example 4.15
Ace Heating and Air Conditioning service finds that the amount of time a repairman needs to
fix a furnace is uniformly distributed between 1.5 and 4 hours.
(i) Find the probability that a randomly selected furnace repair requires more than 2
hours.
(ii) Find the probability that a randomly selected furnace repair requires less than 3 hours.
(iii)Find the mean and standard deviation.
Solution:
Let X is the random variable denoting the time needed to fix a furnace. It is uniformly
distributed between 1.5 and 4 hours.
(i) The probability that a randomly selected furnace repair requires more than 2 hours is
given as
( ) ( )
( )
(ii) The probability that a randomly selected furnace repair requires less than 3 hours is
given as
( ) ( )
(iii)
( ) ( )
(iv) √ √
4.4.2 Exponential distribution

 A random variable X is said to be exponentially distributed with parameter λ > 0 if its
pdf is given by
( ) {
 The pdf and cdf for exponential distribution are shown below:
Figure 4.3 Exponential density function and cumulative distribution function
 The exponential distribution has been used to model inter-arrival times when arrivals
are completely random and to model service times which are highly variable. In these
instances, λ is a rate: arrivals per hour or services per minute.
 The exponential distribution has also been used to model the lifetime of a component
that fails catastrophically (instantaneously), such as light bulb. Then, λ is the failure
rate.
 Several different exponential pdf‟s are shown in figure 4.4.
Figure 4.4 The pdf’s for several exponential distributions

 The exponential distribution has mean and variance given by

( ) ( )
 Therefore Mean and Standard deviation are equal.
 The cdf is given by
( ) {
∫
 Exponential distribution is memoryless i.e. P(X > s + t | X > s) = P(X > t).
Example 4.16 [D-17]

Suppose that the life of an industrial lamp, in thousands of hours, is exponentially distributed
with failure rate λ=1/3 (one failure every 3000 hours, on average).
1. Determine the probability that lamp will last longer than its mean life of 3000 hours.
2. Determine the probability that the lamp will last between 2000 and 3000 hours.
3. Find the probability that the lamp will last for another 1000 hours, given that it is
operating after 2500 hours.
Solution:
Let X is the random variable denoting life of the lamp.
Given, λ=1/3
1. The probability that the lamp will last longer than its mean life of 3000 hours is given
by
P(X > 3) = 1 – P (X ≤ 3)
= 1 – F (3)
= 1 – (1 – e-3/3)
= 0.368
2. The probability that the lamp will last between 2000 and 3000 hours is given by
P(2 ≤ X ≤ 3) = F (3) – F (2)
= (1 – e-3/3) - (1 – e-2/3)
= - 0.368 + 0.513
= 0.145
3. The probability that the lamp will last for another 1000 hours, given that it is
operating after 2500 hours is given by (use memoryless property)
P(X > 2.5 + 1| X > 2.5) = P(X > 1)
= 1 – P (X ≤ 1)
= 1 – F (1)
= 1 – (1 – e-1/3)
= 0.7165
Example 4.17
A component has an exponential time-to-failure distribution with mean of 10,000 hours.
(i) The component has already been in operation for its mean life. What is the probability
that it will fail by 15,000 hours?
(ii) After 15,000 hours the component is still in operation. What is the probability that it
will operate for another 5000 hours?
Solution:
Let X is the random variable denoting the lifetime of a component.
Then X is exponentially distributed (λ=1/10,000 hours) with cumulative distribution function
( ) ,x>0
We use the memoryless property to compute the probabilities.
Given that the component has not failed for s = 10,000 hours or s = 15,000 hours.
The probability that the component lasts more 5000 hours is given by
P(X ≥ s + 5000 | X > s) = P(X ≥ 5000)
= 1 – P (X ≤ 5000)
= 1 – F (5000)
= 1 – (1 – e-5000/10000)
= 0.6065
4.4.3 Gamma distribution
 A random variable X is gamma distributed with parameters β and θ if its pdf is given
by
( )
( ) {⌈
Note: ⌈ ( )
 The parameter β is called the shape parameter and θ is called the scale parameter.
 Several different pdf‟s for gamma distribution with shape parameter = 2 are shown in
figure 4.5.
Figure 4.5 The pdf’s for several gamma distributions when β = 2
 The mean and variance of the gamma distribution are given by

( ) ( )
 The cdf of X is given by
∫ ( )
( ) { ⌈
 When β is an integer, the gamma distribution is related to the exponential distribution

in the following manner:
If the random variable, X, is the sum of β independent, exponentially distributed
random variables, each with parameter βθ, then X has a gamma distribution with
parameters β and θ. Thus, if,
where the pdf of Xj is given by

( )
( ) {
and the Xj are mutually independent, X has the pdf same as that of gamma
distribution.
 When β = 1, it results in the exponential distribution.
Example 4.18
Lead time is gamma distributed in 100s of units with a shape parameter of 3 and a scale
parameter of 1. What is the probability that the lead time exceeds 2 (hundred) units during an
upcoming cycle?
Solution:
Let X is the random variable denoting the lead time.
Given shape parameter β = 3 and scale parameter θ = 1.
The probability that the lead time exceeds 2 is given by
( ) ( )
= 1 – F(2)
= 0 ∫ ( ) 1
⌈
= 0 ∫ ( ) 1
⌈
=∫
= ∫
= 0( ) . / ( ). / . /1
= 0 . /1
= , ( )-
= 25e-6
= 0.062
4.4.4 Erlang distribution
 The pdf of gamma distribution
( )
( ) {⌈
is often referred to as the Erlang distribution of order k when β = k, an integer.
 We know that, the expected value of the sum of random variables is the sum of
expected value of each of the random variable.
 Thus,
E(X) = E(X1) + E(X2) + …+ E(Xk)
 But, the expected value of the exponentially distributed Xi is .
 Hence, the expected value for Erlang distribution is
( )
 If the random variables Xj are independent, the variance of their sum is sum of the
variances.
( )
( ) ( ) ( )
 The cdf of the Erlang distribution is given as
( )
( ) ∑
{
 The mode of the Erlang distribution is given by
Example 4.19 [D-10,11]

A medical examination is given in three stages by a physician. Each stage is exponentially
distributed with a mean service time of 20 minutes. Find the probability that the exam will
take 50 minutes or less. Also compute the expected length of the exam, variance, and the
modal value.
Solution:
Let X is the random variable denoting the duration of exam.
A medical examination is given in three stages by a physician, i.e. k = 3.
Each stage is exponentially distributed with a mean service time of 20 minutes, i.e.
The probability that the exam will take 50 minutes or less is given by
( ) ( )
. /( )
,( )( )-
= ∑
= 1 – [0.0821 + 0.2052 + 0.2565]

= 0.4562
( )
( )
( )( )
Modal value (mode) =
Example 4.20 [M-10, S-15]

The time intervals between dial-up connections to an Internet Service Provider are
exponentially distributed with a mean of 15 seconds. Find the probability that the third dial-
up connection occurs after 30 seconds have elapsed.
Solution:
Let X is the random variable denoting time between dial-up connections.
The desired probability is Erlang distributed with and X = 30.
The probability that the third dial-up connection occurs after 30 seconds is given by
( ) ( )
( )
( )( )
,( )( )-
[ ∑ ]
= 1 – [1 – 0.1353 – 0.2707 – 0.2707]

= 0.6767
4.4.5 Normal distribution

 A random variable X with mean µ (-∞ < µ < ∞) and variance σ2 (σ2 > 0) has a normal
distribution, if its pdf is given by
( ) [ . / ]
√
 It is also denoted sometimes as X ~ N(µ, σ2) to indicate that random variable X is
normally distributed with mean µ and variance σ2.
 The normal pdf is represented in figure 4.6.
Figure 4.6 pdf of normal distribution
 Some of the special properties of normal distribution are listed here:

1. ( ) ( )
The value of f(x) approaches zero as x approaches negative or positive
infinity.
2. ( ) ( ); pdf is symmetric about µ.
3. Maximum value of pdf occurs at x = µ [Mean & mode are equal].
 The cdf of normal distribution is given as
( ) ( ) ∫ * ( * +
√
 Since the above equation is in closed form, it is not possible to evaluate.
 So a transformation variable, . /, allows the evaluation to be independent of
µ and σ.
 If X ~ N(µ, σ2), let Z = (X - µ) / σ, to obtain
( ) ( ) . /
( )
∫ √
( )
∫ ( ) . /
 The pdf of the standard normal distribution with mean 0 and variance 1 [Z~N(0, 1)] is
given as
( )
√
 The standard normal distribution is shown in figure 4.7 below.
Figure 4.7 The pdf of standard normal distribution

 The cdf for Standard Normal Distribution is given as
( ) ∫
√
 The probabilities Ф(z) for Z ≥ 0 are given in Table A.2.
Example 4.21
The time to pass through a queue to begin self-service at a cafeteria has been found to be
N(15, 9). Determine the probability that an arriving customer waits between 14 and 17
minutes.
Solution:
Let X is the random variable denoting that an arriving customer waits.
Given N (15, 9). It implies that µ = 15, σ2 = 9.
The probability that an arriving customer waits between 14 and 17 minutes is given by
P (14 ≤ X ≤ 17) = F (17) – F (14)
= Φ [(17 - 15)/3] - Φ [(14 - 15)/3]
= Φ [0.667] - Φ [-0.333]
= Φ [0.667] – [1 - Φ [0.333]]
= 0.3780
Example 4.22
IQ scores are normally distributed throughout society with a mean of 100 and a standard
deviation of 15.
(i) A person with an IQ of 140 or higher is called a “genius”. What proportion of society
is in the genius category?
(ii) What proportion of society will miss the genius category by 5 or less points?
(iii)An IQ of 110 or better is required to make it through an accredited college or
university. What proportion of society could be eliminated from completing a higher
education based on a low IQ score?
Solution:
Let X is the random variable denoting I.Q. scores.
Given X is normally distributed with mean μ = 100, and standard deviation σ = 15.
(i) The probability that a score is 140 or greater is given by
P(X ≥140) = 1 − Ф [(140 − 100)/15] = 0.00383
(ii) The probability that a score is between 135 and 140 is given by
P(135 ≤ X ≤ 140) = Φ[(140 − 100)/15] − Φ[(135 − 100)/15]
= 0.00598
(iii)The probability that a score is less than 110 is given by
P(X <110) = Φ [(110 − 100)/15] = 0.7475
4.4.6 Weibull distribution

 The random variable X has a weibull distribution if its pdf has the form
. / * . / +
( )
{
 The three parameters of the Weibull distribution are
υ → location parameter (-∞ < υ < ∞)
α → Scale parameter (α > 0)
β → Shape parameter (β > 0)
 When υ = 0, pdf becomes
. / * . / +
( )
 When υ = 0 & β = 1, pdf is reduced to
( ) {
which is an exponential distribution with parameter λ = 1/ α.
 Figure 4.8 shows several Weibull densities for
Figure 4.8 Weibull pdf’s for =0
 Mean is given as ( ) ⌈. /
 Variance is given as ( ) [⌈. / 0⌈. /1 ]
 The location parameter υ has no effect on the variance; however, the mean is
increased or decreased by υ.
 The cdf of the Weibull distribution is given by
( ) {
* . / +
Example 4.23
The time it takes for an aircraft to land and clear the runway at a major airport has a weibull
distribution with υ = 1.34 minutes, β = 0.5 and α = 0.04 minutes. Determine the probability
that an incoming airplane will take more than 1.5 minutes to land and clear the runway.
Solution:
Let X is the random variable denoting the time taken by an aircraft.
Given v = 1.34 minutes, β = 0.5, and α = 0.04 minutes
The probability than an incoming airplane will take more than 1.5 minutes is given by
P (X > 1.5) = 1- P(X ≤ 1.5)
= 1- F(1.5)
= 1- exp [- {(1.5 – 1.34)/ (0.04)} 0.5]
= 1- e- 2
= 1- 0.135
= 0.865
Example 4.24
The time to failure of an electronic subassembly can be modeled by a Weibull distribution
whose location parameter is zero, β = 0.5 and α = 1000 hours.
(i) What is the mean time to failure?
(ii) What fraction of these subassemblies will fail by 3000 hours?
Solution:
Let X is the random variable denoting the time to failure of an electronic subassembly.
Given v = 0, β = 0.5, and α = 1000 hours
(i) The mean time to failure is given by
( ) ⌈( * ⌈( *
(ii) The probability that these subassemblies will fail by 3000 hours is given by
( ) ( )
[ . / ]
= 1 – e-1.732
= 0.823
4.4.7Triangular distribution
 A random variable X has a triangular distribution if its pdf is given by
( )
( )( )
( ) ( )
( )( )
{
where a ≤ b ≤ c.
 The mode occurs at x = b.
 A triangular pdf and the representation of height is shown in figure 4.9
Figure 4.9 The pdf of the triangular distribution
 Mean = E(X) = (a + b + c) / 3
 Variance = V(X) = (a2 + b2 + c2 – ab – ac – bc) / 18
 Mode = b = 3 E(X) – (a + c)
 Since a ≤ b ≤ c, it follows that
( )
 The cdf for the triangular distribution is
( )
( )( )
( )
( )
( )( )
{
Example 4.25
The central processing requirements for a program that will execute, have a triangular
distribution with a = 0.05 second, b = 1.1 seconds and c = 6.5 seconds. Determine the
probability that the CPU requirement for a random number is 2.5 seconds or less.
Solution:
Let X is the random variable denoting the CPU requirement for a program.
This is triangular distributed with a = 0.05 second, b = 1.1 seconds, and c = 6.5 seconds.
The probability that the CPU requirement for a random number is 2.5 seconds or less is given
by
P (X ≤ 2.5) = F (2.5)
The value of F(2.5) is from the portion of the cdf in the interval (0.05, 1.1) plus that portion
in the interval (1.1, 2.5) i.e., 1.1 < 2.5 ≤ 6.5 (b < x ≤ c)
( ) ( )
Therefore F (2.5) =
( )( ) ( )( )
Thus, the probability is 0.541 that the CPU requirement is 2.5 seconds or less.
Example 4.26 [M-10]

Determine the variance, V(X), of the triangular distribution.
Solution:
A random variable, X, has a triangular distribution with pdf
( )
( )( )
( ) ( )
( )( )
{
The variance is
( ) ( ) , ( )-
E(X) = (a + b + c)/3
( ) ( *∫ ( ) ( *∫ ( )
( )( ) ( )( )
0 1, ( ) ( )-
( )
( ) ( )
( ) * + [ ]
4.4.8 Lognormal distribution

 A random variable X has a lognormal distribution, if its pdf is given by
( )
* +
( ) {√
where σ2 > 0
 The pdf‟s of the lognormal distribution are shown in figure 4.10.

Figure 4.10 The Pdf’s of the lognormal distribution
 Mean of lognormal random variable is ( )

 Variance of lognormal random variable is ( ) ( )
 2
The parameter µ and σ are not the mean and variance of lognormal. These
parameters come from the normal distribution.
 When Y has N(µ, σ2), then X= eY has a lognormal distribution with parameters µ and
σ2 .
 If mean and variance of lognormal are known to be µL and σ2L , then the parameter µ
and σ2 are given by
( )
√
. /
Example 4.27
The rate of return on a volatile investment is modeled as having a lognormal distribution with
mean 20% and standard deviation 5%. Compute the parameters for the lognormal
distribution.
Solution:
Given µL=20 and σL=5
( ) ( )
√ √
. / . /
4.4.9 Beta distribution

 A random variable X is beta distributed with parameters β1 > 0 and β2 > 0 if its pdf is
given by
( )
( ) { ( )
⌈( )⌈( )
where ( )
⌈( )
 The cdf of the beta distribution does not have a closed form in general.
 The beta distribution is very flexible and has a finite range from 0 to 1, as shown in
figure 4.11.
 In practice, we often need a beta distribution defined on a different range, say (a,b),
with a < b, rather than (0,1).
 This is easily accomplished by defining a new random variable
( )
 The mean and variance of Y are given by
( ) ( )( *
( ) ( ) ( *
( ) ( )
Figure 4.11 The pdf for several beta distributions
4.5 Poisson Process [M-07,10,11,12,13,15,17, S-15, D-05,06,08,09,11,13,16,17]

 Consider a random event such as the arrival of customers at bank, which follows
some specified distribution.
 This event can be described by a counting function N(t), for all t ≥ 0.The counting
function will represent the number of events that occurred in interval [0, t].
 Time zero is the point at which the observation begins on the arrival event.
 For each interval [0, t], the value N(t) is an observation of a random variable with
possible values 0, 1, 2, …
 The counting process, {N(t), t ≥ 0}is said to be a Poisson process with mean rate λ, if
it satisfies the following assumptions.
1. Arrivals occur one at a time.

2. {N(t),t ≥ 0} has stationary increments.
The distribution of number of arrivals between t and t + s depends only on length
of interval s and not on starting point t. Thus arrivals are completely random.
3. {N(t),t ≥ 0} has independent increments.
The numbers of arrivals during non-overlapping time intervals are independent
random variables.
 If arrivals occur according to the Poisson process, and if it meets the above three
assumptions, it can be shown that probability N(t) is equal to n, i.e.
( )
, ( ) -
 Comparing Poisson pmf with the above equation, N(t) has Poisson distribution with
parameter α = λt.
 Thus, its mean and variance are given by
E(N(t) = α = λt = V[N(t)]
 For any times s and t, such that s < t, the assumption - stationary increments implies
random variable N(t) –N(s) representing number of arrivals in interval s to t is also
Poisson distributed with mean λ (t - s).
 Thus,
( ), ( )-
[ ( )– ( ) ]
and E [N(t) –N(s)] = λ (t - s) = V [N(t) – N(s)]
 Now, consider the time at which an arrival occur in Poisson process. Let the first
arrival occur at time A1, the second at time A1+A2 and so on. Thus A1, A2, … are
successive inter arrival times. It is depicted in figure 4.12.
Figure 4.12 Arrival process
 Since first arrival occurs after time t and no arrivals in interval [0, t], it is seen that
{A1 > t} = {N(t) = 0}
 Therefore P(A1 > t) = P[N(t) = 0] = e-λt
 The probability that the first arrival will occur in [0, t] is given by P(A1 ≤ t) = 1 - e-λt
which is the cdf of exponential distribution with λ.
 Hence A1 is exponentially distributed with mean 1/λ and also A1, A2, … inter-arrival
times are exponentially distributed and independent with mean 1/λ.
 Therefore Poisson process can be also defined as if inter-arrival times are

exponentially and independently distributed then number of arrivals by time t, say
N(t), meets three assumptions.
4.5.1 Properties of Poisson process

 Suppose that, each time an event occurs is classified as either type I or type II event.
 Suppose further that each event is classified as type I event with probability p and
type II event with probability 1-p, independently of all other events.
 The two properties are:
1. Random splitting
 Let N1(t) be random variable denoting number of type I event, N2(t) for type II event.
 N(t) = N1(t)+ N2(t).
 N1(t)and N2(t) are both Poisson processes having rates λp and λ(1-p) as shown in
figure 4.13.
Figure 4.13 Random splitting
2. Pooled process
 It is the process of pooling two arrival streams.
 If Ni(t) are random variables representing independent Poisson process with rates λi
then N(t) = N1(t)+ N2(t) is a poisson process with rate λ1+ λ2, as shown in figure 4.14.
Figure 4.14 Pooled process
Example 4.28 [D-05]

A mainframe computer crashes in accordance with a Poisson process with a mean rate of one
crash every 36 hours. Determine the probability that the next crash will occur between 24 and
48 hours after the last crash.
Solution:
Let X is the random variable denoting the number of hours until crash occurs.
Given, λ=1/36
The probability that the next crash will occur between 24 and 48 hours is given by
P(24 ≤ X ≤ 48) = F (48) – F (24)
= (1 – e-48/36) - (1 – e-24/36)
= 0.513 – 0.264 = 0.249
■■■
CHAPTER 5
QUEUEING MODELS
5.1 Simulation of Queueing Systems

Simulation is often used in the analysis of queuing models. In a simple but typical queuing
model, shown in Figure 5.1, customers arrive from time to time and join a queue or waiting
line, are eventually served by the server and finally leave the system.
Figure 5.1 Queueing System
Characteristics of Queueing Systems [M-09, 14, N-04,D-15]

 The key elements of a queuing system are the customers and servers.
 The term customer can refer to people, machines, trucks, mechanics, patients,
airplanes, e-mail, cases, orders, or dirty clothes-anything that arrives at a facility and
requires service.
 The term server might refer to receptionists, mechanics, tool-crib clerks, medical
personnel, automatic storage and retrieval machines (e.g., cranes), runaways at an
airport, automatic packers, etc which provides the requested service.
 A queueing system is described by its calling population, the nature of the arrivals, the
service mechanism, the system capacity, and the queueing discipline. These attributes
of a queueing system are described in detail below.
5.1.1 The calling population

 The population of potential customers is referred to as the calling population.
 It may be assumed to be finite or infinite.
 Finite calling population model: The arrival rate to the queuing system does
depend on the number of customers being served and waiting. Examples of a
finite calling population are a repair facility in a shop, where there is a fixed
number of machines available to be worked on, a trucking terminal that
services a fleet of a specific number of trucks, or a nurse assigned to attend to
a specific number of patients.
 Infinite calling population model: The arrival rate (i.e., the average number
of arrivals per unit time) is not affected by the number of customers who have
left the calling population and joined the queuing system. When the arrival
process is homogeneous over time (e.g., there are no “rush hours”), the arrival
rate is usually assumed to be constant. Examples of infinite calling
population include the potential customers of the restaurant, bank, or other

similar service facility and also very large group of machines serviced by a
technician.
 In systems with large population of potential customers, the calling
population is usually assumed to be infinite.
5.1.2 System capacity

 In many queuing systems there is a limit to the number of customers that may
be in the waiting line or system.
 When a system has limited capacity, a distinction is made between the arrival
rate (i.e., the number of arrivals per time unit) and the effective rate (i.e., the
number who arrive and enter the system per time unit).
 For example,
1. (Limited capacity) - An automatic car wash may have room only for
10 cars to enter the mechanism. It may be too dangerous or illegal for
cars to wait in the street. An arriving customer who finds the system
full does not enter but returns immediately to the calling population.
2. (Unlimited capacity) - Some systems, such as concert ticket sales for
students, may be considered as having unlimited capacity. There are
no limits on the number of students allowed to wait to purchase
tickets.
5.1.3 The arrival process

 The arrival rate is the rate at which customers arrive at the service facility
during a specified period of time.
 This rate can be estimated from empirical data derived from studying the
system or a similar system, or it can be an average of these empirical data.
 For example, if 100 customers arrive at a store checkout counter during a 10-
hour day, we could say the arrival rate averages 10 customers per hour.
However, although we might be able to determine a rate for arrivals by
counting the number of customers during a specific time period, we would not
know exactly when these customers would arrive. It might be that no
customers would arrive during one hour and 20 customers would arrive during
another hour.
 Arrivals are assumed to be independent of each other and to vary randomly
over time. Arrivals may occur at scheduled times or random times.
 The most important model for random arrivals is the Poisson arrival process.
Poisson arrival process is used as a model for the arrival of people to
restaurants, driving banks and other service facilities like the arrival of
telephone calls to a telephone exchange, etc.
 Second important class of intervals is the scheduled arrivals. In this case the
inter arrival times may be constant, or constant plus or minus a small random
amount to represent early or late arrivals. For example, Patients to a
physician‟s office, scheduled airline flight arrivals to an airport.
 A third situation occurs when at least one customer is assumed to be always

present in the queue so that the server is never idle because of lack of
customers. For example, a customer may represent raw material for a product
and sufficient raw material is assumed to be always available.
5.1.4 Queue behavior and queue discipline [M-16,17]

 Queue behavior refers to customer actions while in a queue waiting for
service to begin. There is a possibility that the incoming customers may
1. Balk: leave when they see that the line is too long.
2. Renege: leave after being in the line when they see that the line is
moving too slow.
3. Jockey: move from one line to another if they think they have chosen
a slow line.
 Queue discipline refers to the logical ordering of customers in a queue and
determines which customer is chosen for service when the server becomes
free.
 Queue disciplines can be FIFO (first in, first out), LIFO (last in, first out),
SIRO (service in random order), SPT (shortest processing time first), PR
(priority service).
 In a job shop, queue disciplines are sometimes based on due dates and
expected processing time for a given type of job.
5.1.5 Service times and the service mechanism

 The service times of successive arrivals are denoted by S1, S2, and S3…
 They may be constant or of random duration.
 In case of random, {S1, S2, …} is usually characterized as a sequence of
independent and identically distributed random variables.
 The distributions like exponential, weibull, gamma, etc can be used as models
for service times.
 Sometimes services may be identically distributed for all customers of a given
type or class or priority, while customers of different types may have different
service time distributions.
 Sometimes, service times depend upon the time of day or length of waiting
line.
 A queuing system consists of a number of service centers and interconnecting
queues.
 Each service center consists some number of severs c, working in parallel.
 Parallel service mechanisms are either single server (c=1), multiple servers (1
< c < ∞) or unlimited servers (c=∞).
 A self service facility is usually characterized as having an unlimited number
of servers.
 Example: Consider a discount warehouse where customers may either serve
themselves or wait for one of the 3 clerks and finally leave after paying a
single cashier. The system is represented by the flow diagram in figure

5.2.The sub system consisting of queue 2 and service center 2 is shown in
more detail in the figure 5.3.
Figure 5.2 Discount warehouse with three service centers
Figure 5.3 Service center 2, with c=3 parallel servers
5.2 Queueing Notation [M-16,17 D-12]

 A notation system for parallel server queues given by Kendall is A/B/c/N/K
 A represents the interarrival-time distribution,
 B represents the service-time distribution,
 c represents the number of parallel servers,
 N represents the system capacity,
 K represents the size of the calling population.
 Common symbols for A and B include M (exponential or Markov), D (constant or
deterministic), Ek (Erlang of order k), P H (phase-type), H (hyperexponential), G
(arbitrary or general), and GI (general independent).
 Examples:
• M/M/1 (also M/M/1//) indicates a single-server that has unlimited queue
capacity and an infinite population model. The inter-arrivals and service times
are exponentially distributed.
– When N and K are infinite, they are often dropped from the notation
• G/G/1/5/5 represents a queueing system with general (or arbitrary) inter-
arrival and service distribution with single server, with a queue capacity of 5
and finite population model of size 5
– General models are used to solve the queue system with no particular
distribution in mind
– Very useful as the final results can be obtained by plugging in the
values of specific distributions
 Additional Queueing Notations for Parallel Server Systems

Pn: steady-state probability of having n customers in system,
Pn(t): probability of n customers in system at time t,
λ: arrival rate,
λe: effective arrival rate,
μ: service rate of one server,
ρ: server utilization,
An: interarrival time between customers n-1 and n,
Sn: service time of the nth arriving customer,
Wn: total time spent in system by the nth arriving customer,
WnQ: total time spent in the waiting line by customer n,
L(t): the number of customers in system at time t,
LQ(t): the number of customers in queue at time t,
L: long-run time-average number of customers in system,
LQ: long-run time-average number of customers in queue,
w: long-run average time spent in system per customer,
wQ: long-run average time spent in queue per customer.
5.3 Long Run Performance Measures of Queueing Systems

[M-05,06,11,12, D-06,07,13]
1. Time-Average Number in System L

 Consider a queueing system over a period of time T, and let L(t) denote the number of
customers in the system at time t.
 A simulation of such system is shown in figure 5.4.
Figure 5.4 Number in System, L(t), at time t

 Let Ti denote the total time during [0, T] in which the system contained exactly i
customers.
 The time-weighted-average number in a system is defined by:

̂ ∑ ∑ ( *
 From the figure 5.4 above:

T0 = 3, T1 = 12, T2 = 4, and T3 = 1
̂ , ( ) ( ) ( ) ( )- =1.15customers
 Consider the total area under the function is L(t), then,
̂ ∑ ∫ ( )
 The long-run time-average number in system, with probability 1:

̂ ∫ ( )
 If LQ(t) denotes the number of customers waiting in line, and denotes the total
time during [0,T] in which exactly i customers are waiting in line then, the time-
weighted-average number in queue is:
̂ ∑ ∫ ( )
Figure 5.5 Number waiting in line, LQ(t), at time t

From the figure 5.5 above,
( )
( ) {
( ) ( )
Area under T3 is 1 i.e.

Area under T2 is 4 i.e.
Hence Area under T1 is 20 – (1 +4) =15 i.e.
Therefore,
( ) ( ) ( )
̂ = 0.3 customers
2. Average Time Spent in System Per Customer w

 If we simulate the queueing system for a period, say T and then record the time each
customer spends in the system during [0,T], say W1, W2, …., WN where N is the
number of arrivals in [0,T].
 The average time spent in system per customer, called the average system time, is:
̂ ∑
 For stable systems, ̂ with probability 1, where w is called the long-

run average system time.
 If the system under consideration is the queue alone (with WiQ being the total time
customer i spends waiting in the queue):
̂ ∑
where ̂ is the observed average time spent in queue (called delay), and wQ is the
long-run average delay per customer.
 Referring to the figure 5.4, N=5 customers are in the system over a period (0, 20). For
getting W1, W2, …, W5 we require more information. Here we assume that the system
has single server and a FIFO queue discipline. In figure 5.4, each jump in upward
direction of L(t) represents the arrival event whereas the jump in downward direction
represent the departure event. Hence arrivals occur at 0, 3, 5, 7, 16 where as the
departures occur at 2, 8, 10, 14, 20.
 Now, W1 = 2 – 0 = 2, W2 = 8 – 3 = 5, W3 = 10 – 5 = 5, W4 = 14 – 7 = 7, W5 = 20 – 16
=4
 The average system time is:
W1  W2  ...  W5 2  (8  3)  ...  (20  16)
wˆ    4.6 time units
5 5
 Similarly, the time spent by each customer in waiting line is:
W1  W2  ...  W5 0 0330
Q Q Q
wˆ Q    1.2 time units

5 5
3. Conservation Equation or Little’s Law [D-07]

 Conservation equation (a.k.a. Little‟s law) is given by ̂ ̂̂
 This relationship holds for almost all queueing systems or subsystems (regardless
of the number of servers, the queue discipline, or other special circumstances).
Derivation:
Total system time for all customers is given by the total area under the number in system
function L(t).
∑ ∫ ( )
Multiplying both sides by (1/T)

∑ ∫ ( )
Multiplying and Dividing by N on L.H.S. and then interchanging sides
∫ ( ) ∑ ̂ ̂̂ ̂
4. Server Utilization
 Server utilization is the proportion of time that a server is busy.
 Observed server utilization, ̂, is defined over a specified time interval [0,T].
 Long-run server utilization is ρ.
 For systems with long-run stability: ̂
As per the above figure same as figure 5.4, the server utilization is
( ) ∑
̂
where T0 is the total idle time.
For G/G/1/∞/∞ queues:
 Any single-server queueing system with average arrival rate λ customers per time
unit, where average service time E(S) = 1/µ time units or µ is the average service rate,
infinite queue capacity and calling population.
 The server alone can be considered as a queueing system in itself, so conservation
equation, L = λw, can be applied.
 For a stable system, the average arrival rate to the server, λs, must be identical to λ.
 The average number of customers in the server is:
̂ ∫ ( ( ) ( ))
 For a single-server case, the average number of customers being served at any
arbitrary point in time is equal to server utilization!!
 In general, for a single-server queue:
(̂ ̂) ( ) ( )
 For a single-server stable queue, the arrival rate λ must be less than the service rate µ;
(λ< µ) or
 For an unstable queue (λ > µ)
 The server is always busy
 Waiting line tend to grow in length at an average rate of (λ -µ) customers
per time unit and long run average queue length is 
 Long-run server utilization is 1.
For G/G/c/∞/∞ queues:

 Consider a system with c identical servers in parallel.
 Arrivals occur at a rate λ and each server works at a rate µ customers per time unit.
 If an arriving customer finds more than one server idle, the customer chooses a server
without favoring any particular server.
 For systems in statistical equilibrium, the average number of busy servers, Ls, is: Ls, =
λE(s) = λ/µ.
 The long-run average server utilization is:
Example 5.1
Customers arrive at random to the passport center at a rate of 40 customers per hour.
Currently, there are 20 clerks, each serving 4 customers per hour on the average. Estimate the
average utilization of a server and the average number of busy servers. Can we decrease the
number of servers?
Solution:
Given that
Arrival rate,  = 40 customers per hour,
Service rate  = 4 customers per hour
Number of servers, c = 20 i.e. it is a multi-server system.
The long-run or steady state average utilization of a server is
( )
The average number of busy servers is:
A typical clerk is busy only 50% of the time!!
For system to be stable, the number of servers must be

Possibilities are c ≥ 11, 12, etc.
For D/D/1 queues:

 Consider the queuing system with single server and deterministic arrival and service
time.
 Here, the interarrival times {A1,A2, ….}are equal to E(A) = 1/λ.
 Also, the service times {S1, S2, ….}are equal to E(S) = 1/µ.
 Assume that a customer arrives to an empty system at time 0.
 The evolution of the system is completely predictable as shown in the figure 5.6
below:
Figure 5.6 Deterministic Queue (D/D/1)
 Here, L = ρ =λ/µ, w = E(S) =1/µ, LQ = WQ = 0

 By varying λ and µ, server utilization can assume any value between 0 and 1.
 Yet there is never any line.
 In general, variability of inter-arrival and service times (small inter-arrival times and
large service times which increases existing waiting line whereas large inter-arrival
times and small service times which reduces the existing waiting line) causes lines to
fluctuate in length.
Example 5.2
A physician schedules patients every 10 minutes and spends Si minutes with the ith patient
which is given by:
{
Find the steady state average utilization of the server.
Solution:
Given that patients are arriving after every 10 minutes; hence the arrivals are deterministic.
Therefore, arrival rate λ = (1/10) per minute.
The services are probabilistic.
Therefore, the mean service time = E(S) = 9(0.9) + 12(0.1) = 9.3 minutes
Therefore, service rate µ = (1/9.3) per minute.
Average sever utilization = =

From this we can say that, a physician will be busy for 93%.
5. Costs in Queueing System [N-04, S-15, D-11,15]

 Costs can be associated with various aspects such as the waiting line or servers in the
queueing problem.
 Let ₹ x be the cost per hour per customer in the queue.
 If customer j spends hours in queue then the total cost of the N customers those
arrive during the simulation is ∑ .
 Hence, average cost per customer is∑ ̂.

 Let ̂ customers per hour arrive to the system then the average cost per hour is
(̂ )( ̂ ) ̂
 The server may also enforce costs on the system.
 Let a group of c parallel servers have utilization ρ and each server imposes a cost of
₹.y per hour while busy.
 Then the cost for a set of c parallel servers:
 When servers are busy: ₹. y.cρ
 When servers are idle: ₹. y.c(1 - ρ)
 From this we can say that, total costs of the system can be reduced by varying the
parameters such as number of servers, service rate, and system capacity.
5.4 Steady-State Behavior of Infinite-Population Markovian Models

 Consider the infinite population models where the arrivals occur according to a
Poisson process with rate λ arrivals per unit time.
 Hence, the interarrival times are exponentially distributed with mean 1/ λ.
 Let the service times be exponentially distributed (M) or arbitrary (G).
 Let the queue discipline be FIFO.
 The models having exponentially distributed interarrival times are called as
Markovian models.
 A queueing system is in static equilibrium (steady state), if the probability that the
system state (number of customers in system) is independent of time t, i.e.,
P[L(t) = n] = Pn(t) = Pn
where Pn(t): probability of n customers in system at time t
Pn: steady-state probability of having n customers in system
 In many queueing models the following two characteristics are observed:
1. Approaching static equilibrium (steady state) from any starting state and
2. Staying in static equilibrium (steady state) once it is reached.
 The steady-state parameter, the time-average number of customers in the system (L)
is given by:
 Since L is also one of the long-run performance measure of the system, hence, the
other steady-state parameters can be computed by using Little‟s equation.
 In case of infinite calling population, for the system to be stable .

5.4.1 M/G/1 Queue [M-13,15,16, D-05,07,08,11,14,17]
 M/G/1 queue has a single server with unlimited capacity and interarrival time which
is exponentially / Poisson distributed whereas the service time is general having mean
1/µ with variance σ².
 If ρ = λ/µ < 1, the steady-state parameters of M/G/1 queue:
Table 5.1 Steady-State Parameters of the M/G/1 Queue
Steady-State Parameter Equation
Server Utilization  /
Long-run time average number of 2 (1  2   2 )  2 (1   2  2 )
customers in the system L      
2(1   ) 2(1   )
Long-run average time spent in the 1  (1 /    )
2 2
system/customer w 
 2(1   )
Long-run average time spent in  (1 /  2   2 )
queue/customer w 
2(1   )
Q
Long-run time average number of  2 (1   2  2 )

customers in queue L 
2(1   )
Q
Steady state probability of having 0 P0  (1   )

customers in system
Example 5. 3
Widget-making machines malfunction apparently at random and then require a mechanic‟s
attention. It is assumed that malfunctions occur according to a Poisson process, at the rate of
1.5 per hour. Observations over several months has found that repair times by a single
mechanic take an average time of 30 minutes, with the standard deviation of 20 minutes.
Compute:
a) The utilization of mechanic
b) The time-average number of machines in the system
c) The average time an arrival spends in the system
d) The average time machine spends in the queue
e) The time-average number in the queue
f) The probability of zero machines with the mechanic.
Solution:
Given, λ=1.5 per hour
The mean service time = 1/µ = 30/60 = ½ hour
Therefore, µ = 2 per hour
σ² = (20)² minutes² =1/9 hour²
a) The utilization of the mechanic

   /  = 1.5/2 = 0.75
b) The time-average number of machines in the system
2 (1  2   2 )  2 (1   2  2 ) (0.75)2 (1  4 / 9)
L   0.75   2.375 machines
2(1   ) 2(1   ) 2(1  0.75)
c) The average time an arrival spends in the system

1 1
 (1 /  2   2 ) 1 1.5( 4  9 )
1
w     1.583hour
 2(1   ) 2 2(1  0.75)
d) The average time machine spends in the queue
1 1
 (1 /  2   2 ) 1.5( 4  9 )
wQ    1.083hour
2(1   ) 2(1  0.75)
e) The time - average number in the queue
4
(0.75) 2 (1  )
 (1    )
2 2 2
9  1.625machines
LQ  
2(1   ) 2(1  0.75)
f) The probability of zero machines with the mechanic.
P0  (1   )  (1  0.75)  0.25
5.4.2 M/M/1 Queue [M-08,13,15,16, D-05,06,07,08,11,12,14,17]

 M/M/1 queue has a single server with unlimited capacity and interarrival time which
is exponentially distributed whereas the service time is exponentially distributed
having mean 1/µ with variance σ² = 1/µ².
 This model is useful when service times have standard deviations approximately
equal to their means.
 The steady state parameters may be computed by substituting σ² = 1/µ² into the
equation of M/G/1 queue.
Table 5.2 Steady-State Parameters of the M/M/1 Queue

Server Utilization  /
Long-run time average number of  
L 
customers in the system    1 
Long-run average time spent in the 1 1
w  
system/customer     (1   )
Long-run average time spent in  
wQ  
queue/customer       (1   )
Long-run time average number of 2 2
L  
customers in queue      1  
Q
Steady state probability of having 0 P0  (1-  )

customers in system
Steady state probability of having n Pn  1    n
customers in system
Example 5.4 [S-15, D-16]

Suppose that the interarrival times and service times at a single chair unisex hair-styling shop
have been shown to be exponentially distributed with values 2 per hour and 3 per hour
respectively. Compute
a) The utilization of server
b) The time-average number of customers in the system
c) The time-average number of customers in the queue
d) The average time customer spends in the system
e) The average time customer spends in the queue
f) The probability of zero, one, two, three, and four or more customers in the shop.
Solution:
Given: λ = 2 per hour, µ = 3 per hour
a) The utilization of server
b) The time-average number of customers in the system
c) The time-average number of customers in the queue

( )
d) The average time customer spends in the system
( ) ( )
e) The average time customer spends in the queue
( ) ( )
f) The probability of zero, one, two, three, and four or more customers in the shop.
( )
( ) ( )
( ) ( )( )
( ) ( )( )
( ) ( )( )
∑ ( )
5.4.3 Multi-Server Queue: M/M/c/∞/∞ Queue

 In M/M/c queue, c channels (servers) are operating in parallel.
 Each channel has an independent and identical exponential service time distributions
with mean 1/µ.
 The arrivals are assumed to be from Poisson process with rate λ.
 Arrivals will join a single queue and enter the first available service channel.
 If the number of customers in the system n < c, then an arriving customer will enter an
available channel else customer will join a queue.
 The offered load is given by λ/µ.
 If λ ≥ cµ, the system is not in steady state and the waiting line grows in length at the
rate of (λ - cµ) customers per time unit, on the average.
Table 5.3 Steady-State Parameters of the M/M/c Queue

Server Utilization    / c
Steady state probability of having 0 1
 c 1 ( /  ) n     c  1  c  
customers in system P0          
  n  0 n !  
     c!  c    
1
 c 1 (c ) n   c  1  1    
     c     
 n 0 n!    c!  1    
Probability that all servers are busy
PL   c  
  c P0  c c P0
c!1   c  c!1   
Long-run time average number of (c ) c 1 P0  P  L ( )  c 
customers in the system L  c   c 
c(c!)(1   ) 2 1 
Long-run average time spent in the L
w
system/customer 
Long-run average time spent in 1
wQ  w 
queue/customer 
Long-run time average number of (c ) c 1 P0  P  L ()  c 
customers in queue LQ  wQ  
c(c!)(1   ) 2
1 
Average number of busy servers or 
L  LQ   c
average number of customers being 
served
Steady state probability of having n Pn = P(L(∞)=n)
customers in system
Example 5.5 [D-13]

A CNG station has two filling machines. The service time follows the exponential
distribution with mean of 5 minutes and taxis arrives for service in Poisson fashion at rate of
15 per hour. Compute the steady state parameters for this M/M/c system.
Solution:
The CNG station facility is modeled by M/M/2 queue with:
λ= 15/ hour and
Mean service time =1/ µ = 5 minutes = (5/60) hours = (1/12) hours.
Therefore, service rate µ = 12 / hour
Number of servers = c = 2
The steady state parameters of M/M/2 system are computed as follows:

Server Utilization    / c =15/(2)(12) = 0.625
Steady state probability of having 0 1
 c 1 ( /  ) n     c  1  c  
customers in system i.e. CNG P0          
system is empty  n 0 n!      c!  c    
( ⁄ ) ( )
=〈*∑ + [. / . / . ( )
/]〉
=( . / . / . /) = . /
Probability that all servers are busy c c P0
PL   c  
c!1   
i.e. all CNG filling machines are
busy
,( )( )- ( )
= ( )( )
Long-run time average number of  P  L ( )  c 
L  c 
customers in the system i.e. average 1 
number of taxis ( )( )
=( )( )
Long-run average time spent in the L

w = = 0.137 hours
system/taxi 
Long-run average time spent in 1
wQ  w  = 0.137- (1/12) = 0.054 hours
queue/taxi 
Long-run time average number of LQ  wQ = (15)(0.054) = 0.81 taxis
taxis in queue
Example 5.6[M-16, D-10]

A tool crib has exponential interarrival time and service time, and it serves a very large group
of mechanics. The mean time between arrivals is 4 minutes. It takes 3 minutes on the average
for a tool crib attendant to service a mechanic. The attendant is paid ₹ 10 per hour and the
mechanic is paid ₹ 15 per hour. Would it be advisable to have a second tool crib attendant?
Solution:
The tool crib is modeled by an M/M/c queue (λ = 1/4, μ = 1/3, c = 1 or 2).
Given that attendants are paid ₹ 10 per hour and the mechanics are paid ₹ 15 per hour.
Mean cost per hour = ₹10c + ₹15L
assuming that mechanics impose cost on the system while in the queue and in service.
CASE 1: one attendant - M/M/1 (c = 1, ρ = λ/μ = 0.75)

L = ρ/(1 − ρ) = 3 mechanics
Mean cost per hour = ₹ 10(1) + ₹ 15(3) = ₹ 55 per hour.
CASE 2: two attendants - M/M/2 (c = 2, ρ = λ/cμ = 0.375)

(c ) c 1 P0
L  c  =0.8727,
c(c!)(1   ) 2
where
1

 c 1 (c )  
n
c  1  1  

P0 =    c     

 n 0 n!    c!  1    
= 0.4545
Mean cost per hour = ₹ 10(2) + ₹ 15(0.8727) = ₹ 33.09 per hour
It would be advisable to have a second attendant because long run costs are reduced by ₹
21.91 per hour.
5.5 Network of Queues [M-10, D-11]
 Many systems are naturally modeled as networks of single queues in which customers
departing one queue may be routed to another.
 The following results assume a stable system with infinite calling population and no
limit on system capacity:
1. If there are no customers created or destroyed in the queue, then the departure
rate out of the queue is the same as the arrival rate into the queue, over the
long run.
2. If customers arrive to queue m at rate λm, and a fraction 0 ≤ Pmn ≤ 1 of them
are routed to queue n upon departure, then the arrival rate from queue m to
queue n is λmPmn over the long run.
3. The overall arrival rate into queue n is the sum of the arrival rate from all
sources.
4. If queue n has cn < ∞ parallel servers with service rate µn, then the long run
utilization of each server is and < 1is the condition for stable
queue.
5. If, for each queue n, arrivals from outside the network is Poisson process with
rate an and there are cn identical servers having exponentially distributed
service times with mean 1/µn, then in steady state queue n behaves like an
M/M/cn queue with arrival rate
∑
■■■
UNIT: III
RANDOM NUMBERS
CHAPTER 6 RANDOM NUMBER GENERATION
CHAPTER 9 RANDOM VARIATE GENERATION

CHAPTER 6
RANDOM NUMBER GENERATION
6.1 Introduction
 A random number is a number generated by a process, whose outcome is
unpredictable, and which cannot be sub sequentially reliably reproduced.
 Random numbers are the basic building blocks for all simulation algorithms.
 Similarly, simulation languages generate random numbers that are used to generate
event times and other random variables.
Why are random numbers used in simulation? [M-07, 13]

 For most real-life systems, we do not have exact characterizations of the input
parameters. Hence, using probabilistic inputs makes the results of the analysis more
robust.
 Even if we do an exact characterization of the input parameters, it is often
computationally too expensive or analytically intractable to take them into account.
 The variability and uncertainty inherent in the systems studied is modeled using
random inputs.
 Random numbers are generated to model the timing and behavior of events. Also they
are used to generate values from a statistical distribution, for example, interarrival
time, service time, number of items in an order.
6.2 Properties of Random Numbers [M-05,06,08,09,15, S-15, D-05,08,09,10,16]

 Consider a sequence of random numbers R1, R2,…
 These random numbers must have two important statistical properties, uniformity and
independence.
 Each random number Ri is an independent sample drawn from a continuous uniform
distribution between 0 and 1.
 The probability density function (pdf) is given by
( ) 2
Figure 6.1 The pdf for random number

 The expected value of each Ri is given by
( ) ∫ * +
 The variance of each Ri is given by
( ) ∫ , ( )- * +
 Some consequences of the uniformity and independence properties are the following:
1. If the interval (0, 1) is divided into n classes, or subintervals of equal length,
the expected number of observations in each interval is N / n, where N is the
total number of observations.
2. The probability of observing a value in a particular interval is independent of
previous values drawn.
6.3 Generation of Pseudo-Random Numbers

 The term “pseudo” means false, hence this involves the generation of false random
numbers.
 It implies that generating random numbers by a known method removes the potential
for true randomness.
 The goal of any generation scheme, however, is to produce a sequence of numbers
between 0 and 1 that simulates, or imitates, the ideal properties of uniform
distribution and independence as closely as possible.
In the generation of pseudo-random numbers, certain problems or errors can occur.

[M-10, S-15, D-05,17]
Some examples of such errors include the following:
1. The generated numbers might not be uniformly distributed.
2. The generated numbers might be discrete-valued instead of continuous-valued.
3. The mean of the generated numbers might be too high or too low.
4. The variance of the generated numbers might be too high or too low.
5. There might be dependence. The following are examples:
a) Autocorrelation between numbers;
b) Numbers successively higher or lower than adjacent numbers;
c) Several numbers above the mean followed by several numbers below the mean.
Important criteria for selection of a random number generator:

1. The routine should be fast. Generator may generate millions of random numbers.
2. The routine should be portable to different computers with different numeric precision
and ideally to different programming languages.
3. The routine should have a sufficiently long cycle. The cycle length, or period,
represents the length of the random number sequence before previous numbers begin
to repeat themselves in an earlier order.
4. The random numbers should be replicable. Given the starting point (seed value) it
should be possible to generate the same set of random numbers, completely
independent of the system that is being simulated.
5. The generated random numbers should closely approximate the ideal statistical
properties of uniformity and independence.
6.4 Techniques for Generating Random Numbers

[M-05,06,07,08,09,10,11,13,14,15, D-07,08,09,10,12,14 ]
6.4.1 Linear Congruential Method

 The linear congruential method produces a sequence of integers, X1, X2, … between
zero and m – 1 according to the following recursive relationship:
( )
 The initial value X0 is called the seed, a is called the multiplier, c is called the
increment, and m is the modulus.
 If c ≠ 0 in the above equation, the form is called the mixed congruential method.
 When c = 0, the form is known as the multiplicative congruential method.
 The selection of the values for a, c, m, and X0 drastically affects the statistical
properties and cycle length.
 The random integers generated from linear congruential method can be converted to
random number by the formula:
 The ultimate test of the linear congruential method or any generation technique is
uniformity and independence.
 There are, however, several secondary properties that must be considered. These
include maximum density and maximum period.
 Maximum Density: It means that the values assumed by Ri, i = 1,2,… leave no large
gaps on [0, 1].
 Maximum Period: It helps to achieve maximum density, and to avoid cycling (i.e.
recurrence of the same sequence of generated numbers). Maximal period can be
achieved by the proper choice of a, c, m, and X0. Table below gives the period that
can be achieved for different choices of a, c, m, and X0.
Modulus Increment Constant Seed Period
(m) (c) Multiplier (a) (X0) (P)
≠ 0 and
relatively prime
to m i.e., 1 + 4k and k is an
2b -- m = 2b
greatest integer
common factor
of c and m is 1.
3 + 8k or 5 + 8k
2b 0 Odd
k = 0, 1, …
For smallest integer
Prime Number 0 k , ak – 1 is -- m-1
divisible by m
Example 6.1
(i) Use the linear congruential method to generate a sequence of three two-digit random
integers. Let X0 = 27, a = 8, c = 47, and m = 100.
(ii) Do we encounter a problem if X0 = 0?
Solution:
(i) Given X0 = 27, a = 8, c = 47, and m = 100.
Note: Here the random integers will be generated between 0 and 99 because of the
value of the modulus.
Random numbers between 0 and 1 can be generated by
According to linear congruential method, ( )
X0 = 27
X1 = (8 x 27 + 47) mod 100 = 263 mod 100 = 63
R1 = 63/100 = 0.63
X2 = (8 x 63 + 47) mod 100 = 551 mod 100 = 51

R2 = 51/100 = 0.51
X3 = (8 x 51 + 47) mod 100 = 455 mod 100 = 55

R3 = 55/100 = 0.55
(ii) If X0 = 0
X1 = (8 x 0 + 47) mod 100 = 47 mod 100 = 47

R1 = 47/100 = 0.47
X2 = (8 x 47 + 47) mod 100 = 423 mod 100 = 23

R2 = 23/100 = 0.23
X3 = (8 x 23 + 47) mod 100 = 231 mod 100 = 31

R3 = 31/100 = 0.31
Hence, no problem is encountered if X0 = 0. A problem would occur only if c = 0 also.
Example 6.2 [S-15, D-16]

Use the mixed congruential method to generate a sequence of three two-digit random integers
between 0 and 24 with X0 = 13, a = 9, and c = 35.
Solution:
Given X0 = 13, a = 9, and c = 35.
Random integers between 0 and 24 indicates m = 25.
Random numbers between 0 and 1 can be generated by
According to linear congruential method, ( )
X0 = 13
X1 = (9 x 13 + 35) mod 25 = 152 mod 25 = 2

R1 = 2/25 = 0.08
X2 = (9 x 2 + 35) mod 25 = 53 mod 25 = 3

R2 = 3/25 = 0.12
X3 = (9 x 3 + 35) mod 25 = 62 mod 25 = 2

R3 = 2/25 = 0.08
Example 6.3
Use the multiplicative congruential method to generate a sequence of four three-digit random
numbers. Use X0 = 117, a = 43, and m = 1000.
Solution:
Given: X0 = 117, a = 43, m = 1000
For multiplicative congruential method,
( )
X1 = [43(117)] mod 1000 = 31
X2 = [43(31)] mod 1000 = 333
X3 = [43(333)] mod 1000 = 319
X4 = [43(319)] mod 1000 = 717
Example 6.4
Determine whether the multiplicative congruential generator with a = 6507, c = 0, and m =
1024 can achieve a maximum period. Also state the restriction on X0 to obtain this period.
Solution:
Given a = 6507, c = 0, and m = 1024 = 210
Since modulus m = 210 is of the form 2b and c = 0, then constant multiplier must be
3 + 8k or 5 + 8k for k = 0.1,2,…
When a = 3 + 8k,
i.e., 6507 = 3 + 8k k = 813 which is an integer.
When a = 5 + 8k,
i.e., 6507 = 5+ 8k k = 812.75 which is not an integer.
From this result we can say that the multiplicative congruential generator can achieve a
maximum period
and the seed X0 must be odd.
6.4.2 Combined Linear Congruential Generators

 As computing power has increased, the complexity of the systems that we are able to
simulate has also increased.
 Hence, the generators with substantially longer periods are required.
 This problem is overcome by combining two or more multiplicative congruential

generators with relatively prime cycle length.
 The method is based on following results:
i. If Wi,1, Wi,2,…,Wi,k are any independent, discrete-valued random variables such
that Wi,1 is uniformly distributed on the integers from 0 to m1-2, then,
(∑ )
is also uniformly distributed on the integers 0 to m1-2. Now this result can be used
to combine generators.
ii. Let Xi,1, Xi,2,…, Xi,k be the ith output from k different multiplicative congruential
generators.
iii. In this case the jth generator has the prime modulus mj, the multiplier aj and the
period mj-1.
iv. The jth generator produces integers ( ) and
( )
v. The combined generators take the form
(∑( ) )
{
vi. The maximum possible period for this generator is
( )( ) ( )
vii. This leads to the following algorithm that combines k generators, with modulus mi
and constant multiplier ai.
Algorithm:
1. Select seeds Xi,0 in the range [1, mi-1] for each k generator such that i = 1 to k. Set
j=0, l=0.
2. Evaluate each individual generator.
For i= 1 to k
3. Combine the generators
(∑( ) )
4. Return {
5. Set j = j + 1, l = l + 1 and go to step 2.

Example 6.5
Use the combined linear congruential method to combine three multiplicative generators with
m1 = 32363, a1 = 157, m2 = 31727, a2 = 146, m3 = 31657, and a3 = 142. Generate random
number with the combined generator using initial seeds Xi,0 = 100, 300, 500 for the individual
generators i = 1, 2, 3.
Solution:
Given the following:
Initial seeds: X1,0 = 100, X2,0 = 300, X3,0 = 500
Modulus: m1 = 32363, m2 = 31727, m3 = 31657
Constant multiplier: a1 = 157, a2 = 146, a3 = 142
1. Select the seed Xi,0 in the range [1, mi-1].

X1,0 = 100, X2,0 = 300, X3,0 = 500
Set j = 0, l = 0
2. Evaluate each individual generator
X1,1 = (a1X1,0) mod m1
= (157 * 100) mod 32363
= 15700 mod 32363
= 15700
X2,1 = (a2X2,0) mod m2
= (146 * 300) mod 31727
= 12073 mod 31727
= 12073
X3,1 = (a3X3,0) mod m1
= (142 * 500) mod 31657
= 71000 mod 31657
= 7686
3. Combine the generators
(∑( ) ) ( )
( )
= 11313
4. Compute the 1st random number
5. Set j = 1, l = 1, and go to step 2 for generating the next random number.
6.5 Tests for Random Numbers or Hypothesis Testing [N-04, D-07]

 Hypothesis testing is used in statistics to decide whether a particular assumption is
correct or not.
 This assumption is called a hypothesis, and it typically is an assertion about a

distribution of one or more random variables, or about some measure of a distribution,
such as the mean and the variance.
 The desirable properties of random numbers we discussed are uniformity and
independence.
 To insure that these desirable properties are achieved, a number of tests can be
performed.
 The tests can be placed in two categories according to the properties of interest.
 The first entry in the list below concerns testing for uniformity.
 The second through fifth entries concern testing for independence.
 The five types of tests discussed in this chapter are as follows:
4. Frequency test: Uses the Kolmogorov-Smirnov or the Chi-square test to
compare the distribution of the set of numbers generated to a uniform
distribution.
5. Runs test: Tests the runs up and down or the runs above and below the mean
by comparing the actual values to the expected values. The statistic for the
comparison is the chi-square.
6. Autocorrelation test: Tests the correlation between numbers and compares the
sample correlation to the expected correlation of zero.
7. Gap test: Counts the number of digits that appear between repetitions of a
particular digit and then uses the Kolmogorov-Smirnov test to compare with
the expected size of gaps.
8. Poker test: Treats numbers grouped together as a poker hand. Then the hands
obtained are compared to what is expected using the chi-square test.
 In testing for uniformity, the hypotheses are as follows:
, -
, -
The null hypothesis, H0, reads that the numbers are distributed uniformly on the
interval [0, 1]. Failure to reject the null hypothesis means that the evidence of non-
uniformity has not been detected by the test.
 In testing for independence, the hypotheses are as follows:
The null hypothesis, H0, reads that the numbers are independent. Failure to reject the
null hypothesis means that the evidence of dependence has not been detected by the
test.
 From each test, a level of significance α must be stated. The level α is the probability
of rejecting the null hypothesis when the null hypothesis is true i.e.,
α = P(reject H0 | H0 true)
 There are two errors associated with hypothesis testing, namely Type I error and Type
II error.
1. Type I error (α): It is the error of rejecting H0 when in fact it is true.
2. Type II error (β): It is the error of accepting H0 when in fact it is false.
 The type I error is commonly known as false negative, and the type II error is known
as the false positive.
6.5.1 Test for Uniformity / Frequency Test
 A basic test that should always be performed to validate a new generator is the test for
uniformity.
 Two different methods of testing are available. They are the Kolmogorov-Smirnov
and the chi-square test.
 Both of these tests measure the degree of agreement between the distribution of a
sample of generated random numbers and the theoretical uniform distribution.
 Both tests are based on the null hypothesis of no significant difference between the
sample distribution and the theoretical distribution.
1. The Kolmogorov-Smirnov test [D-11,14]
 This test compares the continuous CDF, F(x), of the uniform distribution with the
empirical CDF, SN(x), of the N sample observations.
 By definition, ( )
 If the sample from the random-number generator is R1, R2,…,RN, then the empirical
CDF, SN(x), is defined by
( )
 As N becomes larger, SN(x) should become a better approximation to F(x), provided
that the null hypothesis is true.
 The Kolmogorov-Smirnov test is based on the largest absolute deviation between F(x)
and SN(x) over the range of the random variable, i.e., it is based on the statistic
| ( ) ( )|
 The sampling distribution of D is known and is tabulated as a function of N in Table
A.5.
Algorithm:
1. Define the hypothesis for testing the uniformity as

, -
, -
2. Rank the data from smallest to largest. Let Ri denote the ith smallest observation, so
that
3. Compute D+ and D- where

{ }
{ }
4. Compute D = max(D+, D-)

5. Locate in Appendix A.8 the critical value, Dα, for the specified significance level α
and the given sample size N.
6. If D > Dα, reject H0. Else, conclude that no difference has been detected between the
sample distribution and the uniform distribution.
Note: Kolmogorov-Smirnov test is more powerful and can be applied to small sample sizes
(N<50).
Example 6.6 [D-14]

The sequence of numbers 0.54, 0.73, 0.98, 0.11, and 0.68 has been generated. Use the
Kolmogorov-Smirnov test with α = 0.05 to determine if the hypothesis that the numbers are
uniformly distributed on the interval [0, 1] can be rejected. Use D0.05, 5 = 0.565.
Solution:
Step 1: Define the hypothesis for testing the uniformity as:
, -
, -
Step 2: Now, rank data from smallest to largest
0.11 ≤ 0.54 ≤ 0.68 ≤ 0.73 ≤ 0.98
Step 3: Compute D+ and D-
i 1 2 3 4 5
Ri 0.11 0.54 0.68 0.73 0.98
0.20 0.40 0.60 0.80 1.00
0.00 0.20 0.40 0.60 0.80
0.09 -- -- 0.07 0.02
0.11 0.34 0.28 0.13 0.18
D+ = max{ } = 0.09,
D- = max { } = 0.34
Step 4: Compute D
D = max {D+, D-} = max{0.09, 0.34} = 0.34
Step 5: Given, for α = 0.05, D0.05, 5 = 0.565
Since D = 0.34 < D0.05, 5 = 0.565,
, - is accepted i.e., random numbers are uniformly distributed.
Example 6.7
The sequence of numbers 0.63, 0.49, 0.24, 0.89, and 0.71 has been generated. Use the
Solution:
, -
, -
0.11 ≤ 0.54 ≤ 0.68 ≤ 0.73 ≤ 0.98
i 1 2 3 4 5
Ri 0.24 0.49 0.57 0.63 0.71
0.20 0.40 0.60 0.80 1.00
0.00 0.20 0.40 0.60 0.80
-- -- 0.03 0.17 0.29
0.24 0.29 0.17 0.03 --
D+ = max{ } = 0.29,
D- = max { } = 0.29
Step 4: Compute D
D = max {D+, D-} = max{0.29, 0.29} = 0.29
Step 5: Given, for α = 0.05, D0.05, 5 = 0.565
Since D = 0.29 < D0.05, 5 = 0.565,
Example 6.8 [M-16]

Consider the following sequence of 5 numbers: 0.15, 0.94, 0.05, 0.51, 0.29. Use the
Solution:
, -
, -
0.05 ≤ 0.15 ≤ 0.29 ≤ 0.51 ≤ 0.94
i 1 2 3 4 5
Ri 0.05 0.15 0.29 0.51 0.94
0.20 0.40 0.60 0.80 1.00
0.00 0.20 0.40 0.60 0.80

0.15 0.25 0.31 0.29 0.06
0.05 -- -- -- 0.14
D+ = max{ } = 0.31,
D- = max { } = 0.14
Step 4: Compute D
D = max {D+, D-} = max{0.31, 0.14} = 0.31
Step 5: Given, for α = 0.05, D0.05, 5 = 0.565
Since D = 0.31 < D0.05, 5 = 0.565,
2. The chi-square test:

 Pearson was the first to recognize the practical value of the chi-square test.
 The chi-square test can be used to test the uniformity (or homogeneity) within a data
set or to determine the probability that there is a dependency relationship between two
or more distinct data sets.
Algorithm:
1. Define the hypothesis for testing the uniformity as
, -
, -
2. Divide the total number of observations (N) into mutually exclusive equally
numbered classes (n) as [a0, a1), [a1, a2),…, [an-1, aN). The n be chosen in such a way
that each Ei ≥ 5.
3. Compute the sample test statistics
( )
∑
( )
class
class which is N/n.
4. Determine the critical value for the specified significance level α with (n-1) degrees of
freedom from Table A.4.
5. If , reject H0. Else, conclude that, there is no difference detected between
the sample distribution and the uniform distribution.
Note: Chi-square test is valid only for large sample sizes (N ≥ 50).
Example 6.9
Consider the following sequence of 100 numbers.
0.43 0.09 0.52 0.98 0.78 0.44 0.21 0.12 0.64 0.76
0.38 0.67 0.97 0.46 0.07 0.18 0.49 0.47 0.22 0.47
0.69 0.99 0.77 0.76 0.65 0.14 0.25 0.37 0.99 0.20
0.74 0.03 0.71 0.28 0.65 0.50 0.54 0.13 0.87 0.50
0.97 0.17 0.32 0.91 0.28 0.39 0.56 0.73 0.93 0.24
0.99 0.71 0.99 0.64 0.50 0.66 0.01 0.24 0.81 0.94
0.73 0.15 0.45 0.10 0.18 0.82 0.96 0.43 0.57 0.94
0.27 0.34 0.65 0.79 0.03 0.49 0.69 0.85 0.37 0.50
0.60 0.93 0.48 0.42 0.04 0.46 0.04 0.91 0.97 0.26
0.81 0.62 0.79 0.88 0.46 0.74 0.06 0.11 0.92 0.87
Use the chi-square test, with α = 0.05, to test the hypothesis that they are uniformly
distributed on the interval [0, 1] can be rejected.
Solution:
, -
, -
Step 2: Choose the value of n such that Ei ≥ 5.
Since N = 100 and Ei = N/n
Let n = 10 intervals of equal length, namely [0, 0.1), [0.1, 0.2), … , [0.9,1.0)
Step 3: Compute the test statistics
( )
Interval
[0, 0.1) 8 10 0.4
[0.1, 0.2) 9 10 0.1
[0.2, 0.3) 10 10 0.0
[0.3, 0.4) 6 10 1.6
[0.4, 0.5) 13 10 0.9
[0.5, 0.6) 8 10 0.4
[0.6, 0.7) 11 10 0.1
[0.7, 0.8) 12 10 0.4
[0.8, 0.9) 7 10 0.9
[0.9, 1.0) 16 10 3.6
( )
( )
∑
Step 4: Determine the critical value for the specified level of significance α with (n-1)
degree of freedom from Table A.4.
Since α = 0.05, and degree of freedom (n - 1) = 10 – 1 = 9
Step 5: Since
Example 6.10 [M-17, D-14]

The following is set of single digit numbers from a random number generator. Using
appropriate test check whether the numbers are uniformly distributed. Given N = 50, α =
0.05,
{6, 7, 0, 6, 9, 9, 0, 6, 4, 6, 4, 0, 8, 2, 6, 6, 1, 2, 6, 8, 5, 6, 0, 4, 7, 1, 3, 5, 0, 7, 1, 4, 9, 8, 6, 0, 9,
6, 6, 7, 1, 0, 4, 7, 9, 2, 0, 1, 4, 8}
Solution:
, -
, -
Step 2: Choose the value of n such that Ei ≥ 5.
Since N = 50 and Ei = N/n
Let n = 10 intervals of equal length, namely [0], [1], …….,[9]
Step 3: Compute the test statistics
( )
Interval
[0] 8 5 1.8
[1] 5 5 0
[2] 3 5 0.8
[3] 1 5 3.2
[4] 6 5 0.2
[5] 2 5 1.8
[6] 11 5 7.2
[7] 4 5 0.2
[8] 4 5 0.2
[9] 6 5 0.2
( )
( )
∑
Step 4: Determine the critical value for the specified level of significance α with (n-1)
degree of freedom from Table A.4.
Since α = 0.05, and degree of freedom (n - 1) = 10 – 1 = 9
Step 5: Since
6.5.2 Test for Independence [D-06]
6.5.2.1. Runs test
 The runs test examines the arrangement of numbers in a sequence to test the
hypothesis of independence.
 A run is defined as a succession of similar events preceded and followed by a
different event.
 The length of the run is the number of events that occur in the run.
 Example: Consider the following sequence generated by tossing a coin 10 times.
H T T H H H T H H T
1 2 3 1 2 1
In the above example, there are six runs. The length of each run is 1, 2, 3, 1, 2, and 1.
6.5.2.1.1 Runs up and runs down:
 An up run is a sequence of numbers each of which is followed by a larger number.

 Similarly, a down run is a sequence of numbers each of which is followed by a
smaller number.
 Example: Consider the following sequence of 10 numbers.
0.12 0.01 0.23 0.28 0.89 0.31 0.64 0.28 0.33 0.93
 The numbers are given “+” or “-” depending on whether they are followed by a larger
number or a smaller number.
 Since there are 10 numbers, there will be total 9 +‟s and –„s.
 The last number is followed by “no event” and hence it will neither get + or -.
 The sequence of 9 +‟s and –„s is given below
- + + + - + - + +
 Here, each succession of +‟s and –„s forms a run.
 In the above example, there are six runs and the length of each run is 1, 3, 1, 1, 1, 2.
 If N is number of numbers in sequence, then maximum numbers of runs is N-1 and
minimum number of runs is one.
Algorithm:
1. Define the hypothesis for testing the independence as:
2. Write down the sequence of runs up and runs down.

3. Count the total number of runs (a), present in the sequence.
4. Compute mean and variance of a.
and
5. Compute the standard normal statistics
( )
6. Determine the critical value and for the specified significance level α from
Table A.2.
7. If , H0 is not rejected for the specified significance level α.
The critical values and rejection region are shown in figure 6.2.
Figure 6.2 Failure to reject hypothesis
Example 6.11
0.41 0.68 0.89 0.94 0.74 0.91 0.55 0.62 0.36 0.27
0.19 0.72 0.75 0.08 0.54 0.02 0.01 0.36 0.16 0.28
0.18 0.01 0.95 0.69 0.18 0.47 0.23 0.32 0.82 0.53
0.31 0.42 0.73 0.04 0.83 0.45 0.13 0.57 0.63 0.29
Based on runs up and runs down, determine whether the hypothesis of independence can be
rejected, where α = 0.05 and Z0.025 = 1.96
Solution:
Step 1: Define the hypothesis for testing the independence as:
Step 2: The sequence of runs up and down is as follows:

+ + + - + - + - - -
+ + - + - - + - + -
- + - - + - + + - -
+ + - + - - + + -
Step 3: No. of runs a = 26
N = 40
Step 4: Mean and variance of a is given as
( )
( )
Step 5: The standard normal statistics
√
Step 6: Given critical value → Zα/2= Z0.025 = 1.96
–Zα/2 ≤ Z0 ≤ Zα/2 → -1.96 ≤ -0.13 ≤ 1.96

Therefore, H0 cannot be rejected, we accept null hypothesis.
i.e.; the numbers are independent
Example 6.12 [D-10]

0.67 0.31 0.53 0.91 0.80 0.27 0.61 0.49 0.76 0.85
0.62 0.28 0.55 0.77 0.38 0.65 0.29 0.55 0.83 0.92
0.09 0.33 0.24 0.07 0.30 0.54 0.43 0.66 0.71 0.52
0.11 0.36 0.12 0.78 0.95 0.44 0.50 0.19 0.22 0.38
Based on runs up and runs down, determine whether the hypothesis of independence can be rejected,
where α = 0.05 and Z0.025 = 1.96
Solution:
Step 2: The sequence of runs up and down is as follows:

- + + - - + - + + -
- + + - + - + + + -
+ - - + + - + + - -
+ - + + - + - + +
N = 40
( )
( )

–Zα/2 ≤ Z0 ≤ Zα/2 → -1.96 ≤ -0.13 ≤ 1.96
Example 6.13 [M-09,14]

Test the following random numbers for independence by runs up and runs down test. Take α
= 0.05 and the critical value Z0.025 = 1.96.
{0.12, 0.01, 0.23, 0.28, 0.89, 0.31, 0.64, 0.28, 0.33, 0.93}
Solution:
Step2: The sequence of runs up and down is as follows:

- + + + - + - + +
N = 10
( )
( )
√
–Zα/2 ≤ Z0 ≤ Zα/2 → -1.96 ≤ -0.27 ≤ 1.96
6.5.2.1.2 Runs above and below the mean
 Here, the runs are defined as being above the mean or below the mean.
 A “+” will be used to denote a number above the mean and a “-” sign will be used to
denote a number below the mean.
 Example: Consider the following sequence of 10 numbers.
0.37 0.59 0.63 0.07 0.92 0.48 0.12 0.86 0.71 0.24
 The numbers are given “+” or “-” depending on whether they are greater than or
smaller than the expected mean which is (0 + 0.99)/2 = 0.495.
 The sequence of 10 +‟s and –„s is given below:
- + + - + - - + + -
 In the above example, there are seven runs.
 The length of each run is 1, 2, 1, 1, 2, 2, and 1.
 Let n1 and n2 be the number of observations above and below the mean respectively.
 The maximum number of runs is N = n1 + n2 and the minimum number of runs is 1.
Algorithm:
2. Write down the sequence of runs above and runs below the mean.
3. Count the number of observations above mean (n1), the number of observations below
mean (n2), and the total number of runs (b) present in the sequence.
4. Compute mean and variance of b.
( )
( )
5. Compute the standard normal statistics
( )
Table A.2.
Example 6.14 [M-17, D-05]

Consider the following sequence of random numbers. How would you test it for
independence based on runs above and below the mean for the significance level α = 0.05 and
the critical value Z0.025 = 1.96?
0.12 0.01 0.23 0.28 0.89 0.31 0.64 0.28 0.33 0.93
0.39 0.15 0.33 0.35 0.91 0.41 0.60 0.25 0.55 0.88
Solution:
Step 2: The sequence of runs above and below the mean (i.e. 0.495) is:
- - - - + - + - - +
- - - - + - + - + +
Step 3: The number of observations above mean = n1 = 7
The number of observations below mean = n2 = 13
The total number of runs b = 12
Step 4: Mean and variance of b is given as
( )( )
( ) ( )( ), ( )( ) -
( ) ( )
Step5: The standard normal statistics

–Zα/2 ≤ Z0 ≤ Zα/2 → -1.96 ≤ 1.2185 ≤ 1.96
6.5.2.1.3 Length of Runs
 Length of runs is another concern of runs test and it is expected that length of runs
should not be constant.
 Let Yi be the number of runs of length „i‟ in a sequence of N numbers.
 The expected value of Yi fro runs up and runs down or runs above and runs below the
mean is determined.
 Then the chi-square test is applied to compare expected value with the observed
value.
 The following algorithm is used to test independence on the basis of length of runs up
and down or above and below the mean in the given sequence of numbers.
Algorithm:
2. Write down the sequence of runs up and down or above and below the mean (as asked
in the problem).
3. Find the length of runs in the sequence.
4. Prepare a table for number of observed runs of each length.
Run length (i) 1 2 …
Observed runs (Oi)
5. Compute the expected value of Yi.
(i) For runs up and runs down:
, ( ) ( )-
( )
( )
{
(ii) For runs above and below mean:
( )
( )
where
. / . / . /. /
E(I) is the approximate expected length of a run and it is given as
( )
6. Compute the mean or expected total number of runs (of all lengths) in a sequence.
(ii) For runs above and below the mean:
( )
( )
7. Compute the expected number of runs of length greater than or equal to the maximum
length of observed length.
∑ ( ), where m is equal to the maximum length of observed run.
(ii) For runs above and below mean:
( ) ∑ ( ), where m is equal to the maximum length of observed run.
8. Apply the chi-square test.
Run Observed Number of Expected Number of , ( )-
Length, Runs, Runs, ( )
i Oi E(Yi)
1
.
.
.
≥m
The test statistics is calculated as:

, ( )-
∑ , where n is the number of classes.
( )
9. Determine the critical value for the specified significance level α with (n-1) degree of
10. If , reject H0.
Example 6.15
Consider the following sequence of the numbers below. Can the hypothesis that the numbers
are independent be rejected on the basis of the length of runs up and runs down at α = 0.05.
0.34 0.90 0.25 0.89 0.87 0.44 0.12 0.21 0.46 0.67
0.83 0.76 0.09 0.64 0.70 0.81 0.04 0.74 0.22 0.74
0.96 0.99 0.27 0.67 0.56 0.41 0.52 0.73 0.99 0.02
Solution:
Step 2: The sequence of runs up and runs down is
+ - + - - - + + + +
- - + + + - + - + +
+ - + - - + + + -
Step 3: The length of run in the sequence is
1 1 1 3 4 2 3 1 1 1 3 1 1 2 3 1
Step 4: The number of observed runs of each length is
Run length, i 1 2 3 4
Observed Runs, Oi 9 2 4 1
Step 5: The expected number of runs of length one, two, three, and four are
( ) , ( ) ( )-=12.58
( )
( ) , ( ) ( )-=5.26
( )
( ) , ( ) ( )-=1.45
( )
( ) , ( ) ( )-=0.30
( )
Step 6: The mean or expected total number of runs of all lengths in a sequence is:
( )
Step 7: The expected number of runs of length greater than or equal to 5 is
∑ ( ) ( )
Step 8: Apply chi-square test

i Oi E(Yi)
1 9 12.58 1.02
2 2 5.26
3 4 7 1.45 7.09 1.14 x 10-3
4 1 0.30
≥5 0 0.08
As, it is suggested, the minimum value of expected frequency is 5 in case of chi-square test.
If it is less than 5, it can be combined with the expected frequency of an adjacent class
interval. The corresponding observed frequencies would also be combined accordingly and
the value of n (number of classes) would be reduced.
Here class 3, 4, and 5 has expected frequency less than 5, so it is combined with class 2.
Similarly, we combine the observed frequency of class 3, class 4, and class 5 with class 2.
Reduce the number of classes by 3 which leads n = 5 - 3 = 2. The test statistics is calculated
as:
, ( )-
∑
( )
Step 9: The critical value for the specified significance level α = 0.05 with n - 1= (2 - 1) = 1
degree of freedom from Table A.4 is:
Since , H0 is not rejected i.e.; the numbers are independent.

Example 6.16
Consider the following sequence of the numbers below. Can the hypothesis that the numbers
are independent be rejected on the basis of the length of runs above and below mean at α =
0.05.
0.34 0.90 0.25 0.89 0.87 0.44 0.12 0.21 0.46 0.67
0.83 0.76 0.09 0.64 0.70 0.81 0.04 0.74 0.22 0.74
0.96 0.99 0.27 0.67 0.56 0.41 0.52 0.73 0.99 0.02
Solution:
Step 2: The sequence of runs up and runs down is
- + - + + - - - - +
+ + - + + + - + - +
+ + - + + - + + + -
Step 3: The length of run in the sequence is
1 1 1 2 4 3 1 3 1 1 1 3 1 2 1 3 1
Step 4: The number of observed runs of each length is
Run length, i 1 2 3 ≥4
Observed Runs, Oi 10 2 4 1
Step 5: The expected number of runs of length one, two, three, and four are:
The number of observations above mean = n1 = 18 and
The number of observations below mean = n2 = 12
(i) The approximate probability that a run has length „i‟ is:
. / . / . /. /
. / . / . /. / ( * ( * ( *( *
. / . / . /. / ( * ( * ( *( *
. / . / . /. / ( * ( * ( *( *
(ii) The approximate expected length of a run is

( )
(iii)The expected runs of various lengths:

( )
( )
( )
( )
( )
( )
( )
( )
( )
( )
( )
Step 6: The mean or expected total number of runs of all length in a sequence is:
( )
( )
Step 7: The expected number of runs of length greater than or equal to 4 is:
( ) ∑ ( ) ( )
Step 8: Apply the chi-square test.

i Oi E(Yi)
1 10 6.65 1.6875
2 2 3.32
3 4 7 1.73 7.19 5.0208 x 10-3
≥4 1 2.14
Since, class 2, 3, and 4 has expected frequency less than 5; so it is combined which leads to
7.19. Similarly, the observed frequency of class 2, 3, and 4 is also combined which leads to 7.
Reduce the number of classes by 2 which leads in n = 4 – 2 = 2.
The test statistics is calculated as:
, ( )-
∑
( )
Step 9: The critical value for the specified significance level α = 0.05 with n - 1= (1 - 1) = 0
degree of freedom from Table A.4 is:
Since , H0 is not rejected.

i.e.; the numbers are independent.
6.5.2.2. Autocorrelation test
 It deals with the dependence between numbers in a sequence.

 The numbers in the sequence might be related.
 Example: The numbers at position 5, 10, 15,… has larger or smaller values.
 The test computes the autocorrelation between every m numbers (lag) starting with
the ith number.
 Finally, it compares the sample correlation to the expected correlation of zero.
 A nonzero autocorrelation implies a lack of independence.
 If ρim > 0 then the subsequence has positive autocorrelation whereas if ρim < 0 then
the subsequence has negative autocorrelation.
Algorithm:
1. Define the hypothesis for testing the independence as
2. Find out the value of „i‟ and lag „m‟ using the given data.
3. Using i, m, and N estimate the value of M where M is largest integer such that
( ) ,
where N is the total number of values in the sequence.
4. For large values of M, the distribution of estimator of , denoted by ̂ is
approximately normal if the numbers Ri,Ri+m, Ri+2m, …, Ri+(M+1)m are uncorrelated
where
̂ [∑ ( ) ]
5. Find the standard deviation of the estimator as

√
̂
( )
6. Compute the test statistics as
̂
̂
Table A.2.
Example 6.17
0.30 0.48 0.36 0.01 0.54 0.34 0.96 0.06 0.61 0.85
0.48 0.86 0.14 0.86 0.89 0.37 0.49 0.60 0.04 0.83
0.42 0.83 0.37 0.21 0.90 0.89 0.91 0.79 0.57 0.99
0.95 0.27 0.41 0.81 0.96 0.31 0.09 0.06 0.23 0.77
0.73 0.47 0.13 0.55 0.11 0.75 0.36 0.25 0.23 0.72
0.60 0.84 0.70 0.30 0.26 0.38 0.05 0.19 0.73 0.44
Test whether the 2nd, 9th, 16th, … numbers in the sequence are autocorrelated, where α = 0.05.
Solution:
Step 1: Define the hypothesis for testing the independence as
Step 2: Here, the value of i = 2 (starting with second number) and lag m = 7 (every seven
numbers).
Step 3: Given that N = 60. Using i, m and N, estimate M which is the largest integer such that
i + (M+1)m ≤ N
i.e. 2 + (M + 1)7 ≤ 60
i.e. (M + 1)7 ≤ 58
i.e. (M + 1) ≤ 8.28
i.e. M ≤ 7.28
M = max {7, 6 , 5, ….}
Hence, M = 7
Step 4: The distribution of estimator
̂ [∑ ( ) ]
̂ [∑ ( ) ]
= , -
,( )( ) ( )( ) ( )( ) ( )( ) ( )( )
( )( ) ( )( ) ( )( )-
= 0.20 – 0.25
= -0.05
Step 5: The standard deviation of the estimate
√ √ ( )
̂ = 0.10
( ) ( )
Step 6: The test statistics

̂ ̂
-0.5
̂ ̂
Step 7: Determine the critical value and for the specified significance level α from
Table A.2.
Since α = 0.05, Zα/2= Z0.025 = 1.96
Step 8: –Zα/2 ≤ Z0 ≤ Zα/2 → -1.96 ≤ -0.5 ≤ 1.96

6.5.2.3. Gap test [D-11]
 The gap test is used to count the number of digits between successive occurrences of
the same digit.
 Here, we are interested in the frequency of the gaps. A gap of length x occurs between
the recurrences of some specified digit.
 The probability of the gap is determined by:

( ) ( )
 Every digit 0, 1, 2, …,9 must be analyzed to test the numbers are independent using
gap test.
 The observed frequencies of the various gap sizes for all the digits are recorded and
compared with the theoretical frequency using the Kolmogorov-Smirnov test.
 The CDF of theoretical frequency distribution based on the selected class interval
width is given by
( ) ( ) ∑( )
 The following algorithm is used to test independence on the basis of length of gaps
associated with every digit.
Algorithm:
2. Determine the number of gaps and length of each gap associated with each digit (0,
1,…,9).
3. Select the interval width based on the number of gaps and generate the frequency
distribution table for the sample of gaps and apply Kolmogorov-Smirnov test.
Gap Frequency Relative Cumulative CDF of |F(x) – SN(x)|
Length Frequency Relative Theoretical
Frequency Frequency
SN(x) Distribution
F(x)
4. Compute the test statistic D, which is the maximum deviation between F(x) and
SN(x).
| ( ) ( )|
5. Determine the critical value, Dα for the specified significance level α and the sample
size N from Table A.5.
6. If D ≤ Dα, H0 is accepted.
Example 6.18 [S-15]

Consider the following sequence of 120 digits. Test whether these digits can be assumed to
be independent based on the frequency with which gap occur. Use α = 0.05.
0 5 0 1 3 4 1 2 6 7 6 3 7 5 9 2 6 5 1 6
7 0 1 5 9 2 8 4 3 4 8 1 3 5 3 4 4 3 9 0
2 9 8 5 6 0 6 9 1 7 7 8 0 1 4 3 0 0 8 7
9 3 1 8 3 3 6 6 7 8 2 3 5 9 6 5 0 2 4 2
6 7 9 4 0 6 4 0 3 9 3 6 8 1 5 6 7 0 3 1
1 3 7 4 8 0 7 6 2 6 2 3 6 5 6 0 8 8 2 6
Solution:
Step 2: Given that: Number of digits = 120

Total number of gaps = Number of digits - Number of distinct digits
= 120 – 10
= 110
The number of gaps and length of each gap associated with each digit is:
Digit Length of Each Gap Number of Gaps

0 1, 18, 17, 5, 6, 3, 0, 18, 7, 2, 9, 7, 9 13
1 2, 11, 3, 8, 16, 4, 8, 30, 5, 0 10
2 7, 9, 14, 29, 6, 1, 28, 1, 7 9
3 6, 16, 3, 1, 2, 17, 5, 2, 0, 5, 16, 1, 7, 2, 9 15
4 21, 1, 5, 0, 17, 23, 4, 2, 16 9
5 11, 3, 5, 9, 9, 28, 2, 18, 18 9
6 1, 5, 2, 24, 1, 19, 0, 6, 5, 4, 5, 3, 11, 1, 2, 1, 4 17
7 2, 7, 28, 0, 8, 8, 12, 14, 5, 3 10
8 3, 11, 8, 6, 4, 5, 22, 11, 11, 0 10
9 9, 13, 2, 5, 12, 12, 8, 6 8
∑ = 110
Step 3: Select the interval width based on the number of gaps and generate the frequency
distribution table for the sample of gaps and apply Kolmogorov-Smirnov test.
Gap Frequency Relative Cumulative CDF of |F(x) – SN(x)|

Length Frequency Relative Theoretical
Frequency Frequency
SN(x) Distribution
F(x)=
1 – 0.9x+1
0–3 35 0.3182 0.3182 0.3439 0.0257
4–7 29 0.2636 0.5818 0.5695 0.0123
8 – 11 19 0.1727 0.7545 0.7176 0.0369
12 – 15 6 0.0545 0.8090 0.8147 0.0057
16 – 19 12 0.1091 0.9181 0.8784 0.0397
20 – 23 3 0.0273 0.9454 0.9202 0.0252
24 – 27 1 0.0091 0.9545 0.9497 0.0048
28 – 31 5 0.0455 1.0000 0.9657 0.0343
Step 4: The test statistics is D = max |F(x) – SN(x)| = 0.0397
Step 5: Determine the critical value, Dα for the specified significance level α and the sample
size N from Table A.5.
Since α = 0.05 and N = 110
√ √
Step 6: Since D = 0.0397 < = 0.1297, H0 is accepted.

i.e. the numbers are independent.
6.5.2.4. Poker test [D-13]
 The poker test for independence is based on the frequency with which certain digits
are repeated in a series of numbers.
 Here, we show only the three digit version for testing the independence property.
 In this case, the generated random numbers are rounded to three digits.
 In three-digit numbers there are only three possibilities, as follows:
1. The individual numbers can all be different.
2. The individual numbers can all be the same.
3. There can be one pair of like digits.
 The probability associated with each of these possibilities is given by the following:
P(three different digits) = P(second different from the first) x
P(third different from the first and second)
= (0.9)(0.8) = 0.72
P(three like digits) = P(second digit same as the first) x
P(third digit same as the first)
= (0.1)(0.1) = 0.01
P(exactly one pair) = 1 – [0.72 + 0.01] = 0.27
 The observed value is then compared with the expected value using the chi-square
test.
Algorithm:
2. Generate the frequency distribution table for the above three combinations and apply
chi-square test.
Combination, i Observed Frequency, Expected Frequency, ( )
Oi Ei
3 different digits, 1
3 like digits, 2
Exactly one pair, 3
3. Compute the sample test statistics
( )
∑
( )
class
class which is N/n.
4. Determine the critical value for the specified significance level α with (n-1) degrees of
5. If , accept H0; i.e., random numbers are independent.
Example 6.19
A sequence of 1000 three digit numbers has been generated and an analysis indicates that 290
have three different digits, 570 contain exactly one pair of like digits, and 140 contain exactly
three like digits. Based on the Poker test, check whether these numbers are independent. Use
α = 0.05.
Solution:
Step 2: Generate the frequency distribution table for the above three combinations and apply
chi-square test.
Oi Ei = PxN
3 different digits, 1 290 0.72 x 1000 = 720 256.80
3 like digits, 2 140 0.01 x 1000 = 10 1690.00
Exactly one pair, 3 570 0.27 x 1000 = 270 333.33
Step 3: Compute the sample test statistics

( )
∑
Step 4: Determine the critical value for the specified significance level α with (n-1) degrees
of freedom.
Since α = 0.05 and n – 1 = 3 – 1 = 2
Step 5: , reject H0; i.e., random numbers are not independent.
Example 6.20 [D-09,11,17]

Test the following random numbers for independence by Poker test.
{0.594, 0.928, 0.515, 0.055, 0.507, 0.351, 0.262, 0.797, 0.788, 0.442, 0.097, 0.798, 0.227,
0.127, 0.474, 0.825, 0.007, 0.182, 0.929, 0.852}
Use α = 0.05.
Solution:
Given N = 20
Let us assume to be three-digit numbers.
Hence, the numbers are
{594, 928, 515, 055, 507, 351, 262, 797, 788, 442, 097, 798, 227, 127, 474, 825, 007, 182,
929, 852}
Step 2: Generate the frequency distribution table for the above three combinations and apply
chi-square test.
Oi Ei = PxN
3 different digits, 1 10 0.72 x 20 = 14.4 1.34
3 like digits, 2 0 0.01 x 20 = 0.2 3.45
Exactly one pair, 3 10 10 0.27 x 20 = 5.4 5.6
Step 3: Compute the sample test statistics

( )
∑
Step 4: Determine the critical value for the specified significance level α with (n-1) degrees
of freedom from Table A.4.
Since α = 0.05 and n – 1 = 2 – 1 = 1
Step 5: , reject H0; i.e., random numbers are not independent.
■■■
CHAPTER 7
RANDOM VARIATE GENERATION
7.1 Introduction
 In this chapter it is assumed that a distribution has been completely specified, and
ways are sought to generate sample from this distribution to be used as input to a
simulation model.
 Here, we explain and illustrate some widely used techniques for generating random
variates.
 All the techniques in this chapter assume that a source of uniform (0, 1) random
numbers R1, R2, …is readily available, where each Ri has pdf
( ) 2
and cdf
( ) {
Difference between Random Number and Random Variate [M-12,14,16 D-11,13,14]
Random Number Random Variate

Definition Random numbers are numbers that A random variate is a variable
occur in a sequence such that two generated from uniformly distributed
conditions are met: (1) the values pseudorandom numbers. Depending on
are uniformly distributed over a how they are generated, a random
defined interval or set, and (2) it is variate can be uniformly or non-
impossible to predict future values uniformly distributed.
based on past or present ones.
Generation 1. Linear Congruential Method 1. Inverse-Transform Technique
Techniques 2. Multiplicative Congruential 2. Convolution Method
Method 3. Acceptance-Rejection
3. Combined Linear Technique
Congruential Method
Use Random numbers are important in Random variates are frequently used
statistical analysis and probability as the input to simulation models.
theory.
7.2 Inverse Transform Technique [M-06,11,12,16,17, D-07,08,09]

 The inverse transform technique can be used to sample from the exponential, the
uniform, the Weibull, and the triangular distributions and empirical distributions.
 It is the underlying principle for sampling from a wide variety of discrete
distributions.
 It is the most straightforward, but not always the most efficient, technique
computationally.
Algorithm:
1. Get the CDF of the desired random variable X.

2. Generate R U(0,1) and set F(X) = R on the range of X.
3. Calculate F-1 by solving the equation F(X) = R in terms of R which returns X= F-1(R).
4. Generate uniform random numbers R1, R2, … from [0,1) and use Xi = F-1(Ri) to
generate.
7.2.1 Exponential Distribution [M-07,14, D-06,12,14]

( )
2. Set F(X) = R on the range of X.
3. Solve the equation F(X) = R.
Taking natural log on both sides

( )
( )
4. Generate uniform random numbers R1, R2, … from [0,1) and calculate desired
random variate by:
( )
Example 7.1
The time to attend a breakdown call is found to follow exponential distribution with mean of
0.5. Generate two exponential random variates representing the time to attend. Use R1 = 0.54
and R2 = 0.72.
Solution:
The exponential random variate is
( )
Given λ = 0.5
( ) ( ) ( )
Now, X1 = - 2 ln (1 – R1)
= - 2 ln (1 – 0.54)
= 1.55
X2 = - 2 ln (1 – R2)
= - 2 ln (1 – 0.72)
= 2.54
Example 7.2 [D-07,08,09]

Develop a random variate generator for random variable X with pdf
( ) {
Solution:
For For
CDF : CDF :
( ) ∫ ( ) ( ) ∫ ( )
∫ ∫
0 1 = 0 1 =
Set F(X) = R Set F(X) = R
Taking ln on both sides Taking ln on both sides

2x = ln(2R) -2x = ln(1-2R)
( ) ( )
7.2.2 Uniform Distribution
( ) {
( )
( )
4. Generate uniform random numbers R1, R2, … from [0,1) and calculate desired
random variate by
( )
Example 7.3
The time required to travel from station to college is uniformly distributed over the interval
20 to 25 minutes. Generate two random travel times from this distribution. Use R1= 0.2753
and R2 = 0.6418.
Solution:
The uniform random variate is
( )
Given a = 20 and b = 25
( )
Now, X1 = 20 + 5 R1
= 20 + 5(0.2753)
= 21.3765
X2 = 20 + 5 R2
= 20 + 5(0.6418)
= 23.509
7.2.3 Weibull Distribution
( ) , . /
2. Set F(X) = R.
. /

. /
. /
Taking natural log on both sides,
. / ( )
Taking β root on both sides,
, ( )-
, ( )-
4. Generate uniform random numbers R1, R2,… from [0,1) and calculate desired
random variate by
, ( )-
Example 7.4 [S-15, D-16]

The time to failure of a nickel-cadmium battery is Weibull distributed with parameters β =
0.75, α = 8 and υ = 0. Generate two random failure time of battery from this distribution. Use
R1 = 0.5249 and R2 = 0.612
Solution:
The Weibull distribution is , ( )-

Given β = 0.75, and α = 8
( ), ( )-
Now, ( ), ( )-
1.33
= (8)[-ln(1 – 0.5249)]
= 5.4
( ), ( )-
= (8)[-ln(1 – 0.612)]1.33
= 7.4385
7.2.4 Triangular Distribution
1. Consider a random variable X that has pdf
( ) {
Figure 7.1 Density function for triangular distribution
2. This distribution is called a triangular distribution with endpoints (0, 2) and mode at 1.
3. Its CDF is given by
( )
( )
{
4. Set F(X) = R.
√
( )
( ) ( )
√ ( )
√ ( )
√
{
√ ( )
Example 7.5
Develop a generator for a triangular distribution with range (1, 10) and mode at x = 4. Also
generate two random variates for R1= 0.2584 and R2 = 0.6591.
Solution:
Given a triangular distribution with a = 1, b = 4 and c = 10.
f(x)
0 1 4 10 x
Total area of triangle = 1
i.e. h = 2/9
Step 1: Find CDF F(x) = total area from 1 to x.

( )
For 1 ≤ x ≤ 4, by similar triangles so
( ) ( ) ( )
( )
( )
For 4 < x ≤ 10, by similar triangles so
( ) ( ) ( )
( )
Step 2: Set F(X) = R on 1 ≤ X ≤ 10

Step 3: Solve for X.
For 1 ≤ x ≤ 4 For 4 < x ≤ 10

F(X) = R F(X) = R
( ) ( )
( ) ( )
√
√ , 0 ≤ R ≤ 9/27 ( ) ( )
√ ( )
√ ( ), 9/27 < R ≤ 1
7.2.5 Empirical Continuous Distributions

 It is used when the modeler is unable to find a theoretical distribution that provides a
good model for the input data.
 One possibility is to simply resample the observed data itself. This is known as using
the empirical distribution; and it makes particularly good sense when the input
process is known to take on a finite number of values.
 If the data are drawn from continuous distribution, then interpolate between the
observed data points to fill in the gaps.
 In case of original data as well as grouped data, we use table look-up method for
generating random variate from empirical continuous distribution.
 A frequency distribution table is generated showing the interval and its probability,
cumulative probability and slope.
 In case of continuous data, since all values between tabulated values are possible, so
an interpolation is required between the tabulated values.
Algorithm
I. Original Data
1. Sort the data points increasing order.
2. Generate a frequency distribution table and assign a probability to each interval as
1/n where n is the number of observations.
i Interval Probability Cumulative Probability Slope
1/n i/n ai
1
2
:
:
n
3. Compute the slope „ai‟ of the ith line segment.
( )
4. Get CDF of the desired random variable X

( )
( )
{
( )
6. Compute the inverse of empirical CDF by solving the equation F(X) = R.
( )
( )
( )( *
( *
7. Generate uniform random numbers R1, R2,… and compute desired random
variate by
. /, if
Example 7.6
Five data points of computer service center response time to incoming phone call have been
collected and are used in simulation to investigate and improve the service quality for
customer. The response times are 2.76, 1.83, 0.80, 1.45, and 1.24. Set up a table for
generating response time by the table look-up method and generate two values of response
times using uniform random numbers R1 = 0.71 and R2 = 0.83.
Solution:
1. Sort the data points in increasing order and let it be X1 ≤ X2 ≤…≤ X3.
0.80 1.24 1.45 1.83 2.76
Since the smallest possible value is believed to be 0, x0 = 0.
2. Generate a frequency distribution table and assign a probability to each interval.
Given that n = 5 observations.
Hence, 1/5 = 0.2 is the probability of each interval.
i Interval Probability Cumulative Probability Slope
1/n i/n ai
1 0 < X ≤ 0.80 0.2 0.2 4.00
2 0.80 < X ≤ 1.24 0.2 0.4 2.20
3 1.24 < X ≤ 1.45 0.2 0.6 1.05
4 1.45 < X ≤ 1.83 0.2 0.8 1.90
5 1.83 < X ≤ 2.76 0.2 1.0 4.65
3. Compute the slope „ai‟ of the ith line segment.

( ) ( )
( ) ( )
( ) ( )
( ) ( )
( ) ( )
4. The empirical random variate of original data is
. /, if
5. Given that, R1 = 0.71 and R2 = 0.83.
R1 = 0.71 is lying between 3/5 and 4/5.
Therefore, i = 4.
( * ( )
Now, R2 = 0.83 is lying between 4/5 and 5/5.

Therefore, i = 5.
( * ( )
II. For Large Sample Size or Grouped Data

1. Summarize the data into a frequency distribution with a much smaller number of
intervals and fit the continuous empirical CDF to the frequency distribution.
i Interval Frequency Relative Cumulative Frequency Slope
Frequency Ci ai
1
2
:
:
n
where Relative Frequency = Individual Frequency / Total Frequency
2. Compute the slope ai for the ith line segment.
where Ci is the cumulative probability of the first i intervals of the frequency

distribution and Xi-1 < X ≤ Xi is the ith interval.
3. The inverse CDF is calculated as
( ) ( )
Example 7.7
Data have been collected on service times at a drive-in bank window at the Canara Bank.
This data are summarized into intervals as follows:
Interval (seconds) Frequency
15 – 30 10
30 – 45 20
45 – 60 25
60 – 90 35
90 – 120 30
120 – 180 20
180 - 300 10
Set up a table for generating service time by the table look-up method and generate two
values of service times using uniform random numbers 0.3561 and 0.5459.
Solution:
1. Summarize the data into frequency distribution table.
i Interval Frequency Relative Cumulative Slope
Frequency Frequency |ai|
Ci
1 15 < X 10 0.067 0.067 223.88
2 30 < X 20 0.133 0.200 112.78
3 45 < X 25 0.167 0.367 89.82
4 60 < X 35 0.233 0.600 128.76
5 90 < X 30 0.2 0.800 150
6 120 < X 20 0.133 0.933 451.13
7 180 < X 0 10 0.067 1.000 1791.04
∑ = 150
2. Now compute the slope ai.
3. The empirical random variate of grouped data is

( ) ( )
4. Given R1 = 0.3561 and R2 = 0.5459.
Now, R1 = 0.3561 which lies between C2 = 0.2 and C3 = 0.367.
Hence, i =3
( ) ( )
( )
Now, R2 = 0.5459 which lies between C3 = 0.367 and C4 = 0.600.

Hence, i =4
( ) ( )
( )
7.2.6 Empirical Discrete Distributions

 Consider an empirical distribution having probability mass function p(0), p(1),… on
the non-negative integers.
Algorithm
1. Generate uniform random numbers R1, R2,…
i.e. ( )
2. Return the desired random variate X = i if it satisfies
∑ () ∑ () ( ) ( )
3. Note that the algorithm will never return a value X = i for p(i) = 0, because the strict
inequality between the two summations in step (2) is impossible.
4. Step (2) requires a search, which is time consuming.
7.2.6.1 Discrete Uniform Distribution
 Consider the discrete uniform distribution on {1, 2,…,k} with pmf and CDF given by
( ) and
( )
{
 Let xi = i and ri = p(1) + p(2) + ….+ p(xi)
= F(xi) = i/k, i = 1, 2, …, k
 Then from inequality ( ) ( ), it can be seen that, if the
generated random number satisfies
then X is generated by setting X = i.
 Let ⌈ ⌉ denote the smallest integer ≥ y.

For example, ⌈ ⌉ and ⌈ ⌉
 For y ≥ 0, ⌈ ⌉
 This notation and inequality yield a formula for generating X,

namely
⌈ ⌉
Example 7.8
Consider the discrete uniform distribution on {1, 2, 3, …, 10} with CDF
( )
{
Generate two random values of X using two random numbers 0.78 and 0.23.
Solution:
⌈ ⌉
Here k=10
⌈ ⌉ ⌈( )( )⌉ ⌈ ⌉
⌈ ⌉ ⌈( )( )⌉ ⌈ ⌉
Example 7.9
The CDF of a discrete random variable X is given by
( )( )
( )
( )( )
When n = 4, generate three values of X, using R1 = 0.83, R2 = 0.24, and R3 = 0.57.
Solution:
For given random number R, X will take the value in Rx = {1, 2, 3, 4} provided that
( ) ( )
( )( )
But, ( )
( )( )
Put n = 4, we have
( )( ) ( )( )
( )
( )( )
We know that,
( ) ( )
( )( )( ) ( )( )
As x = 1, 2, 3, 4
F(1) = 0.033 F(2) = 0.167
F(3) = 0.467 F(4) = 1
Now, X can be generated by table look-up method below:
i Xi F(Xi)
1 1 0.033
2 2 0.167
3 3 0.467
4 4 1.000
The discrete random variate is
X = i if ( ) ( )
Given that R1 = 0.83, R2 = 0.24, and R3 = 0.57

Now, [F(x3) = 0.467] < [R1 = 0.83] [F(x4) = 1.000]
Therefore, i = 4
Hence, X1 = 4
Similarly, [F(x2) = 0.167] < [R2 = 0.24] [F(x3) = 0.467]
Therefore, i = 3
Hence, X2 = 3
Now, [F(x3) = 0.467] < [R1 = 0.54] [F(x4)=1.000]
Therefore, i = 4
Hence, X3 = 4
Example 7.10 [S-15]

Design a generator for the discrete distribution whose pmf is given below:
( )
( )
Generate the random variate for R1 = 0.3456, and R2 = 0.8912
Solution:
For integer values of x in the range {1, 2, …., k}, the CDF is given by
( ) ( )
( ) ∑ ∑
( ) ( ) ( ) ( )
We know that,
( ) ( )
( ) ( )
( ) ( )
( ) ( ) ( )
To solve this inequality for x in terms of R, first find a value of x that satisfies
( ) ( )
( )
Then by rounding up, the solution is ⌈ ⌉.
√
By quadratic formula, namely
with a = 1, b = -1, and c = -k(k+1)R, the solution to the quadratic equation is

√ ( )
We will use only positive root.

√ ( )
So X is generated by ⌈ ⌉
Given R1 = 0.3456, and R2 = 0.8912
Let k = 2
√ ( ) √ ( )
⌈ ⌉ ⌈ ⌉
√ ( ) √ ( )
⌈ ⌉ ⌈ ⌉
7.2.6.2 Geometric Distribution [D-10]
 Consider the geometric distribution with pmf

( ) ( )
 Its CDF is given by
, ( ) -
( ) ∑ ( ) ( )
( )
 Using the inverse transform technique, we have that a geometric random variable X
will assume the value x whenever
( ) ( ) ( ) ( )
( ) ( )
( ) ( )
Taking natural log, we have
( ) ( ) ( ) ( )
But (1 – p) < 1 implies that ln(1 – p) < 0, so that
( ) ( )
( ) ( )
( )
⌈ ( )
⌉
 Since p is a fixed parameter, let
( )
⌈ ( ) ⌉
Here, - β ln(1 – R) is an exponentially distributed random variable with mean β.
 Occasionally, a geometric variate X is needed which can assume values {q, q+1,…}
with pmf ( ) ( )
 In that case, the desired random variate is
( )
⌈ ⌉
( )
 One of the most common cases is q = 1.
Example 7.11
In an International Conference held by Delhi University, the reviewer reviews the research
papers which is geometric distributed on the range {X ≥ 1} with mean of 3 papers per day.
Generate two values of X using random numbers 0.932 and 0.105.
Solution:
Given that the geometric distribution is defined on the range {X ≥ 1}.
( ) ( ) for x = 1,2,… with mean 1/p.
The geometric random variate is
( )
⌈ ⌉
( )
Given that mean = 1/p = 3
Also R1 = 0.932 and R2 = 0.105

( ) ( )
⌈ ⌉ ⌈ ⌉ ⌈ ⌉
( ) ( )
( ) ( )
⌈ ⌉ ⌈ ⌉ ⌈ ⌉
( ) ( )
7.3 Direct Transformation for Normal and Lognormal Distribution

[D-15,17]
 A commonly used method for generating standard normal variates was developed by
Box and Muller.
 Given standard normal variable ( ), we can obtain ( ) by setting
 Consider two standard normal random variables Z1 and Z2, plotted as a point in the
plane as shown below:
Figure 7.2 Polar representation of a pair of standard normal variables
 Here, Z1 = B cosθ and Z2 = B sinθ ……………..(1)

whereas
 Therefore, B2 has the chi-square distribution with 2 degrees of freedom, which is
equivalent to an exponential distribution with mean 2.
 Now, B can be generated using inverse transform technique for exponential
distribution.

i.e. √ ……………. (2)
 The angle θ is uniformly distributed on (0,2π) and is independent on B.
 Combining equations (1) and (2) gives a direct method for generating two
independent standard normal variates Z1 and Z2 from two independent random
numbers R1 and R2.
√ ( ) ( )
√ ( ) ( )
 Obtain normal variate Xi with mean µ and variance σ2, using the following
transformation
 Finally, obtain the lognormal variate by using the following direct transformation
Example 7.12 [D-15]

Regular maintenance of a production routine has been found to vary and has been modeled as
a normally distributed random variable with mean 10 minutes and variance 4 minutes2.
Generate two random maintenance times with the given distribution. Use R 1 = 0.1758 and R2
= 0.1489.
Solution:
1. Two standard normal random variates are generated as follows:
√ ( ) ( )
√ ( ) ( )
Given R1 = 0.1758 and R2 = 0.1489.
√ ( ) ( )
√ ( ) ( )
2. The normal variate is
where µ = 10 and σ2 = 4 i.e.
( )
( )
7.4 Convolution Method [M-09]

 The probability distribution of a sum of two or more independent random variables is
called a convolution of the distributions of the original variables.
 The convolution method thus refers to adding together two or more random variables
to obtain a new random variable with the desired distribution.
 This technique can be applied to obtain
1. Erlang Variates
2. Binomial Variates
7.4.1 Erlang Distribution [M-14, D-06,10]
 We know that an Erlang random variable X with parameters k and θ is the sum of k
independent exponential random variables, Xi (i = 1, 2,…, k) each having mean 1/kθ.
i.e. ∑
 As each is exponentially distributed with mean 1/λ, so it can be generated using
inverse transform technique
( ) where is uniform random number.
∑ ( )
But
∑ ( ) (∏ +
Example 7.13
In Mc Donald‟s restaurant, the consumption of bread is approximated by Erlang distribution
with parameters k = 2 and θ = 5. Generate the value of consumption from this distribution.
Use R1 = 0.937 and R2 = 0.217.
Solution:
The Erlang random variate is
(∏ +
Given that k = 2 and θ = 5.
R1 = 0.937 and R2 = 0.217
(∏ + ,( )( )- ( )
7.4.2 Binomial Distribution [M-09]
 We know that, the binomial random variate X can be represented as the number of
successes in ‘n’ independent Bernoulli trials, each success having probability p.
 Thus, ∑
where p(Xi = 1) = p
and p(Xi = 0) = 1 – p
Example 7.14
The number appearing on the up-face of the tossed die is modeled by binomial distribution.
The value of x is 1, 2, 3, 4, 5, or 6 with probability 0.10, 0.20, 0.30, 0.25, 0.10, 0.05. Set up a
table for generating appearances of number using table look-up method. Use R1= 0.40 and R2
= 0.60
Solution:
1. Generate a frequency distribution table for appearances of number when a die is

tossed.
i Number appearing Probability Cumulative
On up face of die f(xi) Probability
xi F(Xi)
1 1 0.10 0.10
2 2 0.20 0.30
3 3 0.30 0.60
4 4 0.25 0.85
5 5 0.10 0.95
6 6 0.05 1.00
2. The binomial random variate is X = i if
( ) ( )
3. Given that R1= 0.35 and R2 = 0.65
Now, F(X2) = 0.30 < R1 = 0.35 ≤ F(X3) = 0.60
Hence, i = 2
Therefore, X1 = 2
And, F(X3) = 0.60 < R2 = 0.65 ≤ F(X4) = 0.85
Hence, i = 4
Therefore, X2 = 4
7.5 Acceptance –Rejection Technique [M-08,10,12,13,15, D-08,09]

 It is used when the direct methods go wrong or are inefficient, or no straightforward
solution exists.
 The flowchart for acceptance rejection technique is given below:
Generate R with desired
characteristics
No
Does R Reject R, try again
meet
Yes
Yes
Accept R Finished? Stop
No
Algorithm
1. Generate a random number R having density function f(x).

2. Test a condition based on R and some expression derived for the desired distribution.
3. If the condition is satisfied then accept X computed from a formula. Else, reject, and
go to step 1 and try again.
7.5.1 Poisson Distribution
 A Poisson random variable, X, with λ > 0 has pmf

( ) ( ) , where x = 0, 1, 2…
 Here, X is the number of arrivals from Poisson process in one unit of time.
 The inter-arrival times A1, A2,… of successive customers are exponentially
distributed with rate λ.
 There is a relationship between the discrete Poisson distribution and the continuous
exponential distribution, namely
X = x ……………………………(1)
if and only if
……………….(2)
 Equation (1), X = x says that there are exactly x arrivals during one unit of time.
 Equation (2) says that the xth arrival occurred before time 1 while the (x+1)th arrival
occurred after time 1.
 Clearly, these two statements are equivalent.
 Now, generate exponential interarrival times until some arrival, say x+1, occurs after
time 1; say X = x.
 As interarrivals (Ai) are from exponential distribution, which can be generated using
inverse transform technique by:
∑ ∑
 Next multiply by - λ, which reverses the sign of the inequality, and use the fact that a
sum of logarithm is the logarithm of product.
∏ ∑ ∑ ∏
 Finally, use the relation to obtain
∏ ∏
 The procedure to generate a Poisson random variate, X, is given by the following

steps:
1. Set x = 0 and P = 1.
2. Generate a random number ( ) and replace P by P.(Rx+1).
-λ
3. If P < e , then accept X = x.
4. Else, reject x and increment x by 1 and return to step 2.
Example 7.15
The number of customers arriving at Café Coffee Day is Poisson distributed with mean 4.
Generate Poisson variate. Use random numbers 0.5389, 0.0532, 0.3492 in sequence.
Solution:
Iteration 1:
1. Set x= 0 and P = 1.
2. Given ( )
Compute P.R1 = 1(0.5389) = 0.5389
Set P = P.R1 = 0.5389
3. Check P < e-λ
i.e. 0.5389 < e-4
i.e. 0.5389 < 0.0183 is false.
Reject x and set x = x + 1 i.e. 0 + 1 = 1 and go to step 2.
Iteration 2:
1. Given ( )
Compute P.R2 = (0.5389)(0.0532) = 0.0286
Set P = P.R2 = 0.0286
2. Check P < e-λ
i.e. 0.0286 < e-4
i.e. 0.0286 < 0.0183 is false.
Reject x and set x = x + 1 i.e. 1 + 1 = 2 and go to step 2.
Iteration 3:
1. Given ( )
Compute P.R3 = (0.0286)(0.3492) = 0.0099
Set P = P.R3 = 0.0099
2. Check P < e-λ
i.e. 0.0099 < e-4
i.e. 0.0099 < 0.0183 is true.
Accept x.
Poisson random variate X = x = 2.
■■■
UNIT: IV
ANALYSIS OF SIMULATION DATA
CHAPTER 8 INPUT MODELING
CHAPTER 9 VERIFICATION, CALIBRATION AND VALIDATION

OF SIMULATION MODELS
CHAPTER 10 ESTIMATION OF ABSOLUTE PERFORMANCE

CHAPTER 8
INPUT MODELING
 Input data provide the driving force for a simulation model.
 In queuing system the distribution of time between arrivals and service times are the
input data.
 The distributions of demand and lead time are the input data for inventory system.
 There are four steps in development of a useful model of input data
1. Data Collection
2. Identifying the distribution with data
3. Parameter Estimation
4. Goodness of fit tests
8.1 Steps In Input Modeling [M-05,06,08,09,11,12 D-05,08,09,11,13]
8.1.1 Data Collection [M-16]
 Data collection is one of the biggest tasks in solving a real problem.

 Collection of data requires a substantial time and resource commitment.
 The collection of data is from the real system of interest.
 In some situation it is not possible to collect the data.
 When data are not available, expert opinion and knowledge of the process must be
used to make educated guess.
 “GIGO or garbage - in, garbage - out” is the basic concept in computers. In case input
data are not accurately collected and analyzed then simulation output data will result
in misleading and possibly damage with increase in cost factor.
 To enhance and facilitate data collection, the following suggestions are required.
1. A useful expenditure of time is in planning.
 Data collection should start before observing the process. Devise the forms
for this purpose. These forms have to be modified several times before
actual data collection begins.
 Check for unusual circumstances and consider how they will be handled.
 If data is already present then plenty of time is required for converting the
data into a usable format.
2. Try to analyze the data that are collected i.e. whether the data collected is
adequate for simulation. Check the data that are useless for the system.
3. Try to combine homogeneous data sets; an initial test is to see if means of
distribution are same.
4. Be aware of possibility of data censoring. Censoring is whether the part or whole
data is accepted. Censoring results in long process time, so collect only relevant
data.
5. Build a scatter diagram to determine the relationship between two variables.
Scatter diagram is a dotted graph with one variable on x-axis and one variable on
y-axis.
6. Observe the sequence of inputs for autocorrelation. If service time of ith customer
affects the „i + n‟ customer then there is an autocorrelation.
7. Keep in mind the difference between the input data from output or performance
data. Make sure to collect input data which represents the uncertain quantities that
are largely beyond the control of system and will not be altered by changes made
to improve the system.
8.1.2 Identifying the distribution with data

 When the data is available, this step begins by developing a frequency distribution or
histogram, of the data.
 Based on the frequency distribution and structural knowledge of the process, a family
of distribution is chosen.
8.1.2.1 Histogram
 It is useful in identifying the shape of a distribution. The procedure for construction is

as follows.
1. Divide the range of the data into intervals of equal width (unequal width may be
used).
2. Label the horizontal axis to conform to the intervals selected.
3. Determine the frequency of occurrences within each interval.
4. Label the vertical axis. So that total occurrences can be plotted for each interval.
5. Plot the frequencies on the vertical axis.
 The number of class intervals depends on number of observations and dispersion in

the data.
 If intervals are too wide, histogram will be coarse or blocky.
 If intervals are too narrow, the histogram will be ragged.
 The figure 8.1 shows the different forms of histogram.
(a) Original data – too ragged
(b) Combining adjacent cells – too coarse

(c) Combining adjacent cells - appropriate

Figure 8.1 Ragged, coarse, and appropriate histograms
 Histogram for continuous data corresponds to probability density function (pdf) of a

theoretical distribution; if continuous, a line is drawn through the center point of each
class interval frequency.
 Histograms for discrete data corresponds to probability mass function (pmf) of

distribution, it has large number of data points so it should have a cell for each value
in the range of data. If there are few points then combine adjacent cells to eliminate
ragged appearance of histogram.
Example 8.1 (Discrete data)

The number of vehicles arriving at the corner of the road from 8 am to 8:05 am was
monitored for five working days over 20 week period. The table 9.1 shows the resulting data
with first entry – 12 periods each of 5 minutes during which zero vehicles arrived, 10 periods
during which one vehicle arrived and so.
Arrivals per Arrivals per

Frequency Frequency
period period
0 12 6 7
1 10 7 5
2 19 8 5
3 17 9 3
4 10 10 3
5 8 11 1
Table 8.1 Number of arrivals in a 5 minute period
The number of vehicles is a discrete variable; since there are sample data the histogram can
have a cell for each possible value in the range of data. The resulting histogram is shown in
figure 8.2
Figure 8.2 Histogram of number of arrivals per period
8.1.2.2 Selecting the Family of Distributions [D-07]
 The purpose of preparing a histogram is to infer a known pdf or pmf.

 A family of distributions is selected, on the basis of content being investigated with
the shape of the histogram.
 The exponential, normal & Poisson distribution are frequently used, easy to analyze;
whereas gamma & weibull distribution provide a wide array of shapes, difficult to
analyze.
Some of the examples to select the distributions:
 Binomial - Models the number of successes in n independent trials with probability p.

Example: Number of defective chips found in n chips.
 Negative binomial -Models the number of trials required to achieve „k‟ successes.
Example: Number of chips that must be inspected to find 4 defective chips.
 Poisson - Models number of independent events that occur in fixed amount of time.
Example: Number of customers arriving to a restaurant during 1 hour.
 Normal - Models the process as sum of number of component processes
Example: Time to assemble a product is sum of times required for each assembly
operation.
 Lognormal -Models the distribution of process as product of number of component
processes
Example: Rate of return on an investment.
 Exponential -Models the time between independent events or process time which is
memory less.
Example: Times between arrivals of large number of customers.
 Gamma - Models nonnegative random variables, the gamma can be shifted away
from 0 by adding a constant.
 Weibull - Models the time to failure for components
Example: Time to failure for a disk drive. Exponential is the special case of weibull
 Discrete or continuous uniform - Models complete uncertainty, since all outcomes

are equally likely. This distribution is often used when there are no data.
 Empirical -It is used when no theoretical distributions are appropriate. Resample
from the actual data collected.
8.1.2.3 Quantile - Quantile plots [D-07]
 A quantile-quantile (q-q) plot is a useful tool for evaluating distribution fit.

 If X is a random variable with cdf F then q–quantile of X is that value γ such that
F(γ) = P ( X ≤ γ) = q , 0 < q <1

γ = F-1(q)
 Let
{xi, i = 1, 2, …n} → Sample of data from X
{yj j = 1, 2, …n} → Samples arranged in ascending order, where y1 ≤ y2 ≤…≤yn.
j → ranking or order number
j =1 for smallest and j = n for largest.
( )
 The q –q plot is based on the fact that yj is an estimate of quantile of x, i.e.
( )
yj is approximately F-1* +.
( )
 If F is a member of an appropriate family of distributions, then plot yj Vs F-1* +
will be approximately a straight line.
 If F is a member of an appropriate family of distributions and has appropriate
parameter values, then the line will have slope 1.
 If the assumed distribution is not appropriate then points will deviate from a straight
line.
 In evaluation of linearity of q–q plot, the following should be considered:
1. The observed values will never fall exactly on a straight line
2. The observed values are not independent, since they are ranked. Hence, if one
point is above a straight line, it is likely that the next point will also lie above the
line. And it is unlikely that the points will be scattered about the line.
3. The variances of extremes (largest & smallest values) are higher than variance in
the middle of the plot. Greater discrepancies can be accepted at extremes. The
linearity of points in the middle of plot is more important than linearity at the
extremes.
 Modern data analysis software often includes tools for generating q-q plots, especially
for the normal distribution.
 The q-q plot can also be used to compare two samples of data to see if they can be
represented by the same distribution (that is, they are homogenous).
 If x1, x2, x3, …., xn are a sample of random variable X, and z1, z2, z3, …., zn are a
sample of random variable Z, then plotting the ordered values of X versus the ordered
values of Z will reveal approximately a straight line if both samples are well
represented by the same distribution.
Figure 8.3 Histogram and q-q plot of radon
8.1.3 Parameter Estimation
 The parameters are chosen to determine a specific instance of distribution family.

 When the data are available, these parameters can be estimated.
 Some estimators are described in this section.
8.1.3.1 Preliminary Statistics: Sample Mean and Sample Variance
 The sample mean and sample variance are used to estimate the parameters of a
hypothesized distribution.
 The observations in a sample of size n are X1, X2, … Xn.
1. If data are discrete or continuous raw data then sample mean and sample variance
is defined by
∑ ∑
2. If data are discrete and grouped in a frequency distribution then mean and
variance is given by
∑ ∑
Where fj → Observed frequency of value Xj of X

k→ Number of distinct values of X
3. If data are discrete or continuous and have been placed in class interval, then
mean and variance is
∑ ∑
Where fj→ Observed frequency,

mj→ Midpoint of jth interval and
c→ Number of class intervals
8.1.3.2 Suggested Estimators
 Numerical estimates of the distribution parameters are required to reduce the family
of distributions to a specific distribution and to test the resulting hypothesis.
 The table 8.2 contains suggested estimators for distribution often used in simulation.
Table 8.2 Suggested estimators for distributions often used in simulation

Distribution Parameter(s) Suggested Estimators(s)
Poisson α ̂ ̅
Exponential λ ̂
̅
̂
Gamma β, θ
̂
̅
2 ̂ ̅
Normal μ,σ
̂ ( )
̂ ̅
Lognormal μ,σ2 ̂
Note - The parameter is denoted by α and estimator is denoted by ̂
Example 8.2
The percentage rates of return on 10 investments in a portfolio are 18 .8, 27.9, 21.0, 6.1, 37.4,
5.0, 22.9, 1.0, 3.1 and 8.3.Estimate the parameter of a lognormal model of this data.
Solution:
Natural log of the given data is 2.9, 3.3, 3.0, 1.8, 3.6, 1.6, 3.1, 0, 1.1 and 2.1 respectively.
∑
̅ ̅
∑ ̅ ( )
̂
8.1.4 Goodness-of-Fit tests [M-09,15, D-05,09]
 Goodness-of-fit test provide helpful guidance for evaluating the suitability of a

potential input model.
 If single distribution is selected, then other distributions are called candidate
distributions.
 It is especially important to understand the effect of sample size.
 If very little data are available, then goodness-of-fit test is unlikely to reject any
candidate distributions.
 If very large data are available, then goodness-of-fit test is likely to reject all
candidate distributions.
8.1.4.1 Chi-square Test

 This test is valid for large sample sizes (n ≥ 50) and used for both discrete and
continuous distributional assumptions.
 It formalizes the spontaneous idea of comparing the histogram of data to the shape of
candidate density or mass function.
 The test procedure beings by arranging the n observations into a set of k class
intervals or cells.
 The test statistic is given by
∑ ( )
 The expected frequency Ei for each class intervals is computed as Ei = npi

where pi is the theoretical, hypothesized probability associated with ith class interval.
 approximately follows chi-square distribution with k-s-1 degrees of freedom.

where, s is the number of parameters of hypothesized distribution estimated by
sample statistics.
 The hypotheses are
H0: Random variable X, conforms to distribution assumption with parameter(s) given
by estimate(s).
H1: Random variable X does not conform
 The critical value is found in table A.4. Null hypothesis H0, is accepted if
<
 If an Ei value is too small (i.e. < 5), it can be combined with expected frequencies in
adjacent class intervals. The corresponding observed frequency Oi values should be
also combined and k should be reduced by one for each cell combined.
 If the distribution being tested is discrete, each value of random variable should be a
class interval, unless it is required to combine adjacent class intervals to meet
minimum expected cell-frequency requirements.
 For discrete, if combining adjacent cells is not required, then
pi = p(xi) = P(X = xi)
otherwise
pi is sum of probabilities of appropriate adjacent cells.
 If the distribution being tested is continuous, the class intervals are given by [ai – 1, ai),
Where ai – 1 and ai are endpoints of ith class interval
 For continuous with assumed pdf f(x) or assumed cdf F(x), pi is
∫ ( ) ( ) ( )
Note
1. For discrete - Number of class intervals is determined by number of cells resulting
after combining adjacent cells as necessary.
2. For continuous – Number of class intervals must be specified.
The table 8.3 helps in determining the number of class intervals for continuous data.
Sample size Number of Class Interval

(n) (k)
20 Do not use the chi-square test
50 5 to 10
100 10 to 20
>100 √n to n/5
Table 8.3 Recommendations for number of class intervals for continuous data
Example 8.3
The number of vehicles arriving at the toll booth in a 5-minute period between 7:00 A.M.
and 7:05 A.M. was monitored for 100 days. The table below shows the resulting data.
Arrivals per Period Frequency
0 12
1 10
2 19
3 17
4 10
5 8
6 7
7 5
8 5
9 3
10 3
11 1
Use the Chi-square test to test the hypothesis that these arrivals are Poisson distributed.
Use the level of significance α = 0.05 and
Solution:
Let H0 indicate that the given data is Poisson distributed and

H1 indicate that the given data is not Poisson distributed.
Given, n = 100, k=12
For Poisson distribution we have an estimator ̂
∑ ∑
̂
( ) ( ) ( ) ( )
( ) {
P(0) = 0.026 P(3) = 0.211 P(6) = 0.085 P(9) = 0.008

P(1) = 0.096 P(4) = 0.192 P(7) = 0.044 P(10) = 0.003
P(2) = 0.174 P(5) = 0.140 P(8) = 0.020 P(11) = 0.001
With the above information, the below table is constructed:
Xi Observed Frequency, Oi Expected Frequency, ( )

Ei = n Pi
0 12 22 2.6 12.2
7.87
1 10 9.6
2 19 17.4 0.15
3 17 21.1 0.80
4 10 19.2 4.41
5 8 14.0 2.57
6 7 8.5 0.26
7 5 4.4
8 5 2.0
9 3 17 0.8 7.6 11.62
10 3 0.3
11 1 0.1
∑ = 100 ∑ = 100.0 ∑ = 27.68
The value of E1 is given by nP0 = 100(0.026) = 2.6. In the similar manner, the
remaining Ei values are determined.
Since E1 = 2.6 < 5, E1 and E2 are combined. In that case O1 and O2 are also combined
and the value of k is reduced by 1. The last five class intervals are also combined for
the same reason, and k is further reduced by four.
Hence value of k becomes k = 12 – 1 - 4 = 7
The calculated
The degrees of freedom for the tabulated value of is
Here, s =1, since one parameter.
Critical value
Since, , H0 would be rejected, i.e., the given data is not Poisson
distributed.
Example 8.4 [M-05,10,12,16 D-10,15,17]

A federal agency studied the records pertaining to the number of job-related injuries at an
underground coal mine. The values for the past 100 months were as follows:
Injuries per Month 0 1 2 3 4 5 6

Frequency of Occurrence 35 40 13 6 4 1 1
i. Apply the Chi-Square test to these data to test the hypothesis that the underlying
distribution is Poisson.
ii. Apply the Chi-Square test to these data to test the hypothesis that the underlying
distribution is Poisson with mean 1.0
Use level of significance α = 0.05 and ,
Solution:
i. Define the hypothesis as:
H0: The data fits to Poisson distribution.
H1: The data does not fit to Poisson distribution.
For Poisson distribution,
∑
Where ̅
Using Chi-Square test we test the hypothesis as:
Xi Oi Pi Ei= n*Pi ( )
0 35 0.3296 32.96 0.126

1 40 0.3658 36.58 0.320
2 13 0.2030 20.30 2.625
3 6 0.0751 7.51
4 4 12 0.0209 2.09 10.16 0.333
5 1 0.0046 0.46
6 1 0.0010 0.10
∑ = 100 ∑= ∑ = 100
1.0000
Notice that we have grouped cells i = 3; 4; 5and 6 together into a single cell with Oi = 12 and
Ei = 10.16 since the expected frequency of those cells is less than 5.
We estimate the value of , hence, s = 1. Therefore, degree of freedom = k-s-1 = 4 – 1 – 1 =2
Given,
Since, < , H0 is accepted.
Hence the given data fits to Poisson distribution.
ii. Define the hypothesis as:

H0: The data fits to Poisson distribution.
H1: The data does not fit to Poisson distribution.
For Poisson distribution,
Where
Using Chi-Square test we test the hypothesis as:
Xi Oi Pi Ei= n*Pi ( )
0 35 0.3679 36.79 0.087

1 40 0.3679 36.79 0.280
2 13 0.1839 18.39 1.580
3 6 0.0613 6.13
4 4 12 0.0153 1.53 8.03 1.963
5 1 0.0031 0.31
6 1 0.0006 0.06
∑ = 100 ∑ = 1.0000 ∑ = 100
Notice that we have grouped cells i = 3; 4; 5and 6 together into a single cell with Oi = 12 and
Ei = 8.03 since the expected frequency of those cells is less than 5.
We do not estimate the value of , hence, s = 0.
Therefore, degree of freedom = k-s-1 = 4 – 0 – 1 =3
Given, 7.81
Since, < 7.81, H0 is accepted.
Hence the given data fits to Poisson distribution.
8.1.4.2 Chi- square test with equal probabilities
 If a continuous distributional assumption is being tested, class intervals that are equal
in probability should be used instead of equal in width of interval.
 For equal probabilities
pi = 1/k
Ei = npi ≥ 5
 Substituting for pi, we get
 Solving for k,
 In case of normal, exponential or weibull distribution, the method is straight forward.

 For gamma or other certain distributions, the computation of end points for class
intervals is complex and requires numerical integration for density function.
Disadvantages of using the chi-square test
 Changing the number of classes and interval width affects the value of calculated and
tabulated chi-square.
 A hypothesis may be accepted when the data are grouped in one way but rejected if it
is done in another way.
 It requires the data to be placed in the class intervals. In case of continuous grouping
is arbitrary.
8.1.4.3 Kolmogorov-Smirnov Goodness-of-Fit Test
 Any continuous distributional assumption can be tested for goodness-of-fit using

kolmogorov-smirnov test, while discrete distributional assumptions can be tested
using gap test.
 This test is useful when sample sizes are small (n < 50) and when no parameters have
been estimated from the data.
 The critical values in Table A.5 are biased, they are too conservative. Conservative
means that critical values will be too large, resulting in smaller Type I (α) errors than
those specified.
Example 8.5 [N-04, D-16]

The highway between Atlanta, Georgia and Athens, Georgia has a high incidence of
accidents along its 100 kilometers. Public safety officers say that the occurrence of accidents
along the highway is randomly (uniformly) distributed, but the news media say otherwise.
The Georgia Department of Public Safety published records for the month of September.
These records indicated the point at which 30 accidents involving an injury or death occurred,
as follows (the data points representing the distance from the city limits of Atlanta):
88.3 40.7 36.3 27.3 36.8

91.7 67.3 7.0 45.2 23.3
98.8 90.1 17.2 23.7 97.4
32.4 87.8 69.8 62.6 99.7
20.6 73.1 21.6 6.0 45.3
76.6 73.2 27.3 87.6 87.2
Use Kolmogorov-Smirnov test to discover whether the distribution of location of accidents is
uniformly distributed for the month of September with level of significance α = 0.05 and
D0.05, 30 = 0.24
Solution:
Define the hypothesis as:
H0: The data fits to Uniform distribution.
H1: The data does not fit to Uniform distribution.
Given N = 30 and D0.05,30 = 0.24.
Arrange all the random digits RD in ascending order.
i RD Ri i/N (i-1)/N (i/N)-Ri [Ri-(i-1)]/N
1 6.0 0.06 0.033333 0 - 0.06
2 7.0 0.07 0.066667 0.033333 - 0.036667
3 17.2 0.172 0.1 0.066667 - 0.105333
4 20.6 0.206 0.133333 0.1 - 0.106
5 21.6 0.216 0.166667 0.133333 - 0.082667
6 23.3 0.233 0.2 0.166667 - 0.066333
7 23.7 0.237 0.233333 0.2 - 0.037
8 27.3 0.273 0.266667 0.233333 - 0.039667
9 27.3 0.273 0.3 0.266667 0.027 0.006333
10 32.4 0.324 0.333333 0.3 0.009333 0.024
11 36.3 0.363 0.366667 0.333333 0.003667 0.029667
12 36.8 0.368 0.4 0.366667 0.032 0.001333
13 40.7 0.407 0.433333 0.4 0.026333 0.007
14 45.2 0.452 0.466667 0.433333 0.014667 0.018667
15 45.3 0.453 0.5 0.466667 0.047 -
16 62.6 0.626 0.533333 0.5 - 0.126
17 67.3 0.673 0.566667 0.533333 - 0.139667
18 69.8 0.698 0.6 0.566667 - 0.131333
19 73.1 0.731 0.633333 0.6 - 0.131
20 73.2 0.732 0.666667 0.633333 - 0.098667
21 76.6 0.766 0.7 0.666667 - 0.099333
22 87.2 0.872 0.733333 0.7 - 0.172
23 87.6 0.876 0.766667 0.733333 - 0.142667
24 87.8 0.878 0.8 0.766667 - 0.111333
25 88.6 0.886 0.833333 0.8 - 0.086
26 90.1 0.901 0.866667 0.833333 - 0.067667
27 91.7 0.917 0.9 0.866667 - 0.050333
28 97.4 0.974 0.933333 0.9 - 0.074
29 98.8 0.988 0.966667 0.933333 - 0.054667
30 99.7 0.997 1 0.966667 0.003 0.030333
D+ = max{(i/N)-Ri} = 0.047
D- = max{Ri-[(i-1)/N]} = 0.172
D = max(D+, D-) = 0.172
D0.05,30 = 0.24.
Since, D0.05,30 = 0.24 > D = 0.172, do not reject H0.
Hence, the data fits to Uniform distribution.
8.2 Selecting Input Models without Data [M-07,13, D-05,06,10]
It is often necessary in practice to develop a simulation model for demonstration purpose or

preliminary study before any process, data are available. In this case modeler chooses input
models and carefully checks the sensitivity of results to the chosen models.
There are many ways to obtain information, if data are not available. Few are mentioned
below
1. Engineering data
The values provided by manufacturers (example, the mean time to failure of a
computer chip, the cutting speed of the tool) provide a starting point for input
modeling by fixing a central value. Company rules might specify time or production
standards.
2. Expert option
Talking to the experts who have experience with the process or similar processes.
They can provide optimistic, pessimistic and most likely thoughts. They might also be
able to say whether the process is nearly constant or highly variable, and they might
be able to define the source of availability.
3. Physical or conventional limitations
Many real processes have physical limits on performance (Ex. Computer data entry is
not faster than what a person can type). Because of company policies, there could be
upper limits on how long a process may take. Do not ignore obvious limits or bounds
that narrow the range of input process.
4. The nature of process
The choice of distribution should be done after clear understanding of distributions.
When no data is available then uniform, triangular and beta distributions are used as input
models. A useful refinement is obtained, when minimum, maximum and one or more
breakpoints can be given. A breakpoint is an intermediate value and a probability of being
less than or equal to that value.
8.3 Multivariate and Time Series Input Models
 The variables may be related and if the variables appear in a simulation models as
inputs, the relationship should be determined.
 When inputs exhibit dependence then multivariate input models are used. Example:
Two random variables lead time and annual demand in inventory system.
 Time series models are useful for representing a sequence of dependent inputs.
Example: Successive time between orders in a system.
The two measures of dependence are covariance and correlation.

8.3.1 Covariance and correlation [M-15, S-15, D-11,14]
 Let X1 and X2 be two random variables, and let μi = E(Xi) and σi2 = var(Xi) be the
mean and variance of Xi.
 The covariance and correlation are the measures of linear dependence between X1
and X2.
 In others words it indicates how well the relationship between X1 and X2 is described
by the model
(X1 – μ1) = β(X2 – μ2) + ε
where ε is the random variable with mean 0, and is independent of X2
 If (X1 – μ1) = β(X2 – μ2), then model is perfect.
 If X1 and X2 are statistically independent, then β = 0, and the model is of no value.
 A positive value of β indicates that X1 and X2 tends to be above or below their means.
 A negative value of β indicates that X1 and X2 tends to be on opposite sides of their
means.
 The covariance between X1 and X2 is defined as
cov ( X1, X2 ) = E[(X1 – μ1 ) (X2 – μ2 )] = E (X1X2 ) - μ1μ2
 The value cov (X1, X2) = 0 implies β = 0.
 The value cov (X1, X2) < 0 implies β < 0.
 The value cov (X1, X2) > 0 implies β > 0.
 The covariance can take any value between - ∞ and ∞.
 The correlation standardizes covariance to be between -1 and 1:
( )
( )
 Value of corr (X1, X2) = 0 implies β = 0

 Value of corr (X1, X2) < 0(> 0) implies β < 0 (> 0)
 The closer ρ is to -1 or 1; stronger is the linear relationship between X1 and X2.
 The sequence of random variables X1, X2,… that are identically distributed (same
mean & variance) and may be dependent, such a sequence is called as Time series.
 cov (Xt, Xt + h) is called as lag-h autocovariance.
 corr (Xt, Xt + h) is called as lag-h autocorrelation.
 If the value of autocovariance depends only on h and not on t, then time series is
called as covariance stationary.
 The covariance stationary time series for lag-h autocorrelation is denoted as
ρh = corr (Xt, Xt + h)
8.3.2 Multivariate Input Models [M-06,11]
 If X1 and X2 are normally distributed then dependence between them can be modeled
by bivariate normal distribution with parameter μ1, μ2, σ12, σ22 and ρ = corr (X1, X2).
 To estimate ρ, let (X11, X21), (X12, X22), … (X1n, X2n) be the n independent and
identically distributed pairs.
 The sample covariance is
̂( ) ∑( ̅̅̅̅)( ̅̅̅̅) (∑ ̅̅̅̅ ̅̅̅̅)

where ̅̅̅ ̅̅̅ are the sample means.

 The correlation is estimated by
̂( )
̂
̂̂
where ̂ ̂ are the sample variances.
Example 8.6 [M-17, N-04, D-14,15]

The following data were available for the past 10 years on demand and lead time.
Lead time 6.5 4.3 6.9 6.0 6.9 6.9 5.8 7.3 4.5 6.3
Demand 103 83 116 97 112 104 106 109 92 96
Estimate correlation and covariance.
Solution:
Lead Time Demand
X1 X2 X1 X2 X12 X22
6.5 103 669.5 42.25 10609
4.3 83 356.9 18.49 6889
6.9 116 800.4 47.61 13456
6.0 97 582.0 36.00 9409
6.9 112 772.8 47.61 12544
6.9 104 717.6 47.61 10816
5.8 106 614.8 33.64 11236
7.3 109 795.7 53.29 11881
4.5 92 414.0 20.25 8464
6.3 96 604.8 39.69 9216
̅̅̅ , ̅̅̅ , ̅̅̅ ̅̅̅ 625.052 , ∑ 6328.5
∑ ̅̅̅̅ ∑ ̅̅̅̅
̂ = 1.04 ̂ = 98.62
̂ ̂
̂( ) ∑ ̅̅̅ ̅̅̅
, -
̂( )
̂( )
̂
̂̂ ( )( )
Therefore lead time and demand are strongly dependent.

8.3.3 Time Series Input Models [M-06,08,09,10,14,16 S-15, N-04, D-13,14,16]
 If X1, X2, X3,… is a sequence of identically distributed but dependent and covariance –
stationary random variables then there are number of time series models that can be
used to represent the process.
 Two models for describing time series are
1. AR(1) – Autoregressive order-1 model
2. EAR(1) – Exponential autoregressive order-1 model
 Both these models have the characteristic that the autocorrelation take the form
ρh = corr (Xt, Xt +h) lag-h autocorrelation for h = 1, 2, …
8.3.3.1 AR(1) model

 Consider the time-series model
Xt = µ + Ф(X t -1 - µ) +εt , t= 2, 3, …
where ε2, ε3, ε4… are independent and identically (normally) distributed with mean
zero and variance σε2 and -1< Ф < 1
 If the initial value X1 is appropriately chosen then X1, X2, are normally distributed with
mean µ, variance σ2ε / (1 - Ф2) and ρh = Фh , h = 1,2,…
 This time-series model is called Autoregressive order -1 model or AR(1).
 Parameter Ф = ρ1 = corr (Xt, Xt +1) with lag-1 autocorrelation
 To estimate Ф, we first estimate the lag-1 autocovariance by
̂( ) ∑( ̅ )( ̅) (∑ ( )̅ +
and the variance σ2 = var(X) by the usual estimator ̂
̂( )
 Then ̂
̂
 Finally, estimate µ and σε2 by ̂ ̅ and ̂ ̂ ( ̂ )
Algorithm to generate a stationary AR(1) time series

Given values of parameter Ф, µ, σЄ2
1. Generate X1 from normal distribution with mean µ and variance σЄ2∕ (1- Ф2).
Set t=2.
2. Generate εt from normal distribution with mean 0 and variance σЄ2.
3. Set Xt = μ +Ф ( Xt -1 - µ ) + ε t
4. Set t = t +1 and go to step2
8.3.3.2 EAR (1) Model
 Consider the time-series model

{
for t =2, 3, …, where ε2, ε 3 are independent and identically exponentially distributed
with mean 1/λ and 0 ≤ Ф < 1.
 If initial values X1 is chosen appropriately then X1, X2, ….all are exponentially
distributed with mean 1/λ and ρh= Фh for h=1, 2,…
 This time–series model is called exponential autoregressive order-1model or

EAR(1).
 Only autocorrelation greater than 0, can be represented by this model.
 Estimation of parameter proceeds as AR (1) by setting
̂ ̂ , lag –1 autocorrelation and
̂
̅
Algorithm to generate a stationary EAR (1) time series
Given values of parameter Ф and λ
1. Generate X1 from exponential distribution with mean 1/λ. Set t = 2.
2. Generate U from uniform distribution on [0, 1]. If U ≤ Ф then set
Xt = ФXt - 1
Otherwise generate εt from exponential distribution with mean 1/λ and set
Xt = ФXt - 1 + ε t
3. Set t = t +1 and go to step2.
Example 8.7 [D-11]

In stock brokerage, the following 20 time gaps were recorded between customers buy and sell
orders (in seconds):
1.95 1.75 1.58 1.42 1.28 1.15 1.04 0.93 0.84 0.75
0.68 0.61 11.98 10.79 9.71 14.02 12.62 11.36 10.22 9.20
Assuming exponential distribution is a good model for the individual gaps, calculate lag-1
autocorrelation.
Solution:
∑ ∑
̅
∑ ̅ ∑ ( )
̂
̂( ) (∑ ( )̅ )
, ( )( ) -
̂
Therefore we could model interarrival times as an EAR (1) process with
̂ and ̂ ̂ = 0.8
̅
provided exponential distribution is good model for individual gaps.
Example 8.8
The numbers of patrons staying at Hotel Delight near Andheri Station on 20 successive nights
were observed to be 20, 14, 21, 19, 14, 18, 21, 25, 27, 26, 22, 18, 18, 18, 25, 23, 20, 21. Fit
both an AR(1) and EAR(1) model to this data.
Solution:
For AR(1) model, the parameters ̂ ̂ ̂ are estimated as follows:
∑ ∑
̂ ̅
∑ ̅ ∑ ( )
̂
̂( ) (∑ ( )̅ )
, ( )( ) -
̂
Therefore we could model interarrival times as an EAR (1) process with
̂ and ̂ ̂ = 0.42
̅
■■■
CHAPTER 9
VERIFICATION, CALIBRATION AND VALIDATION OF
SIMULATION MODELS
 Verification and Validation of the simulation model is one of the most important and
difficult task carried out by the model developer, to work closely with end users
throughout the period of development and to increase the model‟s credibility.
 Validation is an integral part of the model development.
 The goal of validation is a twofold process:
1. To produce a model that represents true system behavior, this can be used as a
substitute for the actual system, for the purpose of experimenting.
2. To increase the acceptance, credibility level of model, so that the model will be
used by managers and other decision makers.
 Conceptually, the verification and validation process consists of the following
components.
1. Verification is concerned with building the model right, which is used in
comparison of conceptual model to the computer representation.
2. Validation is concerned with building the right model, which is used to
determine that a model is an accurate representation of a real system. It is
achieved through the calibration of the model, an iterative process of comparing
the model to actual system behavior. This process is repeated until model
accuracy is judged to be acceptable.
9.1 Model building, Verification and Validation [M-16]

The model building involves 3 steps
Step 1 – The first step in model building consists of
 Observing the real system
 Interactions among its various components
 Collecting data on its behavior
This step leads in understanding the system behavior. Persons familiar with the
system or sub system should be questioned and gain the advantage of their special
knowledge. As the development proceeds, new questions may arise and model
developers will return to this step.
Step 2- The second step involves in the construction of a conceptual model. It includes the
collection of assumptions of components, structure of the system and hypothesis on
the values of model input parameters.
Step 3- The third step is the translation of operational model into a computer recognizable
form – computerized model.
Model building is not a linear process instead the model builder goes back to these steps
many times while building, verifying and validating the model. The figure 9.1 shows the
model building process.
Real
Calibration system
and Validation
Conceptual validation
Conceptual model
1. Assumptions on system components

2. Structural assumptions, which define the
interactions between system components
3. Input parameters and data assumptions
Model verification
Operational model
(Computerized
representation)
Figure 9.1 Model building, Verification and Validation
9.2 Verification of Simulation Models

[M-07,08,09,10,12,13,14,15,17, D-06,07,08,09,11,15]
 The main purpose of verification is to assure that the conceptual model is accurately
reflected in the operational model (computerized representation).
 The conceptual model involves some degree of abstraction about the operations of the
system.
 Some considerations in verification process are
1. The computerized model has to be checked by others, than its developer.

2. Make a flow diagram (logic flow), which includes operations of system, so that
we know what operations takes place when the events occur.
3. Closely examine the output of model for reasonableness, under a variety of
settings of input parameters.
4. At the end of simulation, have the computerized representation that prints the
input parameter. This is to confirm that these parameter values are not changed or
modified.
5. As far as possible make the self-documentation of computerized model.
6. If the computerized representation is animated, then check whether the animation
reflects the real system.
7. The interactive run controller (IRC), debugger assists in finding and correcting the
errors in the following ways.
a. As the simulation progress, it can be monitored. This is achieved by

advancing the simulation under a desired time and then display model
information.
b. Focus on each or multiple line of logic that constitutes a procedure or a
particular entity. For example every time a specified entity becomes active,
simulation will pause.
c. Selected model components values can be observed.
d. Simulation can be temporarily suspended or paused, to view information
and reassign values or redirect entities.
8. Graphical interfaces are required to represent the model graphically, it simplifies
the model understanding.
 The standard statistics (average waiting time, average queue length etc) are
automatically collected in simulation language, which takes little time to display all
statistics of interest.
 Two sets of statistics that indicates the factor model reasonableness are
1. Current contents refer to the number of items in each components of the system
at a given time.
2. Total count refers to the total number of items that have entered each component
of the system by a given time.
 Some possibilities of two test statistics are
1. If the current content in some parts of the system is high, then it indicates that a
large number of entities are delayed and queue is unstable.
2. If the total count for some subsystem is zero, then no items entered the system.
3. If the current count and total count is equal to one then an entity has captured a
resource but never freed it.
 A careful evaluation is required to detect the mistakes in model logic. To help in error
detection, it is best to adopt any of these verification processes.
1. Common sense technique – Forecasts a reasonable range of values of selected
output statistics before making a run of the model. So it reduces discrepancies and
unusual output.
2. Documentation should contain brief comments, definitions of all variables and
parameters and description of each major section of computerized model.
Documentation is important as it provides a means to clarify the logic of a model.
3. Trace is more sophisticated technique. Trace is a detailed computer printout
which gives the value of each variable in a program, every time that one of these
variables changes in value. The purpose of trace is to verify the correctness of
computer program by making detailed calculations (manual). To make this
practical, a simulation with trace is usually restricted because of time factor.
Selective trace is also carried out as required. For example a selective trace could
be set for specific locations in the computerized model.
9.3 Calibration and Validation of Models

[M-05,06,07,08,09,10,11,12,13,14, S-15, N- 04, D-05,07,08,09,13,14,16,17]
 Validation is the overall process of comparing the model and its behavior to the real
system and its behavior.
 Calibration is the iterative process of comparing the model to the real system and
making adjustments to the model, comparing the revised model to reality, making
additional adjustments, comparing again, and so on.
 Figure 9.2 shows the relationship of model calibration to the overall validation
process.
 The comparison of the model to reality is carried out by a variety of tests – some
subjective and others objective.
 Subjective tests involve people, who are knowledgeable about one or more aspects of
the system, making judgments about the model and its output.
 Objective tests always require data on the system‟s behavior plus the corresponding
data produced by the model.
 Then one or more statistical tests are performed to compare some aspect of the system
data set to the same aspect of the model data set.
 This iterative process of comparing model and system, and revising both the
conceptual and operational models to accommodate any perceived model deficiencies,
is continued until the model is judged to be sufficiently accurate.
Figure 9.2 Iterative process of calibrating a model
Naylor and finger [1967] proposed a three-step approach for validation

1. Build a model that has high face validity.
2. Validate model assumptions.
3. Compare the model input–output transformation to corresponding input output
transformation for real system.
9.3.1 Face Validity
 The potential users of the model must be involved in the model construction from its
conceptualization to implementation, to assure that the reality is built into the model
through assumptions regarding system structure and reliable data.
 The advantages of involving potential users are.
1. They can evaluate the model output for reasonableness and help in identifying the
deficiencies. So they are involved in the calibration process, as the model is iteratively
improved.
2. The increase in the model‟s perceived validity or credibility helps the manager to trust
the simulation results, a basis for decision making.
 Sensitivity analysis can also be used to check a model‟s face validity – the model user
is asked whether the model behaves in the expected way, when one or more input
variables are changed. Based on experience and observation on the real system, both
model user and builder address the problem.
 For most large-scale simulation models, many possible sensitivity tests are carried out
as there are many input variables. The builder must choose the most critical input
variables for testing if it is too expensive or time consuming.
9.3.2 Validation of Model Assumptions
 Model assumptions have two general classes

1. Structural assumptions involve questions of how the system operates simplifications
and abstractions of reality.
Example: Customer queuing and service facility in a bank – Customers form a queue
for each teller, they are served on first-come, first-serve basis. When there are many
queues, customers may shift to other line that moves faster. The numbers of tellers are
either fixed or variable. These structural assumptions should be verified by
observations at regular interval with discussions between managers and tellers,
regarding policies and implementation of these bank policies.
2. Data assumptions involve collection of reliable data and correct statistical analysis of
the data.
Example – For a bank the data that were collected are
a. Inter arrival times of customers during several 2 hours period of peak
loading.
b. Inter arrival times during a slack period.
c. Service times for commercial accounts.
d. Service times for personal accounts.
 The reliability of data is verified by consultation with bank managers, who identify
typical slack/rush time. When two or more data sets collected are combined, objective
statistical tests is performed for homogeneity of data.
 Additional tests may be required for correlation in data. The analyst begins statistical
analysis as soon as he is assured of dealing with a random sample.
 The analysis consists of three steps
1. Identifying the appropriate probability distribution.
2. Estimating the parameters of hypothesized distribution.
3. Validating the model by goodness-of-fit tests (chi-square or Kolmogorov-Smirnov
test) and by graphical methods.
Note – The use of goodness-of-fit tests is an important part of validation of model

assumptions.
9.3.3 Validating Input-Output Transformations [M-13,16 D-07,08]
 In this phase, the model is viewed as an input-output transformation i.e. model accepts
values of input parameters and transforms these inputs into outputs measures of
performance.
 The modeler collects two sets of data, one data set used at the time of developing and
calibrating the model and the other if required at the final validation test.
 In any case, the modeler should use the main responses of interest as criteria for
validating a model. A necessary condition in this phase is, some version of system
under study exists, so data can be collected (at least one set of input conditions),
which might be useful to compare with model predictions. If system is in planning
stage and no system operating data is collected, complete input output validation is
not possible.
 What about the validity of model of a non- existent proposed system or model of
existing system under new input conditions?
 First, the responses of two models under similar input conditions will be used as
criteria for comparison of existing and proposed system.
 Second, the proposed system is a modification of existing system in most cases.
The modeler hopes that confidence in the model of existing system can be
transferred to the model of new system. This transfer of confidence by modeler
can be justified only if new model is relatively with minor modification of old
model in terms of changes to computerized representation of the system.
 Changes in computerized representation of the system, ranging relatively from minor
to major includes.
 Minor changes of single numerical parameters. Example – speed of a machine,
arrival rate of customers
 Minor changes of statistical distribution. Example – distribution of a service time
or time to failure of a machine.
 Major changes in logical structure of a subsystem. Example – change in queue
discipline for a waiting–line model.
 Major changes of design in new system. Example – computerized inventory
control system.
 If the changes are minor then it can be carefully verified and output from the new
model is accepted with confidence. If a similar subsystem exists elsewhere, it may be
possible to validate sub model that represents the subsystem and then integrate this
sub model with other validated sub models to build a complete model, this is a partial
validation of major changes.
 There is no way to completely validate the input-output transformations of a model of
non existing system. The modeler should consider time and budget constraints and
use as many validation techniques including input-output validation of subsystem
models if operating data can be collected on such subsystems..
10.3.4 Input-Output Validation: Using Historical Input Data
 To conduct a validation test using historical input data, it is important that all input
data (An , Sn…..) and all system response data such as average delay (Z2) should be
collected during the same time period. Otherwise the comparison of model to the
system responses could be misleading – the responses depends on inputs and structure
of the system or model.
 Implementation of this technique for large system is difficult because of the need of
simultaneous data collection. Some electronic counters and devices are used for ease
of data collection. In this technique the modeler hopes that simulation will provide a
replica of a real system, but to determine the level of accuracy both model builder‟s
and model user‟s judgment is considered.
10.3.5 Input-Output Validation: Using a Turing Test
 The comparison of model output to system output can be carried out by persons who
are knowledgeable about system behavior, when no statistical test is readily
applicable.
 For example: Suppose five reports of system performance over five different days are
prepared and simulation output data are used to produce five fake reports. So there are
10 reports exactly in same format and contains information as required by managers
and engineers. These 10 reports are shuffled randomly and submitted to the engineer,
to identify fake and real reports. If the engineer identifies fake reports, then the model
builder questions the engineer and uses the information gained to improve the model.
If the engineer cannot distinguish, then the modeler will conclude that this model is
adequate. This type of validation test is called Turing test.
 It provides a valuable tool in detecting model inadequacies and eventually in
increasing model credibility.
■■■
CHAPTER 10
ESTIMATION OF ABSOLUTE PERFORMANCE
10.1 Introduction [D-10,14]

 Output analysis refers to the examination of data that has been produced as a
result after the simulation of a particular system.
 We need to interpret results for the following reasons:
(i) To predict the system performance.
(ii) Compare performance of two or more design solutions.
 The need for statistical output analysis is based on the observation that the output data
from a simulation exhibits random variability
(i) due to use of random numbers to produce input variables
(ii) Two different streams or sequences of random variables will produce two sets of
outputs which will differ
 There are various issues relating to the output analysis like:
(i) The sequence of the output variables may be auto-correlated, hence creating a
problem in applying the classical methods of statistics.
(ii) Initial conditions of the system may influence the output data.
10.2 Types of Simulations With Respect To Output Analysis [M-09,10,11,12,

S-15, D-06, 11,13]
 In the analyzing of simulation output data, a distinction is made between terminating
or transient simulations and steady-state simulations.
(i) Terminating (Transient) simulation:
 Runs for some duration of time TE, where E is a specified event that stops the
simulation.
 Starts at time 0 under well-specified initial conditions.
 Ends at the stopping time TE.
 Bank example: Opens at 8:30 am (time 0) with no customers present, and closes
at 4:30 pm (Time TE = 480 minutes).
 The simulation analyst chooses to consider it a terminating system because the
object of interest is one day‟s operation.
 TE may be known from the beginning or it may not.
(ii) Non-terminating (Steady-state) simulation:

 Runs continuously or at least over a very long period of time.
 Examples: assembly lines that shut down infrequently, telephone systems,
hospital emergency rooms.
 Initial conditions defined by the analyst.
 Runs for some analyst-specified period of time TE.
 Study the steady-state (long-run) properties of the system, properties that are not
influenced by the initial conditions of the model.
 Whether a simulation is considered to be terminating or non-terminating depends on

both the objectives of the simulation study and the nature of the system.
10.3 Stochastic Nature of Output Data

 Model output consists of one or more random variables because the model is an input-
output transformation and the input variables are random variables.
 The simulation model is run with different data sets for n times.
 Two issues need to be addressed with regard to the statistical analysis. They are:
 Output data from various runs are independent and identically distributed.
(i) Estimation of the true utilization, Point estimate ( ̂)
(ii) Estimation of the error in our point estimate as standard error or a
confidence interval.
 The effects of correlation and initial conditions of the estimation of the long
run mean measures of performance.
(i) The batching transforms the continuous time queue length process into
a discrete time batch mean process.
(ii) The sequence of the batches is auto-correlated as all the data is
obtained using one replication only.
(iii)Avoid direct statistical analysis of the within replication output, as the
sequence is a non-stationary auto-correlated stochastic process.
(iv) Here non-stationary means not identically distributed.
10.4 Measures of Performance and Their Estimation

 The basic aim here is to have a point estimate and an interval estimate of the
performance parameter θ (or Ф).
 The length of the interval estimate is the measure of the error in the point estimate.
 If the simulation output data is of the form {Y1, Y2, …, Yn}, it is called as discrete-
time data, since n is discrete valued. Her θ is an ordinary mean.
 For continuous time, data = f{ Y1, Y2, …, Yn }, is a time weighted mean.
 Here Yi presents the customer delay.
 And Y(t) represents the length of the queue at time t.
10.4.1 Point Estimation [S-15, N-04]

1. Discrete Time Data
 The point estimator of θ based on {Y1, Y2, …, Yn} is given by
̂ ∑
 The point estimator ̂ is unbiased for θ if ( ̂ ) .

 Usually ( ̂ ) .
 Here, the estimator bias is ( ̂) .
 It is desirable to have unbiased or low biased estimator.
 Example: ̂ and ̂ in the queueing models, where Wi is the time spent by the
customer I in the system.
2. Continuous Time Data

 The point estimator based on the data [W(t), 0 ≤ t ≤ TSE] is given by
̂ ∫ ( )
 The point estimator ̂ is unbiased for if ( ̂ ) .

 Usually ( ̂ ) .
 Here, the estimator bias is ( ̂ ) .
 It is desirable to have unbiased or low biased estimator.
 Example: ̂ and ̂ in the queueing models.
Generally, and are called as mean measures of performance.
10.4.2 Quantile and Percentile Estimation

 Performance measures like quantile and percentile does not fit to the above
mentioned framework.
 Quantile describe the level of performance that can be delivered with a given
probability p.
 Let WQ represents the delay in queue.
 The 0.65 quantile of WQ is the value such that Pr{ WQ ≤ }= p where p = 0.65.
 As a percentage, is the 100pth or 65th percentile of customer delay.
 Hence 65% of all customers will experience a delay of minutes or less.
 Median is also widely used as a performance measure.
 Median is the 0.5 quantile or 50th percentile.
 Estimation of a quantile is the inverse of the problem of estimating a proportion or
probability.
 For estimation of proportion, is given and p is to be estimated but in case of a
quantile, p is given and is to be estimated.
 To estimate a quantile, the simple and easiest method is:
1. Form a histogram of the observed values of WQ.
2. Find a value of ̂ where 100p% of histogram is to the left of θ.
3. Example: Let {WQ1, WQ2, … , WQ200}be the delays of n = 200 customers.
Then ̂, which represents an estimate of the 65th percentile of the delay is
(0.65)(200) = 130.
4. Hence we say that 130 observed values are less than or equal to and ̂ is
equal to 130th smallest value in the given sample.
5. In case of continuous time process, histogram gives the fraction of the time
that process spent at each possible level.
6. Example: The queue length process * ( ) +.
10.4.3 Interval Estimation [S-15, N-04]

 The point estimate rarely matches the actual value of the parameter being estimated.
 Hence, there is a need to find an interval estimate.
 For this we need to estimate the variance of the point estimator ̂ or ̂.
 For the given number of samples we can determine how many samples we need for a
given desired variance.
 We construct an interval as the confidence interval, in which we have definite
confidence that it contains the true value of the unknown parameter.
 Consider the sample data point {Y1, Y2, …, Yn}.
 The true variance of the point estimator is ( ̂) ( ̂).
 The estimate of the variance of point estimator is ̂ ( ̂) ( ̂).
 If , ̂ ( ̂ )- ( ̂), the estimator is then considered to be biased.
 But if the variance of point estimator ̂ ( ̂) is nearly unbiased, the statistic
̂
̂ ( ̂)
is approximately t-distributed with f degrees of freedom and can be bounded by
100(1 - α) confidence interval for θ and it is given by
̂ ̂( ̂) ̂ ̂( ̂) ̂ ̂ ̂( ̂)
 is the 100(1 – α/2)% point of a t-distribution with f degrees of freedom and is
given by . / .
 It is very difficult to obtain approximately unbiased estimates of ( ̂)
10.5 Output Analysis of Terminating Simulations [N-04]

 Consider the simulation that runs over a time interval [0, TSE].
 Let Y1, Y2, …, Yn be the output observations of the sample size n.
 A method of independent replication is used to estimate θ using simulation where,
. ∑ /.
 In this method the simulation is repeated „R‟ times with different random-number
stream and independently chosen initial conditions.
 Let Yri be the ith observation within replication r, for i = 1, 2,…, nr and r = 1, 2,…, R.
 For fixed replication r, Yr1, Yr2, … is an auto-correlated sequence within replication r.
 But for different replications r and s (r ≠ s), Yri and Ysj are statistically independent
for all i and j.
 Let ̂ be the sample mean within each replication r such that ̂ ∑ , r = 1,
…, R.
 The R sample means ̂ ̂ ̂ are statistically independent and identically
distributed and are unbiased estimators of θ.
 Now, one can apply the classical methods of interval estimation.
 Similarly, Ф is defined by ̂ .
10.5.1 Statistical Review [M-09, D-06]

 Let {Y1, Y2, …, Yn}be the statistically independent observations.
 Then,
 Sample Mean is given by ̂ . ∑ / and
( ̂) ∑ ̂
 Sample variance is given by ∑ .
 If Yi are independent and identically distributed, the sample variance S2 is an
unbiased estimator of the population variance ( ) for i = 1 to n.
 The variance of ̂ is ( ̂) which is unbiased estimator of ( ̂), with f = n – 1
degrees of freedom which is given by ( ̂) .

 Now the confidence interval which is given by the equation
̂ ̂( ̂) ̂ ̂ ̂( ̂) is valid provided that the point estimate ̂ is
unbiased.
 ̂( ̂) is also denoted by s.e.( ̂ ) and is called the Standard Error of the point
√
estimator ̂.
 The Standard Error is a measure of:
1. The precision of a point estimator and
2. The average deviation to be expected between the point estimator and the true
mean.
10.5.2 Confidence Interval Estimation for a Fixed Number of Replications [N-04]
1. Discrete Time Data

 Consider R independent replications of simulation are made.
 It results in:
1. Independent estimates those are given by
̂ ∑ , r = 1, …, R.
2. The overall point estimate, ̂ is given by
̂ ∑̂
3. The estimate of the variance of ̂ is given by
̂ ( ̂) ∑( ̂ ̂)
( )
4. A 100(1 - α)% confidence interval with f = R – 1 degrees of freedom is given
by ̂ ̂( ̂) ̂ ̂ ̂( ̂)
5. The standard error of the point estimator ̂ is given by
̂( ̂) √ ̂ ( ̂)
 As R increases the standard error becomes smaller and finally reduces to zero.
2. Continuous Time Data

 Consider the output data {Yr(t), 0 ≤ t ≤ TSE}with r = 1, 2, …, R independent
replications.
 It results in:
1. Independent estimates those are given by
̂ ∫ ( ), r = 1, …, R.
2. The overall point estimate, ̂ is given by
̂ ∑̂
3. The estimate of the variance of ̂ is given by
̂ (̂) ∑( ̂ ̂)
( )
4. The confidence interval is similar as that in Discrete Time Data.
Example 10.1 [D-11]

Given the following data for utilization and time spent in system for the Able – Baker carhop
problem. Calculate the overall point estimators, standard error and 95% confidence interval
for the same. Given t0.025,3 = 3.18
Run r 1 2 3 4
Able‟s Utilization 0.808 0.875 0.708 0.842
Average system time (mins) 3.74 4.53 3.84 3.98
Solution:
The analyst desires a 95% confidence interval for Able‟s true utilization, .
As 95% confidence interval is expected, 100(1 – α) = 95
1 – α = 95/100 = 0.95
α = 1 – 0.95 = 0.05 = 5%
Also degree of freedom f = R – 1 = 4 – 1 = 3
An overall point estimator
̂ ∑̂
The estimate of the variance of ̂ is given by

̂ ( ̂)
∑(̂ ̂)
( )
( ) ( ) ( ) ( )
( )
= (0.036)2
Thus, the standard error of ̂ = 0.808 is estimated by s.e.( ̂) = ̂( ̂) = 0.036.
Given t0.025,3 = 3.18.
Now, compute the 95% confidence interval by
̂ ̂( ̂) ( )( )
or with 95% confidence, 0.694 ≤ ρ ≤ 0.922
In the similar way, compute a 95% confidence interval for mean time in system w:
( ) ( ) ( ) ( )
̂ ( ̂) ( )
( )
so that
̂ ̂( ̂) ( )( )
or with 95% confidence, 3.46 ≤ w ≤ 4.58
10.5.3 Confidence Intervals with Specified Precision

 Sometimes we would like to estimate CI with a specified precision
 The half-length (h.l.) of a 100(1 – a)% confidence interval for a mean θ, based on the
t - distribution, is given by:
̂( ̂) ̂( ̂)
√
 Here, we need a large sample size R, if an error criterion is specified to estimate
within with high probability 1 – α, so that (| ̂ | ) .
 STEPS:
1. Make R0 independent replications where R0 = 2 but 4 or 5 is recommended; 10
or more is desirable.
2. Get an initial estimate of the population variance based on R0
replications. Thus,
√
Solving for R in inequality,
( )
3. As , then an initial estimate for R is given by
( *
where is the . / point of the standard normal distribution.

Also, for large R (say R ≥ 50), .
4. After estimating R, collect R – R0 additional observations and form the
( ) confidence interval of by
̂ ̂
√ √
Also, ̂ ( ̂)
5. If the confidence interval is too large, repeat the steps to estimate an even
larger sample size.
10.5.4 Confidence Intervals for Quantiles

 Consider that the number of independent replications Y1, Y2, …, YR is large enough
so that .
 The confidence interval for a probability p is given by
̂( ̂)
̂ √
 We know that the quantile estimation problem is the inverse of the probability
estimation problem.
 Steps:
1. Obtain such that ( ) .
̂
2. To estimate „p‟ quantile, find such that 100p% of the data in the histogram
of Y is to the left of ̂ or the R pth smallest value of Y1, Y2, …, YR.
3. Obtain the appropriate (1 – α)100% confidence interval for by calculating:
(i) that cuts off 100P1% of the histogram and
(ii) that cuts off 100Pµ% of the histogram
where
( ) ( )
√ √
4. In case of sorted values,
̂ is the Rp1 smallest value (rounded down) of Y1, Y2, …, YR
̂ is the Rpµ smallest value (rounded up) of Y1, Y2, …, YR
10.6 Output Analysis of Steady State Simulation [M-05, D-05,08]

 The goal is to estimate the steady state or long run measures of performance of a
system.
 Consider a single run of a simulation model that generates observations Y1, Y2,..,
which are samples of an auto-correlated time series.
 The steady state measure of performance „ ‟ with probability 1 is given by:
 The value of is independent of the initial conditions.

 In practice, simulation stops after say „n‟ numbers of observations have been collected
or for some length of time TSE.
 The sample size „n‟ (or TSE) is a design choice and independent of the nature of
problem.
 Since simulation can‟t run forever, we choose n to terminate simulation based on:
1. Bias in the point estimator due to inappropriate initial conditions. This is more
severe for short term simulation. Average out for long runs.
2. Desired precision of the point estimator (measure point estimator variance).
3. Time and budget available for computers and other resources.
10.6.1 Initialization Bias in Steady State Simulations [M-08,11,14, D-06,07,08,11,15,17]

 Initial conditions can be artificial or unrealistic.
 For this we use methods to reduce the point-estimator bias caused by using such
conditions.
1. Intelligent Initialization
 Initialize the simulation in a state that is more representative of long-run
conditions.
 Give some time to the system to stabilize.
 If the system exists, collect data on it and use that data to specify more
nearly typical initial conditions.
 If the system does not exist, then use any data to make a simplified model
so as to make it mathematically solvable.
 Example: Queueing models, solve the simplified model to find long-run
expected or most likely conditions, use that to initialize the simulation.
2. Divide each simulation into two phases as shown in figure 10.1.
First: Initialization phase from time „0‟ to time T0 (Don‟t collect data until some
time T0, as system requires some time to stabilize).
Second: Data collection phase from time „T0‟ to „T0+ TSE‟
Specified Initial “Steady State” Initial

Conditions Conditions
I0 I
|------------------------------|------------------------------------|
0 T0 T0+TSE
Initialization phase of Data collection phase of

length T0 length TSE
Figure 10.1 Initialization and data collection phases of steady state simulation run
 The selection of „T0‟ is an important issue as the system state „I‟ at time
T0, is proper representation of steady state behavior than the original initial
condition (I0) at time 0.
 Also, the duration of data collection phase „TSE‟ should be long enough so
that the steady state behavior must be sufficiently precise estimates.
 Note that the system state „I‟ is a random variable. The probability
distribution of the system state at time „T0‟ is sufficiently close to the
steady state probability distribution which makes the bias point estimates
of the response variables negligible. So we can say that the system is
approximately in steady state.
Several issues:
1. Ensemble average will reveal a smoother and more precise trend as the number of
replications, R, is increased.
2. Ensemble average can be smoothened further by plotting a moving average. In a
moving average each plotted point is actually the average of several adjacent
ensemble averages.
3. Cumulative averages become less variable as more data are averaged. Thus, it is
expected that the curve at left side (the starting of the simulation) of the plotting is
less smooth than the right side.
4. Simulation data, especially from queueing models, usually exhibits positive
autocorrelation. The more correlation present, the longer it takes for the average to
approach steady state.
5. In most simulation studies the analyst is interested in several measures such as queue
length, waiting time, utilization, etc. Different performance measures may approach
stead state at different rates. Thus it is important to examine each performance
measure individually for initialization bias and use a deletion point that is adequate
for all of them.
10.6.2 Replication Method for Steady State Simulations [D-16]

 If initialization bias in the point estimator has been reduced to a negligible level, then
the method of independent replications can be used to estimate point-estimator
variability and to construct a confidence interval.
 The basic idea is simple: Make R replications, initializing and deleting from each one
as in terminating simulations.
 If, however, significant bias remains in the point estimator and a large number of
replications are used to reduce point estimator variability, the resulting confidence
interval can be misleading.
 This happens because bias is not affected by the number of replications (R); it is
affected only by deleting more data (i.e. increasing T0) or extending the length of each
run (i.e. increasing TSE).
 Thus, increasing the number of replications R may produce shorter confidence
intervals around the “wrong point.” Therefore, it is important to investigate the initial-
condition bias thoroughly.
 If the simulation analyst decides to delete d observations of the total n observations in
a replication, then the point estimator is the average of the remaining data.
10.6.3 Sample Size in Steady State Simulations

 Let us say that we have to estimate a long-run performance measure, , within
with confidence interval 100(1 – α)%.
 In a steady-state simulation, a specified precision may be achieved either by

increasing the number of replications (R) or by increasing the run length TSE.
Example 10.2 [D-13]

Consider the following data for the M/M/1 queue simulation. R0 = 10, d = 2, and = 25.30.
Estimate the long-run mean queue length, LQ, within =2 customers with 90% confidence.
From the table the value of Z0.05 = 1.645. How many additional replications are required?
Solution:
Given that: R0 = 10, d = 2, = 25.30, =2
As 90% confidence interval is expected, 100(1 – α) = 90
1 – α = 90/100 = 0.90
α = 1 – 0.90 = 0.10 = 10%
Estimate the final sample size as desired by:
( ) ( )
( +
( )
Thus, at least 18 replications will be needed.
Use inequality ( ) to estimate the possible candidate (R = 18, 19, 20,…) to
estimate the final size.

R 18 19
t0.05, R-1 1.74 1.73
19.15 18.93
( *
Since R = 19 ( ) =18.93 is the smallest integer R satisfying the inequality, a total
sample size of R = 19 replications are needed to estimate LQ to within customers.

So, R – R0 = 19 – 10 = 9 additional replications are needed to achieve the specified error.
10.6.4 Increasing the Run Length

 An alternative to increasing r is to increase total run length T0+TSE within each
replication.
 The run length is increased from T0+TSE in the proportion of R/R0 to new run length
(R/R0) (T0+TSE).
 Here, we delete additional amount of data from time „0‟ to time „(R/ R0)T0‟ and more
data will be used to calculate the point estimate.
 If we increase only the number of replications but maintain the same run length then
the total amount of simulation effort remains the same.
 Advantage: Any residual bias in the point estimator should be further reduced.
 Disadvantage: It is necessary to save the state of the model at time T0+TSE to restart it
and to run it for additional time.
Figure 10.2 Increasing run-length to achieve specified accuracy
10.6.5 Batch Means for Interval Estimation in Steady State Simulations [M-12,17, D-10]
 One problem of replication method is that you have to delete some initial data i.e.„d‟
observations from each of the „R‟ replications.
 So, we are throwing away a total of „dR‟ observations over all of the observations.
 To overcome this issue we may use an experimental design that is based on single
long replication.
 But, the major problem is in the computation of the standard error of the sample mean
as the data are dependent, so usual estimator is biased.
 Batch means is the method that overcomes the above mentioned problems.
 Method:
 Divide the output data from one replication (after appropriate deletion) into a
few large batches.
 Process the means of these batches as if they were independent.
 Calculate the batch means ( ̅ ) based on the form of the raw output data.
1. Continuous time process

 If the number of batches are “k”, then batch size .
 Batch mean ̅ = ∫( )
( ) for j = 1, 2, .., k
 The jth batch mean is the time weighted average of the process over the
time interval [T0 + (j – 1)m, T0+jm].
2. Discrete time process

( )
 If the number of batches are “k” then the batch size .
 Batch mean ̅ = ∑ ( ) for j = 1, 2,…, k.
 Different batch means those we are getting using above equation are given
below:
… ( )
deleted ̅ ̅ ̅
 Now estimate the variance of the sample mean:

(̅ ̅) ∑ ̅ ̅
∑
( )
where ̅ is the overall sample mean of the data after deletion.
 Note that, ̅ ̅ ̅ are not independent, but if the batch size is large
enough then successive batch means will be approximately independent and
the variance estimator will be approximately unbiased.
10.6.5.1 General Guidelines for choosing an acceptable batch size m or Number of

batches k.
1. For a fixed sample size, a number of batches should be used.
2. The lag-1 autocorrelation ( ̅ ̅ ) is used to examine the dependence
between the batch means.
 Estimate the lag-1 autocorrelation from a large number of batch means based
on a smaller batch size ( ).
 If we rebatch the data to be between 10 and 30 batch means based on a
larger batch size then the autocorrelation will be very small.
3. If the total sample size to be chosen sequentially, then the run length increases as
the batch size and the number of batches increases.
10.6.5.2 Using these guidelines, the following general strategy is recommended

1. Collect the output data from a single replication and delete the appropriate amount
of data from it.
2. Calculate the batch means ( ) and estimate the sample lag-1
autocorrelation of the batch means:
∑ ( ̅ ̅ )( ̅ ̅)
̂
∑ (̅ ̅)
3. Check that if the correlation is sufficiently small:
a) If ̂ , then –
1. Rebatch the data into batches.
2. Form a confidence interval using k – 1 degrees of freedom for the t –
distribution and estimate the variance of ̅ .
b) If ̂ , then
1. Increase the replication by 50% to 100%.
2. Go to step 2.
4. Check again the confidence interval by examining the batch means for
independence test using the test statistics „C‟ where
(̅ ̅) (̅ ̅)
√ (̂ +
∑ (̅ ̅)
 If , then accept the independence of the batch means else extend the
replication by 50% to 100% and go to step 2.
 Due to the small number of batches the estimated lag-1 autocorrelation would be
slightly negative.
 Also, it is difficult to estimate correlation with small numbers of observation.
10.6.6 Confidence Intervals for Quartiles

 Consider the output process Yd+1, …, Yn from a single replication.
 Let Yi represents the delay in queue of the ith customer.
 The point estimate of the pth quantile can be obtained either from the histogram or the
sorted values.
 Here, we use the data after deletion point only.
 Now, make R replications; let ̂ be the quantile estimate of the rth replication.
 The R quantile estimates ̂ , .., ̂ are independently and identically distributed.
 The average is given by:
̂ ∑̂
 The average ̂ can be used as the point estimate of .

 An approximate confidence interval is given by
̂ where S2 is the sample variance of ̂ , .., ̂
√
 If we have only one replication then we say that ̂ be the quantile estimate from
within the ith batch of data.
 But it requires sorting of data or formatting a histogram within each batch.
 If the batches are sufficiently large, then these within-batch quantile estimates are also
independent and identically distributed.
■■■
UNIT: V
APPLICATION
CHAPTER 11 CASE STUDY

CHAPTER 11
Application: Case Study
11.1 Processor and Memory Simulation [M-06, D-15]

1. Processor (CPU) Simulation
 It is the lower level abstraction that we do to discover what the execution time is for
the system.
 The input for this simulation is basically instructions.
 The simulation works through the logical design of CPU to find out what happens in
response to a given stream of instructions, time for executions and where does the
bottleneck exist in the CPU.
 The main challenge to make effective use of CPU is to avoid stalling it, major cause
of which is the latency delay between the CPU and main memory.
 Modern processors after pipelining and extra capabilities exploit Instruction Level
Parallelism (ILP).
 A simulation of ILP CPU will model the logic of each stage and coordinate the
movement of various instructions from stage to stage.
 It starts with the instruction fetch stage that interacts with the simulated memory
system, if present.
 Then comes the instruction decode stage that places an instruction in the CPU‟s list of
active instructions.
 Then is the instruction issue stage. Implementation of this stage model depends on
marking registers and functional units.
 For the instruction execute stage, simulation is a matter of computing the result
specified by the instruction.
 Between the instruction issue and the instruction complete stage, instructions can get
processed in order that does not correspond to the original stream.
 The last stage, “graduation”, re-orders these instructions.
 Simulating this stage is a matter of knowing the sequence number of the next
instruction to be generated and then graduating when it appears.
2. Memory Simulation
 Data is written into the L1 cache. This is done in one of the ways:
(i) Write through all the caches whenever there is a write.
(ii) Entire block is written back every time it is ejected.
 The simulation here compares the performance of these two methods or strategies,
taking costs and contention for resources into consideration.
 Direct-execution simulation is used to provide reference streams for the evaluation of
a memory hierarchy design.
 Caches must be very fast. Dedicating comparison hardware with every location in the
associative memory does this.
 Another role of simulation is to work out, for a given cache size, how the space ought
to be partitioned into sets. This is largely a cost consideration.
 It works to compute the miss ratio of many different set sizes in 1 pass of reference
string.
 Presented with a reference, simulation searches the list of cache blocks for a match.
 Building a cache to have a unique comparator associated with every address in the
cache is extremely expensive; so various memory tricks are applied to bring down the
costs.
 For this, the address space is partitioned into various sets, and any given memory
address is mapped to the set identified by its set-id address bits.
 In context of a set associative cache simulation, each set must be managed separately.
11.2 Manufacturing and Material Handling Systems

[M-08,14,16,17, D-05,09,12,14,17]
 Before designing the model, one should first determine the scope of the simulation. In
this case the objectives of the study and data being collected are useful to freeze the
scope of the work.
 Level of details may be constrained by the availability of input data and the
knowledge about the working of various system components and their interaction
with each other in this process.
 In case of new or non – existent systems, data may be inadequate and the system
knowledge may be based on assumptions and expert‟s opinion.
 The experienced simulation analyst involves customer throughout the simulation
study to determine the purpose of model which provide the basis for selecting a
proper scope and level of detail.
 Simulation is often used to evaluate and compare the effectiveness of various schemes
and to evaluate suggested improvements. It can be used to debug and fine – tune the
system before it is installed.
11.2.1 Models of Manufacturing Systems

While modeling the manufacturing system one has to consider following components -
1. Physical Layout of the System
2. Labor and their skill set
3. Equipment for the system
4. Maintenance policies
5. Work Centers for the different types of labor
6. End Product of the system
7. Production Schedules
8. Production Control and Quality Control
9. Suppliers, ordering policy and lead time
10. Warehouse
11. Packing and Shipping
11.2.2 Goals and Performance Measures [N-04, D-07]

 The modeler uses simulation software to gain insight and understanding how a new or
modified system will work.
 While simulations are expected to provide numeric measures of performance, such as
throughput under a given set of conditions, the major benefit of simulation comes
from the insight and understanding gained regarding systems operations.
 The major goals of simulation of manufacturing system are to identify problem areas
and quantify system performance.
 Common measures of system performance include the following:
1. throughput under average and peak loads;
2. system cycle time (how long it takes to produce one part);
3. utilization of resources, labor, and machines;
4. bottlenecks and choke points;
5. queueing at work locations;
6. queueing and delays caused by material-handling devices and systems;
7. WIP storage needs;
8. staffing requirements;
9. effectiveness of scheduling systems;
10. effectiveness of control systems.
11.2.3 Issues in the simulation of manufacturing system [M-05,07, S-15, D-08,13,14,16]

The following are some of the specific issues that simulation needs to address in
manufacturing:
 The Resources
1. Number and type of machines required for a particular job
2. Number, type, and physical arrangement of transporters, conveyors, and other
support equipment.
3. Location and size of warehouse
4. Evaluation of a change in number of items produced
5. Evaluation of the effect of a new equipment on an existing manufacturing system
6. Evaluation of capital investments
7. Labor-requirements and Staffing policy
 Performance Evaluation
1. Throughput of the system at any point of time.
2. System‟s response time i.e. time taken by system to produce an item
3. Utilization of various resources like labors, machines etc.
4. Bottlenecks and Chokepoints analysis
5. Various statistics related to waiting time
6. Time-in-system analysis
 Evaluation of operational procedures
1. Effectiveness of production scheduling policy
2. Effectiveness of quality control system
3. Inventory policies or Need of Warehouse
4. Maintenance policy
5. Effect of changes in order profiles
11.2.4 Modeling Downtimes

 Downtime has major impact on system performance. But downtimes are usually
predictable and can be scheduled to minimize disruptions in manufacturing process.
Following suggestions one may follow to model random unscheduled downtimes.
1. Ignore it
2. Do not model it explicitly but one may adjust processing time.
3. One may use constant values for life time and repair time.
4. Statistical distributions are more suited to model life time and repair time.
 Time to failure can be measured in different ways like:
1. by clock time
2. by machine or equipment busy time
3. by number of run times
4. by number of items manufacture
 Time to repair can also be modeled in different ways:
1. As a pure time delay
2. As a wait time for a resource plus time delay for actual repair.
11.2.5 Material Handling System [M-06]

 The design of material handling systems is modular, i.e. they consist of a limited set
of similar or identical elements.
 Typical material handling elements are -
 Automatic storage and retrieval machines (ASRS)
 Working stations
 Junctions to control the goods flow.
 The operation of a material handling element is defined by following properties -
 Stochastic properties of goods flows which enter the component (inter-arrival time
distribution), local transfer matrix,
 The technical properties (e.g. speed, handling or delay time), and
 The control strategies (e.g. priority or batch building rules) which control the
element‟s operation.
 These properties determine the element‟s performance parameters like
 Utilization,
 Mean and upper bound of queue sizes in front of the element‟s inputs,
 Stochastic parameters of its outputs.
 So a material handling element can be seen as an apparatus which transforms a certain
set of properties into a certain set of parameters according to its very own
functionality and character.
 The operation of such an apparatus can be modeled in a very small simulation model
to get parameter sets corresponding to property sets.
 Due to the stochastic nature of the data, each single simulation should run for a couple
of days.
11.2.6 Goods Flow Properties

 All elements are connected to form a whole material handling system.
 The links are transportation lines which transfer goods from one element to the next
one.
 Over this links adjacent material handling elements influence each other.
 A goods flow which leaves an element (its output) has certain characteristics (inter-
arrival time distribution).
 This goods flow characteristics become properties which influence the operation of
the successive element.
 Extensive research has been performed to identify the properties of the inter-arrival
time distribution which have a noticeable influence on the elements and which are
likely to describe goods flow characteristics sufficiently:
 The minimum inter-arrival time,
 The mean inter-arrival time,
 The variance (or coefficient of variation) of inter-arrival time.
 The minimum inter-arrival time is usually determined by the technical transportation
devices involved and can be easily estimated.
 The mean inter-arrival time can be derived from goods flow balances (with the
exception of stores, the average rate of arriving items is equal to the leaving rate) and
the element‟s local transfer matrix.
 Thus, the variance of inter-arrival time is the only property which usually has to be
calculated.
 Although the arrival processes look quite different the resulting performance
parameters of the elements are comparable. This leads to the conclusion that goods
flows can be described sufficiently by the three properties named above.
11.2.7 Model of Material Handling Systems [M-06, N-04, D-07]

 The approach to material handling system modeling is consequently straight forward.
 Material handling systems are composed of material handling elements.
 They are linked by goods flows which can be characterized by a limited set of three
properties.
 These properties are on the one hand properties of the incoming goods flows of an
element which determine its operation; they are on the other hand properties of the
outgoing goods flows which result from the element‟s operation.
 In the way how the goods pass a material handling system the performance
parameters of the elements involved and properties of their outgoing goods flows can
be calculated from the sources towards the sinks.
 If the system contains cycles (e.g. repair and restoring work), an iterative calculation
of the goods flow properties is required.
 It turned out that the iteration will converge in most cases rather quickly.
Figure 11.1 Block Diagram of Manufacturing Systems

The sequence of operations in a manufacturing system of metal parts shown in figure 11.1
above is explained as follows:
 The manufacturing facility uses three distinct subassemblies to produce several
different metal parts.
 Subassemblies corresponding to a particular part are produced in large batches on one
of two subassembly manufacturing lines.
 The conveyor then moves these to a loader where they are placed into empty
containers.
 Each container holds only one type of subassembly at a time.
 The containers are stored in a warehouse until all three of the part subassemblies are
available for assembly.
 Containers of the three subassemblies corresponding to a particular part are brought to
an unloader/ assembler, where they are unloaded and assembled into the final product.
 This final product is then sent to shipping.
 The resultant empty containers are temporarily stored in a finite capacity-
accumulating conveyor at the back of the assembler.
 They are then taken to the loaders, if needed; otherwise they are transported to the
warehouse.
 Forklift trucks move full and empty containers.
 The subassembly line, loaders, and the assembler are all subject to random
breakdowns.
 Automod or Arena are the suggested simulation languages that can be used for
simulating the manufacturing system.
■■■
APPENDIX
Table A.1 Random Digits and Random Numbers

11164 36318 75061 37674 26320 75100 10431 20418 19228 91792
21215 91791 76831 58678 87054 31687 93205 43685 19732 08468
10438 44482 66558 37649 08882 90870 12462 41810 01806 02977
36792 26236 33266 66583 60881 97395 20461 36742 02852 50564
73944 04773 12032 51414 82384 38370 00249 80709 72605 67497
49563 12872 14063 93104 78483 72717 68714 18048 25005 04151
64208 48237 41701 73117 33242 42314 83049 21933 92813 04763
51486 72875 38605 29341 80749 80151 33835 52602 79147 08868
99756 26360 64516 17971 48478 09610 04638 17141 09227 10606
71325 55217 13015 72907 00431 45117 33827 92873 02953 85474
65285 97198 12138 53010 95601 15838 16805 61004 43516 17020
17264 57327 38224 29301 31381 38109 34976 65692 98566 29550
95639 99754 31199 92558 68368 04985 51092 37780 40261 14479
61555 76404 86210 11808 12841 45147 97438 60022 12645 62000
78137 98768 04689 87130 79225 08153 84967 64539 79493 74917
62490 99215 84987 28759 19177 14733 24550 28067 68894 38490
24216 63444 21283 07044 92729 37284 13211 37485 10415 36457
16975 95428 33226 55903 31605 43817 22250 03918 46999 98501
59138 39542 71168 57609 91510 77904 74244 50940 31553 62562
29478 59652 50414 31966 87912 87514 12944 49862 96566 48825
To get a random number between 0 and 1, take a grouping of digits, for example 5 at a time,
and place a decimal point in front.
Example: If we take random digit 92729, then the random number will be represented as
.92729
Table A.2 Cumulative Normal Distribution
PROBABILITY VALUES ASSOCIATED WITH Z IN NORMAL DISTRIBUTIONS
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
.0 .5000 .5040 .5080 .5120 .5160 .5199 .5239 .5279 .5319 .5359
.1 .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .5753
.2 .5793 .5832 .5871 .5910 .5948 .5987 .6026 .6064 .6103 .6141
.3 .6179 .6217 .6255 .6293 .6331 .6368 .6406 .6443 .6480 .6517
.4 .6554 .6591 .6628 .6664 .6700 .6736 .6772 .6808 .6844 .6879
.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224
.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549
.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852
.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133
.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389
1.0 .8413 .8438 .8461 .8485 .8508 .8531 .8554 .8577 .8599 .8621
1.1 .8643 .8665 .8686 .8708 .8729 .8749 .8770 .8790 .8810 .8830
1.2 .8849 .8869 .8888 .8907 .8925 .8944 .8962 .8980 .8997 .9015
1.3 .9032 .9049 .9066 .9082 .9099 .9115 .9131 .9147 .9162 .9177
1.4 .9192 .9207 .9222 .9236 .9251 .9265 .9279 .9292 .9306 .9319
1.5 .9332 .9345 .9357 .9370 .9382 .9394 .9406 .9418 .9429 .9441
1.6 .9452 .9463 .9474 .9484 .9495 .9505 .9515 .9525 .9535 .9545
1.7 .9554 .9564 .9573 .9582 .9591 .9599 .9608 .9616 .9625 .9633
1.8 .9641 .9649 .9656 .9664 .9671 .9678 .9686 .9693 .9699 .9706
1.9 .9713 .9719 .9726 .9732 .9738 .9744 .9750 .9756 .9761 .9767
2.0 .9772 .9778 .9783 .9788 .9793 .9798 .9803 .9808 .9812 .9817
2.1 .9821 .9826 .9830 .9834 .9838 .9842 .9846 .9850 .9854 .9857
2.2 .9861 .9864 .9868 .9871 .9875 .9878 .9881 .9884 .9887 .9890
2.3 .9893 .9896 .9898 .9901 .9904 .9906 .9909 .9911 .9913 .9916
2.4 .9918 .9920 .9922 .9925 .9927 .9929 .9931 .9932 .9934 .9936
2.5 .9938 .9940 .9941 .9943 .9945 .9946 .9948 .9949 .9951 .9952
2.6 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .9964
2.7 .9965 .9966 .9967 .9968 .9969 .9970 .9971 .9972 .9973 .9974
2.8 .9974 .9975 .9976 .9977 .9977 .9978 .9979 .9979 .9980 .9981
2.9 .9981 .9982 .9982 .9983 .9984 .9984 .9985 .9985 .9986 .9986
3.0 .9987 .9987 .9987 .9988 .9988 .9989 .9989 .9989 .9990 .9990
3.1 .9990 .9991 .9991 .9991 .9992 .9992 .9992 .9992 .9993 .9993
3.2 .9993 .9993 .9994 .9994 .9994 .9994 .9994 .9995 .9995 .9995
3.3 .9995 .9995 .9995 .9996 .9996 .9996 .9996 .9996 .9996 .9997
3.4 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9998
Table A.3 Student's t Test
(If calculated t is greater than value shown, reject the null hypothesis.)
SIGNIFICANCE LEVEL FOR ONE-DIRECTION TEST

df .10 .05 .025 .01 .005 .000
1 3.078 6.314 12.706 31.821 63.657 636.619
2 1.886 2.920 4.303 6.965 9.925 31.598
3 1.638 2.353 3.182 4.541 5.841 12.941
4 1.533 2.132 2.776 3.747 4.604 8.610
5 1.476 2.015 2.571 3.365 4.032 6.859
6 1.440 1.943 2.447 3.143 3.707 5.959
7 1.415 1.895 2.365 2.998 3.499 5.405
8 1.397 1.860 2.306 2.896 3.355 5.041
9 1.383 1.833 2.262 2.821 3.250 4.781
10 1.372 1.812 2.228 2.764 3.169 4.587
11 1.363 1.796 2.201 2.718 3.106 4.437
12 1.356 1.782 2.179 2.681 3.055 4.318
13 1.350 1.771 2.160 2.650 3.012 4.221
14 1.345 1.761 2.145 2.624 2.977 4.140
15 1.341 1.753 2.131 2.602 2.947 4.073
16 1.337 1.746 2.120 2.583 2.921 4.015
17 1.333 1.740 2.110 2.567 2.898 3.965
18 1.330 1.734 2.101 2.552 2.878 3.922
19 1.328 1.729 2.093 2.539 2.861 3.883
20 1.325 1.725 2.086 2.528 2.845 3.850
21 1.323 1.721 2.080 2.518 2.831 3.819
22 1.321 1.717 2.074 2.508 2.819 3.792
23 1.319 1.714 2.069 2.500 2.807 3.767
24 1.318 1.711 2.064 2.492 2.797 3.745
25 1.316 1.708 2.060 2.485 2.787 3.725
26 1.315 1.706 2.056 2.479 2.779 3.707
27 1.314 1.703 2.052 2.473 2.771 3.690
28 1.313 1.701 2.048 2.467 2.763 3.674
29 1.311 1.699 2.045 2.462 2.756 3.659
30 1.310 1.697 2.042 2.457 2.750 3.646
40 1.303 1.684 2.021 2.423 2.704 3.551
60 1.296 1.671 2.000 2.390 2.660 3.460
120 1.289 1.658 1.980 2.358 2.617 3.373
∞ 1.282 1.645 1.960 2.326 2.576 3.291
Table A.4 Chi-Square Probabilities

The areas given across the top are the areas to the right of the critical value. To look up an
area on the left, subtract it from one, and then look it up (ie: 0.05 on the left is 0.95 on the
right)
df 0.995 0.99 0.975 0.95 0.90 0.10 0.05 0.025 0.01 0.005
1 --- --- 0.001 0.004 0.016 2.706 3.841 5.024 6.635 7.879
2 0.010 0.020 0.051 0.103 0.211 4.605 5.991 7.378 9.210 10.59
0.072 0.115 0.216 0.352 0.584 6.251 7.815 9.348 11.34 7
12.83
3
0.207 0.297 0.484 0.711 1.064 7.779 9.488 11.14 5
13.27 8
14.86
4
0.412 0.554 0.831 1.145 1.610 9.236 11.07 3
12.83 7
15.08 0
16.75
5
0.676 0.872 1.237 1.635 2.204 10.64 0
12.59 3
14.44 6
16.81 0
18.54
6
0.989 1.239 1.690 2.167 2.833 5
12.01 2
14.06 9
16.01 2
18.47 8
20.27
7
1.344 1.646 2.180 2.733 3.490 7
13.36 7
15.50 3
17.53 5
20.09 8
21.95
8
1.735 2.088 2.700 3.325 4.168 2
14.68 7
16.91 5
19.02 0
21.66 5
23.58
9
2.156 2.558 3.247 3.940 4.865 4
15.98 9
18.30 3
20.48 6
23.20 9
25.18
10
2.603 3.053 3.816 4.575 5.578 7
17.27 7
19.67 3
21.92 9
24.72 8
26.75
11
3.074 3.571 4.404 5.226 6.304 5
18.54 5
21.02 0
23.33 5
26.21 7
28.30
12
3.565 4.107 5.009 5.892 7.042 9
19.81 6
22.36 7
24.73 7
27.68 0
29.81
13
4.075 4.660 5.629 6.571 7.790 2
21.06 2
23.68 6
26.11 8
29.14 9
31.31
14
4.601 5.229 6.262 7.261 8.547 4
22.30 5
24.99 9
27.48 1
30.57 9
32.80
15
5.142 5.812 6.908 7.962 9.312 7
23.54 6
26.29 8
28.84 8
32.00 1
34.26
16
5.697 6.408 7.564 8.672 10.08 2
24.76 6
27.58 5
30.19 0
33.40 7
35.71
17
6.265 7.015 8.231 9.390 5
10.86 9
25.98 7
28.86 1
31.52 9
34.80 8
37.15
18
6.844 7.633 8.907 10.11 5
11.65 9
27.20 9
30.14 6
32.85 5
36.19 6
38.58
19
7.434 8.260 9.591 7
10.85 1
12.44 4
28.41 4
31.41 2
34.17 1
37.56 2
39.99
20
8.034 8.897 10.28 1
11.59 3
13.24 2
29.61 0
32.67 0
35.47 6
38.93 7
41.40
21
8.643 9.542 3
10.98 1
12.33 0
14.04 5
30.81 1
33.92 9
36.78 2
40.28 1
42.79
22
9.260 10.19 2
11.68 8
13.09 1
14.84 3
32.00 4
35.17 1
38.07 9
41.63 6
44.18
23
9.886 6
10.85 9
12.40 1
13.84 8
15.65 7
33.19 2
36.41 6
39.36 8
42.98 1
45.55
24
10.52 6
11.52 1
13.12 8
14.61 9
16.47 6
34.38 5
37.65 4
40.64 0
44.31 9
46.92
25
0
11.16 4
12.19 0
13.84 1
15.37 3
17.29 2
35.56 2
38.88 6
41.92 4
45.64 8
48.29
26
0
11.80 8
12.87 4
14.57 9
16.15 2
18.11 3
36.74 5
40.11 3
43.19 2
46.96 0
49.64
27
8
12.46 9
13.56 3
15.30 1
16.92 4
18.93 1
37.91 3
41.33 5
44.46 3
48.27 5
50.99
28
1
13.12 5
14.25 8
16.04 8
17.70 9
19.76 6
39.08 7
42.55 1
45.72 8
49.58 3
52.33
29
1
13.78 6
14.95 7
16.79 8
18.49 8
20.59 7
40.25 7
43.77 2
46.97 8
50.89 6
53.67
30
7
20.70 3
22.16 1
24.43 3
26.50 9
29.05 6
51.80 3
55.75 9
59.34 2
63.69 2
66.76
40
7
27.99 4
29.70 3
32.35 9
34.76 1
37.68 5
63.16 8
67.50 2
71.42 1
76.15 6
79.49
50
1
35.53 7
37.48 7
40.48 4
43.18 9
46.45 7
74.39 5
79.08 0
83.29 4
88.37 0
91.95
60
4
43.27 5
45.44 2
48.75 8
51.73 9
55.32 7
85.52 2
90.53 8
95.02 9
100.4 2
104.2
70
5
51.17 2
53.54 8
57.15 9
60.39 9
64.27 7
96.57 1
101.8 3
106.6 25
112.3 15
116.3
80
2
59.19 0
61.75 3
65.64 1
69.12 8
73.29 8
107.5 79
113.1 29
118.1 29
124.1 21
128.2
90
6
67.32 4
70.06 7
74.22 6
77.92 1
82.35 65
118.4 45
124.3 36
129.5 16
135.8 99
140.1
100
0 8 5 2 9 8 98 42 61 07 69
Table A.5 Kolmogorov-Smirnov Test
(If calculated ratio is greater than value shown, then reject the null hypothesis at the chosen
level of confidence.)
SAMPLE SIZE LEVEL OF SIGNIFICANCE FOR D = MAXIMUM [ F0(X) - Sn(X) ]

(N) .20 .15 .10 .05 .01
1 .900 .925 .950 .975 .995
2 .684 .726 .776 .842 .929
3 .565 .597 .642 .708 .828
4 .494 .525 .564 .624 .733
5 .446 .474 .510 .565 .669
6 .410 .436 .470 .521 .618
7 .381 .405 .438 .486 .577
8 .358 .381 .411 .457 .543
9 .339 .360 .388 .432 .514
10 .322 .342 .368 .410 .490
11 .307 .326 .352 .391 .468
12 .295 .313 .338 .375 .450
13 .284 .302 .325 .361 .433
14 .274 .292 .314 .349 .418
15 .266 .283 .304 .338 .404
16 .258 .274 .295 .328 .392
17 .250 .266 .286 .318 .381
18 .244 .259 .278 .309 .371
19 .237 .252 .272 .301 .363
20 .231 .246 .264 .294 .356
25 .210 .220 .240 .270 .320
30 .190 .200 .220 .240 .290
35 .180 .190 .210 .230 .270
OVER 35
√ √ √ √ √
References:
1. Jerry Banks, John S. Carson II, Barry L. Nelson and David M. Nicol,
“Discrete-Event System Simulation”, 3rd Edition, © 2001, Pearson
Education.
2. Jerry Banks, John S. Carson II, Barry L. Nelson and David M. Nicol,
“Discrete-Event System Simulation”, 5th Edition, © 2010, Pearson
Education.
3. Averill M Law, “System Modeling & Analysis”, 4th Edition, TMH.
4. http://www.fao.org/docrep/009/a0238e/A0238E02.htm
5. http://individual.utoronto.ca/ranodya/6P1.html
6. www.bcnn.net
7. http://chemwiki.ucdavis.edu/Reference/Reference_Tables/Analytic_Refer
ences/Appendix_14%3A_Random_Number_Table
8. http://www.sjsu.edu/faculty/gerstman/EpiInfo/z-table.htm
9. http://dlc.erieri.com/index.cfm?fuseaction=textbook.appendix

CSM Book 2018 PDF

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

CSM Book 2018 PDF

Transféré par

Droits d'auteur :

Formats disponibles

Academic Year 2017-18

Information Technology Branch

Suggested Practical List (If Any):

Weightage of marks should be proportional to number of hours assigned to each module.

CHAPTER 1 INTRODUCTION TO SIMULATION

CHAPTER 2 SIMULATION EXAMPLES

CHAPTER 3 GENERAL PRINCIPLES

1.1 Introduction [M-05,06,07,08,09, S-15, N-04, D-09]

1.2 When Simulation is an Appropriate Tool [M-06,08,09,12,N-04,D-05,10 ]

1.3 When Simulation is Not Appropriate [M-08,09]

8. when the system behavior is too complex or unknown

1.4 Advantages of Simulation [M-10,11]

1.5 Disadvantages of Simulation [M-10,11]

The disadvantages can be offset as follows

1.6 Areas of Application [M-05,06,13, D-05,06]

ii. To renew underground urban infrastructure.

5. Logistics, Transportation, and Distribution Application

6. Business Process Simulation

1.7 System and System Environment [M-12]

1.8 Components of a System

Table 1.1 Examples of Systems and Components

1.9 Types of Systems

continuously over time. Example: Head of water behind the dam.

Figure 1.1 Discrete and Continuous Systems

1.10 Model of a System [N-04, M-05,07,12, D-09,15,17]

1.11 Types of Models [M-05,13, N-04,D-05,06,12,13,15]

1. Mathematical Model: A mathematical model uses symbolic notation and

1.12 Steps in a Simulation Study

2. Setting of objectives and overall project plan

model. Is the computer program performing properly?

Figure 1.2 Steps in Simulation Study

9. Production runs and analysis

10. More runs?

11. Documenting and reporting

2.1 The Simulation Process

2.2 Simulation of Queueing Systems

Figure 2.1 Queueing System

Characteristics of Queueing Systems

2.2.1 The calling population

1.2.2 System capacity

2.2.3. The arrival process

2.2.4 Queue behavior and queue discipline

2.2.5 Service times and the service mechanism

 A queuing system consists of a number of service centers and interconnecting queues.

Figure 2.2 Discount warehouse with three service centers

Figure 2.3 Service center 2, with c=3 parallel servers

Single Channel Queuing Systems

Figure 2.4 Service-just-completed flow diagram

Figure 2.5 Unit-entering-system flow diagram

Steps to perform Manual Simulation of Single Channel Queue

Example 2.1 (Single Channel Queue Simulation Problem) [S-15, D-16,17]

Example 2.2 [M-10,17, D-07,08,14]

Able-Baker Carhop Problem (Multi Server Simulation Problem) [M-12]

Time Between Arrivals (min) 2 3 4 5 6

Table 2.4: Inter-arrival Time Distribution

Table 2.5: Able's Service Time Distribution

Table 2.6: Baker's Service Time Distribution

Table 2.7 Simulation Table for Carhop Problem

2.3 Simulation of Inventory Systems [M-06,08,13,14,16 D-06,07,09,11,12,13,17]

Figure 2.8 Probabilistic Order-Level Inventory System

Table 2.13: Random Digit Assignment for Lead Time

Table 2.14: Simulation Table for (M, N) Inventory System

Example 2.5 [M-07,09,15,17]