Vous êtes sur la page 1sur 448

Chapter 1 Introduction to

Simulation
Banks, Carson, Nelson & Nicol
Discrete-Event System Simulation

Outline

When Simulation Is the Appropriate Tool


When Simulation Is Not Appropriate
Advantages and Disadvantages of Simulation
Areas of Application
Systems and System Environment
Components of a System
Discrete and Continuous Systems
Model of a System
Types of Models
Discrete-Event System Simulation
Steps in a Simulation Study
2

Definition

A simulation is the imitation of the operation of real-world


process or system over time.

Generation of artificial history and observation of that


observation history

A model construct a conceptual framework that


describes a system
The behavior of a system that evolves over time is
studied by developing a simulation model.
The model takes a set of expressed assumptions:
Mathematical, logical
Symbolic relationship between the entities

Goal of modeling and simulation

A model can be used to investigate a wide verity of what


if questions about real-world system.
Potential changes to the system can be simulated and predicate
their impact on the system.
Find adequate parameters before implementation

So simulation can be used as


Analysis tool for predicating the effect of changes
Design tool to predicate the performance of new system

It is better to do simulation before Implementation.

How a model can be developed?

Mathematical Methods
Probability

theory, algebraic method ,


Their results are accurate
They have a few Number of parameters
It is impossible for complex systems

Numerical computer-based simulation


It

is simple
It is useful for complex system

When Simulation Is the Appropriate Tool

Simulation enable the study of internal interaction of a subsystem


with complex system
Informational, organizational and environmental changes can be
simulated and find their effects
A simulation model help us to gain knowledge about improvement of
system
Finding important input parameters with changing simulation inputs
Simulation can be used with new design and policies before
implementation
Simulating different capabilities for a machine can help determine
the requirement
Simulation models designed for training make learning possible
without the cost disruption
A plan can be visualized with animated simulation
The modern system (factory, wafer fabrication plant, service
organization) is too complex that its internal interaction can be
treated only by simulation
6

When Simulation Is Not Appropriate

When the problem can be solved by common


sense.
When the problem can be solved analytically.
If it is easier to perform direct experiments.
If cost exceed savings.
If resource or time are not available.
If system behavior is too complex.
Like

human behavior

Advantages and disadvantages of simulation

In contrast to optimization models, simulation


models are run rather than solved.
Given

as a set of inputs and model characteristics the


model is run and the simulated behavior is observed

Advantages of simulation

New policies, operating procedures, information flows and son on


can be explored without disrupting ongoing operation of the real
system.
New hardware designs, physical layouts, transportation systems
and can be tested without committing resources for their
acquisition.
Time can be compressed or expanded to allow for a speed-up or
slow-down of the phenomenon( clock is self-control).
Insight can be obtained about interaction of variables and important
variables to the performance.
Bottleneck analysis can be performed to discover where work in
process, the system is delayed.
A simulation study can help in understanding how the system
operates.
What if questions can be answered.

Disadvantages of simulation

Model building requires special training.


Vendors

of simulation software have been actively


developing packages that contain models that only
need input (templates).

Simulation results can be difficult to interpret.


Simulation modeling and analysis can be time
consuming and expensive.
Many

simulation software have output-analysis.

10

Areas of application

Manufacturing Applications
Semiconductor Manufacturing
Construction Engineering and project management
Military application
Logistics, Supply chain and distribution application
Transportation modes and Traffic
Business Process Simulation
Health Care
Automated Material Handling System (AMHS)

Risk analysis

Insurance, portfolio,...

Computer Simulation

Test beds for functional testing of control-system software

CPU, Memory,

Network simulation

Internet backbone, LAN (Switch/Router), Wireless, PSTN (call center),...

11

Systems and System Environment

A system is defined as a groups of objects that


are joined together in some regular interaction
toward the accomplishment of some purpose.
An

automobile factory: Machines, components parts


and workers operate jointly along assembly line

A system is often affected by changes occurring


outside the system: system environment.
Factory

: Arrival orders

Effect of supply on demand : relationship between factory


output and arrival (activity of system)

Banks

: arrival of customers

12

Components of system

Entity

Attribute

A instantaneous occurrence that might change the state of the system:


breakdown

Endogenous

A collection of variables that describe the system in any time : status of machine
(busy, idle, down,)

Event

A time period of specified length :welding, stamping

State

The property of an entity : speed, capacity

Activity

An object of interest in the system : Machines in factory

Activities and events occurring with the system

Exogenous

Activities and events occurring with the environment

13

Discrete and Continues Systems

A discrete system is one in which the state variables


change only at a discrete set of points in time : Bank
example

14

Discrete and Continues Systems (cont.)

A continues system is one in which the state variables


change continuously over time: Head of water behind the
dam

15

Model of a System

To study the system

it is sometimes possible to experiments with system

This is not always possible (bank, factory,)


A new system may not yet exist

Model: construct a conceptual framework that


describes a system
It

is necessary to consider those accepts of systems


that affect the problem under investigation
(unnecessary details must remove)

16

Types of Models

17

Characterizing a Simulation Model

Deterministic or Stochastic
Does

the model contain stochastic components?


Randomness is easy to add to a DES

Static or Dynamic
Is

time a significant variable?

Continuous or Discrete
Does

the system state evolve continuously or only at


discrete points in time?
Continuous: classical mechanics
Discrete: queuing, inventory, machine shop models

18

Discrete-Event Simulation Model

Stochastic: some state variables are random


Dynamic: time evolution is important
Discrete-Event: significant changes occur at
discrete time instances

19

Model Taxonomy

20

DES Model Development


How to develop a model:
1)
2)
3)
4)
5)
6)

Determine the goals and objectives


Build a conceptual model
Convert into a specification model
Convert into a computational model
Verify
Validate

Typically an iterative process

21

Three Model Levels

Conceptual
Very high level
How comprehensive should the model be?
What are the state variables, which are dynamic, and which are
important?

Specification
On paper
May involve equations, pseudocode, etc.
How will the model receive input?

Computational
A computer program
General-purpose PL or simulation language?

22

Verification vs. Validation

Verification
Computational

model should be consistent with


specification model
Did we build the model right?

Validation
Computational

model should be consistent with the


system being analyzed
Did we build the right model?
Can an expert distinguish simulation output from
system output?

Interactive graphics can prove valuable


23

Steps in Simulation
Study

24

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation
Discrete-Event System Simulation
Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Chapter 9
Verification and Validation of Simulation Models

Computer Science, Informatik 4


Communication and Distributed Systems

Purpose & Overview


The goal of the validation process is:
To produce a model that represents true
behavior closely enough for decision-making
purposes
To increase the models credibility to an
acceptable level

Validation is an integral part of model


development:
Verification: building the model correctly,
correctly implemented with good input and
structure
Validation: building the correct model, an
accurate representation of the real system

Most methods are informal subjective


comparisons while a few are formal
statistical procedures
Chapter 9. Verification and Validation of Simulation Models

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Modeling-Building, Verification & Validation


Steps in Model-Building
Observing the real system
and the interactions among
their various components
and of collecting data on
their behavior.
Construction of a
conceptual model
Implementation of an
operational model

Chapter 9. Verification and Validation of Simulation Models

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Verification
Purpose: ensure the conceptual model is reflected accurately in the
computerized representation.
Many common-sense suggestions, for example:
Have someone else check the model.
Make a flow diagram that includes each logically possible action a
system can take when an event occurs.
Closely examine the model output for reasonableness under a variety of
input parameter settings.
Print the input parameters at the end of the simulation, make sure they
have not been changed inadvertently.
Make the operational model as self-documenting as possible.
If the operational model is animated, verify that what is seen in the
animation imitates the actual system.
Use the debugger.
If possible use a graphical representation of the model.
Chapter 9. Verification and Validation of Simulation Models

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Examination of Model Output for Reasonableness


Two statistics that give a quick indication of model
reasonableness are current contents and total counts
Current content: The number of items in each component of the
system at a given time.
Total counts: Total number of items that have entered each
component of the system by a given time.

Compute certain long-run measures of performance, e.g.


compute the long-run server utilization and compare to
simulation results.

Chapter 9. Verification and Validation of Simulation Models

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Examination of Model Output for Reasonableness

A model of a complex network of queues consisting of


many service centers.
If the current content grows in a more or less linear fashion as
the simulation run time increases, it is likely that a queue is
unstable
If the total count for some subsystem is zero, indicates no items
entered that subsystem, a highly suspect occurrence
If the total and current count are equal to one, can indicate that
an entity has captured a resource but never freed that resource.

Chapter 9. Verification and Validation of Simulation Models

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Other Important Tools


Documentation
A means of clarifying the logic of a model and verifying its
completeness.
Comment the operational model, definition of all variables and
parameters.

Use of a trace
A detailed printout of the state of the simulation model over time.
- Can be very labor intensive if the programming language does not
support statistic collection.
- Labor can be reduced by a centralized tracing mechanism
- In object-oriented simulation framework, trace support can be
integrated into class hierarchy. New classes need only to add little for
the trace support.

Chapter 9. Verification and Validation of Simulation Models

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Trace - Example
Simple queue from Chapter 2
Trace over a time interval [0, 16]
Allows the test of the results by pen-and-paper method
Definition of Variables:
CLOCK = Simulation clock
EVTYP = Event type (Start, Arrival, Departure, Stop)
NCUST = Number of customers in system at time CLOCK
STATUS = Status of server (1=busy, 0=idle)
State
CLOCK
CLOCK
CLOCK
CLOCK
CLOCK
CLOCK
...

of System Just After the Named Event Occurs:


= 0 EVTYP = Start
NCUST=0 STATUS = 0
= 3 EVTYP = Arrival NCUST=1 STATUS = 0
= 5 EVTYP = Depart
NCUST=0 STATUS = 0
= 11 EVTYP = Arrival NCUST=1 STATUS = 0
= 12 EVTYP = Arrival NCUST=2 STATUS = 1
= 16 EVTYP = Depart
NCUST=1 STATUS = 1

Chapter 9. Verification and Validation of Simulation Models

There is a customer,
but the status is 0

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Calibration and Validation

Validation: the overall process of comparing the model and its behavior to the
real system.
Calibration: the iterative process of comparing the model to the real system
and making adjustments.
Comparison of the model to
real system
Subjective tests
- People who are
knowledgeable about
the system
Objective tests
- Requires data on the
real systems behavior
and the output of the
model

Chapter 9. Verification and Validation of Simulation Models

10

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Calibration and Validation

Danger during the calibration phase

No model is ever a perfect representation of the system

Typically few data sets are available, in the worst case only one, and the
model is only validated for these.
Solution: If possible collect new data sets

The modeler must weigh the possible, but not guaranteed, increase in
model accuracy versus the cost of increased validation effort.

Three-step approach for validation:


1. Build a model that has high face validity.
2. Validate model assumptions.
3. Compare the model input-output transformations with the real systems
data.

Chapter 9. Verification and Validation of Simulation Models

11

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

High Face Validity


Ensure a high degree of realism:
Potential users should be involved in model construction from its
conceptualization to its implementation.

Sensitivity analysis can also be used to check a models face


validity.
Example: In most queueing systems, if the arrival rate of
customers were to increase, it would be expected that server
utilization, queue length and delays would tend to increase.
For large-scale simulation models, there are many input
variables and thus possibly many sensitity tests.
- Sometimes not possible to perform all of theses tests, select the
most critical ones.

Chapter 9. Verification and Validation of Simulation Models

12

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Validate Model Assumptions


General classes of model assumptions:
Structural assumptions: how the system operates.
Data assumptions: reliability of data and its statistical analysis.

Bank example: customer queueing and service facility in a bank.


Structural assumptions
- Customer waiting in one line versus many lines
- Customers are served according FCFS versus priority

Data assumptions, e.g., interarrival time of customers, service times for


commercial accounts.
- Verify data reliability with bank managers
- Test correlation and goodness of fit for data

Chapter 9. Verification and Validation of Simulation Models

13

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Validate Input-Output Transformation


Goal: Validate the models ability to predict future behavior
The only objective test of the model.
The structure of the model should be accurate enough to make good
predictions for the range of input data sets of interest.

One possible approach: use historical data that have been reserved
for validation purposes only.
Criteria: use the main responses of interest.
Input

System

Output

Model is viewed as an
input-output
transformation

Input

Chapter 9. Verification and Validation of Simulation Models

Model
14

Output

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Bank Example
Example: One drive-in window serviced by one teller, only one or two
transactions are allowed.
Data collection: 90 customers during 11 am to 1 pm.
- Observed service times {Si, i = 1,2, , 90}.
- Observed interarrival times {Ai, i = 1,2, , 90}.

Data analysis let to the conclusion that:

- Interarrival times: exponentially distributed with rate = 45


- Service times: N(1.1, 0.22)

Chapter 9. Verification and Validation of Simulation Models

15

Input variables

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Bank Example The Black Box


A model was developed in close consultation with bank management and
employees
Model assumptions were validated
Resulting model is now viewed as a black box:
Model Output Variables, Y
Input Variables

Uncontrolled
variables, X

Controlled
Decision
variables, D

Possion arrivals
= 45/hr: X11, X12,
Services times,
N(D2, 0.22): X21, X22,

Model
black box
f(X,D) = Y

D1 = 1 (one teller)
D2 = 1.1 min
(mean service time)
D3 = 1 (one line)

Chapter 9. Verification and Validation of Simulation Models

16

Primary interest:
Y1 = tellers utilization
Y2 = average delay
Y3 = maximum line length
Secondary interest:
Y4 = observed arrival rate
Y5 = average service time
Y6 = sample std. dev. of
service times
Y7 = average length of time
Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Comparison with Real System Data


Real system data are necessary for validation.
System responses should have been collected during the same time
period (from 11am to 1pm on the same day.)

Compare the average delay from the model Y2 with the actual delay
Z2:
Average delay observed, Z2 = 4.3 minutes, consider this to be the true
mean value 0 = 4.3.
When the model is run with generated random variates X1n and X2n, Y2
should be close to Z2.

Chapter 9. Verification and Validation of Simulation Models

17

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Comparison with Real System Data


Six statistically independent replications of the model, each of
2-hour duration, are run.

Replication

Y4
Arrivals/Hour

Y5
Service Time [Minutes]

Y2
Average Delay [Minutes]

51

1.07

2.79

40

1.12

1.12

45.5

1.06

2.24

50.5

1.10

3.45

53

1.09

3.13

49

1.07

2.38

Sample mean

2.51

Standard deviation

0.82

Chapter 9. Verification and Validation of Simulation Models

18

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Hypothesis Testing
Compare the average delay from the model Y2 with the actual delay Z2
Null hypothesis testing: evaluate whether the simulation and the real
system are the same (w.r.t. output measures):

H 0: E(Y2 ) = 4.3 minutes


H1: E(Y2 ) 4.3 minutes
- If H0 is not rejected, then, there is no reason to consider the model
invalid
- If H0 is rejected, the current version of the model is rejected, and the
modeler needs to improve the model

Chapter 9. Verification and Validation of Simulation Models

19

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Hypothesis Testing
Conduct the t test:
Chose level of significance ( = 0.5) and sample size (n = 6).
Compute the same mean and sample standard deviation over
the n replications:
n

1 n
Y2 = Y2i = 2.51 minutes
n i =1

S=

(Y
i =1

2i

Y2 ) 2

n 1

= 0.82 minutes

Compute test statistics:


t0 =

Y2 0
2.51 4.3
=
= 5.34 > tcritical = 2.571 (for a 2 - sided test)
S/ n
0.82 / 6

Hence, reject H0.

- Conclude that the model is inadequate.

Check: the assumptions justifying a t test, that the observations


(Y2i) are normally and independently distributed.
Chapter 9. Verification and Validation of Simulation Models

20

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Hypothesis Testing
Similarly, compare the model output with the observed output for
other measures:
Y4 Z4, Y5 Z5, and Y6 Z6

Chapter 9. Verification and Validation of Simulation Models

21

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Type II Error
For validation, the power of the test is:
Probability[ detecting an invalid model ] = 1
= P(Type II error) = P(failing to reject H0 | H1 is true)
Consider failure to reject H0 as a strong conclusion, the modeler would
want to be small.
Value of depends on:
- Sample size, n
- The true difference, , between E(Y) and :

E (Y )

In general, the best approach to control error is:


Specify the critical difference, .
Choose a sample size, n, by making use of the operating characteristics
curve (OC curve).
Chapter 9. Verification and Validation of Simulation Models

22

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Type II Error
Operating characteristics curve
(OC curve).
Graphs of the probability of a
Type II Error () versus for a
given sample size n

For the same error


probability with smaller
difference the required
sample size increses!

Chapter 9. Verification and Validation of Simulation Models

23

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Type I and II Error


Type I error ():
Error of rejecting a valid model.
Controlled by specifying a small level of significance .

Type II error ():


Error of accepting a model as valid when it is invalid.
Controlled by specifying critical difference and find the n.

For a fixed sample size n, increasing will decrease .

Statistical Terminology

Modeling Terminology

Associated Risk

Type I: rejecting H0 when H0 is true

Rejecting a valid model

Type II: failure to reject H0 when H1 is true

Failure to reject an invalid model

Chapter 9. Verification and Validation of Simulation Models

24

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Confidence Interval Testing


Confidence interval testing: evaluate whether the simulation and the
real system performance measures are close enough.
If Y is the simulation output, and = E(Y)
The confidence interval (CI) for is:

Y t ,n1
2

Chapter 9. Verification and Validation of Simulation Models

25

S
n

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Confidence Interval Testing


Validating the model:

is a difference value chosen

Suppose the CI does not contain 0:


- If the best-case error is > , model
needs to be refined.
- If the worst-case error is , accept
the model.
- If best-case error is , additional
replications are necessary.

by the analyst, that is small


enough to allow valid
decisions to be based on
simulations!

Suppose the CI contains 0:

- If either the best-case or worst-case


error is > , additional replications
are necessary.
- If the worst-case error is , accept
the model.

Chapter 9. Verification and Validation of Simulation Models

26

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Confidence Interval Testing


Bank example: 0 = 4.3, and close enough is = 1 minute of
expected customer delay.
A 95% confidence interval, based on the 6 replications is
[1.65, 3.37] because:

S
n
0.82
2.51 2.571
6
Y t0.025,5

0 = 4.3 falls outside the confidence interval,

- the best case |3.37 4.3| = 0.93 < 1, but


- the worst case |1.65 4.3| = 2.65 > 1
Additional replications are needed to reach a decision.

Chapter 9. Verification and Validation of Simulation Models

27

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Using Historical Output Data


An alternative to generating input data:
Use the actual historical record.
Drive the simulation model with the historical record and then compare
model output to system data.
In the bank example, use the recorded interarrival and service times for
the customers {An, Sn, n = 1,2,}.

Procedure and validation process: similar to the approach used for


system generated input data.

Chapter 9. Verification and Validation of Simulation Models

28

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Using a Turing Test


Use in addition to statistical test, or when no statistical test is readily
applicable.
Turing Test
Described by Alan Turing in 1950. A human jugde is involved in a natural language
conversation with a human and a machine. If the judge cannot reliably tell which of the
partners is the machine, then the machine has passed the test.

Utilize persons knowledge about the system.


For example:
Present 10 system performance reports to a manager of the system.
Five of them are from the real system and the rest are fake reports
based on simulation output data.
If the person identifies a substantial number of the fake reports, interview
the person to get information for model improvement.
If the person cannot distinguish between fake and real reports with
consistency, conclude that the test gives no evidence of model
inadequacy.
Chapter 9. Verification and Validation of Simulation Models

29

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Summary
Model validation is essential:
Model verification
Calibration and validation
Conceptual validation

Best to compare system data to model data, and make comparison


using a wide variety of techniques.
Some techniques that we covered:

Insure high face validity by consulting knowledgeable persons.


Conduct simple statistical tests on assumed distributional forms.
Conduct a Turing test.
Compare model output to system output by statistical tests.

Chapter 9. Verification and Validation of Simulation Models

30

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation
Discrete-Event System Simulation
Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Chapter 10
Output Analysis for a Single Model

Computer Science, Informatik 4


Communication and Distributed Systems

Purpose
Objective: Estimate system performance via simulation
If is the system performance, the precision of the estimator can
be measured by:
The standard error of .
The width of a confidence interval (CI) for .

Purpose of statistical analysis:

To estimate the standard error or CI .


To figure out the number of observations required to achieve desired
error or CI.

Potential issues to overcome:

Autocorrelation, e.g. inventory cost for subsequent weeks lack statistical


independence.
Initial conditions, e.g. inventory on hand and number of backorders at
time 0 would most likely influence the performance of week 1.

Chapter 10. Output Analysis for a Single Model

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Outline
Distinguish the two types of simulation:
transient vs.
steady state

Illustrate the inherent variability in a stochastic discrete-event


simulation.
Cover the statistical estimation of performance measures.
Discusses the analysis of transient simulations.
Discusses the analysis of steady-state simulations.

Chapter 10. Output Analysis for a Single Model

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Type of Simulations
Terminating versus non-terminating simulations
Terminating simulation:
Runs for some duration of time TE, where E is a specified event that
stops the simulation.
Starts at time 0 under well-specified initial conditions.
Ends at the stopping time TE.
Bank example: Opens at 8:30 am (time 0) with no customers present
and 8 of the 11 teller working (initial conditions), and closes at 4:30 pm
(Time TE = 480 minutes).
The simulation analyst chooses to consider it a terminating system
because the object of interest is one days operation.

Chapter 10. Output Analysis for a Single Model

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Type of Simulations
Non-terminating simulation:
Runs continuously, or at least over a very long period of time.
Examples: assembly lines that shut down infrequently, hospital
emergency rooms, telephone systems, network of routers, Internet.
Initial conditions defined by the analyst.
Runs for some analyst-specified period of time TE.
Study the steady-state (long-run) properties of the system, properties
that are not influenced by the initial conditions of the model.

Whether a simulation is considered to be terminating or nonterminating depends on both


The objectives of the simulation study and
The nature of the system

Chapter 10. Output Analysis for a Single Model

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Stochastic Nature of Output Data


Model output consist of one or more random variables because the
model is an input-output transformation and the input variables are
random variables.
M/G/1 queueing example:
Poisson arrival rate = 0.1 per minute;
service time ~ N( = 9.5, =1.75).
System performance: long-run mean queue length, LQ(t).
Suppose we run a single simulation for a total of 5000 minutes
- Divide the time interval [0, 5000) into 5 equal subintervals of 1000 minutes.
- Average number of customers in queue from time (j-1)1000 to j(1000) is Yj .

Waiting line
Chapter 10. Output Analysis for a Single Model

2
2
=
LQ =
( ) 1

Server
7

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Stochastic Nature of Output Data


M/G/1 queueing example (cont.):
Batched average queue length for 3 independent replications:
Batching Interval
(minutes)
[0, 1000)
[1000, 2000)
[2000, 3000)
[3000, 4000)
[4000, 5000)
[0, 5000)

1, Y1j
3.61
3.21
2.18
6.92
2.82
3.75

Batch, j
1
2
3
4
5

Replication
2, Y2j
2.91
9.00
16.15
24.53
25.19
15.56

3, Y3j
7.67
19.53
20.36
8.11
12.62
13.66

Inherent variability in stochastic simulation both within a single replication


and across different replications.
The average across 3 replications, Y1. , Y2. , Y3. , can be regarded as
independent observations, but averages within a replication, Y11, , Y15,
are not.

Chapter 10. Output Analysis for a Single Model

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Measures of performance
Consider the estimation of a performance parameter, (or ), of a
simulated system.
Discrete time data: [Y1, Y2, , Yn], with ordinary mean:
Continuous-time data: {Y(t), 0 t TE} with time-weighted mean:
Point estimation for discrete time data.
The point estimator:
n
1
= Yi
n i =1

- Is unbiased if its expected value is , that is if:

E () =

Desired

- Is biased if: E () and E () is called bias of

Chapter 10. Output Analysis for a Single Model

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Point Estimator
Point estimation for continuous-time data.
The point estimator:

=
TE

TE

Y (t )dt

- Is biased in general where: E ()


.
- An unbiased or low-bias estimator is desired.

Usually, system performance measures can be put into the common


framework of or :
The proportion of days on which sales are lost through an out-of-stock
situation, let:

1, if out of stock on day i


Y (i) =
0, otherwise
Chapter 10. Output Analysis for a Single Model

10

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Point Estimator
Performance measure that does not fit:
quantile or percentile: Pr{Y } = p
Estimating quantiles: the inverse of the problem of estimating a proportion
or probability.
Consider a histogram of the observed values Y:
- Find such that 100p% of the histogram is to the left of (smaller than) .
A widely used used performance meausure is the median, which is the
0.5 quantile or 50-th percentile.

Chapter 10. Output Analysis for a Single Model

11

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Confidence-Interval Estimation
To understand confidence intervals fully, it is important to distinguish
between measures of error, and measures of risk, e.g., confidence
interval versus prediction interval.
Suppose the model is the normal distribution with mean , variance 2
(both unknown).
Let Yi. be the average cycle time for parts produced on the i-th replication
of the simulation (its mathematical expectation is ).
Average cycle time will vary from day to day, but over the long-run the
average of the averages will be close to .
Sample variance across R replications:

1 R
S =
(Yi. Y.. ) 2

R 1 i =1
2

Chapter 10. Output Analysis for a Single Model

12

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Confidence-Interval Estimation
Confidence Interval (CI):
A measure of error.
Where Yi are normally distributed.

Y.. t , R 1
2

Quantile of the t
distribution with R-1
degrees of freedom.

S
R

We cannot know for certain how far Y.. is from but CI attempts to bound
that error.
A CI, such as 95%, tells us how much we can trust the interval to actually
bound the error between Y.. and .
The more replications we make, the less error there is in Y.. (converging
to 0 as R goes to infinity).

Chapter 10. Output Analysis for a Single Model

13

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Confidence-Interval Estimation
Prediction Interval (PI):
A measure of risk.
A good guess for the average cycle time on a particular day is our
estimator but it is unlikely to be exactly right.
PI is designed to be wide enough to contain the actual average cycle time
on any particular day with high probability.
Normal-theory prediction interval:

1
Y.. t / 2, R 1S 1 +
R
The length of PI will not go to 0 as R increases because we can never
simulate away risk.
PIs limit is: z / 2

Chapter 10. Output Analysis for a Single Model

14

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Output Analysis for Terminating Simulations


A terminating simulation: runs over a simulated time interval [0, TE].
A common goal is to estimate:
1 n
= E Yi ,
for discrete output
n i =1
1 TE

= E Y (t )dt , for continuous output Y (t ),0 t TE


TE 0

In general, independent replications are used, each run using a


different random number stream and independently chosen initial
conditions.

Chapter 10. Output Analysis for a Single Model

15

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Statistical Background
Important to distinguish within-replication data from across-replication
data.
For example, simulation of a manufacturing system
Two performance measures of that system: cycle time for parts and work
in process (WIP).
Let Yij be the cycle time for the j-th part produced in the i-th replication.
Across-replication data are formed by summarizing within-replication
data Yi. .
Within-Replication Data

Across-Replication
Data

Y11

Y12

Y1n 1

Y1 , S 12 , H 1

Y 21

Y 22

Y 2n 2

Y 2 , S 22 , H 2

Y R1

YR 2

Y Rn R

Y R , S 2R , H R

Chapter 10. Output Analysis for a Single Model

16

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Statistical Background
Across Replication:
For example: the daily cycle time averages (discrete time data)
-

1 R
The average: Y.. = Yi.
R i =1
1 R
2
The sample variance: S =
(Yi. Y.. ) 2

R 1 i =1

- The confidence-interval half-width:

H = t / 2, R 1

S
R

Within replication:
For example: the WIP (a continuous time data)

1 TEi
Yi (t )dt

0
T Ei
1 TEi
2
2
(Yi (t ) Yi. ) dt
The sample variance: S i =

0
T Ei

- The average:
-

Chapter 10. Output Analysis for a Single Model

Yi. =

17

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Statistical Background
Overall sample average, Y.., and the interval replication sample
averages, Yi., are always unbiased estimators of the expected daily
average cycle time or daily average WIP.
Across-replication data are independent (different random numbers)
and identically distributed (same model), but within-replication data
do not have these properties.

Chapter 10. Output Analysis for a Single Model

18

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Confidence Intervals with Specified Precision


The half-length H of a100(1 )% confidence interval for a mean ,
based on the t distribution, is given by:

H = t / 2, R 1

S
R
S2 is the sample
variance

R is the # of
replications

Suppose that an error criterion is specified with probability 1- , a


sufficiently large sample size should satisfy:

P Y.. < 1

Chapter 10. Output Analysis for a Single Model

19

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Confidence Intervals with Specified Precision


Assume that an initial sample of size R0 (independent) replications has
been observed.
Obtain an initial estimate S02 of the population variance 2.
Then, choose sample size R such that R R0:
Since t/2, R-1 z/2, an initial estimate of R:
2

z S
R / 2 0 , z / 2 is the standard normal distribution.

2

t / 2, R 1S 0

R is the smallest integer satisfying R R0 and R

Collect R - R0 additional observations.


The 100(1- )% CI for :

Y.. t / 2, R 1
Chapter 10. Output Analysis for a Single Model

20

S
R
Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Confidence Intervals with Specified Precision

Call Center Example: estimate the agents utilization over the first 2 hours of
the workday.

Initial sample of size R0 = 4 is taken and an initial estimate of the population variance
is S02 = (0.072)2 = 0.00518.
The error criterion is = 0.04 and confidence coefficient is 1- = 0.95, hence, the final
sample size must be at least:
2

2
z0.025 S0 1.96 * 0.00518
= 12.14

=
0.042

For the final sample size:


R

t 0.025, R-1

(t / 2, R1S 0 / )

13

14

15

2.18

2.16

2.14

15.39

15.1

14.83

R = 15 is the smallest integer satisfying the error criterion, so R - R0 = 11 additional


replications are needed.
After obtaining additional outputs, half-width should be checked.

Chapter 10. Output Analysis for a Single Model

21

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Quantiles
Here, a proportion or probability is treated as a special case of a
mean.
When the number of independent replications Y1, ,YR is large
enough that t/2,n-1 = z/2, the confidence interval for a probability p is
often written as:

p (1 p )
R 1

p z / 2
The sample proportion

A quantile is the inverse of the probability to the probability


estimation problem:
p is given

Find such that Pr(Y ) = p

Chapter 10. Output Analysis for a Single Model

22

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Quantiles
The best way is to sort the outputs and use the (R*p)-th smallest value,
i.e., find such that100p% of the data in a histogram of Y is to the left of
.
Example: If we have R=10 replications and we want the p = 0.8 quantile,
first sort, then estimate by the (10)(0.8) = 8-th smallest value (round if
necessary).
5.6 sorted data
7.1
8.8
8.9
9.5
9.7
10.1
12.2 this is our point estimate
12.5
12.9

Chapter 10. Output Analysis for a Single Model

23

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Quantiles
Confidence Interval of Quantiles: An approximate (1-)100%
confidence interval for can be obtained by finding two values l
and u.
l cuts off 100pl% of the histogram (the Rpl smallest value of the sorted

data).
u cuts off 100pu% of the histogram (the Rpu smallest value of the sorted
data).

Chapter 10. Output Analysis for a Single Model

where pl = p z / 2

p (1 p )
R 1

pu = p + z / 2

p (1 p )
R 1

24

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Quantiles
Example: Suppose R = 1000 reps, to estimate the p = 0.8 quantile
with a 95% confidence interval.
First, sort the data from smallest to largest.
Then estimate of by the (1000)(0.8) = 800-th smallest value, and the
point estimate is 212.03.
A portion of the 1000
And find the confidence interval:

pl = 0.8 1.96

.8(1 .8)
= 0.78
1000 1

.8(1 .8)
= 0.82
1000 1
The c.i. is the 780 th and 820 th smallest values
pu = 0.8 + 1.96

The point estimate is


The 95% c.i. is [188.96, 256.79]
Chapter 10. Output Analysis for a Single Model

25

sorted values:
Output
Rank
180.92
779
188.96
780
190.55
781
208.58
799
212.03
800
216.99
801
250.32
819
256.79
820
256.99
821

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Output Analysis for Steady-State Simulation


Consider a single run of a simulation model to estimate a steadystate or long-run characteristics of the system.

The single run produces observations Y1, Y2, ... (generally the samples of
an autocorrelated time series).
Performance measure:
1 n
= lim Yi ,
n n i =1

for discrete measure

1
= lim
TE TE

for continuous measure

TE

Y (t )dt ,

(with probability 1)

(with probability 1)

- Independent of the initial conditions.

Chapter 10. Output Analysis for a Single Model

26

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Output Analysis for Steady-State Simulation


The sample size is a design choice, with several considerations in
mind:

Any bias in the point estimator that is due to artificial or arbitrary initial
conditions (bias can be severe if run length is too short).
Desired precision of the point estimator.
Budget constraints on computer resources.

Notation: the estimation of from a discrete-time output process.

One replication (or run), the output data: Y1, Y2, Y3,
With several replications, the output data for replication r: Yr1, Yr2, Yr3,

Chapter 10. Output Analysis for a Single Model

27

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Initialization Bias
Methods to reduce the point-estimator bias caused by using artificial
and unrealistic initial conditions:
Intelligent initialization.
Divide simulation into an initialization phase and data-collection phase.

Intelligent initialization

Initialize the simulation in a state that is more representative of long-run


conditions.
If the system exists, collect data on it and use these data to specify more
nearly typical initial conditions.
If the system can be simplified enough to make it mathematically solvable,
e.g. queueing models, solve the simplified model to find long-run expected
or most likely conditions, use that to initialize the simulation.

Chapter 10. Output Analysis for a Single Model

28

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Initialization Bias
Divide each simulation into two phases:

An initialization phase, from time 0 to time T0.


A data-collection phase, from T0 to the stopping time T0+TE.
The choice of T0 is important:

- After T0 , system should be more nearly representative of steady-state behavior.

System has reached steady state: the probability distribution of the system
state is close to the steady-state probability distribution (bias of response
variable is negligible).

Chapter 10. Output Analysis for a Single Model

29

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Initialization Bias
M/G/1 queueing example: A total of 10 independent replications were
made.

Each replication beginning in the empty and idle state.


Simulation run length on each replication was T0+TE = 15000 minutes.
Response variable: queue length, LQ(t,r) (at time t of the r-th replication).
Batching intervals of 1000 minutes, batch means

Ensemble averages:

To identify trend in the data due to initialization bias


The average corresponding batch means across replications:

1 R
Y. j = Yrj
R r =1

R replications

The preferred method to determine deletion point.

Chapter 10. Output Analysis for a Single Model

30

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Initialization Bias
A plot of the ensemble averages, Y ..(n, d ) , versus 1000j,
for j = 1,2, ,15.

Chapter 10. Output Analysis for a Single Model

31

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Initialization Bias
Cumulative average sample mean (after deleting d observations):

1
Y.. (n, d ) =
nd

j = d +1

.j

Not recommended to determine the initialization phase.

It is apparent that downward bias is present and this bias can be reduced
by deletion of one or more observations.
Chapter 10. Output Analysis for a Single Model

32

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Initialization Bias
No widely accepted, objective and proven technique to guide how
much data to delete to reduce initialization bias to a negligible level.
Plots can, at times, be misleading but they are still recommended.

Ensemble averages reveal a smoother and more precise trend as the # of


replications, R, increases.
Ensemble averages can be smoothed further by plotting a moving
average.
Cumulative average becomes less variable as more data are averaged.
The more correlation present, the longer it takes for Y. j to approach
steady state.
Different performance measures could approach steady state at different
rates.

Chapter 10. Output Analysis for a Single Model

33

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Error Estimation
If {Y1, , Yn} are not statistically independent, then S2/n is a biased
estimator of the true variance.

Almost always the case when {Y1, , Yn} is a sequence of output


observations from within a single replication (autocorrelated sequence,
time-series).

Suppose the point estimator is the sample mean

1 n
Y = i =1 Yi
n
Variance of Y is very hard to estimate.
For systems with steady state, produce an output process that is
approximately covariance stationary (after passing the transient phase).
- The covariance between two random variables in the time series depends
only on the lag, i.e. the number of observations between them.

Chapter 10. Output Analysis for a Single Model

34

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Error Estimation
For a covariance stationary time series, {Y1, , Yn}:
Lag-k autocovariance is:

k = cov(Y1 , Y1+ k ) = cov(Yi , Yi + k )

Lag-k autocorrelation is:

k =

k
2

1 k 1

If a time series is covariance stationary, then the variance of

V (Y ) =

Y is:

k
1
2
+
1 k

n
n
k =1
n 1

The expected value of the variance estimator is:

S2
E = B V (Y ),
n
Chapter 10. Output Analysis for a Single Model

n / c 1
where B =
n 1
35

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Error Estimation
a)

k > 0 for most k


Stationary time series Yi exhibiting
positive autocorrelation.

b)

c)

Serie slowly drifts above and then


below the mean.

k < 0 for most k

Stationary time series Yi exhibiting


negative autocorrelation.

Nonstationary time series with an


upward trend

Chapter 10. Output Analysis for a Single Model

36

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Error Estimation
The expected value of the variance estimator is:

S2
E = B V (Y ),
n

where B =

n / c 1
and V (Y ) is the variance of Y
n 1

If Yi are independent, then S2/n is an unbiased estimator of V (Y )


If the autocorrelation k are primarily positive, then S2/n is biased low as
an estimator of V (Y ) .
If the autocorrelation k are primarily negative, then S2/n is biased high as
an estimator of V (Y ).

Chapter 10. Output Analysis for a Single Model

37

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Replication Method

Use to estimate point-estimator variability and to construct a confidence


interval.
Approach: make R replications, initializing and deleting from each one the
same way.
Important to do a thorough job of investigating the initial-condition bias:

Bias is not affected by the number of replications, instead, it is affected only by


deleting more data (i.e., increasing T0) or extending the length of each run (i.e.
increasing TE).

Basic raw output data {Yrj, r = 1, ..., R; j = 1, , n} is derived by:

Individual observation from within replication r.


Batch mean from within replication r of some number of discrete-time
observations.
Batch mean of a continuous-time process over time interval j.

Chapter 10. Output Analysis for a Single Model

38

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Replication Method
Each replication is regarded as a single sample for estimating .
For replication r:
1
Yr . (n, d ) =
nd

j = d +1

rj

The overall point estimator:

1 R
Y.. (n, d ) = Yr . (n, d )
R r =1

and

E[Y.. (n, d )] = n,d

If d and n are chosen sufficiently large:


n,d ~ .
Y.. (n, d ) is an approximately unbiased estimator of .

Chapter 10. Output Analysis for a Single Model

39

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Replication Method
To estimate the standard error of Y.. , the sample variance and
standard error:

1 R
1 R 2
2
2
(

)
=

S =
Y
Y
Y
R
Y

r. .. R 1
..
r.
R 1 r =1
r =1

Mean of the
undeleted
observations
from the r-th
replication.

Chapter 10. Output Analysis for a Single Model

Mean of

and

s.e.(Y.. ) =

S
R

Standard error

Y1 (n, d ),K, YR (n, d )

40

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Replication Method
Length of each replication (n) beyond deletion point (d):
( n d ) > 10d or
TE > 10T0
Number of replications (R) should be as many as time permits, up to
about 25 replications.
For a fixed total sample size (n), as fewer data are deleted (d):
CI shifts: greater bias.
Standard error of Y.. ( n, d ) decreases: decrease variance.

Reducing
bias

Chapter 10. Output Analysis for a Single Model

Trade off

41

Increasing
variance

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Replication Method

M/G/1 queueing example:

Suppose R = 10, each of length TE = 15000


minutes, starting at time 0 in the empty and
idle state, initialized for T0 = 2000 minutes
before data collection begins.
Each batch means is the average number of
customers in queue for a 1000-minute interval.
The 1-st two batch means are deleted (d = 2).

The point estimator and standard error are:

Y.. (15,2) = 8.43

and

s.e.(Y.. (15,2) ) = 1.59

The 95% CI for long-run mean queue length is:

Y.. t / 2, R 1S / R Y.. + t / 2, R 1S / R
8.43 2.26(1.59) LQ 8.42 + 2.26(1.59)

A high degree of confidence that the long-run mean queue length is between 4.84
and 12.02 (if d and n are large enough).

Chapter 10. Output Analysis for a Single Model

42

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Sample Size
To estimate a long-run performance measure, , within
with confidence 100(1- )%.
M/G/1 queueing example (cont.):
We know: R0 = 10, d = 2 and S02 = 25.30.
To estimate the long-run mean queue length, LQ, within = 2 customers
with 90% confidence ( = 10%).
Initial estimate:

2
z0.05 S0 1.645 (25.30)
R
= 17.1
=
2
2

Hence, at least 18 replications are needed, next try R = 18,19,


using R t0.05, R1S 0 / 2 . We found that:

R = 19 (t0.05,191S 0 / ) = (1.732 * 25.3 / 4) = 18.93


2

Additional replications needed is R R0 = 19-10 = 9.


Chapter 10. Output Analysis for a Single Model

43

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Sample Size
An alternative to increasing R is to increase total run length T0+TE
within each replication.
Approach:
- Increase run length from (T0+TE) to (R/R0)(T0+TE), and
- Delete additional amount of data, from time 0 to time (R/R0)T0.

Advantage: any residual bias in the point estimator should be further


reduced.
However, it is necessary to have saved the state of the model at time
T0+TE and to be able to restart the model.

Chapter 10. Output Analysis for a Single Model

44

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Batch Means for Interval Estimation


Using a single, long replication:
Problem: data are dependent so the usual estimator is biased.
Solution: batch means.

Batch means: divide the output data from 1 replication (after appropriate

deletion) into a few large batches and then treat the means of these batches
as if they were independent.

A continuous-time process, {Y(t), T0 t T0+TE}:


k batches of size m = TE / k, batch means:

1 jm
Yj =
Y (t + T0 )dt
j
m
(

1
)
m
A discrete-time process, {Yi, i = d+1,d+2, , n}:
k batches of size m = (n d)/k, batch means:
jm
1
Yj =
Yi + d

m i =( j 1) m +1
Chapter 10. Output Analysis for a Single Model

45

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Batch Means for Interval Estimation

Y1 , ..., Yd , Yd +1 , ..., Yd + m , Yd + m+1 , ..., Yd + 2 m , ... , Yd +( k 1) m+1 , ..., Yd + km


deleted

Y1

Yk

Y2

Starting either with continuous-time or discrete-time data, the


variance of the sample mean is estimated by:
2

S
1
=
k
k

j =1

(Y j Y )2 =
k 1

j =1

Y j2 kY 2
k (k 1)

If the batch size is sufficiently large, successive batch means will be


approximately independent, and the variance estimator will be
approximately unbiased.
No widely accepted and relatively simple method for choosing an
acceptable batch size m. Some simulation software does it
automatically.
Chapter 10. Output Analysis for a Single Model

46

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Summary

Stochastic discrete-event simulation is a statistical experiment.

Distinguish: terminating simulations and steady-state simulations.


Steady-state output data are more difficult to analyze

Purpose of statistical experiment: obtain estimates of the performance measures


of the system.
Purpose of statistical analysis: acquire some assurance that these estimates are
sufficiently precise.

Decisions: initial conditions and run length


Possible solutions to bias: deletion of data and increasing run length

Statistical precision of point estimators are estimated by standard-error or


confidence interval
Method of independent replications was emphasized.

Chapter 10. Output Analysis for a Single Model

47

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation
Discrete-Event System Simulation
Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Chapter 11
Comparison and Evaluation of Alternative System Designs

Computer Science, Informatik 4


Communication and Distributed Systems

Purpose
Purpose: comparison of alternative system designs.
Approach: discuss a few of many statistical methods that can be
used to compare two or more system designs.
Statistical analysis is needed to discover whether observed
differences are due to:
Differences in design or
The random fluctuation inherent in the models

Chapter 11. Comparison and Evaluation of Alternative Designs

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Outline
For two-system comparisons:
Independent sampling.
Correlated sampling (common random numbers).

For multiple system comparisons:


Bonferroni approach: confidence-interval estimation, screening, and
selecting the best.

Metamodels

Chapter 11. Comparison and Evaluation of Alternative Designs

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Comparison of Two System Designs


Goal: compare two possible configurations of a system
Two possible ordering policies in a supply-chain system, two possible
scheduling rules in a job shop.

Approach: the method of replications is used to analyze the output


data.
The mean performance measure for system i
Denoted by i , i = 1,2.
To obtain point and interval estimates for the difference in mean
performance, namely 1 2.

Chapter 11. Comparison and Evaluation of Alternative Designs

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Comparison of Two System Designs

Vehicle-safety inspection example:

The station performs 3 jobs: (1) brake check, (2) headlight check, and (3) steering
check.
Vehicles arrival: Possion with rate = 9.5/hour.
Present system:

Alternative system:

Performance measure: mean response time per vehicle (total time from vehicle
arrival to its departure).

- Three stalls in parallel (one attendant makes all 3 inspections at each stall).
- Service times for the 3 jobs: normally distributed with means 6.5, 6.0 and 5.5 minutes,
respectively.
- Each attendant specializes in a single task, each vehicle will pass through three work
stations in series
- Mean service times for each job decreases by 10% (5.85, 5.4, and 4.95 minutes).

Chapter 11. Comparison and Evaluation of Alternative Designs

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Comparison of Two System Designs


From replication r of system i, the simulation analyst obtains an estimate
Yir of the mean performance measure i .
Assuming that the estimators Yir are (at least approximately) unbiased:
1 = E(Y1r ), r = 1, , R1;
2 = E(Y2r ), r = 1, , R2
Goal: compute a confidence interval for 1 2 to compare the two
system designs
Confidence interval for 1 2:

- If CI is totally to the left of 0, strong evidence for the hypothesis that


1 2 < 0 (1 < 2 ).
- If CI is totally to the right of 0, strong evidence for the hypothesis that
1 2 > 0 (1 > 2 ).
- If CI contains 0, no strong statistical evidence that one system is better than
the other
If enough additional data were collected (i.e., Ri increased), the CI would
most likely shift, and definitely shrink in length, until conclusion of
1 < 2 or 1 > 2 would be drawn.

Chapter 11. Comparison and Evaluation of Alternative Designs

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Comparison of Two System Designs


In this chapter:
A two-sided 100(1-)% CI for 1 2 always takes the form of:

Y.1 Y.2 t / 2, s.e.(Y.1 Y.2 )


Sample
mean for
system i

Degree
of
freedom

Standard error
of the estimator

All three techniques assume that the basic data Yir are approximately
normally distributed.

Chapter 11. Comparison and Evaluation of Alternative Designs

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Comparison of Two System Designs


Statistically significant versus practically significant
Statistical significance: is the observed difference Y.1 Y.2 larger than

the variability in Y.1 Y.2 ?


Practical significance: is the true difference 1 2 large enough to
matter for the decision we need to make?
Confidence intervals do not answer the question of practical significance
directly, instead, they bound the true difference within the range:

Y1 Y2 t , s.e.(Y1 Y2 ) 1 2 Y1 Y2 + t , s.e.(Y1 Y2 )
2

Whether a difference within these bounds is practically significant


depends on the particular problem.

Chapter 11. Comparison and Evaluation of Alternative Designs

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Independent Sampling with Equal Variances


Different and independent random number streams are used
to simulate the two systems
All observations of simulated system 1 are statistically
independent of all the observations of simulated system 2.

The variance of the sample mean, Y.i , is:

V (Y.i ) i2
V Y.i =
=
,
Ri
Ri

( )

i = 1,2

For independent samples:

) ( ) ( )

V Y.1 Y.2 = V Y.1 + V Y.2 =


Chapter 11. Comparison and Evaluation of Alternative Designs

10

12
R1

22
R2
Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Independent Sampling with Equal Variances


If it is reasonable to assume that 21 = 22 (approximately) or if R1 =
R2, a two-sample-t confidence-interval approach can be used:
The point estimate of the mean performance difference is:
The sample variance for system i is:

Y.1 Y.2

1 Ri
1 Ri 2
2
2
(Yri Y.i ) =
S =
Yri RiY.i

Ri 1 r =1
Ri 1 r =1
2
i

The pooled estimate of 2 is:

S p2

( R1 1) S12 + ( R2 1) S 22
=
,
R1 + R2 2

where = R1 + R2 -2 degrees of freedom

CI is given by: Y.1 Y.2 t / 2, s.e.(Y.1 Y.2 )


Standard error: s.e.(Y.1 Y.2 ) = S p
Chapter 11. Comparison and Evaluation of Alternative Designs

1
1
+
R1 R2
11

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Independent Sampling with Unequal Variances


If the assumption of equal variances cannot safely be made,
an approximate 100(1-)% CI can be computed as:

s.e. Y.1 Y.2 =


With degrees of freedom:

(S

/ R1 + S 22 / R2
=
S 2 / R 2 / (R 1) + S 2 / R
1
2 2
1 1

2
1

S12 S 22
+
R1 R2

) / (R
2

1)

, round to an interger

In this case, the minimum number of replications


R1 > 7 and R2 > 7 is recommended.

Chapter 11. Comparison and Evaluation of Alternative Designs

12

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Common Random Numbers (CRN)


For each replication, the same random numbers are used to simulate
both systems R1=R2=R.
For each replication r, the two estimates, Yr1 and Yr2, are correlated.
However, independent streams of random numbers are used on different
replications, so the pairs (Yr1 ,Ys2 ) are mutually independent for r s.

Purpose: induce positive correlation between Y.1 ,Y.2 (for each r) to


reduce variance in the point estimator of Y.1 Y.2 .

) ( ) ( )

V Y.1 Y.2 = V Y.1 + V Y.2 2 cov Y.1 , Y.2

12

Correlation:
12 is positive

2 12 1 2
=
+

R
R
R
Chapter 11. Comparison and Evaluation of Alternative Designs

22

13

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Common Random Numbers (CRN)


Compare variance from independent sampling with variance
from CRN:

VCRN

2 12 1 2
= VIND
R

Variance of Y.1 Y.2 arising from CRN is less than that of


independent sampling (with R1=R2).

Chapter 11. Comparison and Evaluation of Alternative Designs

14

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Common Random Numbers (CRN)


The estimator based on CRN is more precise, leading to a shorter
confidence interval for the difference.
Sample variance of the differences D = Y.1 Y.2 :

1 R
1 R 2
2
2
(
)

S =
D
D
D
R
D
r

r
R 1 r =1
R 1 r =1

2
D

where Dr = Yr1-Yr 2

Standard error:

1 R
and D = Dr , with degress of freedom = R-1
R r =1

s.e.( D ) = s.e. Y.1 Y.2 =

Chapter 11. Comparison and Evaluation of Alternative Designs

15

SD
R

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Common Random Numbers (CRN)


It is never enough to simply use the same seed for the randomnumber generator(s):
The random numbers must be synchronized: each random number used
in one model for some purpose should be used for the same purpose in
the other model.
Example: if the i-th random number is used to generate a service time at
work station 2 for the 5-th arrival in model 1, the i-th random number
should be used for the very same purpose in model 2.

Chapter 11. Comparison and Evaluation of Alternative Designs

16

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Common Random Numbers (CRN)


Vehicle inspection example:
4 input random variables:
- An interarrival time between vehicle n and vehicle n+1,
- Sn(i) inspection time for task i for vehicle n in model 1 (i=1,2,3; refers to brake,
headlight and steering task, respectively).

When using CRN:


- Same values should be generated for A1, A2, A3, in both models.
- However, mean service time for model 2 is 10% less.
- Two possible approaches to obtain correlated service times:
Let Sn(i), be the service times generated for model 1, use:
Sn(i) - 0.1E[Sn(i)]
Let Zn(i), as the standard normal variate, = 0.5 minutes, use:
E[Sn(i)] + Zn(i)
- For synchronized runs: the service times for a vehicle were generated at the
instant of arrival and stored as its attribute and used as needed.

Chapter 11. Comparison and Evaluation of Alternative Designs

17

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Common Random Numbers (CRN)


Vehicle inspection example (cont.)
Each replication run of 16 hours
Model 2 with independen random numbers

Model 2 with common random numbers


with synchronisation

Model 1

Chapter 11. Comparison and Evaluation of Alternative Designs

Model 2 with common random numbers


without synchronisation

18

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Common Random Numbers (CRN)


Vehicle inspection example (cont.): compare the two systems using
independent sampling and CRN where R = R1 = R2 =10:
Independent sampling: Y.1 Y.2 = 5.4 minutes

with = 17, t 0.05,17 = 2.11, S12 = 118.9 and S 22 = 244.3, CI : -18.1 1-2 7.3
CRN without synchronization: Y.1 Y.2 = 1.9 minutes

with = 9, t 0.05,9 = 2.26, S D2 = 208.9, CI : - 12.3 1 - 2 8.5

CRN with synchronization: Y.1 Y.2 = 0.4 minutes

with = 9, t 0.05,9 = 2.26, S D2 = 1.7, CI : - 0.50 1 - 2 1.30

Chapter 11. Comparison and Evaluation of Alternative Designs

19

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

CRN with Specified Precision


Goal: The error in our estimate of 1 2 to be less than .
Approach: determine the # of replications R such that the half-width of
CI:
H = t / 2, s.e.(Y.1 Y.2 )
Vehicle inspection example (cont.):
R0 = 10, complete synchronization of random numbers
yield 95% CI: 0.4 0.9 minutes
Suppose = 0.5 minutes for practical significance, we know R is the
smallest integer satisfying R R0 and:

t / 2, R 1S D
R

t / 2, R0 1S D
t

t
Since / 2, R 1 / 2, R0 1 , a conservation estimate of R is: R

Hence, 35 replications are needed (25 additional).


Chapter 11. Comparison and Evaluation of Alternative Designs

20

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation
Discrete-Event System Simulation
Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Chapter 2
Simulation Examples

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation using a Table


Repetition

Introducing simulation by

manually simulating on a table

Inputs
xi1

Response
yi

xi2 xij xip

1
2
:

Can be done via pen-and-paper


or by using a spreadsheet

n
Three steps

1. Determine the characteristics of each inputs to the simulation.


2. Construct a simulation table consisting of

p inputs xij, j=1,2,,p

one response y , i=1,2,,n


i

3. For each repetition i, generate a value for each of the p inputs xij and
calculate the response yi.

Next some simulation examples using tables.


Chapter 2. Simulation Examples

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation of Queueing Systems


A queueing system is described by
Calling population
Arrival rate
Service mechanism
System capacity
Queueing discipline

Calling population
Waiting line

Chapter 2. Simulation Examples

Server

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation of Queueing Systems


Single server queue

Queueing system state

Calling population is infinite

System

Arrival rate does not change

Units are served according FIFO

Arrivals are defined by the

Units (in queue or being served)


- Clock
-

distribution of the time between


arrivals

State of the system

Events

- Number of units in the system


- Status of server (idle, busy)

Service times are according to

Arrival rate must be less than


service rate

inter-arrival time

distribution

Server

stable system

- Arrival of a unit
- Departure of a unit

Otherwise waiting line will grow


unbounded

unstable system

Calling population

ti+1

ti

Arrivals
Waiting line

Chapter 2. Simulation Examples

Server

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation of Queueing Systems


Arrival Event

If server idle unit gets service,


otherwise unit enters queue.

Departure Event

If queue is not empty begin


servicing next unit, otherwise
server will be idle.

How do events occur?

Events occur randomly

Interarrival times

Service times

Chapter 2. Simulation Examples

{1,...,6}

{1,...,4}

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation of Queueing Systems


Customer

Interarrival
Time

Arrival Time
on Clock

The interarrival and service


times are taken from
distributions!

Service
Time

15

Customer
Number

The simulation run is


build by meshing
clock, arrival and
service times!

Chapter 2. Simulation Examples

Arrival Time
[Clock]

Time Service
Begins [Clock]

Service Time
[Duration]

Time Service
Ends [Clock]

11

11

12

15

15

19

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation of Queueing Systems


Chronological ordering of events
Clock
Time

Customer
Number

Event
Type

Number of
customers

Arrival

Departure

Arrival

Departure

Arrival

Arrival

Departure

Arrival

11

Departure

12

Departure

15

Arrival

19

Departure

Chapter 2. Simulation Examples

Number of customers in the system

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation of Queueing Systems A Grocery


Analysis of a small grocery store

One checkout counter

Customers arrive at random times

Interarrival
Time

Probability

0.125

0.125

0.125

0.250

0.125

0.375

0.125

0.500

0.125

0.625

0.125

0.750

0.125

0.875

0.125

1.000

from {1,2,,8} minutes

Service times vary from {1,2,,6}


minutes

Consider the system for 100


customers

Problems/Simplifications

Sample size is too small to be able


to draw reliable conclusions

Initial condition is not considered

Chapter 2. Simulation Examples

Service
Time

Probability

Cumulative
Probability

Cumulative
Probability

0.10

0.10

0.20

0.30

0.30

0.60

0.25

0.85

0.10

0.95

0.05

1.00
Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation of Queueing Systems A Grocery


Customer

Interarrival
Time
[Minutes]

Arrival Time
[Clock]

Service
Time
[Minutes]

Time
Service
Begins
[Clock]

Time
Service
Ends
[Clock]

Waiting
Time in
Queue
[Minutes]

Time
Customer
in System
[Minutes]

Idle Time
of Server
[Minutes]

11

11

15

11

15

16

18

18

23

416

418

174

491

101

...
100

Total

415

Chapter 2. Simulation Examples

415

2
317

10

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation of Queueing Systems A Grocery


Average waiting time

w=

Probability that a customer has to wait


Porpotion of server idle time

p( wait ) =

p(idle server) =

Average service time

s=

Number of customers

100

Number of customer who wait 46


=
= 0.46
Nunber of customers
100

Idle time of server = 101 = 0.24


Simulation run time

Service time

Number of customers

418

317
= 3.17 min
100

s p(s) = 0.110 + 0.2 20 + L + 0.05 6 = 3.2 min


Times between arrivals 415
=
= 4.19 min
=

E (s ) =

Average time between arrivals

Waiting time in queue = 174 = 1.74 min

s=0

Number of arrivals - 1

E ( ) =

Average waiting time of those who wait


Average time a customer spends in system

wwaited =

t=

99

a + b 1+ 8
=
= 4.5 min
2
2

Waiting time in queue

Number of customers that wait

174
= 3.22 min
54

Time customers spend in system = 491 = 4.91min


Number of customers

100

t = w + s = 1.74 + 3.17 = 4.91 min

Chapter 2. Simulation Examples

11

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation of Queueing Systems A Grocery


Interesting results for a

Average of 50 Trials

manager, but

Histogram for the Average Customer Waiting Time

longer simulation run would


increase the accuracy

18

17

Some interpretations

Average waiting time is not


high

Server hast not undue amount

Occurrences (No. of Trials)

16
14
12
10
8

4
2

1
0

of idle time, it is well loaded ;-)

15

1
0

4,5

0
0

0,5

1,5

2,5

3,5

>4.5

Average customer waiting time [min]

Nearly half of the customers


have to wait (46%)

Chapter 2. Simulation Examples

12

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Summary
This chapter introduced simulation concepts my means of
examples
Example simulation were performed on a table manually
Use a spreadsheet for large experiments (Excel, OpenOffice)
Input data is important
Random variables can be used
Output analysis important and difficult
The used tables were of ad hoc, a more methodic approach is
needed

Chapter 2. Simulation Examples

13

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation
Discrete-Event System Simulation
Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Chapter 3
General Principles

Computer Science, Informatik 4


Communication and Distributed Systems

General Principles Introduction


Framework for modeling systems by discrete-event
simulation
A system is modeled in terms of its state at each point in time
This is appropriate for systems where changes occur only at
discrete points in time

Chapter 3. General Principles

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Concepts in Discrete-Event Simulation

Concepts of dynamic, stochastic systems that change in a discrete


manner

System

A collection of entities that interact together over time to accomplish one or more
goals.

Model

An abstract representation of a system, usually containing structural, logical, or


mathematical relationships that describe the system.

System state

A collection of variables that contain all the information necessary to describe the
system at any time.

Entity

An object in the system that requires explicit representation in the model, e.g., people,
machines, nodes, packets, server, customer.

Attributes

The properties of a given entity.

List, Set

A collection of associated entities ordered in some logical fashion in a waiting line.


Holds entities and event notices
Entities on a list are always ordered by some rule, e.g. FIFO, LIFO, or ranked by
some attribute, e.g. priority, due date

Event

An instantaneous occurrence that changes the state of a system.

Event notice

A record of an event to occur at the current or some future time, along with any
associated data necessary to execute the event.

Chapter 3. General Principles

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Concepts in Discrete-Event Simulation


Event list

A list of event notices for future events, ordered by time of occurrence; known as the
future event list (FEL).
Always ranked by the event time

Activity

A duration of time of specified length, which is known when it begins.


Represents a service time, interarrival time, or any other processing time whose
duration has been characterized by the modeler. The duration of an activity can be
specified as:
Deterministic Always 5 time units
Statistical Random draw from {2, 5, 7}
A function depending on system variables and entities
The duration of an activity is computable when it begins
The duration is not affected by other events
To track activities, an event notice is created for the completion time, e.g., let
clock=100 and service with duration 5 time units is starting
Schedule an end of service-event for clock + 5 = 105

Delay

A duration of time of unspecified indefinite length, which is not known until it ends.
Customers delay in waiting line depends on the number and service times of other
customers.
Typically a desired output of the simulation run.

Clock

A variable representing the simulated time.

Chapter 3. General Principles

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Concepts in Discrete-Event Simulation


Activity vs. Delay
Activity
Activity is known as unconditional wait
End of an activity is an event, for this an event notice is placed in
the future event list
This event is a primary event

Delay
Delay is known as conditional wait
Delays are managed by placing the entity on another list, e.g.,
representing a waiting line
Completion of delay is a secondary event, but they are not
placed in the future event list

Chapter 3. General Principles

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Concepts in Discrete-Event Simulation


Activity vs. Delay

Delay
Delay
Activity2

Activity1

t
A1

Chapter 3. General Principles

A2

D1

A3

D2

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Concepts in Discrete-Event Simulation

A model consists of

static description of the model and


the dynamic relationships and interactions between the components

Some questions that need to be answered for the dynamic behavior

Events
- How does each event affect system state, entity attributes, and set contents?

Activities
- How are activities defined?
- What event marks the beginning or end of each activity?
- Can the activity begin regardless of system state, or is its beginning conditioned on the
system being in a certain state?

Delays
- Which events trigger the beginning (and end) of each delay?
- Under what condition does a delay begin or end?

System state initialization


- What is the system state at time 0?
- What events should be generated at time 0 to prime the model that is, to get the
simulation started?

Chapter 3. General Principles

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Concepts in Discrete-Event Simulation


A discrete-event simulation proceeds by producing a
sequence of system snapshots over time
A snapshot of the system at a given time includes
System state
Status of all entities
Status of all sets
- Sets are used to collect required information for calculating
performance metrics

List of activities (FEL)


Statistics
Clock

System state

(x, y, z, ...)

...

...

Chapter 3. General Principles

Entities and
attributes

Set 1

Set 2

...

Future event list (FEL)

Statistics

(3,t1) Type 3 event to occur at t1


...

...

...

...

...

...

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Event-scheduling/Time-advance algorithm
Future event list (FEL)

All event notices are chronologically ordered in the FEL


At current time t, the FEL contains all scheduled events
The event times satisfy: t < t1 t2 t3 ... tn
The event associated with t1 is the imminent event, i.e., the next
event to occur
Scheduling of an event
- At the beginning of an activity the duration is computed and an endof-activity event is placed on the future event list

The content of the FEL is changing during simulation run


- Efficient management of the FEL has a major impact on the
performance of a simulation run
- Class: Data structures and algorithms

Chapter 3. General Principles

10

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Event-scheduling/Time-advance algorithm
Old system snapshot at time t
Event-scheduling/Time-advance algorithm

Clock

State

(5,1,6)

(3,t1) Type 3 event to occur at t1


(1,t2) Type 1 event to occur at t2

Step 1: Remove the event notice for the


imminent event from FEL
event (3, t1) in the example

(1,t3) Type 1 event to occur at t3


...
(2,tn) Type 2 event to occur at tn

Step 2: Advance Clock to imminent event time


Set clock = t1
Step 3: Execute imminent event
update system state
change entity attributes
set membership as needed
Step 4: Generate future events and place their
event notices on FEL
Event (4, t*)

Future event list

New system snapshot at time t1


Clock

State

t1

(5,1,5)

Future event list


(1,t2) Type 1 event to occur at t2
(4,t*) Type 4 event to occur at t*
(1,t3) Type 1 event to occur at t3

Step 5: Update statistics and counters

...
(2,tn) Type 2 event to occur at tn

Chapter 3. General Principles

11

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Event-scheduling/Time-advance algorithm
System snapshot at time 0
Initial conditions
Generation of exogenous events
- Exogenous event, is an event which happens outside the system,
but impinges on the system, e.g., arrival of a customer

Chapter 3. General Principles

12

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Event-scheduling/Time-advance algorithm
Generation of events
Arrival of a customer
- At t=0 first arrival is generated and scheduled
- When the clock is advanced to the time of the
first arrival, a second arrival is generated
- Generate an interarrival time a*
- Calculate t* = clock + a*
- Place event notice at t* on the FEL

Chapter 3. General Principles

13

Bootstrapping

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Event-scheduling/Time-advance algorithm
Generation of events
Service completion of a customer
-

A customer completes service at t


If the next customer is present a new service time s* is generated
Calculate t* = clock + s*
Schedule next service completion at t*
Additionally: Service completion event will scheduled at the arrival
time, when there is an idle server
- Service time is an activity
- Beginning service is a conditional event
Conditions: Customer is present and server is idle
- Service completion is a primary event

Chapter 3. General Principles

14

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Event-scheduling/Time-advance algorithm
Generation of events
Alternate generation of runtimes and downtimes
- At time 0, the first runtime will be generated and an end-of-runtime
event will be scheduled
- Whenever an end-of-runtime event occurs, a downtime will be
generated, and a end-of-downtime event will be scheduled
- At the end-of-downtime event, a runtime is generated and an endof-runtime event is scheduled
- Runtimes and downtimes are activities
- end-of-runtime and end-of-downtime are primary events

runtime

downtime

runtime
Time

Time 0
Chapter 3. General Principles

15

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Event-scheduling/Time-advance algorithm

Stopping a simulation
1. At time 0, schedule a stop simulation event at a specified future
time TE Simulation will run over [0, TE]
2. Run length TE is determined by the simulation itself.

TE is not known ahead.


Example: TE = When FEL is empty

Chapter 3. General Principles

16

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

World Views

World view
A world view is an
orientation for the model
developer
Simulation packages
typically support some
world views
Here, only world views for
discrete simulations

Chapter 3. General Principles

Discrete Simulation

Event-scheduling

17

Process-interaction

Activity-scanning

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

World Views
Start

Event-scheduling
Focus on events
Identify the entities and their
attributes
Identify the attributes of the
system
Define what causes a change
in system state
Write a routine to execute for
each event
Variable time advance

Initialization
Select next event

Event
routine 1

Event
routine 2

No

Event
routine n

Terminate?

Yes
Output
End

Chapter 3. General Principles

18

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

World Views

Process-interaction

Modeler thinks in terms of processes


A process is the lifecycle of one entity, which consists of various events and activities
Simulation model is defined in terms of entities or objects and their life cycle as they flow
through the system, demanding resources and queueing to wait for resources
Some activities might require the use of one or more resources whose capacities are limited
Processes interact, e.g., one process has to wait in a queue because the resource it needs is
busy with another process
A process is a time-sequenced list of events, activities and delays, including demands for
resource, that define the life cycle of one entity as it moves through a system
Variable time advance

Chapter 3. General Principles

19

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

World Views
Start

Activity-scanning

Initialization

Modeler concentrates on activities


of a model and those conditions
that allow an activity to begin
At each clock advance, the
conditions for each activity are
checked, and, if the conditions are
true, then the corresponding
activity begins
Fix time advance
Disadvantage: The repeated
scanning to discover whether an
activity can begin results in slow
runtime
Improvement: Three-phase
approach

Phase 1: Time Scan


Phase 2: Activity Scan
Activity 1
Condition
Actions

Activity 2
Condition
Actions

Other condition
satisfied?

Activity n
Condition
Actions

Yes

No
No

Terminate?

Yes

- Combination of event scheduling


with activity scanning

Output
End

Chapter 3. General Principles

20

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

World Views
Start

Three-phase approach

Initialization

Events are activities of duration


zero time units
Two types of activities

Phase A: Time Scan

- B activities: activities bound to


occur; all primary events and
unconditional activities
- C activities: activities or events
that are conditional upon certain
conditions being true

Phase B: Execute B activities due now


Phase C: Scan all C activities
Activity 1
Condition
Actions

The B-type activites can be


scheduled ahead of time, just as
in the event-scheduling approach

Activity 2
Condition
Actions

- Variable time advance


- FEL contains only B-type events

Other condition
satisfied?

Scanning to learn whether any Ctype activities can begin or C-type


events occur happen only at the
end of each time advance, after
all B-type events have completed

Activity n
Condition
Actions

Yes

No
No

Terminate?

Yes
Output
End
Chapter 3. General Principles

21

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

World Views

P2
E4

E3
A3

A4

P4
E8

E7
A7

P3
P1
E2

E1
A1

E6

E5
A5

A8

A6

A2
Time

Chapter 3. General Principles

22

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Manual Simulation Using Event Scheduling Grocery

Reconsider grocery example from Chapter 2

System state = ( LQ(t), LS(t) )

(A, t) arrival event at future time t


(D, t) departure event at future time t
(E, t) simulation stop at future time t

Activities

Arrival (A)
Departure (D)
Stopping event (E)

Event notices

Server and customers are not explicitly modeled

Events

LQ(t) = Number of customers in the waiting line at t


LS(t) = Number of customers being served at t (0 or 1)

Entities

In chapter 2: We used an ad hoc method to simulate the grocery

Interarrival time
Service time

Delay

Customer time spent in waiting line


Calling population
Waiting line

Chapter 3. General Principles

23

Server
Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Manual Simulation Using Event Scheduling Grocery

System state = ( LQ(t), LS(t) ) is affected by the events

Arrival
Departure

Chapter 3. General Principles

24

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Manual Simulation Using Event Scheduling Grocery


Server Busy time
Maximum Queue Length

Initial conditions
First customer arrives at t=0
and gets service
An arrival and a departure
event is on FEL

Server was busy for 21 of


23 time units
Maximum queue length
was 2

Chapter 3. General Principles

25

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Manual Simulation Using Event Scheduling Grocery

When event scheduling is implemented, consider

Only one snapshot is kept in the memory


A new snapshot can be derived only from the previous snapshot
Past snapshot are ignored for advancing the clock
The current snapshot must contain all information necessary to
continue the simulation!

In the example
No information about particular customer
If needed, the model has to be extended

Chapter 3. General Principles

26

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Manual Simulation Using Event Scheduling Grocery


Analyst wants estimates per customer basis
Mean response time (system time)
Mean proportion of customers who spend more than 5 time units

Extend the model to represent customers explicitly


Entities: Customer entities denoted as C1, C2, C3,
- (Ci, t) customer Ci arrived at t

Event notices
- (A, t, Ci) arrival of customer Ci at t
- (D, t, Cj) departure of customer Cj at t

Set
- Checkout Line set of customers currently at the checkout counter ordered
by time of arrival

Statistics
- S: sum of customer response times for all customers who have departed by
the current time
- F: total number of customers who spend 5 time units
- ND: number of departures up to the current simulation time

Chapter 3. General Principles

27

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Manual Simulation Using Event Scheduling Grocery


Extended version of the simulation table from Slide 25
System State
Clock

LQ(t)

(C1,0)

(A,1,C2) (D,4,C1)(E,60)

(C1,0)(C2,1)

(A,2,C3)(D,4,C1)(E,60)

(C1,0)(C2,1)(C3,2)

(D,4,C1)(A,8,C4)(E,60)

(C2,1)(C3,2)

(D,6,C2)(A,8,C4)(E,60)

(C3,2)

(A,8,C4)(D,11,C3)(E,60)

(C3,2)(C4,8)

(D,11,C3)(A,11,C5)(E,60)

11

(C4,8)(C5,11)

(D,15,C4)(A,18,C6)(E,60)

18

15

(C5,11)

(D,16,C4)(A,18,C6)(E,60)

25

16

(A,18,C6)(E,60)

30

18

(C6,18)

(D,23,C6)(A,23,C7)(E,60)

30

23

(C7,23)

(A,25,C8)(D,27,C7)(E,60)

35

response time =

Chapter 3. General Principles

LS(t)

Statistics
Checkout Line

35
S
=
= 5.83
6
ND

Future Event List

N 5 =

28

ND

5
F
= = 0.83
ND 6
Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Manual Simulation Using Event Scheduling Dump Truck


The DumpTruck Problem
Six dump trucks are used to haul coal from the entrace of a small mine
to the railroad
Each truck is loaded by one of two loaders
After loading, the truck immediately moves to the scale, to be weighed
Loader and Scale have a first-come-first-serve (FCFS) queue
The travel time from loader to scale is negligible
After being weighed, a truck begins a travel time, afterwards unloads
the coal and returns to the loader queue
Purpose of the study: Estimation of the loader and scale utilizations.

Chapter 3. General Principles

29

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Manual Simulation Using Event Scheduling Dump Truck

Loader queue Trucks waiting to begin loading, FCFS


Weigh queue Truck waiting to bei weighed, FCFS

Delay at loader queue


Delay at scale

Chapter 3. General Principles

30

CDF

0.30

0.30

10

0.50

0.80

15

0.20

1.00

Weighing Time

PDF

CDF

12

0.70

0.70

16

0.30

1.00

Travel Time Distribution


Travel Time

Loading Loading time


Weighing Weighing time
Travel Travel time

Delays

PDF

Weighing Time Distribution

The six dump trucks DT1, DT2, ..., DT6

Activities

Loading Time

Lists

(ALQ, t, DTi) dump truck i arrives at loader queue (ALQ) at time t


(EL, t, DTi) dump truck i ends loading (EL) at time t
(EW, t, DTi) dump truck i ends weighing (EW) at time t

Entities

LQ(t) = number of trucks in the loader queue {0,1,2,...}


L(t) = number of trucks being loaded {0,1,2}
WQ(t) = number of trucks in weigh queue {0,1,2,...}
W(t) = number of trucks being weighed {0,1}

Event notices

Loading Time Distribution

System state [ LQ(t), L(t), WQ(t), W(t) ]

PDF

CDF

40

0.40

0.40

60

0.30

0.70

80

0.20

0.90

100

0.10

1.00

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Manual Simulation Using Event Scheduling Dump Truck


Initialization
It is assumed that five trucks are at the loader and one is at the scale at
time 0

Activity times

Both loaders
are busy!

Loading time: 10, 5, 5, 10, 15, 10, 10


Weighing time: 12, 12, 12, 16, 12, 16
Travel time: 60, 100, 40, 40 80
System State

Lists
Loader Queue

Statistics

Clock

LQ(t)

L(t)

WQ(t)

W(t)

Weigh Queue

DT4, DT5, DT6

DT5, DT6

DT3

10

DT6

10

12

20

24

Future Event List

BS

(EL,10,DT2) (EL,5+5,DT4) (EW,12,DT1)

10

DT3, DT2

(EL,10,DT4) (EW,12,DT1)
(EL,10+10,DT5)

20

10

DT3, DT2, DT4

(EW,12,DT1) (EL,20,DT5)
(EL,10+15,DT6)

20

10

DT2, DT4

(EL,20,DT5) (EW,12+12,DT3)
(EL,25,DT6) (ALQ,12+60,DT1)

24

12

DT2, DT4, DT5

(EW,24,DT3) (EL,25,DT6) (ALQ,72,DT1)

40

20

DT4, DT5

(EL,25,DT6) (EW,24+12,DT2)
(ALQ,72,DT1) (ALQ,24+100,DT3)

44

24

Chapter 3. General Principles

(EL,5,DT3) (EL,10,DT2) (EW,12,DT1)

BL

31

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation in Java

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation in Java
Again the grocery example

Java is a general purpose


programming language

Single server queue


Run for 1000 customers
Interarrival times are
exponentially distributed with
mean 4.5
Service times are also
exponentially distributed with
mean 3.2
Known as: M/M/1 queueing
system

Object-oriented

First simple specific


simulation implementation
Later, object-oriented
framework for discrete event
simulation

Calling population

ti+1

ti

Arrivals
Waiting line

Chapter 3. General Principles

33

Server

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation in Java

System state

queueLength
numberInService

Entity attributes
customers

futureEventList

Activity durations

meanInterArrivalTime
meanServiceTime
meanInterarrivalTime
meanServiceTime
totalCustomers

34

exponential(mu)

Methods

Input parameters

rho = BusyTime/Clock
avgr = Average response time
pc4 = Number of customers who spent
more than 4 minutes

Help functions

clock
lastEventTime
totalBusy
maxQueueLength
sumResponseTime

Statistics

Future event list

Chapter 3. General Principles

Simulation variables

initialization
processArrival
processDeparture
reportGeneration

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation in Java
Overall structure of the Java program

Overall structure of an event-scheduling


simulation program

Chapter 3. General Principles

35

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation in Java Class Event


class Event {
public double time;
private int type;
public Event(int _type, double _time) {
type = _type;
time = _time;
}
public int getType() {
return type;
}
public double getTime() {
return time;
}
}

Chapter 3. General Principles

36

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation in Java Sim Class


class Sim {
// Class Sim variables
public static double clock,
meanInterArrivalTime,
meanServiceTime,
lastEventTime,
totalBusy,
maxQueueLength,
sumResponseTime;
public static long

numberOfCustomers,
queueLength,
numberInService,
totalCustomers,
numberOfDepartures,
longService;

public final static int arrival = 1;


public final static int departure = 2;

// Event type for an arrival


// Event type for a departure

public static EventList futureEventList;


public static Queue customers;
public static Random stream;

Chapter 3. General Principles

37

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation in Java Main program


public static void main(String
meanInterArrivalTime
meanServiceTime
totalCustomers
long seed

argv[]) {
= 4.5;
= 3.2;
= 1000;
= Long.parseLong(argv[0]);

stream = new Random(seed);


futureEventList = new EventList();
customers = new Queue();

// Initialize rng stream

initialization();
// Loop until first totalCustomers" have departed
while( numberOfDepartures < totalCustomers ) {
Event event = (Event)futureEventList.getMin(); // Get imminent event
futureEventList.dequeue();
// Be rid of it
clock = event.getTime();
// Advance simulation time
if( event.getType() == arrival ) {
processArrival(event);
}
else {
processDeparture(event);
}
}
reportGeneration();
}

Chapter 3. General Principles

38

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation in Java Initialization


// Seed the event list with TotalCustomers arrivals
public static void initialization()
{
clock = 0.0;
queueLength = 0;
numberInService = 0;
lastEventTime = 0.0;
totalBusy = 0 ;
maxQueueLength = 0;
sumResponseTime = 0;
numberOfDepartures = 0;
longService = 0;
// Create first arrival event
double eventTime = exponential(stream, MeanInterArrivalTime);
Event event = new Event(arrival, eventTime);
futureEventList.enqueue(event);
}

Chapter 3. General Principles

39

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation in Java Event Arrival


public static void processArrival(Event event) {
customers.enqueue(event);
queueLength++;
// If the server is idle, fetch the event, do statistics and put into service
if( numberInService == 0 ) {
scheduleDeparture();
}
else {
totalBusy += (clock - lastEventTime); // server is busy
}
// Adjust max queue length statistics
if(maxQueueLength < queueLength) {
maxQueueLength = queueLength;
}
// Schedule the next arrival
Double eventTime = clock + exponential(stream, meanInterArrivalTime);
Event nextArrival = new Event(arrival, eventTime);
futureEventList.enqueue( nextArrival );
lastEventTime = clock;
}

Chapter 3. General Principles

40

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation in Java Event Departure


public static void scheduleDeparture() {
double serviceTime = exponential(stream, meanServiceTime);
Event depart = new Event(departure, clock + serviceTime);
futureEventList.enqueue(depart);
numberInService = 1;
queueLength--;
}
public static void processDeparture(Event e) {
// Get the customer description
Event finished = (Event) customers.dequeue();
// If there are customers in the queue then schedule the departure of the next one
if( queueLength > 0 ) {
scheduleDeparture();
}
else {
numberInService = 0;
}
// Measure the response time and add to the sum
double response = clock - finished.getTime();
sumResponseTime += response;
if( response > 4.0 )
longService++; // record long service
totalBusy += (clock - lastEventTime);
numberOfDepartures++;
lastEventTime = clock;
}

Chapter 3. General Principles

41

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation in Java Report Generator


public static void
double rho
=
double avgr =
double pc4
=

reportGeneration() {
totalBusy/clock;
sumResponseTime/totalCustomers;
((double)longService)/totalCustomers;

System.out.println( "SINGLE SERVER QUEUE SIMULATION - GROCERY STORE CHECKOUT COUNTER ");
System.out.println( "\tMEAN INTERARRIVAL TIME
" + meanInterArrivalTime );
System.out.println( "\tMEAN SERVICE TIME
" + meanServiceTime );
System.out.println( "\tNUMBER OF CUSTOMERS SERVED
" + totalCustomers );
System.out.println();
System.out.println( "\tSERVER UTILIZATION
" + rho );
System.out.println( "\tMAXIMUM LINE LENGTH
" + maxQueueLength );
System.out.println( "\tAVERAGE RESPONSE TIME
" + avgr + " Time Units");
System.out.println( "\tPROPORTION WHO SPEND FOUR ");
System.out.println( "\t MINUTES OR MORE IN SYSTEM
" + pc4 );
System.out.println( "\tSIMULATION RUNLENGTH
" + clock + " Time Units");
System.out.println( "\tNUMBER OF DEPARTURES
" + totalCustomers );
}

Chapter 3. General Principles

42

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation in Java - Output


SINGLE SERVER QUEUE SIMULATION - GROCERY STORE CHECKOUT COUNTER
MEAN INTERARRIVAL TIME
4.5
MEAN SERVICE TIME
3.2
NUMBER OF CUSTOMERS SERVED
1000
SERVER UTILIZATION
MAXIMUM LINE LENGTH
AVERAGE RESPONSE TIME
PROPORTION WHO SPEND FOUR
MINUTES OR MORE IN SYSTEM
SIMULATION RUNLENGTH
NUMBER OF DEPARTURES

Chapter 3. General Principles

0.718
13.0
9.563
0.713
4485.635
1000

43

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Object-oriented Discrete-Event Simulation


Framework

Computer Science, Informatik 4


Communication and Distributed Systems

Object-Oriented Simulation Framework


Package rng

Package core

Chapter 3. General Principles

45

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Object-Oriented Simulation Framework


OO Discrete-Event Simulation Framework consists of
Two packages

Package core

SimEvent
SimEntity
SimQueue
SimControl

Package rng
RNG

Chapter 3. General Principles

46

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Object-Oriented Simulation Framework SimEvent


public class SimEvent {
double time;
int type;
SimEntity src;
SimEntity dst;
public long id;
public SimEvent(SimEntity _dst) {
type = 0;
time = 0;
src = null;
dst = _dst;
}
public SimEvent(double _time, SimEntity _dst) {
type = 0;
time = _time;
src = null;
dst = _dst;
}
public SimEvent(double _time, SimEntity _src, SimEntity _dst) {
type = 0;
time = _time;
src = _src;
dst = _dst;
}

Chapter 3. General Principles

47

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Object-Oriented Simulation Framework SimEntity


public abstract class SimEntity {
protected SimControl simControl;
/**
* An entity has to know the current instance of the simulator.
* @param _simControl
* @see SimControl
*/
public SimEntity(SimControl _simControl) {
simControl = _simControl;
}
/**
* This method handles the events destined to this entity.
* @param event
* @see SimEvent
*/
abstract public void handleEvent(SimEvent event);
}

Chapter 3. General Principles

48

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Object-Oriented Simulation Framework SimQueue


public abstract class SimQueue {
/**
* Schedule the given event according to the event time.
* @param event
* @see SimEvent
*/
abstract public void schedule(SimEvent event);
/**
* Return the next event in the queue.
* @return imminent event in queue.
* @see SimEvent
*/
abstract public SimEvent getNextEvent();
/**
* This method dumps the content of the queue.
* It is for debugging purposes.
*/
abstract public void dump();
abstract public boolean isEmpty();
}

Chapter 3. General Principles

49

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Object-Oriented Simulation Framework SimControl


public class SimControl {
private SimQueue queue;
private double time;
private double endTime;
public SimControl(SimQueue _queue) {
queue = _queue;
}
public void run() {
SimEvent event;
while( queue.isEmpty() == false ) {
// If there is an event in FEL and the sim-end is not reached ...
event = queue.getNextEvent();
time = event.getTime();
if( event.getTime() <= endTime )
dispatch(event); // ... call the destination object of this event
else
break;
}
}
private void dispatch(SimEvent event) {
event.getDestination().handleEvent(event);
}
Chapter 3. General Principles

50

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Object-Oriented Simulation Framework SimControl


... public class SimControl ...
public void setRunTime(double _runTime) {
endTime = _runTime;
}
public void schedule(SimEvent event) {
queue.schedule(event);
}
public void schedule(SimEvent event, double _delta) {
event.setTime(time +_delta);
schedule(event);
}

Chapter 3. General Principles

51

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Object-Oriented Simulation Framework RNG


public abstract class RNG {
abstract public double getNext();
}

public class Exponential extends RNG {


double lambda;
Random uniform;
public Exponential(double _lambda) {
lambda = _lambda;
uniform = new Random(System.currentTimeMillis());
}
/*
* @see rng.RNG#getNext()
*/
public double getNext() {
return -Math.log(uniform.nextDouble())/lambda;
}
}

Chapter 3. General Principles

52

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Object-Oriented Simulation Framework


Again our Grocery
example
Use of the objectoriented simulation
framework

MM1Generator
Generates new
customer

MM1Server
Serves customer

Chapter 3. General Principles

53

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Object-Oriented Simulation Framework

Chapter 3. General Principles

54

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Object-Oriented Simulation Framework


9

Simulation

Simulation
Theory

6
Queueing Time

System Time

Queueing Time

0
0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.1

0.9

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

rho

rho

0.9
Simulation

p0 Probability that
a customer finds the
system idle

Theory

p0

0.8

0.7

Probability of Empty System

System Time

Theory

0.6

0.5

0.4

0.3

0.2

0.1

0
0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

rho

Chapter 3. General Principles

55

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Summary
Introduced a general framework for discrete event
simulations
Event-scheduling and time-advance algorithm
Generation of events
World views for discrete simulations
Introduced manual discrete event simulation
Introduced simulation in Java
Object-oriented simulation framework in Java

Chapter 3. General Principles

56

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation
Discrete-Event System Simulation
Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Chapter 4
Statistical Models in Simulation

Computer Science, Informatik 4


Communication and Distributed Systems

Purpose & Overview


The world the model-builder sees is probabilistic rather than
deterministic.
Some statistical model might well describe the variations.

An appropriate model can be developed by sampling the


phenomenon of interest:
Select a known distribution through educated guesses
Make estimate of the parameters
Test for goodness of fit

In this chapter:
Review several important probability distributions
Present some typical application of these models

Chapter 4. Statistical Models in Simulation

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Review of Terminology and Concepts


In this section, we will review the following concepts:

Discrete random variables


Continuous random variables
Cumulative distribution function
Expectation

Chapter 4. Statistical Models in Simulation

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Discrete Random Variables


X is a discrete random variable if the number of possible values
of X is finite, or countable infinite.
Example: Consider jobs arriving at a job shop.
- Let X be the number of jobs arriving each week at a job shop.
Rx = possible values of X (range space of X) = {0,1,2,}
p(xi) = probability the random variable X is xi , p(xi) = P(X = xi)

p(xi), i = 1,2, must satisfy:

1. p( xi ) 0, for all i
2.

i =1

p( xi ) = 1

The collection of pairs [xi, p(xi)], i = 1,2,, is called the probability


distribution of X, and p(xi) is called the probability mass function (pmf) of
X.

Chapter 4. Statistical Models in Simulation

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Continuous Random Variables

X is a continuous random variable if its range space Rx is an interval or


a collection of intervals.
The probability that X lies in the interval [a, b] is given by:
b

P(a X b) = f ( x)dx
a

f(x) is called the probability density function (pdf) of X, satisfies:

1. f ( x) 0 , for all x in R X
2.

f ( x)dx = 1

RX

3. f ( x) = 0, if x is not in RX

Properties
x0

1. P ( X = x0 ) = 0, because f ( x)dx = 0
x0

2. P ( a X b ) = P ( a < X b ) = P ( a X < b ) = P ( a < X < b )


Chapter 4. Statistical Models in Simulation

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Continuous Random Variables


Example: Life of an inspection device is given by X, a
continuous random variable with pdf:

1 x / 2
e , x0
f ( x) = 2
0,
otherwise
Lifetime in Year

X has an exponential distribution with mean 2 years


Probability that the devices life is between 2 and 3 years is:

1 3 x / 2
P(2 x 3) = e dx = 0.14
2 2
Chapter 4. Statistical Models in Simulation

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Cumulative Distribution Function

Cumulative Distribution Function (cdf) is denoted by F(x), where


F(x) = P(X x)

If X is discrete, then

F ( x) = p ( xi )
xi x

If X is continuous, then

F ( x) = f (t )dt

Properties
1. F is nondecreasing function. If a b, then F (a ) F (b)
2. lim F ( x) = 1
x

3. lim F ( x) = 0
x

All probability question about X can be answered in terms of the cdf:

P (a X b) = F (b) F (a ), for all a b


Chapter 4. Statistical Models in Simulation

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Cumulative Distribution Function


Example: An inspection device has cdf:

1 x t / 2
F ( x) = e dt = 1 e x / 2
2 0
The probability that the device lasts for less than 2 years:

P(0 X 2) = F (2) F (0) = F (2) = 1 e 1 = 0.632

The probability that it lasts between 2 and 3 years:

P (2 X 3) = F (3) F (2) = (1 e ( 3 / 2 ) ) (1 e 1 ) = 0.145

Chapter 4. Statistical Models in Simulation

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Expectation

The expected value of X is denoted by E(X)

E ( x) = xi p ( xi )
all i

E ( x) = x f ( x)dx

If X is continuous

a.k.a the mean, m, , or the 1st moment of X


A measure of the central tendency

The variance of X is denoted by V(X) or var(X) or 2

If X is discrete

Definition:
V(X) = E( (X E[X])2 )
Also,
V(X) = E(X2) ( E(x) )2
A measure of the spread or variation of the possible values of X around the
mean

The standard deviation of X is denoted by


Definition: = V (x)

Expressed in the same units as the mean

Chapter 4. Statistical Models in Simulation

10

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Expectations
Example: The mean of life of the previous inspection device is:

1
x / 2
x / 2
E ( X ) = xe dx = xe
+ e x / 2 dx = 2
0
2 0
0

To compute variance of X, we first compute E(X2):

1 2 x / 2
x / 2
2
E ( X ) = x e dx = x e
+ e x / 2 dx = 8
0
2 0
0
2

Hence, the variance and standard deviation of the devices life


are:
2

V (X ) = 8 2 = 4

= V (X ) = 2
Chapter 4. Statistical Models in Simulation

11

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Expectations

1 x / 2
x / 2
E ( X ) = xe dx = xe
+ e x / 2 dx = 2
0
2 0
0

1
x / 2
+ e x / 2 dx = 2
E ( X ) = xe x / 2 dx = xe
0
2 0
0
Partial Integration

u ( x)v' ( x)dx = u ( x)v( x) u ' ( x)v( x)dx


Set
u ( x) = x
v' ( x) = e x / 2

u ' ( x) = 1
v( x) = 2e x / 2

1 x/ 2
1
x/ 2
E ( X ) = xe dx = ( x (2e ) 1 (2e x / 2 )dx)
0
2 0
2
0
Chapter 4. Statistical Models in Simulation

12

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Useful Statistical Models


In this section, statistical models appropriate to some
application areas are presented.
The areas include:

Queueing systems
Inventory and supply-chain systems
Reliability and maintainability
Limited data

Chapter 4. Statistical Models in Simulation

13

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Useful models Queueing Systems


In a queueing system, interarrival and service-time patterns
can be probabilistic.
Sample statistical models for interarrival or service time
distribution:
Exponential distribution: if service times are completely random
Normal distribution: fairly constant but with some random variability
(either positive or negative)
Truncated normal distribution: similar to normal distribution but with
restricted value.
Gamma and Weibull distribution: more general than exponential
(involving location of the modes of pdfs and the shapes of tails.)

Chapter 4. Statistical Models in Simulation

14

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Useful models Inventory and supply chain


In realistic inventory and supply-chain systems, there are at
least three random variables:
The number of units demanded per order or per time period
The time between demands
The lead time = Time between placing an order and the receipt of that
order

Sample statistical models for lead time distribution:


Gamma

Sample statistical models for demand distribution:


Poisson: simple and extensively tabulated.
Negative binomial distribution: longer tail than Poisson (more large
demands).
Geometric: special case of negative binomial given at least one demand
has occurred.

Chapter 4. Statistical Models in Simulation

15

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Useful models Reliability and maintainability


Time to failure (TTF)
Exponential: failures are random
Gamma: for standby redundancy where each component has an
exponential TTF
Weibull: failure is due to the most serious of a large number of
defects in a system of components
Normal: failures are due to wear

Chapter 4. Statistical Models in Simulation

16

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Useful models Other areas


For cases with limited data, some useful distributions
are:
Uniform
Triangular
Beta

Other distribution:
Bernoulli
Binomial
Hyperexponential

Chapter 4. Statistical Models in Simulation

17

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Discrete Distributions
Discrete random variables are used to describe random
phenomena in which only integer values can occur.
In this section, we will learn about:

Bernoulli trials and Bernoulli distribution


Binomial distribution
Geometric and negative binomial distribution
Poisson distribution

Chapter 4. Statistical Models in Simulation

18

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Bernoulli Trials and Bernoulli Distribution


Bernoulli Trials:
Consider an experiment consisting of n trials, each can be a success or
a failure.
- Xj = 1 if the j-th experiment is a success
- Xj = 0 if the j-th experiment is a failure

The Bernoulli distribution (one trial):

xj =1
p,
p j ( x j ) = p( x j ) =
,
=

=
q
:
1
p
,
x
0
j

j = 1,2,..., n

where E(Xj) = p and V(Xj) = p(1-p) = pq

Bernoulli process:
The n Bernoulli trials where trails are independent:
p(x1,x2,, xn) = p1(x1)p2(x2) pn(xn)
Chapter 4. Statistical Models in Simulation

19

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Binomial Distribution
The number of successes in n Bernoulli trials, X, has a binomial
distribution.

n x n x
p q , x = 0,1,2,..., n
p( x) = x
0,
otherwise

The number of
outcomes having the
required number of
successes and
failures

Probability that
there are
x successes and
(n-x) failures

The mean, E(x) = p + p + + p = n*p


The variance, V(X) = pq + pq + + pq = n*pq

Chapter 4. Statistical Models in Simulation

20

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Geometric Distribution

Geometric distribution
The number of Bernoulli trials, X, to achieve the 1st success:

q x 1 p, x = 0,1,2,..., n
p ( x) =
otherwise
0,
E(x) = 1/p, and V(X) = q/p2

Chapter 4. Statistical Models in Simulation

21

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Negative Binomial Distribution

Negative binomial distribution


The number of Bernoulli trials, X, until the kth success
If Y is a negative binomial distribution with parameters p and k,
then:

y 1 y k k
q p , y = k , k + 1, k + 2,...

p ( x) = k 1
0,
otherwise

y 1 y k k 1
q p {p
p ( x) =
k 1
1
442443 k th success
(k-1 ) successes

E(Y) = k/p, and V(X) = kq/p2

Chapter 4. Statistical Models in Simulation

22

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Poisson Distribution
Poisson distribution describes many random processes quite
well and is mathematically quite simple.
where > 0, pdf and cdf are:

p( x) = x! e , x = 0,1,...
0,
otherwise

i e

i =0

i!

F ( x) =

E(X) = = V(X)

Chapter 4. Statistical Models in Simulation

23

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Poisson Distribution
Example: A computer repair person is beeped each
time there is a call for service. The number of beeps per
hour ~ Poisson( = 2 per hour).
The probability of three beeps in the next hour:
p(3)
= 23/3! e-2 = 0.18
also,
p(3)
= F(3) F(2) = 0.857-0.677=0.18
The probability of two or more beeps in a 1-hour period:
p(2 or more)
= 1 ( p(0) + p(1) )
= 1 F(1)
= 0.594

Chapter 4. Statistical Models in Simulation

24

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Continuous Distributions
Continuous random variables can be used to describe
random phenomena in which the variable can take on any
value in some interval.
In this section, the distributions studied are:

Uniform
Exponential
Weibull
Normal
Lognormal

Chapter 4. Statistical Models in Simulation

25

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Uniform Distribution
A random variable X is uniformly distributed on the interval
(a, b), U(a, b), if its pdf and cdf are:

x<a
0,
x a
F ( x) =
, a x<b
b a
xb
1,

, a xb
f ( x) = b a
0,
otherwise
Properties

P(x1 < X < x2) is proportional to the length of the interval


[F(x2) F(x1) = (x2-x1)/(b-a)]
E(X) = (a+b)/2
V(X) = (b-a)2/12

U(0,1) provides the means to


generate random numbers, from
which random variates can be
generated.
Chapter 4. Statistical Models in Simulation

26

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Exponential Distribution
A random variable X is exponentially distributed with parameter
> 0 if its pdf and cdf are:
x<0
0,
F ( x ) = x t
x
0 e dt = 1 e , x 0

e x , x 0
f ( x) =
elsewhere
0,
E(X) = 1/ V(X) = 1/2

Chapter 4. Statistical Models in Simulation

27

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Exponential Distribution
Used to model interarrival
times when arrivals are
completely random, and to
model service times that are
highly variable
For several different
exponential pdfs (see figure),
the value of intercept on the
vertical axis is , and all pdfs
eventually intersect.

Chapter 4. Statistical Models in Simulation

28

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Exponential Distribution
Memoryless property
For all s and t greater or equal to 0:
P(X > s+t | X > s) = P(X > t)
Example: A lamp ~ exp( = 1/3 per hour), hence, on average, 1
failure per 3 hours.
- The probability that the lamp lasts longer than its mean life is:
P(X > 3) = 1-(1-e-3/3) = e-1 = 0.368

- The probability that the lamp lasts between 2 to 3 hours is:


P(2 <= X <= 3) = F(3) F(2) = 0.145

- The probability that it lasts for another hour given it is operating for
2.5 hours:
P(X > 3.5 | X > 2.5) = P(X > 1) = e-1/3 = 0.717

Chapter 4. Statistical Models in Simulation

29

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Exponential Distribution
Memoryless property

P( X > s + t )
P( X > s + t | X > s) =
P( X > s)
e ( s +t )
= s
e
= e t
= P( X > t )

Chapter 4. Statistical Models in Simulation

30

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Weibull Distribution

A random variable X has a Weibull distribution if its pdf has the form:
x 1
x

exp
, x
f ( x) =



0,
otherwise

3 parameters:

( < < )

Location parameter: ,
Scale parameter: , ( > 0)
Shape parameter. , (> 0)

Example: = 0 and = 1:

Chapter 4. Statistical Models in Simulation

31

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Weibull Distribution
Weibull Distribution
x 1
x

exp
, x
f ( x) =

0,
otherwise

For = 1, =0
1
x
1

f ( x) = exp , x
0,
otherwise

When = 1,
X ~ exp( = 1/)

Chapter 4. Statistical Models in Simulation

32

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Normal Distribution
A random variable X is normally distributed if it has the pdf:
1 x 2
1
f ( x) =
exp
, < x <
2
2
Mean: < <
2
Variance: > 0
Denoted as X ~ N(,2)

Special properties:
f ( x) = 0, and lim f ( x) = 0
xlim

x
f(-x)=f(+x); the pdf is symmetric about .
The maximum value of the pdf occurs at x = ; the mean and mode are
equal.

Chapter 4. Statistical Models in Simulation

33

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Normal Distribution
Evaluating the distribution:
Use numerical methods (no closed form)
Independent of and , using the standard normal distribution:
Z ~ N(0,1)
Transformation of variables: let Z = (X - ) / ,

F ( x ) = P ( X x ) = P Z

( x ) /
1 z2 / 2
=
e
dz

2
=

( x ) /

Chapter 4. Statistical Models in Simulation

( z )dz = ( x )

34

, where ( z ) =

1 t 2 / 2
e
dt
2

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Normal Distribution
Example: The time required to load an oceangoing vessel, X, is
distributed as N(12,4), =12, =2
The probability that the vessel is loaded in less than 10 hours:

10 12
F (10) =
= (1) = 0.1587
2
- Using the symmetry property, (1) is the complement of (-1)

Chapter 4. Statistical Models in Simulation

35

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Lognormal Distribution
A random variable X has a lognormal distribution if its pdf has
the form:
1
(ln x ) 2
exp

, x > 0
2
f ( x) = 2 x
2

0,
otherwise

=1,

2=0.5,1,2.

Mean E(X) = e+ /2
2
2
Variance V(X) = e2+ /2 (e - 1)
2

Relationship with normal distribution


When Y ~ N(, 2), then X = eY ~ lognormal(, 2)
Parameters and 2 are not the mean and variance of the lognormal
random variable X

Chapter 4. Statistical Models in Simulation

36

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Poisson Distribution
Definition: N(t) is a counting function that represents the
number of events occurred in [0,t].
A counting process {N(t), t>=0} is a Poisson process with
mean rate if:
Arrivals occur one at a time
{N(t), t>=0} has stationary increments
{N(t), t>=0} has independent increments

Properties

( t ) n t
P[ N (t ) = n] =
e ,
n!

for t 0 and n = 0,1,2,...

Equal mean and variance: E[N(t)] = V[N(t)] = t


Stationary increment: The number of arrivals in time s to t is also
Poisson-distributed with mean (t-s)

Chapter 4. Statistical Models in Simulation

37

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Poisson Distribution Interarrival Times

Consider the interarrival times of a Possion process (A1, A2, ), where


Ai is the elapsed time between arrival i and arrival i+1

The 1st arrival occurs after time t iff there are no arrivals in the interval [0,t],
hence:
P(A1 > t) = P(N(t) = 0) = e-t
P(A1 <= t) = 1 e-t
[cdf of exp()]

Interarrival times, A1, A2, , are exponentially distributed and independent with
mean 1/
Arrival counts
~ Poisson()

Interarrival time
~ Exp(1/)

Stationary & Independent

Chapter 4. Statistical Models in Simulation

Memoryless

38

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Poisson Distribution Splitting and Pooling


Splitting:
Suppose each event of a Poisson process can be classified as Type I,
with probability p and Type II, with probability 1-p.
N(t) = N1(t) + N2(t), where N1(t) and N2(t) are both Poisson processes
with rates p and (1-p)
N(t) ~ Poisson()

p
(1-p)

N1(t) ~ Poisson[p]
N2(t) ~ Poisson[(1-p)]

Pooling:
Suppose two Poisson processes are pooled together
N1(t) + N2(t) = N(t), where N(t) is a Poisson processes with rates 1 + 2
N1(t) ~ Poisson[1]
N2(t) ~ Poisson[2]
Chapter 4. Statistical Models in Simulation

1 + 2

N(t) ~ Poisson(1 + 2)

2
39

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Poisson Distribution Empirical Distributions


A distribution whose parameters are the observed values in a
sample of data.
May be used when it is impossible or unnecessary to establish that a
random variable has any particular parametric distribution.
Advantage: no assumption beyond the observed values in the sample.
Disadvantage: sample might not cover the entire range of possible values.

Chapter 4. Statistical Models in Simulation

40

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Summary
The world that the simulation analyst sees is probabilistic, not
deterministic.
In this chapter:
Reviewed several important probability distributions.
Showed applications of the probability distributions in a simulation context.

Important task in simulation modeling is the collection and


analysis of input data, e.g., hypothesize a distributional form for
the input data.
Student should know:
Difference between discrete, continuous, and empirical distributions.
Poisson process and its properties.

Chapter 4. Statistical Models in Simulation

41

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation Techniques
Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Chapter 7
Queueing Models

Computer Science, Informatik 4


Communication and Distributed Systems

Purpose
Simulation is often used in the analysis of queueing models.
A simple but typical queueing model

Calling population
Waiting line

Server

Queueing models provide the analyst with a powerful tool for


designing and evaluating the performance of queueing systems.
Typical measures of system performance
Server utilization, length of waiting lines, and delays of customers
For relatively simple systems, compute mathematically
For realistic models of complex systems, simulation is usually required

Chapter 7. Queueing Models

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Outline
Discuss some well-known models
Not development of queueing theory, for this see other class!

We will deal with

General characteristics of queues


Meanings and relationships of important performance measures
Estimation of mean measures of performance
Effect of varying input parameters
Mathematical solutions of some basic queueing models

Chapter 7. Queueing Models

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Characteristics of Queueing Systems


Key elements of queueing systems
Customer: refers to anything that arrives at a facility and requires
service, e.g., people, machines, trucks, emails.
Server: refers to any resource that provides the requested service, e.g.,
repairpersons, retrieval machines, runways at airport.
System

Customers

Server

Reception desk

People

Receptionist

Hospital

Patients

Nurses

Airport

Airplanes

Runway

Production line

Cases

Case-packer

Road network

Cars

Traffic light

Grocery

Shoppers

Checkout station

Computer

Jobs

CPU, disk, CD

Network

Packets

Router

Chapter 7. Queueing Models

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Calling Population
Calling population: the population of potential customers, may be
assumed to be finite or infinite.
Finite population model: if arrival rate depends on the number of
customers being served and waiting, e.g., model of one corporate jet, if it
is being repaired, the repair arrival rate becomes zero.

n-1

Infinite population model: if arrival rate is not affected by the number of


customers being served and waiting, e.g., systems with large population
of potential customers.

Chapter 7. Queueing Models

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

System Capacity
System Capacity: a limit on the number of customers that may
be in the waiting line or system.
Limited capacity, e.g., an automatic car wash only has room for 10 cars
to wait in line to enter the mechanism.

Waiting line

Server

Unlimited capacity, e.g., concert ticket sales with no limit on the number
of people allowed to wait to purchase tickets.

Waiting line

Chapter 7. Queueing Models

Server

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Arrival Process
For infinite-population models:
In terms of interarrival times of successive customers.
Random arrivals: interarrival times usually characterized by a probability
distribution.
- Most important model: Poisson arrival process (with rate ), where An
represents the interarrival time between customer n-1 and customer n, and is
exponentially distributed (with mean 1/).

Scheduled arrivals: interarrival times can be constant or constant plus or


minus a small random amount to represent early or late arrivals.
- Example: patients to a physician or scheduled airline flight arrivals to an
airport

At least one customer is assumed to always be present, so the server is


never idle, e.g., sufficient raw material for a machine.

Chapter 7. Queueing Models

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Arrival Process
For finite-population models:
Customer is pending when the customer is outside the queueing system,
e.g., machine-repair problem: a machine is pending when it is
operating, it becomes not pending the instant it demands service from
the repairman.
Runtime of a customer is the length of time from departure from the
queueing system until that customers next arrival to the queue, e.g.,
machine-repair problem, machines are customers and a runtime is time
to failure (TTF).
Let A1(i), A2(i), be the successive runtimes of customer i, and S1(i), S2(i)
be the corresponding successive system times:

Chapter 7. Queueing Models

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Queue Behavior and Queue Discipline


Queue behavior: the actions of customers while in a queue
waiting for service to begin, for example:
Balk: leave when they see that the line is too long
Renege: leave after being in the line when its moving too slowly
Jockey: move from one line to a shorter line

Queue discipline: the logical ordering of customers in a queue


that determines which customer is chosen for service when a
server becomes free, for example:

First-in-first-out (FIFO)
Last-in-first-out (LIFO)
Service in random order (SIRO)
Shortest processing time first (SPT)
Service according to priority (PR)

Chapter 7. Queueing Models

10

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Service Times and Service Mechanism


Service times of successive arrivals are denoted by S1, S2, S3.

May be constant or random.


{S1, S2, S3, } is usually characterized as a sequence of independent and
identically distributed random variables, e.g., exponential, Weibull,
gamma, lognormal, and truncated normal distribution.

A queueing system consists of a number of service centers and


interconnected queues.
Each service center consists of some number of servers, c, working in
parallel, upon getting to the head of the line, a customer takes the 1st
available server.

Chapter 7. Queueing Models

11

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Service Times and Service Mechanism


Example: consider a discount warehouse where customers may:
Serve themselves before paying at the cashier

Chapter 7. Queueing Models

12

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Service Times and Service Mechanism


Wait for one of the three clerks:

Batch service (a server serving several customers simultaneously), or


customer requires several servers simultaneously.

Chapter 7. Queueing Models

13

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Service Times and Service Mechanism

Chapter 7. Queueing Models

14

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Example
Candy production line
Three machines separated by buffers
Buffers have capacity of 1000 candies

Assumption:Allways
sufficient supply of
raw material.
Chapter 7. Queueing Models

15

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Queueing Notation Kendall Notation

A notation system for parallel server queues: A/B/c/N/K

represents the interarrival-time distribution


represents the service-time distribution
represents the number of parallel servers
represents the system capacity
represents the size of the calling population
are usually dropped, if they are infinity

Common symbols for A and B

A
B
c
N
K
N, K
M
D
Ek
H
G

Markov, exponential distribution


Constant, deterministic
Erlang distribution of order k
Hyperexponential distribution
General, arbitrary

Examples

M/M/1// same as M/M/1: Single-server with unlimited capacity and callpopulation. Interarrival and service times are exponentially distributed
G/G/1/5/5: Single-server with capacity 5 and call-population 5.

Chapter 7. Queueing Models

16

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Queueing Notation
Primary performance measures of queueing systems:

Pn
Pn(t)

An
Sn
Wn
WnQ
L(t)
LQ(t)
L
LQ
w
wQ

Chapter 7. Queueing Models

steady-state probability of having n customers in system


probability of n customers in system at time t
arrival rate
effective arrival rate
service rate of one server
server utilization
interarrival time between customers n-1 and n
service time of the n-th arriving customer
total time spent in system by the n-th arriving customer
total time spent in the waiting line by customer n
the number of customers in system at time t
the number of customers in queue at time t
long-run time-average number of customers in system
long-run time-average number of customers in queue
long-run average time spent in system per customer
long-run average time spent in queue per customer

17

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Evolving of a Queueing System

Number of
customers in the
system

Time

Chapter 7. Queueing Models

18

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Time-Average Number in System L


Consider a queueing system over a period of time T
Let Ti denote the total time during [0,T] in which the system contained
exactly i customers, the time-weighted-average number in a system is
defined by:

1
T
L =
iTi =
i i
T i =0
T
i =0

Consider the total area under the function is L(t), then,

1
L =
T

1
iTi =
T
i =0

L(t )dt
0

The long-run time-average number of customers in system, with


probability 1:

L=
T

Chapter 7. Queueing Models

L(t )dt T
L

19

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Time-Average Number in System L


The time-weighted-average number in queue is:

1
1
Q

LQ = iTi =
T i =0
T

LQ (t )dt T
LQ

G/G/1/N/K example: consider the results from the queueing system (N> 4,
K > 3).
L = [0(3) + 1(12) + 2(4) + 3(1)] / 20
= 23 / 20 = 1.15 cusomters

if L(t) = 0
0,
LQ (t ) =
L(t ) 1, if L(t) 1
0(15) + 1(4) + 2(1)
LQ =
= 0.3 customers
20

Chapter 7. Queueing Models

20

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Average Time Spent in System Per Customer w


The average time spent in system per customer, called the average
system time, is:

w=
N

W
i =1

where W1, W2, , WN are the individual times that each of the N
customers spend in the system during [0,T].
w as N
For stable systems: w
If the system under consideration is the queue alone:

1 N Q
w Q = Wi wQ
N i =1

as

G/G/1/N/K example (cont.): the average system time is

w =

W1 + W2 + ... + W5 2 + (8 3) + ... + (20 16)


=
= 4.6 time units
5
5

Chapter 7. Queueing Models

21

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

The Conservation Equation Littles Law


Conservation equation (a.k.a. Littles law)
Average # in
system

L = w

Average
System time

Arrival rate

L = w as T and N
Holds for almost all queueing systems or subsystems (regardless of the
number of servers, the queue discipline, or other special circumstances).
G/G/1/N/K example (cont.): On average, one arrival every 4 time units
and each arrival spends 4.6 time units in the system. Hence, at an
arbitrary point in time, there is (1/4)(4.6) = 1.15 customers present on
average.

Chapter 7. Queueing Models

22

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Server Utilization
Definition: the proportion of time that a server is busy.
Observed server utilization, , is defined over a specified time interval
[0,T].
Long-run server utilization is .
For systems with long-run stability: as T

Chapter 7. Queueing Models

23

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Server Utilization
For G/G/1// queues:
Any single-server queueing system with average arrival rate
customers per time unit, where average service time E(S) = 1/
time units, infinite queue capacity and calling population.
Conservation equation, L = w, can be applied.
For a stable system, the average arrival rate to the server, s,
must be identical to .
The average number of customers in the server is:

Ls =
T

Chapter 7. Queueing Models

(
T

T T0
L(t ) LQ (t ) dt =
T

24

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Server Utilization
In general, for a single-server queue:

Ls = T
Ls =

and

= E (s) =

- For a single-server stable queue:

<1

- For an unstable queue ( > ), long-run server utilization is 1.

Chapter 7. Queueing Models

25

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Server Utilization
For G/G/c// queues:
A system with c identical servers in parallel.
If an arriving customer finds more than one server idle, the
customer chooses a server without favoring any particular server.
For systems in statistical equilibrium, the average number of busy
servers, Ls, is: Ls, = E(s) = /.
The long-run average server utilization is:

Chapter 7. Queueing Models

Ls

=
, where < c for stable systems
c c

26

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Server Utilization and System Performance


System performance varies widely for a given utilization .
For example, a D/D/1 queue where E(A) = 1/ and E(S) = 1/,
where:
L = = /, w = E(S) = 1/, LQ = WQ = 0.
- By varying and , server utilization can assume any value between 0
and 1.
- Yet there is never any line.

In general, variability of interarrival and service times causes lines


to fluctuate in length.

Chapter 7. Queueing Models

27

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Server Utilization and System Performance

Consider the system is simulated


with service times: S1= 9, S2=12,

Example: A physician who


schedules patients every 10
minutes and spends Si minutes
with the i-th patient:

S3 = 9, S4 = 9, S5 = 9, .

The system becomes:

9 minutes with probability 0.9


Si =
12 minutes with probability 0.1
Arrivals are deterministic,
A1 = A2 = = -1 = 10.
Services are stochastic
- E(Si) = 9.3 min
- V(S0) = 0.81 min2
- = 0.9 min

The occurrence of a relatively


long service time (S2 = 12)
causes a waiting line to form
temporarily.

On average, the physician's


utilization = = / = 0.93 < 1.

Chapter 7. Queueing Models

28

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Costs in Queueing Problems


Costs can be associated with various aspects of the waiting line or
servers:

System incurs a cost for each customer in the queue, say at a rate of $10
per hour per customer.
WjQ is the time
- The average cost per customer is:
customer j spends
N

$10 W jQ

j =1

in queue

= $10 w Q

- If customers per hour arrive (on average), the average cost per
hour is:

L
$
10
customer $10 w Q
Q
= $10 w Q =

hour customer
hour

Server may also impose costs on the system, if a group of c parallel


servers (1 c ) have utilization r, each server imposes a cost of $5 per
hour while busy.
- The total server cost is: $5 c
Chapter 7. Queueing Models

29

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Steady-State Behavior of Markovian Models

Markovian models:

Exponential-distributed arrival process (mean arrival rate = 1/).


Service times may be exponentially (M) or arbitrary (G) distributed.
Queue discipline is FIFO.
A queueing system is in statistical equilibrium if the probability that the
system is in a given state is not time dependent:

P ( L(t ) = n) = Pn (t ) = Pn
Mathematical models in this chapter can be used to obtain approximate
results even when the model assumptions do not strictly hold, as a
rough guide.
Simulation can be used for more refined analysis, more faithful
representation for complex systems.
Chapter 7. Queueing Models

30

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Steady-State Behavior of Markovian Models

Properties of processes with statistical equilibrium


The state of statistical equilibrium is reached from any starting
state.
The process remain in statistical equilibrium once it has reached
it.

Chapter 7. Queueing Models

31

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Steady-State Behavior of Markovian Models

For the simple model studied in this chapter, the steady-state


parameter, L, the time-average number of customers in the
system is:

L=

nP

n =0

Apply Littles equation, L= w, to the whole system and to the


queue alone:

w=

wQ = w

LQ = wQ

G/G/c// example: to have a statistical equilibrium, a


necessary and sufficient condition is:

=
<1
c
Chapter 7. Queueing Models

32

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

M/G/1 Queues

Single-server queues with Poisson arrivals and unlimited capacity.


Suppose service times have mean 1/ and variance 2 and = / < 1, the
steady-state parameters of M/G/1 queue:

P0 = 1

2 (1 + 2 2 )
L=+
2(1 )

The particular
distribution is not
known!

2 (1 + 2 2 )
LQ =
2(1 )
(1 / 2 + 2 )
w= +

2(1 )
1

(1 / 2 + 2 )
wQ =
2(1 )
Chapter 7. Queueing Models

33

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

M/G/1 Queues
There are no simple expression for the steady-state probabilities P0,
P1,
L LQ = is the time-average number of customers being served.
Average length of queue, LQ, can be rewritten as:

2 2
+
LQ =
2(1 ) 2(1 )
If and are held constant, LQ depends on the variability, 2, of the
service times.

Chapter 7. Queueing Models

34

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

M/G/1 Queues

Example: Two workers competing for a job, Able claims to be faster than
Baker on average, but Baker claims to be more consistent,

Poisson arrivals at rate = 2 per hour (1/30 per minute).


Able: 1/ = 24 minutes and 2 = 202 = 400 minutes2:

(1 / 30) 2 [24 2 + 400]


LQ =
= 2.711 customers
2(1 4 / 5)
- The proportion of arrivals who find Able idle and thus experience no delay is P0 = 1- = 1/5

= 20%.

Baker: 1/ = 25 minutes and 2 = 22 = 4 minutes2:

(1 / 30) 2 [252 + 4]
LQ =
= 2.097 customers
2(1 5 / 6)
- The proportion of arrivals who find Baker idle and thus experience no delay is P0 = 1- =
1/6 = 16.7%.

Although working faster on average, Ables greater service variability results in an


average queue length about 30% greater than Bakers.

Chapter 7. Queueing Models

35

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

M/M/1 Queues
Suppose the service times in an M/G/1 queue are exponentially
distributed with mean 1/, then the variance is 2 = 1/2.
M/M/1 queue is a useful approximate model when service times
have standard deviation approximately equal to their means.
The steady-state parameters
=

Pn = (1 ) n
L=

P0 = 1

2
2
=
LQ =
( ) 1
w=
wQ =
Chapter 7. Queueing Models

1
1
=
(1 )

=
( ) (1 )
36

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

M/M/1 Queues
Single-chair unisex hair-styling shop
Interarrival and service times are exponentially distributed
=2 customers/hour and =3 customers/hour
=

2
=
3

L=
1
3

P0 = 1 =
1

n =0

Chapter 7. Queueing Models

2
= 2 Customers
3 2

4
4
2
LQ =
=
= Customers
( ) 3(3 2) 3
4 2
L = LQ + = + = 2 Customers
3 3

1 2
4
P2 = =
3 3
27
P 4 = 1 Pn =

2
= 1 hour
2
1
1 2
wQ = w = 1 = hour
3 3

w=

1 2 2
P1 = =
3 3 9

16
81

37

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

M/M/1 Queues
Example: M/M/1 queue with
service rate =10 customers
per hour.

Consider how L and w


increase as arrival rate, ,
increases from 5 to 8.64 by
increments of 20%

5,0

6,0

7,2

8,6

10,0

0,5
1,0

0,6
1,5

0,7
2,6

0,9
6,4

1,0

0,2

0,3

0,4

0,7

20

L
18 w

If / 1, waiting lines tend to


continually grow in length

Number of Customers

16

Increase in average system


time (w) and average number
in system (L) is highly
nonlinear as a function of .

14
12
10
8
6
4
2
0
0.5

0.6

0.7

0.8

0.9

rho

Chapter 7. Queueing Models

38

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Effect of Utilization and Service Variability


For almost all queues, if lines are too long, they can be reduced by
decreasing server utilization () or by decreasing the service time
variability (2).
A measure of the variability of a distribution,
coefficient of variation (cv):

(cv) 2 =

V (X )

[E ( X )]2

The larger cv is, the more variable is the distribution relative to its
expected value
For exponential service times with rate
- E(X)=1/
- V(X)=1/2
cv=1
Chapter 7. Queueing Models

39

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Effect of Utilization and Service Variability


Consider LQ for any M/G/1 queue:
2 (1 + 2 2 )
LQ =
2(1 )
2 1 + (cv) 2

1
2

LQ for M/M/1
queue
Corrects the M/M/1
formula to account
for a non-exponential
service time distn

Chapter 7. Queueing Models

40

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Multiserver Queue
M/M/c// queue: c servers operating in parallel
Arrival process is poisson with rate
Each server has an independent and identical exponential service-time
distribution, with mean 1/.
To achieve statistical equilibrium, the offered load (/) must satisfy /<c,
where /(c) = is the server utilization.

Calling population

1
Waiting line

c
Chapter 7. Queueing Models

41

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Multiserver Queue

The steady-state parameters


=

c 1 ( / ) n c 1 c

P0 =
+
n =0 n! c! c
(c ) c P0
P (L ( ) c ) =
c!(1 )
Probability
that all servers
are busy

(c ) c +1 P0
P (L ( ) c )
c
L = c +
=

+
1
c(c!)(1 ) 2
L
w=

P (L ( ) c )
LQ =
1
L LQ = c
Chapter 7. Queueing Models

42

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Multiserver Queue
Probability of empty system

Chapter 7. Queueing Models

Number of customers in system

43

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Multiserver Queue
Other common multiserver queueing models
2 1 + (cv) 2

LQ =
2
1

Corrects the M/M/1


formula

LQ for M/M/1
queue

M/G/c/: general service times and c parallel server. The parameters can
be approximated from those of the M/M/c// model.
M/G/: general service times and infinite number of servers.
M/M/c/N/: service times are exponentially distributed at rate and c
servers where the total system capacity is N c customer. When an arrival
occurs and the system is full, that arrival is turned away.
Chapter 7. Queueing Models

44

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Multiserver Queue
M/G/: general service times and infinite number of servers
- customer is its own server
- service capacity far exceeds service demand
- when we want to know how many servers are required so that
customers are rarely delayed

Pn = e

P0 = e

w=

()

n!

, n = 0,1, K

wQ = 0
L=

LQ = 0
Chapter 7. Queueing Models

45

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Multiserver Queue
How many users can be logged in simultaneously in a computer
system

Customers log on with rate =500 per hour


Stay connected in average for 1/=180 minutes = 3 hours
For planning purposes it is pretended that the simultaneous logged in
users is infinite
Expected number of simultaneous users L

L = = 500 3 = 1500

To ensure providing adequate capacity 95% of the time, the number of


parallel users c has to be restricted
e 1500 (1500) n
0.95
P ( L() c) = Pn =
n!
n =0
n =0
c

The capacity c=1564 simultaneous users satisfies this requirement


Chapter 7. Queueing Models

46

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Multiserver Queue
M/M/c/N/: service times are exponentially distributed at rate and
c servers where the total system capacity is N c customer
when an arrival occurs and the system is full, that arrival is turned away
Effective arrival rate e is defined as the mean number of arrivals per
time unit who enter and remain in the system
1

P0
PN
LQ

a n a c N n c
= 1 + +

!
!
n
c
1
1
=
=
+
n
n
c

aN
=
P0
c!c N c
P0 a c
1 N c ( N c) N c (1 )
=
c!(1 )
= (1 PN )

wQ =

LQ

w = wQ +

(1-PN) probability that a


customer will find a space
and be able to enter the
system

L = e w
Chapter 7. Queueing Models

47

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Multiserver Queue
Single-chair unisex hair-styling shop (again!)
Space only for 3 customers:
one in service and two waiting
First computer P0
P0 =

1
2 2 3 2 n 1
1 + +
3 3 n = 2 3

Queue time
wQ =

= 0.415

PN = P3 =

()
1!1

P0 =

w = wQ +

8
= 0.123
65

28
= 0.246
114

66
= 0.579
114

Expected number of
customers in shop

Average of the queue


LQ = 0.431

L = e w =

Effective arrival rate

66
= 1.015
65

Probability of busy shop

1 P0 = e = 0.585

8 114

e = 21 =
= 1.754
65 65

Chapter 7. Queueing Models

System time, time in shop

P(system is full)
2 3
3
2

LQ

48

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Steady-State Behavior of Finite-Population Models

In practical problems calling population is finite

When the calling population is small, the presence of one or more customers in
the system has a strong effect on the distribution of future arrivals.

Consider a finite-calling population model with K customers (M/M/c/K/K)

The time between the end of one service visit and the next call for service is
exponentially distributed with mean = 1/.
Service times are also exponentially distributed with mean 1/.
c parallel servers and system capacity is K.

K Customers

1
Waiting line

c
Chapter 7. Queueing Models

49

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Steady-State Behavior of Finite-Population Models


Some of the steady-state probabilities:
n
c 1 K n K

K!

P0 = +
n c
n =0 n n =c ( K n)!c!c
K n
n = 0,1,..., c 1
P0 ,
n
Pn =
n

!
K

( K n)!c!c n c , n = c, c + 1,...K

L = nPn ,
n =0

w = L / e ,

e
c

where e is the long run effective arrival rate of customers to queue (or entering/exiting service)
K

e = ( K n)Pn
n =0

Chapter 7. Queueing Models

50

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Steady-State Behavior of Finite-Population Models


Example: two workers who are responsible for 10 milling
machines.

Machines run on the average for 20 minutes, then require an


average 5-minute service period, both times exponentially
distributed: = 1/20 and = 1/5.
All of the performance measures depend on P0:
n
21 10 5 n 10
10!
5
P0 = +

n2
n
20
n
(
10

)!
2
!
2

20
n =0
n=2

= 0.065

Then, we can obtain the other Pn, and can compute the
expected number of machines in system:
10

L=

nP

= 3.17 machines

n =0

The average number of running machines:


K L = 10 3.17 = 6.83 machines
Chapter 7. Queueing Models

51

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Networks of Queues
Many systems are naturally modeled as networks of single queues
customers departing from one queue may be routed to another

The following results assume a stable system with infinite calling


population and no limit on system capacity:

Provided that no customers are created or destroyed in the queue, then


the departure rate out of a queue is the same as the arrival rate into the
queue, over the long run.
If customers arrive to queue i at rate i, and a fraction 0 pij 1 of them
are routed to queue j upon departure, then the arrival rate from queue i
to queue j is i pij over the long run.

Chapter 7. Queueing Models

52

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Networks of Queues
The overall arrival rate into queue j:

j = aj +

p
i

ij

all i

Arrival rate
from outside
the network

Sum of arrival rates


from other queues
in network

If queue j has cj < parallel servers, each working at rate j, then the
long-run utilization of each server is j=j /(cj) (where j < 1 for stable
queue).
If arrivals from outside the network form a Poisson process with rate aj
for each queue j, and if there are cj identical servers delivering
exponentially distributed service times with mean 1/j, then, in steady
state, queue j behaves likes an M/M/cj queue with arrival rate

j = aj +

p
i

ij

all i

Chapter 7. Queueing Models

53

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Network of Queues
Customer
Population

80 cust
hour

0.4

c=
c=1

0.6

Discount store example:

Suppose customers arrive at the rate 80 per hour and


40% choose self-service.
Hence:
- Arrival rate to service center 1 is 1 = 80(0.4) = 32 per hour
- Arrival rate to service center 2 is 2 = 80(0.6) = 48 per hour.

c2 = 3 clerks and 2 = 20 customers per hour.

The long-run utilization of the clerks is:


2 = 48/(3*20) = 0.8
All customers must see the cashier at service center 3,
the overall rate to service center 3 is 3 = 1 + 2 = 80
per hour.
- If 3 = 90 per hour, then the utilization of the cashier is:
3 = 80/90 = 0.89

Chapter 7. Queueing Models

54

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Summary

Introduced basic concepts of queueing models.


Show how simulation, and some times mathematical analysis, can be used
to estimate the performance measures of a system.
Commonly used performance measures: L, LQ, w, wQ, , and e.
When simulating any system that evolves over time, analyst must decide
whether to study transient behavior or steady-state behavior.

Simple formulas exist for the steady-state behavior of some queues.

Simple models can be solved mathematically, and can be useful in providing


a rough estimate of a performance measure.

Chapter 7. Queueing Models

55

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation Techniques
Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Chapter 7
Queueing Models

Computer Science, Informatik 4


Communication and Distributed Systems

Purpose
Simulation is often used in the analysis of queueing models.
A simple but typical queueing model

Calling population
Waiting line

Server

Queueing models provide the analyst with a powerful tool for


designing and evaluating the performance of queueing systems.
Typical measures of system performance
Server utilization, length of waiting lines, and delays of customers
For relatively simple systems, compute mathematically
For realistic models of complex systems, simulation is usually required

Chapter 7. Queueing Models

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Outline
Discuss some well-known models
Not development of queueing theory, for this see other class!

We will deal with

General characteristics of queues


Meanings and relationships of important performance measures
Estimation of mean measures of performance
Effect of varying input parameters
Mathematical solutions of some basic queueing models

Chapter 7. Queueing Models

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Characteristics of Queueing Systems


Key elements of queueing systems
Customer: refers to anything that arrives at a facility and requires
service, e.g., people, machines, trucks, emails.
Server: refers to any resource that provides the requested service, e.g.,
repairpersons, retrieval machines, runways at airport.
System

Customers

Server

Reception desk

People

Receptionist

Hospital

Patients

Nurses

Airport

Airplanes

Runway

Production line

Cases

Case-packer

Road network

Cars

Traffic light

Grocery

Shoppers

Checkout station

Computer

Jobs

CPU, disk, CD

Network

Packets

Router

Chapter 7. Queueing Models

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Calling Population
Calling population: the population of potential customers, may be
assumed to be finite or infinite.
Finite population model: if arrival rate depends on the number of
customers being served and waiting, e.g., model of one corporate jet, if it
is being repaired, the repair arrival rate becomes zero.

n-1

Infinite population model: if arrival rate is not affected by the number of


customers being served and waiting, e.g., systems with large population
of potential customers.

Chapter 7. Queueing Models

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

System Capacity
System Capacity: a limit on the number of customers that may
be in the waiting line or system.
Limited capacity, e.g., an automatic car wash only has room for 10 cars
to wait in line to enter the mechanism.

Waiting line

Server

Unlimited capacity, e.g., concert ticket sales with no limit on the number
of people allowed to wait to purchase tickets.

Waiting line

Chapter 7. Queueing Models

Server

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Arrival Process
For infinite-population models:
In terms of interarrival times of successive customers.
Random arrivals: interarrival times usually characterized by a probability
distribution.
- Most important model: Poisson arrival process (with rate ), where An
represents the interarrival time between customer n-1 and customer n, and is
exponentially distributed (with mean 1/).

Scheduled arrivals: interarrival times can be constant or constant plus or


minus a small random amount to represent early or late arrivals.
- Example: patients to a physician or scheduled airline flight arrivals to an
airport

At least one customer is assumed to always be present, so the server is


never idle, e.g., sufficient raw material for a machine.

Chapter 7. Queueing Models

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Arrival Process
For finite-population models:
Customer is pending when the customer is outside the queueing system,
e.g., machine-repair problem: a machine is pending when it is
operating, it becomes not pending the instant it demands service from
the repairman.
Runtime of a customer is the length of time from departure from the
queueing system until that customers next arrival to the queue, e.g.,
machine-repair problem, machines are customers and a runtime is time
to failure (TTF).
Let A1(i), A2(i), be the successive runtimes of customer i, and S1(i), S2(i)
be the corresponding successive system times:

Chapter 7. Queueing Models

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Queue Behavior and Queue Discipline


Queue behavior: the actions of customers while in a queue
waiting for service to begin, for example:
Balk: leave when they see that the line is too long
Renege: leave after being in the line when its moving too slowly
Jockey: move from one line to a shorter line

Queue discipline: the logical ordering of customers in a queue


that determines which customer is chosen for service when a
server becomes free, for example:

First-in-first-out (FIFO)
Last-in-first-out (LIFO)
Service in random order (SIRO)
Shortest processing time first (SPT)
Service according to priority (PR)

Chapter 7. Queueing Models

10

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Service Times and Service Mechanism


Service times of successive arrivals are denoted by S1, S2, S3.

May be constant or random.


{S1, S2, S3, } is usually characterized as a sequence of independent and
identically distributed random variables, e.g., exponential, Weibull,
gamma, lognormal, and truncated normal distribution.

A queueing system consists of a number of service centers and


interconnected queues.
Each service center consists of some number of servers, c, working in
parallel, upon getting to the head of the line, a customer takes the 1st
available server.

Chapter 7. Queueing Models

11

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Service Times and Service Mechanism


Example: consider a discount warehouse where customers may:
Serve themselves before paying at the cashier

Chapter 7. Queueing Models

12

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Service Times and Service Mechanism


Wait for one of the three clerks:

Batch service (a server serving several customers simultaneously), or


customer requires several servers simultaneously.

Chapter 7. Queueing Models

13

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Service Times and Service Mechanism

Chapter 7. Queueing Models

14

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Example
Candy production line
Three machines separated by buffers
Buffers have capacity of 1000 candies

Assumption:Allways
sufficient supply of
raw material.
Chapter 7. Queueing Models

15

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Queueing Notation Kendall Notation

A notation system for parallel server queues: A/B/c/N/K

represents the interarrival-time distribution


represents the service-time distribution
represents the number of parallel servers
represents the system capacity
represents the size of the calling population
are usually dropped, if they are infinity

Common symbols for A and B

A
B
c
N
K
N, K
M
D
Ek
H
G

Markov, exponential distribution


Constant, deterministic
Erlang distribution of order k
Hyperexponential distribution
General, arbitrary

Examples

M/M/1// same as M/M/1: Single-server with unlimited capacity and callpopulation. Interarrival and service times are exponentially distributed
G/G/1/5/5: Single-server with capacity 5 and call-population 5.

Chapter 7. Queueing Models

16

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Queueing Notation
Primary performance measures of queueing systems:

Pn
Pn(t)

An
Sn
Wn
WnQ
L(t)
LQ(t)
L
LQ
w
wQ

Chapter 7. Queueing Models

steady-state probability of having n customers in system


probability of n customers in system at time t
arrival rate
effective arrival rate
service rate of one server
server utilization
interarrival time between customers n-1 and n
service time of the n-th arriving customer
total time spent in system by the n-th arriving customer
total time spent in the waiting line by customer n
the number of customers in system at time t
the number of customers in queue at time t
long-run time-average number of customers in system
long-run time-average number of customers in queue
long-run average time spent in system per customer
long-run average time spent in queue per customer

17

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Evolving of a Queueing System

Number of
customers in the
system

Time

Chapter 7. Queueing Models

18

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Time-Average Number in System L


Consider a queueing system over a period of time T
Let Ti denote the total time during [0,T] in which the system contained
exactly i customers, the time-weighted-average number in a system is
defined by:

1
T
L =
iTi =
i i
T i =0
T
i =0

Consider the total area under the function is L(t), then,

1
L =
T

1
iTi =
T
i =0

L(t )dt
0

The long-run time-average number of customers in system, with


probability 1:

L=
T

Chapter 7. Queueing Models

L(t )dt T
L

19

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Time-Average Number in System L


The time-weighted-average number in queue is:

1
1
Q

LQ = iTi =
T i =0
T

LQ (t )dt T
LQ

G/G/1/N/K example: consider the results from the queueing system (N> 4,
K > 3).
L = [0(3) + 1(12) + 2(4) + 3(1)] / 20
= 23 / 20 = 1.15 cusomters

if L(t) = 0
0,
LQ (t ) =
L(t ) 1, if L(t) 1
0(15) + 1(4) + 2(1)
LQ =
= 0.3 customers
20

Chapter 7. Queueing Models

20

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Average Time Spent in System Per Customer w


The average time spent in system per customer, called the average
system time, is:

w=
N

W
i =1

where W1, W2, , WN are the individual times that each of the N
customers spend in the system during [0,T].
w as N
For stable systems: w
If the system under consideration is the queue alone:

1 N Q
w Q = Wi wQ
N i =1

as

G/G/1/N/K example (cont.): the average system time is

w =

W1 + W2 + ... + W5 2 + (8 3) + ... + (20 16)


=
= 4.6 time units
5
5

Chapter 7. Queueing Models

21

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

The Conservation Equation Littles Law


Conservation equation (a.k.a. Littles law)
Average # in
system

L = w

Average
System time

Arrival rate

L = w as T and N
Holds for almost all queueing systems or subsystems (regardless of the
number of servers, the queue discipline, or other special circumstances).
G/G/1/N/K example (cont.): On average, one arrival every 4 time units
and each arrival spends 4.6 time units in the system. Hence, at an
arbitrary point in time, there is (1/4)(4.6) = 1.15 customers present on
average.

Chapter 7. Queueing Models

22

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Server Utilization
Definition: the proportion of time that a server is busy.
Observed server utilization, , is defined over a specified time interval
[0,T].
Long-run server utilization is .
For systems with long-run stability: as T

Chapter 7. Queueing Models

23

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Server Utilization
For G/G/1// queues:
Any single-server queueing system with average arrival rate
customers per time unit, where average service time E(S) = 1/
time units, infinite queue capacity and calling population.
Conservation equation, L = w, can be applied.
For a stable system, the average arrival rate to the server, s,
must be identical to .
The average number of customers in the server is:

Ls =
T

Chapter 7. Queueing Models

(
T

T T0
L(t ) LQ (t ) dt =
T

24

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Server Utilization
In general, for a single-server queue:

Ls = T
Ls =

and

= E (s) =

- For a single-server stable queue:

<1

- For an unstable queue ( > ), long-run server utilization is 1.

Chapter 7. Queueing Models

25

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Server Utilization
For G/G/c// queues:
A system with c identical servers in parallel.
If an arriving customer finds more than one server idle, the
customer chooses a server without favoring any particular server.
For systems in statistical equilibrium, the average number of busy
servers, Ls, is: Ls, = E(s) = /.
The long-run average server utilization is:

Chapter 7. Queueing Models

Ls

=
, where < c for stable systems
c c

26

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Server Utilization and System Performance


System performance varies widely for a given utilization .
For example, a D/D/1 queue where E(A) = 1/ and E(S) = 1/,
where:
L = = /, w = E(S) = 1/, LQ = WQ = 0.
- By varying and , server utilization can assume any value between 0
and 1.
- Yet there is never any line.

In general, variability of interarrival and service times causes lines


to fluctuate in length.

Chapter 7. Queueing Models

27

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Server Utilization and System Performance

Consider the system is simulated


with service times: S1= 9, S2=12,

Example: A physician who


schedules patients every 10
minutes and spends Si minutes
with the i-th patient:

S3 = 9, S4 = 9, S5 = 9, .

The system becomes:

9 minutes with probability 0.9


Si =
12 minutes with probability 0.1
Arrivals are deterministic,
A1 = A2 = = -1 = 10.
Services are stochastic
- E(Si) = 9.3 min
- V(S0) = 0.81 min2
- = 0.9 min

The occurrence of a relatively


long service time (S2 = 12)
causes a waiting line to form
temporarily.

On average, the physician's


utilization = = / = 0.93 < 1.

Chapter 7. Queueing Models

28

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Costs in Queueing Problems


Costs can be associated with various aspects of the waiting line or
servers:

System incurs a cost for each customer in the queue, say at a rate of $10
per hour per customer.
WjQ is the time
- The average cost per customer is:
customer j spends
N

$10 W jQ

j =1

in queue

= $10 w Q

- If customers per hour arrive (on average), the average cost per
hour is:

L
$
10
customer $10 w Q
Q
= $10 w Q =

hour customer
hour

Server may also impose costs on the system, if a group of c parallel


servers (1 c ) have utilization r, each server imposes a cost of $5 per
hour while busy.
- The total server cost is: $5 c
Chapter 7. Queueing Models

29

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Steady-State Behavior of Markovian Models

Markovian models:

Exponential-distributed arrival process (mean arrival rate = 1/).


Service times may be exponentially (M) or arbitrary (G) distributed.
Queue discipline is FIFO.
A queueing system is in statistical equilibrium if the probability that the
system is in a given state is not time dependent:

P ( L(t ) = n) = Pn (t ) = Pn
Mathematical models in this chapter can be used to obtain approximate
results even when the model assumptions do not strictly hold, as a
rough guide.
Simulation can be used for more refined analysis, more faithful
representation for complex systems.
Chapter 7. Queueing Models

30

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Steady-State Behavior of Markovian Models

Properties of processes with statistical equilibrium


The state of statistical equilibrium is reached from any starting
state.
The process remain in statistical equilibrium once it has reached
it.

Chapter 7. Queueing Models

31

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Steady-State Behavior of Markovian Models

For the simple model studied in this chapter, the steady-state


parameter, L, the time-average number of customers in the
system is:

L=

nP

n =0

Apply Littles equation, L= w, to the whole system and to the


queue alone:

w=

wQ = w

LQ = wQ

G/G/c// example: to have a statistical equilibrium, a


necessary and sufficient condition is:

=
<1
c
Chapter 7. Queueing Models

32

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

M/G/1 Queues

Single-server queues with Poisson arrivals and unlimited capacity.


Suppose service times have mean 1/ and variance 2 and = / < 1, the
steady-state parameters of M/G/1 queue:

P0 = 1

2 (1 + 2 2 )
L=+
2(1 )

The particular
distribution is not
known!

2 (1 + 2 2 )
LQ =
2(1 )
(1 / 2 + 2 )
w= +

2(1 )
1

(1 / 2 + 2 )
wQ =
2(1 )
Chapter 7. Queueing Models

33

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

M/G/1 Queues
There are no simple expression for the steady-state probabilities P0,
P1,
L LQ = is the time-average number of customers being served.
Average length of queue, LQ, can be rewritten as:

2 2
+
LQ =
2(1 ) 2(1 )
If and are held constant, LQ depends on the variability, 2, of the
service times.

Chapter 7. Queueing Models

34

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

M/G/1 Queues

Example: Two workers competing for a job, Able claims to be faster than
Baker on average, but Baker claims to be more consistent,

Poisson arrivals at rate = 2 per hour (1/30 per minute).


Able: 1/ = 24 minutes and 2 = 202 = 400 minutes2:

(1 / 30) 2 [24 2 + 400]


LQ =
= 2.711 customers
2(1 4 / 5)
- The proportion of arrivals who find Able idle and thus experience no delay is P0 = 1- = 1/5

= 20%.

Baker: 1/ = 25 minutes and 2 = 22 = 4 minutes2:

(1 / 30) 2 [252 + 4]
LQ =
= 2.097 customers
2(1 5 / 6)
- The proportion of arrivals who find Baker idle and thus experience no delay is P0 = 1- =
1/6 = 16.7%.

Although working faster on average, Ables greater service variability results in an


average queue length about 30% greater than Bakers.

Chapter 7. Queueing Models

35

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

M/M/1 Queues
Suppose the service times in an M/G/1 queue are exponentially
distributed with mean 1/, then the variance is 2 = 1/2.
M/M/1 queue is a useful approximate model when service times
have standard deviation approximately equal to their means.
The steady-state parameters
=

Pn = (1 ) n
L=

P0 = 1

2
2
=
LQ =
( ) 1
w=
wQ =
Chapter 7. Queueing Models

1
1
=
(1 )

=
( ) (1 )
36

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

M/M/1 Queues
Single-chair unisex hair-styling shop
Interarrival and service times are exponentially distributed
=2 customers/hour and =3 customers/hour
=

2
=
3

L=
1
3

P0 = 1 =
1

n =0

Chapter 7. Queueing Models

2
= 2 Customers
3 2

4
4
2
LQ =
=
= Customers
( ) 3(3 2) 3
4 2
L = LQ + = + = 2 Customers
3 3

1 2
4
P2 = =
3 3
27
P 4 = 1 Pn =

2
= 1 hour
2
1
1 2
wQ = w = 1 = hour
3 3

w=

1 2 2
P1 = =
3 3 9

16
81

37

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

M/M/1 Queues
Example: M/M/1 queue with
service rate =10 customers
per hour.

Consider how L and w


increase as arrival rate, ,
increases from 5 to 8.64 by
increments of 20%

5,0

6,0

7,2

8,6

10,0

0,5
1,0

0,6
1,5

0,7
2,6

0,9
6,4

1,0

0,2

0,3

0,4

0,7

20

L
18 w

If / 1, waiting lines tend to


continually grow in length

Number of Customers

16

Increase in average system


time (w) and average number
in system (L) is highly
nonlinear as a function of .

14
12
10
8
6
4
2
0
0.5

0.6

0.7

0.8

0.9

rho

Chapter 7. Queueing Models

38

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Effect of Utilization and Service Variability


For almost all queues, if lines are too long, they can be reduced by
decreasing server utilization () or by decreasing the service time
variability (2).
A measure of the variability of a distribution,
coefficient of variation (cv):

(cv) 2 =

V (X )

[E ( X )]2

The larger cv is, the more variable is the distribution relative to its
expected value
For exponential service times with rate
- E(X)=1/
- V(X)=1/2
cv=1
Chapter 7. Queueing Models

39

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Effect of Utilization and Service Variability


Consider LQ for any M/G/1 queue:
2 (1 + 2 2 )
LQ =
2(1 )
2 1 + (cv) 2

1
2

LQ for M/M/1
queue
Corrects the M/M/1
formula to account
for a non-exponential
service time distn

Chapter 7. Queueing Models

40

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Multiserver Queue
M/M/c// queue: c servers operating in parallel
Arrival process is poisson with rate
Each server has an independent and identical exponential service-time
distribution, with mean 1/.
To achieve statistical equilibrium, the offered load (/) must satisfy /<c,
where /(c) = is the server utilization.

Calling population

1
Waiting line

c
Chapter 7. Queueing Models

41

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Multiserver Queue

The steady-state parameters


=

c 1 ( / ) n c 1 c

P0 =
+
n =0 n! c! c
(c ) c P0
P (L ( ) c ) =
c!(1 )
Probability
that all servers
are busy

(c ) c +1 P0
P (L ( ) c )
c
L = c +
=

+
1
c(c!)(1 ) 2
L
w=

P (L ( ) c )
LQ =
1
L LQ = c
Chapter 7. Queueing Models

42

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Multiserver Queue
Probability of empty system

Chapter 7. Queueing Models

Number of customers in system

43

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Multiserver Queue
Other common multiserver queueing models
2 1 + (cv) 2

LQ =
2
1

Corrects the M/M/1


formula

LQ for M/M/1
queue

M/G/c/: general service times and c parallel server. The parameters can
be approximated from those of the M/M/c// model.
M/G/: general service times and infinite number of servers.
M/M/c/N/: service times are exponentially distributed at rate and c
servers where the total system capacity is N c customer. When an arrival
occurs and the system is full, that arrival is turned away.
Chapter 7. Queueing Models

44

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Multiserver Queue
M/G/: general service times and infinite number of servers
- customer is its own server
- service capacity far exceeds service demand
- when we want to know how many servers are required so that
customers are rarely delayed

Pn = e

P0 = e

w=

()

n!

, n = 0,1, K

wQ = 0
L=

LQ = 0
Chapter 7. Queueing Models

45

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Multiserver Queue
How many users can be logged in simultaneously in a computer
system

Customers log on with rate =500 per hour


Stay connected in average for 1/=180 minutes = 3 hours
For planning purposes it is pretended that the simultaneous logged in
users is infinite
Expected number of simultaneous users L

L = = 500 3 = 1500

To ensure providing adequate capacity 95% of the time, the number of


parallel users c has to be restricted
e 1500 (1500) n
0.95
P ( L() c) = Pn =
n!
n =0
n =0
c

The capacity c=1564 simultaneous users satisfies this requirement


Chapter 7. Queueing Models

46

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Multiserver Queue
M/M/c/N/: service times are exponentially distributed at rate and
c servers where the total system capacity is N c customer
when an arrival occurs and the system is full, that arrival is turned away
Effective arrival rate e is defined as the mean number of arrivals per
time unit who enter and remain in the system
1

P0
PN
LQ

a n a c N n c
= 1 + +

!
!
n
c
1
1
=
=
+
n
n
c

aN
=
P0
c!c N c
P0 a c
1 N c ( N c) N c (1 )
=
c!(1 )
= (1 PN )

wQ =

LQ

w = wQ +

(1-PN) probability that a


customer will find a space
and be able to enter the
system

L = e w
Chapter 7. Queueing Models

47

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Multiserver Queue
Single-chair unisex hair-styling shop (again!)
Space only for 3 customers:
one in service and two waiting
First computer P0
P0 =

1
2 2 3 2 n 1
1 + +
3 3 n = 2 3

Queue time
wQ =

= 0.415

PN = P3 =

()
1!1

P0 =

w = wQ +

8
= 0.123
65

28
= 0.246
114

66
= 0.579
114

Expected number of
customers in shop

Average of the queue


LQ = 0.431

L = e w =

Effective arrival rate

66
= 1.015
65

Probability of busy shop

1 P0 = e = 0.585

8 114

e = 21 =
= 1.754
65 65

Chapter 7. Queueing Models

System time, time in shop

P(system is full)
2 3
3
2

LQ

48

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Steady-State Behavior of Finite-Population Models

In practical problems calling population is finite

When the calling population is small, the presence of one or more customers in
the system has a strong effect on the distribution of future arrivals.

Consider a finite-calling population model with K customers (M/M/c/K/K)

The time between the end of one service visit and the next call for service is
exponentially distributed with mean = 1/.
Service times are also exponentially distributed with mean 1/.
c parallel servers and system capacity is K.

K Customers

1
Waiting line

c
Chapter 7. Queueing Models

49

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Steady-State Behavior of Finite-Population Models


Some of the steady-state probabilities:
n
c 1 K n K

K!

P0 = +
n c
n =0 n n =c ( K n)!c!c
K n
n = 0,1,..., c 1
P0 ,
n
Pn =
n

!
K

( K n)!c!c n c , n = c, c + 1,...K

L = nPn ,
n =0

w = L / e ,

e
c

where e is the long run effective arrival rate of customers to queue (or entering/exiting service)
K

e = ( K n)Pn
n =0

Chapter 7. Queueing Models

50

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Steady-State Behavior of Finite-Population Models


Example: two workers who are responsible for 10 milling
machines.

Machines run on the average for 20 minutes, then require an


average 5-minute service period, both times exponentially
distributed: = 1/20 and = 1/5.
All of the performance measures depend on P0:
n
21 10 5 n 10
10!
5
P0 = +

n2
n
20
n
(
10

)!
2
!
2

20
n =0
n=2

= 0.065

Then, we can obtain the other Pn, and can compute the
expected number of machines in system:
10

L=

nP

= 3.17 machines

n =0

The average number of running machines:


K L = 10 3.17 = 6.83 machines
Chapter 7. Queueing Models

51

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Networks of Queues
Many systems are naturally modeled as networks of single queues
customers departing from one queue may be routed to another

The following results assume a stable system with infinite calling


population and no limit on system capacity:

Provided that no customers are created or destroyed in the queue, then


the departure rate out of a queue is the same as the arrival rate into the
queue, over the long run.
If customers arrive to queue i at rate i, and a fraction 0 pij 1 of them
are routed to queue j upon departure, then the arrival rate from queue i
to queue j is i pij over the long run.

Chapter 7. Queueing Models

52

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Networks of Queues
The overall arrival rate into queue j:

j = aj +

p
i

ij

all i

Arrival rate
from outside
the network

Sum of arrival rates


from other queues
in network

If queue j has cj < parallel servers, each working at rate j, then the
long-run utilization of each server is j=j /(cj) (where j < 1 for stable
queue).
If arrivals from outside the network form a Poisson process with rate aj
for each queue j, and if there are cj identical servers delivering
exponentially distributed service times with mean 1/j, then, in steady
state, queue j behaves likes an M/M/cj queue with arrival rate

j = aj +

p
i

ij

all i

Chapter 7. Queueing Models

53

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Network of Queues
Customer
Population

80 cust
hour

0.4

c=
c=1

0.6

Discount store example:

Suppose customers arrive at the rate 80 per hour and


40% choose self-service.
Hence:
- Arrival rate to service center 1 is 1 = 80(0.4) = 32 per hour
- Arrival rate to service center 2 is 2 = 80(0.6) = 48 per hour.

c2 = 3 clerks and 2 = 20 customers per hour.

The long-run utilization of the clerks is:


2 = 48/(3*20) = 0.8
All customers must see the cashier at service center 3,
the overall rate to service center 3 is 3 = 1 + 2 = 80
per hour.
- If 3 = 90 per hour, then the utilization of the cashier is:
3 = 80/90 = 0.89

Chapter 7. Queueing Models

54

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Summary

Introduced basic concepts of queueing models.


Show how simulation, and some times mathematical analysis, can be used
to estimate the performance measures of a system.
Commonly used performance measures: L, LQ, w, wQ, , and e.
When simulating any system that evolves over time, analyst must decide
whether to study transient behavior or steady-state behavior.

Simple formulas exist for the steady-state behavior of some queues.

Simple models can be solved mathematically, and can be useful in providing


a rough estimate of a performance measure.

Chapter 7. Queueing Models

55

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation
Discrete-Event System Simulation
Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Chapter 5
Random-Number Generation

Computer Science, Informatik 4


Communication and Distributed Systems

Purpose & Overview


Discuss the generation of random numbers.
Introduce the subsequent testing for randomness:
Frequency test
Autocorrelation test.

Chapter 5. Random-Number Generation

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Properties of Random Numbers


Two important statistical properties:
Uniformity
Independence.

Random Number, Ri, must be independently drawn from a uniform


distribution with pdf:

1, 0 x 1
f ( x) =
0, otherwise
2

x
E ( R) = xdx =
0
2
1

Chapter 5. Random-Number Generation

1
=
2
Figure: pdf for
random numbers
4

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Generation of Pseudo-Random Numbers


Pseudo, because generating numbers using a known method
removes the potential for true randomness.
Goal: To produce a sequence of numbers in [0,1] that simulates, or
imitates, the ideal properties of random numbers (RN).
Important considerations in RN routines:

Fast
Portable to different computers
Have sufficiently long cycle
Replicable
Closely approximate the ideal statistical properties of uniformity and
independence.

Chapter 5. Random-Number Generation

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Techniques for Generating Random Numbers


Linear Congruential Method (LCM).
Combined Linear Congruential Generators (CLCG).
Random-Number Streams.

Chapter 5. Random-Number Generation

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Linear Congruential Method


To produce a sequence of integers, X1, X2, between 0 and m-1
by following a recursive relationship:

X i +1 = (aX i + c) mod m, i = 0,1,2,...


The
multiplier

The
modulus

The
increment

The selection of the values for a, c, m, and X0 drastically affects


the statistical properties and the cycle length.
The random integers are being generated [0,m-1], and to convert
the integers to random numbers:

Ri =

Chapter 5. Random-Number Generation

Xi
, i = 1,2,...
m
7

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Linear Congruential Method Example


Use X0 = 27, a = 17, c = 43, and m = 100.
The Xi and Ri values are:
X1 = (17*27+43) mod 100 = 502 mod 100 = 2,
X2 = (17*2+43) mod 100 = 77,
X3 = (17*77+43) mod 100 = 52,
X4=(17*52+43) mod 100 = 27,

Chapter 5. Random-Number Generation

R1 = 0.02;
R2 = 0.77;
R3 = 0.52;
R3 = 0.27;

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Characteristics of a Good Generator


Maximum Density

Such that he values assumed by Ri, i = 1,2,, leave no large gaps on

[0,1]

Problem: Instead of continuous, each Ri is discrete


Solution: a very large integer for modulus m
- Approximation appears to be of little consequence

Maximum Period

To achieve maximum density and avoid cycling.


Achieve by: proper choice of a, c, m, and X0.

Most digital computers use a binary representation of numbers


Speed and efficiency are aided by a modulus, m, to be (or close to) a
power of 2.

Chapter 5. Random-Number Generation

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Random-Numbers in Java
Defined in java.util.Random
private final static long multiplier = 0x5DEECE66DL;
private final static long addend = 0xBL;
private final static long mask = (1L << 48) - 1;
protected int next(int bits) {
long oldseed, nextseed;
...
oldseed = seed.get();
nextseed = (oldseed * multiplier + addend) & mask;
...
return (int)(nextseed >>> (48 - bits));
}

Chapter 5. Random-Number Generation

10

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Combined Linear Congruential Generators


Reason: Longer period generator is needed because of the
increasing complexity of simulated systems.
Approach: Combine two or more multiplicative congruential
generators.
Let Xi,1, Xi,2, , Xi,k, be the i-th output from k different multiplicative
congruential generators.
The j-th generator:
- Has prime modulus mj and multiplier aj and period is mj -1
- Produces integers Xi,j is approx ~ Uniform on integers in [1, mj 1]
- Wi,j = Xi,j -1 is approx ~ Uniform on integers in [0, mj -2]

Chapter 5. Random-Number Generation

11

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Combined Linear Congruential Generators


Suggested form:

k
j 1
X i = (1) X i , j mod m1 1

j =1

Xi
m ,
Hence, Ri = 1
m 1
1 ,
m1

Xi > 0
Xi = 0

The coefficient:
Performs the
subtraction Xi,1-1

The maximum possible period is: P =

Chapter 5. Random-Number Generation

12

(m1 1)(m2 1)...(mk 1)


2 k 1

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Combined Linear Congruential Generators

Example: For 32-bit computers, combining k = 2 generators with


m1 = 2,147,483,563, a1 = 40,014, m2 = 2,147,483,399 and a2 = 20,692. The
algorithm becomes:
Step 1: Select seeds
- X1,0 in the range [1, 2,147,483,562] for the 1st generator
- X2,0 in the range [1, 2,147,483,398] for the 2nd generator.
Step 2:
For each individual generator,
X1,j+1 = 40,014 X1,j mod 2,147,483,563
X2,j+1 = 40,692 X1,j mod 2,147,483,399.

Step 3:
Step 4:

Xj+1 = (X1,j+1 - X2,j+1 ) mod 2,147,483,562.

Return

X j +1

2,147,483,
563

R j +1 =
2,147,483,562
2,147,483,563 ,

X j +1 > 0
X j +1 = 0

Step 5: Set j = j+1, go back to step 2.


Combined generator has period: (m1 1)(m2 1)/2 ~ 2 x 1018

Chapter 5. Random-Number Generation

13

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Random-Numbers in Excel 2003


In Excel 2003 new Random Number Generator

X, Y, Z {1,...,30000}
X = X 171 mod 30269
Y = Y 172 mod 30307
Z = Z 170 mod 30323
Y
Z
X
R =
+
+
mod 1.0
30269 30307 30323
It is stated that this method produces more than 10^13
numbers

Chapter 5. Random-Number Generation

14

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Random-Numbers Streams

The seed for a linear congruential random-number generator:

Is the integer value X0 that initializes the random-number sequence.


Any value in the sequence can be used to seed the generator.

Refers to a starting seed taken from the sequence X0, X1, , XP.
If the streams are b values apart, then stream i could defined by starting seed:

A random-number stream:

i = 1,2, K , Pb

S i = X b ( i 1)

Older generators: b = 105; Newer generators: b = 1037.

A single random-number generator with k streams can act like k distinct


virtual random-number generators

To compare two or more alternative systems.

Advantageous to dedicate portions of the pseudo-random number sequence to


the same purpose in each of the simulated systems.

Chapter 5. Random-Number Generation

15

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Tests for Random Numbers


Two categories:
Testing for uniformity:

H0: Ri ~ U[0,1]
H1: Ri ~/ U[0,1]

- Failure to reject the null hypothesis, H0, means that evidence of nonuniformity has not been detected.

Testing for independence:


H0: Ri ~ independently
H1: Ri ~/ independently

- Failure to reject the null hypothesis, H0, means that evidence of


dependence has not been detected.

Level of significance , the probability of rejecting H0 when it is true:


= P( reject H0 | H0 is true)

Chapter 5. Random-Number Generation

16

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Tests for Random Numbers


When to use these tests:
If a well-known simulation languages or random-number generators is
used, it is probably unnecessary to test
If the generator is not explicitly known or documented, e.g., spreadsheet
programs, symbolic/numerical calculators, tests should be applied to
many sample numbers.

Types of tests:
Theoretical tests: evaluate the choices of m, a, and c without actually
generating any numbers
Empirical tests: applied to actual sequences of numbers produced.
- Our emphasis.

Chapter 5. Random-Number Generation

17

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Frequency Tests
Test of uniformity
Two different methods:
Kolmogorov-Smirnov test
Chi-square test

Chapter 5. Random-Number Generation

18

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Kolmogorov-Smirnov Test
Compares the continuous cdf, F(x), of the uniform distribution with
the empirical cdf, SN(x), of the N sample observations.
We know: F ( x) = x, 0 x 1
If the sample from the RN generator is R1, R2, , RN, then the empirical
cdf, SN(x) is:

S N ( x) =

Number of Ri where Ri x
N

Based on the statistic: D = max | F(x) - SN(x)|


Sampling distribution of D is known
A more powerful test, recommended.
Chapter 5. Random-Number Generation

19

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Kolmogorov-Smirnov Test
The test consists of the following steps
Step 1: Rank the data from smallest to largest
R(1)R(2)... R(N)
Step 2: Compute

D + = max R( i )
1i N N

i 1

D = max R( i )

1i N
N

Step 3: Compute D = max(D+, D-)


Step 4: Get D for the significance level
Step 5: If D D accept, otherwise reject H0

Chapter 5. Random-Number Generation

20

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Kolmogorov-Smirnov Test
Example: Suppose N=5 numbers: 0.44, 0.81, 0.14, 0.05, 0.93.
i

Step 1:

Step 2:

R(i)

0.05

0.14

0.44

0.81

0.93

i/N

0.20

0.40

0.60

0.80

1.00

i/N R(i)

0.15

0.26

0.16

0.07

R(i) (i-1)/N

0.05

0.04

0.21

0.13

Arrange R(i) from


smallest to largest
D+ = max {i/N R(i)}

D- = max {R(i) - (i-1)/N}

Step 3: D = max(D+, D-) = 0.26


Step 4: For = 0.05,
D = 0.565 > D
Hence, H0 is not rejected.

Chapter 5. Random-Number Generation

21

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Chi-square test
Chi-square test uses the sample statistic:
n is the # of classes
n

02 =
i =1

(Oi Ei )
Ei

Ei is the expected
# in the i-th class
Oi is the observed
# in the i-th class

Approximately the chi-square distribution with n-1 degrees of freedom


For the uniform distribution, Ei, the expected number in each class is:
N
Ei = , where N is the total # of observation
n

Valid only for large samples, e.g. N >= 50

Chapter 5. Random-Number Generation

22

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Chi-square test

Example
100 numbers from [0,1]
=0.05
10 intervals
X20.05,9=16.9
Accept, since

Interval

Upper Limit

0.1

10

0.2

X20=11.2 < X20.05,9

Oi-Ei

(Oi-Ei)^2

(Oi-Ei)^2/Ei

10

10

-1

0.1

0.3

10

-5

25

2.5

0.4

10

-4

16

1.6

0.5

16

10

36

3.6

0.6

13

10

0.9

0.7

10

10

0.8

10

-3

0.9

0.9

10

10

10

1.0

14

10

16

1.6

100

100

11.2

Oi

Ei

X20=11.2

Chapter 5. Random-Number Generation

23

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Tests for Autocorrelation


Autocorrelation is concerned with dependence between
numbers in a sequence
Example:
0.12 0.01 0.23 0.28 0.89 0.31 0.64 0.28 0.83 0.93
0.99 0.15 0.33 0.35 0.91 0.41 0.60 0.27 0.75 0.88
0.68 0.49 0.05 0.43 0.95 0.58 0.19 0.36 0.69 0.87

Numbers at 5-th, 10-th, 15-th, ... are very similar


Numbers can be
Low
High
Alternating

Chapter 5. Random-Number Generation

24

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Tests for Autocorrelation


Testing the autocorrelation between every m numbers (m is a.k.a.
the lag), starting with the i-th number
The autocorrelation im between numbers: Ri, Ri+m, Ri+2m, Ri+(M+1)m
M is the largest integer such that i + (M + 1 )m N
Hypothesis:

H 0 : im = 0,

if numbers are independen t

H 1 : im 0,

if numbers are dependent

If the values are uncorrelated:


For large values of M, the distribution of the estimator of im, denoted im
is approximately normal.

Chapter 5. Random-Number Generation

25

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Tests for Autocorrelation

Test statistics is:

im
Z0 =

im

Z0 is distributed normally with mean = 0 and variance = 1, and:

1 M

im =
R

i + km
i +(k +1 )m 0.25

M + 1 k =0

im =

If im > 0, the subsequence has positive autocorrelation

13M + 7
12(M + 1 )

High random numbers tend to be followed by high ones, and vice versa.

If im < 0, the subsequence has negative autocorrelation

Low random numbers tend to be followed by high ones, and vice versa.

Chapter 5. Random-Number Generation

26

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Example
Test whether the 3rd, 8th, 13th, and so on, for the numbers on Slide
24.
Hence, = 0.05, i = 3, m = 5, N = 30, and M = 4
35 =

1 (0.23)(0.28) + (0.28)(0.33) + (0.33)(0.27 )


0.25

4 + 1 + (0.27 )(0.05) + (0.05)(0.36)

= 0.1945
13( 4) + 7
= 0.128
35 =
12( 4 + 1 )
0.1945
= 1.516
Z0 =
0.1280
z0.025 = 1.96 hence, the hypothesis is not rejected.

Chapter 5. Random-Number Generation

27

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Shortcomings
The test is not very sensitive for small values of M, particularly when
the numbers being tested are on the low side.
Problem when fishing for autocorrelation by performing numerous
tests:
If = 0.05, there is a probability of 0.05 of rejecting a true hypothesis.
If 10 independence sequences are examined,
- The probability of finding no significant autocorrelation, by chance
alone, is 0.9510 = 0.60.
- Hence, the probability of detecting significant autocorrelation when it
does not exist = 40%

Chapter 5. Random-Number Generation

28

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Summary
In this chapter, we described:
Generation of random numbers
Testing for uniformity and independence

Caution:
Even with generators that have been used for years, some of which still
in used, are found to be inadequate.
This chapter provides only the basic
Also, even if generated numbers pass all the tests, some underlying
pattern might have gone undetected.

Chapter 5. Random-Number Generation

29

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation
Discrete-Event System Simulation
Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Chapter 6
Random-Variate Generation

Computer Science, Informatik 4


Communication and Distributed Systems

Purpose & Overview


Develop understanding of generating samples from a
specified distribution as input to a simulation model.
Illustrate some widely-used techniques for generating random
variates.
Inverse-transform technique
Acceptance-rejection technique
Special properties

Chapter 6. Random-Variate Generation

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Preparation
It is assumed that a source of uniform [0,1] random numbers
exists.
Linear Congruential Method

Random numbers R, R1, R2, with


PDF

1 0 x 1
f R ( x) =
0 otherwise
CDF

0 x < 0

FR ( x) = x 0 x 1
1 x > 1

Chapter 6. Random-Variate Generation

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Inverse-transform Technique
The concept:
For CDF function: r = F(x)
Generate r from uniform (0,1), a.k.a U(0,1)
Find x,
x = F-1(r)

F(x)

F(x)
1

r = F(x)

r = F(x)
r1

r1

x
x1

x1
Chapter 6. Random-Variate Generation

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Inverse-transform Technique

The inverse-transform technique can be used in principle for


any distribution.
Most useful when the CDF F(x) has an inverse F -1(x) which
is easy to compute.
Required steps
1.
2.
3.
4.

Compute the CDF of the desired random variable X.


Set F(X)=R on the range of X.
Solve the equation F(X)=R for X in terms of R.
Generate uniform random numbers R1, R2, R3, and compute
the desired random variate by Xi = F-1(Ri)

Chapter 6. Random-Variate Generation

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Exponential Distribution
Exponential Distribution

To generate X1, X2, X3

PDF

f ( x ) = e

1 e X = R

e X = 1 R
X = ln(1 R)
ln(1 R)
X=

ln(1 R)
X =

CDF

F ( x ) = 1 e x

Simplification

X =

ln(R)

X = F 1 ( R)

Since R and (1-R) are


uniformly distributed on [0,1]
Chapter 6. Random-Variate Generation

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Exponential Distribution

Figure: Inverse-transform technique for


exp( = 1)
Chapter 6. Random-Variate Generation

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Exponential Distribution
Example: Generate 200 variates Xi with distribution exp(= 1)
Generate 200 Rs with U(0,1), the histogram of Xs become:
0,7
0,6
0,5
0,4
0,3
0,2
0,1
0
0,5

1,5

2,5

3,5

Empirical Histogram

4,5

5,5

6,5

Theor. PDF

Check: Does the random variable X1 have the desired distribution?

P ( X 1 x0 ) = P ( R1 F ( x0 )) = F ( x 0 )

Chapter 6. Random-Variate Generation

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Other Distributions
Examples of other distributions for which inverse CDF works
are:
Uniform distribution
Weibull distribution
Triangular distribution

Chapter 6. Random-Variate Generation

10

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Uniform Distribution
Random variable X uniformly distributed over [a, b]

F(X ) = R
X a
=R
ba
X a = R(b a )
X = a + R (b a )

Chapter 6. Random-Variate Generation

11

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Weibull Distribution
The Weibull Distribution is
described by

The variate is

F(X ) = R

PDF

1 e

1 ( )
f ( x) = x e

( X

( X

=R
= 1 R

( X ) = ln(1 R)

CDF

F ( X ) = 1 e

(x )

= ln(1 R )

X = ln(1 R )
x = ln(1 R )
X = ln(1 R )

Chapter 6. Random-Variate Generation

12

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Triangular Distribution
The CDF of a Triangular
Distribution with endpoints (0, 2)
is given by

0
x2
2
F ( x) =
(2 x) 2
1
2

For 0 X 1

x0

X2
R=
2
and for 1 X 2

0 < x 1
1< x 2

(2 X ) 2
R = 1
2

x>2

X is generated by

2R
X =
2 2(1 R)
Chapter 6. Random-Variate Generation

0 R
1
2

1
2

< R 1
13

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Empirical Continuous Distributions

When theoretical distribution is not applicable


To collect empirical data:

Resample the observed data


Interpolate between observed data points to fill in the gaps

For a small sample set (size n):

Arrange the data from smallest to largest

x (1) x (2) x (n)

Set x(0)=0
Assign the probability 1/n to each interval x (i-1)
The slope of each line segment is defined as
x(i ) x(i 1)
x(i ) x(i 1)
=
ai =
1 / n (i 1) / n
1/ n
The inverse CDf is given by

x x (i)

(i 1)

X = F 1 ( R) = x(i 1) + ai R

Chapter 6. Random-Variate Generation

14

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Empirical Continuous Distributions


i

Interval

Probability

Cumulative Probability

Slope ai

0.0 < x 0.8

0.2

0.2

4.00

0.8 < x 1.24

0.2

0.4

2.20

1.24 < x 1.45

0.2

0.6

1.05

1.45 < x 1.83

0.2

0.8

1.90

1.83 < x 2.76

0.2

1.0

4.65

R1 = 0.71
X 1 = x( 41) + a4 ( R1 (4 1) / n)
= 1.45 + 1.90(0.71 0.6)
= 1.66
Chapter 6. Random-Variate Generation

15

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Empirical Continuous Distributions


What happens for large samples of data
Several hundreds or tens of thousand

First summarize the data into a frequency distribution with


smaller number of intervals
Afterwards, fit continuous empirical CDF to the frequency
distribution
Slight modifications
Slope

ai =

ci cumulative probability of
the first i intervals

x(i ) x(i 1)
ci ci 1

The inverse CDf is given by


X = F 1 ( R) = x + a (R c
( i 1)

Chapter 6. Random-Variate Generation

i 1

16

)
Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Empirical Continuous Distributions


Example: Suppose the data collected for 100 broken-widget repair
times are:
Cumulative Slope,
ai
Frequency, c i

Interval
(Hours)

Frequency

Relative
Frequency

0.25 x 0.5

31

0.31

0.31

0.81

0.5 x 1.0

10

0.10

0.41

5.0

1.0 x 1.5

25

0.25

0.66

2.0

1.5 x 2.0

34

0.34

1.00

1.47

Consider R1 = 0.83:
c3 = 0.66 < R1 < c4 = 1.00
X1 = x(4-1) + a4(R1 c(4-1))
= 1.5 + 1.47(0.83-0.66)
= 1.75

Chapter 6. Random-Variate Generation

17

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Empirical Continuous Distributions


Problems with empirical distributions
The data in the previous example is restricted in the range
0.25 X 2.0
The underlying distribution might have a wider range
Thus, try to find a theoretical distribution

Hints for building empirical distribution based on frequency


tables
It is recommended to use relatively short intervals
- Number of bin increase

This will result in a more accurate estimate

Chapter 6. Random-Variate Generation

18

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Discrete Distribution
All discrete distributions can be generated via inverse-transform
technique
Method: numerically, table-lookup procedure, algebraically, or a
formula
Examples of application:
Empirical
Discrete uniform
Gamma

Chapter 6. Random-Variate Generation

19

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Discrete Distribution
Example: Suppose the number of shipments, x, on the
loading dock of a company is either 0, 1, or 2
Data - Probability distribution:
x

p(x)

F(x)

0
1
2

0.50
0.30
0.20

0.50
0.80
1.00

The inverse-transform technique as table-lookup procedure

F ( xi 1 ) = ri 1 < R ri = F ( xi )
Set X1 = xi

Chapter 6. Random-Variate Generation

20

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Discrete Distribution
Method - Given R, the
generation scheme
becomes:

R 0.5
0,

x = 1, 0.5 < R 0.8


2, 0.8 < R 1.0

Table for generating


the discrete variate
X

Consider R1 = 0.73:
F(xi-1) < R <= F(xi)
F(x0) < 0.73 <= F(x1)
Hence, x1 = 1

i Input ri Output xi
1

0.5

0.8

1.0

Chapter 6. Random-Variate Generation

21

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Acceptance-Rejection technique
Useful particularly when inverse cdf does not exist in closed form, a.k.a.
thinning
Illustration: To generate random variates, X ~ U(1/4, 1)
Generate R

Procedures:
no

Step 1. Generate R ~ U[0,1]


Step 2a. If R >= , accept X=R.
Step 2b. If R < , reject R, return
to Step 1

Condition
yes

Output R

R does not have the desired distribution, but R conditioned (R) on the
event {R } does.
Efficiency: Depends heavily on the ability to minimize the number of
rejections.
Chapter 6. Random-Variate Generation

22

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Poisson Distribution
PMF of a Poisson Distribution
P ( N = n) =

n!
Exactly n arrivals during one time unit

A1 + A2 + L + An 1 < A1 + A2 + L + An + An +1
Since interarrival times are exponentially distributed we can set

Ai =

ln( Ri )

Well known, we derived this generator in the beginning of the class

Chapter 6. Random-Variate Generation

23

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Poisson Distribution
Substitute the sum by
n

Simplify by

ln( Ri )

i =1

n +1

1<

ln( Ri )

i =1

multiply by -, which reverses the inequality sign


sum of logs is the log of a product
n

n +1

i =1

i =1

i =1

i =1

ln Ri = ln( Ri ) > ln( Ri ) = ln Ri


Simplify by eln(x) = x
n

i =1

Chapter 6. Random-Variate Generation

n +1

> Ri
i =1

24

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Poisson Distribution

Procedure of generating a Poisson random variate N is as


follows
1. Set n=0, P=1
2. Generate a random number Rn+1, and replace P by P x Rn+1
3. If P < exp(-), then accept N=n
-

Otherwise, reject the current n, increase n by one, and return to


step 2.

Chapter 6. Random-Variate Generation

25

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Poisson Distribution

Example: Generate three Poisson variates with mean =0.2

exp(-0.2) = 0.8187

Step 1: Set n=0, P=1


Step 2: R1 = 0.4357, P = 1 x 0.4357
Step 3: Since P = 0.4357 < exp(- 0.2), accept N = 0

Step 1: Set n=0, P=1


Step 2: R1 = 0.4146, P = 1 x 0.4146
Step 3: Since P = 0.4146 < exp(-0.2), accept N = 0

Step 1: Set n=0, P=1


Step 2: R1 = 0.8353, P = 1 x 0.8353
Step 3: Since P= 0.8353 > exp(-0.2), reject n=0 and return to Step 2 with n=1
Step 2: R2 = 0.9952, P = 0.8353 x 0.9952 = 0.8313
Step 3: Since P= 0.8313 > exp(-0.2), reject n=1 and return to Step 2 with n=2
Step 2: R3 = 0.8004, P = 0.8313 x 0.8004 = 0.6654
Step 3: Since P = 0.6654 < exp(-0.2), accept N = 2

Variate 1

Variate 2

Variate 3

Chapter 6. Random-Variate Generation

26

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Poisson Distribution
It took five random numbers to generate three Poisson
variates
In long run, the generation of Poisson variates requires some
overhead!
N Rn+1

Accept/Reject

Result

0 0.4357 0.4357 P < exp(- )

Accept

N=0

0 0.4146 0.4146 P < exp(- )

Accept

N=0

0 0.8353 0.8353 P exp(- )

Reject

1 0.9952 0.8313 P exp(- )

Reject

2 0.8004 0.6654 P < exp(- )

Accept

Chapter 6. Random-Variate Generation

27

N=2

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Special Properties
Based on features of particular family of probability
distributions
For example:
Direct Transformation for normal and lognormal distributions
Convolution
Beta distribution (from gamma distribution)

Chapter 6. Random-Variate Generation

28

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Direct Transformation

Approach for N(0,1):


Consider two standard normal random variables, Z1 and Z2, plotted as a
point in the plane:
In polar coordinates:
Z1 = B cos()
Z2 = B sin()

B2 = Z21 + Z22 ~ 2 distribution with 2 degrees of freedom = Exp( = 2).


Hence,
B = (2 ln R)1/ 2
The radius B and angle are mutually independent.

Z1 = (2 ln R )1/ 2 cos(2R2 )
Z 2 = (2 ln R)1/ 2 sin(2R2 )
Chapter 6. Random-Variate Generation

29

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Direct Transformation
Approach for N(,2):

Generate Zi ~ N(0,1)

Xi = + Zi

Approach for Lognormal(,2):


Generate X ~ N((,2)

Yi = eXi

Chapter 6. Random-Variate Generation

30

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Summary
Principles of random-variate generation via
Inverse-transform technique
Acceptance-rejection technique
Special properties

Important for generating continuous and discrete distributions

Chapter 6. Random-Variate Generation

31

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Simulation
Discrete-Event System Simulation
Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Chapter 8
Input Modeling

Computer Science, Informatik 4


Communication and Distributed Systems

Purpose & Overview

Input models provide the driving force for a simulation model.


The quality of the output is no better than the quality of inputs.
In this chapter, we will discuss the 4 steps of input model
development:
1) Collect data from the real system
2) Identify a probability distribution to represent the input process
3) Choose parameters for the distribution
4) Evaluate the chosen distribution and parameters for goodness
of fit.

Chapter 8. Input Modeling

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Data Collection
One of the biggest tasks in solving a real problem
GIGO Garbage-In-Garbage-Out

Raw Data

System
Performance
simulation

Input
Data

Output

Even when model structure is valid simulation results can be


misleading, if the input data are
inaccurately collected
inappropriately analyzed
not representative of the environment

Chapter 8. Input Modeling

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Data Collection

Suggestions that may enhance and facilitate data


collection:
Plan ahead: begin by a practice or pre-observing session, watch
for unusual circumstances
Analyze the data as it is being collected: check adequacy
Combine homogeneous data sets: successive time periods,
during the same time period on successive days
Be aware of data censoring: the quantity is not observed in its
entirety, danger of leaving out long process times
Check for relationship between variables: build scatter diagram
Check for autocorrelation:
Collect input data, not performance data

Chapter 8. Input Modeling

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Identifying the Distribution

Histograms
Scatter Diagrams
Selecting families of distribution
Parameter estimation
Goodness-of-fit tests
Fitting a non-stationary process

Chapter 8. Input Modeling

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Histograms
A frequency distribution or histogram is useful in determining
the shape of a distribution
The number of class intervals depends on:
The number of observations
The dispersion of the data
Suggested number of intervals: the square root of the sample size

For continuous data:

Corresponds to the probability density function of a theoretical


distribution

For discrete data:

Corresponds to the probability mass function

If few data points are available

combine adjacent cells to eliminate the ragged appearance of the


histogram

Chapter 8. Input Modeling

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Histograms

Vehicle Arrival Example: Number of vehicles arriving at an intersection


between 7 am and 7:05 am was monitored for 100 random workdays.
Arrivals per
Period
Frequency
0
12
1
10
2
19
3
17
4
10
5
8
6
7
7
5
8
5
9
3
10
3
11
1

Same data
with different
interval sizes

There are ample data, so the histogram may have a cell for each possible
value in the data range

Chapter 8. Input Modeling

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Histograms Example
Life tests were performed on electronic components at 1.5
times the nominal voltage, and their lifetime was recorded

Component Life Frequency


0x<3

23

3x<6

10

6x<9

9 x < 12

12 x < 15

42 x < 45

144 x < 147


Chapter 8. Input Modeling

1
9

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Histograms Example
Stanford University Mobile Activity Traces (SUMATRA)

Available traces

Target community: cellular


network research community
Traces contain mobility as well as
connection information

SULAWESI (S.U. Local Area Wireless


Environment Signaling Information)
BALI (Bay Area Location Information)

BALI Characteristics

San Francisco Bay Area


Trace length: 24 hour
Number of cells: 90
Persons per cell: 1100
Persons at all: 99.000
Active persons: 66.550
Move events: 243.951
Call events: 1.570.807

Chapter 8. Input Modeling

Question: How to transform the BALI


information so that it is usable with a
network simulator, e.g., ns-2?

10

Node number as well as connection


number is too high for ns-2

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Histograms Example
1800

Analysis of the BALI Trace

1600

Goal: Reduce the amount of


data by identifying user groups

1400

People

1200

User group

1000
800
600

Between 2 local minima


Communication characteristic
is kept in the group
A user represents a group

400
200
50
0 40
30

lls
Ca

Groups with different mobility


characteristics

0
5

20

10

nts
eme
Mov

15

10
20
0

25000

Intra- and inter group


communication

Number of People

20000

Interesting characteristic

Number of people with odd


number movements is
negligible!

15000

10000

5000

-1 0

9 10 11 12 13 14 15 16 17 18 19

Number of Movements

Chapter 8. Input Modeling

11

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Scatter Diagrams
A scatter diagram is a quality tool that can show the
relationship between paired data
Random Variable X = Data 1
Random Variable Y = Data 2
Draw random variable X on the x-axis and Y on the y-axis

Strong Correlation
Chapter 8. Input Modeling

Moderate Correlation
12

No Correlation
Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Scatter Diagrams
Linear relationship

Correlation: Measures how well data line up


Slope: Measures the steepness of the data
Direction
Y Intercept

Chapter 8. Input Modeling

13

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Selecting the Family of Distributions


A family of distributions is selected based on:
The context of the input variable
Shape of the histogram

Frequently encountered distributions:


Easier to analyze: Exponential, Normal and Poisson

Harder to analyze: Beta, Gamma and Weibull

Chapter 8. Input Modeling

14

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Selecting the Family of Distributions


Use the physical basis of the distribution as a guide, for example:
Binomial: Number of successes in n trials
Poisson: Number of independent events that occur in a fixed amount of
time or space
Normal: Distribution of a process that is the sum of a number of
component processes
Exponential: time between independent events, or a process time that is
memoryless
Weibull: time to failure for components
Discrete or continuous uniform: models complete uncertainty
Triangular: a process for which only the minimum, most likely, and
maximum values are known
Empirical: resamples from the actual data collected

Chapter 8. Input Modeling

15

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Selecting the Family of Distributions


Remember the physical characteristics of the process
Is the process naturally discrete or continuous valued?
Is it bounded?

No true distribution for any stochastic input process


Goal: obtain a good approximation

Chapter 8. Input Modeling

16

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Quantile-Quantile Plots
Q-Q plot is a useful tool for evaluating distribution fit
If X is a random variable with CDF F, then the q-quantile of X is the
such that

F( ) = P(X ) = q,

for 0 < q < 1

When F has an inverse, = F-1(q)

Let {xi, i = 1,2, ., n} be a sample of data from X and {yj, j = 1,2, , n}


be the observations in ascending order:
j - 0.5
y j is approximately F -1

n
where j is the ranking or order number

Chapter 8. Input Modeling

17

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Quantile-Quantile Plots

The plot of yj versus F-1( ( j - 0.5 ) / n) is


Approximately a straight line if F is a member of an appropriate family of
distributions
The line has slope 1 if F is a member of an appropriate family of
distributions with appropriate parameter values

Chapter 8. Input Modeling

18

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Quantile-Quantile Plots
Example: Door installation times of a robot follows a normal
distribution.
The observations are ordered from the smallest to the largest:

j
1
2
3
4
5

Value
99.55
99.56
99.62
99.65
99.79

j
6
7
8
9
10

Value
99.98
100.02
100.06
100.17
100.23

j
11
12
13
14
15

Value
100.26
100.27
100.33
100.41
100.47

yj are plotted versus F-1( (j-0.5)/n) where F has a normal distribution with
the sample mean (99.99 sec) and sample variance (0.28322 sec2)

Chapter 8. Input Modeling

19

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Quantile-Quantile Plots

Example (continued): Check whether the door


installation times follow a normal distribution.

100,8
100,6
100,4
100,2

Straight line,
supporting the
hypothesis of a
normal distribution

100
99,8
99,6
99,4
99,2
99,2

99,4

99,6

99,8

100

100,2

100,4

100,6

100,8

0,35
0,3
0,25
0,2

Superimposed
density function of
the normal
distribution

0,15
0,1
0,05
0
99,4

Chapter 8. Input Modeling

20

99,6

99,8

100

100,2

100,4

100,6

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Quantile-Quantile Plots
Consider the following while evaluating the linearity of a Q-Q plot:

The observed values never fall exactly on a straight line


The ordered values are ranked and hence not independent, unlikely for
the points to be scattered about the line
Variance of the extremes is higher than the middle. Linearity of the
points in the middle of the plot is more important.

Q-Q plot can also be used to check homogeneity

It can be used to check whether a single distribution can represent two


sample sets
Given two random variables
- X and x1, x2, , xn
- Z and z1, z2, , zn

Plotting the ordered values of X and Z against each other reveals


approximately a straight line if X and Z are well represented by the same
distribution

Chapter 8. Input Modeling

21

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Parameter Estimation
Parameter Estimation: Next step after selecting a family of
distributions
If observations in a sample of size n are X1, X2, , Xn (discrete or
continuous), the sample mean and variance are:

i=1 X i
n

X=

2
2
X

n
X
i=1 i
n

S2 =

n 1

If the data are discrete and have been grouped in a frequency


distribution:

j =1 f j X j

X=

S2

2
2

f
X
n
X
j
j
j =1

n 1

where fj is the observed frequency of value Xj

Chapter 8. Input Modeling

22

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Parameter Estimation
When raw data are unavailable (data are grouped into class
intervals), the approximate sample mean and variance are:

j =1 f j m j
c

X=

S2

2
2
f
m

n
X
j
j
j =1

n 1

fj is the observed frequency in the j-th class interval


mj is the midpoint of the j-th interval
c is the number of class intervals

A parameter is an unknown constant, but an estimator is a


statistic.

Chapter 8. Input Modeling

23

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Parameter Estimation

Vehicle Arrival Example (continued): Table in the histogram of the example


on Slide 8 can be analyzed to obtain:
n = 100, f1 = 12, X 1 = 0, f 2 = 10, X 2 = 1,...

and

j =1 f j X j = 364, and j =1 f j X 2j = 2080


k

The sample mean and variance are

25

20

Frequency

364
X=
= 3.64
100
2080 100 (3.64) 2
2
S =
99
= 7.63

15

10

0
0

10

11

Number of Arrivals per Period

The histogram suggests X to have a Possion distribution


- However, note that sample mean is not equal to sample variance.
Theoretically: Poisson with parameter = 2 =
- Reason: each estimator is a random variable, it is not perfect.
Chapter 8. Input Modeling

24

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Parameter Estimation

Suggested Estimators for Distributions often used in Simulation


Maximum-Likelihood Esitmators
Distribution

Parameter Estimator

Poisson

Exponential

= X
=

1
X

Gamma

Normal

, 2

= X , 2 = S 2

Lognormal

, 2

= X , 2 = S 2

, =

1
X

After taking ln
of data.
Chapter 8. Input Modeling

25

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Goodness-of-Fit Tests
Conduct hypothesis testing on input data distribution using
Kolmogorov-Smirnov test
Chi-square test

No single correct distribution in a real application exists

If very little data are available, it is unlikely to reject any candidate


distributions
If a lot of data are available, it is likely to reject all candidate distributions

Be aware of mistakes in decision finding


Type I Error:
Type II Error:

Chapter 8. Input Modeling

Statistical
Decision

State of the null hypothesis


H0 True

H0 False

Reject H0

Type I Error

Correct

Accept H0

Correct

Type II Error

26

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Chi-Square Test
Intuition: comparing the histogram of the data to the shape of
the candidate density or mass function
Valid for large sample sizes when parameters are estimated
by maximum-likelihood
Arrange the n observations into a set of k class intervals
The test statistic is:
02

i =1

(Oi Ei ) 2
Ei

Observed Frequency
in the i-th class

Expected Frequency
Ei = n*pi
where pi is the theoretical
prob. of the i-th interval.
Suggested Minimum = 5

02 approximately follows the chi-square distribution with k-s-1

degrees of freedom
s = number of parameters of the hypothesized distribution
estimated by the sample statistics.
Chapter 8. Input Modeling

27

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Chi-Square Test
The hypothesis of a chi-square test is
H0: The random variable, X, conforms to the distributional

assumption with the parameter(s) given by the estimate(s).


H1: The random variable X does not conform.

2
2
H0 is rejected if 0 > ,k s 1

If the distribution tested is discrete and combining adjacent cell is not


required (so that Ei > minimum requirement):
Each value of the random variable should be a class interval, unless
combining is necessary, and

pi = p(xi ) = P(X = xi )
Chapter 8. Input Modeling

28

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Chi-Square Test
If the distribution tested is continuous:

pi =

ai

ai1

f ( x) dx = F (ai ) F (ai 1 )

where ai-1 and ai are the endpoints of the i-th class interval
f(x) is the assumed pdf, F(x) is the assumed cdf
Recommended number of class intervals (k):
Sample Size, n

Number of Class Intervals, k

20

Do not use the chi-square test

50

5 to 10

100

10 to 20
n1/2 to n/5

> 100

Caution: Different grouping of data (i.e., k) can affect the hypothesis


testing result.
Chapter 8. Input Modeling

29

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Chi-Square Test
Vehicle Arrival Example (continued):
H0: the random variable is Poisson distributed.
H1: the random variable is not Poisson distributed.
xi

Observed Frequency, Oi

Expected Frequency, Ei

0
1
2
3
4
5
6
7
8
9
10
> 11

12
10
19
17
19
6
7
5
5
3
3
1
100

2.6
9.6
17.4
21.1
19.2
14.0
8.5
4.4
2.0
0.8
0.3
0.1
100.0

22

17

12.2

(Oi - Ei)2/Ei
7.87
0.15
0.8
4.41
2.57
0.26

7.6

11.62

27.68

Ei = np ( x)
e x
=n
x!
Combined because
of the assumption of
min Ei = 5, e.g.,
E1 = 2.6 < 5, hence
combine with E2

Degree of freedom is k-s-1 = 7-1-1 = 5, hence, the hypothesis is rejected


at the 0.05 level of significance.

02 = 27.68 > 02.05,5 = 11.1


Chapter 8. Input Modeling

30

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Kolmogorov-Smirnov Test
Intuition: formalize the idea behind examining a Q-Q plot
Recall

The test compares the continuous cdf, F(x), of the hypothesized


distribution with the empirical cdf, SN(x), of the N sample observations.
Based on the maximum difference statistics
D = max| F(x) - SN(x) |

A more powerful test, particularly useful when:

Sample sizes are small


No parameters have been estimated from the data

When parameter estimates have been made:

Critical values are biased, too large.


More conservative, i.e., smaller Type I error than specified.

Chapter 8. Input Modeling

31

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

p-Values and Best Fits


p-value for the test statistics
The significance level at which one would just reject H0 for the given test
statistic value.
A measure of fit, the larger the better
Large p-value: good fit
Small p-value: poor fit

Vehicle Arrival Example (cont.):


H0: data is Poisson
Test statistics: 02 = 27.68 , with 5 degrees of freedom
p-value = 0.00004, meaning we would reject H0 with 0.00004 significance
level, hence Poisson is a poor fit.

Chapter 8. Input Modeling

32

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

p-Values and Best Fits


Many software use p-value as the ranking measure to automatically
determine the best fit.
Things to be cautious about:
Software may not know about the physical basis of the data, distribution
families it suggests may be inappropriate.
Close conformance to the data does not always lead to the most
appropriate input model.
p-value does not say much about where the lack of fit occurs

Recommended: always inspect the automatic selection using


graphical methods.

Chapter 8. Input Modeling

33

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Fitting a Non-stationary Poisson Process


Fitting a NSPP to arrival data is difficult, possible approaches:
Fit a very flexible model with lots of parameters
Approximate constant arrival rate over some basic interval of
time, but vary it from time interval to time interval.

Suppose we need to model arrivals over time [0,T], our


approach is the most appropriate when we can:

Observe the time period repeatedly


Count arrivals / record arrival times
Divide the time period into k equal intervals of length t =T/k
Over n periods of observation let Cij be the number of arrivals
during the i-th interval on the j-th period

Chapter 8. Input Modeling

34

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Fitting a Non-stationary Poisson Process


The estimated arrival rate during the i-th time period
(i-1) t < t i t is:
n

(t ) =
Cij

nt j =1

n = Number of observation periods,


t = time interval length
Cij = # of arrivals during the i-th time interval on the j-th observation
period

Example: Divide a 10-hour business day [8am,6pm] into equal


intervals k = 20 whose length t = , and observe over n=3 days
Time Period

Chapter 8. Input Modeling

Number of Arrivals
Day 1
Day 2
Day 3

Estimated Arrival
Rate (arrivals/hr)

8:00 - 8:30

12

14

10

24

8:30 - 9:00

23

26

32

54

9:00 - 9:30

27

18

32

52

9:30 - 10:00

20

13

12

30

35

For instance,
1/3(0.5)*(23+26+32)
= 54 arrivals/hour

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Selecting Model without Data


If data is not available, some possible sources to obtain
information about the process are:

Engineering data: often product or process has performance


ratings provided by the manufacturer or company rules specify
time or production standards.
Expert option: people who are experienced with the process or
similar processes, often, they can provide optimistic, pessimistic
and most-likely times, and they may know the variability as well.
Physical or conventional limitations: physical limits on
performance, limits or bounds that narrow the range of the input
process.
The nature of the process.

The uniform, triangular, and beta distributions are often used


as input models.
Speed of a vehicle?

Chapter 8. Input Modeling

36

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Selecting Model without Data

Example: Production planning


simulation.

Input of sales volume of various


products is required, salesperson
of product XYZ says that:
- No fewer than 1000 units and no
more than 5000 units will be sold.
- Given her experience, she believes
there is a 90% chance of selling
more than 2000 units, a 25%
chance of selling more than 2500
units, and only a 1% chance of
selling more than 4500 units.

Interval (Sales)

1000 <= X <= 2000

0,1

0,10

2000 < X <=2500

0,65

0,75

2500 < X <= 4500

0,24

0,99

4500 < X <= 5000

0,01

1,00

1,20

1,00

0,80

0,60

Translating these information into


a cumulative probability of being
less than or equal to those goals
for simulation input:

Chapter 8. Input Modeling

Cumulative
Frequency, ci

0,40

0,20

0,00
1000 <= X <= 2000

37

2000 < X <=2500

2500 < X <= 4500

4500 < X <= 5000

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Multivariate and Time-Series Input Models


The random variable discussed until now were considered to be
independent of any other variables within the context of the problem
However, variables may be related
If they appear as input, the relationship should be investigated and taken
into consideration

Multivariate input models


Fixed, finite number of random variables
For example, lead time and annual demand for an inventory model
An increase in demand results in lead time increase, hence variables are
dependent.

Time-series input models


Infinite sequence of random variables
For example, time between arrivals of orders to buy and sell stocks
Buy and sell orders tend to arrive in bursts, hence, times between arrivals
are dependent.
Chapter 8. Input Modeling

38

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Covariance and Correlation


Consider the model that describes relationship between X1 and X2:

( X 1 1 ) = ( X 2 2 ) +

is a random
variable with mean 0
and is independent
of X2

= 0, X1 and X2 are statistically independent


> 0, X1 and X2 tend to be above or below their means together
< 0, X1 and X2 tend to be on opposite sides of their means

Covariance between X1 and X2:

cov( X 1 , X 2 ) = E[( X 1 1 )( X 2 2 )] = E ( X 1 X 2 ) 1 2
Covariance between X1 and X2:
where

Chapter 8. Input Modeling

= 0

cov( X 1 , X 2 )< 0
> 0

= 0

< 0
> 0

39

< cov( X 1 , X 2 ) <

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Covariance and Correlation


Correlation between X1 and X2 (values between -1 and 1):

= corr( X 1 , X 2 ) =

where

= 0

corr ( X 1 , X 2 )< 0
> 0

cov( X 1 , X 2 )

1 2

= 0

< 0
> 0

The closer is to -1 or 1, the stronger the linear relationship is between


X1 and X2.

Chapter 8. Input Modeling

40

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Covariance and Correlation


A time series is a sequence of random variables X1, X2, X3,
which are identically distributed (same mean and variance)
but dependent.
cov(Xt, Xt+h) is the lag-h autocovariance
corr(Xt, Xt+h) is the lag-h autocorrelation
If the autocovariance value depends only on h and not on t, the
time series is covariance stationary
For covariance stationary time series, the shorthand for lag-h is
used
h = corr ( X t , X t + h )

Notice

autocorrelation measures the dependence between random


variables that are separated by h-1

Chapter 8. Input Modeling

41

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Multivariate Input Models


If X1 and X2 are normally distributed, dependence between them can
be modeled by the bivariate normal distribution with 1, 2, 12, 22
and correlation
To estimate 1, 2, 12, 22, see Parameter Estimation
To estimate , suppose we have n independent and identically distributed
pairs (X11, X21), (X12, X22), (X1n, X2n),

Then the sample covariance is

1 n
cov( X 1 , X 2 ) =
( X 1 j X 1 )( X 2 j X 2 )

n 1 j =1
The sample correlation is

=
Chapter 8. Input Modeling

cov( X 1 , X 2 )
1 2
42

Sample deviation
Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Multivariate Input Models - Example


Let X1 the average lead time to deliver and X2 the annual demand
for a product.
Data for 10 years is available.
Demand
Lead Time
(X2)

(X1)
6,5

103

4,3

83

X 2 = 101.8, 2 = 9.93

6,9

116

6,0

97

csample = 8.66

6,9

112

6,9

104

5,8

106

7,3

109

4,5

92

6,3

96

X 1 = 6.14, 1 = 1.02

Covariance

8.66
= 0.86
1.02 9.93

Lead time and demand are strongly dependent.

Before accepting this model, lead time and demand should be checked
individually to see whether they are represented well by normal
distribution.

Chapter 8. Input Modeling

43

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Time-Series Input Models


If X1, X2, X3, is a sequence of identically distributed, but dependent
and covariance-stationary random variables, then we can represent
the process as follows:
Autoregressive order-1 model, AR(1)
Exponential autoregressive order-1 model, EAR(1)
- Both have the characteristics that:

h = corr( X t , X t + h ) = h ,

for h = 1,2,...

- Lag-h autocorrelation decreases geometrically as the lag increases, hence,


observations far apart in time are nearly independent

Chapter 8. Input Modeling

44

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

AR(1) Time-Series Input Models


Consider the time-series model:

X t = + ( X t 1 ) + t ,

for t = 2,3,...

where 2 , 3 , are i.i.d. normally distributed with = 0 and variance 2


If initial value X1 is chosen appropriately, then
X1, X2, are normally distributed with mean = , and variance = 2/(1-2)
Autocorrelation h = h
To estimate , , 2 :

= X ,

= (1 2 ) ,
2

cov( X t , X t +1 )

=
2

where cov( X t , X t +1 ) is the lag-1 autocovariance

Chapter 8. Input Modeling

45

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

EAR(1) Time-Series Input Models


Consider the time-series model:

with probability
X t 1 ,
Xt =
X t 1 + t , with probability 1-

for t = 2,3,...

where 2 , 3 , are i.i.d. exponentially distributed with = 1/, and 0 < 1

If X1 is chosen appropriately, then

X1, X2, are exponentially distributed with mean = 1/


Autocorrelation h = h , and only positive correlation is allowed.

To estimate , :

= 1 / X ,

cov( X t , X t +1 )

= =
2

where cov( X t , X t +1 ) is the lag-1 autocovariance


Chapter 8. Input Modeling

46

Dr. Mesut Gne

Computer Science, Informatik 4


Communication and Distributed Systems

Summary

In this chapter, we described the 4 steps in developing input data


models:
1)
2)
3)
4)

Collecting the raw data


Identifying the underlying statistical distribution
Estimating the parameters
Testing for goodness of fit

Chapter 8. Input Modeling

47

Dr. Mesut Gne

Vous aimerez peut-être aussi