WHAT IS MODELING? Modeling is theprocess of producingamodel; amodel is arepresentation of theconstruction and working of somesystemof interest. A model is similar tobut simpler than thesystemit represents. Onepurposeof a model is to enabletheanalyst to predict theeffect of changes to thesystem. On theonehand, amodel should beacloseapproximation tothereal systemand incorporatemost of its salient features. On theother hand, it should not beso complex that it is impossibleto understand and experiment with it. A good model is ajudicious tradeoff between realismand simplicity. Simulation practitioners recommend increasing the complexity of amodel iteratively. An important issuein modeling is model validity. Model validation techniques includesimulating themodel under known input conditions and comparing model output with systemoutput.
Generally, amodel intended for asimulation study is amathematical model developed with thehelp of simulation software. Mathematical model classifications includedeterministic (input and output variables arefixed values) or stochastic (at least oneof theinput or output variables is probabilistic); static (timeis not taken into account) or dynamic (time-varying interactions among variables aretaken into account). Typically, simulation models are stochastic and dynamic. WHAT IS SIMULATION? A simulation of a systemis theoperation of amodel of thesystem. Themodel can be reconfigured and experimented with; usually, this is impossible, too expensiveor impractical to do in thesystemit represents. Theoperation of themodel can bestudied, and hence, properties concerning thebehavior of theactual systemor its subsystemcan beinferred. In its broadest sense, simulation is atool to evaluatetheperformanceof a system, existing or proposed, under different configurations of interest and over long periods of real time. Simulation is used beforean existing systemis altered or anew systembuilt, to: Reducethechances of failureto meet specifications Eliminateunforeseen bottlenecks To prevent under or over-utilization of resources Optimizesystemperformance For instance, simulation can beused to answer questions like: What is thebest design for a new telecommunications network? What aretheassociated resourcerequirements? How will atelecommunication network performwhen thetraffic load increases by 50%? How will a new routing algorithmaffect its performance? Which network protocol optimizes network performance? What will betheimpact of a link failure? Thesubject of this tutorial is discreteevent simulation in which thecentral assumption is that thesystemchanges instantaneously in responseto certain discreteevents. For instance, in an EEEQ 337 Page 2
M/M/1 queue- asingleserver queuing process in which timebetween arrivals and service timeareexponential - an arrival causes thesystemto changeinstantaneously. On theother hand, continuous simulators, likeflight simulators and weather simulators, attempt to quantify thechanges in asystemcontinuously over timein responseto controls. Discrete event simulation is less detailed (coarser in its smallest timeunit) than continuous simulation but it is much simpler toimplement, and hence, is used in awidevariety of situations. Figure1 is aschematic of a simulation study. Theiterativenatureof theprocess is indicated by thesystemunder study becoming thealtered systemwhich then becomes thesystemunder study and thecyclerepeats. In asimulation study, human decision making is required at all stages, namely, model development, experiment design, output analysis, conclusion formulation, and making decisions toalter thesystemunder study. Theonly stagewhere human intervention is not required is therunning of thesimulations, which most simulation softwarepackages performefficiently. Theimportant point is that powerful simulation softwareis merely a hygienefactor - its absencecan hurt asimulation study but its presence will not ensuresuccess. Experienced problemformulators and simulation modelers and analysts areindispensablefor a successful simulation study.
Thesteps involved in developing asimulation model, designing asimulation experiment, and performing simulation analysis are:
EEEQ 337 Page 3
Step 1. Identify theproblem. Step 2. Formulatetheproblem. Step 3. Collect and process real systemdata. Step 4. Formulateand develop amodel. Step 5. Validatethemodel. Step 6. Document model for futureuse. Step 7. Select appropriateexperimental design. Step 8. Establish experimental conditions for runs. Step 9. Performsimulation runs. Step 10. Interpret and present results. Step 11. Recommend further courseof action.
Although this is alogical ordering of steps in asimulation study, many iterations at various sub-stages may berequired beforetheobjectives of asimulation study areachieved. Not all thesteps may bepossibleand/or required. On theother hand, additional steps may haveto be performed. Thenext threesections describethesesteps in detail. HOW TO DEVELOP A SIMULATION MODEL
Simulation models consist of thefollowingcomponents: systementities, input variables, performancemeasures, and functional relationships. For instancein a simulation model of an M/M/1 queue, theserver and thequeuearesystementities, arrival rateand servicerateare input variables, mean wait timeand maximumqueuelength areperformancemeasures, and 'timein system=wait time+servicetime' is an exampleof afunctional relationship. Almost all simulation softwarepackages provideconstructs tomodel each of theabovecomponents. Modeling is arguably themost important part of asimulation study. Indeed, asimulation study is as good as thesimulation model. Simulation modeling comprises thefollowing steps:
Step 1. Identify the problem. Enumerateproblems with an existing system. Produce requirements for aproposed system.
Step 2. Formulate the problem. Select theboundsof thesystem, theproblemor apart thereof, to bestudied. Defineoverall objectiveof thestudy and afew specific issues to beaddressed. Defineperformancemeasures - quantitativecriteriaon thebasis of which different system configurations will becompared and ranked. Identify, briefly at this stage, theconfigurations of interest and formulatehypotheses about systemperformance. Decidethetimeframeof the study, i.e., will themodel beused for aone-timedecision (e.g., capital expenditure) or over a period of timeon aregular basis (e.g., air traffic scheduling). Identify theend user of the simulation model, e.g., corporatemanagement versus aproduction supervisor. Problems must beformulated as precisely as possible.
Step 3. Collect and process real system data. Collect dataon systemspecifications (e.g., bandwidth for acommunication network), input variables, as well as performanceof the existing system. Identify sources of randomness in thesystem, i.e., thestochastic input variables. Select an appropriateinput probability distribution for each stochastic input variableand estimatecorresponding parameter(s).
Softwarepackages for distribution fitting and selection includeExpertFit, BestFit, and add- ons in somestandard statistical packages. Theseaids combinegoodness-of-fit tests, e.g., 2
EEEQ 337 Page 4
test, Kolmogorov-Smirnov test, and Anderson-Darling test, and parameter estimation in a user friendly format.
Standard distributions, e.g., exponential, Poisson, normal, hyperexponential, etc., areeasy to model and simulate. Although most simulation softwarepackages includemany distributions as astandard feature, issues relating to randomnumber generators and generating random variates fromvarious distributions arepertinent and should belooked into. Empirical distributions areused when standard distributions arenot appropriateor do not fit the availablesystemdata. Triangular, uniformor normal distribution is used as afirst guess when no dataareavailable.
Step 4. Formulate and develop a model. Developschematics and network diagrams of the system(How do entities flow through thesystem?). Translatetheseconceptual models to simulation softwareacceptableform. Verify that thesimulation model executes as intended. Verification techniques include traces, varying input parameters over their acceptable range and checking the output, substituting constants for random variables and manually checking results, and animation.
Step 5. Validate the model. Comparethemodel's performanceunder known conditions with theperformanceof thereal system. Performstatistical inferencetests and get themodel examined by systemexperts. Assess theconfidencethat theend user places on themodel and address problems if any. For major simulation studies, experienced consultants advocatea structured presentation of themodel by thesimulation analyst(s) beforean audienceof management and systemexperts. This not only ensures that themodel assumptions are correct, completeand consistent, but also enhances confidencein themodel.
Step 6. Document model for future use. Document objectives, assumptions and input variables in detail.
HOW TO DESIGN A SIMULATION EXPERIMENT
A simulation experiment is atest or aseries of tests in which meaningful changes aremadeto theinput variables of a simulation model so that wemay observeand identify thereasons for changes in theperformancemeasures. Thenumber of experiments in asimulation study is greater than or equal to thenumber of questions being asked about themodel (e.g., Is therea significant differencebetween themean delay in communication networks A and B?, Which network has theleast delay: A, B, or C? How will anew routing algorithmaffect the performanceof network B?). Design of a simulation experiment involves answering the question: what dataneed to beobtained, in what form, and how much? Thefollowing steps illustratetheprocess of designing a simulation experiment.
Step 7. Select appropriate experimental design. Select aperformancemeasure, afew input variables that arelikely to influenceit, and thelevels of each input variable. Document the experimental design.
Step 8. Establish experimental conditions for runs. Address thequestion of obtaining accurate information and themost information fromeach run. Determineif thesystemis stationary (performancemeasuredoes not changeover time) or non-stationary (performancemeasure changes over time). Generally, in stationary systems, steady-statebehavior of theresponse EEEQ 337 Page 5
variableis of interest. Ascertain whether aterminating or anon-terminating simulation run is appropriate. Select therun length. Select appropriatestarting conditions (e.g., empty and idle, fivecustomers in queueat time0). Select thelength of thewarm-up period, if required. Decidethenumber of independent runs - each run uses adifferent randomnumber stream and thesamestarting conditions - by considering output datasamplesize. Samplesizemust belargeenough (at least 3-5 runs for each configuration) to providetherequired confidence in theperformancemeasureestimates. Alternately, usecommon randomnumbers to compare alternativeconfigurations by using a separaterandomnumber streamfor each sampling process in aconfiguration. Identify output datamost likely to becorrelated.
Most simulation packages providerun statistics (mean, standard deviation, minimumvalue, maximumvalue) on theperformancemeasures, e.g., wait time(non-timepersistent statistic), inventory on hand (timepersistent statistic). Let themean wait timein an M/M/1 queue observed fromn runs beW1 ,W2 ,...,Wn . It is important to understand that themean wait time W is arandomvariableand theobjectiveof output analysis is to estimatethetruemean of W and to quantify its variability.
Notwithstanding thefacts that thereareno data collection errors in simulation, theunderlying model is fully known, and replications and configurations areuser controlled, simulation results aredifficult tointerpret. An observation may bedueto systemcharacteristics or just a randomoccurrence. Normally, statistical inferencecan assess thesignificanceof an observed phenomenon, but most statistical inferencetechniques assumeindependent, identically distributed (iid) data. Most types of simulation dataareautocorrelated, and hence, do not satisfy this assumption. Analysis of simulation output dataconsists of thefollowing steps.
Step 10. Interpret and present results. Computenumerical estimates (e.g., mean, confidence intervals) of thedesired performancemeasurefor each configuration of interest. To obtain confidenceintervals for themean of autocorrelated data, thetechniqueof batch means can be used. In batch means, original contiguous dataset froma run is replaced with a smaller data set containingthemeans of contiguous batches of original observations. Theassumption that batch means areindependent may not always betrue; increasing total samplesizeand increasing thebatch length may help.
Test hypotheses about systemperformance. Construct graphical displays (e.g., piecharts, histograms) of theoutput data. Document results and conclusions.
Step 11. Recommend further course of action. This may includefurther experiments to increasetheprecision and reducethebias of estimators, to performsensitivity analyses, optimization, etc.
EEEQ 337 Page 6
AN EXAMPLE A machineshop contains two drills, onestraightener, and onefinishing operator. Figure2 shows aschematic of themachineshop. Two types of parts enter themachine shop.
Type1 parts requiredrilling, straightening, and finishing in sequence. Type2 parts require only drilling and finishing. Thefrequency of arrival and thetimeto berouted to thedrilling areaaredeterministic for both types of parts. Step 1. Identify the problem. Theutilization of drills, straightener, and finishing operator needs to beassessed. In addition, thefollowing modification to theoriginal systemis of interest: thefrequency of arrival of both parts is exponential with thesamerespectivemeans as in theoriginal system. Step 2. Formulate the problem. Theobjectiveis toobtain theutilization of drills, straightener, and finishingoperator for theoriginal system and themodification. Theassumptions include: Thetwo drills areidentical Thereis no material handling timebetween thethreeoperations. Machineavailability implies operator availability. Parts areprocessed on aFIFO basis. All times arein minutes. Step 3. Collect and process real system data. At thejob shop, aType1 part arrives every 30 minutes, and a Type2 part arrives every 20 minutes. It takes 2 minutes to routeaType1 part and 10 minutes to routeaType2 part to thedrilling area. Parts wait in a queuetill oneof the two drilling machines becomes available. After drilling, Type1 parts arerouted to the straightener and Type2 parts arerouted to thefinishing operator. After straightening, Type1 parts arerouted to thefinishing operator. EEEQ 337 Page 7
Theoperation times for either part weredetermined to beas follows. Drilling timeis normally distributed with mean 10.0 and standard deviation 1.0. Straightening timeis exponentially distributed with amean of 15.0. Finishing requires 5 minutes per part. Step 4. Formulate and develop a model. A model of thesystemand themodification was developed using asimulation package. A traceverified that theparts flowed through thejob shop as expected. Step 5. Validate the model. Theutilization for asufficiently long run of theoriginal system was judged to bereasonableby themachineshop operators. Step 6. Document model for future use. Themodels of theoriginal systemand the modification weredocumented as thoroughly as possible. Step 7. Select appropriate experimental design. Theoriginal systemand themodification described abovewerestudied. Step 8. Establish experimental conditions for runs. Each model was run threetimes for 4000 minutes and statistical registers werecleared at time 1000, so thestatistics below werecollected on thetimeinterval [1000, 4000]. At the beginning of asimulation run, therewereno parts in themachineshop. Step 9. Perform simulation runs. Runs wereperformed as specified in Step 8 above. Step 10. Interpret and present results. Table1contains theutilization statistics of thethree operations for theoriginal systemand themodification (in parentheses).
Mean utilization represents thefraction of timea server is busy, i.e., busy time/total time. Furthermore, theaverageutilization output for drilling must bedivided by thenumber of drills in order to get theutilization per drill. Each drill is busy about 40% of thetimeand straightening and finishing operations arebusy about half thetime. This implies that for the given work load, thesystemis underutilized. Consequently, theaverageutilization did not changesubstantially between theoriginal systemand themodification; thestandard deviation of thedrilling operation seems to haveincreased becauseof theincreased randomness in the modification. Thestatistical significanceof theseobservations can bedetermined by computing confidenceintervals on themean utilization of theoriginal and modified systems. Step 11. Recommend further course of action. Other performancemeasures of interest may be: throughput of parts for thesystem, mean timein systemfor both types of parts, average and maximumqueuelengths for each operation. Other modifications of interest may be: the flow of parts to themachineshop doubles, thefinishing operation will berepeated for 10% of theproducts on aprobabilistic basis.
EEEQ 337 Page 8
HOW TO SELECT SIMULATION SOFTWARE
Although a simulation model can be built using general purpose programming languages which arefamiliar to theanalyst, availableover a widevariety of platforms, and less expensive, most simulation studies today areimplemented using asimulation package. Theadvantages arereduced programming requirements; natural framework for simulation modeling; conceptual guidance; automated gathering of statistics; graphic symbolismfor communication; animation; and increasingly, flexibility to change the model. Naturally, the question of how to select the best simulation software for an application arises. Metrics for evaluation include modeling flexibility, ease of use, modeling structure (hierarchical v/s flat; object-oriented v/s nested), code reusability, graphic user interface, animation, dynamic business graphics, hardware and software requirements, statistical capabilities, output reports and graphical plots, customer support, and documentation. The two types of simulation packages are simulation languages and application- oriented simulators. Simulation languages offer more flexibility than the application- oriented simulators. On the other hand, languages require varying amounts of programming expertise. Application-oriented simulators are easier to learn and have modeling constructs closely related to the application. Most simulation packages incorporateanimation which is excellent for communication and can beused to debug thesimulation program; a"correct looking" animation, however, is not aguaranteeof a valid model. Moreimportantly, animation is not asubstitutefor output analysis.
(International Series of Monographs On Physics 135) Vladimir Fortov, Igor Iakubov, Alexey Khrapak - Physics of Strongly Coupled Plasma (2006, Oxford University Press) PDF