26  Simulation Modelling

26.1 What Simulation Is

Simulation modelling replaces a real stochastic system with a computational model driven by random inputs, runs the model many times, and studies the distribution of the outputs. When an analytic formula is available it is usually faster; simulation is chosen when the system is too complex (queueing with general service times, network flow with congestion, inventory policies with stochastic demand and leadtimes, project schedules with random activity durations) to solve in closed form. Metropolis and Ulam (1949) are the standard starting reference for Monte Carlo methods; Law (2015) and Banks et al. (2010) are the modern textbook treatments.

flowchart LR
    A[Random inputs] --> B[Model logic]
    B --> C[Sample paths]
    C --> D[Output statistics]
    D --> E[Confidence intervals]
    D --> F[Decision]

NoteSimulation vs LP vs IP

LP and IP (Chapters 23 and 24) optimise under certainty. Simulation studies the distribution of outcomes under uncertainty, but does not itself optimise. Combining the two (for example, simulating outcomes across a range of candidate policies and picking the best) is the workflow for prescriptive analytics under uncertainty.

26.2 Monte Carlo and the Law of Large Numbers

Monte Carlo methods estimate an expectation by averaging many independent simulated draws. The law of large numbers guarantees convergence of the running mean to the true expectation; the central limit theorem describes the error: the standard error shrinks as \(1 / \sqrt{n}\). Doubling precision therefore costs four times the sample size, which is the basic budget equation for any simulation study.

The running mean drifts toward the analytic value as \(n\) grows and the 95 percent confidence interval brackets it. The shrinking-band intuition applies to every simulation estimator in this chapter.

26.3 Random Variate Generation

Business simulation pulls from a short list of distributions: uniform for symmetric uncertainty with hard limits, normal for symmetric continuous outcomes, lognormal for multiplicative positive quantities (prices, revenues, durations), Poisson for count arrivals in a fixed interval, exponential for inter-arrival and service times in memoryless systems, and binomial for pass/fail batches. set.seed() fixes the stream so a simulation is exactly reproducible.

The sample means and standard deviations land close to the parameter targets, and the histograms show the typical shapes. Picking the right distribution for the input is the single most consequential modelling decision in a simulation.

26.4 Risk Analysis

A capital project has an up-front cost, uncertain annual cash flows, and a discount rate. The net present value (NPV) is a linear function of the cash flows; treating the cash flows as random inputs turns NPV into a random output whose distribution tells the business about downside risk.

The mean NPV is positive, but the histogram shows a non-trivial left tail: the 5 percent value-at-risk and the probability of a negative NPV quantify the downside in a way a single-point NPV calculation cannot.

26.5 Queueing Simulation

A single-server queue with exponential inter-arrival times (rate \(\lambda\)) and exponential service times (rate \(\mu\)) is the classical M/M/1 model (Kendall 1953). Analytic formulas exist for utilisation \(\rho = \lambda / \mu\) and mean waiting time, so this example also validates the simulation against a known answer.

Simulated utilisation and mean waiting time match the analytic M/M/1 values. The same simulation scales immediately to non-exponential service, multi-server queues, and networks of queues where closed-form answers stop being available.

26.6 Inventory Simulation

The newsvendor problem stocks a perishable item ahead of a single selling period with uncertain demand. Overstock is wasted at cost \(c_o\) per unit; understock forgoes margin \(c_u\) per unit. The optimal order quantity under a known demand distribution is the critical-ratio quantile \(F^{-1}(c_u / (c_u + c_o))\) (Hadley and Whitin 1963). Simulation over a range of order quantities confirms the rule without assuming which quantile formula applies.

The simulated cost curve is convex and the minimum identifies the best order quantity without requiring the analyst to invert any distribution function. Swapping rlnorm for a gamma or a real historical demand sample is a one-line change.

26.7 Project Duration under Uncertainty

The critical path method in Chapter 25 assumed every activity duration was known. In reality durations are uncertain, the critical path itself shifts across replications, and the project finish time is a distribution. Simulation replaces point durations with distributions and reports the full finish-time distribution (Malcolm et al. 1959 PERT).

The finish-time distribution shifts right of the deterministic CPM answer because the longest of several random paths is itself a skewed variable. The 90th percentile is what a stakeholder asking for a “safe” deadline actually needs.

26.8 Variance Reduction

Antithetic variates pair each random draw \(U\) with \(1 - U\) and average the two evaluations. When the integrand is monotone in \(U\), this reduces variance without increasing sample size, so the same precision is obtained with fewer replications.

At the same replication budget, the antithetic estimator reports a much smaller standard error than the crude estimator, which is why variance-reduction techniques are standard in production simulations (Kleijnen 2015).

26.9 Confidence Intervals and Replications

For a Monte Carlo mean estimator, the half-width of a 95 percent CI is \(1.96 \, s / \sqrt{n}\). To shrink the half-width to a target \(h\) requires \(n \approx (1.96 \, s / h)^2\). Welford’s (1962) online update computes the running mean and variance in a single pass without storing all replications.

The half-width shrinks linearly in \(1 / \sqrt{n}\) on the log-log plot, and the formula gives the replication count needed for a target precision before the full run begins.

26.10 Validating a Simulation Model

Simulation outputs are only as credible as the model itself. Three layers of validation are standard (Law 2015). Face validity checks that the flowchart matches how the business thinks the system works. Historical validation runs the model with parameters estimated from past data and compares simulated outputs to the outcomes that actually occurred. Statistical validation applies formal tests (goodness-of-fit for the input distributions, and hypothesis tests comparing simulated and real output distributions). A model that passes all three layers is defensible in a decision review; one that skips validation is advocacy, not analysis.

TipSensitivity for free

Once the simulation harness exists, sweeping a single input parameter across a range and re-running is a cheap sensitivity analysis. Reporting the output distribution at the 10th, 50th, and 90th percentiles of the input gives stakeholders a feel for which assumptions matter most.

26.11 Reporting Simulation Results

A complete simulation report states the population and the question; the model logic (flowchart plus the equations); each input with its distribution and source; the number of replications, the seed, and any variance-reduction technique used; the output distribution summarised by mean, standard deviation, relevant quantiles, and the 95 percent CI for the mean; a validation summary against historical data; and a sensitivity table for the one or two inputs stakeholders care most about. Law (2015) and Banks et al. (2010) are the standard methodological references for structuring and reviewing simulation studies.

26.12 Summary

Summary of simulation-modelling concepts introduced in this chapter
Concept Description
Foundations
Simulation vs closed-form Simulate when analytic formulas are unavailable or intractable
Monte Carlo and the LLN Running mean converges as 1 / sqrt(n); CLT bands shrink
Random Generation
Random variate generation runif, rnorm, rlnorm, rexp, rpois, rbinom for standard inputs
Seed and reproducibility set.seed() fixes the random stream for an exactly repeatable run
Applications
Risk analysis (NPV distribution) Distribution of NPV; mean, sd, P(>0), VaR at 5 percent
M/M/1 queueing Exponential arrivals and service; utilisation and mean wait
Newsvendor inventory Critical-ratio quantile found by minimising simulated cost
Project duration under uncertainty Random activity durations yield a finish-time distribution
Accuracy
Antithetic variates Pair U with 1 - U; halves variance when integrand is monotone
Running mean and variance Welford online update for mean and variance in one pass
Confidence interval half-width 1.96 * sd / sqrt(n); shrinks as 1 / sqrt(n)
Replication planning n approx (1.96 * sd / h)^2 for target half-width h
Delivery
Validation layers Face, historical, statistical validation before decisions are made
Reporting checklist Model, inputs, replications, seed, outputs, CI, validation, sensitivity