Queueing Theory Fundamentals

Queueing Theory is the mathematical study of waiting lines (queues), providing the analytical foundation for staffing decisions, service-level modeling, and capacity planning in contact centers and other service operations. The theory characterizes the behavior of systems in which customers (or contacts) arrive at a service facility, wait if servers are unavailable, and eventually receive service before departing.[1] In workforce management, queueing models underpin the Erlang-C and Erlang-A formulas that translate offered load and staffing levels into predicted Service Level, abandonment rates, and average wait times. A working understanding of queueing fundamentals — arrival processes, service distributions, queue disciplines, and Little's Law — is prerequisite to interpreting staffing calculations and understanding their limitations.[2]
The M/M/c Queue
The canonical model for contact center staffing is the M/M/c queue, characterized by three components corresponding to its Kendall notation:
- M (first): Markovian (Poisson) arrival process — interarrival times are exponentially distributed and independent.
- M (second): Markovian (exponential) service times — handle times are exponentially distributed and independent.
- c: c parallel servers (agents), each serving one customer at a time with a common service rate.
Additional assumptions in the basic M/M/c model include: infinite waiting room (no capacity limit on queue length), infinite customer population, first-come first-served (FCFS) queue discipline, and no customer abandonment.
Poisson Arrival Process
The Poisson process is the standard model for random arrivals in which the occurrence of one arrival does not affect the probability of subsequent arrivals (memoryless property). The key parameter is the arrival rate λ (contacts per unit time). The probability that exactly k contacts arrive in a time interval of length t is:
- P(N=k) = (λt)^k e^{-λt} / k!
Empirical studies of contact center arrival data confirm that intraday arrivals are approximately Poisson within short intervals (typically 15–30 minutes), conditional on the arrival rate being stable within the interval. However, arrival rates vary substantially across intervals and across days of week, necessitating interval-level staffing calculations rather than a single daily model. Brown et al. (2005) document that contact center arrivals exhibit overdispersion — greater variance than a pure Poisson process — particularly over longer time horizons, implying that uncertainty in the arrival process is larger than the Poisson model suggests.[3]
Service Time Distribution
The exponential service time assumption (the second M in M/M/c) implies that handle times are memoryless: the probability of a call ending in the next second is the same regardless of how long the call has already lasted. In practice, contact center handle times are more accurately described by lognormal or Erlang-k distributions, which exhibit a mode greater than zero and a right tail. The exponential assumption is analytically convenient but may underestimate variability in service performance.
The mean service rate is μ = 1/AHT, where AHT is mean Average Handle Time. The offered load (in Erlangs) is:
- A = λ / μ = λ × AHT
where λ and AHT use consistent time units.
Traffic Intensity
Traffic intensity ρ (rho) is the ratio of offered load to capacity:
- ρ = A / c = λ × AHT / c
For a stable queue (one that does not grow without bound), traffic intensity must be strictly less than 1: ρ < 1. When ρ ≥ 1, the queue grows without bound — the system is in overload and wait times diverge to infinity. In practice, contact center staffing targets ρ in the range 0.80–0.90, corresponding to an Occupancy of 80–90%.
Little's Law
Little's Law is one of the most powerful and general results in queueing theory. It states that for any stable queueing system in steady state:
- L = λ × W
where:
- L = mean number of customers in the system (in queue + in service)
- λ = mean arrival rate
- W = mean time a customer spends in the system (wait time + service time)
Little's Law requires no distributional assumptions — it holds for any stable queue under very general conditions. In contact center applications, it provides a direct relationship between observable operational metrics:
- Given average queue length L and arrival rate λ, the mean wait time W = L / λ.
- Given a target mean wait time and arrival rate, the required queue capacity follows directly.
Little's Law also applies to subsystems: the queue subsystem alone obeys L_q = λ × W_q, where L_q is the mean queue length and W_q is the mean wait time (excluding service time). See Gans et al. (2003) for a comprehensive treatment.
The Erlang-C Formula
The Erlang-C formula is the M/M/c result for the probability that an arriving customer must wait (i.e., finds all c servers busy). This probability — C(c, A) — is used to compute predicted service level, average speed of answer (ASA), and occupancy. The formula requires only two inputs: offered load A (Erlangs) and number of agents c.
The probability that a customer waits more than time t is:
- P(wait > t) = C(c, A) × e^{-(c−A)μt}
The service level (probability of answer within threshold T) is:
- SL = 1 − C(c, A) × e^{−(c−A)μT}
These are the core formulas driving Interval Level Staffing Requirements calculations. See the Erlang-C article for full derivation and worked examples.
The Erlang-A Formula
The M/M/c/∞/∞ + M model — commonly called the Erlang-A model or the M/M/c+M model — extends the M/M/c queue to incorporate customer abandonment (patience). Customers in queue abandon at rate θ (theta) if not answered within their patience threshold, which is exponentially distributed with mean 1/θ.
The Erlang-A model is more realistic than Erlang-C for most contact center environments, where a material fraction of customers abandon when wait times are long. Erlang-C overestimates service level (because it assumes no abandonment reduces queue pressure) and underestimates wait times. In high-occupancy environments where abandonment is common, Erlang-A provides substantially more accurate predictions. See Erlang-A for formula details and Abandonment for the operational implications of patient modeling.
Queue Discipline
Queue discipline determines the order in which waiting customers are served. Common disciplines include:
- FCFS (First Come, First Served): Standard in most inbound contact centers.
- LCFS (Last Come, First Served): Rare in customer-facing operations; common in inventory systems.
- Priority queuing: Customers are assigned priority classes; high-priority customers are served before lower-priority customers regardless of arrival order. Relevant in multi-skill and Skill-Based Routing environments.
- Processor sharing: All waiting customers are served simultaneously at a reduced rate. Approximates behavior in some asynchronous channel environments.
Priority queuing introduces trade-offs between service quality for different customer segments and overall system efficiency. An operation that provides high service level to a premium tier may do so at the cost of substantially degraded service for the standard tier — an effect that must be modeled explicitly, not assumed away.
Stationarity and the Interval Staffing Problem
The M/M/c model assumes a stationary (time-invariant) arrival rate. Contact centers experience highly non-stationary arrival patterns: volume varies by time of day, day of week, and season. The standard industry solution is to treat each short interval (15 or 30 minutes) as approximately stationary and apply the M/M/c (or M/M/c+M) model separately to each interval. This approximation is accurate when the arrival rate changes slowly relative to the mean service time — a condition satisfied in most contact center environments. The Interval Level Staffing Requirements article discusses the full interval staffing workflow.
Limitations and Extensions
The M/M/c framework, while powerful, rests on assumptions that may not hold in all contact center settings:
- Non-exponential service times: The M/G/c queue (general service distribution) does not have a closed-form solution; simulation or approximation methods are required. See Discrete-Event vs Monte Carlo Simulation Models.
- Correlated arrivals: If arrivals cluster (e.g., after a marketing email send), the Poisson assumption fails and queue performance is worse than M/M/c predicts.
- Multiple skills and routing: Multi-skill environments with Skill-Based Routing cannot be accurately modeled as a single M/M/c queue; network queueing models or simulation is required. See Multi-Skill Scheduling and Pooling Theory.
- Non-stationary arrivals: The interval approximation breaks down during rapid ramp-up or ramp-down periods.
The Erlang-A model extends the basic framework to handle abandonment. Discrete-Event vs Monte Carlo Simulation Models provides a treatment of simulation approaches for systems where closed-form queueing models are inadequate.
Maturity Model Considerations
At L1 maturity, staffing decisions are made without reference to queueing theory. Headcount is set by intuition, historical patterns, or simple volume-per-agent ratios that ignore service-level dynamics.
At L2, organizations use the Erlang-C formula — often via spreadsheet calculators — to translate offered load and staffing into predicted service level. The mechanics of the formula may not be well understood, but the inputs and outputs are applied correctly.
At L3 and above, planners understand the distributional assumptions underlying Erlang-C and Erlang-A, recognize when those assumptions are violated, and select simulation or more sophisticated models when the standard formulas are inadequate. See WFM Labs Maturity Model.
Related Concepts
- Erlang-C
- Erlang-A
- Service Level
- Abandonment
- Occupancy
- Average Handle Time
- Interval Level Staffing Requirements
- Power of One
- Pooling Theory
- Multi-Skill Scheduling
- Skill-Based Routing
- Discrete-Event vs Monte Carlo Simulation Models
- Probabilistic Forecasting
- Staffing to Percentile vs. Mean Forecast
- WFM Labs Maturity Model
References
- ↑ Gross, D., Shortle, J. F., Thompson, J. M., & Harris, C. M. (2008). Fundamentals of Queueing Theory, 4th ed. John Wiley & Sons.
- ↑ Gans, N., Koole, G., & Mandelbaum, A. (2003). Telephone call centers: Tutorial, review, and research prospects. Manufacturing & Service Operations Management, 5(2), 79–141.
- ↑ Brown, L., Gans, N., Mandelbaum, A., Sakov, A., Shen, H., Zeltyn, S., & Zhao, L. (2005). Statistical analysis of a telephone call center: A queueing-science perspective. Journal of the American Statistical Association, 100(469), 36–50.
