Waiting Time Distributions
Waiting time distributions describe the full probability structure of how long callers wait before being answered — not just the mean (ASA), but the complete distribution from instant answer to extreme tail. In contact center WFM, this distribution determines service level, shapes caller experience, drives abandonment behavior, and reveals risks that mean-based metrics hide.
Overview
A contact center reports ASA of 20 seconds. This sounds acceptable. But what does the distribution look like?
- 60% of callers are answered immediately (wait = 0)
- 15% wait 1–30 seconds
- 10% wait 30–60 seconds
- 10% wait 60–120 seconds
- 5% wait over 120 seconds
The mean is 20 seconds, but 15% of callers wait over a minute and 5% wait over two minutes. The experience of the caller in the tail bears no resemblance to the mean. Service level (e.g., 80/20) captures one point on this distribution. The full distribution reveals the rest.
Understanding waiting time distributions is essential for:
- Setting meaningful SLA targets
- Predicting abandonment (which depends on how the wait distribution interacts with the patience distribution)
- Communicating service quality to stakeholders who think "average wait of 20 seconds" means everyone waits about 20 seconds
Mathematical Foundation
Probability of Waiting (Erlang C)
The Erlang C formula gives the probability that an arriving caller must wait at all:
where s is the number of agents and A is the offered load. This is the probability of delay — the fraction of callers not answered immediately.
Conditional Wait Time Distribution
Given that a caller must wait (i.e., all agents are busy at arrival), the waiting time follows an exponential distribution:
where μ = 1/AHT is the individual service rate, s is the number of agents, and A is the offered load in Erlangs.
The decay rate of the exponential is μ(s − A), which equals μs(1 − ρ) where ρ is the traffic intensity. This is the excess capacity rate — the rate at which the s agents collectively drain the queue beyond what is needed to keep up with arrivals.
Unconditional Wait Time Distribution
Combining the probability of delay with the conditional distribution:
This is the survival function of the waiting time. The service level at threshold T is:
This is the formula behind every SL calculation in WFM. When a WFM system reports "80% answered within 20 seconds," it has computed 1 − C(s, A) · e^{−μ(s−A)·20} and found it equals 0.80.
Mean Waiting Time (ASA)
The unconditional mean waiting time is:
ASA is the first moment of the waiting time distribution. It equals the probability of waiting multiplied by the mean wait conditional on waiting:
Percentiles
The p-th percentile of the waiting time (the time by which p% of callers are answered) is:
For p > (1 − C(s, A)) × 100 (i.e., the percentile is in the waiting portion of the distribution):
For p ≤ (1 − C(s, A)) × 100, the percentile is 0 (these callers are answered immediately).
The Heavy Tail Problem
The conditional wait distribution is exponential — it has a memoryless tail. This means the wait distribution is not heavy-tailed in the technical sense (it decays faster than any power law). But from a practical perspective, the tail is thick enough to matter because:
- A significant fraction of callers do wait (C(s, A) can be 30–60% in moderate-occupancy systems).
- The exponential decay rate μ(s − A) slows dramatically as occupancy increases.
At high occupancy, the effective tail extends far beyond the mean. The ratio of the 95th percentile wait to the mean wait is always ln(20) ≈ 3.0 for the conditional distribution — regardless of parameters. This means the longest-waiting 5% of delayed callers always wait at least 3× the average conditional wait.
WFM Applications
Worked Example: Anatomy of ASA = 20 Seconds
Parameters: 150 agents, 133.3 Erlangs offered load, AHT = 240 sec (μ = 1/240).
Computed: C(150, 133.3) = 0.44 (44% of callers wait), μ(s − A) = (1/240)(150 − 133.3) = 0.0694/sec.
| Metric | Value | Interpretation |
|---|---|---|
| P(W = 0) | 56% | More than half answered instantly |
| ASA | 0.44 / 0.0694 = 6.3 sec | Wait of 6.3 seconds... but misleading |
| P(W > 20 sec) | 0.44 × e^{−0.0694×20} = 0.44 × 0.25 = 0.11 | 11% of callers wait > 20 sec |
| P(W > 60 sec) | 0.44 × e^{−0.0694×60} = 0.44 × 0.016 = 0.007 | 0.7% wait > 60 sec |
| SL (80/20) | 1 − 0.11 = 89% | Exceeds 80/20 target |
| Mean wait given wait > 0 | 1/0.0694 = 14.4 sec | Delayed callers wait 14.4 sec on average |
| 95th percentile wait (delayed) | −ln(0.05)/0.0694 = 43 sec | 5% of delayed callers wait > 43 sec |
| 95th percentile wait (all) | −ln(0.05/0.44)/0.0694 = 31 sec | 5% of all callers wait > 31 sec |
The distribution is bimodal in practice: most callers wait zero, those who wait face an exponentially distributed delay. The "average" experience does not exist — callers either get instant service or enter the tail.
Worked Example: Same ASA, Different Distributions
Two scenarios with ASA ≈ 20 seconds:
Scenario A: 155 agents, 133.3 Erlangs. C = 0.29, μ(s−A) = 0.090. ASA = 0.29/0.090 = 3.2 sec. (This ASA is actually lower; let's use a harder scenario.)
Scenario A: 100 agents, 88 Erlangs, AHT = 300 sec. C = 0.60, μ(s−A) = (1/300)(100−88) = 0.040. ASA = 0.60/0.040 = 15 sec.
Scenario B: 30 agents, 25 Erlangs, AHT = 300 sec. C = 0.55, μ(s−A) = (1/300)(30−25) = 0.0167. ASA = 0.55/0.0167 = 33 sec.
Scenario B has a much slower decay rate because the spare capacity (s − A = 5 agents) is small. The tail is much longer:
| Metric | Scenario A (100 agents) | Scenario B (30 agents) |
|---|---|---|
| ASA | 15 sec | 33 sec |
| P(W > 60 sec) | 0.60 × e^{−0.040×60} = 5.4% | 0.55 × e^{−0.0167×60} = 20.2% |
| P(W > 120 sec) | 0.60 × e^{−0.040×120} = 0.5% | 0.55 × e^{−0.0167×120} = 7.4% |
Scenario B, despite having far fewer agents, illustrates the small pool problem: with few agents, spare capacity is low, the decay rate is slow, and tail waits are extreme. This connects to Pooling Theory — larger pools have faster queue drainage.
Service Level as a Survival Function
Service level at threshold T is 1 minus the survival function evaluated at T:
Plotting SL(T) for various T produces the service level curve. This curve is concave — the marginal SL improvement from increasing T by one second diminishes as T grows.
| SL Target | T (for 150 agents, 133.3 Erlangs) | Interpretation |
|---|---|---|
| 80% answered in T | T = 6.5 sec | Aggressive — requires very fast answer for delayed callers |
| 80% answered in T | T = 20 sec (typical 80/20) | Standard |
| 90% answered in T | T = 20 sec | Requires more agents (≈157) |
| 80% answered in T | T = 30 sec (80/30) | Slightly relaxed — achievable with ~147 agents |
The choice of T matters as much as the percentage. An 80/20 target is not inherently "better" than 80/30 — it is a different operating point with different staffing cost.
Relationship to Patience and Abandonment
The Erlang-A model extends the waiting time distribution by introducing caller patience with rate θ. In Erlang-A, a caller abandons if their waiting time exceeds their (random, exponential) patience. The observed waiting time is:
The abandonment probability depends on the overlap between the wait distribution tail and the patience distribution. When the wait distribution tail extends well beyond mean patience, abandonment is heavy. When the tail decays faster than patience, abandonment is minimal.
This interaction is precisely why Erlang-A gives different (and more realistic) results than Erlang C: Erlang C assumes infinite patience, meaning the entire wait distribution tail is experienced. Erlang-A truncates the tail by removing impatient callers.
Conditional Wait Time: "Given I Wait, How Long?"
Callers who have already been waiting t seconds and have not been answered face an additional expected wait of 1/[μ(s − A)] seconds — the same as a newly delayed caller. This is the memoryless property of the exponential conditional wait distribution.
Practical implication: If an IVR announces "your expected wait is X minutes," that estimate should not decrease as the caller waits (in the Erlang C model). The expected remaining wait is constant. In reality, it does decrease because agents completing calls create openings — but the exponential model says the hazard rate is constant. This is a well-known source of frustration when wait-time announcements don't match experience.
Common Misconceptions
1. "ASA of 20 seconds means most callers wait about 20 seconds."
The wait distribution is bimodal: zero or exponential. "About 20 seconds" is the experience of almost nobody. Most callers are answered immediately; those who wait can wait much longer than the mean suggests.
2. "80/20 is the industry standard because it provides good service."
80/20 is a convention, not a theorem. It originated with AT&T in the 1980s. Whether 80/20 provides "good service" depends entirely on the cost of waiting to the customer and the business. A financial advisory line might need 90/10; a low-cost utility might be fine with 70/60. The right SL target comes from the value equation, not tradition.
3. "If we meet SL for the day, service was good all day."
Daily SL averages can mask interval-level failures. A center meeting 80/20 for the day might have 95/20 in off-peak intervals and 50/20 in peaks. The tail callers during peaks had a terrible experience hidden by the daily average.
4. "Reducing ASA from 20 to 15 seconds improves SL proportionally."
ASA and SL are not linearly related. Both depend on the underlying parameters (C(s,A) and μ(s−A)) in nonlinear ways. A 25% reduction in ASA might correspond to a 5-point SL improvement or a 15-point improvement depending on where on the curve the operation sits.
5. "The exponential wait distribution means waits can be infinitely long."
Mathematically, yes — the exponential has infinite support. Practically, very long waits are vanishingly unlikely in the Erlang C model. But in the Erlang C model, those callers are assumed to wait patiently. In reality, they abandon — which is exactly why Erlang-A exists.
Maturity Model Position
- Level 1 — Initial. Service measured by ASA only. No distributional thinking. "Average wait is 20 seconds" accepted without question.
- Level 2 — Foundational. Service level (80/20) tracked and targeted. Practitioners understand the difference between ASA and SL but may not connect them mathematically. SL threshold (20 sec, 30 sec) inherited from industry convention.
- Level 3 — Progressive. Wait time distribution understood conceptually. Percentile waits calculated and reported alongside SL. Conditional wait times used for wait-time announcements. SL threshold chosen based on customer experience analysis rather than convention.
- Level 4 — Advanced. Full wait distribution modeled and used for planning. Patience distributions estimated and integrated via Erlang-A. Service level reported as a survival curve, not a single point. Tail probabilities (P(wait > 120 sec)) tracked as explicit metrics. Cost-of-waiting models quantify the business impact of distributional tails.
- Level 5 — Pioneering. Real-time wait distribution estimation. Individual caller wait-time predictions based on current queue state. Dynamic SL targets that adapt based on customer segment value and current distributional position. Wait-time distributions integrated into routing optimization.
See Also
- Erlang C
- Erlang-A
- Service Level
- Average Speed of Answer (ASA)
- Traffic Intensity and Server Utilization
- Palm's Theorem and PASTA
- Queueing Theory Fundamentals
- Pooling Theory
- Probabilistic Planning
- Abandonment
References
- Gross, D. & Harris, C.M. (2008). Fundamentals of Queueing Theory, 4th ed. Wiley. Chapter 3.
- Gans, N., Koole, G. & Mandelbaum, A. (2003). "Telephone Call Centers: Tutorial, Review, and Research Prospects." Manufacturing & Service Operations Management 5(2), 79–141.
- Koole, G. (2013). Call Center Optimization. MG Books. Sections 3.4–3.5.
- Garnett, O., Mandelbaum, A. & Reiman, M.I. (2002). "Designing a Call Center with Impatient Customers." Manufacturing & Service Operations Management 4(3), 208–227.
- Whitt, W. (1999). "Predicting Queueing Delays." Management Science 45(6), 870–888.
- Brown, L. et al. (2005). "Statistical Analysis of a Telephone Call Center: A Queueing-Science Perspective." Journal of the American Statistical Association 100(469), 36–50.
