Waiting Time Distributions

Waiting time distributions describe the full probability structure of how long callers wait before being answered — not just the mean (ASA), but the complete distribution from instant answer to extreme tail. In contact center WFM, this distribution determines service level, shapes caller experience, drives abandonment behavior, and reveals risks that mean-based metrics hide.

Overview

A contact center reports ASA of 20 seconds. This sounds acceptable. But what does the distribution look like?

60% of callers are answered immediately (wait = 0)
15% wait 1–30 seconds
10% wait 30–60 seconds
10% wait 60–120 seconds
5% wait over 120 seconds

The mean is 20 seconds, but 15% of callers wait over a minute and 5% wait over two minutes. The experience of the caller in the tail bears no resemblance to the mean. Service level (e.g., 80/20) captures one point on this distribution. The full distribution reveals the rest.

Understanding waiting time distributions is essential for:

Setting meaningful SLA targets
Predicting abandonment (which depends on how the wait distribution interacts with the patience distribution)
Communicating service quality to stakeholders who think "average wait of 20 seconds" means everyone waits about 20 seconds

Mathematical Foundation

Probability of Waiting (Erlang C)

The Erlang C formula gives the probability that an arriving caller must wait at all:

P (W > 0) = C (s, A)

where s is the number of agents and A is the offered load. This is the probability of delay — the fraction of callers not answered immediately.

Conditional Wait Time Distribution

Given that a caller must wait (i.e., all agents are busy at arrival), the waiting time follows an exponential distribution:

P (W > t ∣ W > 0) = e^{- μ (s - A) t}

where μ = 1/AHT is the individual service rate, s is the number of agents, and A is the offered load in Erlangs.

The decay rate of the exponential is μ(s − A), which equals μs(1 − ρ) where ρ is the traffic intensity. This is the excess capacity rate — the rate at which the s agents collectively drain the queue beyond what is needed to keep up with arrivals.

Unconditional Wait Time Distribution

Combining the probability of delay with the conditional distribution:

P (W > t) = C (s, A) \cdot e^{- μ (s - A) t}

This is the survival function of the waiting time. The service level at threshold T is:

S L (T) = 1 - P (W > T) = 1 - C (s, A) \cdot e^{- μ (s - A) T}

This is the formula behind every SL calculation in WFM. When a WFM system reports "80% answered within 20 seconds," it has computed 1 − C(s, A) · e^{−μ(s−A)·20} and found it equals 0.80.

Mean Waiting Time (ASA)

The unconditional mean waiting time is:

E [W] = \frac{C (s, A)}{μ (s - A)} = ASA

ASA is the first moment of the waiting time distribution. It equals the probability of waiting multiplied by the mean wait conditional on waiting:

ASA = P (W > 0) \times E [W ∣ W > 0] = C (s, A) \times \frac{1}{μ (s - A)}

Percentiles

The p-th percentile of the waiting time (the time by which p% of callers are answered) is:

For p > (1 − C(s, A)) × 100 (i.e., the percentile is in the waiting portion of the distribution):

W_{p} = \frac{- \ln (\frac{1 - p / 100}{C (s, A)})}{μ (s - A)}

For p ≤ (1 − C(s, A)) × 100, the percentile is 0 (these callers are answered immediately).

The Heavy Tail Problem

The conditional wait distribution is exponential — it has a memoryless tail. This means the wait distribution is not heavy-tailed in the technical sense (it decays faster than any power law). But from a practical perspective, the tail is thick enough to matter because:

A significant fraction of callers do wait (C(s, A) can be 30–60% in moderate-occupancy systems).
The exponential decay rate μ(s − A) slows dramatically as occupancy increases.

At high occupancy, the effective tail extends far beyond the mean. The ratio of the 95th percentile wait to the mean wait is always ln(20) ≈ 3.0 for the conditional distribution — regardless of parameters. This means the longest-waiting 5% of delayed callers always wait at least 3× the average conditional wait.

WFM Applications

Worked Example: Anatomy of ASA = 20 Seconds

Parameters: 150 agents, 133.3 Erlangs offered load, AHT = 240 sec (μ = 1/240).

Computed: C(150, 133.3) = 0.44 (44% of callers wait), μ(s − A) = (1/240)(150 − 133.3) = 0.0694/sec.

Metric	Value	Interpretation
P(W = 0)	56%	More than half answered instantly
ASA	0.44 / 0.0694 = 6.3 sec	Wait of 6.3 seconds... but misleading
P(W > 20 sec)	0.44 × e^{−0.0694×20} = 0.44 × 0.25 = 0.11	11% of callers wait > 20 sec
P(W > 60 sec)	0.44 × e^{−0.0694×60} = 0.44 × 0.016 = 0.007	0.7% wait > 60 sec
SL (80/20)	1 − 0.11 = 89%	Exceeds 80/20 target
Mean wait given wait > 0	1/0.0694 = 14.4 sec	Delayed callers wait 14.4 sec on average
95th percentile wait (delayed)	−ln(0.05)/0.0694 = 43 sec	5% of delayed callers wait > 43 sec
95th percentile wait (all)	−ln(0.05/0.44)/0.0694 = 31 sec	5% of all callers wait > 31 sec

The distribution is bimodal in practice: most callers wait zero, those who wait face an exponentially distributed delay. The "average" experience does not exist — callers either get instant service or enter the tail.

Worked Example: Same ASA, Different Distributions

Two scenarios with ASA ≈ 20 seconds:

Scenario A: 155 agents, 133.3 Erlangs. C = 0.29, μ(s−A) = 0.090. ASA = 0.29/0.090 = 3.2 sec. (This ASA is actually lower; let's use a harder scenario.)

Scenario A: 100 agents, 88 Erlangs, AHT = 300 sec. C = 0.60, μ(s−A) = (1/300)(100−88) = 0.040. ASA = 0.60/0.040 = 15 sec.

Scenario B: 30 agents, 25 Erlangs, AHT = 300 sec. C = 0.55, μ(s−A) = (1/300)(30−25) = 0.0167. ASA = 0.55/0.0167 = 33 sec.

Scenario B has a much slower decay rate because the spare capacity (s − A = 5 agents) is small. The tail is much longer:

Metric	Scenario A (100 agents)	Scenario B (30 agents)
ASA	15 sec	33 sec
P(W > 60 sec)	0.60 × e^{−0.040×60} = 5.4%	0.55 × e^{−0.0167×60} = 20.2%
P(W > 120 sec)	0.60 × e^{−0.040×120} = 0.5%	0.55 × e^{−0.0167×120} = 7.4%

Scenario B, despite having far fewer agents, illustrates the small pool problem: with few agents, spare capacity is low, the decay rate is slow, and tail waits are extreme. This connects to Pooling Theory — larger pools have faster queue drainage.

Service Level as a Survival Function

Service level at threshold T is 1 minus the survival function evaluated at T:

S L (T) = 1 - C (s, A) \cdot e^{- μ (s - A) T}

Plotting SL(T) for various T produces the service level curve. This curve is concave — the marginal SL improvement from increasing T by one second diminishes as T grows.

SL Target	T (for 150 agents, 133.3 Erlangs)	Interpretation
80% answered in T	T = 6.5 sec	Aggressive — requires very fast answer for delayed callers
80% answered in T	T = 20 sec (typical 80/20)	Standard
90% answered in T	T = 20 sec	Requires more agents (≈157)
80% answered in T	T = 30 sec (80/30)	Slightly relaxed — achievable with ~147 agents

The choice of T matters as much as the percentage. An 80/20 target is not inherently "better" than 80/30 — it is a different operating point with different staffing cost.

Relationship to Patience and Abandonment

The Erlang-A model extends the waiting time distribution by introducing caller patience with rate θ. In Erlang-A, a caller abandons if their waiting time exceeds their (random, exponential) patience. The observed waiting time is:

W_{observed} = \min (W_{potential}, Patience)

The abandonment probability depends on the overlap between the wait distribution tail and the patience distribution. When the wait distribution tail extends well beyond mean patience, abandonment is heavy. When the tail decays faster than patience, abandonment is minimal.

This interaction is precisely why Erlang-A gives different (and more realistic) results than Erlang C: Erlang C assumes infinite patience, meaning the entire wait distribution tail is experienced. Erlang-A truncates the tail by removing impatient callers.

Conditional Wait Time: "Given I Wait, How Long?"

Callers who have already been waiting t seconds and have not been answered face an additional expected wait of 1/[μ(s − A)] seconds — the same as a newly delayed caller. This is the memoryless property of the exponential conditional wait distribution.

Practical implication: If an IVR announces "your expected wait is X minutes," that estimate should not decrease as the caller waits (in the Erlang C model). The expected remaining wait is constant. In reality, it does decrease because agents completing calls create openings — but the exponential model says the hazard rate is constant. This is a well-known source of frustration when wait-time announcements don't match experience.

Common Misconceptions

1. "ASA of 20 seconds means most callers wait about 20 seconds."

The wait distribution is bimodal: zero or exponential. "About 20 seconds" is the experience of almost nobody. Most callers are answered immediately; those who wait can wait much longer than the mean suggests.

2. "80/20 is the industry standard because it provides good service."

80/20 is a convention, not a theorem. It originated with AT&T in the 1980s. Whether 80/20 provides "good service" depends entirely on the cost of waiting to the customer and the business. A financial advisory line might need 90/10; a low-cost utility might be fine with 70/60. The right SL target comes from the value equation, not tradition.

3. "If we meet SL for the day, service was good all day."

Daily SL averages can mask interval-level failures. A center meeting 80/20 for the day might have 95/20 in off-peak intervals and 50/20 in peaks. The tail callers during peaks had a terrible experience hidden by the daily average.

4. "Reducing ASA from 20 to 15 seconds improves SL proportionally."

ASA and SL are not linearly related. Both depend on the underlying parameters (C(s,A) and μ(s−A)) in nonlinear ways. A 25% reduction in ASA might correspond to a 5-point SL improvement or a 15-point improvement depending on where on the curve the operation sits.

5. "The exponential wait distribution means waits can be infinitely long."

Mathematically, yes — the exponential has infinite support. Practically, very long waits are vanishingly unlikely in the Erlang C model. But in the Erlang C model, those callers are assumed to wait patiently. In reality, they abandon — which is exactly why Erlang-A exists.

Maturity Model Position

Level 1 — Initial. Service measured by ASA only. No distributional thinking. "Average wait is 20 seconds" accepted without question.
Level 2 — Foundational. Service level (80/20) tracked and targeted. Practitioners understand the difference between ASA and SL but may not connect them mathematically. SL threshold (20 sec, 30 sec) inherited from industry convention.
Level 3 — Progressive. Wait time distribution understood conceptually. Percentile waits calculated and reported alongside SL. Conditional wait times used for wait-time announcements. SL threshold chosen based on customer experience analysis rather than convention.
Level 4 — Advanced. Full wait distribution modeled and used for planning. Patience distributions estimated and integrated via Erlang-A. Service level reported as a survival curve, not a single point. Tail probabilities (P(wait > 120 sec)) tracked as explicit metrics. Cost-of-waiting models quantify the business impact of distributional tails.
Level 5 — Pioneering. Real-time wait distribution estimation. Individual caller wait-time predictions based on current queue state. Dynamic SL targets that adapt based on customer segment value and current distributional position. Wait-time distributions integrated into routing optimization.

References

Gross, D. & Harris, C.M. (2008). Fundamentals of Queueing Theory, 4th ed. Wiley. Chapter 3.
Gans, N., Koole, G. & Mandelbaum, A. (2003). "Telephone Call Centers: Tutorial, Review, and Research Prospects." Manufacturing & Service Operations Management 5(2), 79–141.
Koole, G. (2013). Call Center Optimization. MG Books. Sections 3.4–3.5.
Garnett, O., Mandelbaum, A. & Reiman, M.I. (2002). "Designing a Call Center with Impatient Customers." Manufacturing & Service Operations Management 4(3), 208–227.
Whitt, W. (1999). "Predicting Queueing Delays." Management Science 45(6), 870–888.
Brown, L. et al. (2005). "Statistical Analysis of a Telephone Call Center: A Queueing-Science Perspective." Journal of the American Statistical Association 100(469), 36–50.

Anonymous

Search

Waiting Time Distributions

Namespaces

More

Page actions

Contents

Overview

Mathematical Foundation

Probability of Waiting (Erlang C)

Conditional Wait Time Distribution

Unconditional Wait Time Distribution

Mean Waiting Time (ASA)

Percentiles

The Heavy Tail Problem

WFM Applications

Worked Example: Anatomy of ASA = 20 Seconds

Worked Example: Same ASA, Different Distributions

Service Level as a Survival Function

Relationship to Patience and Abandonment

Conditional Wait Time: "Given I Wait, How Long?"

Common Misconceptions

Maturity Model Position

See Also

References

Navigation

Navigation

Core WFM

Applied Science

Beyond Contact Centers

Strategy & Transformation

Signature Models

Community

Wiki tools

Wiki tools

Anonymous

Search

Waiting Time Distributions

Overview

Mathematical Foundation

Probability of Waiting (Erlang C)

Conditional Wait Time Distribution

Unconditional Wait Time Distribution

Mean Waiting Time (ASA)

Percentiles

The Heavy Tail Problem

WFM Applications

Worked Example: Anatomy of ASA = 20 Seconds

Worked Example: Same ASA, Different Distributions

Service Level as a Survival Function

Relationship to Patience and Abandonment

Conditional Wait Time: "Given I Wait, How Long?"

Common Misconceptions

Maturity Model Position

See Also

References

Navigation

Wiki tools

Page tools