The Service Level Savings Fallacy
The Service Level Savings Fallacy is the widely held misconception that loosening a contact center's Service Level target — from, say, 80/20 to 80/120 — will produce proportionally large headcount savings. The reality is that Erlang C's nonlinear mathematics typically yield only a 3–8% staffing reduction for mid-size centers, while the hidden costs of degraded service — increased abandonment, higher attrition, longer handle times, and customer lifetime value erosion — frequently exceed the savings. In most realistic scenarios, this trade is NPV-negative.
This page exists because the conversation between WFM and Finance about service level targets happens in every organization, usually annually during budget planning. WFM leaders need hard numbers, not intuition, to push back on the assumption that "relaxing the SL target" is a free lunch.
Overview

The argument sounds reasonable on its surface: "We're answering 80% of calls in 20 seconds. If we relax that to 80% in 120 seconds, we need fewer agents, because we're giving ourselves more buffer." Finance models this as a proportional relationship — if we're answering faster than needed, we must have excess capacity.
The math says otherwise. Erlang C is nonlinear. The relationship between service level threshold and required staffing is logarithmic, not linear. Doubling the threshold from 20 to 40 seconds saves a few agents. Going from 40 to 80 saves fewer. Going from 80 to 120 saves barely any at all. The curve flattens dramatically.
Meanwhile, the costs of operating at 80/120 instead of 80/120 are real, measurable, and compounding.
The Erlang Math: Worked Examples
To make this concrete, consider three centers of different sizes. All assume a mean AHT of 360 seconds (6 minutes), calls arriving via Poisson process, and the target is 80% of calls answered within threshold T.
Small Center: ~50 Agents
Offered load: 40 Erlangs (400 calls/hour at 6-minute AHT).
| SL Target | Required Agents | Difference from 80/20 |
|---|---|---|
| 80/20 | 51 | — |
| 80/30 | 49 | −2 (3.9%) |
| 80/60 | 47 | −4 (7.8%) |
| 80/90 | 46 | −5 (9.8%) |
| 80/120 | 46 | −5 (9.8%) |
At 50 agents, the difference between 80/20 and 80/120 is approximately 5 agents — roughly 10%. But notice: going from 80/60 to 80/120 saves only 1 agent. The curve has already flattened.
Mid-Size Center: ~150 Agents
Offered load: 130 Erlangs (1,300 calls/hour at 6-minute AHT).
| SL Target | Required Agents | Difference from 80/20 |
|---|---|---|
| 80/20 | 149 | — |
| 80/30 | 146 | −3 (2.0%) |
| 80/60 | 143 | −6 (4.0%) |
| 80/90 | 141 | −8 (5.4%) |
| 80/120 | 140 | −9 (6.0%) |
Nine agents out of 149. That's 6%, and the last 60 seconds of threshold relaxation (from 80/60 to 80/120) buys only 3 agents.
Large Center: ~500 Agents
Offered load: 450 Erlangs (4,500 calls/hour at 6-minute AHT).
| SL Target | Required Agents | Difference from 80/20 |
|---|---|---|
| 80/20 | 486 | — |
| 80/30 | 480 | −6 (1.2%) |
| 80/60 | 474 | −12 (2.5%) |
| 80/90 | 471 | −15 (3.1%) |
| 80/120 | 469 | −17 (3.5%) |
Seventeen agents out of 486. That's 3.5%. The pooling effect means larger centers already operate more efficiently, so the marginal savings from relaxing the threshold are smaller in percentage terms.
The pattern is clear: relaxing from 80/20 to 80/120 saves 3–10% of headcount, with the savings percentage inversely related to center size. And those are the gross savings — before accounting for the costs that relaxation creates.
The Hidden Cost Cascade
The 80/120 target doesn't just mean longer waits. It creates a cascade of second- and third-order effects that erode the savings.
Cost 1: Occupancy Rise and Burnout
Removing 9 agents from a 149-agent center doesn't redistribute the work — it concentrates it. Occupancy rises from approximately 87% to 93%. The difference between 87% and 93% occupancy is the difference between agents getting ~8 minutes of recovery time per hour versus ~4 minutes. At 93%, agents have barely enough time to finish after-call work before the next call hits.
The research on this is clear. Job Demands-Resources theory and Conservation of Resources theory both predict that sustained high occupancy leads to emotional exhaustion, depersonalization, and ultimately turnover. The lag is typically 4–8 weeks. The Occupancy Trap covers this mechanism in detail.
If attrition increases by just 5 percentage points (from, say, 30% to 35% annually in a typical center), the incremental replacement cost is:
- 140 agents × 5% incremental attrition = 7 additional departures/year
- 7 departures × $15,000 replacement cost = $105,000/year
And that assumes the 5-point increase is modest. In high-emotion queues like complaints or collections, the attrition response to occupancy spikes can be 8–15 points.
Cost 2: Abandonment and Lost Revenue
At 80/120, the Average Speed of Answer (ASA) is significantly higher. Callers who would have been answered in 30 seconds at 80/20 now wait 90+ seconds. The patience distribution means more callers reach their tolerance limit and abandon.
For a mid-size center moving from 80/20 to 80/120:
- ASA increases from ~12 seconds to ~45 seconds
- Abandonment rises from ~2% to ~6%
- At 1,300 calls/hour, that's 52 additional abandoned calls per hour
- If 30% of abandoned callers don't call back (they switch channels, switch providers, or give up), that's 16 lost contacts per hour
- At $25 revenue per contact (a moderate B2C assumption), that's $400/hour or $832,000/year (assuming 2,080 operating hours)
Even if revenue per contact is only $10, that's $333,000/year — roughly equal to the loaded cost of the agents you "saved."
Cost 3: AHT Creep
Agents answering calls from frustrated customers who waited 90+ seconds spend longer on each call. The customer opens with frustration ("I've been on hold forever"), the agent must de-escalate before problem-solving, and the emotional labor of repeated de-escalation is draining. Research consistently shows that AHT increases 5–15% when customers experience long waits before connection.[1]
If AHT increases by 8% (from 360 to 389 seconds), the offered load increases from 130 Erlangs to 140 Erlangs. That 10-Erlang increase requires approximately 12 additional agents to maintain the relaxed 80/120 target — more than the 9 you removed. The "savings" has created a staffing deficit.
Cost 4: Customer Lifetime Value Erosion
This is the slowest-acting but largest cost. Customers who experience long waits are measurably more likely to churn. The exact relationship depends on industry and customer segment, but studies consistently show a 10–30% increase in churn probability for customers who experience wait times above their expectation threshold.[2]
For a center serving 2 million customers with an average CLV of $1,200:
- If 5% of contacts (65 per hour × 2,080 hours = 135,200 contacts/year) experience waits beyond the expectation threshold that wouldn't have under 80/20
- And 3% of those customers churn who otherwise wouldn't have: 4,056 incremental churned customers
- At $1,200 CLV: $4.9 million in lifetime value erosion
Even discounting this heavily for attribution uncertainty (say 20% confidence), that's still $970,000 — dwarfing the $550,000 in headcount savings.
Full P&L: The "Savings" vs the Hidden Costs
For a 150-agent center, 1,300 calls/hour, moving from 80/20 to 80/120:
| Line Item | Annual Impact |
|---|---|
| Visible Savings | |
| 9 fewer agents × $60,000 loaded cost | +$540,000 |
| Hidden Costs | |
| Incremental attrition (occupancy-driven) | −$105,000 |
| Abandoned-caller revenue loss (conservative) | −$333,000 |
| AHT creep requiring additional overtime/staffing | −$180,000 |
| CLV erosion (20% attribution confidence) | −$970,000 |
| Net Impact | −$1,048,000 |
The "savings" is a million-dollar loss. And this P&L is conservative — it uses the low end of every estimate.
When Loosening SL Does Work
This isn't a universal rule. There are scenarios where relaxing the service level target genuinely saves money without creating offsetting costs:
Very high-volume centers (5,000+ seats). The Erlang curve is flatter at scale, and the customer experience impact per incremental wait second is smaller when the operation can maintain consistent (if slower) service. The pooling effect means the occupancy increase from removing agents is smaller.
Low-value or low-stakes queues. If the queue handles routine informational contacts where customer retention isn't at risk (e.g., balance inquiries that are migrating to self-service anyway), the CLV erosion component disappears and the P&L can turn positive.
When combined with channel deflection. If relaxing voice SL is paired with investment in self-service that genuinely absorbs demand (not just adds a frustrating IVR layer), the reduced demand justifies lower staffing. But the savings come from reduced demand, not from a relaxed SL.
Queues with very patient callers. If empirical patience data shows callers willing to wait 3–5 minutes (some B2B or technical support queues), 80/120 doesn't generate the abandonment spike seen in impatient consumer queues.
Having This Conversation with Your CFO
The typical Finance argument: "We're paying for a Rolls-Royce level of service. We can save money by targeting a Honda level."
The WFM response, backed by data:
- Show the math. "Relaxing from 80/20 to 80/120 saves 9 agents, not 30. The Erlang function is nonlinear — here's the curve." Present the worked example for your actual operation.
- Show the cascade. "Those 9 agents represent $540K in savings. Here are the four cost categories that erode it." Walk through occupancy, abandonment, AHT creep, and CLV.
- Offer the alternative. "If you need to save $540K in the contact center, here are approaches that don't trigger the cost cascade: shrinkage reduction, schedule optimization, self-service containment improvement, or AHT reduction through process improvement." Each of these reduces demand or improves efficiency without degrading customer experience.
- Propose a test. If Finance insists, propose a controlled test: relax SL on one queue or during off-peak hours and measure abandonment, AHT, CSAT, and attrition over 8 weeks. Let the data settle the argument.
The strongest position: "I agree we should optimize our service level target. Let me show you the economic optimization framework that finds the right target for our business — it's based on our customer value, not on convention."
The Human Performance Science Behind the Cascade
The cost cascade above isn't just an operational pattern — it's predicted by decades of human performance research. Understanding why loosening SL triggers these effects makes the argument evidence-based, not anecdotal.
Why AHT Creeps: Cognitive Load and Attention Restoration
When occupancy rises, agents lose the recovery windows between contacts. Attention Restoration Theory (Kaplan, 1995) explains why this matters: directed attention — the cognitive resource agents use to listen, diagnose, and resolve — is finite and fatiguing. It requires restoration through brief periods of low-demand activity. The Microsoft Human Factors Lab EEG study (2021) demonstrated this directly: back-to-back sessions without breaks produced cumulative beta-wave stress that reset only with 10-minute recovery periods.
When recovery windows shrink from 9 minutes per hour (85% occupancy) to 3 minutes (95% occupancy), agents lose the cognitive reset that keeps them sharp. The result: slower information retrieval, more clarification questions, longer documentation — AHT creeps up not because agents are lazy, but because their directed attention is depleted. See Cognitive Load and Contact Center Work for Sweller's framework applied to agent performance.
Why Burnout Accelerates: The JD-R Model and Loss Spirals
The Job Demands-Resources Model (Bakker & Demerouti, 2001) provides the theoretical mechanism: job demands (call volume, emotional labor, time pressure) consume resources; job resources (autonomy, recovery time, social support) replenish them. High occupancy eliminates the most fundamental resource — time between contacts — while demands remain constant or increase (frustrated callers who waited longer). The demand-resource ratio inverts, and the exhaustion pathway activates.
Conservation of Resources Theory (Hobfoll, 1989) explains why the damage compounds: resource loss is disproportionately harmful compared to equivalent gain, and losses spiral. An agent who loses recovery time also loses the ability to emotionally regulate (deeper surface acting per Hochschild's framework), which degrades call quality, which produces negative customer interactions, which further depletes emotional resources. Each loss makes the next more likely. This is the 4–8 week lag between occupancy increase and attrition spike — the spiral takes time to reach the breaking point.
Why Break Science Matters for This Conversation
The microbreak meta-analysis (Albulescu et al., 2022; N=2,335) found statistically significant well-being gains from breaks as short as 5–10 minutes. Ultradian rhythm research (Kleitman's Basic Rest-Activity Cycle) shows that human alertness cycles in ~90-minute waves — peak performance followed by natural recovery troughs.
The critical insight: the idle time between contacts at 85% occupancy isn't waste. It's the natural recovery mechanism that sustains performance. Eliminating it by loosening SL (which raises occupancy) doesn't improve efficiency — it removes the biological recovery that makes the next contact productive. The savings on paper are the cost of destroying the recovery architecture that keeps the operation running.
The Allostatic Load Argument
For operations that sustain high occupancy for weeks or months, allostatic load theory (McEwen & Stellar, 1993) adds a physiological dimension: chronic stress produces cumulative biological wear that manifests as increased cortisol, cardiovascular strain, and immune suppression. This isn't metaphorical burnout — it's measurable physiological damage with downstream healthcare costs. An operation "saving" $600K by loosening SL may be generating $200K+ in incremental healthcare claims, absenteeism, and disability — costs that show up in a different budget line and are never traced back to the staffing decision.
Connecting the Science to the Business Case
When presenting the service level fallacy to Finance, the human science evidence transforms the argument from "trust me, agents will burn out" to:
- "Directed attention research shows that eliminating recovery time between contacts produces measurable cognitive performance degradation" — Kaplan (1995), Microsoft EEG study (2021)
- "The demand-resource imbalance created by high occupancy activates the exhaustion pathway documented in 20+ years of JD-R research" — Bakker & Demerouti (2017), meta-analyses with N>100,000
- "Resource loss spirals predict the 4–8 week attrition lag we observe after occupancy increases" — Hobfoll (1989, 2001)
- "The idle time we're eliminating is the biological recovery mechanism that keeps agents productive — it's not waste, it's maintenance" — Albulescu microbreak meta-analysis (2022)
This is the difference between a WFM leader saying "I feel like agents will burn out" and a WFM leader saying "here's the peer-reviewed evidence for exactly how and when they will."
How To Actually Reduce Workforce Cost
The service level savings fallacy persists because it answers a real question: "how do we reduce workforce cost?" The answer isn't "loosen SL" — it's to improve how tightly supply matches demand across the day. The savings are in the gaps between staffing and workload, not in the service level target itself.
The Schedule Quality Index
Most WFM organizations don't measure how well their schedules actually match demand. The Schedule Quality Index (SQI) quantifies this:
- SQI = 1 − (Σ |Staffed_i − Required_i|) / (Σ Required_i)
where i represents each interval of the day. A perfect schedule (SQI = 1.0) has exactly the right number of agents in every interval. In practice:
| SQI Range | What It Means | Typical Operations |
|---|---|---|
| 0.90–1.00 | Excellent fit — minimal overstaffing or understaffing | Best-in-class WFM with flexible shifts |
| 0.80–0.90 | Good fit — some shoulder/transition gaps | Competent WFM with standard shift catalog |
| 0.70–0.80 | Moderate fit — significant gaps at shift boundaries | Traditional fixed-shift operations |
| Below 0.70 | Poor fit — large structural overstaffing or understaffing | Rigid shifts, limited scheduling flexibility |
The gap between your current SQI and 1.0 is where the real savings live. An operation at SQI 0.75 has 25% of its staffing misaligned — some intervals overstaffed (wasted labor), some understaffed (service failures). Improving SQI from 0.75 to 0.85 recovers more labor cost than any service level change, without touching the SL target at all.
Five Levers That Actually Save Money
These are the legitimate workforce cost levers — each improves occupancy by tightening supply-demand fit, not by running agents harder:
1. Shift Catalog Optimization
Most operations run 3–5 shift types designed decades ago. Expanding the shift catalog — adding split shifts, short shifts (4–6 hours), staggered starts at 15-minute increments — closes the gap between staffing curves and demand curves. A well-designed shift catalog can improve SQI by 5–10 points, translating to 3–7% labor cost reduction at the same service level.
2. Real-Time Automation (Level 3)
This is the Level 3 breakthrough in the WFM Labs Maturity Model™. Platforms like Intradiem and QStory continuously tune supply to demand in real time — pushing micro-learning during dips, offering VTO during sustained overstaffing, pulling training forward when conditions allow, and protecting breaks during surges. This is Variance Harvesting in production.
The impact: real-time automation typically recovers 15–30 minutes of productive time per agent per day from natural variance windows that traditional scheduling can't capture. At scale:
- 200 agents × 20 minutes recovered/day × 250 working days × ($25/hour ÷ 60) = $416,667/year
That's nearly $420K in recovered productivity — without changing the SL target, without raising occupancy, without burning anyone out. The savings come from doing productive work (coaching, training, wellness) during time that would otherwise be idle.
3. Flexible Workforce Models
Adding part-time agents, split shifts, and gig/on-demand workers (see Platform and Gig Workforce Planning) improves demand matching at the edges of the day where fixed full-time shifts create the biggest overstaffing. A 20% part-time mix typically improves SQI by 3–5 points.
4. AI Containment
Deflecting 20–30% of contacts to AI agents (see AI Agent Capacity Planning) directly reduces the human workload. Unlike loosening SL — which doesn't reduce demand, just tolerates longer waits — AI containment actually removes contacts from the queue. The human pool shrinks, occupancy becomes more manageable for the remaining agents, and the SL target can be maintained or even tightened.
5. Shrinkage Reduction
The typical contact center runs 30–38% shrinkage. Reducing shrinkage by even 2 percentage points (e.g., from 35% to 33%) through better schedule adherence, reduced unplanned absence, and more efficient training delivery is equivalent to adding 2% more productive agents — without hiring anyone. See Shrinkage for the full taxonomy and reduction strategies.
6. Cross-Skilling and Pool Consolidation
This is one of the most powerful — and most underutilized — cost levers in workforce management. The square-root staffing law proves that larger agent pools require proportionally fewer surplus agents to meet the same service level. Two 50-agent queues each need roughly 7 surplus agents (14 total); combined into one 100-agent queue, the surplus drops to about 10. That's 4 agents of pure mathematical efficiency — roughly $240K annually — gained by changing nothing except how agents are grouped.
The pooling opportunity: when two lines of business or customer segments handle similar-enough work, consolidating them into a single pool delivers immediate efficiency gains without touching the service level target, without raising occupancy, and without burning anyone out.
The feasibility assessment: pooling isn't free — it depends on three practical dimensions:
| Dimension | What To Evaluate | Pooling Feasibility |
|---|---|---|
| System access | Do agents need different CRM instances, knowledge bases, or toolsets? | If both LOBs use the same systems, pooling is straightforward. If they need separate logins or different applications, the technology cost of cross-access may offset the staffing gain. |
| Call complexity and training | How different is the work? How long does cross-training take? | Similar-complexity work (e.g., two product lines with the same support model) can cross-train in 1–2 weeks. Fundamentally different work (e.g., sales vs. technical support) may require 4–8 weeks and ongoing proficiency maintenance. |
| Licensing and compliance | Are there regulatory, contractual, or licensing constraints? | Financial services, healthcare, and insurance often have compliance-driven separation requirements. Some BPO contracts mandate dedicated teams. These are hard constraints that may prevent pooling regardless of the efficiency math. |
The cross-skilling spectrum: pooling isn't all-or-nothing. Operations can pursue it progressively:
- Full pool consolidation — merge two queues into one with all agents cross-skilled. Maximum efficiency gain. Works when work types are similar.
- Overflow pooling — maintain separate primary queues but enable cross-trained agents to take overflow from the other queue during demand spikes. Captures most of the pooling benefit during peak intervals without requiring full cross-training.
- Skill-group expansion — within a single LOB, cross-train agents across sub-skills (e.g., billing AND account changes) to reduce the number of narrow skill groups. Each consolidation improves the effective pool size. See Multi-Skill Scheduling and Cross-Training and Skill Mix Strategy.
- Time-of-day pooling — pool queues during off-peak hours (evenings, weekends) when neither has enough volume to staff independently, but keep them separate during peak hours when specialization matters. This is the lowest-risk entry point.
Worked example: Two LOBs each running 75 agents at 80/20 SL with 4-minute AHT and similar call types:
- Separate: 75 + 75 = 150 agents, combined occupancy ~83%
- Pooled: 140 agents to meet the same 80/20 SL, occupancy ~85%
- Savings: 10 agents ($600K annually) with no service level degradation and only a 2-point occupancy increase
This is a legitimate $600K saving — unlike the SL-loosening version, it comes from mathematical efficiency, not from running people harder. The occupancy increase from 83% to 85% is modest and sustainable. See Pooling Theory for the full mathematical framework and Skill-Based Routing for the routing configurations that enable pooled operations.
When pooling doesn't work:
- Regulatory separation requirements (HIPAA, PCI, contractual)
- Fundamentally different knowledge domains where cross-training ROI is negative
- Technology environments that can't support shared agent access
- Client contracts in BPO that mandate dedicated teams
- Work complexity so different that blending degrades quality for both queues
The Comparison: Loosening SL vs Real Optimization
| Approach | Savings Mechanism | Typical Savings | Side Effects | Sustainability |
|---|---|---|---|---|
| Loosen SL from 80/20 to 80/120 | Remove 5–15 agents | 3–8% labor reduction | Occupancy spike, AHT creep, abandon increase, attrition, CLV erosion | ✗ Often NPV-negative |
| Shift catalog optimization | Better demand matching | 3–7% labor reduction | Improved agent satisfaction (more shift choices) | ✓ Permanent |
| Real-time automation | Recover idle time for productive activity | $1,500–$2,500 per agent/year | Improved training completion, coaching delivery, break compliance | ✓ Compounds over time |
| AI containment | Remove contacts from queue | 15–30% volume reduction | Requires AI investment; remaining contacts may be harder | ✓ Scales with AI capability |
| Shrinkage reduction | More productive hours from existing staff | 2–5% effective capacity gain | Requires adherence culture, not policing | ✓ Sustainable with right approach |
| Pool consolidation | Larger pools need fewer surplus agents (square-root law) | 5–10% headcount reduction per consolidation | Requires cross-training investment, system access, compliance review | ✓ Permanent mathematical efficiency |
The message to Finance: "You're asking the right question — how do we reduce workforce cost? But loosening SL is the most expensive way to do it. Here are five approaches that deliver bigger savings with zero service degradation. Let me model them."
Maturity Model Position
- Level 1 — Initial (Emerging Operations). Service level target inherited or arbitrary. No analysis of the economic impact of SL changes. Finance and WFM don't share a framework for discussing service costs.
- Level 2 — Foundational (Traditional WFM Excellence). 80/20 accepted as standard. WFM can explain the Erlang curve qualitatively but doesn't model the full cost cascade of SL changes. Finance sees SL as a cost lever.
- Level 3 — Progressive (Breaking the Monolith). WFM builds the full P&L of SL changes, incorporating abandonment, attrition, and AHT effects. SL targets differentiated by queue based on customer value. Finance and WFM share a cost model.
- Level 4 — Advanced (The Ecosystem Emerges). SL targets derived from economic optimization (staffing cost + customer loss cost). CLV data feeds the model. SL, abandonment, and occupancy targets set jointly using Erlang-A or simulation. The Value-Based Planning Model drives queue differentiation.
- Level 5 — Pioneering (Enterprise-Wide Intelligence). SL targets adjust dynamically based on real-time demand, staffing availability, and customer-segment value. The economic optimization runs continuously. Finance treats contact center SL as an investment decision with measurable ROI, not a cost to minimize.
See Also
- Service Level
- Service Level Target Selection
- Erlang C
- Erlang-A
- Occupancy
- The Occupancy Trap
- Average Speed of Answer (ASA)
- Abandonment
- Traffic Intensity and Server Utilization
- The True Cost of Understaffing
- Erlang Sensitivity and the Staffing Cliff
- The ASA-SL-Abandon Relationship
- Conservation of Resources Theory and Loss Spirals
- Value-Based Planning Model
References
- ↑ Aksin, Z., Armony, M., and Mehrotra, V. (2007). The Modern Call Center: A Multi-Disciplinary Perspective on Operations Management Research. Production and Operations Management, 16(6), 665–688.
- ↑ Dixon, M., Toman, N., and DeLisi, R. (2013). The Effortless Experience: Conquering the New Battleground for Customer Loyalty. Portfolio/Penguin.
