The Flaw of Averages

The Flaw of Averages is the systematic error that arises when a single average value is used to represent an uncertain quantity in a calculation whose output depends nonlinearly on that quantity. Plans built on average inputs are, in the memorable summary of the concept's popularizer, "wrong on average." The idea was named and developed by Stanford management scientist Sam L. Savage in his 2009 book The Flaw of Averages, which collected a wide range of business and engineering failures under a single diagnosis: replacing an uncertain number with its mean discards the very variability that drives risk and cost.[1] For workforce management — where almost every important relationship between demand and resource is nonlinear — the flaw is one of the most consequential and least recognized sources of planning error.
The principle
Savage illustrates the idea with the parable of the statistician who drowns crossing a river that is, on average, three feet deep. The average is a true fact about the river and a fatal basis for the plan. More formally, the flaw appears whenever an outcome is a nonlinear function of an uncertain input: the outcome computed from the average input is not equal to the average of the outcomes computed across the full range of inputs. Using one number — the average — in place of the distribution produces an answer that is not merely imprecise but systematically biased in a predictable direction.
A central source of the bias is that real systems contain nonlinearities and constraints: capacity limits, queueing effects, thresholds, and minimums. When work is fed through these, good and bad scenarios do not cancel. A project made of several uncertain tasks finishes later than the plan built on each task's average duration, because the project waits for the slowest path, not the average one. The flaw is therefore not a forecasting error in the inputs — the average input may be exactly right — but an error in how a single average is propagated through a nonlinear system.
Jensen's inequality
The mathematical core of the flaw is Jensen's inequality, proved by Johan Jensen in 1906: for a convex function f and a random input X, the expected value of the function is greater than or equal to the function of the expected value, E[f(X)] ≥ f(E[X]); for a concave function the inequality reverses.[2] The size of the gap depends on how nonlinear the function is and how much the input varies — which is why the flaw bites hardest exactly where uncertainty is greatest. The diagram shows the convex case: the chord connecting two demand scenarios lies above the curve, so the average of the two outcomes sits above the outcome at the average input.
Why it matters in workforce management
WFM is unusually exposed because its governing relationships are nonlinear and its inputs are genuinely uncertain:
- Staffing for the average interval. Required staffing is a nonlinear (roughly convex) function of volume through Erlang. Staffing each interval for the average day therefore understates the staffing that the actual, variable days require — the WFM-specific form of the flaw sometimes called the "Erlang law of averages." This is the conceptual parent of staffing to a percentile rather than the mean.
- Service level and the staffing cliff. Service level is a steep, S-shaped function of staffing, so averaging across intervals hides the intervals that fall off the staffing cliff. A plan that hits target "on average" can miss badly in the intervals that matter.
- Occupancy and abandonment. Both respond nonlinearly to load; planning on average load misstates both the cost and the customer experience.
- Project and ramp timelines. Hiring classes, training pipelines, and implementation projects made of several uncertain activities finish later than the average-based plan, because completion waits on the longest path.
- Aggregation. Rolling volatile intervals, sites, or skills into a single average smooths away the variation that drives staffing and risk — the same averaging trap that produces Simpson's paradox when a confounder is hidden.
In each case the average is not wrong as a description; it is wrong as a plan, because the operation does not run on average days.
Avoiding the flaw
The remedy is to carry uncertainty through the calculation instead of collapsing it to a mean at the start — the discipline Savage later termed "probability management."[3]
- Model distributions, not points. Use probabilistic forecasts and Monte Carlo simulation so the full range of demand passes through the staffing math.
- Staff to a percentile, not the mean. Percentile staffing explicitly targets a service-level confidence rather than the average case.
- Report plan risk, not a single number. The WFM Labs Risk Score™ rates how fragile a plan is across plausible scenarios rather than certifying it against the average.
- Harvest the variance. Variance Harvesting treats expected variation as a planned input, the operational counterpart of refusing to plan on the mean.
- Beware the average in dashboards. Pair averages with ranges and distributions, a habit shared with statistical thinking and a defense against the overconfidence that single numbers encourage.
Maturity Model Position
In the WFM Labs Maturity Model™, whether planning carries uncertainty or collapses it to an average is a defining maturity tell.
- Level 1–2 (Emerging / Foundational) — plans are built on single average inputs (average volume, average AHT, average shrinkage); the resulting systematic bias is absorbed as recurring "misses."
- Level 3 (Progressive) — planning works with ranges and percentiles, staffs to confidence rather than the mean, and recognizes where nonlinearity makes the average misleading.
- Level 4–5 (Advanced / Pioneering) — distributions are carried end to end through simulation and probability management, plan risk is rated explicitly, and automated planning never reduces an uncertain quantity to its mean before it has passed through the model.
See also
- Erlang Law of Averages
- Staffing to Percentile vs Mean Forecast
- Probabilistic Forecasting
- Deterministic vs Probabilistic Models
- Erlang Sensitivity and the Staffing Cliff
- Fractional Agents and Staffing Interpolation
- Variance Harvesting
- WFM Labs Risk Score™
- Statistical Thinking in WFM
- Cognitive Biases in WFM Decisions
References
- ↑ Savage, S. L. (2009). The Flaw of Averages: Why We Underestimate Risk in the Face of Uncertainty. Wiley. ISBN 978-0-471-38197-6.
- ↑ Jensen, J. L. W. V. (1906). "Sur les fonctions convexes et les inégalités entre les valeurs moyennes". Acta Mathematica, 30, 175–193.
- ↑ Savage, S. L., Scholtes, S., & Zweidler, D. (2006). "Probability Management". OR/MS Today, 33(1), 20–28.
