Potential Outcomes Framework

From WFM Labs

The Potential Outcomes Framework (also called the Neyman–Rubin causal model) is one of the two dominant languages for causal inference, alongside the structural-diagram approach. It defines a causal effect as the comparison between the outcome a unit would have under treatment and the outcome the same unit would have without it. For workforce management, the framework is the rigorous foundation under every claim that an intervention "caused" a result — a coaching program, a schedule change, an automation rollout — and it makes precise exactly what cannot be observed and must instead be estimated.[1]

Potential outcomes and the fundamental problem

For each unit i and a binary treatment, there are two potential outcomes: Yi(1), the outcome if treated, and Yi(0), the outcome if not. The individual causal effect is their difference, Yi(1) − Yi(0). The difficulty — the fundamental problem of causal inference — is that only one of the two is ever observed for any unit: an agent either received the coaching or did not, so the other outcome is missing and counterfactual.[2] Causal inference is therefore, formally, a missing-data problem.

Agent Y(1) coached Y(0) not coached Effect
A 88 ? ?
B ? 71 ?
C 90 ? ?

The "?" cells are the counterfactuals. Because they cannot be filled in for any individual, causal questions are answered at the level of averages across groups, where the missing cells can be estimated.[3]

Estimands: ATE and ATT

Since individual effects are unobservable, the framework targets average effects. The Average Treatment Effect (ATE) is the mean of Y(1) − Y(0) across the whole population. The Average Treatment Effect on the Treated (ATT) is that mean restricted to the units that actually received the treatment — often the more relevant quantity in WFM, where the question is "did the program help the agents it was applied to?" rather than "would it help everyone?" Distinguishing the estimand matters because a method can identify one and not the other.

Identifying assumptions

Estimating these averages from data requires assumptions that make the treated and untreated groups comparable:

  • Ignorability (unconfoundedness). Given the measured covariates, treatment assignment is independent of the potential outcomes — there is no unmeasured confounding. Randomization guarantees this by design; observational WFM data does not, which is the entire reason for quasi-experimental methods.
  • Overlap (positivity). Every type of unit has some chance of being in either group, so comparable cases exist on both sides.
  • SUTVA (Stable Unit Treatment Value Assumption) — two conditions: one unit's treatment does not affect another's outcome (no interference), and there is a single, consistent version of the treatment. SUTVA is routinely strained in contact centers: coaching one agent can spill over to teammates, and a pooled queue links agents' outcomes, both forms of interference.

Relationship to the other tools

The potential-outcomes and structural-diagram frameworks are complementary, not competing: a DAG encodes the assumptions, and the potential-outcomes notation defines the quantity being estimated. The ignorability assumption corresponds to the diagram's backdoor criterion being satisfied by the measured covariates, and the individual effect Y(1) − Y(0) is exactly the counterfactual contrast. The estimation methods built on this framework — difference-in-differences, instrumental variables, regression discontinuity, and propensity-score matching — are all strategies for filling in the missing potential outcomes credibly.

Maturity Model Position

In the WFM Labs Maturity Model™, thinking in potential outcomes is the foundation of disciplined program evaluation.

  • Level 1–2 (Emerging / Foundational) — "effect" means a raw before-and-after or treated-vs-untreated difference, with no attention to comparability or the missing counterfactual.
  • Level 3 (Progressive) — evaluations name the estimand (ATE vs ATT), check overlap, and reason about confounding and SUTVA before claiming an effect.
  • Level 4–5 (Advanced / Pioneering) — potential-outcomes estimands are standard in how interventions are designed and measured, and the assumptions are stated and stress-tested rather than assumed.

See also

References

  1. Rubin, D. B. (1974). "Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies". Journal of Educational Psychology, 66(5), 688–701. doi:10.1037/h0037350.
  2. Holland, P. W. (1986). "Statistics and Causal Inference". Journal of the American Statistical Association, 81(396), 945–960.
  3. Imbens, G. W., & Rubin, D. B. (2015). Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. Cambridge University Press. ISBN 978-0-521-88588-1.