Selection and Collider Bias in WFM

Selection and Collider Bias in WFM is the distortion that arises when the population analyzed is not the population the conclusion is meant to apply to — because units entered (or were kept in) the sample through a process related to the variables under study. Its structural cause is conditioning on a collider: when an analysis is restricted to cases sharing a common effect of two variables, a spurious association between those variables appears inside the restricted group. In workforce management, where data is constantly filtered to "agents who stayed," "contacts that reached a human," or "the top performers," selection bias is pervasive and frequently mistaken for a real finding.^[1]

The structural cause

Two variables that are unrelated in the full population can become correlated once the sample is restricted to a shared consequence of both. Formally, conditioning on a collider — or on a descendant of a collider — opens a non-causal path between its causes.^[2] Selection bias is this mechanism in disguise: the act of selecting the sample is the act of conditioning on the collider. This is why it cannot be fixed by collecting more of the same data — the bias is built into how the data was chosen, not how much of it there is.

Forms in workforce management

Survivorship. Analyzing CSAT or performance drivers using only agents who reached twelve months of tenure conditions on retention — itself an outcome of the very factors under study. Conclusions about "what makes agents successful" drawn from survivors can be backwards, because the agents who would have contradicted the pattern have already left.
Outcome-defined subgroups. Studying whether training works using only agents who were promoted conditions on a collider (promotion is caused by both training and aptitude), and can make training appear useless or harmful within the promoted group even when it helps everyone.
Deflection and reachability. Comparing channels or vendors using only contacts that reached an agent excludes everything resolved or abandoned in self-service, distorting the comparison whenever deflection depends on the same factors as the outcome.
Voluntary response. Drawing engagement or satisfaction conclusions from agents who chose to answer a survey conditions on the decision to respond, which correlates with the attitudes being measured.
Range restriction. Validating a hiring assessment only on candidates who were hired (and stayed) hides its true predictive value, because the low-scoring range was filtered out before the data was collected.

The survivorship classic

The canonical illustration comes from the statistician Abraham Wald, who during the Second World War was asked where to add armor on bombers based on the damage pattern of returning aircraft. The data showed hits concentrated on the wings and fuselage and almost none on the engines. The intuitive answer — armor where the bullet holes are — is exactly wrong: the planes hit in the engines did not return to be measured. The sample of survivors had been selected on the outcome (returning), and the absence of engine damage among survivors was the signal that engine hits were fatal.^[3] Every WFM analysis run on "the agents who stayed" risks the same inversion.

Detection and avoidance

Define the target population before selecting the sample. State who the conclusion should apply to, then check whether the analyzed data represents them or a survivor subset.
Do not condition on anything caused by the variables under study. Restricting, segmenting, or controlling for a post-treatment outcome (promotion, retention, reaching an agent) is the most common way selection bias enters.
Bring the excluded cases back in. Where possible, analyze the full cohort — including leavers, deflected contacts, and non-respondents — rather than the survivors.
Weight or model the selection. When exclusion is unavoidable, inverse-probability-of-selection weighting can recover the target estimate under stated assumptions; at minimum, reason explicitly about the direction of the likely bias.

The discipline reduces to one habit: be suspicious of any finding drawn from a group that was defined by an outcome. It is the operational sibling of regression to the mean and a direct application of the collider rule.

Maturity Model Position

In the WFM Labs Maturity Model™, awareness of selection bias separates analytics that generalize from analytics that quietly describe only the survivors.

Level 1–2 (Emerging / Foundational) — analyses routinely run on convenient subsets (current agents, answered surveys, handled contacts) and the findings are generalized without question.
Level 3 (Progressive) — analysts define the target population first, avoid conditioning on outcomes, and flag survivorship and selection effects when interpreting results.
Level 4–5 (Advanced / Pioneering) — selection is modeled explicitly (weighting, full-cohort designs), and automated analyses are specified to avoid outcome-conditioned samples by construction.

References

↑ Hernán, M. A., Hernández-Díaz, S., & Robins, J. M. (2004). "A Structural Approach to Selection Bias". Epidemiology, 15(5), 615–625. doi:10.1097/01.ede.0000135174.63482.43.
↑ Elwert, F., & Winship, C. (2014). "Endogenous Selection Bias: The Problem of Conditioning on a Collider Variable". Annual Review of Sociology, 40, 31–53. doi:10.1146/annurev-soc-071913-043455.
↑ Mangel, M., & Samaniego, F. J. (1984). "Abraham Wald's Work on Aircraft Survivability". Journal of the American Statistical Association, 79(386), 259–267.

[hernan2004-1] Hernán, M. A., Hernández-Díaz, S., & Robins, J. M. (2004). "A Structural Approach to Selection Bias". Epidemiology, 15(5), 615–625. doi:10.1097/01.ede.0000135174.63482.43.

[elwert2014-2] Elwert, F., & Winship, C. (2014). "Endogenous Selection Bias: The Problem of Conditioning on a Collider Variable". Annual Review of Sociology, 40, 31–53. doi:10.1146/annurev-soc-071913-043455.

[mangel1984-3] Mangel, M., & Samaniego, F. J. (1984). "Abraham Wald's Work on Aircraft Survivability". Journal of the American Statistical Association, 79(386), 259–267.

[1]

[2]

[3]

Anonymous

Search

Selection and Collider Bias in WFM

Namespaces

More

Page actions

Contents

The structural cause

Forms in workforce management

The survivorship classic

Detection and avoidance

Maturity Model Position

See also

References

Navigation

Navigation

Core WFM

Applied Science

Beyond Contact Centers

Strategy & Transformation

Signature Models

Community

Wiki tools

Wiki tools

Anonymous

Search

Selection and Collider Bias in WFM

The structural cause

Forms in workforce management

The survivorship classic

Detection and avoidance

Maturity Model Position

See also

References

Navigation

Wiki tools

Page tools

Categories