WFM Research Agenda

From WFM Labs

WFM Research Agenda proposes a structured program of investigation for the workforce management field, organized by time horizon. As contact centers undergo their most significant transformation since the shift from on-premises to cloud infrastructure — driven by AI agent deployment, algorithmic scheduling, and evolving labor regulations — the research community must anticipate and address challenges that practitioners will face in the coming decade. This agenda draws on the open problems identified by the academic community and the practical priorities emerging from industry practice.

Near-Term Research Priorities (1–2 Years)

AI Agent Staffing Models

Research questions:

  • What queueing models correctly characterize AI agent behavior in service systems, given their unique properties (near-infinite concurrency, stochastic resolution capability, zero fatigue, computational latency constraints)?
  • How should capacity planning account for the coupling between AI and human staffing — where AI resolution rates determine human escalation volumes?
  • What is the appropriate cost function for blended human-AI workforce optimization, given that AI costs are infrastructure-based (compute, licensing) rather than labor-based?

Potential methodologies: Fluid model analysis of tandem queues with AI front-end and human back-end servers. Simulation studies calibrated with empirical AI agent performance data from production deployments. Extension of the Garnett-Mandelbaum-Reiman Erlang-A framework to incorporate an AI resolution stage before the traditional human queue.[1]

Practical implications: Organizations deploying AI agents need validated models to determine the right mix of AI capacity and human staffing. Current practice relies on vendor-provided estimates and pilot program extrapolation — methods that systematically underestimate the complexity of human-AI workforce interaction effects. Validated staffing models would enable more accurate ROI projections and reduce the risk of service degradation during AI deployment transitions.

Real-Time Coaching Effectiveness Measurement

Research questions:

  • Does real-time AI-driven coaching (suggesting responses, detecting customer sentiment, prompting procedural compliance) measurably improve agent performance?
  • What is the dose-response relationship — at what point does coaching frequency become counterproductive, creating cognitive overload?
  • How do coaching effects vary by agent experience level, interaction type, and coaching modality (visual prompt vs. audio vs. haptic)?

Potential methodologies: Randomized controlled trials with crossover design in operational contact centers. Interrupted time series analysis of coaching feature deployments. Multilevel modeling to separate agent-level, team-level, and system-level effects.[2]

Practical implications: Real-time coaching technology is a multi-billion-dollar market segment driven almost entirely by vendor claims and case studies rather than rigorous causal evidence. Contact center operators are making significant investments without knowing whether the technology produces net positive effects on the outcomes they care about — resolution rates, customer satisfaction, agent retention — or whether the cognitive load imposed by continuous coaching degrades complex problem-solving performance.

Predictive Scheduling Law Impact Analysis

Research questions:

  • What is the measured operational cost impact of predictive scheduling laws (advance notice requirements, schedule change penalties, right-to-rest provisions) on contact center workforce management?
  • Do predictive scheduling laws achieve their intended policy objectives (schedule stability, worker wellbeing) without disproportionate efficiency costs?
  • What scheduling optimization strategies minimize compliance cost while maintaining service levels?

Potential methodologies: Difference-in-differences analysis comparing contact center operations in jurisdictions with and without predictive scheduling laws. Mathematical programming models incorporating scheduling law constraints to quantify theoretical cost impacts. Employer and employee survey instruments measuring perceived schedule stability and wellbeing outcomes.[3]

Practical implications: Predictive scheduling legislation is expanding rapidly across U.S. cities and states, with similar regulatory trends in the European Union, United Kingdom, and Australia. Contact center operators need evidence-based guidance on compliance strategies that maintain operational flexibility. Policymakers need data on whether these laws achieve their intended objectives in the contact center context, where demand volatility creates inherent tension between schedule predictability and service level maintenance.

Medium-Term Research Priorities (3–5 Years)

Autonomous Scheduling Validation Frameworks

Research questions:

  • How should organizations validate that AI-generated schedules are fair, compliant, and operationally effective before deployment?
  • What audit methodologies can detect algorithmic bias in scheduling (disparate impact on protected groups in shift quality, overtime distribution, or preference accommodation)?
  • What human-in-the-loop configurations optimally balance automation efficiency with oversight quality?

Potential methodologies: Formal verification methods adapted from software engineering applied to scheduling algorithm outputs. Statistical fairness audits using disparate impact metrics (four-fifths rule, demographic parity, equalized odds) applied to schedule outcomes rather than employment decisions.[4] Simulation-based stress testing of scheduling algorithms under adversarial demand scenarios and workforce composition variations.

Practical implications: As scheduling becomes more automated, the "black box" problem intensifies. Supervisors who previously constructed schedules — and understood the tradeoffs embedded in them — are replaced by algorithms whose tradeoff logic is opaque. Validation frameworks are necessary to maintain trust, ensure compliance, and provide accountability when algorithmic decisions produce adverse outcomes. Regulatory frameworks in the EU (AI Act) and proposed U.S. legislation increasingly require algorithmic audit capabilities for high-impact AI systems, and workforce scheduling will likely fall within scope.

Human-AI Workforce Optimization Theory

Research questions:

  • What is the theoretically optimal allocation of interaction types between human and AI agents as a function of complexity, emotional content, regulatory requirements, and customer value?
  • How do learning effects interact with allocation — does routing complex interactions away from junior agents slow their skill development?
  • What dynamic reallocation policies are robust to changing AI capabilities (improving resolution rates over time)?

Potential methodologies: Multi-armed bandit formulations where arms represent allocation policies and payoffs incorporate customer satisfaction, resolution rate, agent development, and cost. Markov decision process models with state variables capturing agent skill levels, AI capability levels, and demand composition. Asymptotic analysis in the Halfin-Whitt regime extended to heterogeneous server systems with learning.[5]

Practical implications: The human-AI allocation decision is the defining workforce strategy question for the next decade. Get it wrong in one direction and you degrade customer experience with premature AI deployment; get it wrong in the other and you forgo cost savings and scalability benefits. A theoretical framework would provide principled guidance for a decision currently made by intuition and vendor influence.

Skills-Based Dynamic Capacity Planning

Research questions:

  • How should capacity planning models account for the continuous evolution of required skill sets as AI handles routine interactions and human agents handle increasingly complex escalations?
  • What training and cross-skilling investment strategies optimize long-term workforce capability under uncertainty about AI capability trajectories?
  • How do network effects in skill-based routing (an agent skilled in A and B provides routing flexibility beyond the sum of single-skill coverage) scale with workforce size and skill diversity?

Potential methodologies: Network flow models with dynamic skill adjacencies. Stochastic dynamic programming for training investment decisions under technology uncertainty. Simulation optimization for skill-based routing configurations with time-varying skill requirements. The theoretical foundation draws on Bassamboo, Harrison, and Zeevi's (2006) LP-based framework extended to incorporate skill evolution dynamics.[6]

Practical implications: Workforce planners currently make skill-mix decisions using static analysis that assumes stable skill requirements. In reality, the skill landscape shifts continuously as AI capabilities expand, new products launch, and customer expectations evolve. Dynamic capacity planning models would enable proactive workforce development strategies rather than reactive hiring and training cycles.

Long-Term Research Priorities (5–10 Years)

Fully Autonomous WFM Systems

Research questions:

  • What are the necessary and sufficient conditions for a WFM system to operate autonomously — forecasting demand, generating schedules, managing real-time operations, and adapting to changing conditions — without human intervention?
  • What failure modes are unique to autonomous WFM (cascading optimization errors, adversarial manipulation, distributional shift in demand patterns), and how should systems detect and recover from them?
  • What is the appropriate level of autonomy for different WFM decisions, and how should this vary with organizational context (industry, scale, regulatory environment)?

Potential methodologies: Control theory frameworks for autonomous system design, incorporating stability analysis, robustness margins, and failure mode enumeration. Adversarial testing methodologies from autonomous vehicle research adapted to WFM contexts. Levels-of-autonomy frameworks adapted from the SAE autonomous driving taxonomy (SAE J3016) applied to WFM decision categories.[7]

Practical implications: The trajectory of WFM technology points toward increasing automation. Understanding the theoretical limits and practical requirements of full autonomy — before it is deployed — is essential for safe and effective adoption. The alternative is incremental automation without a coherent framework, which risks catastrophic failure when autonomous components interact in unanticipated ways.

Workforce Intelligence as a Discipline

Research questions:

  • Can workforce management evolve from a set of operational processes (forecasting, scheduling, real-time management) into a unified analytical discipline — "workforce intelligence" — with its own theoretical foundations, methodological standards, and professional identity?
  • What are the core axioms and principles that would define such a discipline?
  • How should workforce intelligence integrate methods from operations research, organizational psychology, labor economics, computer science, and management science into a coherent body of knowledge?

Potential methodologies: Scientometric analysis of the WFM research literature to identify intellectual clusters and integration opportunities. Delphi studies with leading researchers and practitioners to articulate disciplinary boundaries and core principles. Curriculum development research for graduate programs in workforce intelligence. Historical analysis of how related fields (information systems, supply chain management, business analytics) achieved disciplinary identity.[8]

Practical implications: The fragmentation of WFM knowledge across multiple academic disciplines has slowed the development of integrated solutions. A practitioner seeking to understand the full scope of WFM — from queueing theory to organizational psychology to labor law — must synthesize across disconnected literatures. Establishing workforce intelligence as a coherent discipline would accelerate knowledge production, improve practitioner education, and raise the professional standing of WFM leaders.

Ethical Frameworks for Algorithmic Workforce Management

Research questions:

  • What ethical principles should govern algorithmic workforce management decisions — particularly those affecting schedule quality, work intensity, performance evaluation, and employment continuity?
  • How should these principles be operationalized in system design, balancing organizational efficiency objectives with worker wellbeing and autonomy?
  • What governance structures (review boards, audit requirements, transparency obligations, worker participation mechanisms) are appropriate for organizations deploying algorithmic WFM?

Potential methodologies: Applied ethics research using case-based reasoning and stakeholder analysis. Participatory design research involving workers, managers, and technology developers in the articulation of ethical requirements. Comparative regulatory analysis across jurisdictions. Experimental studies measuring worker perceptions of algorithmic fairness under different transparency and participation conditions.[9]

Practical implications: Algorithmic workforce management is already the subject of legislative attention (EU AI Act, U.S. state-level AI employment laws), labor organizing (algorithmic management as a union bargaining issue), and media scrutiny (journalistic investigations of algorithmic scheduling at major retailers and gig platforms). Organizations that proactively develop ethical frameworks will be better positioned to navigate this evolving landscape than those that wait for regulatory mandates. The research community has an obligation to provide the conceptual foundations and empirical evidence that inform both organizational practice and public policy.

Cross-Cutting Themes

Several themes recur across time horizons:

Empirical validation: Across all research areas, the gap between theoretical models and operational reality requires sustained empirical work. The field needs more datasets, more field experiments, and more honest assessment of where models fail. Brown et al. (2005) set the standard for empirical rigor in contact center research; that standard must be maintained and extended to new problem domains.[10]

Interdisciplinary integration: The most important problems span traditional disciplinary boundaries. AI agent staffing requires queueing theory, computer science, and economics. Autonomous scheduling validation requires operations research, algorithmic fairness, and law. Burnout prediction requires organizational psychology, time series analysis, and ethics. No single discipline can address these problems alone.

Industry-academic collaboration: The contact center industry generates enormous volumes of operational data that could fuel breakthrough research but are rarely shared with academics due to confidentiality concerns, competitive sensitivity, and organizational inertia. Creating mechanisms for responsible data sharing — anonymized benchmarks, synthetic datasets, academic-industry partnerships with appropriate governance — is essential for the field's advancement.

Practitioner translation: Academic findings must be translated into actionable guidance for practitioners. The history of WFM research shows a persistent gap between the sophistication of academic models and the simplicity of tools actually used in practice. Closing this gap requires researchers who understand practice and practitioners who value research — a two-way investment that benefits from venues like INFORMS, the International Symposium on Mathematical Programming, and industry conferences.

See also

  1. Garnett, O., Mandelbaum, A., & Reiman, M. I. (2002). "Designing a call center with impatient customers." Manufacturing & Service Operations Management, 4(3), 208–227.
  2. Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Houghton Mifflin. ISBN 978-0-395-61556-0.
  3. Autor, D. H. (2003). "Outsourcing at will: The contribution of unjust dismissal doctrine to the growth of employment outsourcing." Journal of Labor Economics, 21(1), 1–42.
  4. Feldman, M., Friedler, S. A., Moeller, J., Scheidegger, C., & Venkatasubramanian, S. (2015). "Certifying and removing disparate impact." Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 259–268.
  5. Halfin, S., & Whitt, W. (1981). "Heavy-traffic limits for queues with many exponential servers." Operations Research, 29(3), 567–588.
  6. Bassamboo, A., Harrison, J. M., & Zeevi, A. (2006). "Design and control of a large call center: Asymptotic analysis of an LP-based method." Operations Research, 54(3), 419–435.
  7. SAE International. (2021). Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles. SAE Standard J3016_202104.
  8. Davenport, T. H., & Harris, J. G. (2007). Competing on Analytics: The New Science of Winning. Harvard Business Press. ISBN 978-1-4221-0332-6.
  9. Kellogg, K. C., Valentine, M. A., & Christin, A. (2020). "Algorithms at work: The new contested terrain of control." Academy of Management Annals, 14(1), 366–410.
  10. Brown, L., Gans, N., Mandelbaum, A., Sakov, A., Shen, H., Zeltyn, S., & Zhao, L. (2005). "Statistical analysis of a telephone call center: A queueing-science perspective." Journal of the American Statistical Association, 100(469), 36–50.