Open Problems in Workforce Management Research
Open Problems in Workforce Management Research surveys the unresolved questions and active frontiers in workforce management (WFM) scholarship. Despite decades of progress in queueing theory, mathematical programming, and human factors research, significant gaps remain between theoretical models and operational reality. This article catalogs the most consequential open problems, organized by domain, and identifies the researchers and institutions advancing the state of the art.
Forecasting
Multi-Channel Demand Correlation
Modern contact centers handle voice, chat, email, social media, and increasingly AI-agent interactions simultaneously. Classical forecasting approaches treat each channel independently, applying univariate time series methods to each queue. This independence assumption breaks down in practice: a service outage generates correlated spikes across voice, chat, and social channels with different lag structures and volume ratios.
Current state: Multivariate time series methods such as vector autoregression (VAR) and dynamic factor models have been applied in econometrics and supply chain forecasting but remain underexplored in contact center contexts.[1] Ibrahim and L'Ecuyer (2013) demonstrated that call arrival processes exhibit significant dependencies across time periods within a day, but cross-channel correlation modeling remains nascent.[2]
Why it matters: Inaccurate cross-channel forecasts cascade into staffing errors. If a chat surge is treated independently from a simultaneous voice decline, the organization overstaffs voice and understaffs chat. With channel migration accelerating — particularly toward digital and AI-assisted channels — this problem intensifies.
Key researchers: Pierre L'Ecuyer (Université de Montréal), Rouba Ibrahim (University College London), Zeynep Aksin (Koç University).
Real-Time Bayesian Updating at Scale
Intraday forecast updating — revising demand predictions as actual arrival data accumulates throughout the day — is well-established conceptually. Bayesian methods offer a principled framework: prior forecasts are updated with observed arrivals to produce posterior predictions.[3] The open problem is computational: performing these updates in real time across hundreds of queues with complex dependency structures strains current implementations.
Current state: Weinberg, Brown, and Stroud (2007) demonstrated Bayesian updating for call center arrivals using doubly stochastic Poisson processes. Shen and Huang (2008) developed interday and intraday forecasting models exploiting the correlation structure of arrival counts.[4] However, scaling these methods to omnichannel environments with hundreds of queues and sub-minute update intervals remains computationally prohibitive without approximation techniques that sacrifice theoretical guarantees.
Why it matters: The gap between daily forecast accuracy and real-time operational need is where service levels are won or lost. A 15-minute delay in detecting a volume shift can produce service level violations that persist for hours.
Key researchers: Haipeng Shen (University of Hong Kong), Lawrence D. Brown (University of Pennsylvania, deceased 2018), Jonathan Weinberg (The Annenberg Public Policy Center).
Transfer Learning for New Queues
When organizations launch new products, enter new markets, or open new service channels, they face a cold-start problem: no historical data exists for the new queue. Traditional approaches rely on analogy-based estimation from similar existing queues, but this is ad hoc and poorly calibrated.
Current state: Transfer learning and domain adaptation methods from machine learning offer a framework for borrowing statistical strength from related queues.[5] Recent work in related service operations contexts has explored Gaussian process priors that encode structural similarities between demand patterns.[6] Application to contact center queue initialization is an active area with limited published results.
Why it matters: The proliferation of AI agent queues makes this problem urgent. Organizations deploying conversational AI may stand up dozens of new interaction types within months, each requiring staffing models before sufficient data accumulates.
Key researchers: Nikolay Laptev (formerly Uber, now Meta), Rob Hyndman (Monash University), Slawek Smyl (Uber/NVIDIA).
Scheduling
Multi-Objective Optimization with Fairness Constraints
Workforce scheduling is inherently multi-objective: minimize cost, maximize service level, and satisfy employee preferences. The field has well-developed methods for bi-criteria optimization, but the incorporation of formal fairness constraints — equitable distribution of desirable and undesirable shifts, balanced overtime allocation, uniform schedule quality across demographic groups — introduces mathematical complexity that current methods handle poorly.
Current state: Ernst et al. (2004) surveyed staff scheduling and rostering methods comprehensively, noting that most practical systems use weighted-sum scalarization of multiple objectives.[7] More recent work by Legrain, Omer, and Rosat (2020) explored fairness in nurse rostering using lexicographic optimization.[8] Formal fairness definitions from algorithmic fairness literature (equalized workload variance, min-max individual utility) have not been systematically applied to WFM scheduling.
Why it matters: Scheduling fairness directly affects attrition. Perceived inequity in shift assignment is consistently cited among the top drivers of contact center agent turnover. As scheduling becomes more automated, algorithmic fairness becomes both an ethical imperative and a regulatory concern under emerging AI governance frameworks.
Key researchers: Andreas T. Ernst (Monash University), Louis-Martin Rousseau (Polytechnique Montréal), Andrea Lodi (Cornell Tech).
Dynamic Scheduling with AI Agents
The integration of AI agents into service delivery fundamentally disrupts scheduling models. Unlike human agents, AI agents have near-infinite availability but variable capability, no break requirements but potential computational constraints, and zero labor cost but non-zero infrastructure cost. The open problem is how to jointly schedule human and AI workforces when the two resource types have fundamentally different constraint structures.
Current state: The blended workforce scheduling problem has no established formulation in the operations research literature. Related work on parallel machine scheduling with heterogeneous processors exists in manufacturing contexts.[9] The contact center-specific challenge — where AI agents handle routine interactions and escalate complex ones to humans in real time — creates a coupled system where human staffing requirements depend on AI performance, which itself varies with interaction complexity.
Why it matters: Every major contact center technology vendor is deploying AI agents. Without rigorous scheduling models that account for human-AI workforce composition, organizations are making multi-million-dollar staffing decisions based on intuition and vendor claims rather than mathematical analysis.
Key researchers: Avi Mandelbaum (Technion), Ramandeep Randhawa (USC Marshall), Itai Gurvich (Northwestern Kellogg).
Distributed Scheduling for Remote Workforces
The post-2020 shift to remote and hybrid work introduced scheduling dimensions absent from classical models: timezone-spanning coverage requirements, home-office ergonomic constraints, reduced visibility into agent availability, and employee preferences for flexible scheduling that may not align with coverage needs.
Current state: Distributed scheduling has been studied extensively in computing (distributed systems, cloud resource allocation) but the workforce variant introduces human behavioral constraints that computational models do not face.[10] Emerging research examines preference-aware scheduling with incentive compatibility — agents truthfully reporting preferences because the mechanism rewards honesty — but practical implementations remain limited.
Why it matters: Remote work is permanent for a substantial fraction of the contact center workforce. Scheduling models that assume a co-located workforce with uniform shift structures are increasingly obsolete.
Key researchers: Jonathan F. Bard (University of Texas at Austin), Rainer Kolisch (Technical University of Munich).
Real-Time Management
Online Optimization Under Uncertainty
Real-time workforce management is fundamentally an online optimization problem: decisions must be made sequentially without knowledge of future arrivals, handle times, or agent availability events. Classical approaches use threshold policies (e.g., activate overtime when queue length exceeds threshold X), but these are provably suboptimal in non-stationary environments.
Current state: Competitive analysis and regret-minimization frameworks from online optimization theory provide bounds on achievable performance but are rarely applied to WFM contexts.[11] Recent advances in online convex optimization with switching costs have direct applicability to real-time staffing adjustments (moving agents between queues, activating overflow groups) but have not been adapted to the WFM domain.
Why it matters: The gap between theoretically optimal real-time decisions and current practice (spreadsheet-based threshold rules) represents the single largest operational efficiency opportunity in modern contact centers.
Key researchers: Mor Harchol-Balter (Carnegie Mellon), Galit Yom-Tov (Technion), Avishai Mandelbaum (Technion).
Reinforcement Learning for Routing
Skills-based routing — directing incoming interactions to agents based on skill match, proficiency, and availability — is a sequential decision problem under uncertainty. Reinforcement learning (RL) offers a framework for learning routing policies from operational data, but the high-dimensional action space (potentially thousands of agents), partial observability (unknown caller intent until interaction begins), and non-stationarity (workforce composition changes with shift boundaries) create formidable challenges.
Current state: Dai and Shi (2019) applied deep reinforcement learning to queueing network control, demonstrating feasibility for moderate-scale systems.[12] Separately, Gurvich and Whitt (2009) developed asymptotic optimality results for routing in many-server systems that provide theoretical benchmarks.[13] The intersection — RL-based routing that achieves near-optimal performance in realistic contact center environments — remains an open frontier.
Why it matters: Routing decisions are made millions of times daily in large contact centers. Even small improvements in routing efficiency translate to measurable cost savings and service quality gains.
Key researchers: Jim Dai (Cornell), Ward Whitt (Columbia), Amy Ward (University of Chicago Booth).
Predictive Adherence
Schedule adherence — whether agents are performing assigned activities at assigned times — is traditionally measured retroactively. Predictive adherence would anticipate deviations before they occur, enabling preemptive intervention. The open problem is developing models that reliably predict individual agent adherence behavior in real time using observable signals.
Current state: Limited academic work exists specifically on adherence prediction, though related problems in absenteeism prediction and employee behavior modeling provide foundations.[14] Survival analysis and hazard models for break-return times, combined with contextual features (time of day, day of week, queue pressure), represent a promising but underdeveloped approach.
Why it matters: Adherence loss is the primary driver of the gap between planned and actual staffing. A workforce planned for 95% adherence that delivers 88% adherence is functionally understaffed by the equivalent of approximately 8% of the workforce.
Key researchers: This area lacks established academic leadership, representing an opportunity for new entrants to the field.
Human Factors
Optimal Occupancy by Task Type
Occupancy — the fraction of logged-in time spent handling interactions or in after-call work — has known performance effects: excessive occupancy increases errors and burnout, while low occupancy increases cost. The open problem is that optimal occupancy thresholds vary by task type, cognitive load, and individual characteristics, but current models treat occupancy as a single universal parameter.
Current state: Gans, Koole, and Mandelbaum (2003) identified occupancy effects as a critical factor in call center operations but noted the absence of rigorous empirical studies linking occupancy to quality and wellbeing outcomes across task types.[15] Research in cognitive psychology on vigilance decrements and task-switching costs provides relevant frameworks but has not been systematically translated to contact center task taxonomies.[16]
Why it matters: As interactions become more complex (routine queries handled by AI, escalations handled by humans), human agents face consistently higher cognitive loads. Occupancy thresholds calibrated for simple, repetitive call handling may be dangerously inappropriate for complex problem-solving interactions.
Key researchers: Ger Koole (Vrije Universiteit Amsterdam), Anat Rafaeli (Technion), Galit Yom-Tov (Technion).
Circadian-Aware Scheduling Algorithms
Human cognitive performance varies predictably with circadian rhythms: alertness, reaction time, working memory, and decision-making quality fluctuate across the 24-hour cycle. Current scheduling algorithms are circadian-blind, assigning tasks based on demand coverage without considering when individual agents are cognitively optimal for different task types.
Current state: Chronobiology research has established robust models of circadian performance variation, including the two-process model of sleep regulation.[17] Shift work research in healthcare has demonstrated that circadian-misaligned scheduling increases medical errors.[18] Integration of these findings into contact center scheduling optimization is absent from the literature.
Why it matters: 24/7 contact centers operate across all circadian phases. Matching task complexity to circadian-optimal periods for each agent could reduce errors, improve customer outcomes, and decrease burnout — without adding staff.
Key researchers: Till Roenneberg (Ludwig Maximilian University of Munich), Steven Lockley (Harvard Medical School/Monash University).
Burnout Prediction Models
Contact center agent burnout is well-documented as a driver of turnover, absenteeism, and quality degradation. Predicting which agents are approaching burnout — and intervening before irreversible disengagement occurs — is an open problem at the intersection of organizational psychology, time series analysis, and workforce analytics.
Current state: Maslach and Leiter's burnout framework identifies emotional exhaustion, depersonalization, and reduced personal accomplishment as burnout dimensions.[19] Operationalizing these constructs using behavioral signals observable in WFM systems (handle time trends, adherence patterns, schedule swap frequency, quality score trajectories) requires validated mapping between system-observable features and psychological states. Preliminary work suggests feasibility but lacks large-scale validation.[20]
Why it matters: Agent attrition costs in contact centers range from $10,000 to $25,000 per departure when accounting for recruitment, training, and productivity ramp. Early identification of burnout trajectories could significantly reduce these costs while improving agent wellbeing.
Key researchers: Christina Maslach (UC Berkeley), Wilmar Schaufeli (Utrecht University/KU Leuven).
AI Integration
Human-AI Handoff Optimization
When AI agents escalate interactions to human agents, the handoff quality critically affects customer experience and human agent workload. The open problem is determining optimal escalation policies: when should the AI escalate, how much context should transfer, and how should the human agent queue incorporate AI-originated interactions alongside direct human contacts?
Current state: Research on human-AI collaboration in decision-making provides frameworks but has not been applied specifically to the contact center escalation problem.[21] The queueing-theoretic formulation — a tandem queue where the first server (AI) has stochastic service completion (may or may not resolve) and incomplete service generates a retrial or transfer — has been studied in general terms but not with the specific structure of AI agent limitations.
Why it matters: Poor handoffs erase the efficiency gains from AI automation. Customers who must repeat information after an AI escalation experience worse outcomes than if they had reached a human directly. Optimizing the handoff is essential for AI deployment to achieve net positive customer impact.
Key researchers: Ece Kamar (Microsoft Research), Eric Horvitz (Microsoft Research), Noah Gans (University of Pennsylvania Wharton).
AI Agent Capacity Planning Theory
Classical capacity planning assumes servers (agents) with stationary, human-like service characteristics. AI agents violate these assumptions: they can handle multiple concurrent interactions, their service quality may degrade gracefully rather than binary success/failure, they have computational rather than labor cost structures, and they can be scaled near-instantaneously. No established theoretical framework exists for capacity planning in blended human-AI workforces.
Current state: Multi-server queueing models with heterogeneous servers provide partial foundations.[22] The specific properties of AI agents — effectively infinite concurrency with quality-dependent completion rates, zero fatigue but potential latency constraints — require new model classes. Early industry frameworks exist but lack academic rigor and validation.
Why it matters: Organizations investing billions in AI agent deployment need capacity planning models that correctly account for the unique properties of AI servers. Using human-calibrated Erlang models for AI capacity planning produces systematically wrong answers.
Key researchers: Mor Armony (New York University Stern), Ramandeep Randhawa (USC Marshall), Achal Bassamboo (Northwestern Kellogg).
Governance Frameworks for Autonomous WFM
As WFM systems incorporate more AI-driven automation — automated scheduling, dynamic routing, real-time staffing adjustments, performance-triggered coaching — questions of algorithmic governance become critical. The open problem spans technical (how to audit algorithmic scheduling decisions), ethical (how to ensure equitable treatment), legal (compliance with scheduling laws and labor regulations), and organizational (accountability structures for automated decisions) dimensions.
Current state: Algorithmic fairness research provides technical frameworks for bias detection and mitigation, but application to workforce scheduling is underdeveloped.[23] Emerging predictive scheduling laws in jurisdictions like San Francisco, New York City, Oregon, and Chicago create compliance requirements that interact with optimization objectives in complex ways. No comprehensive governance framework exists for WFM-specific AI systems.
Why it matters: Regulatory scrutiny of algorithmic workforce management is accelerating. Organizations deploying autonomous WFM systems without governance frameworks face legal, reputational, and operational risks. The academic community has an opportunity to shape governance standards before regulatory mandates outpace available frameworks.
Key researchers: Solon Barocas (Microsoft Research/Cornell), Arvind Narayanan (Princeton), Manish Raghavan (MIT Sloan).
See also
- Workforce Management
- Operations Research in Workforce Management
- Erlang C
- Multi-Objective Optimization in WFM
- Artificial Intelligence in Workforce Management
- WFM Research Agenda
- Landmark Papers in Contact Center Operations Research
- ↑ Lütkepohl, H. (2005). New Introduction to Multiple Time Series Analysis. Springer-Verlag Berlin Heidelberg. ISBN 978-3-540-27752-1.
- ↑ Ibrahim, R., & L'Ecuyer, P. (2013). "Forecasting call center arrivals: Fixed-effects, mixed-effects, and bivariate models." Manufacturing & Service Operations Management, 15(1), 72–85.
- ↑ Weinberg, J., Brown, L. D., & Stroud, J. R. (2007). "Bayesian forecasting of an inhomogeneous Poisson process with applications to call center data." Journal of the American Statistical Association, 102(480), 1185–1198.
- ↑ Shen, H., & Huang, J. Z. (2008). "Interday forecasting and intraday updating of call center arrivals." Manufacturing & Service Operations Management, 10(3), 391–410.
- ↑ Pan, S. J., & Yang, Q. (2010). "A survey on transfer learning." IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359.
- ↑ Chapados, N. (2014). "Effective Bayesian modeling of groups of related count time series." Proceedings of the 31st International Conference on Machine Learning (ICML), 1395–1403.
- ↑ Ernst, A. T., Jiang, H., Krishnamoorthy, M., & Sier, D. (2004). "Staff scheduling and rostering: A review of applications, methods and models." European Journal of Operational Research, 153(1), 3–27.
- ↑ Legrain, A., Omer, J., & Rosat, S. (2020). "An online stochastic algorithm for a dynamic nurse scheduling problem." European Journal of Operational Research, 285(1), 196–210.
- ↑ Pinedo, M. L. (2016). Scheduling: Theory, Algorithms, and Systems. 5th ed. Springer. ISBN 978-3-319-26578-0.
- ↑ Brunner, J. O., Bard, J. F., & Kolisch, R. (2009). "Flexible shift scheduling of physicians." Health Care Management Science, 12(3), 285–305.
- ↑ Borodin, A., & El-Yaniv, R. (1998). Online Computation and Competitive Analysis. Cambridge University Press. ISBN 978-0-521-56392-5.
- ↑ Dai, J. G., & Shi, P. (2019). "Inpatient overflow: An approximate dynamic programming approach." Manufacturing & Service Operations Management, 21(4), 894–911.
- ↑ Gurvich, I., & Whitt, W. (2009). "Queue-and-idleness-ratio controls in many-server service systems." Mathematics of Operations Research, 34(2), 363–396.
- ↑ Harrison, D. A., & Martocchio, J. J. (1998). "Time for absenteeism: A 20-year review of origins, offshoots, and outcomes." Journal of Management, 24(3), 305–350.
- ↑ Gans, N., Koole, G., & Mandelbaum, A. (2003). "Telephone call centers: Tutorial, review, and research prospects." Manufacturing & Service Operations Management, 5(2), 79–141.
- ↑ Wickens, C. D., Hollands, J. G., Banbury, S., & Parasuraman, R. (2013). Engineering Psychology and Human Performance. 4th ed. Pearson. ISBN 978-0-205-02198-7.
- ↑ Borbély, A. A., Daan, S., Wirz-Justice, A., & Deboer, T. (2016). "The two-process model of sleep regulation: A reappraisal." Journal of Sleep Research, 25(2), 131–143.
- ↑ Barger, L. K., Ayas, N. T., Cade, B. E., Cronin, J. W., Rosner, B., Speizer, F. E., & Czeisler, C. A. (2006). "Impact of extended-duration shifts on medical errors, adverse events, and attentional failures." PLoS Medicine, 3(12), e487.
- ↑ Maslach, C., & Leiter, M. P. (2016). "Understanding the burnout experience: Recent research and its implications for psychiatry." World Psychiatry, 15(2), 103–111.
- ↑ Deery, S., Iverson, R., & Walsh, J. (2002). "Work relationships in telephone call centres: Understanding emotional exhaustion and employee withdrawal." Journal of Management Studies, 39(4), 471–496.
- ↑ Bansal, G., Nushi, B., Kamar, E., Lasecki, W. S., Weld, D. S., & Horvitz, E. (2019). "Beyond accuracy: The role of mental models in human-AI team performance." Proceedings of the AAAI Conference on Human and Artificial Intelligence (HCOMP), 7, 2–11.
- ↑ Armony, M. (2005). "Dynamic routing in large-scale service systems with heterogeneous servers." Queueing Systems, 51(3-4), 287–329.
- ↑ Barocas, S., Hardt, M., & Narayanan, A. (2019). Fairness and Machine Learning: Limitations and Opportunities. MIT Press. ISBN 978-0-262-04861-3.
