Game Theory and Incentive Design in WFM
Game Theory and Incentive Design in WFM applies the mathematics of strategic interaction to workforce management problems where outcomes depend on the decisions of multiple self-interested participants. Agents choose shifts based on preferences. Managers design overtime offers to elicit participation. Scheduling systems create markets for swap and VTO. In each case, the rules of the game determine whether individual incentives align with operational objectives — or undermine them.
Overview

Game theory studies situations where rational decision-makers interact strategically — each player's optimal choice depends on what others choose. Mechanism design is the engineering discipline within game theory: instead of analyzing an existing game, design the rules so that self-interested play produces the desired outcome.
WFM is saturated with strategic interactions that are rarely recognized as such:
- Shift bidding: Agents submit preferences. If agents learn that requesting unpopular shifts gets them better overall schedules (because the system rewards flexibility), the "true preferences" reported to the system are strategic, not honest. The bidding mechanism determines whether truth-telling is optimal.
- VTO and overtime markets: Offering VTO to the entire floor triggers strategic waiting ("if I don't take it now, will a better offer come later?"). Overtime offers face the same problem in reverse.
- Performance incentives: AHT targets create strategic behavior — agents rush calls, avoid complex issues, or game after-call work codes. Goodhart's Law — "when a measure becomes a target, it ceases to be a good measure" — is a game theory result about incentive misalignment.
- Schedule swaps: Peer-to-peer swap markets are matching markets. The rules of the swap platform (who can swap with whom, what approval is needed, whether three-way swaps are allowed) determine market efficiency and fairness.
Mathematical Foundation
Normal-Form Games and Nash Equilibrium
A normal-form game consists of:
- N players (agents, managers, the scheduling system)
- For each player i, a strategy set
- For each player i, a payoff function
A Nash equilibrium is a strategy profile where no player can improve their payoff by unilaterally changing their strategy:
Intuition: At Nash equilibrium, every player is doing the best they can given what everyone else is doing. Nobody has an incentive to deviate.
Mechanism Design
Mechanism design reverses the game theory problem. Instead of: "Given these rules, what will players do?" the question becomes: "What rules should we set so that players do what we want?"
A mechanism is a set of rules that maps player actions to outcomes. A mechanism is incentive-compatible (or truthful) if reporting truthful preferences is every player's optimal strategy. The revelation principle states that any outcome achievable by any mechanism can also be achieved by a truthful mechanism — so the designer can focus on truth-telling mechanisms without loss of generality.
Vickrey-Clarke-Groves (VCG) mechanism: Each player reports their valuation. The mechanism allocates efficiently and charges each player the externality they impose on others. Under VCG, truth-telling is a dominant strategy. This provides the theoretical foundation for auction-based shift allocation.
Auction Theory
Auctions are mechanisms for allocating scarce resources. Relevant auction types for WFM:
- First-price sealed bid: Players submit bids, highest bidder wins, pays their bid. Encourages strategic underbidding.
- Second-price sealed bid (Vickrey): Highest bidder wins but pays the second-highest bid. Truth-telling (bidding your true value) is optimal.
- Combinatorial auctions: Players bid on bundles of items (e.g., sets of shifts). Useful when shifts have complementarities (an agent values Monday+Wednesday more than either alone).
Principal-Agent Theory
The principal-agent problem arises when one party (the principal — management) delegates work to another (the agent — the employee) whose effort or quality is not directly observable. The principal must design incentives (compensation, metrics, consequences) that align the agent's self-interest with organizational objectives.
Key results:
- Moral hazard: When effort is unobservable, agents exert less effort than the principal would want. Solution: tie compensation to observable outcomes.
- Adverse selection: When agent capabilities are private information, the principal cannot distinguish high-performers from low-performers before hiring. Solution: screening mechanisms (trial periods, skill assessments).
- Information rent: Agents with private information capture surplus. The principal must "pay for truth."
WFM Applications
Shift Bidding as Auction Mechanism
Problem: 150 agents must be assigned to shifts. Management allows agents to rank their top 5 shift preferences. How should the algorithm use these preferences?
Naive approach (first-preference-wins): Assign agents to their top preference if available, then second preference, etc. This mechanism is not truthful. Agent Alice, knowing that the 9 AM–5 PM shift is oversubscribed, might strategically list the less popular 10 AM–6 PM as her first choice to guarantee getting it, even though she truly prefers 9–5.
Truthful mechanism (Deferred Acceptance): Use the Gale-Shapley algorithm. Agents rank shifts; shifts have capacity limits and a priority ordering (seniority). The algorithm iteratively proposes and rejects until stable. Under Deferred Acceptance, agents cannot improve their outcome by misreporting preferences. Truth-telling is optimal.
Practical impact: Systems that reward strategic bidding create an arms race where sophisticated agents get better outcomes than honest ones. Truthful mechanisms eliminate this advantage, improving both fairness and data quality (reported preferences reflect actual preferences, enabling better schedule design).
VTO as Mechanism Design Problem
Problem: 20 excess agent-hours exist on Wednesday afternoon. Management wants to offer VTO to reduce labor cost. How should VTO be allocated?
Current practice (first-come-first-served): Post VTO to all eligible agents, first responders get it. This mechanism rewards monitoring speed, not need. Agents who constantly watch the app grab VTO; agents actively handling calls miss the notification.
Mechanism design approach: Ask agents to report their VTO willingness and personal cost of leaving (some agents need the hours for income; others prefer time off). Design the allocation to maximize total surplus:
- Agents report: willingness to take VTO and minimum acceptable compensation (e.g., "I'll take VTO if I still get 2 hours of guaranteed pay" vs. "I'll take unpaid VTO anytime").
- Mechanism: Allocate VTO to agents with the lowest cost (those who genuinely prefer time off), charge/compensate according to VCG principles.
- Incentive compatibility: Under VCG, agents report truthfully because lying cannot improve their allocation.
Practical VTO mechanisms need not implement full VCG. Simpler rules can approximate truthfulness: lottery with opt-in (random selection from willing agents) is more truthful than first-come-first-served because it removes speed as a strategic variable.
Goodhart's Law and AHT Targets
Goodhart's Law: When a measure becomes a target, it ceases to be a good measure.
The game: Management sets a 4-minute AHT target. Agents are evaluated (and potentially compensated) based on AHT performance. The intended outcome: efficient call handling. The actual Nash equilibrium:
- Agents rush complex calls to reduce AHT, creating repeat contacts (higher total volume).
- Agents transfer difficult calls rather than resolve them (transfer shifts cost to another agent).
- Agents manipulate after-call work codes to reduce reported handle time.
- Some agents cherry-pick simple calls from the queue if the routing allows.
Game-theoretic analysis: The AHT target creates a game where the agent's payoff function (good evaluation) is misaligned with the organization's payoff function (cost-effective resolution). The Nash equilibrium of the agent game is not the socially optimal outcome.
Mechanism redesign: Replace single-metric targets with composite quality scores that internalize externalities:
- First Contact Resolution (FCR) penalizes agents for generating repeat contacts.
- Customer satisfaction scores capture service quality.
- Transfer rate penalties internalize the cost of transfers.
The mechanism design insight: the performance system is a mechanism. Its rules determine agent behavior as surely as an auction's rules determine bidding behavior.
Schedule Swap Markets as Matching Markets
Problem: Agent A wants Tuesday off. Agent B wants Thursday off. A currently has Thursday; B has Tuesday. A swap benefits both. How should the swap market be designed?
Two-way swap platform: Only pairwise swaps allowed. A offers Tuesday, B offers Thursday, they match. Simple but inefficient — many beneficial exchanges require three or more participants (A gives to B, B gives to C, C gives to A).
Multi-way swap platform (Top Trading Cycles): Each agent points to their most-preferred available shift. Follow the directed graph. When a cycle forms, execute all swaps in the cycle simultaneously. This algorithm is strategy-proof (agents should point to their true top choice) and Pareto-efficient (no further mutually beneficial swaps exist after the algorithm terminates).
Practical constraint: Swaps must maintain coverage. Agent A cannot swap away from a shift that would leave the interval understaffed. This adds a feasibility constraint to the matching market — making it a constrained matching problem, computationally harder but still solvable.
Supervisor-Agent Dynamics as Principal-Agent Problem
Setup: A supervisor monitors 15 agents. Agent effort (engagement, attention quality, adherence to process) is partially unobservable. The supervisor observes outcomes (handle time, quality scores, customer satisfaction) that are noisy signals of effort.
Principal-agent model: The supervisor (principal) designs monitoring intensity, feedback frequency, and escalation rules. The agent chooses effort level. Higher effort produces better outcomes but costs the agent discomfort.
Optimal monitoring: Monitor enough that the agent internalizes quality expectations, but not so much that monitoring costs exceed the value of improved performance. Game theory predicts: agents exert minimum effort when unmonitored and maximum effort when observed. Mixed monitoring (random quality checks) is more cost-effective than constant surveillance because agents maintain effort when the probability of being monitored is nonzero.
This is precisely the logic behind random quality sampling in contact center QA — evaluate a random subset of calls, and agents maintain quality across all calls because any call might be sampled.
Worked Example: VTO Auction
Setup: Wednesday 2 PM–5 PM has 8 excess agents (3 hours × 8 agents = 24 excess agent-hours). Management wants to offer VTO to reduce cost. Agent hourly rates vary:
| Agent | Hourly Rate | VTO Willingness | True Value of Time Off ($/hr) |
|---|---|---|---|
| Agent 1 | $18/hr | Very willing | $22 (values time off more than pay) |
| Agent 2 | $20/hr | Willing | $19 (roughly indifferent) |
| Agent 3 | $22/hr | Unwilling | $12 (needs the pay) |
| Agent 4 | $18/hr | Very willing | $25 (strongly prefers time off) |
| Agent 5 | $20/hr | Willing | $18 (slightly prefers time off) |
| Agent 6 | $22/hr | Unwilling | $10 (needs the pay) |
| Agent 7 | $18/hr | Willing | $16 (mild preference for time off) |
| Agent 8 | $20/hr | Very willing | $24 (strongly prefers time off) |
Optimal allocation (maximize total surplus): Offer VTO to agents who value time off most relative to their wage. Total surplus for agent i getting VTO = value of time off − company savings (avoided wage).
Wait — the company saves the agent's wage. The agent gains their personal value of time off. Total surplus = value of time off + wage saved to company = .
Rank by total surplus:
- Agent 4: $25 + $18 = $43
- Agent 8: $24 + $20 = $44
- Agent 1: $22 + $18 = $40
- Agent 2: $19 + $20 = $39
- Agent 5: $18 + $20 = $38
Select top 8 agents — but we only need 8 of 8, and all benefit from VTO except those whose value of time off < 0 (none here). However, Agents 3, 6, and 7 value their pay more than time off (). Giving them VTO destroys value for them.
Efficient VTO allocation: Offer VTO to Agents 1, 4, 8 (strongly willing), 2, 5 (willing). That's 5 agents × 3 hours = 15 hours. Need 24 hours. Include Agent 7 (mild preference, net positive). That's 18 hours. Still short. Agents 3 and 6 would lose value from VTO — do not force VTO on them.
Result: 6 agents take VTO (18 hours), 2 excess agents remain on the clock. The mechanism correctly identifies that forcing VTO on unwilling agents is value-destroying even though it reduces labor cost.
Behavioral Game Theory: Bounded Rationality
Classical game theory assumes perfect rationality. Real agents have bounded rationality:
- Loss aversion: Agents weight schedule losses (losing a preferred day off) more heavily than equivalent gains. Schedule change framing matters.
- Status quo bias: Agents resist changes to established schedules even when the new schedule is objectively better.
- Fairness preferences: Agents reject offers they perceive as unfair even when accepting would improve their payoff. Schedule systems perceived as unfair generate grievances regardless of mathematical optimality.
- Myopic behavior: Agents optimize for the current bidding round rather than the long-term game. Mechanisms that reward long-term cooperation (repeated game effects, reputation systems) can counteract myopia.
WFM mechanism designers must account for these behavioral realities. A theoretically optimal mechanism that assumes perfect rationality will fail in practice if agents don't behave as rational maximizers.
Maturity Model Position
Game theory and mechanism design map to the WFM Labs Maturity Model:
- Level 1 (Reactive): No recognition of strategic behavior. First-come-first-served for VTO. Simple seniority for scheduling. No analysis of incentive effects.
- Level 2 (Established): Recognition that AHT targets create gaming behavior. Basic fairness rules in schedule assignments.
- Level 3 (Advanced): Shift bidding systems with preference collection. Awareness of mechanism design principles. Composite performance metrics that reduce gaming.
- Level 4 (Optimized): Formal mechanism design for VTO/overtime markets. Multi-way swap platforms. Incentive-compatible preference elicitation. A/B testing of mechanism variations.
- Level 5 (Autonomous): Dynamic market mechanisms that adapt rules based on observed behavior. Automated detection of gaming. Mechanism optimization through reinforcement learning.
See Also
- Operations Research in Workforce Management
- Schedule Optimization
- Self-Scheduling and Flexible Workforce Models
- Multi-Objective Optimization
- Performance Management
- Quality Management
- WFM Labs Maturity Model
References
- Nisan, N. et al. (eds.) Algorithmic Game Theory. Cambridge University Press, 2007. — Comprehensive treatment of mechanism design, auctions, and computational game theory.
- Roth, A.E. and Sotomayor, M.A.O. Two-Sided Matching: A Study in Game-Theoretic Modeling and Analysis. Cambridge University Press, 1990. — Foundation for matching markets.
- Shapley, L. and Shubik, M. "The Assignment Game I: The Core." International Journal of Game Theory 1, 1972. — Assignment markets.
- Myerson, R.B. "Optimal Auction Design." Mathematics of Operations Research 6(1), 1981. — Revenue-optimal mechanisms.
- Koole, G. Call Center Optimization. MG Books, 2013. — Agent routing as a resource allocation game.
- Goodhart, C.A.E. "Problems of Monetary Management: The U.K. Experience." In Monetary Theory and Practice. Macmillan, 1984. — Origin of Goodhart's Law.
