WFM Data Governance and Quality

WFM Data Governance and Quality addresses the organizational and technical practices that ensure WFM systems operate on trustworthy data. Every WFM output — forecasts, schedules, adherence scores, staffing recommendations — is only as reliable as its input data. This page provides the framework for governing that data across the WFM ecosystem.

Overview

WFM is uniquely sensitive to data quality problems because its outputs are operationally binding. A forecast error becomes overstaffing or understaffing the next day. A schedule built on phantom agents creates real coverage gaps. An adherence score calculated from misaligned timestamps falsely flags compliant agents.

Unlike BI dashboards where a data error produces a wrong number on a screen, WFM data errors produce wrong decisions that affect labor costs, service levels, and employee experience. This makes data governance not a compliance exercise but an operational necessity.

Data Quality Dimensions for WFM

Completeness

Every expected data point must exist. Missing data in WFM creates silent failures — the system produces results that look valid but are built on partial information.

Data Element	Completeness Check	Impact of Missing Data
Interval volume	96 intervals per day per queue (15-min grain)	Missing intervals appear as zero volume → forecast underestimates demand
Agent schedule	Every active agent has a schedule for each scheduled day	Unscheduled agents appear as unstaffed → coverage calculations undercount supply
Contact records	CDR count matches ACD contact counter	Missing contacts → understated volume → understated staffing
Agent skills	Every agent has at least one skill assigned	Skill-less agents are unroutable → phantom staffing
AHT components	Talk + hold + ACW = total handle time	Missing ACW → understated AHT → understated staffing

Detection pattern: Run daily completeness checks that compare expected record counts against actual counts. For intervals: expected = queues × 96 × days. For agents: expected = active_agents from HRIS. Alert on any deficit > 1%.

Accuracy

Data values must reflect reality.

Data Element	Accuracy Risk	Verification Method
Handle time	ACW timer left running (agent forgot to close) → inflated AHT	Cap at P99; flag contacts > 2× median AHT for review
Contact volume	Transfers double-counted as separate contacts	Reconcile with unique ANI/session count; compare ACD volume report vs CDR count
Agent state duration	Clock drift between ACD and WFM servers	NTP sync validation; compare state duration totals against shift length
Forecast vs actuals	Forecast generated with wrong timezone → shifted by hours	Compare forecast peak-hour to actual peak-hour; shift should be < 1 interval
Shrinkage rates	Planned shrinkage includes categories that aren't actually shrinkage	Audit shrinkage categories quarterly; validate against actual non-productive time

Timeliness

Data must arrive fast enough for its intended use.

Data Feed	Required Freshness	Common Failure
Agent state (adherence)	< 2 seconds	ACD event API down; webhook queue backing up; network latency
Queue metrics (wallboard)	< 10 seconds	Polling interval too long; API rate-limited
Contact records (intraday)	< 5 minutes	CDR generation delayed; batch job stuck
HRIS updates (scheduling)	< 24 hours	HRIS sync job failed silently; delta detection broken
Forecast refresh	< 15 minutes for intraday	Reforecast engine crashed; model training timed out

Monitoring pattern: Track data freshness for each feed. Maintain a last_successful_sync timestamp per integration. Alert when freshness exceeds the required threshold by 2×.

Consistency

The same entity must be represented the same way across all systems.

The double-count problem: A transferred contact appears as two contacts in the ACD CDR but as one contact in the CRM. If WFM uses ACD volume and CRM resolution rate, the metrics are inconsistent — 200 ACD contacts with 150 CRM resolutions looks like 75% resolution, but 50 of those ACD contacts are transfers, so the real resolution rate is 100%.

The identity problem: Agent "Jane Smith" is agent_id "a-4821" in the ACD, employee_id "E-10042" in HRIS, and user "jsmith@company.com" in the QA system. If the mapping table has a stale entry, Jane's QA scores don't appear on her WFM profile, and her schedule adherence can't be reconciled with her HRIS contracted hours.

The timezone problem: The ACD stores timestamps in UTC. The WFM system converts to site-local time. The HRIS uses corporate HQ timezone. During DST transitions, a 2:30 AM agent state event could map to two different intervals depending on which timezone is applied first.

Common Data Quality Issues

Timezone Mismatches

The most pervasive data quality issue in multi-site WFM operations.

Scenario: Site A is in Eastern Time (UTC-5). Site B is in Pacific Time (UTC-8). The ACD reports all timestamps in UTC. The WFM system's "9:00 AM" interval must map to different UTC ranges for each site.

Failure modes:

Forecast trained on UTC intervals without site-local alignment → peak hour is wrong
Schedule published in site-local time but adherence checked against UTC → every agent appears 3 hours out of adherence
DST transition: 2:00 AM event on "spring forward" day doesn't exist in local time; "fall back" creates duplicate 1:00–2:00 AM intervals

Prevention:

Store all timestamps in UTC with IANA timezone metadata
Convert to local time only at the presentation layer
Pre-compute interval boundaries in UTC for each site, accounting for DST
Test DST transitions explicitly in your integration test suite — twice per year

Double-Counted Transfers

Scenario: A customer calls billing (Queue A). The billing agent transfers to tech support (Queue B). The ACD generates two CDRs: one for Queue A (talk time = 45 seconds, transfer = true) and one for Queue B (talk time = 320 seconds, disposition = resolved).

Impact on WFM:

Volume inflated by 100% for the transferred contact (counted in both queues)
AHT for Queue A is understated (45 seconds of transfer-out vs a typical 300-second handle)
Staffing for Queue A is overstated (volume up, AHT down partially cancel, but not perfectly)

Remediation:

For volume: Count arrivals to Queue A. Separately count arrivals to Queue B. Do not sum them for "total contacts" without deduplicating transfers. Use the transfer correlation ID or session ID to identify linked contacts.
For AHT: Decide on a convention. Option 1: attribute full handle time to the originating queue (customer's perspective). Option 2: attribute handle time to each queue separately (agent's perspective). Document the convention and apply it consistently.

Missing After-Call Work

Scenario: Agents bypass the ACW state by going directly to "available" after a call. The ACD records zero ACW time.

Impact: AHT is understated by 30–60 seconds per contact (typical ACW duration). Over a day, this understatement reduces the Erlang-calculated staffing requirement by 5–10%, creating chronic understaffing.

Detection: Compare the percentage of contacts with zero ACW against a baseline. If historically 5% of contacts have zero ACW and suddenly 30% do, the agents' desktop configuration changed or agents found a workaround.

Prevention: Configure the ACD to enforce minimum ACW duration (even 5 seconds) so every contact generates an ACW record. If the business allows immediate availability, accept the AHT measurement gap and add a fixed ACW offset to forecasting inputs.

Phantom Agents

Scenario: Agent terminated in HRIS on Friday. WFM sync runs Monday. Agent has published schedules for the next two weeks. Schedule shows adequate coverage, but a phantom shift covers nobody.

Impact: Understaffing on the day of execution. The severity scales linearly — 5 phantom agents in a 100-agent center means 5% understaffing during their scheduled shifts.

Prevention:

Run HRIS-to-WFM reconciliation daily, not weekly
Flag any agent scheduled for future dates who is not "active" in HRIS
Automate: termination event triggers immediate schedule unpublish for the agent

Data Lineage

Tracing a Forecast

Every forecast number should be traceable back to its source data:

Staffing requirement (45 agents for Queue A, 10:00-10:15 Monday)
  ← [[Erlang C]] calculation
    ← Inputs: forecast volume (142 contacts), forecast AHT (295 sec),
       SL target (80/20), shrinkage (32%)
      ← Forecast volume (142)
        ← Model: ARIMA(2,1,1) trained on 52 weeks of historical volume
          ← Historical volume source: ACD CDRs, 15-min aggregation
            ← Data quality: 99.8% interval completeness,
               transfers deduplicated, outliers capped at P99
      ← Forecast AHT (295 sec)
        ← Model: 8-week weighted moving average
          ← Historical AHT source: ACD CDRs, ACW included
      ← Shrinkage (32%)
        ← Components: breaks (8%), training (4%), coaching (3%),
           meetings (2%), PTO (6%), absenteeism (5%), other (4%)
          ← Source: 12-week rolling actual shrinkage from adherence data

Why lineage matters: When the Monday 10:00 AM interval is understaffed by 8 agents, you need to determine if the forecast was wrong (volume was 180, not 142), the AHT was wrong (actual was 340 seconds, not 295), the shrinkage was wrong (38%, not 32%), or the schedule was wrong (only 38 agents scheduled against a 45-agent requirement). Without lineage, you debug by guessing.

Lineage Implementation

Tag every forecast with a version ID and the source data date range
Store the model parameters used for each forecast version
Log shrinkage component values at the time the staffing calculation ran
Maintain an immutable audit log of forecast → staffing → schedule transformations

Data Ownership and RACI

Data Domain	Responsible	Accountable	Consulted	Informed
ACD contact data	IT / Telecom	VP Technology	WFM, QA	BI, Finance
Agent master data	HRIS / HR Ops	CHRO	WFM, Payroll, IT	All
Schedule data	WFM team	WFM Director	Operations, HR	Agents, Payroll
Forecast data	WFM Forecasting	WFM Director	Operations, Finance	Scheduling, RT
Adherence data	WFM Real-Time	WFM Director	Operations	HR, QA
Quality scores	QA team	QA Director	WFM, Training	Operations
Shrinkage inputs	WFM Planning	WFM Director	HR, Training, Operations	Finance

Key ownership disputes and resolution:

Who owns AHT? The ACD produces it, WFM consumes it, QA influences it through coaching. Resolution: IT owns the data pipeline; WFM owns the metric definition (what's included/excluded); QA owns improvement initiatives.
Who owns the agent skill profile? HRIS tracks certifications, training tracks completions, WFM maps skills to queues, the ACD uses skills for routing. Resolution: HRIS is the system of record for skill certifications. WFM translates certifications to routing skills. Changes flow HRIS → WFM → ACD.
Who owns schedule changes? WFM publishes the schedule. Supervisors swap shifts. Agents trade shifts. The automation platform modifies activities. Resolution: WFM owns the schedule of record. All modifications flow through the WFM system API, with the modification source tagged for audit.

Privacy and Compliance

GDPR and Workforce Data

WFM systems process personal data extensively. Under GDPR (and similar regulations in CCPA, LGPD, PIPA):

Data minimization: Collect only what's needed for WFM operations. Agent demographic data beyond what's required for scheduling (e.g., ethnicity, religion) should not be in the WFM system unless required for accommodation scheduling.
Purpose limitation: Data collected for scheduling cannot be repurposed for performance management without separate legal basis. Adherence data showing a bathroom break pattern is scheduling data, not performance data.
Right to erasure: When an agent leaves and requests deletion, their personal data must be removable from WFM systems. Practical approach: pseudonymize rather than delete — replace agent name with "Agent-XXXX" in historical records to preserve aggregate accuracy while removing PII.
Data retention limits: Keeping 10 years of detailed agent adherence data is hard to justify under data minimization. Define retention periods aligned with legitimate business need. See retention policies below.

PCI-DSS for Contact Centers

Contact centers handling payment data must comply with PCI-DSS. WFM implications:

WFM systems should never store payment card data (they shouldn't, but verify)
Screen recording paused during payment capture — this affects adherence if the "recording paused" state isn't mapped to a valid schedule activity
Agent desktop monitoring tools integrated with WFM must not capture card data from screen shares

Agent Monitoring Disclosure

Most jurisdictions require disclosure that agents are being monitored. WFM adherence monitoring qualifies:

Disclose adherence monitoring in employment agreements
In two-party consent jurisdictions, ensure agents acknowledge real-time state tracking
Union environments may have additional constraints — see Union Environments and WFM

Data Retention Policies

Data Category	Operational (Hot)	Analytical (Warm)	Compliance (Cold)	Justification
Contact detail records	90 days	2 years (aggregated to interval)	7 years for regulated industries	Tax, audit, dispute resolution
Agent state events	30 days	1 year (daily adherence summaries)	Not retained	No regulatory requirement for second-level state data
Schedules (published)	90 days + 8 weeks forward	2 years	7 years (shift records for labor law)	Wage-and-hour dispute evidence
Forecast versions	Current + 90 days of history	2 years (for accuracy trending)	Not retained	No regulatory requirement for forecast history
QA evaluations	90 days	2 years	Per regulatory requirement	May be needed for dispute resolution
Agent PII	While employed + 90 days	Pseudonymized in historical records	Per jurisdiction	GDPR: minimum necessary; US: varies by state

Aggregation schedule:

Daily job:
  - Raw contact records > 90 days → aggregate to interval grain → archive raw
  - Raw agent state events > 30 days → aggregate to daily adherence summary → delete raw

Weekly job:
  - 15-minute interval data > 2 years → aggregate to daily grain → archive 15-min
  - Daily interval data > 5 years → aggregate to weekly grain → archive daily

Monthly job:
  - Audit: verify aggregation completeness (no gaps)
  - Report: storage consumption by data category
  - Compliance: flag any data past retention policy for review

Master Data Management

Agent Master

The agent master record is the single source of truth for who an agent is and what they can do:

Attribute	Source System	Sync Frequency	Conflict Resolution
Name, employee ID	HRIS	Daily	HRIS always wins
Team, site assignment	HRIS	Daily	HRIS always wins; WFM overrides for same-day moves logged as exceptions
Skills / proficiency	Training (certification) → WFM (routing skill mapping)	On certification event	Training certifies; WFM maps to routing skill; ACD consumes
Schedule group	WFM	Real-time	WFM owns
ACD agent ID	ACD provisioning	On hire/system change	IT provisions; mapping stored in WFM
Contact information	HRIS	Daily	HRIS always wins

Agent identity reconciliation: Run a daily reconciliation that joins HRIS active employees, ACD provisioned agents, and WFM agent records. Flag:

Active in HRIS but not provisioned in ACD (can't receive contacts)
Provisioned in ACD but not active in HRIS (potential security issue)
Active in HRIS but no WFM agent record (won't be scheduled)
WFM agent record but terminated in HRIS (phantom agent)

Queue Master

Attribute	Source	Notes
Queue ID	ACD	Unique identifier from ACD
Queue name	WFM (business name may differ from ACD technical name)	WFM maintains friendly names
Channel	ACD	Voice, chat, email, messaging
SL target	Business / Operations	Reviewed quarterly; stored in WFM
Concurrency	Business / Operations	Agents handling multiple simultaneous contacts
Skills required	WFM / Routing	Maps queues to agent skills
Forecast group	WFM	Groups queues for forecast modeling

Shift Catalog

Standardized shift definitions that constrain scheduling:

Attribute	Description
Shift ID	Unique identifier
Start time range	Earliest and latest start time (e.g., 7:00–9:00 AM)
Duration	Shift length in hours (e.g., 8.0, 10.0)
Break rules	Break count, duration, earliest/latest placement
Lunch rules	Duration, earliest/latest placement, paid/unpaid
Eligible schedule groups	Which agent groups can work this shift
Effective dates	When this shift pattern is valid

Common Pitfalls

No data quality monitoring. Data quality checks run once during implementation, then never again. Implement automated daily checks with alerting. A data quality dashboard reviewed weekly costs less than one bad forecast.
Governance by document only. Data ownership policies in a SharePoint document nobody reads. Embed governance in the system: automated reconciliation, enforced naming conventions, required metadata fields.
Treating HRIS as optional. "We'll maintain agent data directly in WFM." This creates a maintenance burden that scales linearly with headcount and eventually diverges from HRIS reality.
No retention policy enforcement. Policy says "retain 2 years." Database has 7 years of detail data because nobody implemented the cleanup job. Storage costs grow; query performance degrades; GDPR risk accumulates.
Ignoring the shrinkage data problem. Shrinkage is the hardest WFM input to measure accurately because it's composed of many small categories (breaks, training, coaching, meetings, PTO, absenteeism), each with its own data source and accuracy issues. Garbage shrinkage data → garbage staffing calculations → chronic over/understaffing.

Maturity Model Position

Level	Characteristics
Foundational	No formal data governance. Data quality issues discovered when reports look wrong. No retention policy. Agent data maintained in multiple systems with manual reconciliation. No data lineage.
Progressive	Data ownership RACI documented. Automated daily quality checks on key metrics (completeness, volume reconciliation). Retention policy defined and partially enforced. Agent master managed in HRIS with automated sync to WFM.
Advanced	Full data lineage from source to forecast to staffing decision. Automated reconciliation across all integrated systems with exception alerting. Retention policy enforced by automated jobs. Privacy impact assessment completed for WFM data. Data quality SLAs defined and monitored. Master data management with conflict resolution rules.

References

DAMA International. (2017). DAMA-DMBOK: Data Management Body of Knowledge. 2nd ed. Technics Publications. — Comprehensive data governance framework.
Redman, T. C. (2008). Data Driven: Profiting from Your Most Important Business Asset. Harvard Business Press. — Business case for data quality.
European Parliament. (2016). "General Data Protection Regulation (GDPR)." Regulation (EU) 2016/679.
PCI Security Standards Council. (2022). "PCI DSS v4.0." https://www.pcisecuritystandards.org/
Cleveland, B. (2012). Call Center Management on Fast Forward. ICMI Press. — Shrinkage measurement and forecasting accuracy.
Loshin, D. (2010). Master Data Management. Morgan Kaufmann. — MDM patterns and conflict resolution.

Anonymous

Search