External Regressors in WFM Forecasting

From WFM Labs

External Regressors in WFM Forecasting covers the use of non-historical data to improve demand forecasts. A pure time-series model uses only the history of the series itself (past call volumes to predict future call volumes). An external regressor is any outside variable — marketing activity, weather, economic data, technology events — that carries information about future demand beyond what the series history contains.

The core question is not "can we add external data?" but "does adding it measurably improve forecast accuracy?" External regressors add model complexity, data dependencies, and maintenance burden. They must earn their place by improving accuracy on held-out data.

Categories of external regressors

Marketing and commercial events

Marketing drives demand spikes. The forecaster needs to capture both the existence of an event and its expected magnitude.

Common marketing regressors:

  • Direct mail / email campaigns — volume lift correlated with send volume and response rate. The lag between send and call arrival is typically 1–3 days for email, 3–7 days for physical mail.
  • TV and radio advertising — immediate spike within hours of airtime. Magnitude depends on reach and call-to-action strength.
  • Product launches — sustained lift over days to weeks. Often includes an inquiry spike (pre-launch) and a support spike (post-launch).
  • Promotional offers — discount campaigns, free trials, renewal offers. The offer terms predict the volume shape: a deadline-driven promotion produces a hockey-stick curve.
  • Website/app changes — a new self-service feature may reduce call volume; a confusing UX change may increase it.

The practical challenge: marketing teams often finalize campaigns late, providing WFM with short notice. Building a structured handoff process — marketing provides campaign calendar with estimated reach and response rate — is as important as the statistical modeling.[1]

Weather

Weather affects contact volumes through multiple mechanisms:

  • Utility companies — temperature extremes drive billing inquiries, outage reports, and service requests. Heating degree days (HDD) and cooling degree days (CDD) are standard regressors.
  • Insurance — storms, floods, and severe weather events produce claim spikes. The lag between event and call is typically 24–72 hours.
  • Travel and hospitality — weather at destination drives booking and rebooking activity. Severe weather produces cancellation spikes.
  • Retail — extreme weather (heat waves, snowstorms) shifts purchasing patterns and associated customer service contacts.
  • Telecommunications — infrastructure damage from storms drives technical support volume.

Weather data is readily available from public sources (NOAA, national meteorological services) and commercial providers (Weather Company, AccuWeather). The practical question is granularity: zip-code-level daily temperature from NOAA is free; hourly hyperlocal forecasts from commercial providers cost money. For most WFM applications, daily regional weather from public sources is sufficient.[2]

Technology events

  • App releases and updates — a new version rollout produces a support spike. The spike magnitude correlates with the scope of changes and the install base.
  • System outages — planned maintenance windows produce predictable (and avoidable) volume; unplanned outages produce acute spikes followed by a clearing queue.
  • Platform migrations — moving users to a new system produces sustained elevated support volume for weeks to months.
  • Feature deprecation — removing a feature users relied on drives confusion-based contacts.

Technology event data comes from internal sources (release calendars, incident management systems). The forecaster should build a structured feed from IT/engineering that flags upcoming releases and their expected user impact.

Economic indicators

Macroeconomic variables influence contact volumes on longer timescales:

  • Unemployment rate — drives financial services inquiries (hardship programs, payment arrangements, account closures) and government services volume.
  • Consumer confidence index — leading indicator for discretionary purchasing and associated service contacts.
  • Interest rates — affect mortgage, lending, and insurance inquiry volumes.
  • Inflation — drives billing dispute and price complaint volumes.
  • GDP growth — broad demand driver for B2B service organizations.

Economic regressors are most useful for medium-to-long-range forecasts (monthly, quarterly) rather than daily or intraday. Their granularity (monthly or quarterly releases) limits their value for short-horizon predictions.[3]

Social media and digital signals

  • Viral complaints — a customer complaint goes viral on social media, producing a surge in "me too" contacts from other customers experiencing the same issue.
  • Trending topics — brand mentions on Twitter/X, Reddit, or TikTok that correlate with inbound contact volume.
  • App store reviews — a spike in negative reviews after an update predicts incoming support contacts.
  • Search trends — Google Trends data for brand + "problem" or brand + "contact" as a leading indicator.

Social media signals are noisy, high-frequency, and difficult to operationalize in a production forecast. They are most valuable as anomaly detectors (something is happening) rather than as calibrated regressors (this many contacts will result). Sentiment analysis tools can flag spikes, but converting a sentiment score into a volume multiplier requires organization-specific calibration.

Incorporating regressors: statistical methods

Regression with dummy variables

The simplest approach: add binary (0/1) dummy variables for known events to a regression model.

yt=β0+β1Mondayt++β6Sundayt+β7Campaignt+β8Holidayt+ϵt

Each dummy variable captures the average lift (or suppression) of that event type. The coefficient β7 is the estimated additional contacts on a campaign day.

Strengths: transparent, easy to interpret ("campaigns add 350 calls on average"), minimal infrastructure.

Limitations: assumes constant effect size (every campaign has the same impact), does not capture lag effects without additional dummy variables for post-event days, and does not capture interaction effects (a campaign during holiday season may have a different impact than a campaign in February).

ARIMAX / Dynamic regression

ARIMA extended with external regressors. The model captures the time-series structure (trend, seasonality, autocorrelation) in the ARIMA component while adding regressor effects:

yt=β0+β1x1,t++βkxk,t+ηt

where ηt follows an ARIMA process. This separates the explainable variance (regressors) from the time-dependent residual structure.

Strengths: combines the power of ARIMA for capturing serial correlation with the ability to model external drivers. The standard approach in the forecasting literature for incorporating regressors into time-series models.[4]

Limitations: requires regressor values to be known (or forecasted) for the forecast horizon. You cannot use tomorrow's temperature as a regressor unless you have a weather forecast for tomorrow.

Prophet with holidays and events

Facebook's Prophet model has a built-in holidays/events framework. The forecaster provides a dataframe of event dates with event names, and Prophet fits a separate effect for each event type:

y(t)=g(t)+s(t)+h(t)+ϵt

where h(t) is the holiday/event component. Prophet also supports custom regressors added as additional columns.

Strengths: low barrier to entry — the holidays interface is well-designed and handles multi-day events, window effects (days before/after), and multiple event types. Prophet's automatic seasonality detection reduces manual feature engineering.[5]

Limitations: Prophet's accuracy on interval-level WFM data is often mediocre. It was designed for daily or weekly business metrics, not 15-minute contact center volumes. The event framework is valuable; the underlying trend/seasonality model may underperform ETS or ARIMA on high-frequency WFM series.

Gradient boosting (XGBoost, LightGBM) with feature columns

Machine learning models treat regressors as feature columns alongside time-based features (day of week, hour of day, week of year, lag values). The model learns non-linear relationships between features and volume.

Strengths: handles many regressors simultaneously, captures non-linear effects and interactions automatically, and scales to large datasets with hundreds of features. Can incorporate categorical regressors (campaign type, weather category) without manual dummy coding.

Limitations: requires more data to train reliably, less interpretable than regression, and requires the same features to be available at forecast time. See ML vs Classical Forecasting Comparison for the full tradeoff analysis.

Measuring regressor value

Adding a regressor increases model complexity. The question: does the added complexity improve accuracy?

Holdout comparison

The gold standard. Train two models on the same training data — one with the regressor, one without. Compare accuracy on the same holdout period.

  • If the regressor model beats the baseline on MASE or MAE across the holdout, the regressor adds value.
  • If the improvement is small (< 1–2% MAE reduction), the regressor may not justify the data dependency and maintenance burden.

Feature importance analysis

For tree-based models (XGBoost, LightGBM), feature importance scores indicate how much each regressor contributes to the model's predictions. A regressor with near-zero importance can be dropped.

For regression models, the t-statistic and p-value of the regressor coefficient serve the same purpose. A regressor with a non-significant coefficient is not contributing.

Information criteria

AIC and BIC penalize model complexity. A model with a regressor should have a lower AIC/BIC than the same model without the regressor if the regressor genuinely helps. This is a quick screen before the more expensive holdout comparison.[6]

The "does weather actually help?" test

A common WFM question: "Should we add weather to our forecast?" The answer is empirical, not theoretical.

Run the holdout comparison. For utility companies and insurance, the answer is almost always yes — weather explains substantial variance in daily volumes. For general-purpose customer service (retail, telecom), the answer is often no — weather effects exist but are small relative to day-of-week and seasonal effects already captured by the time-series model.

The exception: extreme weather events. A single-day ice storm produces a 300% volume spike that no amount of day-of-week modeling predicts. But these are better handled as event flags (binary: extreme weather yes/no) than as continuous temperature regressors.

Practical implementation considerations

Data availability at forecast time

Every external regressor must be available (or forecastable) for the forecast horizon. This creates a hierarchy:

  • Known at forecast time: marketing campaigns (scheduled), holidays (calendar), product launches (planned) — these are the highest-value regressors because they require no forecasting of the regressor itself.
  • Forecastable: weather (3–7 day forecasts are reasonably accurate), economic indicators (published on known schedules with forward-looking estimates).
  • Unknown: social media spikes, unplanned outages, viral events — these cannot be used as regressors in advance forecasts but can be used in intraday reforecasts once detected. See Intraday Reforecasting and Real-Time Forecast Updates.

Data pipeline reliability

A model that depends on a weather API feed breaks when the API goes down. External regressors create data dependencies that must be monitored. Best practice: the model should gracefully degrade to its baseline (no-regressor) forecast when regressor data is unavailable, rather than failing entirely.

Organizational process

The most common failure mode is not statistical but organizational: the marketing team does not inform WFM about upcoming campaigns with enough lead time. Building a structured intake process — a shared calendar, a standard campaign brief template, a weekly sync meeting — is often more impactful than the choice of statistical method.

Relationship to other pages

  1. Gans, N., Koole, G., and Mandelbaum, A. (2003). Telephone Call Centers: Tutorial, Review, and Research Prospects. Manufacturing and Service Operations Management, 5(2), 79–141.
  2. Taylor, J.W. (2008). A Comparison of Univariate Time Series Methods for Forecasting Intraday Arrivals at a Call Center. Management Science, 54(2), 253–265.
  3. Ibrahim, R., Ye, H., L'Ecuyer, P., and Shen, H. (2016). Modeling and Forecasting Call Center Arrivals: A Literature Survey and a Case Study. International Journal of Forecasting, 32(3), 865–874.
  4. Hyndman, R.J. and Athanasopoulos, G. (2021). Forecasting: Principles and Practice. 3rd ed. OTexts. Chapter 10: Dynamic Regression Models.
  5. Taylor, S.J. and Letham, B. (2018). Forecasting at Scale. The American Statistician, 72(1), 37–45.
  6. Burnham, K.P. and Anderson, D.R. (2002). Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. 2nd ed. Springer.