Video in Customer Service

From WFM Labs

Video in Customer Service is an emerging contact center channel that adds visual communication to customer interactions. Video enables agents and customers to see each other, share screens, demonstrate solutions, and conduct visual assessments — capabilities that voice and text channels cannot replicate. While video remains a small fraction of total contact center volume, its adoption is accelerating in specific high-value use cases where visual communication materially improves outcomes.

Video is not "a phone call with a camera." It changes the interaction dynamic, the staffing model, the scheduling requirements, and the performance profile. WFM practitioners must understand these differences as video grows from niche to mainstream channel.

Use Cases

Video finds its strongest application where visual information is essential to resolving the interaction:

Insurance Claims Visual Assessment

Policyholders use their smartphone cameras to show damage — automobile accidents, property damage, water leaks, storm damage — to claims adjusters in real time. The adjuster can guide the camera ("show me the other side"), ask for close-ups, and make preliminary assessments without dispatching a field inspector.

Impact: reduces claim processing time by 1-3 days, eliminates field visit costs ($150-$400 per inspection), and improves customer satisfaction by resolving the claim interaction in a single contact rather than scheduling a follow-up.

Technical Support (Show-Me)

Customers aim their camera at the product — a router, an appliance, a piece of equipment — and the agent provides guided troubleshooting. "Show me the lights on the front panel." "Turn the device around so I can see the cable connections." "Point the camera at the error message on the screen."

Impact: reduces AHT for hardware troubleshooting by 20-40% compared to verbal-only guidance. Improves FCR by enabling the agent to verify the customer's actions rather than relying on the customer's description.

Telehealth

Video is the primary channel for telehealth consultations — physician-patient interactions conducted remotely. Contact centers that support healthcare organizations route telehealth appointments through video-enabled agents and practitioners.

Impact: telehealth adoption surged during COVID-19 and has remained elevated. The contact center's role is scheduling, technical support for the video platform, and in some cases triage before practitioner handoff.

Financial Advisory

Wealth management, mortgage consultation, tax advisory, and complex financial product sales benefit from face-to-face interaction. Video preserves the relationship dynamic of in-person meetings while eliminating geographic constraints.

Impact: conversion rates for complex financial products are 15-25% higher via video than voice, because the trust and engagement dynamics of face-to-face communication transfer to video.

Luxury Retail

High-end brands use video to provide virtual shopping experiences — a sales associate walks the customer through products, demonstrates features, and provides the personalized attention that luxury customers expect. Brands including Gucci, Burberry, and BMW have deployed video customer service.

Impact: average order values for video-assisted sales are 2-5× higher than standard e-commerce or voice-assisted sales.

Government Services

Passport applications, visa consultations, social services assessments, and identity verification processes that traditionally required in-person visits can be conducted via video.

Impact: reduces facility costs and wait times; improves accessibility for citizens with mobility limitations or those in remote areas.

Staffing Implications

Video fundamentally changes the agent utilization model:

No Concurrency

Voice agents handle one call at a time. Chat agents handle 2-4 concurrent conversations. Video agents handle one interaction at a time and cannot multitask during the interaction because the customer can see them. Video is the most attention-intensive channel.

The concurrency factor for video is 1.0 — no multiplier. This means video staffing is more expensive per contact than chat (where concurrency provides a multiplier of 2-4×) and comparable to voice in per-contact cost.

Extended Handle Times

Video contacts are typically longer than their voice equivalents:

Use Case Voice AHT Video AHT Notes
Insurance claim assessment 12-15 min 18-25 min Camera guidance and visual documentation add time
Technical support 8-12 min 10-15 min Faster diagnosis offsets camera setup time
Financial advisory 20-30 min 25-40 min Relationship-building component is more natural on video
Telehealth N/A 15-30 min Video is the primary channel, not a substitute

However, the FCR benefit often offsets the longer AHT. If a video interaction resolves in one contact what would have required two voice contacts plus a field visit, the total cost is lower despite the longer single-interaction duration.

Agent Selection Criteria

Not all agents are equally suited for video:

  • Presentation: Video agents are visible to customers. Professional appearance and background matter. This is not vanity — it is the visual equivalent of clear speech on voice.
  • On-camera comfort: Some agents are uncomfortable on camera. Forcing video capability on reluctant agents produces awkward interactions.
  • Technical fluency: Video introduces additional technical variables (camera operation, screen sharing, lighting, bandwidth) that the agent must manage while handling the interaction.
  • Non-verbal communication: Video enables facial expressions, gestures, and body language. Agents skilled in non-verbal communication perform better on video.

Scheduling Video-Capable Agents

Skill Routing

Video capability is a routing skill in the skill-based routing framework. Only agents with the "video" skill receive video contacts. The skill assignment considers:

  • Agent training completion (video-specific training module)
  • Equipment verification (webcam, lighting, professional background)
  • Bandwidth verification (minimum bandwidth requirements met at the agent's location)
  • Agent willingness (opt-in for video is preferable to mandatory assignment)

Equipment Requirements

Video agents require:

  • HD webcam: 720p minimum; 1080p preferred. Built-in laptop cameras are usually adequate but dedicated webcams provide better quality.
  • Professional lighting: Ring light or desk lamp positioned to illuminate the face evenly. Backlighting (window behind the agent) is the most common video quality issue.
  • Professional background: Either a physical professional backdrop (clean, branded wall) or a high-quality virtual background. Virtual backgrounds require sufficient processing power to render smoothly.
  • Headset with noise cancellation: Video interactions are more sensitive to background noise because the visual channel sets expectations of presence.
  • Adequate bandwidth: 5+ Mbps upload for stable HD video. Lower bandwidth produces pixelation and lag that degrades the experience.

Scheduling Patterns

Video demand tends to be:

  • Appointment-based: Many video interactions are scheduled (financial advisory, telehealth, complex consultations). The WFM system must support appointment scheduling alongside queue-based routing.
  • Lower volume than voice/chat: Video is a small fraction of total contacts, meaning the staffing pool is small and variance is proportionally higher.
  • Time-of-day concentrated: Video contacts cluster during business hours more than voice contacts, because customers treat video as "meeting-like" and prefer daytime scheduling.

The WFM challenge: staffing a small video pool with sufficient coverage while managing the variance inherent in low-volume channels. Options include blended agents (video-capable agents who handle voice when no video contacts are queued) and appointment scheduling to smooth the demand.

Performance Metrics

Video contacts warrant adjusted metrics:

  • AHT: Expected to be higher than voice. Video-specific AHT targets should be set based on video-channel data, not adjusted from voice baselines.
  • FCR: Video typically produces higher FCR (the visual component enables more complete resolution). This is the primary business justification for the channel.
  • CSAT: Video contacts consistently score higher in customer satisfaction than voice contacts for the same contact types. The face-to-face dynamic creates stronger perceived service quality.
  • NPS: Similar pattern — video interactions drive higher Net Promoter Scores.
  • Technical quality: Video introduces platform-specific metrics — video quality score, connection stability, audio/video sync, dropped connection rate. These must be tracked to manage the technical experience.
  • Agent utilization: Lower than voice or chat due to equipment setup, post-interaction notes (video contacts generate more detailed notes), and the attention intensity of the channel.

Privacy and Recording Considerations

Video recording adds complexity to the privacy and compliance framework:

  • Consent: Many jurisdictions require explicit consent for video recording. The consent requirement is more stringent than voice recording in some cases because video captures biometric data (facial images).
  • Storage: Video recordings are dramatically larger than audio recordings (10-50× the storage per minute). Retention policies must account for storage costs.
  • PCI scope: If the customer shows a payment card on camera, the video recording contains card data and falls under PCI-DSS requirements. Operations must train agents to prevent this and implement technical controls.
  • GDPR: Video recordings of customer faces constitute biometric personal data under GDPR, triggering the highest protection requirements.
  • Agent privacy: Recording the agent's face and home workspace (for remote agents) raises employee privacy considerations. Clear policy and consent are required.

Video + AI

AI enhances video interactions:

Screen Sharing and Co-Browsing

Agent shares their screen or navigates the customer's screen together. AI can highlight relevant areas, annotate shared content, and generate step-by-step guides from the co-browsing session.

Augmented Reality (AR) Assisted Support

The most advanced application: AR overlays on the customer's camera view. The agent (or AI) places visual markers ("click here," "turn this dial," "the part you need is highlighted in green") directly on the customer's live camera view. Used in field service support, manufacturing, and complex product troubleshooting.

Automated Visual Analysis

AI analyzes the video feed for relevant information:

  • Damage assessment: AI pre-categorizes damage severity from the camera feed, supporting the human adjuster's assessment.
  • Product identification: AI identifies the customer's product model from the camera view, auto-populating the service record.
  • Sentiment analysis: AI reads facial expressions and vocal tone to provide real-time sentiment indicators to the agent.

Maturity Model Position

In the WFM Labs Maturity Model™:

  • Level 1 — Initial organizations do not offer video as a customer service channel.
  • Level 2 — Foundational organizations use video for a specific high-value use case (telehealth, financial advisory) as a separate, manually scheduled channel. No WFM integration — video appointments are managed outside the standard scheduling system.
  • Level 3 — Progressive organizations integrate video into the multi-channel routing and scheduling framework. Video-capable agents are identified as a skill group. WFM models include video demand forecasting. Equipment and bandwidth standards are defined.
  • Level 4 — Advanced organizations blend video with other channels dynamically. AI-assisted video features (co-browsing, AR, automated visual analysis) are deployed. Video performance analytics drive continuous improvement. Video demand is forecast alongside voice, chat, and other channels in a unified model.
  • Level 5 — Pioneering organizations treat video as a natural mode of interaction, not a special channel. The system intelligently escalates from text to voice to video based on interaction complexity and customer preference. AI handles simple visual assessments autonomously, escalating to human agents for complex judgment.

See Also

References

  • Gartner. Critical Capabilities for Contact Center as a Service. Annual evaluation including video channel capabilities.
  • McKinsey & Company. The Next Normal: The Recovery Will Be Digital. 2020. Video adoption acceleration during COVID-19.
  • Metrigy. Video for Customer Engagement: 2024 Research Study. Industry adoption and ROI data.
  • Cleveland, B. Call Center Management on Fast Forward (4th ed.). ICMI Press, 2019.