Building voice security infrastructure for HSA providers

Key takeaways: Voice has become a major fraud risk as deepfakes, voice cloning, and social engineering make traditional call center verification methods easier to bypass. Knowledge-based authentication is no longer enough – a layered approach is needed to ensure voice-based attacks do not negatively impact the member experience.

Traditional knowledge-based authentication is no longer reliable because breached personal data and AI-generated voice attacks make shared-secret questions easy to defeat.
A layered, continuous risk model is more effective than one-time verification, using signals like caller metadata, passive voice biometrics, liveness checks, and behavioral analysis.
Voice channel threats are always evolving, and security teams must adapt to new methods, technologies, and bad actors while evolving their responses.

The voice channel has shifted from a service convenience to a primary attack surface. For Health Savings Account (HSA) providers, that shift carries unique stakes: a single, compromised authentication can expose protected health information, account balances, and the linked banking information behind both.

The economics of voice-based fraud have changed. Open-weight text-to-speech (TTS) models, low-cost voice cloning toolkits, and automated calling infrastructure mean an attacker no longer needs deep technical expertise or insider data to mount a convincing impersonation. A short audio sample pulled from a podcast, a webinar, or a leaked voicemail is enough to generate synthetic speech that defeats listener verification and many legacy voice biometric systems.

How are cybercriminals using voice-based customer service for attacks?

Recent industry telemetry confirms a growing problem for voice-based fraud. CrowdStrike’s 2026 Global Threat Report observed a 400% year-over-year increase in voice phishing (vishing) campaigns.¹ Pindrop, analyzing more than 1.2 billion call records, reported a 680% rise in deepfake activity and a more than 1,300% increase in overall fraud attempts on the voice channel.²

For healthcare and HSA providers, three categories of attack dominate:

Synthetic-voice impersonation, where TTS or voice-conversion models produce audio matching a target member.
Replay and splicing attacks, where recorded fragments of legitimate audio are stitched together to answer prompts.
Social engineering of contact center agents, often paired with weaponized urgency or emotional cues to override agent judgment.

A deep-faked voice that sounds panicked, emotional, may successfully pressure customer service agents to ignore their instincts or deviate from procedure.

Why knowledge-based authentication has failed

Shared-secret authentication (like the last four of an social security number, a mother’s maiden name, or the name of a first pet) was already weakened by a decade of large-scale data breaches. The National Institute of Standards and Technology (NIST) Digital Identity Guidelines (SP 800-63)³ effectively deprecated knowledge-based authentication (KBA) built on data of this kind for the same reason: the secrets are not secret. They are searchable.

In an HSA contact center, the failure mode is particularly costly. The same data points an agent uses to verify a member are the data points an attacker has already bought from open and dark-web sources. KBA is not a security step in that environment; it is a checkbox that creates the appearance of security.

What is continuous risk scoring for cybersecurity?

From the moment a call hits the carrier network through the entire member interaction, HSA providers should conduct continuous risk scoring that layers on a number of checkpoints.

Pre-answer signal analysis. Before the call reaches an agent or the Interactive Voice Response (IVR), carrier metadata, Automatic Number Identification (ANI) reputation, Signaling System 7 (SS7) anomaly detection, and audio-path fingerprinting identify spoofed numbers, VoIP routing inconsistencies, and known-bad infrastructure. This is the easiest stage to filter automated attacks at scale.

Passive voice biometrics. Text-independent speaker verification compares the live voiceprint against an enrolled model derived from prior legitimate interactions. Performed passively during natural conversation, it avoids the friction of active enrollment phrases and resists replay attacks tied to fixed challenge text.

Liveness and Presentation Attack Detection (PAD). Aligned with international standards, PAD techniques analyze spectral artifacts, prosodic patterns, and channel characteristics that distinguish synthetic or replayed audio from a live human speaker. This is the layer most directly targeted at TTS and voice-cloning attacks.

Behavioral analytics. Cadence, hesitation, IVR navigation patterns, and consistency with the member’s historical interaction profile contribute to additional risk analysis. A caller who knows the right answers but navigates the menu like someone who has never used it before is a behavioral mismatch worth flagging.

Continuous risk scoring surfaced to the agent. The output of these layers is a real-time score the agent can see, not a binary allow/deny gate. High scores route to step-up verification; low scores enable expedited self-service. The agent stops being the lone arbiter of suspicion.

How can HSA providers balance account security with great member experiences?

HealthEquity integrated Pindrop’s voice security platform across the contact center as part of this layered model.⁴ Two measurable outcomes from the deployment:

The profile match rate in the IVR rose from 31% to 71% within the first month of Pindrop Passport deployment, meaning a larger share of legitimate callers were recognized and routed to self-service without ever touching an agent.
Overall authentication rate exceeded 91%, reducing agent handle time spent on manual identity checks and freeing capacity for substantive member support.

The point worth emphasizing for benefits and security leaders: stronger authentication and lower friction are not a tradeoff. Member experience doesn’t have to be sacrificed for account security. HealthEquity’s broader approach to AI-driven identity verification and advanced fraud detection reflects the same principle applied across other channels.

Combined with stronger authentication like passkeys, this creates a system that’s much harder to fool. Trust isn’t granted once, but verified continuously.

Implementation considerations for HSA and healthcare providers

A few practical notes for teams scoping similar deployments:

Voiceprint enrollment is a data governance question. Biometric templates associated with protected health information (PHI) are regulated under the Health Insurance Portability and Accountability Act (HIPAA) and, in many cases, separately under state biometric privacy statutes. Storage, encryption, retention, and member opt-out paths need to be designed before enrollment scales.
Score interpretation belongs in the agent workflow, not adjacent to it. Risk scores buried in a separate tool are ignored when call centers get busy. They need to be in the line of sight where the agent makes the next decision.
The control must be evaluated against an evolving threat. Voice synthesis quality improves on a quarterly cadence. PAD models that performed well against 2024-era cloning may underperform against 2026 systems. Periodic red-team testing with current-generation synthesis tooling is part of operating the control, not a one-time validation.
Channel parity matters. Attackers probe whichever channel – voice, chat, email, AI agent – has the weakest controls. A hardened voice channel adjacent to a soft chatbot does not reduce overall fraud; it relocates it.

The infrastructure of voice security and trust

For most of human history, trust in a voice was a perceptual fact. It is now an engineering problem. The organizations that handle it well will be the ones that treat trust the way they treat any other piece of critical infrastructure: instrumented, monitored, and built to fail safely under adversarial conditions.

Visit our Trust Center to see how HealthEquity is building and measuring that infrastructure for clients and members.

HealthEquity does not provide legal, tax, or financial advice.

¹CrowdStrike, 2026 Global Threat Report.

²Pindrop, 2025 Voice Intelligence & Security Report.

³National Institute of Standards and Technology, Digital Identity Guidelines.

⁴Pindrop and HealthEquity case study, 2026. HealthEquity and Pindrop are separate, unaffiliated companies and are not responsible for each other’s policies or services.

Building voice security infrastructure for HSA providers

How are cybercriminals using voice-based customer service for attacks?

Why knowledge-based authentication has failed

What is continuous risk scoring for cybersecurity?

How can HSA providers balance account security with great member experiences?

Implementation considerations for HSA and healthcare providers

The infrastructure of voice security and trust

Are you a business?

Are you an individual?

Read this next

Blog Categories

Business Help

What’s Trending Now

Follow us

Building voice security infrastructure for HSA providers

How are cybercriminals using voice-based customer service for attacks?

Why knowledge-based authentication has failed

What is continuous risk scoring for cybersecurity?

How can HSA providers balance account security with great member experiences?

Implementation considerations for HSA and healthcare providers

The infrastructure of voice security and trust

Subscribe to Remark Blog

Are you a business?

Are you an individual?

Read this next

Blog Categories

Business Help

What’s Trending Now

Follow us