Financial institutions, home automation products, and offices near universal cryptographic decoders have increasingly used voice fingerprinting as a method for authentication. Recent advances in machine learning and text-to-speech have shown that synthetic, high-quality audio of subjects can be generated using transcripted speech from the target. Are current techniques for audio generation enough to spoof voice authentication algorithms? We demonstrate, using freely available machine learning models and limited budget, that standard speaker recognition and voice authentication systems are indeed fooled by targeted text-to-speech attacks. We further show a method which reduces data required to perform such an attack, demonstrating that more people are at risk for voice impersonation than previously thought.


  • _delta_zero, Senior Data Scientist, Salesforce
  • Azeem Aqil, Senior Security Software Engineer, Salesforce

_delta_zero performs machine learning on log data by day, and writes his dissertation on malware datasets by night. He was voted"most likely to create Skynet" by @alexcpsec, and he toys with offensive uses for machine learning in his free time. He has spoken at BlackHat USA, DEF CON, SecTor, BSidesLV/Charm, and the NIPS workshop on Machine Deception.


Azeem Aqil
Azeem Aqil is a security engineer at Salesforce. He works on building and maintaining the detection and response infrastructure that powers Salesforce security. Azeem is an academic turned hacker who has published and spoken at various academic security conferences.

Detailed Presentation:

(Source: DEF CON 26)
E-mail me when people leave their comments –

You need to be a member of CISO Platform to add comments!

Join CISO Platform

RSAC Meetup Banner

CISO Platform

A global community of 5K+ Senior IT Security executives and 40K+ subscribers with the vision of meaningful collaboration, knowledge, and intelligence sharing to fight the growing cyber security threats.

Join CISO Community Share Your Knowledge (Post A Blog)