SlideShare a Scribd company logo
1 of 28
Download to read offline
YOUR VOICE IS MY PASSPORT
John Seymour and Azeem Aqil
1
Introduction
Voice is starting to be used for authentication
2
Introduction
Voice is starting to be used for authentication
3
Introduction
Voice is starting to be used for authentication
4
Introduction
Voice is starting to be used for authentication
5
Goal:
Break Voice Authentication
(with minimal effort)
6
Obligatory Sneakers Reference
7
Obligatory Sneakers Reference
And here’s how you do it.8
• In Sneakers, they social engineered the target in order to record the
exact words they needed

• In practice, this is hard to do

• Luckily, text-to-speech exists
9
10
Overview of Text to Speech (TTS)
▪Essential idea: you give the algorithm text, and it generates the
equivalent audio representation of that text (e.g. Mel Spectrograms)

▪Model learns the mapping between the transcript and audio

▪The way it does this is you give it labeled (transcribed) audio and
feed it into a deep neural network

▪ Generally models are trained on a single person’s voice

▪ Generally deep learning models require a LARGE amount of labeled data

▪ Open source datasets (e.g. Blizzard, LJ Speech)

▪ >24 hours of labeled data to do well
11
Proof of concept
▪ Create an account

▪ Record >30 sentences

▪ Chosen by Lyrebird, same for all users.

▪ Provide a target sentence that Lyrebird will generate. 

▪ Apple Siri and Microsoft Speaker Recognition API (Public Beta)

▪ Proof of concept is of limited value from a security standpoint
12
Proof of Concept Demo
13
Open source TTS models
▪Several open source models (Tacotron, Wavenet are best known)

▪ WaveNet generates realistic human sounding output, however, needs to be ‘tuned’
significantly.

▪Tacotron simplifies this process greatly

▪ The production of the feature set (which needs tuning in WaveNet) is replaced by another NN
that works directly off data

▪We use Tacotron
14
Model overview
15
Tacotron v1, Tacotron v2, and WaveNet
16
Collecting Data
▪That’s all well and good, but in order to train a model, we need to
feed it data

▪ Can grab audio from e.g. YouTube, but quality/quantity are both important

▪Need to transcribe this data

▪ Youtube/Google Speech API not good enough

▪ We also had to cut pieces with poor quality or with “um”

▪ We had to manually transcribe the data we used for training

▪Most open source models require short (<10 second) snippets of
audio

▪ Use ffmpeg to split the file into chunks
17
Data Augmentation
▪Publicly available data for a specific target is probably limited

▪Transcribing is time intensive since it must be done manually

▪Need LARGE amount of high quality training data

▪Solution: Augment Data
18
Data Augmentation: Shifting Pitch
▪Slow down and speed up audio to generate new examples

▪Libraries (pydub) available for this

▪Measured how far pitch can be raised/lowered by recording “Hey
Siri” and testing how much speed up/slowdown would be accepted

▪ 0.88x to 1.21x for our tests

▪ YMMV for exact parameter (probably different for every person)
19
Data Augmentation: Transfer Learning
1.Initially train on large open source dataset (Blizzard, LJSpeech)

2.Get a good model, stop training

3.Replace open source dataset with the target data

4.Continue to train
20
Transfer Learning Demo
21
Putting it all together
1.Scrape data from target (e.g. Youtube)

2.Select high-quality samples

3.Transcribe and chunk audio

4.Augment audio by shifting pitch

5.Train general TTS model on open source dataset

6.Replace general model training data with target data; finish training

7.Synthesize voice from model
22
Our work in perspective (ML for offense)
▪Attacks on ML systems

▪ Adversarial Attacks

▪ Most prior work attacking voice systems utilize GANs

▪ Pro: hiding commands within benign-sounding audio

▪ Con: method is currently brittle

▪ (We use the simpler approach of generating speech for a given user)

▪ Poisoning the Well

▪ Privacy/Differential Privacy

▪ Attacks using ML systems

▪ Phishing

▪ DeepFakes

▪ Robotics/Social Engineering
23
Mitigation: MFA
▪Defense in Depth

▪Potential issue: Speaker Recognition with unknown vocabulary is
hard

▪Potential issue: Passphrases may not be kept secret
24
Image credit: XKCD.com
Mitigation: Detect CGA
25
Conclusions
▪Speaker authentication and speaker recognition are different
problems. Recognition is only a [weak] signal for “authenticating”.

▪Speaker authentication can be broken if the attacker has speech
data of the target and knows the authentication prompt.

▪Although most TTS systems require 24 hours of speech to train,
transfer learning is an effective way to reduce that time to an amount
realistic for an attacker to abuse. Transfer learning is effective for
reducing data requirements generally.

▪In conclusion, it’s relatively easy to spoof someone’s voice

▪ Will only get easier over time
26
Final note
27 Be afraid. Be very afraid.
?John Seymour and Azeem Aqil

More Related Content

What's hot

Natural Language Processing: Comparing NLTK and OpenNLP
Natural Language Processing: Comparing NLTK and OpenNLPNatural Language Processing: Comparing NLTK and OpenNLP
Natural Language Processing: Comparing NLTK and OpenNLPCodeOps Technologies LLP
 
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdfTransfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdforanisalcani
 
Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical Dhruv Gohil
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processingpunedevscom
 
Puzzle Theory 102
Puzzle Theory 102Puzzle Theory 102
Puzzle Theory 102sblom
 

What's hot (7)

Natural Language Processing: Comparing NLTK and OpenNLP
Natural Language Processing: Comparing NLTK and OpenNLPNatural Language Processing: Comparing NLTK and OpenNLP
Natural Language Processing: Comparing NLTK and OpenNLP
 
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdfTransfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
 
Blenderbot
BlenderbotBlenderbot
Blenderbot
 
Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical
 
Deeplearning NLP
Deeplearning NLPDeeplearning NLP
Deeplearning NLP
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Puzzle Theory 102
Puzzle Theory 102Puzzle Theory 102
Puzzle Theory 102
 

Similar to YOUR VOICE IS MY PASSPORT

How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for DevelopersHow do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for DevelopersIvo Andreev
 
DIY Jarvis All Things Open 2019
DIY Jarvis All Things Open 2019DIY Jarvis All Things Open 2019
DIY Jarvis All Things Open 2019Wes Widner
 
Building your own open-source voice assistant
Building your own open-source voice assistantBuilding your own open-source voice assistant
Building your own open-source voice assistantAll Things Open
 
final ppt BATCH 3.pptx
final ppt BATCH 3.pptxfinal ppt BATCH 3.pptx
final ppt BATCH 3.pptxMounika715343
 
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...Databricks
 
OpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and MisconceptionsOpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and MisconceptionsIvo Andreev
 
Breaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video AnalyticsBreaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video AnalyticsJason Anderson
 
From Dream socialbot to Multiskill AI Assistant Platform
From Dream socialbot to Multiskill AI Assistant PlatformFrom Dream socialbot to Multiskill AI Assistant Platform
From Dream socialbot to Multiskill AI Assistant PlatformDaniel Kornev
 
Jingle: Cutting Edge VoIP
Jingle: Cutting Edge VoIPJingle: Cutting Edge VoIP
Jingle: Cutting Edge VoIPmattjive
 
Synthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep LearningSynthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep LearningS N
 
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...Jeongkyu Shin
 
Towards Machine Comprehension of Spoken Content
Towards Machine Comprehension of Spoken ContentTowards Machine Comprehension of Spoken Content
Towards Machine Comprehension of Spoken ContentNVIDIA Taiwan
 
Voice usability testing with WOZ methodology - UX SCOT 2019
Voice usability testing with WOZ methodology - UX SCOT 2019Voice usability testing with WOZ methodology - UX SCOT 2019
Voice usability testing with WOZ methodology - UX SCOT 2019Abi Reynolds
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPMENGSAYLOEM1
 
Ux scot voice usability testing with woz - ar and sf - june 2019
Ux scot   voice usability testing with woz - ar and sf  - june 2019Ux scot   voice usability testing with woz - ar and sf  - june 2019
Ux scot voice usability testing with woz - ar and sf - june 2019User Vision
 
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdfITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdfOrtus Solutions, Corp
 
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDistributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDatabricks
 

Similar to YOUR VOICE IS MY PASSPORT (20)

How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for DevelopersHow do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
 
DIY Jarvis All Things Open 2019
DIY Jarvis All Things Open 2019DIY Jarvis All Things Open 2019
DIY Jarvis All Things Open 2019
 
Building your own open-source voice assistant
Building your own open-source voice assistantBuilding your own open-source voice assistant
Building your own open-source voice assistant
 
final ppt BATCH 3.pptx
final ppt BATCH 3.pptxfinal ppt BATCH 3.pptx
final ppt BATCH 3.pptx
 
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...
 
OpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and MisconceptionsOpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and Misconceptions
 
Breaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video AnalyticsBreaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video Analytics
 
From Dream socialbot to Multiskill AI Assistant Platform
From Dream socialbot to Multiskill AI Assistant PlatformFrom Dream socialbot to Multiskill AI Assistant Platform
From Dream socialbot to Multiskill AI Assistant Platform
 
Jingle: Cutting Edge VoIP
Jingle: Cutting Edge VoIPJingle: Cutting Edge VoIP
Jingle: Cutting Edge VoIP
 
Os Tucker
Os TuckerOs Tucker
Os Tucker
 
Synthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep LearningSynthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep Learning
 
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
 
Towards Machine Comprehension of Spoken Content
Towards Machine Comprehension of Spoken ContentTowards Machine Comprehension of Spoken Content
Towards Machine Comprehension of Spoken Content
 
Deep Domain
Deep DomainDeep Domain
Deep Domain
 
Voice usability testing with WOZ methodology - UX SCOT 2019
Voice usability testing with WOZ methodology - UX SCOT 2019Voice usability testing with WOZ methodology - UX SCOT 2019
Voice usability testing with WOZ methodology - UX SCOT 2019
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLP
 
Ux scot voice usability testing with woz - ar and sf - june 2019
Ux scot   voice usability testing with woz - ar and sf  - june 2019Ux scot   voice usability testing with woz - ar and sf  - june 2019
Ux scot voice usability testing with woz - ar and sf - june 2019
 
kornev.pdf
kornev.pdfkornev.pdf
kornev.pdf
 
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdfITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
 
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDistributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
 

More from Priyanka Aash

Digital Personal Data Protection (DPDP) Practical Approach For CISOs
Digital Personal Data Protection (DPDP) Practical Approach For CISOsDigital Personal Data Protection (DPDP) Practical Approach For CISOs
Digital Personal Data Protection (DPDP) Practical Approach For CISOsPriyanka Aash
 
Verizon Breach Investigation Report (VBIR).pdf
Verizon Breach Investigation Report (VBIR).pdfVerizon Breach Investigation Report (VBIR).pdf
Verizon Breach Investigation Report (VBIR).pdfPriyanka Aash
 
Top 10 Security Risks .pptx.pdf
Top 10 Security Risks .pptx.pdfTop 10 Security Risks .pptx.pdf
Top 10 Security Risks .pptx.pdfPriyanka Aash
 
Simplifying data privacy and protection.pdf
Simplifying data privacy and protection.pdfSimplifying data privacy and protection.pdf
Simplifying data privacy and protection.pdfPriyanka Aash
 
Generative AI and Security (1).pptx.pdf
Generative AI and Security (1).pptx.pdfGenerative AI and Security (1).pptx.pdf
Generative AI and Security (1).pptx.pdfPriyanka Aash
 
EVERY ATTACK INVOLVES EXPLOITATION OF A WEAKNESS.pdf
EVERY ATTACK INVOLVES EXPLOITATION OF A WEAKNESS.pdfEVERY ATTACK INVOLVES EXPLOITATION OF A WEAKNESS.pdf
EVERY ATTACK INVOLVES EXPLOITATION OF A WEAKNESS.pdfPriyanka Aash
 
Cyber Truths_Are you Prepared version 1.1.pptx.pdf
Cyber Truths_Are you Prepared version 1.1.pptx.pdfCyber Truths_Are you Prepared version 1.1.pptx.pdf
Cyber Truths_Are you Prepared version 1.1.pptx.pdfPriyanka Aash
 
Cyber Crisis Management.pdf
Cyber Crisis Management.pdfCyber Crisis Management.pdf
Cyber Crisis Management.pdfPriyanka Aash
 
CISOPlatform journey.pptx.pdf
CISOPlatform journey.pptx.pdfCISOPlatform journey.pptx.pdf
CISOPlatform journey.pptx.pdfPriyanka Aash
 
Chennai Chapter.pptx.pdf
Chennai Chapter.pptx.pdfChennai Chapter.pptx.pdf
Chennai Chapter.pptx.pdfPriyanka Aash
 
Cloud attack vectors_Moshe.pdf
Cloud attack vectors_Moshe.pdfCloud attack vectors_Moshe.pdf
Cloud attack vectors_Moshe.pdfPriyanka Aash
 
Stories From The Web 3 Battlefield
Stories From The Web 3 BattlefieldStories From The Web 3 Battlefield
Stories From The Web 3 BattlefieldPriyanka Aash
 
Lessons Learned From Ransomware Attacks
Lessons Learned From Ransomware AttacksLessons Learned From Ransomware Attacks
Lessons Learned From Ransomware AttacksPriyanka Aash
 
Emerging New Threats And Top CISO Priorities In 2022 (Chennai)
Emerging New Threats And Top CISO Priorities In 2022 (Chennai)Emerging New Threats And Top CISO Priorities In 2022 (Chennai)
Emerging New Threats And Top CISO Priorities In 2022 (Chennai)Priyanka Aash
 
Emerging New Threats And Top CISO Priorities In 2022 (Mumbai)
Emerging New Threats And Top CISO Priorities In 2022 (Mumbai)Emerging New Threats And Top CISO Priorities In 2022 (Mumbai)
Emerging New Threats And Top CISO Priorities In 2022 (Mumbai)Priyanka Aash
 
Emerging New Threats And Top CISO Priorities in 2022 (Bangalore)
Emerging New Threats And Top CISO Priorities in 2022 (Bangalore)Emerging New Threats And Top CISO Priorities in 2022 (Bangalore)
Emerging New Threats And Top CISO Priorities in 2022 (Bangalore)Priyanka Aash
 
Cloud Security: Limitations of Cloud Security Groups and Flow Logs
Cloud Security: Limitations of Cloud Security Groups and Flow LogsCloud Security: Limitations of Cloud Security Groups and Flow Logs
Cloud Security: Limitations of Cloud Security Groups and Flow LogsPriyanka Aash
 
Cyber Security Governance
Cyber Security GovernanceCyber Security Governance
Cyber Security GovernancePriyanka Aash
 

More from Priyanka Aash (20)

Digital Personal Data Protection (DPDP) Practical Approach For CISOs
Digital Personal Data Protection (DPDP) Practical Approach For CISOsDigital Personal Data Protection (DPDP) Practical Approach For CISOs
Digital Personal Data Protection (DPDP) Practical Approach For CISOs
 
Verizon Breach Investigation Report (VBIR).pdf
Verizon Breach Investigation Report (VBIR).pdfVerizon Breach Investigation Report (VBIR).pdf
Verizon Breach Investigation Report (VBIR).pdf
 
Top 10 Security Risks .pptx.pdf
Top 10 Security Risks .pptx.pdfTop 10 Security Risks .pptx.pdf
Top 10 Security Risks .pptx.pdf
 
Simplifying data privacy and protection.pdf
Simplifying data privacy and protection.pdfSimplifying data privacy and protection.pdf
Simplifying data privacy and protection.pdf
 
Generative AI and Security (1).pptx.pdf
Generative AI and Security (1).pptx.pdfGenerative AI and Security (1).pptx.pdf
Generative AI and Security (1).pptx.pdf
 
EVERY ATTACK INVOLVES EXPLOITATION OF A WEAKNESS.pdf
EVERY ATTACK INVOLVES EXPLOITATION OF A WEAKNESS.pdfEVERY ATTACK INVOLVES EXPLOITATION OF A WEAKNESS.pdf
EVERY ATTACK INVOLVES EXPLOITATION OF A WEAKNESS.pdf
 
DPDP Act 2023.pdf
DPDP Act 2023.pdfDPDP Act 2023.pdf
DPDP Act 2023.pdf
 
Cyber Truths_Are you Prepared version 1.1.pptx.pdf
Cyber Truths_Are you Prepared version 1.1.pptx.pdfCyber Truths_Are you Prepared version 1.1.pptx.pdf
Cyber Truths_Are you Prepared version 1.1.pptx.pdf
 
Cyber Crisis Management.pdf
Cyber Crisis Management.pdfCyber Crisis Management.pdf
Cyber Crisis Management.pdf
 
CISOPlatform journey.pptx.pdf
CISOPlatform journey.pptx.pdfCISOPlatform journey.pptx.pdf
CISOPlatform journey.pptx.pdf
 
Chennai Chapter.pptx.pdf
Chennai Chapter.pptx.pdfChennai Chapter.pptx.pdf
Chennai Chapter.pptx.pdf
 
Cloud attack vectors_Moshe.pdf
Cloud attack vectors_Moshe.pdfCloud attack vectors_Moshe.pdf
Cloud attack vectors_Moshe.pdf
 
Stories From The Web 3 Battlefield
Stories From The Web 3 BattlefieldStories From The Web 3 Battlefield
Stories From The Web 3 Battlefield
 
Lessons Learned From Ransomware Attacks
Lessons Learned From Ransomware AttacksLessons Learned From Ransomware Attacks
Lessons Learned From Ransomware Attacks
 
Emerging New Threats And Top CISO Priorities In 2022 (Chennai)
Emerging New Threats And Top CISO Priorities In 2022 (Chennai)Emerging New Threats And Top CISO Priorities In 2022 (Chennai)
Emerging New Threats And Top CISO Priorities In 2022 (Chennai)
 
Emerging New Threats And Top CISO Priorities In 2022 (Mumbai)
Emerging New Threats And Top CISO Priorities In 2022 (Mumbai)Emerging New Threats And Top CISO Priorities In 2022 (Mumbai)
Emerging New Threats And Top CISO Priorities In 2022 (Mumbai)
 
Emerging New Threats And Top CISO Priorities in 2022 (Bangalore)
Emerging New Threats And Top CISO Priorities in 2022 (Bangalore)Emerging New Threats And Top CISO Priorities in 2022 (Bangalore)
Emerging New Threats And Top CISO Priorities in 2022 (Bangalore)
 
Cloud Security: Limitations of Cloud Security Groups and Flow Logs
Cloud Security: Limitations of Cloud Security Groups and Flow LogsCloud Security: Limitations of Cloud Security Groups and Flow Logs
Cloud Security: Limitations of Cloud Security Groups and Flow Logs
 
Cyber Security Governance
Cyber Security GovernanceCyber Security Governance
Cyber Security Governance
 
Ethical Hacking
Ethical HackingEthical Hacking
Ethical Hacking
 

Recently uploaded

How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfdanishmna97
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingScyllaDB
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdfMuhammad Subhan
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimaginedpanagenda
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuidePixlogix Infotech
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentationyogeshlabana357357
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...FIDO Alliance
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceSamy Fodil
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...ScyllaDB
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftshyamraj55
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxFIDO Alliance
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)Wonjun Hwang
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctBrainSell Technologies
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxFIDO Alliance
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxFIDO Alliance
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxFIDO Alliance
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe中 央社
 

Recently uploaded (20)

How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 

YOUR VOICE IS MY PASSPORT

  • 1. YOUR VOICE IS MY PASSPORT John Seymour and Azeem Aqil 1
  • 2. Introduction Voice is starting to be used for authentication 2
  • 3. Introduction Voice is starting to be used for authentication 3
  • 4. Introduction Voice is starting to be used for authentication 4
  • 5. Introduction Voice is starting to be used for authentication 5
  • 8. Obligatory Sneakers Reference And here’s how you do it.8
  • 9. • In Sneakers, they social engineered the target in order to record the exact words they needed • In practice, this is hard to do • Luckily, text-to-speech exists 9
  • 10. 10
  • 11. Overview of Text to Speech (TTS) ▪Essential idea: you give the algorithm text, and it generates the equivalent audio representation of that text (e.g. Mel Spectrograms) ▪Model learns the mapping between the transcript and audio ▪The way it does this is you give it labeled (transcribed) audio and feed it into a deep neural network ▪ Generally models are trained on a single person’s voice ▪ Generally deep learning models require a LARGE amount of labeled data ▪ Open source datasets (e.g. Blizzard, LJ Speech) ▪ >24 hours of labeled data to do well 11
  • 12. Proof of concept ▪ Create an account ▪ Record >30 sentences ▪ Chosen by Lyrebird, same for all users. ▪ Provide a target sentence that Lyrebird will generate. ▪ Apple Siri and Microsoft Speaker Recognition API (Public Beta) ▪ Proof of concept is of limited value from a security standpoint 12
  • 13. Proof of Concept Demo 13
  • 14. Open source TTS models ▪Several open source models (Tacotron, Wavenet are best known) ▪ WaveNet generates realistic human sounding output, however, needs to be ‘tuned’ significantly. ▪Tacotron simplifies this process greatly ▪ The production of the feature set (which needs tuning in WaveNet) is replaced by another NN that works directly off data ▪We use Tacotron 14
  • 16. Tacotron v1, Tacotron v2, and WaveNet 16
  • 17. Collecting Data ▪That’s all well and good, but in order to train a model, we need to feed it data ▪ Can grab audio from e.g. YouTube, but quality/quantity are both important ▪Need to transcribe this data ▪ Youtube/Google Speech API not good enough ▪ We also had to cut pieces with poor quality or with “um” ▪ We had to manually transcribe the data we used for training ▪Most open source models require short (<10 second) snippets of audio ▪ Use ffmpeg to split the file into chunks 17
  • 18. Data Augmentation ▪Publicly available data for a specific target is probably limited ▪Transcribing is time intensive since it must be done manually ▪Need LARGE amount of high quality training data ▪Solution: Augment Data 18
  • 19. Data Augmentation: Shifting Pitch ▪Slow down and speed up audio to generate new examples ▪Libraries (pydub) available for this ▪Measured how far pitch can be raised/lowered by recording “Hey Siri” and testing how much speed up/slowdown would be accepted ▪ 0.88x to 1.21x for our tests ▪ YMMV for exact parameter (probably different for every person) 19
  • 20. Data Augmentation: Transfer Learning 1.Initially train on large open source dataset (Blizzard, LJSpeech) 2.Get a good model, stop training 3.Replace open source dataset with the target data 4.Continue to train 20
  • 22. Putting it all together 1.Scrape data from target (e.g. Youtube) 2.Select high-quality samples 3.Transcribe and chunk audio 4.Augment audio by shifting pitch 5.Train general TTS model on open source dataset 6.Replace general model training data with target data; finish training 7.Synthesize voice from model 22
  • 23. Our work in perspective (ML for offense) ▪Attacks on ML systems ▪ Adversarial Attacks ▪ Most prior work attacking voice systems utilize GANs ▪ Pro: hiding commands within benign-sounding audio ▪ Con: method is currently brittle ▪ (We use the simpler approach of generating speech for a given user) ▪ Poisoning the Well ▪ Privacy/Differential Privacy ▪ Attacks using ML systems ▪ Phishing ▪ DeepFakes ▪ Robotics/Social Engineering 23
  • 24. Mitigation: MFA ▪Defense in Depth ▪Potential issue: Speaker Recognition with unknown vocabulary is hard ▪Potential issue: Passphrases may not be kept secret 24 Image credit: XKCD.com
  • 26. Conclusions ▪Speaker authentication and speaker recognition are different problems. Recognition is only a [weak] signal for “authenticating”. ▪Speaker authentication can be broken if the attacker has speech data of the target and knows the authentication prompt. ▪Although most TTS systems require 24 hours of speech to train, transfer learning is an effective way to reduce that time to an amount realistic for an attacker to abuse. Transfer learning is effective for reducing data requirements generally. ▪In conclusion, it’s relatively easy to spoof someone’s voice ▪ Will only get easier over time 26
  • 27. Final note 27 Be afraid. Be very afraid.
  • 28. ?John Seymour and Azeem Aqil