Money has been withdrawn from your account. You don't remember making, or authorising that transaction. When you follow up with the bank, they say you called earlier and requested the transfer - it was, after-all, you speaking - right? Unbeknownst to you, your voice was stolen, and so was your money. With the rise of voice authentication biometrics, so will the opportunities to spoof it. Text-to-Speech API's are constantly improving, with Google's technology now being indistinguishable from the real human speaker. Threat actors have access to a target's YouTube videos, social media posts. Even more invasive channels are certain vulnerable IoT devices, littered throughout homes and offices. Social media posts and IoT's allow threat actors to listen to your voice, capture and then manipulate it (all using free online tools). So what exactly can be done with a 'stolen' voice? This research explores the possibilities of banking fraud, by using voice-spoofing to bypass authentication and withdraw funds.
12. HOW A VOICE IS MADE
The Part That
Makes Sound:
string of a
violin
vocal folds
The Part That
Shapes Sound:
body of a violin
supralaryngeal
articulators
13. HOW A VOICE IS MADE
supralaryngeal articulators/vocal tract
14. VOICE RECOGNITION
Phenomes:
• r eh k ao g n ay z s p iy ch
– "recognise speech "
• r eh k ay n ay s b iy ch
– "wreck a nice beach"
15. VOICE RECOGNITION
• Homonyms
– there and their
– air and heir
– be and bee
• Speech
– accent
– tempo & mumble
– enunciation
16. VOICE RECOGNITION
• How does software understand WHAT we
are saying?
• NATURAL LANGUAGE PROCESSING:
– CONTEXT and/or PROBABILITY
18. VOICE VERIFICATION
How voices are identified as unique:
• Text Dependent:
– Same passphrase during sign-up and verification
• Text Independent
– verifying the identity without constraint on the
speech content
19. EACH VOICE IS “UNIQUE”
Listen up, maggots. You are not special.
You are not a beautiful or unique snowflake.
You're the same decaying organic
matter as everything else.
Tyler Durden - Fight Club
20. THE LAW
• Protection of Personal Information Act
Biometric information includes a technique of
personal identification that is based on physical,
physiological or behavioural characterisation
including blood typing, fingerprinting, DNA
analysis, retinal scanning and voice recognition.
21. THE LAW
• Verbal agreements are no less binding than
written agreements.
• Having your voice spoofed is theft!
22. IN THE WILD
Many predictions about voice spoofing attacks…
• Breaking Smart Speakers: We Are Listening To You - Tencent Blade Team
• Your Voice is My Passport – Seymour & Aqil
• This AI Can Clone Any Voice, Including Yours - Bloomberg
• Can-You-Hear-Me Scam - Criminals
• Vocal theft on the horizon - Taylor Armerding
• Busted: Thousands Of Amazon Employees Listening To Alexa Conversations – Durden
• Digital Voice Assistants: The New Front in the War on IoT Hackers – Edwards
• Balancing safety and convenience with biometrics – Gool
25. IN THE WILD
• Attacks require a lot of preparation, time,
‘expertise’ and resources.
• Attack vector will become much easier.
• ‘High profile’ individuals = more risk.
32. DEEPTHROAT – STEP 1
• Input audio
– e.g. recording from IoT, or videos from YouTube
• Cut input into 10 second clips
• Pass to a Subtitle API
– Manually check subtitles/transcription
33. DEEPTHROAT – STEP 2
• App takes input audio and it is processed
through the Deep Learning model
34. DEEPTHROAT – STEP 3
• Threat Actor prepares the Attack Script
• Once the spoofed voice is generated, the
App automatically creates the sentences
from the Attack Script
35. DEEPTHROAT – STEP 4
• Connect VAC
– Virtual Audio Cable allows sound from one
application to be passed to another.
• Connects to a Phone-Call App (e.g. Skype)
– Can make international calls
– Number cannot be easily traced back
36. DEEPTHROAT – STEP 5
• Ambiance
– Background noise can be added to cover up for
poorly generated syllables or make the audio
sound more realistic