SlideShare a Scribd company logo
Voice authentication systems: are they secure? can AI be used to fool them?
ophile
Bhusan Chettri explains how voice authentication systems can be fooled using AI and how they
can be protected
Although today’s speaker verification systems driven by deep learning and big data shows superior
performance in verifying a speaker, they are not secure. They are prone to spoofing attacks. In this article
Dr. Bhusan Chettri gives an overview of the technology used for spoofing a voice aunthetication system
that uses automatic speaker verification (ASV) technology.
Spoofing attacks in ASV: an overview by Dr Bhusan Chettri
A spoofing attack (or presentation attack ) involves illegitimate access to the personal data of a targeted
user. These attacks are performed on a biometric system to provoke an increase in its false acceptance
rate. The security threats imposed by such attacks are now well acknowledged within the speech
community. As identified in the ISO/IEC 30107-1 standard, a biometric system could be potentially
attacked from nine different points. Fig. 1 provides a summary of this. The first two attack points are of
specific interest as they are particularly vulnerable in terms of enabling an adversary to inject spoofed
biometric data. These two points are commonly referred as physical access (PA) and logical access (LA)
attacks. As illustrated in the figure, PA attacks involve presentation attack at the sensor (microphone in
case of ASV) level and LA attacks involve modifying biometric samples to bypass the sensor. Text-to-
speech and voice conversion techniques are used to produce artificial speech to bypass an ASV system.
These two methods are examples of LA attacks. On the other hand, mimicry and playing back speech
recordings (replay) are examples of PA attacks.
Figure 1: Possible locations [ISO/IEC, 2016] to attack an ASV system. 1: microphone point, 2:
transmission point, 3: override feature extractor, 4: modify features, 5: override classifier, 6: modify
speaker database, 7: modify biometric reference, 8: modify score and 9: override decision.
Below, Bhusan Chettri provides a brief summary of the four different spoofing methods used to fool an
ASV system
1. Mimicry (or Impersonation)
This form of attack involves an attacker attempting to modify their voice characteristics to sound like a
target speaker. In other words, an attacker aims to transform their lexical and prosodic properties to be
able to sound as close as possible to the target speaker. Therefore, this form of attack can be highly
effective when the attacker’s voice is similar to the target speaker, as less effort would be required to
adjust the voice of an attacker in contrast to situations where the voice of the attacker is less similar to the
target speaker. In other words, the success of mimicry attacks often depends on the degree or quality of
the impersonated voice, suggesting that professional impersonators may be better at mimicking a target
speaker’s voice than inexperienced impersonators. Research has shown that successful attackers were
found to be able to transform their F0 (fundamental frequency) and sometimes the formants close to the
target speaker.
2. Speech synthesis
Speech synthesis or text-to-speech (TTS), is a method to generate speech from a given text input that
sounds as natural and intelligible as possible. It has a wide range of applications including spoken
dialogue systems, speech-to-speech translation, assisting people with vocal disorders, and automatic e-
book reading, to name a few. Text analysis and speech waveform generation are the two main
components of a typical TTS system. The text analysis component analyses the input text and produces
sequence of phonemes defining the linguistic specification of the text. Using these phonemes, the speech
waveform generation module produces the speech waveform. However, in end-to-end deep learning
frameworks, speech waveforms are directly generated from the input text.
3. Voice conversion
Voice conversion aims at converting the voice of a speaker to that of another. In the context of ASV
spoofing, the source voice corresponds to an attacker which is converted to that of a target speaker to
fool an ASV system. Typical VC systems operate directly on speech signals of the source and target
speaker using a parallel corpus of the two speakers (speaking the same utterances) on which a
transformation function is learned to convert the attacker acoustic parameters to that of a target speaker.
Applications of VC technologies include producing natural sounding voices for people with speech
disabilities and voice dubbing in entertainment industries to name a few.
4. Replay attacks
A replay spoofing attack involves playing back recorded speech samples of a target speaker (enrolled
speaker) to bypass an ASV system. This type of attack requires physical transmission of spoofed speech
through the system microphone. This is shown as point 1 in Fig. 1. Replay is the simplest form of a
spoofing attack that can be implemented using smartphones, and does not require specific expertise
either in speech processing or machine learning techniques. A bonafide or genuine speech corresponds
to speech spoken by a target speaker during enrollment (or the verification phase) and is acquired by an
ASV system’s microphone. On the other hand, a replayed speech denotes the speech signal that is
obtained by playing back a pre-recorded bonafide speech which is then acquired by the system’s
microphone. The acoustic environment for the acquisition of bonafide speech, and the replayed speech
can be the same — situations where an attacker manages to launch the attack from the same physical
space. But, in practice the acoustic space is usually different (eg. a different closed room/office with no
background noise) as an attacker would not want to risk getting caught while launching such attacks.
Therefore, factors of interest in detecting replay attacks are changes/noise induced in bonafide speech
from the loudspeaker of playback device, recording device and the acoustic environment where the replay
attack is simulated.
Therefore, it is very important to secure these systems from being manipulated. For this, spoofing
countermeasure solutions are often integrated within the verfication pipeline. And, voice spoofing
countermeasures is currently an active research topic within the speech research community. In the next
article, Dr Bhusan Chettri will be talking more about how AI and big-data can be used to design anti-
spoofing solutions in order to protect voice authentication systems from spoofing attacks.
References
[1] Bhusan Chettri scholar and personal website
[2] M. Sahidullah et. al. Introduction to Voice Presentation Attack Detection and Recent Advances, 2019.
[3]. Bhusan Chettri. Voice biometric system security: Design and analysis of countermeasures for replay
attacks. PhD thesis, Queen Mary University of London, August 2020.
[4] ASVspoof: The automatic speaker verification spoofing and countermeasures challenge website.
Tags: Bhusan Chettri London | Bhusan Chettri Queen Mary University of London | Dr. Bhusan Chettri |
Bhusan Chettri social | Bhusan Chettri Research

More Related Content

Similar to an-overview-of--spoofing-by-Bhusan-Chettri.pdf

IJSRED-V2I2P5
IJSRED-V2I2P5IJSRED-V2I2P5
IJSRED-V2I2P5
IJSRED
 
Assign
AssignAssign
Automatic Speaker Recognition and AI.pdf
Automatic Speaker Recognition and AI.pdfAutomatic Speaker Recognition and AI.pdf
Automatic Speaker Recognition and AI.pdf
Bhusan Chettri
 
IJSRED-V2I2P5
IJSRED-V2I2P5IJSRED-V2I2P5
IJSRED-V2I2P5
IJSRED
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversionankit_saluja
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversionankit_saluja
 
ACHIEVING SECURITY VIA SPEECH RECOGNITION
ACHIEVING SECURITY VIA SPEECH RECOGNITIONACHIEVING SECURITY VIA SPEECH RECOGNITION
ACHIEVING SECURITY VIA SPEECH RECOGNITION
ijistjournal
 
Voice
VoiceVoice
Voice
replay21
 
Voice Recognition System using Template Matching
Voice Recognition System using Template MatchingVoice Recognition System using Template Matching
Voice Recognition System using Template Matching
IJORCS
 
A Survey on Speech Recognition with Language Specification
A Survey on Speech Recognition with Language SpecificationA Survey on Speech Recognition with Language Specification
A Survey on Speech Recognition with Language Specification
ijtsrd
 
VOICE RECOGNITION SYSTEM
VOICE RECOGNITION SYSTEMVOICE RECOGNITION SYSTEM
VOICE RECOGNITION SYSTEM
Journal For Research
 
Classification of Language Speech Recognition System
Classification of Language Speech Recognition SystemClassification of Language Speech Recognition System
Classification of Language Speech Recognition System
ijtsrd
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognition
Charu Joshi
 
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNUtterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANN
IJCSEA Journal
 
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNUtterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANN
IJCSEA Journal
 
visH (fin).pptx
visH (fin).pptxvisH (fin).pptx
visH (fin).pptx
tefflontrolegdy
 
De4201715719
De4201715719De4201715719
De4201715719
IJERA Editor
 

Similar to an-overview-of--spoofing-by-Bhusan-Chettri.pdf (20)

IJSRED-V2I2P5
IJSRED-V2I2P5IJSRED-V2I2P5
IJSRED-V2I2P5
 
Final thesis
Final thesisFinal thesis
Final thesis
 
Assign
AssignAssign
Assign
 
Automatic Speaker Recognition and AI.pdf
Automatic Speaker Recognition and AI.pdfAutomatic Speaker Recognition and AI.pdf
Automatic Speaker Recognition and AI.pdf
 
IJSRED-V2I2P5
IJSRED-V2I2P5IJSRED-V2I2P5
IJSRED-V2I2P5
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 
50120140502007
5012014050200750120140502007
50120140502007
 
ACHIEVING SECURITY VIA SPEECH RECOGNITION
ACHIEVING SECURITY VIA SPEECH RECOGNITIONACHIEVING SECURITY VIA SPEECH RECOGNITION
ACHIEVING SECURITY VIA SPEECH RECOGNITION
 
Voice
VoiceVoice
Voice
 
Voice Recognition System using Template Matching
Voice Recognition System using Template MatchingVoice Recognition System using Template Matching
Voice Recognition System using Template Matching
 
A Survey on Speech Recognition with Language Specification
A Survey on Speech Recognition with Language SpecificationA Survey on Speech Recognition with Language Specification
A Survey on Speech Recognition with Language Specification
 
10
1010
10
 
VOICE RECOGNITION SYSTEM
VOICE RECOGNITION SYSTEMVOICE RECOGNITION SYSTEM
VOICE RECOGNITION SYSTEM
 
Classification of Language Speech Recognition System
Classification of Language Speech Recognition SystemClassification of Language Speech Recognition System
Classification of Language Speech Recognition System
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognition
 
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNUtterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANN
 
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNUtterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANN
 
visH (fin).pptx
visH (fin).pptxvisH (fin).pptx
visH (fin).pptx
 
De4201715719
De4201715719De4201715719
De4201715719
 

Recently uploaded

Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
AADYARAJPANDEY1
 
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
ssuserbfdca9
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
muralinath2
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
Sérgio Sacani
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
anitaento25
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
aishnasrivastava
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
sonaliswain16
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SELF-EXPLANATORY
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
subedisuryaofficial
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
yusufzako14
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 

Recently uploaded (20)

Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
 
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 

an-overview-of--spoofing-by-Bhusan-Chettri.pdf

  • 1. Voice authentication systems: are they secure? can AI be used to fool them? ophile Bhusan Chettri explains how voice authentication systems can be fooled using AI and how they can be protected Although today’s speaker verification systems driven by deep learning and big data shows superior performance in verifying a speaker, they are not secure. They are prone to spoofing attacks. In this article Dr. Bhusan Chettri gives an overview of the technology used for spoofing a voice aunthetication system that uses automatic speaker verification (ASV) technology. Spoofing attacks in ASV: an overview by Dr Bhusan Chettri A spoofing attack (or presentation attack ) involves illegitimate access to the personal data of a targeted user. These attacks are performed on a biometric system to provoke an increase in its false acceptance rate. The security threats imposed by such attacks are now well acknowledged within the speech community. As identified in the ISO/IEC 30107-1 standard, a biometric system could be potentially attacked from nine different points. Fig. 1 provides a summary of this. The first two attack points are of specific interest as they are particularly vulnerable in terms of enabling an adversary to inject spoofed biometric data. These two points are commonly referred as physical access (PA) and logical access (LA) attacks. As illustrated in the figure, PA attacks involve presentation attack at the sensor (microphone in case of ASV) level and LA attacks involve modifying biometric samples to bypass the sensor. Text-to- speech and voice conversion techniques are used to produce artificial speech to bypass an ASV system. These two methods are examples of LA attacks. On the other hand, mimicry and playing back speech recordings (replay) are examples of PA attacks.
  • 2. Figure 1: Possible locations [ISO/IEC, 2016] to attack an ASV system. 1: microphone point, 2: transmission point, 3: override feature extractor, 4: modify features, 5: override classifier, 6: modify speaker database, 7: modify biometric reference, 8: modify score and 9: override decision. Below, Bhusan Chettri provides a brief summary of the four different spoofing methods used to fool an ASV system 1. Mimicry (or Impersonation) This form of attack involves an attacker attempting to modify their voice characteristics to sound like a target speaker. In other words, an attacker aims to transform their lexical and prosodic properties to be able to sound as close as possible to the target speaker. Therefore, this form of attack can be highly effective when the attacker’s voice is similar to the target speaker, as less effort would be required to adjust the voice of an attacker in contrast to situations where the voice of the attacker is less similar to the target speaker. In other words, the success of mimicry attacks often depends on the degree or quality of the impersonated voice, suggesting that professional impersonators may be better at mimicking a target speaker’s voice than inexperienced impersonators. Research has shown that successful attackers were found to be able to transform their F0 (fundamental frequency) and sometimes the formants close to the target speaker. 2. Speech synthesis
  • 3. Speech synthesis or text-to-speech (TTS), is a method to generate speech from a given text input that sounds as natural and intelligible as possible. It has a wide range of applications including spoken dialogue systems, speech-to-speech translation, assisting people with vocal disorders, and automatic e- book reading, to name a few. Text analysis and speech waveform generation are the two main components of a typical TTS system. The text analysis component analyses the input text and produces sequence of phonemes defining the linguistic specification of the text. Using these phonemes, the speech waveform generation module produces the speech waveform. However, in end-to-end deep learning frameworks, speech waveforms are directly generated from the input text. 3. Voice conversion Voice conversion aims at converting the voice of a speaker to that of another. In the context of ASV spoofing, the source voice corresponds to an attacker which is converted to that of a target speaker to fool an ASV system. Typical VC systems operate directly on speech signals of the source and target speaker using a parallel corpus of the two speakers (speaking the same utterances) on which a transformation function is learned to convert the attacker acoustic parameters to that of a target speaker. Applications of VC technologies include producing natural sounding voices for people with speech disabilities and voice dubbing in entertainment industries to name a few. 4. Replay attacks A replay spoofing attack involves playing back recorded speech samples of a target speaker (enrolled speaker) to bypass an ASV system. This type of attack requires physical transmission of spoofed speech through the system microphone. This is shown as point 1 in Fig. 1. Replay is the simplest form of a spoofing attack that can be implemented using smartphones, and does not require specific expertise either in speech processing or machine learning techniques. A bonafide or genuine speech corresponds to speech spoken by a target speaker during enrollment (or the verification phase) and is acquired by an ASV system’s microphone. On the other hand, a replayed speech denotes the speech signal that is obtained by playing back a pre-recorded bonafide speech which is then acquired by the system’s microphone. The acoustic environment for the acquisition of bonafide speech, and the replayed speech can be the same — situations where an attacker manages to launch the attack from the same physical space. But, in practice the acoustic space is usually different (eg. a different closed room/office with no background noise) as an attacker would not want to risk getting caught while launching such attacks. Therefore, factors of interest in detecting replay attacks are changes/noise induced in bonafide speech from the loudspeaker of playback device, recording device and the acoustic environment where the replay attack is simulated.
  • 4. Therefore, it is very important to secure these systems from being manipulated. For this, spoofing countermeasure solutions are often integrated within the verfication pipeline. And, voice spoofing countermeasures is currently an active research topic within the speech research community. In the next article, Dr Bhusan Chettri will be talking more about how AI and big-data can be used to design anti- spoofing solutions in order to protect voice authentication systems from spoofing attacks. References [1] Bhusan Chettri scholar and personal website [2] M. Sahidullah et. al. Introduction to Voice Presentation Attack Detection and Recent Advances, 2019. [3]. Bhusan Chettri. Voice biometric system security: Design and analysis of countermeasures for replay attacks. PhD thesis, Queen Mary University of London, August 2020. [4] ASVspoof: The automatic speaker verification spoofing and countermeasures challenge website. Tags: Bhusan Chettri London | Bhusan Chettri Queen Mary University of London | Dr. Bhusan Chettri | Bhusan Chettri social | Bhusan Chettri Research