SlideShare a Scribd company logo
1 of 4
Download to read offline
Bhusan Chettri explains how your unique VOICE can be used for automatic
authentication and challenges towards the security of voice authentication systems
· Dr. Bhusan Chettri a Ph.D graduate from Queen Mary University of London (QMUL)
explains the science behind voice biometrics; different types of voice biometric
system
s

· Spoo
fi
ng attacks on voice biometric systems: a growing concern regarding its
security
 

Voice biometrics in simple words refers to the technology used to automatically recognise
a person using their voice sample. Every person possesses a unique vocal apparatus and
therefore the characteristics and features of an individual's voice is distinct. This is one of
the key reasons for wide adoption of voice as a means of person authentication across the
globe. In this article, Dr Bhusan Chettri explains the basics of voice biometrics and briefs
about growing concern regarding its security against fake voices generated using
computers and AI technology
.

Voice biometrics are commonly referred to as automatic speaker veri
fi
cation (ASV). Two
key steps are required to be followed in order to build such a system using a computer.
 

Training phase: it involves building a universal voice template a.k.a speaker template (or
model) using large amounts of voice samples collected from different people with different
cultural background, ethnicity and from different regions across the world. The more data
recorded/collected under different diverse environmental conditions from a large speaker
population the better will be the universal template because with such diverse data the
template will be able to capture and represent the general voice pattern of speakers
across the world. Furthermore, the voice template (also referred as speaker model) is
simply a large table (or matrix) of numbers learned during the training such that each
number in the table represents some meaningful information (about the speaker) which the
computer understands but is hard for humans to interpret. As illustrated in Figure 1, top
row, this step is often called of
fl
ine phase training.
 

Figure 1. Training phase. The goal here is to build speaker speci
fi
c models by adaptin
g

a background model which is trained on a large speech database
.

Here, the feature extraction step simply gathers relevant information from the voice/speech
samples of speakers and use them for building the voice template. The training step then
makes use of the features being extracted from voices and applies computer algorithm to
learn patterns across different voices. As a results this step produces the so called
background model which is nothing but the universal speaker template representing the
whole speaker/voice population. Then the next key step in training phase is building
speaker speci
fi
c model or voice template for a designated speaker making use of the
universal speaker template. One interesting point to note here is that this step, also called
speaker or voice registration, does not require huge amount of voice samples from the
speci
fi
c target speaker. And, it is also impractical to collect thousands of hours of speech/
voice samples for one speaker. This is the reason why universal speaker/voice template
are created and are then adapted to build speaker speci
fi
c template. What this means is
that using a small fragment of voice samples (usually 5-10 seconds or a minute speech
sample) the large table (universal voice template) is adjusted to represent the speci
fi
c
speaker. It should also be noted that this speaker registration often happens on the
fl
y. For
example, in voice-based banking application, the system often ask user’s to speak certain
phrase such as “my voice is my password” for couple of times. What is happening here is
that the universal voice template is being adjusted to suit the user’s voice pattern. Once it
is successful, a voice template/model for a speci
fi
c user is created
.

Veri
fi
cation phase: The second step in voice biometrics is called speaker veri
fi
cation
phase. Here, the system accepts as input a test speech/voice sample and extracts
relevant features from it. Then the system will simply match this new speech/voice with the
voice template of the claimed speaker (which was already created during the training
phase). As a result a number/score is produced that informs the level/degree of match
being observed. Furthermore, it also uses the universal voice template to score this new
voice. Finally, the score difference between the speaker voice template and universal
voice template (also called log-likelihood ratio in ASV terminology) is used as the
fi
nal
score to decide whether to accept or reject the claimed identity. Higher score difference
usually corresponds to higher probability that the new voice sample belongs to the claimed
identity. This process is illustrated in Figure 2
.

Figure 2. Speaker veri
fi
cation phase. For a given speech utterance the system obtains a
veri
fi
cation score and makes a decision whether to accept or reject the claimed identity
.

Types of ASV systems. Depending upon the level of user cooperation ASV systems are
often classi
fi
ed into two types: text-dependent and text independent systems. In text-
dependent applications, the system has prior knowledge about the text being spoken and
therefore it expects the same utterance when the biometric system is accessed by the
user. An example usage of this scenario would be banking applications. On the contrary, in
text-independent systems there are no such restrictions. Users can speak any phrase
during registration and while accessing the system. An example of this would be forensic
applications where users may not be cooperating to speak the phrase they are being
asked to during interrogations
.

Bhusan Chettri further elucidated, Now, one interesting question that might pop up in the
reader's mind is regarding the usage of this technology. Where is this technology used?
What are its applications
?

Application
s

ASV systems can be used in a wide range of applications across different domains.
1. Access control: controlling access to electronic devices and other facilities using
voice
.

2. Speaker diarization applications: identifying who spoke when
?

3. Forensic application - to match voice templates with pre-recorded voices of
criminals
.

4. Retrieval of customer information in call centres using voice indexing
.

5. Surveillance applications
.

Advantages
There are many advantages to using this technology. One interesting one is the fact that
using voice biometrics user’s do not have to worry about remembering long complex
combinations of passwords anymore. By just speaking up the unlock phrase (for example,
“my voice is my password”) users can access the application (for example banking app or
personalised digital accessories)
.

Common errors in ASV
Like any other computer systems (or machine learning models) ASV systems can make
mistakes while it is up and running. There are two types of common errors it can make:
false acceptance and false rejection. False acceptance means that the system has falsely
accepted an unknown (or unregistered) speaker. False rejection is an error which refers to
a situation where the system rejects the true speaker. This may happen in cases for
example where a user attempts to access the voice biometrics in very noisy conditions
(with severe background noises), and therefore the system becomes incon
fi
dent in
recognising the speaker’s voice
.

How good is voice biometrics? Evaluation metric
s

“To decide whether the trained biometric system is good or not, an evaluation metric is
required. Commonly used metric in ASV is Equal Error Rate (EER). EER basically
corresponds to a situation where both false acceptance and false rejection errors are the
same. And for this to happen the decision threshold to accept or reject a speaker is
carefully adjusted during training (and this adjustment varies across different application
domains)” ‘Bhusan Explained’. Researchers and ASV system developers aim at
minimising these error rates. Lower the EER better is the ASV system
.

Security of Voice biometrics: a growing concern
One of the key problems with the usage of voice biometric application corresponds to the
growing concern about its security. With recent advancement in technology, there are
commercial applications (available online) capable of producing voices that sound as
natural as if spoken by a real human. For human ears it is very dif
fi
cult to detect if the
voice was created using computer algorithms. Therefore, fraudsters/attackers aim at
launching spoo
fi
ng attacks on voice biometrics in order to gain illegitimate access to
someone else’s voice biometrics (say, bank application with an aim to steal money).
However, researchers like Andrew Ng, Bhusan Chettri, Alexis Conneau, Edward Chang,
Demis Hassabis and more in the speech community have also been working hard towards
design and development of spoo
fi
ng countermeasures with an aim to safe-guard voice
biometrics from fraudulent access. The next article, follow up on this, would be explaining
more about spoo
fi
ng attacks in voice biometrics and mechanisms/algorithms used to
counter such attacks.
 

References
[1] D. A. Reynolds, “An overview of automatic speaker recognition technology,” 2002 IEEE
ICASSP, 2002, pp. IV-4072-IV-4075
.

[2] Bhusan Chettri. Voice biometric system security: Design and analysis of
countermeasures for replay attacks. Ph.D. thesis, Queen Mary University of London
.

[3] ORCID, DBLP
[4] Automatic Speaker Recognition and AI

More Related Content

Similar to Story Voice authentication systems .pdf

Automatic Speaker Recognition and AI.pdf
Automatic Speaker Recognition and AI.pdfAutomatic Speaker Recognition and AI.pdf
Automatic Speaker Recognition and AI.pdfBhusan Chettri
 
Developing a hands-free interface to operate a Computer using voice command
Developing a hands-free interface to operate a Computer using voice commandDeveloping a hands-free interface to operate a Computer using voice command
Developing a hands-free interface to operate a Computer using voice commandMohammad Liton Hossain
 
IJSRED-V2I2P5
IJSRED-V2I2P5IJSRED-V2I2P5
IJSRED-V2I2P5IJSRED
 
Authentication System Based on the Combination of Voice Biometrics and OTP Ge...
Authentication System Based on the Combination of Voice Biometrics and OTP Ge...Authentication System Based on the Combination of Voice Biometrics and OTP Ge...
Authentication System Based on the Combination of Voice Biometrics and OTP Ge...ijtsrd
 
IJSRED-V2I2P5
IJSRED-V2I2P5IJSRED-V2I2P5
IJSRED-V2I2P5IJSRED
 
Forensic and Automatic Speaker Recognition System
Forensic and Automatic Speaker Recognition System Forensic and Automatic Speaker Recognition System
Forensic and Automatic Speaker Recognition System IJECEIAES
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceIlhaan Marwat
 
Bachelors project summary
Bachelors project summaryBachelors project summary
Bachelors project summaryAditya Deshmukh
 
NLP BASED INTERVIEW ASSESSMENT SYSTEM
NLP BASED INTERVIEW ASSESSMENT SYSTEMNLP BASED INTERVIEW ASSESSMENT SYSTEM
NLP BASED INTERVIEW ASSESSMENT SYSTEMvivatechijri
 
Assistive Examination System for Visually Impaired
Assistive Examination System for Visually ImpairedAssistive Examination System for Visually Impaired
Assistive Examination System for Visually ImpairedEditor IJCATR
 
AI for voice recognition.pptx
AI for voice recognition.pptxAI for voice recognition.pptx
AI for voice recognition.pptxJhalakDashora
 
Presentation 204 lisa bruening aac in times of change
Presentation 204  lisa bruening aac in times of changePresentation 204  lisa bruening aac in times of change
Presentation 204 lisa bruening aac in times of changeThe ALS Association
 
A Survey on Speech Recognition with Language Specification
A Survey on Speech Recognition with Language SpecificationA Survey on Speech Recognition with Language Specification
A Survey on Speech Recognition with Language Specificationijtsrd
 
VOICE BIOMETRIC IDENTITY AUTHENTICATION MODEL FOR IOT DEVICES
VOICE BIOMETRIC IDENTITY AUTHENTICATION MODEL FOR IOT DEVICESVOICE BIOMETRIC IDENTITY AUTHENTICATION MODEL FOR IOT DEVICES
VOICE BIOMETRIC IDENTITY AUTHENTICATION MODEL FOR IOT DEVICESijsptm
 
IRJET- Comparative Analysis of Emotion Recognition System
IRJET- Comparative Analysis of Emotion Recognition SystemIRJET- Comparative Analysis of Emotion Recognition System
IRJET- Comparative Analysis of Emotion Recognition SystemIRJET Journal
 

Similar to Story Voice authentication systems .pdf (20)

spoofing-overview.pdf
spoofing-overview.pdfspoofing-overview.pdf
spoofing-overview.pdf
 
Automatic Speaker Recognition and AI.pdf
Automatic Speaker Recognition and AI.pdfAutomatic Speaker Recognition and AI.pdf
Automatic Speaker Recognition and AI.pdf
 
30
3030
30
 
Developing a hands-free interface to operate a Computer using voice command
Developing a hands-free interface to operate a Computer using voice commandDeveloping a hands-free interface to operate a Computer using voice command
Developing a hands-free interface to operate a Computer using voice command
 
IJSRED-V2I2P5
IJSRED-V2I2P5IJSRED-V2I2P5
IJSRED-V2I2P5
 
Authentication System Based on the Combination of Voice Biometrics and OTP Ge...
Authentication System Based on the Combination of Voice Biometrics and OTP Ge...Authentication System Based on the Combination of Voice Biometrics and OTP Ge...
Authentication System Based on the Combination of Voice Biometrics and OTP Ge...
 
VOICE RECOGNITION SYSTEM
VOICE RECOGNITION SYSTEMVOICE RECOGNITION SYSTEM
VOICE RECOGNITION SYSTEM
 
IJSRED-V2I2P5
IJSRED-V2I2P5IJSRED-V2I2P5
IJSRED-V2I2P5
 
Forensic and Automatic Speaker Recognition System
Forensic and Automatic Speaker Recognition System Forensic and Automatic Speaker Recognition System
Forensic and Automatic Speaker Recognition System
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail Inteligence
 
Bachelors project summary
Bachelors project summaryBachelors project summary
Bachelors project summary
 
NLP BASED INTERVIEW ASSESSMENT SYSTEM
NLP BASED INTERVIEW ASSESSMENT SYSTEMNLP BASED INTERVIEW ASSESSMENT SYSTEM
NLP BASED INTERVIEW ASSESSMENT SYSTEM
 
Assistive Examination System for Visually Impaired
Assistive Examination System for Visually ImpairedAssistive Examination System for Visually Impaired
Assistive Examination System for Visually Impaired
 
AI for voice recognition.pptx
AI for voice recognition.pptxAI for voice recognition.pptx
AI for voice recognition.pptx
 
Presentation 204 lisa bruening aac in times of change
Presentation 204  lisa bruening aac in times of changePresentation 204  lisa bruening aac in times of change
Presentation 204 lisa bruening aac in times of change
 
De4201715719
De4201715719De4201715719
De4201715719
 
A Survey on Speech Recognition with Language Specification
A Survey on Speech Recognition with Language SpecificationA Survey on Speech Recognition with Language Specification
A Survey on Speech Recognition with Language Specification
 
VOICE BIOMETRIC IDENTITY AUTHENTICATION MODEL FOR IOT DEVICES
VOICE BIOMETRIC IDENTITY AUTHENTICATION MODEL FOR IOT DEVICESVOICE BIOMETRIC IDENTITY AUTHENTICATION MODEL FOR IOT DEVICES
VOICE BIOMETRIC IDENTITY AUTHENTICATION MODEL FOR IOT DEVICES
 
IRJET- Comparative Analysis of Emotion Recognition System
IRJET- Comparative Analysis of Emotion Recognition SystemIRJET- Comparative Analysis of Emotion Recognition System
IRJET- Comparative Analysis of Emotion Recognition System
 
50120140502007
5012014050200750120140502007
50120140502007
 

Recently uploaded

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 

Recently uploaded (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

Story Voice authentication systems .pdf

  • 1. Bhusan Chettri explains how your unique VOICE can be used for automatic authentication and challenges towards the security of voice authentication systems · Dr. Bhusan Chettri a Ph.D graduate from Queen Mary University of London (QMUL) explains the science behind voice biometrics; different types of voice biometric system s · Spoo fi ng attacks on voice biometric systems: a growing concern regarding its security Voice biometrics in simple words refers to the technology used to automatically recognise a person using their voice sample. Every person possesses a unique vocal apparatus and therefore the characteristics and features of an individual's voice is distinct. This is one of the key reasons for wide adoption of voice as a means of person authentication across the globe. In this article, Dr Bhusan Chettri explains the basics of voice biometrics and briefs about growing concern regarding its security against fake voices generated using computers and AI technology . Voice biometrics are commonly referred to as automatic speaker veri fi cation (ASV). Two key steps are required to be followed in order to build such a system using a computer. Training phase: it involves building a universal voice template a.k.a speaker template (or model) using large amounts of voice samples collected from different people with different cultural background, ethnicity and from different regions across the world. The more data recorded/collected under different diverse environmental conditions from a large speaker population the better will be the universal template because with such diverse data the template will be able to capture and represent the general voice pattern of speakers across the world. Furthermore, the voice template (also referred as speaker model) is simply a large table (or matrix) of numbers learned during the training such that each number in the table represents some meaningful information (about the speaker) which the computer understands but is hard for humans to interpret. As illustrated in Figure 1, top row, this step is often called of fl ine phase training. Figure 1. Training phase. The goal here is to build speaker speci fi c models by adaptin g a background model which is trained on a large speech database . Here, the feature extraction step simply gathers relevant information from the voice/speech samples of speakers and use them for building the voice template. The training step then makes use of the features being extracted from voices and applies computer algorithm to
  • 2. learn patterns across different voices. As a results this step produces the so called background model which is nothing but the universal speaker template representing the whole speaker/voice population. Then the next key step in training phase is building speaker speci fi c model or voice template for a designated speaker making use of the universal speaker template. One interesting point to note here is that this step, also called speaker or voice registration, does not require huge amount of voice samples from the speci fi c target speaker. And, it is also impractical to collect thousands of hours of speech/ voice samples for one speaker. This is the reason why universal speaker/voice template are created and are then adapted to build speaker speci fi c template. What this means is that using a small fragment of voice samples (usually 5-10 seconds or a minute speech sample) the large table (universal voice template) is adjusted to represent the speci fi c speaker. It should also be noted that this speaker registration often happens on the fl y. For example, in voice-based banking application, the system often ask user’s to speak certain phrase such as “my voice is my password” for couple of times. What is happening here is that the universal voice template is being adjusted to suit the user’s voice pattern. Once it is successful, a voice template/model for a speci fi c user is created . Veri fi cation phase: The second step in voice biometrics is called speaker veri fi cation phase. Here, the system accepts as input a test speech/voice sample and extracts relevant features from it. Then the system will simply match this new speech/voice with the voice template of the claimed speaker (which was already created during the training phase). As a result a number/score is produced that informs the level/degree of match being observed. Furthermore, it also uses the universal voice template to score this new voice. Finally, the score difference between the speaker voice template and universal voice template (also called log-likelihood ratio in ASV terminology) is used as the fi nal score to decide whether to accept or reject the claimed identity. Higher score difference usually corresponds to higher probability that the new voice sample belongs to the claimed identity. This process is illustrated in Figure 2 . Figure 2. Speaker veri fi cation phase. For a given speech utterance the system obtains a veri fi cation score and makes a decision whether to accept or reject the claimed identity . Types of ASV systems. Depending upon the level of user cooperation ASV systems are often classi fi ed into two types: text-dependent and text independent systems. In text- dependent applications, the system has prior knowledge about the text being spoken and therefore it expects the same utterance when the biometric system is accessed by the
  • 3. user. An example usage of this scenario would be banking applications. On the contrary, in text-independent systems there are no such restrictions. Users can speak any phrase during registration and while accessing the system. An example of this would be forensic applications where users may not be cooperating to speak the phrase they are being asked to during interrogations . Bhusan Chettri further elucidated, Now, one interesting question that might pop up in the reader's mind is regarding the usage of this technology. Where is this technology used? What are its applications ? Application s ASV systems can be used in a wide range of applications across different domains. 1. Access control: controlling access to electronic devices and other facilities using voice . 2. Speaker diarization applications: identifying who spoke when ? 3. Forensic application - to match voice templates with pre-recorded voices of criminals . 4. Retrieval of customer information in call centres using voice indexing . 5. Surveillance applications . Advantages There are many advantages to using this technology. One interesting one is the fact that using voice biometrics user’s do not have to worry about remembering long complex combinations of passwords anymore. By just speaking up the unlock phrase (for example, “my voice is my password”) users can access the application (for example banking app or personalised digital accessories) . Common errors in ASV Like any other computer systems (or machine learning models) ASV systems can make mistakes while it is up and running. There are two types of common errors it can make: false acceptance and false rejection. False acceptance means that the system has falsely accepted an unknown (or unregistered) speaker. False rejection is an error which refers to a situation where the system rejects the true speaker. This may happen in cases for example where a user attempts to access the voice biometrics in very noisy conditions (with severe background noises), and therefore the system becomes incon fi dent in recognising the speaker’s voice . How good is voice biometrics? Evaluation metric s “To decide whether the trained biometric system is good or not, an evaluation metric is required. Commonly used metric in ASV is Equal Error Rate (EER). EER basically corresponds to a situation where both false acceptance and false rejection errors are the same. And for this to happen the decision threshold to accept or reject a speaker is carefully adjusted during training (and this adjustment varies across different application domains)” ‘Bhusan Explained’. Researchers and ASV system developers aim at minimising these error rates. Lower the EER better is the ASV system . Security of Voice biometrics: a growing concern One of the key problems with the usage of voice biometric application corresponds to the growing concern about its security. With recent advancement in technology, there are commercial applications (available online) capable of producing voices that sound as natural as if spoken by a real human. For human ears it is very dif fi cult to detect if the
  • 4. voice was created using computer algorithms. Therefore, fraudsters/attackers aim at launching spoo fi ng attacks on voice biometrics in order to gain illegitimate access to someone else’s voice biometrics (say, bank application with an aim to steal money). However, researchers like Andrew Ng, Bhusan Chettri, Alexis Conneau, Edward Chang, Demis Hassabis and more in the speech community have also been working hard towards design and development of spoo fi ng countermeasures with an aim to safe-guard voice biometrics from fraudulent access. The next article, follow up on this, would be explaining more about spoo fi ng attacks in voice biometrics and mechanisms/algorithms used to counter such attacks. References [1] D. A. Reynolds, “An overview of automatic speaker recognition technology,” 2002 IEEE ICASSP, 2002, pp. IV-4072-IV-4075 . [2] Bhusan Chettri. Voice biometric system security: Design and analysis of countermeasures for replay attacks. Ph.D. thesis, Queen Mary University of London . [3] ORCID, DBLP [4] Automatic Speaker Recognition and AI