SlideShare a Scribd company logo
1 of 50
A TEXT INDEPENDENT SPEAKER
RECOGNITION SYSTEM FOR
DETECTING CRIMINALS
An M.Sc. Oral Defence SEMINAR
By
AMUSAAFOLARIN IBRAHIM
(PG13 /0025)
Department of Computer Science
Federal University of Agriculture, Abeokuta
Supervisory team:
Prof. A. S. Sodiya
Dr. O. R. Vincent
Prof. A. A. A Agboola
OUTLINE
• Introduction
• Motivation
• Problem statement
• Research objectives
• Literature Review
• Research design and methodology
• Implementation
• Contribution to knowledge
• Recommendation
• Conclusion and future work
• References 2
INTRODUCTION
• In criminal law, kidnapping is the unlawful taking away or
transportation of a person against the person’s will, usually to
hold the person unlawfully (Gary, 2007).
• Kidnapping is a major challenge in the society that is yet to be
eradicated due to a number of factors ranging from political to
socio-economic problems.
• This may be done for ransom or in furtherance of another
crime, or in connection with a child custody dispute. The
typical behavior of a kidnapper after successfully abducting a
victim entails that the kidnapper contact the victim’s family for
one or more demands. 3
INTRODUCTION
• Speaker recognition is the process that enables machines to
understand and interpret the human speech by making use of
certain algorithms and verify the authenticity of a speaker with
the help of a database.(Teunen et al., 2000,Reynolds., 2000)
• Speaker recognition is the process of automatically recognizing
speaker’s voice on the basis of individual information included
in the input speech waves..(Parul and Dubey. 2012)
• Speaker recognition is generally divided into two tasks
• Speaker Verification(SV)
• Speaker Identification(SI) 4
INTRODUCTION
5
Fig.1 human speech production system.
INTRODUCTION
The voices consist of five major components
• Pitch (fundamental frequency)
• Tone
• Amplitude (Volume)
• Quality
• Loudness
Underlying premise of a speaker recognition of each
person’s voice differ in the components above to make it
uniquely distinguishable.
6
INTRODUCTION
Speaker recognition can be divided into three categories
• Text-dependent (TD)
• Text-independent (TI)
• Text-prompted
Many approaches have been proposed for TI speaker recognition
• Vector Quantization (VQ based method)
• Autoassociative neural network (AANN)
• Gaussian Mixture Model
• Matrix representation
• Decision trees
• K-nearest Neighbors (K-NN)
7
INTRODUCTION
The feature extraction used for this system are:
• Mel Frequency Cepstral Coefficient (MFCC)
• Modified Mel Frequency Cepstral Coefficient (MMFCC)
The classifier used for this system are;
• VQ based method
• Artificial neural network (Auto-associate neural network
(AANN))
8
MOTIVATION
• Many countries, especially developing countries like
Nigeria, are still faced with the problems of insecurity
such as armed robbery, kidnapping, etc. and there is the
need to develop systems that will be able to mitigate
against these problems.
• From the literature, the efficiency and reliability of
most existing speaker recognition systems are not
quite perfect for voice detection and there is a need
to introduce an improve system to detect unknown
voices. (Parul and Dubey.,2012).
9
PROBLEM STATEMENT
A number of speaker recognition systems have been developed
over the years. However, the most prevalent challenges facing
these systems include:
• Poor clustering of speaker models in the database leading to high computational
search time for matching a speaker model. (Parul and Dubey. R.B. 2012).
• Low noise reduction by existing algorithms leading to poor construction of speaker
model. (Sumithra et al., 2012)
• Inability of existing speaker recognition systems to detect unknown speakers.
(Kinnunen et al. 2009).
• Poor quality result from a single model is still susceptible to speaker recognition
system. (Visalakshi. R and Dhanalakshmi. P. 2014) 10
RESEARCH OBJECTIVES
The objectives of this work are to;
• Identify the strengths and weaknesses in the
existing approaches used for speaker
recognition system.
• An improved speaker recognition system
efficient detection of criminals.
• Simulate and evaluate the proposed system. 11
LITERATURE REVIEW
METHODS USED AND THEIR FEATURES EXTRACTOR
FEATURE EXTRACTION:
• Mel Frequency Cepstral Coefficient
• Mel Frequency Perceptual Linear Predicitive (ML-PLP)
• Perceptual Linear Predicitive (PLP)
• Modified Mel Frequency Cepstral Coefficient.
CLASSIFIERS;
• Vector quantization
• Artificial neural network (Auto associative neural network and
Radial basis function neural network)
• K-nearest Neighbors (K-NN)
12
LITERATURE REVIEW
• (Visalakshi. and Dhanalakshmi. 2014): This work
compared the efficiency of Radial basin function neural
network (RBFNN) and autoassociative neural network
(AANN) for the design of speaker recognition.
Strength: gives a better performance using AANN of 94.93%
than RBFNN of 89.1%
Weakness: Robustness of RBFNN affect the performance of
the system
• Sumithra et al., (2012): This combined the two feature
extraction techniques to design a speaker recognition system.
Strength: It is suitable for highly secured environment.
Weakness: It was not quite efficient 13
RELATED WORK
S/
N
AUTHORS
FEATURE
EXTRACTIONS
CLASSIFIERS STRENGTHS WEAKNESSES
1 Kinnunen. and
Haizhou., 2009
MFCC VQ, GMM, SVM
and others
Solve text
dependency and
speech duration
Encounter limited trained
data and unbalanced text
2 Visalahi and
Dhanalakshmi.,
2014
LPCC, LPC and
MFCC
RBFNN and
AANN
Gives a better
performance of
AANN of about
(94.93%)
Robustness of RBFNN
can affect the
performance of the
system
3 Revathi. and
Venkatarami et
al., 2009
PLP and MF-PLP Vector
quantization
Improve the
speech recognition
accuracy of 91%
Isolated use of PLP
reduces detection
accuracy
4 Nair et al., 2012 MFCC Hybrid algorithm Solve the problem
of enhancing
speech in noisy
environment
Better methods are
needed for
speech/silence detection
5 Sumithra et al.,
2012
MFCC and
MMFCC
Vector
quantization
It suitable for
highly secured
environment
Can only have better
efficiency if the speaker
modeling techniques are
used
14
S/N AUTHORS
FEATURE
EXTRACTIONS
CLASSIFIERS STRENGTHS WEAKNESSES
6 Hautamaki et
al., 2008
Affine
transformation
invariance
Graph matching It has the potential to
complement or replace
currently used statistical
and template based
methods
It can’t be used in real-life
speaker recognition system
only if the size of the
association graph grows fast
for large models.
7 Hanilci. and
Figen. 2009
MFCC VQ-Based and
PCA-Based
classifier
Opinion fussion
improves identification
rate effectively
PCA classifier is not better in
noisy or distorted speech
only slightly better in clean
speech.
8 Sekar, 2012 K-Nearest
Neighbor (KNN)
classifier
The accuracy rate of
combining RT and DCT
to extract features is
96%
No improvement in accuracy
for increment in number of
DCT coefficient
9 Parul and
Dubey. 2012
MFCC Vector
Quantization
R ecognition accuracy of
80%
Challenged by highly variant
input speech signal
10 Cuiling. and
Tiejun. 2008
Effective at defeating
FASRS
The last three disguise
pattern are weak with their
CRRs of 85%
15
METHODOLOGY
DESIGN CONSIDERATION
The following requirements are considered to design the system;
• Combining two classifiers together (autoassociative
neural network and vector quantization.
• Reducing computational time (search time) in the
database.
• AANN model is used to train the speech (noisy) which
ensures that the characteristics of the voices are
similar both in training and testing phrase.
• Reduce dimensionality of the feature vectors for easy
computation. 16
METHODOLOGY
17
Pre-
processing
Framing Windowing FastFourier
transform
(FFT)
MMFCC
FeatureExtractor
Speaker
database
Auto-associate
neural network
Vector
Quantization
Classifier
Cluster
search
Speaker ID
Voice
signal
Fig 2 Proposed architecture for
Text Independent Speaker
Recognition System
METHODOLOGY
18

METHODOLOGY
19

METHODOLOGY
20

METHODOLOGY
21

METHODOLOGY
22

METHODOLOGY
23

METHODOLOGY
24

METHODOLOGY
25

METHODOLOGY
26

METHODOLOGY
• Data collection: Implementing this system, real life datasets
was collected from different speakers at different
environment like a market, hall. Also existing datasets was
used. (body conducted speech datasets) for both training
and testing phrase
• The data spitted into two categories which the vocal tracts
will pass through namely; training and testing. 27
IMPLEMENTATION
Hardware requirement
Minimum hardware requirements are;
• PC workstation with Intel core i3 processor at 2.40Hz
• 4GB of RAM
• 5.1 SVGA or VGA with desktop performance for windows
• 86.6MB/s primary disk data transfer
• 500GB hard disk capacity
• A mouse and a keyboard
Software requirement
Minimum software requirements are;
• Windows 7, windows 8, windows 10
• Java programming language (JDK 8), NetBeans 8.0.1.
• Sound forge version 5.0
• AANN java libraries
• VQ java libraries
28
IMPLEMENTATION
Feature extractor performance
Voice features were extracted from a sample of voices gotten from group of
persons using the feature extraction algorithms (MFCC)
29
IMPLEMENTATION
Feature extractor performance cont’d
MFCC is applied to the voice samples to get the feature sets
30
IMPLEMENTATION
Classifier performance
Neural net. training the voices during the training phrase
31
Current
error
(%)
Error
Improvement
(%)
IMPLEMENTATION
32
 Diagram for the Euclidean distances of different speakers in a cluster
Euclidean
distanc
e
data point
IMPLEMENTATION
 Euclidean distance of a speaker
33
RESULT
Analyzing how the system select a speaker comparing with
the speakers in a cluster.
Training phase;
Where all valid speakers are accounted.
Speaker A matches with speaker A
Speaker B matches with speaker B
Speaker C matches with speaker C
Speaker D matches with speaker D
Speaker E matches with speaker E
Speaker F matches with speaker F
Speaker G matches with speaker G
Speaker H matches with speaker H
34
RESULT
Testing phase
Speaker D matches with speaker A-------------NO
Speaker D matches with speaker B-----------NO
Speaker D matches with speaker C---------NO
Speaker D matches with speaker D----------YES
35
RESULT
36

RESULT
37

RESULT
Matching score
Table shows the accuracy of the system based on Biometrics Cyber Security (BCS) datasets which gives
94.12%
Table 1
38
.
Speaker Number of match Number of mismatch Accuracy %
1 5 1 92
2 6 0 100
3 6 0 100
4 6 0 100
5
6
7
8
9
10
11
12
13
14
15
6
6
6
6
6
6
6
6
6
6
6
0
0
0
0
0
0
0
0
0
0
0
100
100
100
100
100
100
100
100
100
100
100
RESULT
39
Matching score
The accuracy rate of the system based on real life datasets gotten from individuals in different environment is 88.24%.
Table 2
Speaker Number of match Number of mismatch Accuracy %
1 5 1 92
2 6 0 100
3 6 0 100
4 6 0 100
5
6
7
8
9
10
11
12
13
14
15
6
6
6
6
6
6
6
6
6
6
6
0
0
0
0
0
0
0
0
0
0
0
100
100
100
100
100
100
100
100
100
100
100
RESULT
False acceptance and false reject rates on test datasets with different
thresholds.
Table 4.
40
Thresholds False acceptance
rate%
False rejection
rate%
Total error rate% Equal error rate
%
0.5 3.3 4.7 8 4
1 2 5.3 7.3 3.7
1.5 0.7 4 4.7 2.4
2 2.7 6 8.7 7.5
RESULT
41
TISRS ANALYSES
Table 5
Speaker ID Total samples Recognition accuracy % Rejection accuracy %
1 6 91 84
2 6 90 81
3 6 94 86
4 6 91 80
5 6 95 84
6 6 90 85
7 6 90 82
8
9
10
11
12
13
14
15
6
6
6
6
6
6
6
6
97
92
95
93
96
94
97
91
80
82
87
81
80
87
88
82
RESULT
PARULAND DUBEY,.2012 ANALYSES
Table 6
42
Speaker ID Total samples Recognition
accuracy%
Rejection accuracy
%
1 5 80 85
2 5 79 80
3 5 81 84
4 5 83 81
5 5 86 80
6 5 85 83
7 5 87 82
RESULT
SEKAR., 2012 ANALYSES
The accuracy rate of Sekar. 2012 is 96%
Table 6
43
Speaker Number of match Number of
mismatch
Accuracy %
1 5 0 100
2 5 0 100
3 4 1 80
4 5 0 100
5 5 0 100
RESULT
Comparing TISRS with Parul and Dubey.,2012 and Sekar.,2012
Table 7
44
Systems Accuracy
rate/detection
speed %
Recognition
accuracy %
Rejection
accuracy
FFT
calculation
MFCC/MMFCC
feature extraction
speed
Total recognition
time
TISRS 94.12 93.07 83.27 110ms 3seconds 6-8 seconds
Parul and
Dubey, 2012
- 80 85 100ms 4-5 seconds 5-7 seconds
Sekar, 2012 96 - - - - -
CONCLUSION
In this study, the major goal of this work, is to create a speaker recognition system for
detecting criminals;
• The Speaker Recognition System has good accuracy and performance for recognition of
speaker based on their normal voices given in respective of their languages and dialect.
• Based on the disguised voices gotten from each speakers, it had a great and effective
performance of the system. The efficiency of the voice used varies based on the disguising type
of voice obtained from the speaker.
• The current system was designed to evaluate the performance of the algorithms provided under
different types of inputs. Based on the experiment, it was observed that the matching error of
the system basically results from;
--Insufficient number of corresponding extractor
--Large distortion
--Missing features and spurious features.
• The implemented algorithms are more accurate, detects a person easily with its voice and faster
than previous systems like Parul and Dubey (2012) while Sekar (2012) has better accuracy rate
compared to this method using BCS dataset but the developed system has a better accuracy rate
using real life datasets.
• It is difficult to achieve a very high classification rate and it is beneficial to incorporate the
information into the designed algorithm to improve its discrimination performance. 45
CONTRIBUTION TO KNOWLEDGE
The contributions are listed below;
• The introduction of combined classifier techniques for efficient clustering
and decision making.
• In today’s high rate of crimes, improved mechanism for detection of
criminals was introduced.
• The capturing and used of real life datasets has also made the new design
practicable.
46
RECOMENDATION
For actual reduction of criminals, it is recommended that the
system is implemented in the following systems;
• Time and Attendance Systems
• Access Control Systems
• Telephone-Banking/Booking
• Biometric Login to telephone aided shopping systems
Information and Reservation Services
• Security control for confidential information Forensic purposes
47
FUTURE WORK
• In the future, research should focus on combining one or two
other biometrics techniques for better authentication as stated in
the concluding part of the work.
• Furthermore, research can also consider the adoption of other
areas of biometrics in speaker recognition system. This will
reduce the rate of crime in the country
48
REFERENCES
Cuiling Zhang and Tiejun Tan 2008. Voice disguise and automatic speaker
recognition. Forensic science international.
Hanilçi Cemal and Figen Ertaş 2009. Principal Component Based
Classification for Text-Independent Speaker Identification. Department of
Electronic Engineering, Uludag University, Bursa, TURKEY.
Hautamäki Ville, Tomi Kinnunen, Pasi Fränti 2008. Text-independent speaker
recognition using graph matching. Pattern recognition letter.
Kinnunen, T., Haizhou, L. 2009. An overview of text-independent speaker
recognition from features to supervectors. Speech communication.
Parul, G. and Dubey R.B. 2012.Automatic speaker recognition system.
International Journal of Advanced Computer Research.
Revathi, A. and Venkataramani, Y. 2009. Iterative clustering approach for text
independent speaker identification using multiple features. International
Journal of Computer science & Information Technology (IJCSIT) 1(2).
49
THANKS FOR LISTENING.
50

More Related Content

Similar to Text Independent Speaker recognitom framework for detecting criminals.ppt

A Survey on Speaker Recognition System
A Survey on Speaker Recognition SystemA Survey on Speaker Recognition System
A Survey on Speaker Recognition SystemVani011
 
Replicating Speech Experts’ Assessment for Parkinson’s Disease Treatment usin...
Replicating Speech Experts’ Assessment for Parkinson’s Disease Treatment usin...Replicating Speech Experts’ Assessment for Parkinson’s Disease Treatment usin...
Replicating Speech Experts’ Assessment for Parkinson’s Disease Treatment usin...Mark (Mong) Montances
 
Recommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmRecommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmVaibhav Varshney
 
Emotion recognition using facial expressions and speech
Emotion recognition using facial expressions and speechEmotion recognition using facial expressions and speech
Emotion recognition using facial expressions and speechLakshmi Sarvani Videla
 
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueA Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueCSCJournals
 
SPEECH RECOGNITION BY IMPROVING THE PERFORMANCE OF ALGORITHMS USED IN DISCRIM...
SPEECH RECOGNITION BY IMPROVING THE PERFORMANCE OF ALGORITHMS USED IN DISCRIM...SPEECH RECOGNITION BY IMPROVING THE PERFORMANCE OF ALGORITHMS USED IN DISCRIM...
SPEECH RECOGNITION BY IMPROVING THE PERFORMANCE OF ALGORITHMS USED IN DISCRIM...AIRCC Publishing Corporation
 
SPEECH RECOGNITION BY IMPROVING THE PERFORMANCE OF ALGORITHMS USED IN DISCRIM...
SPEECH RECOGNITION BY IMPROVING THE PERFORMANCE OF ALGORITHMS USED IN DISCRIM...SPEECH RECOGNITION BY IMPROVING THE PERFORMANCE OF ALGORITHMS USED IN DISCRIM...
SPEECH RECOGNITION BY IMPROVING THE PERFORMANCE OF ALGORITHMS USED IN DISCRIM...ijcsit
 
FACE RECOGNITION USING ELM-LRF
FACE RECOGNITION USING ELM-LRFFACE RECOGNITION USING ELM-LRF
FACE RECOGNITION USING ELM-LRFAras Masood
 
E0502 01 2327
E0502 01 2327E0502 01 2327
E0502 01 2327IJMER
 
Thesis presentation
Thesis presentationThesis presentation
Thesis presentationAras Masood
 
Dynamic Audio-Visual Client Recognition modelling
Dynamic Audio-Visual Client Recognition modellingDynamic Audio-Visual Client Recognition modelling
Dynamic Audio-Visual Client Recognition modellingCSCJournals
 
Virtual personal assistant
Virtual personal assistantVirtual personal assistant
Virtual personal assistantsiddhesh nanche
 
Neural Networks for Pattern Recognition
Neural Networks for Pattern RecognitionNeural Networks for Pattern Recognition
Neural Networks for Pattern RecognitionVipra Singh
 
An interactive approach to multiobjective clustering of gene expression patterns
An interactive approach to multiobjective clustering of gene expression patternsAn interactive approach to multiobjective clustering of gene expression patterns
An interactive approach to multiobjective clustering of gene expression patternsRavi Kumar
 
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...Vienna Data Science Group
 
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...DataScienceConferenc1
 

Similar to Text Independent Speaker recognitom framework for detecting criminals.ppt (20)

A Survey on Speaker Recognition System
A Survey on Speaker Recognition SystemA Survey on Speaker Recognition System
A Survey on Speaker Recognition System
 
Parkinson disease classification recorded v2.0
Parkinson disease classification recorded   v2.0Parkinson disease classification recorded   v2.0
Parkinson disease classification recorded v2.0
 
Replicating Speech Experts’ Assessment for Parkinson’s Disease Treatment usin...
Replicating Speech Experts’ Assessment for Parkinson’s Disease Treatment usin...Replicating Speech Experts’ Assessment for Parkinson’s Disease Treatment usin...
Replicating Speech Experts’ Assessment for Parkinson’s Disease Treatment usin...
 
Parkinson disease classification v2.0
Parkinson disease classification v2.0Parkinson disease classification v2.0
Parkinson disease classification v2.0
 
Recommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmRecommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic Algorithm
 
Emotion recognition using facial expressions and speech
Emotion recognition using facial expressions and speechEmotion recognition using facial expressions and speech
Emotion recognition using facial expressions and speech
 
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueA Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
 
SPEECH RECOGNITION BY IMPROVING THE PERFORMANCE OF ALGORITHMS USED IN DISCRIM...
SPEECH RECOGNITION BY IMPROVING THE PERFORMANCE OF ALGORITHMS USED IN DISCRIM...SPEECH RECOGNITION BY IMPROVING THE PERFORMANCE OF ALGORITHMS USED IN DISCRIM...
SPEECH RECOGNITION BY IMPROVING THE PERFORMANCE OF ALGORITHMS USED IN DISCRIM...
 
SPEECH RECOGNITION BY IMPROVING THE PERFORMANCE OF ALGORITHMS USED IN DISCRIM...
SPEECH RECOGNITION BY IMPROVING THE PERFORMANCE OF ALGORITHMS USED IN DISCRIM...SPEECH RECOGNITION BY IMPROVING THE PERFORMANCE OF ALGORITHMS USED IN DISCRIM...
SPEECH RECOGNITION BY IMPROVING THE PERFORMANCE OF ALGORITHMS USED IN DISCRIM...
 
FACE RECOGNITION USING ELM-LRF
FACE RECOGNITION USING ELM-LRFFACE RECOGNITION USING ELM-LRF
FACE RECOGNITION USING ELM-LRF
 
E0502 01 2327
E0502 01 2327E0502 01 2327
E0502 01 2327
 
Bci
BciBci
Bci
 
Bci
BciBci
Bci
 
Thesis presentation
Thesis presentationThesis presentation
Thesis presentation
 
Dynamic Audio-Visual Client Recognition modelling
Dynamic Audio-Visual Client Recognition modellingDynamic Audio-Visual Client Recognition modelling
Dynamic Audio-Visual Client Recognition modelling
 
Virtual personal assistant
Virtual personal assistantVirtual personal assistant
Virtual personal assistant
 
Neural Networks for Pattern Recognition
Neural Networks for Pattern RecognitionNeural Networks for Pattern Recognition
Neural Networks for Pattern Recognition
 
An interactive approach to multiobjective clustering of gene expression patterns
An interactive approach to multiobjective clustering of gene expression patternsAn interactive approach to multiobjective clustering of gene expression patterns
An interactive approach to multiobjective clustering of gene expression patterns
 
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
 
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
 

Recently uploaded

Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...amilabibi1
 
Introduction to Artificial intelligence.
Introduction to Artificial intelligence.Introduction to Artificial intelligence.
Introduction to Artificial intelligence.thamaeteboho94
 
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...David Celestin
 
Digital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of DrupalDigital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of DrupalFabian de Rijk
 
Zone Chairperson Role and Responsibilities New updated.pptx
Zone Chairperson Role and Responsibilities New updated.pptxZone Chairperson Role and Responsibilities New updated.pptx
Zone Chairperson Role and Responsibilities New updated.pptxlionnarsimharajumjf
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfSkillCertProExams
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Baileyhlharris
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lodhisaajjda
 
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven CuriosityUnlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven CuriosityHung Le
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoKayode Fayemi
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIINhPhngng3
 
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdfSOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdfMahamudul Hasan
 
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...ZurliaSoop
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatmentnswingard
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar TrainingKylaCullinane
 

Recently uploaded (17)

Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
 
ICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdfICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdf
 
Introduction to Artificial intelligence.
Introduction to Artificial intelligence.Introduction to Artificial intelligence.
Introduction to Artificial intelligence.
 
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
 
Digital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of DrupalDigital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of Drupal
 
Zone Chairperson Role and Responsibilities New updated.pptx
Zone Chairperson Role and Responsibilities New updated.pptxZone Chairperson Role and Responsibilities New updated.pptx
Zone Chairperson Role and Responsibilities New updated.pptx
 
in kuwait௹+918133066128....) @abortion pills for sale in Kuwait City
in kuwait௹+918133066128....) @abortion pills for sale in Kuwait Cityin kuwait௹+918133066128....) @abortion pills for sale in Kuwait City
in kuwait௹+918133066128....) @abortion pills for sale in Kuwait City
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Bailey
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.
 
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven CuriosityUnlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio III
 
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdfSOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
 
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatment
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 

Text Independent Speaker recognitom framework for detecting criminals.ppt

  • 1. A TEXT INDEPENDENT SPEAKER RECOGNITION SYSTEM FOR DETECTING CRIMINALS An M.Sc. Oral Defence SEMINAR By AMUSAAFOLARIN IBRAHIM (PG13 /0025) Department of Computer Science Federal University of Agriculture, Abeokuta Supervisory team: Prof. A. S. Sodiya Dr. O. R. Vincent Prof. A. A. A Agboola
  • 2. OUTLINE • Introduction • Motivation • Problem statement • Research objectives • Literature Review • Research design and methodology • Implementation • Contribution to knowledge • Recommendation • Conclusion and future work • References 2
  • 3. INTRODUCTION • In criminal law, kidnapping is the unlawful taking away or transportation of a person against the person’s will, usually to hold the person unlawfully (Gary, 2007). • Kidnapping is a major challenge in the society that is yet to be eradicated due to a number of factors ranging from political to socio-economic problems. • This may be done for ransom or in furtherance of another crime, or in connection with a child custody dispute. The typical behavior of a kidnapper after successfully abducting a victim entails that the kidnapper contact the victim’s family for one or more demands. 3
  • 4. INTRODUCTION • Speaker recognition is the process that enables machines to understand and interpret the human speech by making use of certain algorithms and verify the authenticity of a speaker with the help of a database.(Teunen et al., 2000,Reynolds., 2000) • Speaker recognition is the process of automatically recognizing speaker’s voice on the basis of individual information included in the input speech waves..(Parul and Dubey. 2012) • Speaker recognition is generally divided into two tasks • Speaker Verification(SV) • Speaker Identification(SI) 4
  • 6. INTRODUCTION The voices consist of five major components • Pitch (fundamental frequency) • Tone • Amplitude (Volume) • Quality • Loudness Underlying premise of a speaker recognition of each person’s voice differ in the components above to make it uniquely distinguishable. 6
  • 7. INTRODUCTION Speaker recognition can be divided into three categories • Text-dependent (TD) • Text-independent (TI) • Text-prompted Many approaches have been proposed for TI speaker recognition • Vector Quantization (VQ based method) • Autoassociative neural network (AANN) • Gaussian Mixture Model • Matrix representation • Decision trees • K-nearest Neighbors (K-NN) 7
  • 8. INTRODUCTION The feature extraction used for this system are: • Mel Frequency Cepstral Coefficient (MFCC) • Modified Mel Frequency Cepstral Coefficient (MMFCC) The classifier used for this system are; • VQ based method • Artificial neural network (Auto-associate neural network (AANN)) 8
  • 9. MOTIVATION • Many countries, especially developing countries like Nigeria, are still faced with the problems of insecurity such as armed robbery, kidnapping, etc. and there is the need to develop systems that will be able to mitigate against these problems. • From the literature, the efficiency and reliability of most existing speaker recognition systems are not quite perfect for voice detection and there is a need to introduce an improve system to detect unknown voices. (Parul and Dubey.,2012). 9
  • 10. PROBLEM STATEMENT A number of speaker recognition systems have been developed over the years. However, the most prevalent challenges facing these systems include: • Poor clustering of speaker models in the database leading to high computational search time for matching a speaker model. (Parul and Dubey. R.B. 2012). • Low noise reduction by existing algorithms leading to poor construction of speaker model. (Sumithra et al., 2012) • Inability of existing speaker recognition systems to detect unknown speakers. (Kinnunen et al. 2009). • Poor quality result from a single model is still susceptible to speaker recognition system. (Visalakshi. R and Dhanalakshmi. P. 2014) 10
  • 11. RESEARCH OBJECTIVES The objectives of this work are to; • Identify the strengths and weaknesses in the existing approaches used for speaker recognition system. • An improved speaker recognition system efficient detection of criminals. • Simulate and evaluate the proposed system. 11
  • 12. LITERATURE REVIEW METHODS USED AND THEIR FEATURES EXTRACTOR FEATURE EXTRACTION: • Mel Frequency Cepstral Coefficient • Mel Frequency Perceptual Linear Predicitive (ML-PLP) • Perceptual Linear Predicitive (PLP) • Modified Mel Frequency Cepstral Coefficient. CLASSIFIERS; • Vector quantization • Artificial neural network (Auto associative neural network and Radial basis function neural network) • K-nearest Neighbors (K-NN) 12
  • 13. LITERATURE REVIEW • (Visalakshi. and Dhanalakshmi. 2014): This work compared the efficiency of Radial basin function neural network (RBFNN) and autoassociative neural network (AANN) for the design of speaker recognition. Strength: gives a better performance using AANN of 94.93% than RBFNN of 89.1% Weakness: Robustness of RBFNN affect the performance of the system • Sumithra et al., (2012): This combined the two feature extraction techniques to design a speaker recognition system. Strength: It is suitable for highly secured environment. Weakness: It was not quite efficient 13
  • 14. RELATED WORK S/ N AUTHORS FEATURE EXTRACTIONS CLASSIFIERS STRENGTHS WEAKNESSES 1 Kinnunen. and Haizhou., 2009 MFCC VQ, GMM, SVM and others Solve text dependency and speech duration Encounter limited trained data and unbalanced text 2 Visalahi and Dhanalakshmi., 2014 LPCC, LPC and MFCC RBFNN and AANN Gives a better performance of AANN of about (94.93%) Robustness of RBFNN can affect the performance of the system 3 Revathi. and Venkatarami et al., 2009 PLP and MF-PLP Vector quantization Improve the speech recognition accuracy of 91% Isolated use of PLP reduces detection accuracy 4 Nair et al., 2012 MFCC Hybrid algorithm Solve the problem of enhancing speech in noisy environment Better methods are needed for speech/silence detection 5 Sumithra et al., 2012 MFCC and MMFCC Vector quantization It suitable for highly secured environment Can only have better efficiency if the speaker modeling techniques are used 14
  • 15. S/N AUTHORS FEATURE EXTRACTIONS CLASSIFIERS STRENGTHS WEAKNESSES 6 Hautamaki et al., 2008 Affine transformation invariance Graph matching It has the potential to complement or replace currently used statistical and template based methods It can’t be used in real-life speaker recognition system only if the size of the association graph grows fast for large models. 7 Hanilci. and Figen. 2009 MFCC VQ-Based and PCA-Based classifier Opinion fussion improves identification rate effectively PCA classifier is not better in noisy or distorted speech only slightly better in clean speech. 8 Sekar, 2012 K-Nearest Neighbor (KNN) classifier The accuracy rate of combining RT and DCT to extract features is 96% No improvement in accuracy for increment in number of DCT coefficient 9 Parul and Dubey. 2012 MFCC Vector Quantization R ecognition accuracy of 80% Challenged by highly variant input speech signal 10 Cuiling. and Tiejun. 2008 Effective at defeating FASRS The last three disguise pattern are weak with their CRRs of 85% 15
  • 16. METHODOLOGY DESIGN CONSIDERATION The following requirements are considered to design the system; • Combining two classifiers together (autoassociative neural network and vector quantization. • Reducing computational time (search time) in the database. • AANN model is used to train the speech (noisy) which ensures that the characteristics of the voices are similar both in training and testing phrase. • Reduce dimensionality of the feature vectors for easy computation. 16
  • 17. METHODOLOGY 17 Pre- processing Framing Windowing FastFourier transform (FFT) MMFCC FeatureExtractor Speaker database Auto-associate neural network Vector Quantization Classifier Cluster search Speaker ID Voice signal Fig 2 Proposed architecture for Text Independent Speaker Recognition System
  • 27. METHODOLOGY • Data collection: Implementing this system, real life datasets was collected from different speakers at different environment like a market, hall. Also existing datasets was used. (body conducted speech datasets) for both training and testing phrase • The data spitted into two categories which the vocal tracts will pass through namely; training and testing. 27
  • 28. IMPLEMENTATION Hardware requirement Minimum hardware requirements are; • PC workstation with Intel core i3 processor at 2.40Hz • 4GB of RAM • 5.1 SVGA or VGA with desktop performance for windows • 86.6MB/s primary disk data transfer • 500GB hard disk capacity • A mouse and a keyboard Software requirement Minimum software requirements are; • Windows 7, windows 8, windows 10 • Java programming language (JDK 8), NetBeans 8.0.1. • Sound forge version 5.0 • AANN java libraries • VQ java libraries 28
  • 29. IMPLEMENTATION Feature extractor performance Voice features were extracted from a sample of voices gotten from group of persons using the feature extraction algorithms (MFCC) 29
  • 30. IMPLEMENTATION Feature extractor performance cont’d MFCC is applied to the voice samples to get the feature sets 30
  • 31. IMPLEMENTATION Classifier performance Neural net. training the voices during the training phrase 31 Current error (%) Error Improvement (%)
  • 32. IMPLEMENTATION 32  Diagram for the Euclidean distances of different speakers in a cluster Euclidean distanc e data point
  • 34. RESULT Analyzing how the system select a speaker comparing with the speakers in a cluster. Training phase; Where all valid speakers are accounted. Speaker A matches with speaker A Speaker B matches with speaker B Speaker C matches with speaker C Speaker D matches with speaker D Speaker E matches with speaker E Speaker F matches with speaker F Speaker G matches with speaker G Speaker H matches with speaker H 34
  • 35. RESULT Testing phase Speaker D matches with speaker A-------------NO Speaker D matches with speaker B-----------NO Speaker D matches with speaker C---------NO Speaker D matches with speaker D----------YES 35
  • 38. RESULT Matching score Table shows the accuracy of the system based on Biometrics Cyber Security (BCS) datasets which gives 94.12% Table 1 38 . Speaker Number of match Number of mismatch Accuracy % 1 5 1 92 2 6 0 100 3 6 0 100 4 6 0 100 5 6 7 8 9 10 11 12 13 14 15 6 6 6 6 6 6 6 6 6 6 6 0 0 0 0 0 0 0 0 0 0 0 100 100 100 100 100 100 100 100 100 100 100
  • 39. RESULT 39 Matching score The accuracy rate of the system based on real life datasets gotten from individuals in different environment is 88.24%. Table 2 Speaker Number of match Number of mismatch Accuracy % 1 5 1 92 2 6 0 100 3 6 0 100 4 6 0 100 5 6 7 8 9 10 11 12 13 14 15 6 6 6 6 6 6 6 6 6 6 6 0 0 0 0 0 0 0 0 0 0 0 100 100 100 100 100 100 100 100 100 100 100
  • 40. RESULT False acceptance and false reject rates on test datasets with different thresholds. Table 4. 40 Thresholds False acceptance rate% False rejection rate% Total error rate% Equal error rate % 0.5 3.3 4.7 8 4 1 2 5.3 7.3 3.7 1.5 0.7 4 4.7 2.4 2 2.7 6 8.7 7.5
  • 41. RESULT 41 TISRS ANALYSES Table 5 Speaker ID Total samples Recognition accuracy % Rejection accuracy % 1 6 91 84 2 6 90 81 3 6 94 86 4 6 91 80 5 6 95 84 6 6 90 85 7 6 90 82 8 9 10 11 12 13 14 15 6 6 6 6 6 6 6 6 97 92 95 93 96 94 97 91 80 82 87 81 80 87 88 82
  • 42. RESULT PARULAND DUBEY,.2012 ANALYSES Table 6 42 Speaker ID Total samples Recognition accuracy% Rejection accuracy % 1 5 80 85 2 5 79 80 3 5 81 84 4 5 83 81 5 5 86 80 6 5 85 83 7 5 87 82
  • 43. RESULT SEKAR., 2012 ANALYSES The accuracy rate of Sekar. 2012 is 96% Table 6 43 Speaker Number of match Number of mismatch Accuracy % 1 5 0 100 2 5 0 100 3 4 1 80 4 5 0 100 5 5 0 100
  • 44. RESULT Comparing TISRS with Parul and Dubey.,2012 and Sekar.,2012 Table 7 44 Systems Accuracy rate/detection speed % Recognition accuracy % Rejection accuracy FFT calculation MFCC/MMFCC feature extraction speed Total recognition time TISRS 94.12 93.07 83.27 110ms 3seconds 6-8 seconds Parul and Dubey, 2012 - 80 85 100ms 4-5 seconds 5-7 seconds Sekar, 2012 96 - - - - -
  • 45. CONCLUSION In this study, the major goal of this work, is to create a speaker recognition system for detecting criminals; • The Speaker Recognition System has good accuracy and performance for recognition of speaker based on their normal voices given in respective of their languages and dialect. • Based on the disguised voices gotten from each speakers, it had a great and effective performance of the system. The efficiency of the voice used varies based on the disguising type of voice obtained from the speaker. • The current system was designed to evaluate the performance of the algorithms provided under different types of inputs. Based on the experiment, it was observed that the matching error of the system basically results from; --Insufficient number of corresponding extractor --Large distortion --Missing features and spurious features. • The implemented algorithms are more accurate, detects a person easily with its voice and faster than previous systems like Parul and Dubey (2012) while Sekar (2012) has better accuracy rate compared to this method using BCS dataset but the developed system has a better accuracy rate using real life datasets. • It is difficult to achieve a very high classification rate and it is beneficial to incorporate the information into the designed algorithm to improve its discrimination performance. 45
  • 46. CONTRIBUTION TO KNOWLEDGE The contributions are listed below; • The introduction of combined classifier techniques for efficient clustering and decision making. • In today’s high rate of crimes, improved mechanism for detection of criminals was introduced. • The capturing and used of real life datasets has also made the new design practicable. 46
  • 47. RECOMENDATION For actual reduction of criminals, it is recommended that the system is implemented in the following systems; • Time and Attendance Systems • Access Control Systems • Telephone-Banking/Booking • Biometric Login to telephone aided shopping systems Information and Reservation Services • Security control for confidential information Forensic purposes 47
  • 48. FUTURE WORK • In the future, research should focus on combining one or two other biometrics techniques for better authentication as stated in the concluding part of the work. • Furthermore, research can also consider the adoption of other areas of biometrics in speaker recognition system. This will reduce the rate of crime in the country 48
  • 49. REFERENCES Cuiling Zhang and Tiejun Tan 2008. Voice disguise and automatic speaker recognition. Forensic science international. Hanilçi Cemal and Figen Ertaş 2009. Principal Component Based Classification for Text-Independent Speaker Identification. Department of Electronic Engineering, Uludag University, Bursa, TURKEY. Hautamäki Ville, Tomi Kinnunen, Pasi Fränti 2008. Text-independent speaker recognition using graph matching. Pattern recognition letter. Kinnunen, T., Haizhou, L. 2009. An overview of text-independent speaker recognition from features to supervectors. Speech communication. Parul, G. and Dubey R.B. 2012.Automatic speaker recognition system. International Journal of Advanced Computer Research. Revathi, A. and Venkataramani, Y. 2009. Iterative clustering approach for text independent speaker identification using multiple features. International Journal of Computer science & Information Technology (IJCSIT) 1(2). 49