This document summarizes a presentation on baseline speaker verification. It discusses preprocessing speech signals using voice activity detection, extracting mel-frequency cepstral coefficients as features, building Gaussian mixture models during enrollment and testing phases, and evaluating performance using equal error rates. The authors' future plans include generating more training data synthetically and validating their results using i-vector based speaker verification.
The task of speaker identification is to determine the identity of a speaker by machine. To recognize the voice, the voices must be familiar in the case of human beings as well as machines.
The objective of speaker identification is to determine the identity of a speaker by machine on the basis of his/her voice. No identity is claimed by the user.
GitHub Link:https://github.com/TrilokiDA/Speaker-Identification-from-Voice
Fingerprints vary person to person i.e. no two persons can have same fingerprint ridge structures, so they play a major role in unique identification of human beings. Fingerprints of human beings are unique and persistent. Hence, in biometric identification applications, Automatic Fingerprint Identification System (AFIS) is emerging as a popular technology. The whole performance of the system depends on the quality of finger print images. Keeping in mind the above fact, this paper is solely based on fingerprint image enhancement so as to improve the quality of fingerprints, which makes feature (minutiae) extraction reliable.
The task of speaker identification is to determine the identity of a speaker by machine. To recognize the voice, the voices must be familiar in the case of human beings as well as machines.
The objective of speaker identification is to determine the identity of a speaker by machine on the basis of his/her voice. No identity is claimed by the user.
GitHub Link:https://github.com/TrilokiDA/Speaker-Identification-from-Voice
Fingerprints vary person to person i.e. no two persons can have same fingerprint ridge structures, so they play a major role in unique identification of human beings. Fingerprints of human beings are unique and persistent. Hence, in biometric identification applications, Automatic Fingerprint Identification System (AFIS) is emerging as a popular technology. The whole performance of the system depends on the quality of finger print images. Keeping in mind the above fact, this paper is solely based on fingerprint image enhancement so as to improve the quality of fingerprints, which makes feature (minutiae) extraction reliable.
Designing and building a forensic laboratory is a complicated undertaking. Design issues include those considerations present when designing any building, with enhanced concern and special requirements involving environmental health and safety, hazardous materials, management, operational efficiency, adaptability, security of evidence, preservation of evidence in an uncontaminated state, as well as budgetary concerns.
Presentation by Professor Oliver Carsten at European Transport Safety Council (ETSC) and Liikenneturva (Finnish Road Safety Council) conference on distracted driving, 7 October 2014.
www.its.leeds.ac.uk/people/o.carsten
http://etsc.eu/7-october-2014-distracted-driving-helsinki/
A short introduction to multimedia forensics the science discovering the hist...Sebastiano Battiato
Thematic Meeting on MULTIMEDIA TRUTHFULNESS VERIFICATION IN LEGAL ENVIRONMENT AND SOCIAL MEDIA
Co-located with WIFS 2015, Roma - Italy, 16 November 2015
Deep Learning techniques have enabled exciting novel applications. Recent advances hold lot of promise for speech based applications that include synthesis and recognition. This slideset is a brief overview that presents a few architectures that are the state of the art in contemporary speech research. These slides are brief because most concepts/details were covered using the blackboard in a classroom setting. These slides are meant to supplement the lecture.
Multimodal biometric systems are those that utilize more than one physical or behavioural characteristic for enrolment , verification, or identification.
Performance Evaluation of Conventional and Hybrid Feature Extractions Using M...IJERA Editor
Speech feature extraction and likelihood evaluation are considered the main issues in speech recognition system.
Although both techniques were developed and improved, but they remain the most active area of research. This
paper investigates the performance of conventional and hybrid speech feature extraction algorithm of Mel
Frequency Cepstrum Coefficient (MFCC), Linear Prediction Cepstrum Coefficient (LPCC), perceptual linear
production (PLP) and RASTA-PLP through using multivariate Hidden Markov Model (HMM) classifier. The
performance of the speech recognition system is evaluated based on word error rate (WER), which is given for
different data set of human voice using isolated speech TIDIGIT corpora sampled by 8 Khz. This data includes
the pronunciation of eleven words (zero to nine plus oh) are recorded from 208 different adult speakers (men &
women) each person uttered each word 2 times.
Designing and building a forensic laboratory is a complicated undertaking. Design issues include those considerations present when designing any building, with enhanced concern and special requirements involving environmental health and safety, hazardous materials, management, operational efficiency, adaptability, security of evidence, preservation of evidence in an uncontaminated state, as well as budgetary concerns.
Presentation by Professor Oliver Carsten at European Transport Safety Council (ETSC) and Liikenneturva (Finnish Road Safety Council) conference on distracted driving, 7 October 2014.
www.its.leeds.ac.uk/people/o.carsten
http://etsc.eu/7-october-2014-distracted-driving-helsinki/
A short introduction to multimedia forensics the science discovering the hist...Sebastiano Battiato
Thematic Meeting on MULTIMEDIA TRUTHFULNESS VERIFICATION IN LEGAL ENVIRONMENT AND SOCIAL MEDIA
Co-located with WIFS 2015, Roma - Italy, 16 November 2015
Deep Learning techniques have enabled exciting novel applications. Recent advances hold lot of promise for speech based applications that include synthesis and recognition. This slideset is a brief overview that presents a few architectures that are the state of the art in contemporary speech research. These slides are brief because most concepts/details were covered using the blackboard in a classroom setting. These slides are meant to supplement the lecture.
Multimodal biometric systems are those that utilize more than one physical or behavioural characteristic for enrolment , verification, or identification.
Performance Evaluation of Conventional and Hybrid Feature Extractions Using M...IJERA Editor
Speech feature extraction and likelihood evaluation are considered the main issues in speech recognition system.
Although both techniques were developed and improved, but they remain the most active area of research. This
paper investigates the performance of conventional and hybrid speech feature extraction algorithm of Mel
Frequency Cepstrum Coefficient (MFCC), Linear Prediction Cepstrum Coefficient (LPCC), perceptual linear
production (PLP) and RASTA-PLP through using multivariate Hidden Markov Model (HMM) classifier. The
performance of the speech recognition system is evaluated based on word error rate (WER), which is given for
different data set of human voice using isolated speech TIDIGIT corpora sampled by 8 Khz. This data includes
the pronunciation of eleven words (zero to nine plus oh) are recorded from 208 different adult speakers (men &
women) each person uttered each word 2 times.
2018 IEEE Big Data Cup Challenge - FEMH Voice Data Challengehanumayamma
Computerized detection of voice disorders has attracted considerable academic and clinical interest in the hope of providing an effective screening method for voice diseases before endoscopic confirmation. The goal of this competition is to detect pathological voice and classify three disordered categories from acoustic waveforms collected by FEMH (Far Eastern Memorial Hospital). From a health science perspective, a pathological status of the human voice can substantially reduce quality of life and occupational performance, which results in considerable costs for both the patient and the society.
The report summarizes the various techniques and feature engineering processes that we have applied for the Far Eastern Memorial Hospital (FEMH) Voice Data Challenge. We have used Mel scaled spectrograms and MFCC components as audio features to train various Neural Network Architectures. We have trained a 5-layer plain network, 5-layer CNN and RNN. We discuss the challenges faced and solutions to improve model performance, model parameter tuning and model evaluation.
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueCSCJournals
Automatic speaker recognition system is used to recognize an unknown speaker among several reference speakers by making use of speaker-specific information from their speech. In this paper, we introduce a novel, hierarchical, text-independent speaker recognition. Our baseline speaker recognition system accuracy, built using statistical modeling techniques, gives an accuracy of 81% on the standard MIT database and our baseline gender recognition system gives an accuracy of 93.795%. We then propose and implement a novel state-space pruning technique by performing gender recognition before speaker recognition so as to improve the accuracy/timeliness of our baseline speaker recognition system. Based on the experiments conducted on the MIT database, we demonstrate that our proposed system improves the accuracy over the baseline system by approximately 2%, while reducing the computational time by more than 30%.
The performance of speaker identification systems
decreases significantly under noisy conditions and especially
when there is a difference between the recognition and the
learning sessions. To improve robustness, we have proposed in
the previous study an auditory features and a robust speaker
recognition system using a front-end based on the combination of
MFCC and RASTA-PLP methods. In this paper, we further
study the auditory features by exploring the combination of
GFCC and RASTA-PLP. We find that the method performs
substantially better than all previous studied methods.
Furthermore, our current identification system achieves
significant performance improvements of 5.92% in a wide range
of signal-to-noise conditions compared with the last studied
front-end based on MFCC combined to RASTA-PLP.
Experimental results show an average accuracy improvement of
10.11% in case of GFCC combined with RASTA-PLP over the
base line MFCC thechnique across various SNR. This fusion
approach allow a highly and appreciable enhancement in the
higher noisy conditions.
Limited Data Speaker Verification: Fusion of FeaturesIJECEIAES
The present work demonstrates experimental evaluation of speaker verification for dif- ferent speech feature extraction techniques with the constraints of limited data (less than 15 seconds). The state-of-the-art speaker verification techniques provide good performance for sufficient data (greater than 1 minutes). It is a challenging task to develop techniques which perform well for speaker verification under limited data condition. In this work different features like Mel Frequency Cepstral Coefficients (MFCC), Linear Prediction Cepstral Coefficients (LPCC), Delta (4), Delta-Delta (44), Linear Prediction Residual (LPR) and Linear Prediction Residual Phase (LPRP) are considered. The performance of individual features is studied and for better verification performance, combination of these features is attempted. A comparative study is made between Gaussian mixture model (GMM) and GMM-universal background model (GMM-UBM) through experimental evaluation. The experiments are conducted using NIST-2003 database. The experimental results show that, the combination of features provides better performance compared to the individual features. Further GMM-UBM modeling gives reduced equal error rate (EER) as compared to GMM.
A WIRELESS DIGITAL PUBLIC ADDRESS WITH VOICE ALARM AND TEXT-TO-SPEECH FEATURE...Mark John Lado, MIT
Adaption of the new technology is a prerequisite for the business's survival and to meet the quality standard. The public address system is widely used in all areas of living; it helps the host speaker to easily disperse the desired messages. This study aimed to develop a wireless digital broadcasting with voice alarm and text-to-speech (TTS) feature that can wirelessly transmit audio signals from the main campus to its satellite campus over long distances. Also, the prototype has a TTS feature that can offer high-quality and stable speech. The main campus is the Colegio de San Antonio de Padua (CSAP) located at Guinsay, Danao City Cebu, Philippines with a satellite campus at Barangay Suba, Danao City, with a distance of not less than four kilometers and not greater than five kilometers from the main campus. The researcher used the descriptive developmental method of research as the systematic study of designing, developing, and evaluating programs that must meet the criteria of internal consistency and effectiveness. The rapid prototyping model was used during the system development while the criteria in McCall’s Factor Model were used to test the system according to its usability, applicability, and efficiency.
Isolated words recognition using mfcc, lpc and neural networkeSAT Journals
Abstract Automatic speech recognition is an important topic of speech processing. This paper presents the use of an Artificial Neural Network (ANN) for isolated word recognition. The Pre-processing is done and voiced speech is detected based on energy and zero crossing rates (ZCR). The proposed approach used in speech recognition is Mel Frequency Cepstral Coefficients (MFCC) and combine features of both MFCC and Linear Predictive Coding (LPC). The back-propagation is used as a classifier. The recognition accuracy is increased when combine features of both LPC and MFCC are used as compared to only MFCC approach using Neural Network as a classifier.. Keywords: Pre-processing, Mel frequency Cepstral Coefficient (MFCC), Linear Predictive Coding (LPC), Artificial Neural Network (ANN).
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
6th International Conference on Machine Learning & Applications (CMLA 2024)ClaraZara1
6th International Conference on Machine Learning & Applications (CMLA 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of on Machine Learning & Applications.
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...ssuser7dcef0
Power plants release a large amount of water vapor into the
atmosphere through the stack. The flue gas can be a potential
source for obtaining much needed cooling water for a power
plant. If a power plant could recover and reuse a portion of this
moisture, it could reduce its total cooling water intake
requirement. One of the most practical way to recover water
from flue gas is to use a condensing heat exchanger. The power
plant could also recover latent heat due to condensation as well
as sensible heat due to lowering the flue gas exit temperature.
Additionally, harmful acids released from the stack can be
reduced in a condensing heat exchanger by acid condensation. reduced in a condensing heat exchanger by acid condensation.
Condensation of vapors in flue gas is a complicated
phenomenon since heat and mass transfer of water vapor and
various acids simultaneously occur in the presence of noncondensable
gases such as nitrogen and oxygen. Design of a
condenser depends on the knowledge and understanding of the
heat and mass transfer processes. A computer program for
numerical simulations of water (H2O) and sulfuric acid (H2SO4)
condensation in a flue gas condensing heat exchanger was
developed using MATLAB. Governing equations based on
mass and energy balances for the system were derived to
predict variables such as flue gas exit temperature, cooling
water outlet temperature, mole fraction and condensation rates
of water and sulfuric acid vapors. The equations were solved
using an iterative solution technique with calculations of heat
and mass transfer coefficients and physical properties.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
Online aptitude test management system project report.pdfKamal Acharya
The purpose of on-line aptitude test system is to take online test in an efficient manner and no time wasting for checking the paper. The main objective of on-line aptitude test system is to efficiently evaluate the candidate thoroughly through a fully automated system that not only saves lot of time but also gives fast results. For students they give papers according to their convenience and time and there is no need of using extra thing like paper, pen etc. This can be used in educational institutions as well as in corporate world. Can be used anywhere any time as it is a web based application (user Location doesn’t matter). No restriction that examiner has to be present when the candidate takes the test.
Every time when lecturers/professors need to conduct examinations they have to sit down think about the questions and then create a whole new set of questions for each and every exam. In some cases the professor may want to give an open book online exam that is the student can take the exam any time anywhere, but the student might have to answer the questions in a limited time period. The professor may want to change the sequence of questions for every student. The problem that a student has is whenever a date for the exam is declared the student has to take it and there is no way he can take it at some other time. This project will create an interface for the examiner to create and store questions in a repository. It will also create an interface for the student to take examinations at his convenience and the questions and/or exams may be timed. Thereby creating an application which can be used by examiners and examinee’s simultaneously.
Examination System is very useful for Teachers/Professors. As in the teaching profession, you are responsible for writing question papers. In the conventional method, you write the question paper on paper, keep question papers separate from answers and all this information you have to keep in a locker to avoid unauthorized access. Using the Examination System you can create a question paper and everything will be written to a single exam file in encrypted format. You can set the General and Administrator password to avoid unauthorized access to your question paper. Every time you start the examination, the program shuffles all the questions and selects them randomly from the database, which reduces the chances of memorizing the questions.
Water billing management system project report.pdfKamal Acharya
Our project entitled “Water Billing Management System” aims is to generate Water bill with all the charges and penalty. Manual system that is employed is extremely laborious and quite inadequate. It only makes the process more difficult and hard.
The aim of our project is to develop a system that is meant to partially computerize the work performed in the Water Board like generating monthly Water bill, record of consuming unit of water, store record of the customer and previous unpaid record.
We used HTML/PHP as front end and MYSQL as back end for developing our project. HTML is primarily a visual design environment. We can create a android application by designing the form and that make up the user interface. Adding android application code to the form and the objects such as buttons and text boxes on them and adding any required support code in additional modular.
MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software. It is a stable ,reliable and the powerful solution with the advanced features and advantages which are as follows: Data Security.MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software.
ACEP Magazine edition 4th launched on 05.06.2024Rahul
This document provides information about the third edition of the magazine "Sthapatya" published by the Association of Civil Engineers (Practicing) Aurangabad. It includes messages from current and past presidents of ACEP, memories and photos from past ACEP events, information on life time achievement awards given by ACEP, and a technical article on concrete maintenance, repairs and strengthening. The document highlights activities of ACEP and provides a technical educational article for members.
3. Speaker Recognition is the computing task of validating
identity claim of a person from his/her voice.
Applications:-
Authentication
Forensic test
Security system
ATM Security Key
Personalized user interface
Multi speaker tracking
Surveillance
6/24/2015 N.I.T. PATNA ECE, DEPTT. 3
5. Phase of Speaker Verification
• Enrollment Session or Training Phase
• Operating Session or Testing Phase
6/24/2015 N.I.T. PATNA ECE, DEPTT. 5
6. Training & Testing Phase
Training Reference model
Speech
Identity claim
Testing
Speech R
Accept/reject
Pre-
processing
Feature
extraction
Model
Building
Pre-
processing
Feature
extraction comparison
Decision
logic
6/24/2015 N.I.T. PATNA ECE, DEPTT. 6
7. Preprocessing
Preprocessing is an important step in a speaker verification system. This also called
voice activity detection (VAD).
VAD separates speech region from non-speech regions[2-3]
It is very difficult to implement a VAD algorithm which works consistently for
different type of data
VAD algorithms can be classified in two groups
Feature based approach
Statistical model based approach
Each of the VAD method have its own merits and demerits depending on accuracy,
complexity etc.
Due to simplicity most of the speaker verification systems use signal energy for VAD.
6/24/2015 N.I.T. PATNA ECE, DEPTT. 7
8. The speech signal along with speaker information
contains many other redundant information like
recording sensor, channel, environment etc.
The speaker specific information in the speech
signal[2]
Unique speech production system
Physiological
Behavioral aspects
Feature extraction module transforms speech to a set
of feature vectors of reduce dimensions
To enhance speaker specific information
Suppress redundant information.
Feature Extraction
6/24/2015 N.I.T. PATNA ECE, DEPTT. 8
9. • Robust against noise and distortion
• Occur frequently and naturally in speech
• Be easy to measure from speech signal
• Be difficult to impersonate/mimic
• Not be affected by the speaker’s health or long term variations in voice
Selection of Features
6/24/2015 N.I.T. PATNA ECE, DEPTT. 9
11. Feature Extraction Techniques
A wide range of approaches may be used to parametrically represent the speech
signal to be used in the speaker recognition activity.
Linear Prediction Coding
Linear Predictive Ceptral Coefficients
Mel Frequency Ceptral Coefficients
Perceptual Linear Prediction
Neural Predictive Coding
Most of the state-of-the-art speaker verification systems use Mel-frequency
Cepstral Coefficient (MFCC) appended to it’s first and second order derivative
as the feature vectors
Easy to extract
Provides best performance compared to other features
MFCC mostly contains information about the resonance structure of the vocal
tract system
6/24/2015 N.I.T. PATNA ECE, DEPTT. 11
12. 1. Analog to digital conversion
2. Pre emphasis
3. Framing & windowing
4. Fast Fourier Transform
5. Mel scale wrapping
6. MFCC
6/24/2015 N.I.T. PATNA ECE, DEPTT. 12
13. MFCC
6/24/2015 N.I.T. PATNA ECE, DEPTT. 13
Step 1:- Analog to digital conversion: is transformed to
digital form by sampling it at given frequency.
14. MFCC
6/24/2015 N.I.T. PATNA ECE, DEPTT. 14
Step 2:- Pre-emphasis: The amount of energy present in
the high frequency (important for speech) are boosted.
20. MFCC WINDOWING
• The next step is to window individual frame to
minimize the signal discontinuities at the
beginning and end of each frame.
• The concept applied here is to minimize the
spectral distortion by using the window to
taper the signal to zero at the beginning and
end of each frame.
• We have used hamming window
6/24/2015 N.I.T. PATNA ECE, DEPTT. 20
26. 6/24/2015 N.I.T. PATNA ECE, DEPTT. 26
Speaker Modelling
• Vector Quantization
• Gaussian Mixture Model
• Gaussian Mixture Model-UBM
• Hidden Markov Model
• Artificial Neural Networks
• Super Vector Machines
• I-Vector
Gaussian model assumes the feature vectors follow a Gaussian distribution,
characterized by mean vectors, covariance matrix and weights
The data unseen in the training which appear in the test data will trigger a low
score
Speaker models the statistical information present in the
feature vectors it enhances the speaker information and
suppress the redundant information
27. A Gaussian mixture density defined as-
A Gaussian function for D dimension is defined as-
where- Unimodal Gaussian
D=8,16,32,64
ʎ i = {wi , ∑i µi }
wi = Weight
µi = Mean ;
∑i = Covariance ;
i-No. of models(M=356)
6/24/2015
N.I.T. PATNA ECE, DEPTT.
27
Gaussian Mixture Model
28. For a sequence of T training vector X={x1 , x2 ,…, xT }
the GMM likelihood can be defined as-
For estimation of speaker specific GMM,
Expectation maximization algorithm is used .
6/24/2015 N.I.T. PATNA ECE, DEPTT. 28
30. ʎtarget : X(MFCC(TESTING DATA)) is from the hypothesized
speaker S
ʎUBM : X(MFCC(TESTING DATA)) is not from the
hypothesized speaker S
The likelihood ratio test is given by-
LR(X)=
The probability of alternative hypothesis
P(X/ʎUBM ) =F( P(X/ʎ1), P(X/ʎ2),..., P(X/ʎM))
F( ) is function such as average or maximum of likelihood
value of Background Speaker set ( P(X/ʎi) ) .
6/24/2015 N.I.T. PATNA ECE, DEPTT.
30
31. Score Normalisation
Where-
s- Original Score = log(LR(X));
µI - Estimated mean of s
σI -standard deviation of s
6/24/2015 N.I.T. PATNA ECE, DEPTT. 31
32. PERFORMANCE EVALUATION
NIST has conducted speaker recognition
benchmarking activity on annual basis since
1997.
NIST has provided speech files as development
data.
NIST 2003 data-
Testing Speech Data-2559
Train Speech Data-356
UBM Female Speech data-251
UBM male Speech data-251
6/24/2015 N.I.T. PATNA ECE, DEPTT. 32
33. For Baseline speaker verification the following parameter are
used
VAD: Energy based VAD (0.6 * average energy)
Feature vector: 13 dimension MFCC appended with delta
and delta-delta
Modeling: GMM
GMM size: 8, 16, 32, 64.0
Comparison: log Likelihood score
41. Future Plan
Synthetically generating training and testing speech
from limited speech data.
Validating the results on state-of-the-art i-vector
based speaker verification system.
6/24/2015 N.I.T. PATNA ECE, DEPTT. 41