Armor vox impostermaps_how-to-build-an-effective-voice-biometric-solution-in-three-easy-steps


Published on

ImpostorMaps™ is a patented methodology developed by Auraya and available from Auraya resellers worldwide to configure, prove, optimize and deploy voice biometrics solutions into major customers facing applications.

Originally developed to prove the performance and accuracy of voice biometric systems deployed in Australian Government services, ImpostorMaps™ has also been used by the National Australia Bank and The Vanguard Group in the USA to successful develop and deploy voice biometric solutions now used by millions of citizens and banking customers to authentication identity over the telephone. Download this whitepaper for instructions on three simple steps to building an effective Voice Biometrics Solution with ArmorVox

Stage 1: Prototype Design and Usability Testing
Stage 2: Technology Evaluation and Impostor Maps
Stage 3: Business Rules and Optimization
The reader will also learn to deploy ImpostorMap™ technology combined with the ArmorVox Optimizer to roll-out an effective voice biometric application in only three stages

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Armor vox impostermaps_how-to-build-an-effective-voice-biometric-solution-in-three-easy-steps

  1. 1. TECHNOLOGY WHITEPAPER ArmorVox ImpostorMaps™ How to Build an Effective Voice Biometric Solution in Three Easy StepsAURAYA SYSTEMSOne Tara Boulevard | Nashua, New Hampshire 03062 | +1 603 123 7654 | | linkedin/in/armorvox
  2. 2. ArmorVox ImpostorMaps™How to Build an Effective Voice Biometric Solution in Three Easy StepsImpostorMaps™ is a methodology developed by Auraya and available from Auraya resellers worldwideto configure, prove, optimize and deploy voice biometrics solutions into major customers facingapplications. Originally developed to prove the performance and accuracy of voice biometric systemsdeployed in Australian Government services, ImpostorMaps™ has also been used by the NationalAustralia Bank and The Vanguard Group in the USA to successful develop and deploy voice biometricsolutions now used by millions of citizens and banking customers to authentication identity over thetelephone.The process is focused on confirming the performance and the configuration of the technology to meetthe end-user’s security and customer usability requirements. At the start of the process the end-userspecifies their preferred authentication configuration and the process delivers a result that confirmsusability and security so that the end-user can deploy the system confident that customers’ can use thesystem effectively and at the same time, the system will deliver the security requirements required bythe end-user for deployment. Figure 1: The ImpostorMap™ Voice Biometric Applications Development and Deployment Methodology2ARMORVOX – ImpostorMaps™© 2012 Auraya Systems
  3. 3. The ImpostorMap™ methodology comprises three stages:Stage 1: Prototype Design and Usability TestingThe first step in the ImpostorMap™ process is the prototype design and usability analysis. The focus ofthis initial stage is not security but usability. Stage 1 involves setting-up a “prototype application VUI(Voice User Interface)”, implemented either using the customer’s IVR system (or a hosted IVR systemfrom one of Auraya’s telephony services partner (Voxeo) to create a prototype “model” of theauthentication solution required by the end-user. This “prototype” implements the VUI which thecustomer believes best meets its customer service requirements. At this stage the “prototype” invokesthe core authentication engine in its default settings and implements a voice data capture process thatlogs focus group callers’ interactions with the system to understand how callers interact with thesystem, their speaking styles and speaking environments (telecommunication environments and so on).Once set-up, a focus group of callers (usually end-user employees) is used to “prototype” system andtheir voice responses are captured. The process requires focus group subjects to telephone theprototype system and respond to the voice prompts in “the way they think they should respond”.During this stage each subject is provided with “fabricated” personal identity information, such asaccount number, name, date or other personal information designed to model the information thatwould be used in the deployed application. The process elicits an authentic voice response from thefocus group subjects in their preferred language and accent; including subjects saying the wrongwords, hesitating and other behaviors that callers frequently exhibit. Further, because they arespeaking over the telephone network (landline, mobile or other), the speech data captured will alsocontains the noise, distortion, interference and bandwidth limitations introduced by the network. All thisinformation is logged by ArmorVox and is used to optimize the performance of the solution during thefollowing stages of the ImpostorMap™ process.The “Over -Imposted Speech Database”The crucial consideration at this stage is the design of the database of “fabricated” personal identityinformation. Auraya uses what is known in the scientific community as an “over-imposted speechdatabase”. In this database each subject is given “sample personal information” to speaker into the trialsystem, presented in the same format as would be used in the final deployed system. The speaker istold that this example information is used to protect their own personal information during testing.3ARMORVOX – ImpostorMaps™© 2012 Auraya Systems
  4. 4. However, the data provided is not unique to each speaker. Depending on how the evaluation is set-up,there are typically 10 (or more) other speakers (of the same gender) quoting the exact same examplepersonal information (such as the “fabricated” account numbers, PINs, names, addresses, dates ofbirth). In such an arrangement the only way to distinguish between different speakers is from theirunique voice characteristics and not the personal information presented. In the ImpostorMap™process, the groups of speaker with the same information are then used as impostors to break intoeach other’s accounts.In effect, the process simulates a “massive hacker attack”; where impostors are using other peoples’identity information to gain unauthorized access to other users’ secure services. Given that thespeakers in this database are saying the same personal information (i.e. account numbers, name datesand names etc.) then there is no way current telephone voice security processes, including PIN,password or proof-of-identity questioning is able to separate the legitimate speakers from the impostorsspeakers. This creates the situation, where the current security methods produce a “100% FalseAccept Rate (FAR)”. This then becomes the benchmark against which the security performance of thevoice biometrics system can be compared to provide the end-user’s security team with a measure ofhow much more secure ArmorVox is compared to the current security solution.As the database is collected the responses can be analyzed to evaluate the usability of the Voice UserInterface (VUI). Based on the analysis, the VUI design can be updated and focus group participantsinvited to re-try the system. This way the design of the VUI can be iteratively improved to achieve thecustomer usability levels required for production deployment. Further, as this is a voice database of“authentic subject responses”, including all the speaker mistakes and communications networkartifacts, the evaluation results generated by the benchmarking process will give an accurate reflectionof the performance of a fully deployed system.The “over imposted database design” also delivers the data needed to calibrate the securityperformance, optimize performance and develop the business rules for acceptance and decline ofcallers claimed identity at the application layer. This is described in Stage 2 of the process.The data collection process is supported by a quality assurance (QA) module which is built into theArmorVox system. The Voice QA module detects speaker errors e.g. inconsistency, noisy samples,distorted or clipped speech items and problematic channel effects, as well as measuring the quality ofthe audio utterance using an ArmorVox proprietary algorithm. The module provides the feedback4ARMORVOX – ImpostorMaps™© 2012 Auraya Systems
  5. 5. required by the VUI designers to improve the application and measure enhanced usability. Further,when deployed in the end-user’s telecommunications environment, it identifies early any issues with thetelephony network that is likely to impact of system performance.Deliverables from Stage 1At the end of this stage, the process delivers a VUI design confirmed to meet customer usabilityrequirements and the “Over imposted speech database” specifically set-up for the security evaluation,tuning and optimization of the technology to meet the end-user’s requirements.Stage 2: Technology Evaluation and Impostor MapsAs the speech database is being collected during Stage 1, the analysis, optimization and creation of theImpostorMap™ commences. This process is performed by the tuning module that is incorporated in theArmorVox product (the ArmorVox Optimizer) plus a separate analysis of the enrolment and verificationprocesses that the end-user can use the assess security performance and set-up of the system.The first stage of this process is to optimize the UBM’s (Universal Background Models). The UBMrepresent the acoustic characteristics of the voice samples used by the population of speaker toperform the verification given an acoustic environment e.g. the office. The closer the UBM representsthe population of speakers - the better the performance of the system on that population and the betterthe discrimination of speaker outside that population.The ArmorVox product is shipped with 8 text-dependent UBM’s for different types of speech; rangingfrom people saying account numbers in English to words to words and phrases. In addition there is aUBM for text-independent authentication and another for text-prompted authentication. Typically fortext-dependent authentication the closest UBM is used to seed the process and the tuning processprogressively adapts the parameters of the UBM to represent the acoustics of the enrolment andverification samples of the speakers. The process is repeated for all UBM used in the trial application tomaximize performance for each verification process and the overall security performance of theapplication.5ARMORVOX – ImpostorMaps™© 2012 Auraya Systems
  6. 6. Once UBM optimization is complete, an analysis of the verification performance can commence basedon the measurements made by the system on all the enrolled speakers.Once enrollment is complete, the verification samples are verified against their respective enrolledacoustic models to generate the true speaker map. Given that the speech data is produced by thesame person (i.e. the legitimate speaker) there should be a good match and the technology shouldreturn high score for the true speakers. The problem is that this is not always the case and during thisstage the analysis focuses on those samples that return low score and understand why score of low.Low results can be generated by the true speaker for a number of reasons. The speaker may have saidthe wrong word or phrase. They may have said the information differently than the way it was originallyenrolled. For example, they could be speaking more quickly or more slowly than the original enrollmentsample. They may have hesitated, “ummed”, “arred”, coughed sneezed or just did not say anything atall. There could have been high noise on the line or the line was subject to high levels of distortion,cutting-out (as often heard in mobile and some VoIP networks) or clipping.Information gathered at this stage is used in stage 3 for business rule refinement to ensure that failuresare handled systematically, correctly and efficiently by the application and that when a true speakerfailure occurs the reason is understood and appropriate action is implemented.The ArmorVox Optimizer generates equal error rates (EERs) at the system as well as at the speakerlevel pre- and post-optimization. The equal error rate measures the trade-off between false acceptanceand false rejection at the point where they are both equal. The objective is of course to drive theseEERs to a minimum thus resulting in an optimal authentication system.The optimizer calibrates the performance of each voiceprint enroll against the corresponding impostorssaying the same information. This is the key process. What this process does is to simulate a“massive hacker attack” whereby each voiceprint enrolled in the system is subject to a large number ofimpostor attacks for speakers of the same gender and age-group where possible are saying the sameidentity information. Such an attack is very uncommon in real deployment, but does represent a worstcase scenario for the technology.6ARMORVOX – ImpostorMaps™© 2012 Auraya Systems
  7. 7. The impostor test, which is also performed by the optimizer is opposite of the true speaker test. That isgiven that the impostors voice quality is different from that enrolled then, despite the information beingthe same, the technology should return a low score. The problem is that this is not always the case.Some people may have similar voices, in which case the value may be quite high. Alternatively, theenrolled voiceprint may be weak and vulnerable to impostor attack. In this case impostors consistentlyscore higher or score more closely compared to the true speaker scores, resulting into higher speakerequal error rates.The distribution of results generated by the impostor process is known as the ImpostorMap™.Typically, Auraya designs the over impostor speech database to have a ratio of 10 impostors for eachtrue speaker. If the database comprises, say 500 speakers, the ImpostorMap™, thus comprises 5,000impostor attempts. Typically, the map can be produced in as many dimensions as there are speechtypes used in the application (external to the optimizer but part of the process). For example, if anapplication uses account number and date of birth (as in the case of current banking deployments) thena two dimensional ImpostorMap™ is created. Adding another speech type, such as a phrase (“At thebank my voice is my password”) creates a three dimensional map.The “Impostor Map” is the critical information as it provides a profile of the security performance of theapplication under consideration and the ability of the application to separate true speakers fromImpostor speaker saying the same information. Typically, current security solutions based onknowledge questions and PINs and passwords would generate a 100% false accept rate. That is allimpostors would breach the system. Auraya’s ImpostorMap™ process provides a tangible measure onhow much more secure a voice biometric system compared to current security solutions and providesthe confidence that the system will deliver security when it encounters an impostor.Deliverables from Stage 2This stage delivers the performance of the technology as the an outcome of running the optimzer andthe impostormap™ (external to the system) which are used in the next stage to develop the businessrules for the application and provide additional optimization.7ARMORVOX – ImpostorMaps™© 2012 Auraya Systems
  8. 8. Stage 3: Business Rules and OptimizationThe optimizer measurements and ImpostorMaps™ can be used to develop and optimize the rulesassociated with the application and its intended use. This stage is controlled by what the client islooking to achieve as a business outcome.Using EER measurements for different speech items as well as ImpostorMap™ analysis, differentsystem configurations can be developed that allow the trade-off between impostor false accepts andtrue speaker false rejects from being examined. In some case ImpostorMaps™ has been used todesign systems that combine different voice biometric technologies, such as text-dependent and text-independent technologies to produce system that exhibit very high security and very low true speakerfalse reject rates. ImpostorMaps™ also enables rules to be developed to handle ambiguous resultswhere the authentication score are on the boundary between the true speaker and impostor map.Maps can be used to attach, confidence and probability scores to speakers; enabling the end-user todevelop rules that limit risk by limiting rights and access privileges based on scores generated byArmorVox. ImpostorMap™ have also been used to develop rules to handle infrequent callers forexample, and how to adjust rules in noisy and mobile channels.For example, in the case of an Australian bank, ImpostorMaps™ were used to determine theperformance of the system as noise in the telephony channel was increased. This highlightedvulnerabilities in their application, especially during the enrolment process. In this case ImpostorMaps™was used to develop rules to restrict enrolment to quiet environments and channels. In ArmorVox thisinformation can be used to set the parameters within the Voice QA module to flag when noisy conditionlead to performance limitations and situations where the security performance of the system could becompromised (i.e. voice biometric vulnerability analysis).In another example, a US financial services firm used ImpostorMaps™ to confirm that the technologymet their stringent security requirements for distinguishing between family members access the system.Having shown that specific configurations of the technology would meet their requirements they thenproceeded to implement and configure the application to meet their security and businessrequirements.8ARMORVOX – ImpostorMaps™© 2012 Auraya Systems
  9. 9. Voiceprint Vulnerability AnalysisA unique feature of ArmorVox is its voiceprint vulnerability analysis. Typically, 10% to 15% ofvoiceprints stored in the database are potentially vulnerable and susceptible to impostor attacks.Vulnerabilities occur when the enrolment process is corrupted in some way; either through noisyspeaking conditions; transmission interference or degradation; or speaker errors or inconsistentspeaking style. Analysis shows that weak and vulnerable voiceprints have a significant impact on theoverall security performance of an application.Using impostor data automatically generated by ArmorVox, the ArmorVox system can detect andoptimize weak and vulnerable voiceprints and strengthen performance for the whole system. It can alsoidentify outliers or “goats” that do not exhibit average speaker behaviour. The tool tests each voiceprintin the database attaching a confidence score to each voiceprint indicating the security strength ofthat voiceprint. Voiceprints found to have weak security strengths are selected for optimization or re-enrolment, using either existing or newly acquired speech data, typically, eliminating weak andvulnerable voiceprints.As well as significantly enhancing the overall security performance of the solution, this proprietaryprocess also sets the individual speaker thresholds automatically and allows business rules to be set atthe “individual level” e.g. by manually raising or lowering individual thresholds.Deliverable from Stage 3This final stage delivers a suggested configuration for the application to meet the end-user’s securityrequirements e.g. the best speech items to use and the situations under which the system generateserror response, such as when a voice sample is too noisy, when the caller is saying the incorrectinformation and so on.9ARMORVOX – ImpostorMaps™© 2012 Auraya Systems
  10. 10. ConclusionArmorVox Optimizer and the ArmorVox ImpostorMap™ methodology ensure:a. The dialogue, persona and authentication processes meet the client’s customers’ expectations and that users can effectively authentication their identity using the system.b. The technology performance is known, security setting optimized and vulnerabilities eliminated or minimized to meet the end-user’s requirements.c. The business rules implement the end-user’s identity authentication security requirements allowing the system to integrate with the end-users identity management systemsAt this point the end-user can roll-out the voice biometric application based on ArmorVox confident thatthe system will work and will deliver the user acceptance and security requirements for successfuldeployment.Next StepsArmorVox ImpostorMap methodology is available for Auraya or an Auraya Certified Reseller Partner.Either contact Auraya directly via our website at or contact your local AurayaReseller Partners (again see for a list of resellers worldwide…10ARMORVOX – ImpostorMaps™© 2012 Auraya Systems
  11. 11. About the Author Dr. Clive Summerfield is Auraya Systems’ Founder and Chief Executive Officer. Clive is an internationally recognized authority on voice technology and holds numerous patents in Australia, USA and UK in radar processing, speech chip design and speech recognition and voice biometrics.As a former Founder Deputy Director of the National Centre for Biometric Studies (NCBS) at Universityof Canberra, in 2005 Clive undertook at the time the world’s largest scientific analysis of the voicebiometric systems leading to the adoption of voice biometrics by for secure services. That experiencelead Clive in 2006 founding Auraya, a business exclusively focused on advanced voice biometrictechnologies for enterprise and cloud based services. Visit for Clive Summerfield’s fullbio.About Auraya SystemsFounded in 2006, Auraya Systems, the creators of ArmorVox™ Speaker Identity System is a globalleader in the delivery of advanced voice biometric technologies for security and identity managementapplications in a wide range of markets including banks, government, and health services. Offices arelocated near Boston USA, Canberra and Sydney Australia. For more information, pleasevisit – ImpostorMaps™© 2012 Auraya Systems