SlideShare a Scribd company logo
1 of 15
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
Alphanumeric Speech
Recognition
SpeechTek
August 19, 2013
Crispin Reedy
“The fault, dear Brutus, is not in
our stars, but in ourselves”
-- Julius Caesar, Act I, scene ii
2
The ProblemWithAlphanumerics
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
The Need
• Account Numbers
• Policy Numbers
• Spelling out names and addresses
• Special cases
– VIN, Canadian Postal Code
• And more…
3
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
Methods for Addressing
• Project Tactics
• Limit the grammar
– Constraint List
– N-Best + Back-End Data Validation
• Confirmation
• Prefiller
4
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
Project Tactics
• Can you avoid it?
– Phone number / SSN / Zip / DOB?
• Set expectations
– Not always easy!
• Describe the problem
• What tools do you have available?
– Constraints / patterns?
– Back-end data source available?
• Can you run a proof of concept / experiment?
5
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
Constraints and Patterns
• Does the number have any known pattern
that can be used to limit possible values (and
thereby improve recognition)
– For example:
• First character is always A
• First three characters are always numbers
• Last characters are always C, G or T.
• If the answer is “no,” consider doing your own
analysis.
– Even if you don’t think there is a pattern, there
may be one.
6
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
Applying Constraints
• Writing grammar specifically for the pattern
– How complicated is it?
• Applying a constraint list.
– How big is it?
7
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
Using nBest + Back-End Data
• Collect using an unconstrained grammar
• Set your recognizer to return an nBest list.
• Use a webservice / back end data dip to
determine which ones are “real.”
• Confirm the first “real” one on the list
– Throw out the ones that are not real.
• If no, confirm the second “real” one on the
list.
– Potentially collect again after that.
8
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
Confirmation Strategy
• PROTIP: Phonemes that are difficult for the
recognizer to hear … are also difficult for
humans to hear when they are spoken back.
• Confirm using letter names for easily
confusable alphanumerics.
– “You said 8, 2, 7 G as in George, B as in Boy, 9. Is
that right?”
9
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
What About Letter Names?
• Yes with caveats:
– Do you have a special domain that would allow
you to teach the caller letter names?
– Letter names invented by the caller will be quite
variable.
• Some of the “oddballs” will never be recognized
– If letter names are used during confirmation, and
the utterance is re-collected, the caller may tend
to use those letter names during the second
collection.
• So add them.
10
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
What About Letter Names?
• Yes, because:
– Longer utterances “B as in Boy” are not likely to
generate false acceptance between shorter
utterances such as “G” “T” etc.
• Make them separate rules so they can be
weighted
11
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
Using Prefiller
• “The account number is… B Z 3 9 0”
– Noticeable improvement in recognition of first
letter
– Caller may spontaneously offer
– Consider teaching the caller to say the prefiller
• Especially if you have repeat callers
12
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
Other Suggestions
• Look at speech recognition parameters that
are not directly related to alphanumeric
– Are callers calling from a very noisy environment?
• Adjust overall speech threshold
– Timing of utterance collection?
• Listen to recording of utterances to make sure
everything is getting collected
13
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
Specific Cases
• VIN
– Has specific pattern, but different for each
manufacturer
– 16 digits: nobody will want to re-enter if you get it
wrong.
14
IT DEPENDS!
15
but which way is “the best?”

More Related Content

More from Crispin Reedy

Association for Voice Interaction Design Annual Meeting 2017
Association for Voice Interaction Design Annual Meeting 2017Association for Voice Interaction Design Annual Meeting 2017
Association for Voice Interaction Design Annual Meeting 2017Crispin Reedy
 
Where's Jarvis? The Future of Voice Recognition and Natural Language User In...
Where's Jarvis?  The Future of Voice Recognition and Natural Language User In...Where's Jarvis?  The Future of Voice Recognition and Natural Language User In...
Where's Jarvis? The Future of Voice Recognition and Natural Language User In...Crispin Reedy
 
Voice Recognition and Natural Language - Dallas TechFest 2016
Voice Recognition and Natural Language - Dallas TechFest 2016Voice Recognition and Natural Language - Dallas TechFest 2016
Voice Recognition and Natural Language - Dallas TechFest 2016Crispin Reedy
 
Top 10 Tips for Making Complicated Things Simple
Top 10 Tips for Making Complicated Things SimpleTop 10 Tips for Making Complicated Things Simple
Top 10 Tips for Making Complicated Things SimpleCrispin Reedy
 
Association for Voice Interaction Design Annual Meeting 2016
Association for Voice Interaction Design Annual Meeting 2016Association for Voice Interaction Design Annual Meeting 2016
Association for Voice Interaction Design Annual Meeting 2016Crispin Reedy
 
Going Solo: Design and Productivity Techniques for the Team of One
Going Solo: Design and Productivity Techniques for the Team of OneGoing Solo: Design and Productivity Techniques for the Team of One
Going Solo: Design and Productivity Techniques for the Team of OneCrispin Reedy
 
Service Design and the Omnichannel Experience - SpeechTEK 2015
Service Design and the Omnichannel Experience - SpeechTEK 2015Service Design and the Omnichannel Experience - SpeechTEK 2015
Service Design and the Omnichannel Experience - SpeechTEK 2015Crispin Reedy
 
Association for Voice Interaction Design Annual Meeting 2015
Association for Voice Interaction Design Annual Meeting 2015Association for Voice Interaction Design Annual Meeting 2015
Association for Voice Interaction Design Annual Meeting 2015Crispin Reedy
 
SpeechTEK University Outtakes 2014: Zero Out Strategies
SpeechTEK University Outtakes 2014: Zero Out StrategiesSpeechTEK University Outtakes 2014: Zero Out Strategies
SpeechTEK University Outtakes 2014: Zero Out StrategiesCrispin Reedy
 

More from Crispin Reedy (9)

Association for Voice Interaction Design Annual Meeting 2017
Association for Voice Interaction Design Annual Meeting 2017Association for Voice Interaction Design Annual Meeting 2017
Association for Voice Interaction Design Annual Meeting 2017
 
Where's Jarvis? The Future of Voice Recognition and Natural Language User In...
Where's Jarvis?  The Future of Voice Recognition and Natural Language User In...Where's Jarvis?  The Future of Voice Recognition and Natural Language User In...
Where's Jarvis? The Future of Voice Recognition and Natural Language User In...
 
Voice Recognition and Natural Language - Dallas TechFest 2016
Voice Recognition and Natural Language - Dallas TechFest 2016Voice Recognition and Natural Language - Dallas TechFest 2016
Voice Recognition and Natural Language - Dallas TechFest 2016
 
Top 10 Tips for Making Complicated Things Simple
Top 10 Tips for Making Complicated Things SimpleTop 10 Tips for Making Complicated Things Simple
Top 10 Tips for Making Complicated Things Simple
 
Association for Voice Interaction Design Annual Meeting 2016
Association for Voice Interaction Design Annual Meeting 2016Association for Voice Interaction Design Annual Meeting 2016
Association for Voice Interaction Design Annual Meeting 2016
 
Going Solo: Design and Productivity Techniques for the Team of One
Going Solo: Design and Productivity Techniques for the Team of OneGoing Solo: Design and Productivity Techniques for the Team of One
Going Solo: Design and Productivity Techniques for the Team of One
 
Service Design and the Omnichannel Experience - SpeechTEK 2015
Service Design and the Omnichannel Experience - SpeechTEK 2015Service Design and the Omnichannel Experience - SpeechTEK 2015
Service Design and the Omnichannel Experience - SpeechTEK 2015
 
Association for Voice Interaction Design Annual Meeting 2015
Association for Voice Interaction Design Annual Meeting 2015Association for Voice Interaction Design Annual Meeting 2015
Association for Voice Interaction Design Annual Meeting 2015
 
SpeechTEK University Outtakes 2014: Zero Out Strategies
SpeechTEK University Outtakes 2014: Zero Out StrategiesSpeechTEK University Outtakes 2014: Zero Out Strategies
SpeechTEK University Outtakes 2014: Zero Out Strategies
 

Recently uploaded

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 

Recently uploaded (20)

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 

2013 Speech TEK - Alphanumeric Recognition Discussion

  • 1. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. Alphanumeric Speech Recognition SpeechTek August 19, 2013 Crispin Reedy
  • 2. “The fault, dear Brutus, is not in our stars, but in ourselves” -- Julius Caesar, Act I, scene ii 2 The ProblemWithAlphanumerics
  • 3. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. The Need • Account Numbers • Policy Numbers • Spelling out names and addresses • Special cases – VIN, Canadian Postal Code • And more… 3
  • 4. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. Methods for Addressing • Project Tactics • Limit the grammar – Constraint List – N-Best + Back-End Data Validation • Confirmation • Prefiller 4
  • 5. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. Project Tactics • Can you avoid it? – Phone number / SSN / Zip / DOB? • Set expectations – Not always easy! • Describe the problem • What tools do you have available? – Constraints / patterns? – Back-end data source available? • Can you run a proof of concept / experiment? 5
  • 6. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. Constraints and Patterns • Does the number have any known pattern that can be used to limit possible values (and thereby improve recognition) – For example: • First character is always A • First three characters are always numbers • Last characters are always C, G or T. • If the answer is “no,” consider doing your own analysis. – Even if you don’t think there is a pattern, there may be one. 6
  • 7. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. Applying Constraints • Writing grammar specifically for the pattern – How complicated is it? • Applying a constraint list. – How big is it? 7
  • 8. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. Using nBest + Back-End Data • Collect using an unconstrained grammar • Set your recognizer to return an nBest list. • Use a webservice / back end data dip to determine which ones are “real.” • Confirm the first “real” one on the list – Throw out the ones that are not real. • If no, confirm the second “real” one on the list. – Potentially collect again after that. 8
  • 9. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. Confirmation Strategy • PROTIP: Phonemes that are difficult for the recognizer to hear … are also difficult for humans to hear when they are spoken back. • Confirm using letter names for easily confusable alphanumerics. – “You said 8, 2, 7 G as in George, B as in Boy, 9. Is that right?” 9
  • 10. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. What About Letter Names? • Yes with caveats: – Do you have a special domain that would allow you to teach the caller letter names? – Letter names invented by the caller will be quite variable. • Some of the “oddballs” will never be recognized – If letter names are used during confirmation, and the utterance is re-collected, the caller may tend to use those letter names during the second collection. • So add them. 10
  • 11. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. What About Letter Names? • Yes, because: – Longer utterances “B as in Boy” are not likely to generate false acceptance between shorter utterances such as “G” “T” etc. • Make them separate rules so they can be weighted 11
  • 12. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. Using Prefiller • “The account number is… B Z 3 9 0” – Noticeable improvement in recognition of first letter – Caller may spontaneously offer – Consider teaching the caller to say the prefiller • Especially if you have repeat callers 12
  • 13. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. Other Suggestions • Look at speech recognition parameters that are not directly related to alphanumeric – Are callers calling from a very noisy environment? • Adjust overall speech threshold – Timing of utterance collection? • Listen to recording of utterances to make sure everything is getting collected 13
  • 14. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. Specific Cases • VIN – Has specific pattern, but different for each manufacturer – 16 digits: nobody will want to re-enter if you get it wrong. 14
  • 15. IT DEPENDS! 15 but which way is “the best?”

Editor's Notes

  1. Or, in other words, the problem, dear friends, is not with our recognizers, but with our language. Specifically, with our English letter names, which are very confusable, i.e. E, D, T, 3, G, P, etc. etc. which all consist of a very short initial phoneme which is easily missed by recognizers and a longer phoneme which is quite confusable.