Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
Alphanumeric Speech
Recognition
SpeechTek
August 19, 2013
Crispi...
“The fault, dear Brutus, is not in
our stars, but in ourselves”
-- Julius Caesar, Act I, scene ii
2
The ProblemWithAlphanu...
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
The Need
• Account Numbers
• Policy Numbers
• Spelling out names...
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
Methods for Addressing
• Project Tactics
• Limit the grammar
– C...
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
Project Tactics
• Can you avoid it?
– Phone number / SSN / Zip /...
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
Constraints and Patterns
• Does the number have any known patter...
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
Applying Constraints
• Writing grammar specifically for the patt...
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
Using nBest + Back-End Data
• Collect using an unconstrained gra...
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
Confirmation Strategy
• PROTIP: Phonemes that are difficult for ...
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
What About Letter Names?
• Yes with caveats:
– Do you have a spe...
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
What About Letter Names?
• Yes, because:
– Longer utterances “B ...
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
Using Prefiller
• “The account number is… B Z 3 9 0”
– Noticeabl...
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
Other Suggestions
• Look at speech recognition parameters that
a...
© 2002 – 2012 Versay Solutions, LLC. All rights reserved.
Specific Cases
• VIN
– Has specific pattern, but different for e...
IT DEPENDS!
15
but which way is “the best?”
Upcoming SlideShare
Loading in …5
×

2013 Speech TEK - Alphanumeric Recognition Discussion

664 views

Published on

This morning's discussion on Alphanumeric Reco was great. Here are the slides for anyone who is interested. Thanks to all for sharing their experiences!

Published in: Technology, Education
  • Be the first to comment

2013 Speech TEK - Alphanumeric Recognition Discussion

  1. 1. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. Alphanumeric Speech Recognition SpeechTek August 19, 2013 Crispin Reedy
  2. 2. “The fault, dear Brutus, is not in our stars, but in ourselves” -- Julius Caesar, Act I, scene ii 2 The ProblemWithAlphanumerics
  3. 3. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. The Need • Account Numbers • Policy Numbers • Spelling out names and addresses • Special cases – VIN, Canadian Postal Code • And more… 3
  4. 4. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. Methods for Addressing • Project Tactics • Limit the grammar – Constraint List – N-Best + Back-End Data Validation • Confirmation • Prefiller 4
  5. 5. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. Project Tactics • Can you avoid it? – Phone number / SSN / Zip / DOB? • Set expectations – Not always easy! • Describe the problem • What tools do you have available? – Constraints / patterns? – Back-end data source available? • Can you run a proof of concept / experiment? 5
  6. 6. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. Constraints and Patterns • Does the number have any known pattern that can be used to limit possible values (and thereby improve recognition) – For example: • First character is always A • First three characters are always numbers • Last characters are always C, G or T. • If the answer is “no,” consider doing your own analysis. – Even if you don’t think there is a pattern, there may be one. 6
  7. 7. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. Applying Constraints • Writing grammar specifically for the pattern – How complicated is it? • Applying a constraint list. – How big is it? 7
  8. 8. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. Using nBest + Back-End Data • Collect using an unconstrained grammar • Set your recognizer to return an nBest list. • Use a webservice / back end data dip to determine which ones are “real.” • Confirm the first “real” one on the list – Throw out the ones that are not real. • If no, confirm the second “real” one on the list. – Potentially collect again after that. 8
  9. 9. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. Confirmation Strategy • PROTIP: Phonemes that are difficult for the recognizer to hear … are also difficult for humans to hear when they are spoken back. • Confirm using letter names for easily confusable alphanumerics. – “You said 8, 2, 7 G as in George, B as in Boy, 9. Is that right?” 9
  10. 10. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. What About Letter Names? • Yes with caveats: – Do you have a special domain that would allow you to teach the caller letter names? – Letter names invented by the caller will be quite variable. • Some of the “oddballs” will never be recognized – If letter names are used during confirmation, and the utterance is re-collected, the caller may tend to use those letter names during the second collection. • So add them. 10
  11. 11. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. What About Letter Names? • Yes, because: – Longer utterances “B as in Boy” are not likely to generate false acceptance between shorter utterances such as “G” “T” etc. • Make them separate rules so they can be weighted 11
  12. 12. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. Using Prefiller • “The account number is… B Z 3 9 0” – Noticeable improvement in recognition of first letter – Caller may spontaneously offer – Consider teaching the caller to say the prefiller • Especially if you have repeat callers 12
  13. 13. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. Other Suggestions • Look at speech recognition parameters that are not directly related to alphanumeric – Are callers calling from a very noisy environment? • Adjust overall speech threshold – Timing of utterance collection? • Listen to recording of utterances to make sure everything is getting collected 13
  14. 14. © 2002 – 2012 Versay Solutions, LLC. All rights reserved. Specific Cases • VIN – Has specific pattern, but different for each manufacturer – 16 digits: nobody will want to re-enter if you get it wrong. 14
  15. 15. IT DEPENDS! 15 but which way is “the best?”

×