Semantic expansion of language models can significantly improve the performance of ASR systems. Noam Ziv, CEO of Semantic Interfaces, explains why and how.
3. SemanticInterfaces
Semantic Interfaces
LTD. Natural Language
Understanding
ASR Glass Ceiling
-
the missing last mile
glis
h
ASR Engine English Hebrew
Google Voice 74% 37%
Dragon 78% 46%
Sphinx 87% 88%
Nuance Recognizer (NR) 84% 83%
Bing 56% -
*ASR Application - Voice Commands, Benchmark April 2016,
Success rate – task completion %
4. SemanticInterfaces
Semantic Interfaces
LTD. Natural Language
Understanding
Linguistic parsing
The answer for the last mile
Question: How can a mother understand
her two years old baby when it asks for
something where more than 50% of the
words it utters are illegible:
MAMA WA SEUSS ME PAD =
I wanna watch doctor Seuss on my iPad
Answer: By taking into
account contextual
considerations and by
using linguistic intuitions.
5. SemanticInterfaces
Semantic Interfaces
LTD. Natural Language
Understanding
What does
Contextual
considerations &
Linguistic intuitions
really mean
Semantic transformations LM expansion
Word class semantic LM expansion
Synonyms semantic LM expansion
6. SemanticInterfaces
Semantic Interfaces
LTD. Natural Language
Understanding
LM expansion
Transformations
LM based SR toolkit
(speaker dependent)
quantitative LM Expansion
(Kaldi, Sphinx, HTK(
Syntax based ASR
engines
Specifying domain specific
syntax (NR for example(
LVCSR general
language engines
Post process enhancement
(Google voice, Dragon,
Alexa. etc.(
1 2 3
statement/question + active/passive Transformations
Expanding LM by a factor of 4
7. SemanticInterfaces
Semantic Interfaces
LTD. Natural Language
Understanding
Can Chomsky’s transformational grammar
facilitate a significant LM expansion?
statement/question transformation rule:
happy as is2the man is1
?
LM expansion
Transformations
8. SemanticInterfaces
Semantic Interfaces
LTD. Natural Language
Understanding
the man who happyis1 tall is2
Can Chomsky’s transformational grammar
facilitate a significant LM expansion?
statement/question transformation rule:
LM expansion
Transformations
?
9. SemanticInterfaces
Semantic Interfaces
LTD. Natural Language
Understanding
Statement/question semantic transformation rule:
NP
theman Is1 happy
who is2 tall
S
LM expansion
Semantic Transformations
happy?the man [who is2 tall] happyis1
?
S
VP
NP
10. SemanticInterfaces
Semantic Interfaces
LTD. Natural Language
Understanding
September eleven caused Clinton’s administration in
Washington to turn 180 degrees back the way the came.
Word class expansion rule: transform all corpus numbers,
private names, places, dates etc into [any number], [any
private name], [any place], [any date]…
[any month] [any number] caused [any private
name] administration in [any city/state] to turn [any
number] degrees back the way they came.
LM expansion
Word class expansion
11. SemanticInterfaces
Semantic Interfaces
LTD. Natural Language
Understanding
Once the corpus is parsed:
[any month] [1 to 31] caused [any
USA president] administration in
[any city/state] to turn [1 to 360]
degrees back the way they came.
LM expansion
Semantic Word class expansion
12. SemanticInterfaces
Semantic Interfaces
LTD. Natural Language
Understanding
• I found Trump attractive on Sundays
• I found eggs under my pillow on Sundays
Synonyms rule: replace any corpus term with its
synonymous term, find=discovered
• I discovered Trump attractive on Sundays
• I discovered eggs under my pillow on Sundays
LM expansion
Synonym Expansion
13. SemanticInterfaces
Semantic Interfaces
LTD. Natural Language
Understanding
• I found [Trump attractive on Sundays]
• I found [eggs under my pillow on Sundays]
Semantic Synonyms rule: replace any corpus
term with a synonymous term only if its lexical
requirements are met in the sentential context
(find: [ _NP] [_ S], discover: [ _NP])
• I discovered that [Trump is attractive on Sunday]
• I discovered [eggs under my pillow on Sundays]
LM expansion
Semantic Synonym Expansion
14. SemanticInterfaces
Semantic Interfaces
LTD. Natural Language
Understanding
Breaking the ASR
Glass Ceiling
English ASR
Engine
As is
Google Voice 74%
Dragon 78%
Sphinx 87%
Nuance Recognizer 84%
Semantic
enhancements
+ 12% 86%
+ 10% 88%
+ 3% 90%
+ 6% 90%
Semantic multi
ASR platform 93%
15. SemanticInterfaces
Semantic Interfaces
LTD. Natural Language
Understanding
Breaking the ASR
Glass Ceiling
Hebrew ASR Engine As is
Google Voice 37%
Dragon 46%
Sphinx 88%
Nuance Recognizer 83%
Semantic
enhancements
+ 30% 67%
+ 25% 71%
+ 3% 91%
+ 6% 89%
Semantic multi ASR
platform 92%
16. SemanticInterfaces
Semantic Interfaces
LTD. Natural Language
Understanding
ASR #1
SI NLU Engine (IDE)
“Record
Amazing Rice”
“Red Cord
Mazing Race
tonight”
“Record Amazing
Face too night”
ASR #2 ASR #3
“Record Amazing Race tonight”
Record Amazing Race tonight
Application Specific Command
STB command = Record
Title name = Amazing Race
Time = @Tonight
17. SemanticInterfaces
Semantic Interfaces
LTD. Natural Language
Understanding
Problems & Discussion
• None grammatical corpus – “can’t” be parsed (?) - SMS
language, conversational language, mixed language
texts (English, Spanish Tagalog…(, Corrupted ASR output
texts etc.
• Domain based ASR engines vs LVCSR? The end of the
dream of one overall General Language ASR engine?
• Parsing either both corpus & query, or query only or
corpus only parsing