SlideShare a Scribd company logo
1 of 15
Download to read offline
Departamento de Ciencias de la Computación
Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas
Universidad Nacional Autónoma de México

Practical Speech Recognition for
Contextualized Service Robots
Ivan Meza, Caleb Rascón and Luis Pineda

http://golem.iimas.unam.mx/
GrupoGolem
Service robots
● Our future butlers
● They are task oriented
○ Clean up a room
○ Play a game

●
●
●
●

Interaction with spoken language
They work in noisy environments
Microphone is not close to the speaker
Poor speech recognition
Proposal
● Improve the system on four aspects
● Contextualized recogniser
● Prompting strategies
● Recovery strategies
● Audio calibration
I. Contextualized recognition
● Use specific language models for the
given expectations
■ YES: yes, okay, all right
■ NO: no, don’t, do not
■ NAVIGATE: go to the kitchen, go to the living
room, go to the bedroom
ASR module
II. Prompting strategies
● Let know the user when to speak
■ Beep sound

● Speaker volume monitor
■ Could you speak louder or softer
III. Recovery strategy
● Let know the user when something
went wrong
■ could you repeat?
■ i can’t hear you well, could you repeat
■ sorry, i’m a little deaf
IV. Calibration of audio setting
● Hardware
■ 1 directional microphone
■ 1 USB interface with 4 channels
■ 2 speakers

● Calibration of SNR in situ
■ For background noise -58dB
■ SNR set to 20 dB
Corpus evaluation
● Logs from the robot performing
RoboCup tasks
■
■
■
■
■
■
■

2 years interactions in lab and competition
1,439 utterances
2,472 tokens
120 types
11 tasks
9 of 11 tasks are contextualized
14 language models
Contextualized recognition
We measure WER (the lower the better)
● With a unique LM for all tasks: 53.84%
● With task-based LM: 28.28%
● With contextualized: 23.42%

17.2% relative error reduction
Beep sound
● 79 utterances were recorded without the
beep sound
■ Without beeps 55.86%
■ With beeps 39.75%
■ With beeps full 53.72%

30%-4% Relative error reduction
Usage of SoundLoc System
● We measure usage
■ 174 times could have been triggered
■ 21 soft speech
■ 4 louder

14.36% of the times
Recovery strategy
● We measure usage
■ 504 times could have been triggered
■ 85 times activated

16.87% of the times
Conclusions
● These strategies help to improve in small
amounts the performance
● Together they allow practical speech
recognition on a service robot
Thank you
● ¿Questions?

More Related Content

Viewers also liked

رسالة الصالحين للشيخ فوزى محمد أبوزيد
 رسالة الصالحين للشيخ فوزى محمد أبوزيد رسالة الصالحين للشيخ فوزى محمد أبوزيد
رسالة الصالحين للشيخ فوزى محمد أبوزيد
Hassan Elagouz
 
فتاوى جامعة للنساء لفضيلة الشيخ فوزى محمد أبوزيد
فتاوى جامعة للنساء لفضيلة الشيخ فوزى محمد أبوزيدفتاوى جامعة للنساء لفضيلة الشيخ فوزى محمد أبوزيد
فتاوى جامعة للنساء لفضيلة الشيخ فوزى محمد أبوزيد
Hassan Elagouz
 
بشائر المؤمن عند الموت لفضيلة الشيخ فوزى محمد أبوزيد
بشائر المؤمن عند الموت لفضيلة الشيخ فوزى محمد أبوزيدبشائر المؤمن عند الموت لفضيلة الشيخ فوزى محمد أبوزيد
بشائر المؤمن عند الموت لفضيلة الشيخ فوزى محمد أبوزيد
Hassan Elagouz
 
كتاب علامات التوفيق لأهل التحقيق
كتاب علامات التوفيق لأهل التحقيقكتاب علامات التوفيق لأهل التحقيق
كتاب علامات التوفيق لأهل التحقيق
Hassan Elagouz
 
فتاوى فورية الجزء الثانى للشيخ فوزى محمد أبوزيد
 فتاوى فورية الجزء الثانى للشيخ فوزى محمد أبوزيد فتاوى فورية الجزء الثانى للشيخ فوزى محمد أبوزيد
فتاوى فورية الجزء الثانى للشيخ فوزى محمد أبوزيد
Hassan Elagouz
 
بشريات المؤمن فى الآخرةللفضيلة الشيخ فوزى محمد أبوزيد
بشريات المؤمن فى الآخرةللفضيلة الشيخ فوزى محمد أبوزيدبشريات المؤمن فى الآخرةللفضيلة الشيخ فوزى محمد أبوزيد
بشريات المؤمن فى الآخرةللفضيلة الشيخ فوزى محمد أبوزيد
Hassan Elagouz
 
Production Diary
Production DiaryProduction Diary
Production Diary
Sanusia1
 

Viewers also liked (17)

رسالة الصالحين للشيخ فوزى محمد أبوزيد
 رسالة الصالحين للشيخ فوزى محمد أبوزيد رسالة الصالحين للشيخ فوزى محمد أبوزيد
رسالة الصالحين للشيخ فوزى محمد أبوزيد
 
فتاوى جامعة للنساء لفضيلة الشيخ فوزى محمد أبوزيد
فتاوى جامعة للنساء لفضيلة الشيخ فوزى محمد أبوزيدفتاوى جامعة للنساء لفضيلة الشيخ فوزى محمد أبوزيد
فتاوى جامعة للنساء لفضيلة الشيخ فوزى محمد أبوزيد
 
بشائر المؤمن عند الموت لفضيلة الشيخ فوزى محمد أبوزيد
بشائر المؤمن عند الموت لفضيلة الشيخ فوزى محمد أبوزيدبشائر المؤمن عند الموت لفضيلة الشيخ فوزى محمد أبوزيد
بشائر المؤمن عند الموت لفضيلة الشيخ فوزى محمد أبوزيد
 
حديث الحقائق عن قدر سيد الخلائق
حديث الحقائق عن قدر سيد الخلائق حديث الحقائق عن قدر سيد الخلائق
حديث الحقائق عن قدر سيد الخلائق
 
Slide Presentasi Stars United Network 97 2003
Slide Presentasi Stars United Network 97 2003Slide Presentasi Stars United Network 97 2003
Slide Presentasi Stars United Network 97 2003
 
Quiz show
Quiz showQuiz show
Quiz show
 
Lambda The Extreme: Test-Driving a Functional Language
Lambda The Extreme: Test-Driving a Functional LanguageLambda The Extreme: Test-Driving a Functional Language
Lambda The Extreme: Test-Driving a Functional Language
 
Delay Me Not!
Delay Me Not!Delay Me Not!
Delay Me Not!
 
Taxare tva
Taxare tvaTaxare tva
Taxare tva
 
كتاب علامات التوفيق لأهل التحقيق
كتاب علامات التوفيق لأهل التحقيقكتاب علامات التوفيق لأهل التحقيق
كتاب علامات التوفيق لأهل التحقيق
 
Accize
AccizeAccize
Accize
 
Navigating The Digital Space
Navigating The Digital SpaceNavigating The Digital Space
Navigating The Digital Space
 
فتاوى فورية الجزء الثانى للشيخ فوزى محمد أبوزيد
 فتاوى فورية الجزء الثانى للشيخ فوزى محمد أبوزيد فتاوى فورية الجزء الثانى للشيخ فوزى محمد أبوزيد
فتاوى فورية الجزء الثانى للشيخ فوزى محمد أبوزيد
 
بشريات المؤمن فى الآخرةللفضيلة الشيخ فوزى محمد أبوزيد
بشريات المؤمن فى الآخرةللفضيلة الشيخ فوزى محمد أبوزيدبشريات المؤمن فى الآخرةللفضيلة الشيخ فوزى محمد أبوزيد
بشريات المؤمن فى الآخرةللفضيلة الشيخ فوزى محمد أبوزيد
 
Production Diary
Production DiaryProduction Diary
Production Diary
 
Jee advanced-2014-paper-1-code-8-english
Jee advanced-2014-paper-1-code-8-englishJee advanced-2014-paper-1-code-8-english
Jee advanced-2014-paper-1-code-8-english
 
Республика Цвета
Республика ЦветаРеспублика Цвета
Республика Цвета
 

More from Grupo Golem (DCC-IIMAS-UNAM)

More from Grupo Golem (DCC-IIMAS-UNAM) (12)

El proyecto golem ii+
El proyecto golem ii+El proyecto golem ii+
El proyecto golem ii+
 
Micai13 turdus migratorius (2)
Micai13   turdus migratorius (2)Micai13   turdus migratorius (2)
Micai13 turdus migratorius (2)
 
RIPS RoboCup@home 2013, The Netherlands
RIPS RoboCup@home 2013, The NetherlandsRIPS RoboCup@home 2013, The Netherlands
RIPS RoboCup@home 2013, The Netherlands
 
RIPS Tornemo Mexicano de Robotica 2012
RIPS Tornemo Mexicano de Robotica 2012RIPS Tornemo Mexicano de Robotica 2012
RIPS Tornemo Mexicano de Robotica 2012
 
Triptico (Magdeburgh)
Triptico (Magdeburgh)Triptico (Magdeburgh)
Triptico (Magdeburgh)
 
Golem-II+
Golem-II+Golem-II+
Golem-II+
 
Golem II+
Golem II+Golem II+
Golem II+
 
INTERACTION-ORIENTED COGNITIVE ARCHITECTURE IOCA
INTERACTION-ORIENTED COGNITIVE ARCHITECTURE IOCAINTERACTION-ORIENTED COGNITIVE ARCHITECTURE IOCA
INTERACTION-ORIENTED COGNITIVE ARCHITECTURE IOCA
 
The Golem Group (1998-2011)
The Golem Group (1998-2011)The Golem Group (1998-2011)
The Golem Group (1998-2011)
 
Poster: Distance Learning for Author Verification
Poster: Distance Learning for Author VerificationPoster: Distance Learning for Author Verification
Poster: Distance Learning for Author Verification
 
Concept of service robots
Concept of service robotsConcept of service robots
Concept of service robots
 
Grupo Golem
Grupo GolemGrupo Golem
Grupo Golem
 

Recently uploaded

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 

Micai 13 contextualized practical speech

  • 1. Departamento de Ciencias de la Computación Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas Universidad Nacional Autónoma de México Practical Speech Recognition for Contextualized Service Robots Ivan Meza, Caleb Rascón and Luis Pineda http://golem.iimas.unam.mx/ GrupoGolem
  • 2. Service robots ● Our future butlers ● They are task oriented ○ Clean up a room ○ Play a game ● ● ● ● Interaction with spoken language They work in noisy environments Microphone is not close to the speaker Poor speech recognition
  • 3. Proposal ● Improve the system on four aspects ● Contextualized recogniser ● Prompting strategies ● Recovery strategies ● Audio calibration
  • 4. I. Contextualized recognition ● Use specific language models for the given expectations ■ YES: yes, okay, all right ■ NO: no, don’t, do not ■ NAVIGATE: go to the kitchen, go to the living room, go to the bedroom
  • 6. II. Prompting strategies ● Let know the user when to speak ■ Beep sound ● Speaker volume monitor ■ Could you speak louder or softer
  • 7. III. Recovery strategy ● Let know the user when something went wrong ■ could you repeat? ■ i can’t hear you well, could you repeat ■ sorry, i’m a little deaf
  • 8. IV. Calibration of audio setting ● Hardware ■ 1 directional microphone ■ 1 USB interface with 4 channels ■ 2 speakers ● Calibration of SNR in situ ■ For background noise -58dB ■ SNR set to 20 dB
  • 9. Corpus evaluation ● Logs from the robot performing RoboCup tasks ■ ■ ■ ■ ■ ■ ■ 2 years interactions in lab and competition 1,439 utterances 2,472 tokens 120 types 11 tasks 9 of 11 tasks are contextualized 14 language models
  • 10. Contextualized recognition We measure WER (the lower the better) ● With a unique LM for all tasks: 53.84% ● With task-based LM: 28.28% ● With contextualized: 23.42% 17.2% relative error reduction
  • 11. Beep sound ● 79 utterances were recorded without the beep sound ■ Without beeps 55.86% ■ With beeps 39.75% ■ With beeps full 53.72% 30%-4% Relative error reduction
  • 12. Usage of SoundLoc System ● We measure usage ■ 174 times could have been triggered ■ 21 soft speech ■ 4 louder 14.36% of the times
  • 13. Recovery strategy ● We measure usage ■ 504 times could have been triggered ■ 85 times activated 16.87% of the times
  • 14. Conclusions ● These strategies help to improve in small amounts the performance ● Together they allow practical speech recognition on a service robot