This document discusses cloud robotics platforms for human-robot dialogues. It introduces Rospeex, a cloud robotics platform developed by NICT for multilingual dialogues. Rospeex has over 30,000 unique users and achieves state-of-the-art performance in speech recognition. The document also discusses NICT's work building domestic service robots through the RoboCup@Home competition, and their research using machine learning to better predict air pollution and reduce its social costs.
React Native vs Ionic - The Best Mobile App Framework
Cloud Robotics for Human-Robot Dialogues
1. Cloud Robotics for Human-Robot Dialogues
Komei Sugiura
Senior Researcher,
National Institute of Information and Communications Technology
Trustee,
RoboCup Federation
2. NICT:
Japan’s National Research Institute for ICT
Possible collaborations
• human-robot communication, scene
understanding, multimodal dialogues,
IoT data mining, and other
robotics/machine-learning/CV topics
Annual budget ¥29.7B
(£168M)
# researchers/staffs 434 / 937
Research topics
Spoken language processing, natural
language processing, machine translation,
databases, data mining, etc. @Kyoto
Photonic network, wireless network,
cybersecurity, time standard, neuroscience,
space weather, etc. @Tokyo, Osaka
VoiceTra
>1M downloads
since July, 2010
3. Multimodal dialogues with robots: Language processing using
non-linguistic information is challenging
Smartphone and other consumer devices
Language processing using non-
linguistic information gives benefit
cf. Market size of speech recognition
¥88B@2013→¥170B@2018 (£1B)*
Show me today’s
schedule
* Estimation by NEDO, TSC Foresight Vol.8, 2015
Sushi restaurants
around here
Benefit for
QA/search
GPS Contacts Other context
info.
Current communication with robots
Limited multi-modality and
scalability in robot intelligence
??
??Throw them
away.
Is there any milk
in the fridge?
cf. [Steels 2003, Roy 2005, Iwahashi
2007, Kollar+ 2010, Yu+ 2013]
4. Key Question:
How can we make robot intelligence scalable and multimodal ?
Major speech recognition engines are trained with large-scale corpora (>1000
hours ≒ 100M utterances), and continuously improved as cloud services
4
RoboCup@Home: Target user scenario
with service robots
XIMERA 3
(by NICT)
Voice talent
cf. [Sugiura+ ICRA2014]
Can we make such innovations in robotics?
• Training with large-scale datasets and continuous improvements in e.g.
dialogues, object recognition, grasping, simulation, …
5. (1) Rospeex:
We built a cloud robotics platform for multilingual dialogues*
• 30,000 unique users since Sep. 2013
• Non-monologue speech synthesis designed for robots [Sugiura+ 2014]
• Word Error Rate = 7.9% for IWSLT tst2011 (1st Place Winners in
IWSLT12, 13, 14)
Python & C++ samples
are available
rospeex Search
* Research/development-use only
6. Rospeex’s positioning in robot dialogue quadrants
Cloud APIs
(Google, Microsoft, Nuance,
NTT docomo, Wit.ai, …)
Free software
Commercial software
OpenHRI,
PocketSphinx, Festival
Cloud-based
Stand-alone
Robot middleware-
compatible
Incompatible
6
Does not work with
very low-spec PCs
Robotics-specific
logs are lost
Authentication
Low quality
Expensive
IP distributions of rospeex users
Rospeex has been applied to:
Humanoids, web agents, conversational
robots with elderly people, automotive
navigation systems, smart-home interface
7. (2) Building domestic service robots
(1st places in 2008 & 2010, 2nd places in 2009 & 2012)
• RoboCup@Home: The largest competition for domestic service robots
– Focuses on human-robot interaction and mobile manipulation
• Challenges
– Navigation in unknown environments (e.g. real shop), handling everyday
objects, spoken dialogues in very noisy environments (70dBA), …
• cf. Social impacts of other RoboCup leagues
– RoboCupRescue: Fukushima Power plant investigation
– Aldebaran sold >1000 NAO robots and bought by Softbank @US$ 100M, …
7by Channel 5
8. (3) Machine learning for environments:
Air pollution prediction can reduce social cost
• Loss by PM2.5 and air pollutants
– 3.3M premature deaths per year [Lelieveld, Nature, 2015]
• Prediction can prevent possible exposure, but prediction accuracy is quite low
– Standard approach gave only 42% accuracy in Fukuoka@2013-14*
• Applying the DPT-DRNN method to weather open-data outperforms a
standard weather model-based approach
*threat score=TP/(TP+FP+FN)
Premature deaths in London
≒2,800 @2010 [Ong & Sugiura, IEEE BigData 2014]Hefei, China 2015
(Not fog, not cloudy)