SlideShare a Scribd company logo
Source: Dark Star, 1974
Chatty Devices
Does mouse and monitor soon become a thing of the past?
Sascha Wolter |@saschawolter | wolter.biz
Mai 2017
Learned or Innate
Source: Star Trek IV (20th Century Fox), 1986
Our Oldest Interface
• The deeply instinctive nature of speech
presents specific constraints and new
challenges. Our brains are fundamentally
wired to interpret the source of speech as
human. […] Thus, a device that speaks to us is
tapping into a deep river of psychological
adaptations, and subject to a set of
assumptions a pixel-based UI will never
encounter.
(Cheryl Platz, 2017, https://medium.com/microsoft-
design/voice-user-interface-design-new-solutions-to-old-
problems-baa36a64b3e4#.zc46diybh)
• There is no consensus on the ultimate origin
or age of human language (human language
could be 40,000 years old or much older).
Source: https://en.wikipedia.org/wiki/File:Real-time_MRI_-_Speaking_(English).ogv
Conversational User Experience
Amazon Echo: Alexa…
Google Home: Okay Google…
LingLong DingDong : DingDong DingDong …
8 Million sold by the end of 2016
https://www.digitalcommerce360.com/2017/01/23/amazons-us-echo-sales-top-8-million/
Harman Kardon Invoke / Microsoft Home Hub: Hey Cortana
Amazon Echo: Alexa…
8 Million sold by the end of 2016
https://www.digitalcommerce360.com/2017/01/23/amazons-us-echo-sales-top-8-million/
Source: Mattel's Barbie Hello Dreamhouse
Connected Devices (Internet of Things)
I know so much
…about you
Video: My Friend Cayla 2014
Source: MGM Child's Play (The Lakeshore Strangler), Vivid My friend Cayla
Big Brother Award
Internet of Uncanny Things
German Federal Network Agency says, any toy capable of
transmitting signals and recording images or sound without
detection is banned. (https://t.co/R7UCmI9aj9)
Conversation Experience and Voice
Intersection of text-based conversational user interfaces and voice user interfaces.
Conversational
Experience
Voice User
Interface (VUI)
More than Command and Control
Speech Processing
Natural Language Understanding (NUI)
Image: © Wolter ‘17
Conversation Experience and Voice
What Researchers say and why Investors bet on bots!
Conversational
Experience
Voice User
Interface (VUI)
63% like to use Voice to
control their home.
65% of Smartphone Users
have used Voice Assistants.
Bots are the new apps.
(Satya Nadella, Microsoft CEO)
Every fourth German wants to
use Chatbots.
Active Users: 1 billion
WhatsApp, 800 million
Facebook Messenger…
63 % don’t like to talk to/with
machines.
https://www.quora.com/Why-are-people-saying-Bots-are-the-new-apps
https://www.bitkom.org/Presse/Presseinformation/Jeder-Vierte-will-Chatbots-nutzen.html
http://www.fittkaumaass.de/news/chatbots-von-jedem-zweiten-online-kaeufer-abgelehnt
50 % doubt the reliability.
Google Voice Search, Google Now,
and Google Assistant
• Voice Search (2002)
• Voice Search is merged with Now (2012)
• Google Android and iOS
• Mobile usage scenarios
• Looks up the Internet but doesn’t know you
• Natural sounding voice commands
(https://www.cnet.com/how-to/complete-list-of-ok-google-commands/)
Google Assistant
• Google's Allo chat app*
• Google’s Pixel phone*
• Google Home
• Probably “next gen. Google Now”
• Conversational
• Deeper artificial intelligence
*Multimodal human-computer interaction involving several of the five human
senses (i.e. vision and voice).
Google Voice Search
Google Now
Alexa’s built-in voice capabilities for
your connected products:
• Works same way it would with an
Amazon Echo
• Access to third-party skills
developed using the Alexa Skills
Kit (ASK).
• Develover Kits
• Still a Wake Word Engine needed
(i.e. Sensory Alexa wake word
suite)
Commands/Conversation and Devices
Source: https://developer.amazon.com/alexa-voice-service
Skills Devices
Conversational User Experience
Source: https://developer.amazon.com/alexa-voice-service
Alexa in the Car: Ford, Amazon to Provide
Access to Shop, Search and Control Smart
Home Features on the Road.
The world's first Amazon Alexa-enabled
smartwatch: iMCO CoWatch.
LG puts Amazon Alexa on a fridge.
…operates without a graphical user interface and
is typically controlled via a network connection.
Headless Devices
Source: Discovery Channel 2013
Source: Room E demo by Jared Ficklin, http://www.youtube.com/watch?v=BGaAyBBur3I
Interaction isn´t one-dimensional
Multimodal interaction provides multiple modes of input and output.
speechRecognizer = new SpeechRecognitionEngine();
var grammer = new Grammar(new FileStream("commands.grxml...
speechRecognizer.LoadGrammar(grammer);
speechRecognizer.SpeechRecognized += new EventHandler...
speechRecognizer.SpeechHypothesized += new EventHandler...
speechRecognizer.SpeechRecognitionRejected += new EventHandler...
speechRecognizer.SetInputToAudioStream(stream,
new SpeechAudioFormatInfo(EncodingFormat...
speechRecognizer.RecognizeAsync(RecognizeMode...
https://github.com/Uberi/speech_recognition
Gulf between Human and Machine
User and GoalsPhysical System
(World)
Source: Norman, D. (1986). "User Centered System Design: New Perspectives on Human-computer Interaction". CRC. ISBN 978-0-89859-872-8
Voice Input Changes Lives
Inclusion
• The biggest and most impactful benefit voice
user experiences provide is vastly improved
accessibility. Looking for inspiration? Go read
the reviews of the Amazon Echo. […] Voice
UIs allow us to remain fully human in our
interactions.
(Cheryl Platz, 2017, https://medium.com/microsoft-design/voice-
user-interface-design-new-solutions-to-old-problems-
baa36a64b3e4#.zc46diybh)
Voice User Interface (VUI)
• Grice’s Maxims (1975)
(https://plato.stanford.edu/entries/grice/)
• Quality: Only say things that are true
• Quantity: Don’t be more or less
informative than needed
• Relevance: Only say things relevant to the
topic
• Manner: Be brief, get to the point, and
avoid ambiguity and obscurity
• Cooperative Principle
• Turn-taking
• Context
• Threading
Herbert Paul Grice (March 13, 1913 – August 28, 1988)
Speech Recognition/ Speech to Text
and Speech Synthesis / Text to Speech
Speech Processing
Source: Echo/Google Home infinite loop, https://youtu.be/ZfCfTYZJWtI
Physiological voice modelling (1791)
Speech Synthesis
Source: Kempelen's speaking machine (1791), http://www.dailymotion.com/video/x363xkr
Source: Hatsune Miku - World is mine– 2011, https://youtu.be/YSyWtESoeOc
Vocaloid (2003)
Hatsune Miku: “First sound from the Future.”
Speech Synthesis
Speech Synthesis
• Text-to-phoneme
• Known/Unknow words
• Text normalization
• Henry VIII vs Chapter VIII
• Prosodics and emotional content
• How to sound “natural”?
• …
• Usually based on samples
(versus physiological modelling)
• Discrete symbols to continues
Waveforms
• Stochastic process
(Hidden Markov model)
• Machine Learning / Deep Learning
(https://static.googleusercontent.com/media/research.google.com/en//pub
s/archive/41539.pdf)
Source: https://en.wikipedia.org/wiki/Speech_synthesis
Speech Synthesis Markup Language (SSML)
• http://www.w3.org/TR/speech-synthesis/
• XML-based
• Some elements and attributes:
• break
• phoneme
• prosody
• say-as
• currency
• digits
• number
• date
• time
• …
• audio
• ...
<speak>
<say-as>
Welcome! Today is
</say-as>
<say-as interpret-as="date">
20121213
</say-as>
</speak>
Speech Recognition
History
• 1950’s: Bell Laboratories designed the
"Audrey“ which could understand digits
• 1960’s: IBM demonstrated “Shoebox” which
could understand 16 words
• 1970’s: Carnegie Mellon's "Harpy" speech-
understanding system could understand 1011
words (approximately the vocabulary of an
average three-year-old)
• …
Video: Massive Attack Tour 2008 http://www.uva.co.uk/archives/84
Speech Recognition
Moving from word templates and sound patterns to probability.
• 1980’s: Worlds of Wonder's Julie doll (1987),
which children could train to respond to their
voice.
• 1990’s: In the early 90’s Dragon Dictate (9000
USD) and in the late 90’s Dragon
NaturallySpeaking arrived to recognize
continuous speech
• 2000’s: It’s still guessing with around 80
percent accuracy.
• 2010’s: Google's English Voice Search system
now incorporates 230 billion words from
actual user queries.
• …
Video: https://youtu.be/UkU9SbIictc
Speech Recognition
Technics
• Voice recognition (biometric) versus Speech
recognition (content)
• Speaker dependence vs. independence
• Detection Algorithms
• Fourier transformation (decorrelate the
spectrum)
• Dynamic time warping (DTW)-based speech
recognition
• Hidden Markov models (“stochastic state
model”)
• Grammar and Vocabulary
• …
• Front-End vs. Back-End
• Natural Voice Control
• Automatic Speech Recognition (ASR)
• Natural Language Understanding (NLU)
• Nonverbal communication
• E.g. lip-reading, McGurk-Effect
• Context and Grammar:
• Simple Grammar: e.g. just digits or numbers
• Advanced Grammar: Speech Recognition
Grammar Specification (SRGS), W3C Standard
• SRGS can take a variety of forms, with the
most popular being Grammar XML (GRXML).
(http://www.w3.org/TR/speech-grammar/ )
• Subject area (Healthcare, Military etc.)
Image: © Wolter ‘17
Source: https://youtu.be/tDFfZlQRCwM 2016
Speaker Recognition and Reliability.
Conversational UX
Codified and strict vs.
Conversational
• CLI: Command Line Interface
• Input of Commands via Keyboard
• Eliza by Joseph Weizenbaum (Psychotherapist), 1966
• Already 1220 chatbots according to the chatbots
directory
(https://www.chatbots.org/)
[4]
Source: Subservient Chicken 2011 (http://web.archive.org/web/20110426194400/http://www.bk.com/en/us/campaigns/subservient-chicken.html)
Microsoft Xiaoice, Rinna, and Tay
i.e. Xiaoice has 20 million registered users
Sources: https://en.wikipedia.org/wiki/Xiaoice, https://en.wikipedia.org/wiki/Tay_(bot)
Persona for an Avatar with Personality
Source: http://genieblog.ch/cortana-vs-siri-1-emotionen/
Source: Project Yorick, https://youtu.be/3Nss_2_rwdE
Creepy (Ro)bot
Uncanny Valley
Source: http://www.androidscience.com/theuncannyvalley/proceedings2005/uncannyvalley.html
BB-8, Star Wars VII
Source: Disney
Conversational UX turns real
Source: https://youtu.be/jSVRrJJ2nl4, SNL Julie the Operator 2006
Human?
• Chinese room: Does a machine literally
"understand" Chinese? Or is it merely
simulating the ability to understand Chinese?
Searle calls the first position "strong AI" and
the latter "weak AI".
(https://en.wikipedia.org/wiki/Chinese_room)
• Turing Test: A player C is given the task of
trying to determine which player – A or B – is
a computer and which is a human. C is limited
to using the responses to written questions to
make the determination.
(https://en.wikipedia.org/wiki/Turing_test)
• The Alexa Prize: A social bot that can
converse coherently and engagingly with
humans on popular topics for 20 minutes
(similar to Loebner Prize with 25 minutes).
(https://developer.amazon.com/alexaprize)
Source: Boris Adryan, 2015-10-20, http://iot.ghost.io/is-it-all-machine-learning/
Commonsense Knowledge and Intuition
Source: The Simpsons, 2001
Anticipation and Empathy
What is Machine Ethics?
Source: http://moralmachine.mit.edu/
Source: http://www.youtube.com/Fzo_5q_dhIM, Green Tricycle Studios 2012
Respect Privacy.
Conversational UX
Conversational UX, Topics and Guides
• Microsoft
• Kinect for Windows | Human Interface
Guidelines v2.0
(https://developer.microsoft.com/en-us/windows/kinect/tools)
• Interaction primer
(https://docs.microsoft.com/en-us/windows/uwp/input-and-
devices/input-primer)
• Experience Principles and Best Practices
(http://docs.botframework.com/en-us/directory/best-practices/)
• Inclusive Design
(https://www.microsoft.com/en-us/design/inclusive)
• Google
• Conversation Design
(https://developers.google.com/actions/design/)
• Amazon
• Alexa Skills Kit Voice Design
(https://developer.amazon.com/public/solutions/alexa/alexa-
skills-kit/docs/alexa-skills-kit-voice-design-handbook)
g.co/dev/ActionsChecklist
Source: Kinect for Windows | Human Interface Guidelines v1.7
Wake Word
Invocation
Action/Skill Types
Prompt
Wake Word
• Devices “Name”
• Alexa, Echo, Amazon, or Computer
• OK Google, or Hey Google
• Hey Cortana
• Keyword or trigger vs
Always on, active listening
• Indicate that the device is listening
• Reduced false activation
• Provide alternative input
Invocation Name
• Activating the Agent for your Skill (think of
starting your app)
• Usually two words (without articles), no
trademarks
Image: © Wolter ‘17
Action Types
Direct integration
(for home automation, media etc.)
• Direct Actions* (Google)
• Smart Home Skill (Amazon)
• Flash Briefing Skill (Amazon)
• Cortana Skill* (Microsoft)
*not yet available
Indirect integration
(invocation trigger/name)
• Conversation Actions (Google)
• Custom Skill (Amazon)
• Cortana Skill* (Microsoft)
*not yet available
Some restrictions for publishing! Needs invocation trigger!
Invocation Types
Invocation Type Conversation
Full Intent User: Alexa, ask Astrology Zone for the horoscope for Leo.
Astrology Zone: Today’s outlook for Leo: An opportunity presents itself at work.
Partial Intent User: Alexa, ask Astrology Daily for my horoscope.
Astrology Daily: Horoscope for what sign?
No Intent User: Alexa, talk to Astrology Daily.
Astrology Daily: You can ask for your horoscope. Which is your sign?
Ask <invocation name> <connecting word> <some action>
<some action> <connecting word> <invocation name>
Tell <invocation name> <connecting word> <some action>
Search <invocation name> for <some action>
Open <invocation name> for <some action>
Talk to <invocation name> and <some action>Launch <invocation name> and <some action>
Start <invocation name> and <some action>
Resume <invocation name> and <some action>
Run <invocation name> and <some action>
Load <invocation name> and <some action>
Begin <invocation name> and <some action>
Use <invocation name> <connecting word> <some action>
Prompt Types
Prompt Type Conversation
Question
Interaction remains
open, waiting for
respond.
Astrology Daily: Horoscope for
which sign?
Statement
Interaction will
terminate.
Astrology Daily: Today’s outlook for
Pisces: You could be questioning
your current path…
Wizard of Oz experiment, Image: http://www.kristamcgeebooks.com/
The dirty secrets: JSON behind
Amazon Alexa | Request Response
The dirty secrets: JSON behind
Google Assistant | Request Response
How-to Alexa Skills
https://developer.amazon.com/edw/home.html#/skills/list
General Settings and Invocation Name Intents, Content, and Utterances
How-to Alexa Skills
https://developer.amazon.com/edw/home.html#/skills/list
Fulfillment via Endpoint/Webhook Testing
Tip: Development and Debugging
• Prepare node.js
• https://nodejs.org/
• https://expressjs.co
var express = require('express');
var bodyParser
= require('body-parser');
var app = express();
app.get('/', function (req, res) {
res.send('Hello World!');
});
app.listen(3000, function () {
console.log('Example app
listening on port 3000!');
});
• Use an editor or IDE (i.e. Visual Studio Code)
• https://code.visualstudio.com/Docs/runtimes/nodejs
• Connect to your local server via tunnel
• https://ngrok.com/
• Generates URL like https://b22ec890.ngrok.io/
ngrok http 3000
• Eclipse SmartHome/QIVICON
• REST API http://127.0.0.1:8080/doc/index.html
• Paper UI http://127.0.0.1:8080/ui/index.html
How-to Alexa Skills
var Alexa = require('alexa-sdk');
app.post('/', function(req, res) {
var context = {
succeed: function (result) {
console.log(result);
res.json(result);
},
fail:function (error) {
console.log(error);
}
};
var alexa =
Alexa.handler(req.body, context);
alexa.registerHandlers(handlers);
alexa.execute();
});
var handlers = {
'SwitchOnIntent': function () {
var item = this.event.request.intent.
slots.item.value;
doRequest("ON");
this.emit(':tell', 'Switch ' + item);
},
'SwitchOffIntent': function () {
var item = this.event.request.intent.
slots.item.value;
doRequest("OFF");
this.emit(':tell', 'Switch ' + item);
}
};
https://developer.amazon.com/edw/home.html#/skills/list
How-to Google Assistant Skills
…
https://developers.google.com/actions/
How-to Google Assistant Skills: Actions SDK
https://developers.google.com/actions/
How-to Google Assistant Skills: Actions SDK
• Conversation API
• gactions CLI
• Specifying Action Package (JSON)
• Testing
• Deployment
• Actions SDK /
ActionsSdkAssistant
• npm install
express body-parser
actions-on-google
--save
• require('actions-on-google')
.ActionsSdkAssistant;
• Web Simulator
https://developers.google.com/actions/
How-to Google Assistant Skills: api.ai
• API.AI webhook protocol
• Webfrontend
• Specifying Action Package
• Testing
• Deployment
• Actions SDK /
ActionsAssistant
• npm install
express body-parser
actions-on-google
--save
• require('actions-on-google')
.ActionsAssistant;
• Web Simulator
https://developers.google.com/actions/ | https://console.api.ai/api-client/#/newAgent
How-to Google Assistant Skills: api.ai
https://developers.google.com/actions/ | https://console.api.ai/api-client/#/agents
Create Agent General Settings
How-to Google Assistant Skills: api.ai
https://developers.google.com/actions/ | https://console.api.ai/api-client/#/agents
Intents, Content, and Utterances Intents, Content, and Utterances
How-to Google Assistant Skills: api.ai
https://developers.google.com/actions/ | https://console.api.ai/api-client/#/agents
Content Training
How-to Google Assistant Skills: api.ai
https://developers.google.com/actions/ | https://console.api.ai/api-client/#/agents
Integration Invocation Name
How-to Google Assistant Skills: api.ai
https://developers.google.com/actions/ | https://console.api.ai/api-client/#/agents
Testing Fulfillment via Endpoint/Webhook
How-to Google Assistant Skills: api.ai
GUI Interface
NLU (Natural Language Understanding )
Conversation building features
(i.e. Domains, Entities, State, Context)
https://developers.google.com/actions/ | https://console.api.ai/api-client/#/newAgent
How-to Google Assistant Skills: api.ai
var ApiAiAssistant =
require('actions-on-google').ApiAiAssistant;
app.post('/', function(req, res) {
var assistant = new ApiAiAssistant(
{
request: req,
response: res
});
var actionMap = new Map();
actionMap.set("switchOnIntent",
switchOnIntent);
actionMap.set("switchOffIntent",
switchOffIntent);
assistant.handleRequest(actionMap);
});
var switchOnIntent = function () {
var item = assistant.getArgument("item");
doRequest("ON");
assistant.ask('Turned ' + item + ' on!');
};
var switchOffIntent = function () {
var item = assistant.getArgument("item");
doRequest("OFF");
assistant.ask('Turned ' + item + ' off!');
};
https://developers.google.com/actions/tools/ngrok
@saschawolter | http://wolter.biz | https://github.com/wolter
Next: Microsoft Cortana Skills
…and some more like Samsung Bixby etc.
• Harman Kardon Speaker announced for 2017
• Cortana
• Fictional AI character in the Halo video game.
• Acquired 2009 as TellMe.
• Intelligent personal assistant and knowledge navigator.
• Competes with Siri and Google Now since 2013 (Windows Phone 8.1).
• Available on Windows 10, Windows IoT, on Android and iOS, on Xbox, etc.
• Cortana Skills Kit announced for early 2017
• Cortana Devices SDK announced for 2017
• Allows OEMs (original equipment manufacturer) and ODMs to create smart and personal devices.
• Microsoft Bot Framework
• Build and connect intelligent bots to interact with your users naturally.
• Microsoft Luis - Language Understanding Intelligent Service (part of Cognitive Services)
• Understand language contextually
Source: Dark Star, 1974, Bryanston Pictures
I think, therefore I am.
[…] But how do you know that anything else exists?
My sensory apparatus
reveals it to me.
Chatty Devices
Does mouse and monitor soon become a thing of the past?
Sascha Wolter |@saschawolter | wolter.biz
Mai 2017
Source: Dark Star, 1974, Bryanston Pictures

More Related Content

Similar to Chatty Devices

UX STRAT Europe 2019: Zhaochang He, VMware
UX STRAT Europe 2019: Zhaochang He, VMwareUX STRAT Europe 2019: Zhaochang He, VMware
UX STRAT Europe 2019: Zhaochang He, VMware
UX STRAT
 
The Singularity is Here - SXSWi 2011
The Singularity is Here - SXSWi 2011The Singularity is Here - SXSWi 2011
The Singularity is Here - SXSWi 2011
Mindgrub Technologies
 
Chatbot
ChatbotChatbot
Chatbot
UTSAB NEUPANE
 
Semantic Web: In Quest for the Next Generation Killer Apps
Semantic Web: In Quest for the Next Generation Killer AppsSemantic Web: In Quest for the Next Generation Killer Apps
Semantic Web: In Quest for the Next Generation Killer Apps
Jie Bao
 
Humanities in the Digital World
Humanities in the Digital WorldHumanities in the Digital World
Humanities in the Digital World
David De Roure
 
NLP Community Conference - Dr. Catherine Havasi (ConceptNet/MIT Media Lab/Lum...
NLP Community Conference - Dr. Catherine Havasi (ConceptNet/MIT Media Lab/Lum...NLP Community Conference - Dr. Catherine Havasi (ConceptNet/MIT Media Lab/Lum...
NLP Community Conference - Dr. Catherine Havasi (ConceptNet/MIT Media Lab/Lum...
Maryam Farooq
 
Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...
Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...
Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...
MaRS Discovery District
 
NLU-MAP. IBM Watson NLU with Mind Mapping automation
NLU-MAP. IBM Watson NLU with Mind Mapping automationNLU-MAP. IBM Watson NLU with Mind Mapping automation
NLU-MAP. IBM Watson NLU with Mind Mapping automation
José M. Guerrero
 
CHATBOT PPT-2.pptx
CHATBOT PPT-2.pptxCHATBOT PPT-2.pptx
CHATBOT PPT-2.pptx
LohithaJangala
 
Creating Chatbots Using TensorFlow | Chatbot Tutorial | Deep Learning Trainin...
Creating Chatbots Using TensorFlow | Chatbot Tutorial | Deep Learning Trainin...Creating Chatbots Using TensorFlow | Chatbot Tutorial | Deep Learning Trainin...
Creating Chatbots Using TensorFlow | Chatbot Tutorial | Deep Learning Trainin...
Edureka!
 
Spohrer EMAC 20230509 v14.pptx
Spohrer EMAC 20230509 v14.pptxSpohrer EMAC 20230509 v14.pptx
Spohrer EMAC 20230509 v14.pptx
ISSIP
 
Aibdconference chat bot for every product Maksym Volchenko
Aibdconference chat bot for every product Maksym VolchenkoAibdconference chat bot for every product Maksym Volchenko
Aibdconference chat bot for every product Maksym Volchenko
Olga Zinkevych
 
Introduction to Deep Learning (September 2017)
Introduction to Deep Learning (September 2017)Introduction to Deep Learning (September 2017)
Introduction to Deep Learning (September 2017)
Julien SIMON
 
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly
 
Artificial Intelligence Today (22 June 2017)
Artificial Intelligence Today (22 June 2017)Artificial Intelligence Today (22 June 2017)
Artificial Intelligence Today (22 June 2017)
Sabri Sansoy
 
An introduction to Deep Learning
An introduction to Deep LearningAn introduction to Deep Learning
An introduction to Deep Learning
Amazon Web Services
 
Semiconductors 20240320 v14 corrected slides.pptx
Semiconductors 20240320 v14 corrected slides.pptxSemiconductors 20240320 v14 corrected slides.pptx
Semiconductors 20240320 v14 corrected slides.pptx
ISSIP
 
Big Data meets Big Social: Social Machines and the Semantic Web
Big Data meets Big Social: Social Machines and the Semantic WebBig Data meets Big Social: Social Machines and the Semantic Web
Big Data meets Big Social: Social Machines and the Semantic Web
David De Roure
 
Sl languages convention 2010
Sl languages convention 2010Sl languages convention 2010
Sl languages convention 2010
kayleewest
 
Semiconductors 20240320 v14 Narayanasamy event.pptx
Semiconductors 20240320 v14 Narayanasamy event.pptxSemiconductors 20240320 v14 Narayanasamy event.pptx
Semiconductors 20240320 v14 Narayanasamy event.pptx
ISSIP
 

Similar to Chatty Devices (20)

UX STRAT Europe 2019: Zhaochang He, VMware
UX STRAT Europe 2019: Zhaochang He, VMwareUX STRAT Europe 2019: Zhaochang He, VMware
UX STRAT Europe 2019: Zhaochang He, VMware
 
The Singularity is Here - SXSWi 2011
The Singularity is Here - SXSWi 2011The Singularity is Here - SXSWi 2011
The Singularity is Here - SXSWi 2011
 
Chatbot
ChatbotChatbot
Chatbot
 
Semantic Web: In Quest for the Next Generation Killer Apps
Semantic Web: In Quest for the Next Generation Killer AppsSemantic Web: In Quest for the Next Generation Killer Apps
Semantic Web: In Quest for the Next Generation Killer Apps
 
Humanities in the Digital World
Humanities in the Digital WorldHumanities in the Digital World
Humanities in the Digital World
 
NLP Community Conference - Dr. Catherine Havasi (ConceptNet/MIT Media Lab/Lum...
NLP Community Conference - Dr. Catherine Havasi (ConceptNet/MIT Media Lab/Lum...NLP Community Conference - Dr. Catherine Havasi (ConceptNet/MIT Media Lab/Lum...
NLP Community Conference - Dr. Catherine Havasi (ConceptNet/MIT Media Lab/Lum...
 
Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...
Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...
Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...
 
NLU-MAP. IBM Watson NLU with Mind Mapping automation
NLU-MAP. IBM Watson NLU with Mind Mapping automationNLU-MAP. IBM Watson NLU with Mind Mapping automation
NLU-MAP. IBM Watson NLU with Mind Mapping automation
 
CHATBOT PPT-2.pptx
CHATBOT PPT-2.pptxCHATBOT PPT-2.pptx
CHATBOT PPT-2.pptx
 
Creating Chatbots Using TensorFlow | Chatbot Tutorial | Deep Learning Trainin...
Creating Chatbots Using TensorFlow | Chatbot Tutorial | Deep Learning Trainin...Creating Chatbots Using TensorFlow | Chatbot Tutorial | Deep Learning Trainin...
Creating Chatbots Using TensorFlow | Chatbot Tutorial | Deep Learning Trainin...
 
Spohrer EMAC 20230509 v14.pptx
Spohrer EMAC 20230509 v14.pptxSpohrer EMAC 20230509 v14.pptx
Spohrer EMAC 20230509 v14.pptx
 
Aibdconference chat bot for every product Maksym Volchenko
Aibdconference chat bot for every product Maksym VolchenkoAibdconference chat bot for every product Maksym Volchenko
Aibdconference chat bot for every product Maksym Volchenko
 
Introduction to Deep Learning (September 2017)
Introduction to Deep Learning (September 2017)Introduction to Deep Learning (September 2017)
Introduction to Deep Learning (September 2017)
 
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
 
Artificial Intelligence Today (22 June 2017)
Artificial Intelligence Today (22 June 2017)Artificial Intelligence Today (22 June 2017)
Artificial Intelligence Today (22 June 2017)
 
An introduction to Deep Learning
An introduction to Deep LearningAn introduction to Deep Learning
An introduction to Deep Learning
 
Semiconductors 20240320 v14 corrected slides.pptx
Semiconductors 20240320 v14 corrected slides.pptxSemiconductors 20240320 v14 corrected slides.pptx
Semiconductors 20240320 v14 corrected slides.pptx
 
Big Data meets Big Social: Social Machines and the Semantic Web
Big Data meets Big Social: Social Machines and the Semantic WebBig Data meets Big Social: Social Machines and the Semantic Web
Big Data meets Big Social: Social Machines and the Semantic Web
 
Sl languages convention 2010
Sl languages convention 2010Sl languages convention 2010
Sl languages convention 2010
 
Semiconductors 20240320 v14 Narayanasamy event.pptx
Semiconductors 20240320 v14 Narayanasamy event.pptxSemiconductors 20240320 v14 Narayanasamy event.pptx
Semiconductors 20240320 v14 Narayanasamy event.pptx
 

Recently uploaded

Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
JungkooksNonexistent
 
Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027
harveenkaur52
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
JeyaPerumal1
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Brad Spiegel Macon GA
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
ufdana
 
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shopHistory+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
laozhuseo02
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
3ipehhoa
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
Rogerio Filho
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
nirahealhty
 
Bài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docxBài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docx
nhiyenphan2005
 
The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
laozhuseo02
 
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
CIOWomenMagazine
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
3ipehhoa
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
3ipehhoa
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
Arif0071
 
Comptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guideComptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guide
GTProductions1
 
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
keoku
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
Gal Baras
 
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdfMeet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Florence Consulting
 

Recently uploaded (20)

Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
 
Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
 
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shopHistory+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
 
Bài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docxBài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docx
 
The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
 
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
 
Comptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guideComptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guide
 
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
 
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdfMeet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
 

Chatty Devices

  • 1. Source: Dark Star, 1974 Chatty Devices Does mouse and monitor soon become a thing of the past? Sascha Wolter |@saschawolter | wolter.biz Mai 2017
  • 2. Learned or Innate Source: Star Trek IV (20th Century Fox), 1986
  • 3. Our Oldest Interface • The deeply instinctive nature of speech presents specific constraints and new challenges. Our brains are fundamentally wired to interpret the source of speech as human. […] Thus, a device that speaks to us is tapping into a deep river of psychological adaptations, and subject to a set of assumptions a pixel-based UI will never encounter. (Cheryl Platz, 2017, https://medium.com/microsoft- design/voice-user-interface-design-new-solutions-to-old- problems-baa36a64b3e4#.zc46diybh) • There is no consensus on the ultimate origin or age of human language (human language could be 40,000 years old or much older). Source: https://en.wikipedia.org/wiki/File:Real-time_MRI_-_Speaking_(English).ogv
  • 4. Conversational User Experience Amazon Echo: Alexa… Google Home: Okay Google… LingLong DingDong : DingDong DingDong … 8 Million sold by the end of 2016 https://www.digitalcommerce360.com/2017/01/23/amazons-us-echo-sales-top-8-million/ Harman Kardon Invoke / Microsoft Home Hub: Hey Cortana
  • 5. Amazon Echo: Alexa… 8 Million sold by the end of 2016 https://www.digitalcommerce360.com/2017/01/23/amazons-us-echo-sales-top-8-million/
  • 6. Source: Mattel's Barbie Hello Dreamhouse Connected Devices (Internet of Things)
  • 7. I know so much …about you Video: My Friend Cayla 2014
  • 8. Source: MGM Child's Play (The Lakeshore Strangler), Vivid My friend Cayla Big Brother Award Internet of Uncanny Things German Federal Network Agency says, any toy capable of transmitting signals and recording images or sound without detection is banned. (https://t.co/R7UCmI9aj9)
  • 9. Conversation Experience and Voice Intersection of text-based conversational user interfaces and voice user interfaces. Conversational Experience Voice User Interface (VUI) More than Command and Control Speech Processing Natural Language Understanding (NUI) Image: © Wolter ‘17
  • 10. Conversation Experience and Voice What Researchers say and why Investors bet on bots! Conversational Experience Voice User Interface (VUI) 63% like to use Voice to control their home. 65% of Smartphone Users have used Voice Assistants. Bots are the new apps. (Satya Nadella, Microsoft CEO) Every fourth German wants to use Chatbots. Active Users: 1 billion WhatsApp, 800 million Facebook Messenger… 63 % don’t like to talk to/with machines. https://www.quora.com/Why-are-people-saying-Bots-are-the-new-apps https://www.bitkom.org/Presse/Presseinformation/Jeder-Vierte-will-Chatbots-nutzen.html http://www.fittkaumaass.de/news/chatbots-von-jedem-zweiten-online-kaeufer-abgelehnt 50 % doubt the reliability.
  • 11. Google Voice Search, Google Now, and Google Assistant • Voice Search (2002) • Voice Search is merged with Now (2012) • Google Android and iOS • Mobile usage scenarios • Looks up the Internet but doesn’t know you • Natural sounding voice commands (https://www.cnet.com/how-to/complete-list-of-ok-google-commands/) Google Assistant • Google's Allo chat app* • Google’s Pixel phone* • Google Home • Probably “next gen. Google Now” • Conversational • Deeper artificial intelligence *Multimodal human-computer interaction involving several of the five human senses (i.e. vision and voice). Google Voice Search Google Now
  • 12. Alexa’s built-in voice capabilities for your connected products: • Works same way it would with an Amazon Echo • Access to third-party skills developed using the Alexa Skills Kit (ASK). • Develover Kits • Still a Wake Word Engine needed (i.e. Sensory Alexa wake word suite) Commands/Conversation and Devices Source: https://developer.amazon.com/alexa-voice-service Skills Devices
  • 13. Conversational User Experience Source: https://developer.amazon.com/alexa-voice-service Alexa in the Car: Ford, Amazon to Provide Access to Shop, Search and Control Smart Home Features on the Road. The world's first Amazon Alexa-enabled smartwatch: iMCO CoWatch. LG puts Amazon Alexa on a fridge.
  • 14. …operates without a graphical user interface and is typically controlled via a network connection. Headless Devices Source: Discovery Channel 2013
  • 15. Source: Room E demo by Jared Ficklin, http://www.youtube.com/watch?v=BGaAyBBur3I Interaction isn´t one-dimensional Multimodal interaction provides multiple modes of input and output. speechRecognizer = new SpeechRecognitionEngine(); var grammer = new Grammar(new FileStream("commands.grxml... speechRecognizer.LoadGrammar(grammer); speechRecognizer.SpeechRecognized += new EventHandler... speechRecognizer.SpeechHypothesized += new EventHandler... speechRecognizer.SpeechRecognitionRejected += new EventHandler... speechRecognizer.SetInputToAudioStream(stream, new SpeechAudioFormatInfo(EncodingFormat... speechRecognizer.RecognizeAsync(RecognizeMode...
  • 17. Gulf between Human and Machine User and GoalsPhysical System (World) Source: Norman, D. (1986). "User Centered System Design: New Perspectives on Human-computer Interaction". CRC. ISBN 978-0-89859-872-8
  • 18. Voice Input Changes Lives Inclusion • The biggest and most impactful benefit voice user experiences provide is vastly improved accessibility. Looking for inspiration? Go read the reviews of the Amazon Echo. […] Voice UIs allow us to remain fully human in our interactions. (Cheryl Platz, 2017, https://medium.com/microsoft-design/voice- user-interface-design-new-solutions-to-old-problems- baa36a64b3e4#.zc46diybh)
  • 19. Voice User Interface (VUI) • Grice’s Maxims (1975) (https://plato.stanford.edu/entries/grice/) • Quality: Only say things that are true • Quantity: Don’t be more or less informative than needed • Relevance: Only say things relevant to the topic • Manner: Be brief, get to the point, and avoid ambiguity and obscurity • Cooperative Principle • Turn-taking • Context • Threading Herbert Paul Grice (March 13, 1913 – August 28, 1988)
  • 20. Speech Recognition/ Speech to Text and Speech Synthesis / Text to Speech Speech Processing Source: Echo/Google Home infinite loop, https://youtu.be/ZfCfTYZJWtI
  • 21. Physiological voice modelling (1791) Speech Synthesis Source: Kempelen's speaking machine (1791), http://www.dailymotion.com/video/x363xkr
  • 22. Source: Hatsune Miku - World is mine– 2011, https://youtu.be/YSyWtESoeOc Vocaloid (2003) Hatsune Miku: “First sound from the Future.” Speech Synthesis
  • 23. Speech Synthesis • Text-to-phoneme • Known/Unknow words • Text normalization • Henry VIII vs Chapter VIII • Prosodics and emotional content • How to sound “natural”? • … • Usually based on samples (versus physiological modelling) • Discrete symbols to continues Waveforms • Stochastic process (Hidden Markov model) • Machine Learning / Deep Learning (https://static.googleusercontent.com/media/research.google.com/en//pub s/archive/41539.pdf) Source: https://en.wikipedia.org/wiki/Speech_synthesis
  • 24. Speech Synthesis Markup Language (SSML) • http://www.w3.org/TR/speech-synthesis/ • XML-based • Some elements and attributes: • break • phoneme • prosody • say-as • currency • digits • number • date • time • … • audio • ... <speak> <say-as> Welcome! Today is </say-as> <say-as interpret-as="date"> 20121213 </say-as> </speak>
  • 25. Speech Recognition History • 1950’s: Bell Laboratories designed the "Audrey“ which could understand digits • 1960’s: IBM demonstrated “Shoebox” which could understand 16 words • 1970’s: Carnegie Mellon's "Harpy" speech- understanding system could understand 1011 words (approximately the vocabulary of an average three-year-old) • … Video: Massive Attack Tour 2008 http://www.uva.co.uk/archives/84
  • 26. Speech Recognition Moving from word templates and sound patterns to probability. • 1980’s: Worlds of Wonder's Julie doll (1987), which children could train to respond to their voice. • 1990’s: In the early 90’s Dragon Dictate (9000 USD) and in the late 90’s Dragon NaturallySpeaking arrived to recognize continuous speech • 2000’s: It’s still guessing with around 80 percent accuracy. • 2010’s: Google's English Voice Search system now incorporates 230 billion words from actual user queries. • … Video: https://youtu.be/UkU9SbIictc
  • 27. Speech Recognition Technics • Voice recognition (biometric) versus Speech recognition (content) • Speaker dependence vs. independence • Detection Algorithms • Fourier transformation (decorrelate the spectrum) • Dynamic time warping (DTW)-based speech recognition • Hidden Markov models (“stochastic state model”) • Grammar and Vocabulary • … • Front-End vs. Back-End • Natural Voice Control • Automatic Speech Recognition (ASR) • Natural Language Understanding (NLU) • Nonverbal communication • E.g. lip-reading, McGurk-Effect • Context and Grammar: • Simple Grammar: e.g. just digits or numbers • Advanced Grammar: Speech Recognition Grammar Specification (SRGS), W3C Standard • SRGS can take a variety of forms, with the most popular being Grammar XML (GRXML). (http://www.w3.org/TR/speech-grammar/ ) • Subject area (Healthcare, Military etc.) Image: © Wolter ‘17
  • 28. Source: https://youtu.be/tDFfZlQRCwM 2016 Speaker Recognition and Reliability. Conversational UX
  • 29. Codified and strict vs. Conversational • CLI: Command Line Interface • Input of Commands via Keyboard • Eliza by Joseph Weizenbaum (Psychotherapist), 1966 • Already 1220 chatbots according to the chatbots directory (https://www.chatbots.org/) [4]
  • 30. Source: Subservient Chicken 2011 (http://web.archive.org/web/20110426194400/http://www.bk.com/en/us/campaigns/subservient-chicken.html)
  • 31. Microsoft Xiaoice, Rinna, and Tay i.e. Xiaoice has 20 million registered users Sources: https://en.wikipedia.org/wiki/Xiaoice, https://en.wikipedia.org/wiki/Tay_(bot)
  • 32. Persona for an Avatar with Personality Source: http://genieblog.ch/cortana-vs-siri-1-emotionen/
  • 33. Source: Project Yorick, https://youtu.be/3Nss_2_rwdE Creepy (Ro)bot
  • 35. Conversational UX turns real Source: https://youtu.be/jSVRrJJ2nl4, SNL Julie the Operator 2006
  • 36. Human? • Chinese room: Does a machine literally "understand" Chinese? Or is it merely simulating the ability to understand Chinese? Searle calls the first position "strong AI" and the latter "weak AI". (https://en.wikipedia.org/wiki/Chinese_room) • Turing Test: A player C is given the task of trying to determine which player – A or B – is a computer and which is a human. C is limited to using the responses to written questions to make the determination. (https://en.wikipedia.org/wiki/Turing_test) • The Alexa Prize: A social bot that can converse coherently and engagingly with humans on popular topics for 20 minutes (similar to Loebner Prize with 25 minutes). (https://developer.amazon.com/alexaprize)
  • 37. Source: Boris Adryan, 2015-10-20, http://iot.ghost.io/is-it-all-machine-learning/
  • 39. Source: The Simpsons, 2001 Anticipation and Empathy
  • 40. What is Machine Ethics? Source: http://moralmachine.mit.edu/
  • 41. Source: http://www.youtube.com/Fzo_5q_dhIM, Green Tricycle Studios 2012 Respect Privacy. Conversational UX
  • 42. Conversational UX, Topics and Guides • Microsoft • Kinect for Windows | Human Interface Guidelines v2.0 (https://developer.microsoft.com/en-us/windows/kinect/tools) • Interaction primer (https://docs.microsoft.com/en-us/windows/uwp/input-and- devices/input-primer) • Experience Principles and Best Practices (http://docs.botframework.com/en-us/directory/best-practices/) • Inclusive Design (https://www.microsoft.com/en-us/design/inclusive) • Google • Conversation Design (https://developers.google.com/actions/design/) • Amazon • Alexa Skills Kit Voice Design (https://developer.amazon.com/public/solutions/alexa/alexa- skills-kit/docs/alexa-skills-kit-voice-design-handbook)
  • 44. Source: Kinect for Windows | Human Interface Guidelines v1.7 Wake Word Invocation Action/Skill Types Prompt Wake Word • Devices “Name” • Alexa, Echo, Amazon, or Computer • OK Google, or Hey Google • Hey Cortana • Keyword or trigger vs Always on, active listening • Indicate that the device is listening • Reduced false activation • Provide alternative input Invocation Name • Activating the Agent for your Skill (think of starting your app) • Usually two words (without articles), no trademarks Image: © Wolter ‘17
  • 45. Action Types Direct integration (for home automation, media etc.) • Direct Actions* (Google) • Smart Home Skill (Amazon) • Flash Briefing Skill (Amazon) • Cortana Skill* (Microsoft) *not yet available Indirect integration (invocation trigger/name) • Conversation Actions (Google) • Custom Skill (Amazon) • Cortana Skill* (Microsoft) *not yet available Some restrictions for publishing! Needs invocation trigger!
  • 46. Invocation Types Invocation Type Conversation Full Intent User: Alexa, ask Astrology Zone for the horoscope for Leo. Astrology Zone: Today’s outlook for Leo: An opportunity presents itself at work. Partial Intent User: Alexa, ask Astrology Daily for my horoscope. Astrology Daily: Horoscope for what sign? No Intent User: Alexa, talk to Astrology Daily. Astrology Daily: You can ask for your horoscope. Which is your sign? Ask <invocation name> <connecting word> <some action> <some action> <connecting word> <invocation name> Tell <invocation name> <connecting word> <some action> Search <invocation name> for <some action> Open <invocation name> for <some action> Talk to <invocation name> and <some action>Launch <invocation name> and <some action> Start <invocation name> and <some action> Resume <invocation name> and <some action> Run <invocation name> and <some action> Load <invocation name> and <some action> Begin <invocation name> and <some action> Use <invocation name> <connecting word> <some action>
  • 47. Prompt Types Prompt Type Conversation Question Interaction remains open, waiting for respond. Astrology Daily: Horoscope for which sign? Statement Interaction will terminate. Astrology Daily: Today’s outlook for Pisces: You could be questioning your current path… Wizard of Oz experiment, Image: http://www.kristamcgeebooks.com/
  • 48. The dirty secrets: JSON behind Amazon Alexa | Request Response
  • 49. The dirty secrets: JSON behind Google Assistant | Request Response
  • 50. How-to Alexa Skills https://developer.amazon.com/edw/home.html#/skills/list General Settings and Invocation Name Intents, Content, and Utterances
  • 52. Tip: Development and Debugging • Prepare node.js • https://nodejs.org/ • https://expressjs.co var express = require('express'); var bodyParser = require('body-parser'); var app = express(); app.get('/', function (req, res) { res.send('Hello World!'); }); app.listen(3000, function () { console.log('Example app listening on port 3000!'); }); • Use an editor or IDE (i.e. Visual Studio Code) • https://code.visualstudio.com/Docs/runtimes/nodejs • Connect to your local server via tunnel • https://ngrok.com/ • Generates URL like https://b22ec890.ngrok.io/ ngrok http 3000 • Eclipse SmartHome/QIVICON • REST API http://127.0.0.1:8080/doc/index.html • Paper UI http://127.0.0.1:8080/ui/index.html
  • 53. How-to Alexa Skills var Alexa = require('alexa-sdk'); app.post('/', function(req, res) { var context = { succeed: function (result) { console.log(result); res.json(result); }, fail:function (error) { console.log(error); } }; var alexa = Alexa.handler(req.body, context); alexa.registerHandlers(handlers); alexa.execute(); }); var handlers = { 'SwitchOnIntent': function () { var item = this.event.request.intent. slots.item.value; doRequest("ON"); this.emit(':tell', 'Switch ' + item); }, 'SwitchOffIntent': function () { var item = this.event.request.intent. slots.item.value; doRequest("OFF"); this.emit(':tell', 'Switch ' + item); } }; https://developer.amazon.com/edw/home.html#/skills/list
  • 54. How-to Google Assistant Skills … https://developers.google.com/actions/
  • 55. How-to Google Assistant Skills: Actions SDK https://developers.google.com/actions/
  • 56. How-to Google Assistant Skills: Actions SDK • Conversation API • gactions CLI • Specifying Action Package (JSON) • Testing • Deployment • Actions SDK / ActionsSdkAssistant • npm install express body-parser actions-on-google --save • require('actions-on-google') .ActionsSdkAssistant; • Web Simulator https://developers.google.com/actions/
  • 57. How-to Google Assistant Skills: api.ai • API.AI webhook protocol • Webfrontend • Specifying Action Package • Testing • Deployment • Actions SDK / ActionsAssistant • npm install express body-parser actions-on-google --save • require('actions-on-google') .ActionsAssistant; • Web Simulator https://developers.google.com/actions/ | https://console.api.ai/api-client/#/newAgent
  • 58. How-to Google Assistant Skills: api.ai https://developers.google.com/actions/ | https://console.api.ai/api-client/#/agents Create Agent General Settings
  • 59. How-to Google Assistant Skills: api.ai https://developers.google.com/actions/ | https://console.api.ai/api-client/#/agents Intents, Content, and Utterances Intents, Content, and Utterances
  • 60. How-to Google Assistant Skills: api.ai https://developers.google.com/actions/ | https://console.api.ai/api-client/#/agents Content Training
  • 61. How-to Google Assistant Skills: api.ai https://developers.google.com/actions/ | https://console.api.ai/api-client/#/agents Integration Invocation Name
  • 62. How-to Google Assistant Skills: api.ai https://developers.google.com/actions/ | https://console.api.ai/api-client/#/agents Testing Fulfillment via Endpoint/Webhook
  • 63. How-to Google Assistant Skills: api.ai GUI Interface NLU (Natural Language Understanding ) Conversation building features (i.e. Domains, Entities, State, Context) https://developers.google.com/actions/ | https://console.api.ai/api-client/#/newAgent
  • 64. How-to Google Assistant Skills: api.ai var ApiAiAssistant = require('actions-on-google').ApiAiAssistant; app.post('/', function(req, res) { var assistant = new ApiAiAssistant( { request: req, response: res }); var actionMap = new Map(); actionMap.set("switchOnIntent", switchOnIntent); actionMap.set("switchOffIntent", switchOffIntent); assistant.handleRequest(actionMap); }); var switchOnIntent = function () { var item = assistant.getArgument("item"); doRequest("ON"); assistant.ask('Turned ' + item + ' on!'); }; var switchOffIntent = function () { var item = assistant.getArgument("item"); doRequest("OFF"); assistant.ask('Turned ' + item + ' off!'); }; https://developers.google.com/actions/tools/ngrok
  • 65. @saschawolter | http://wolter.biz | https://github.com/wolter
  • 66. Next: Microsoft Cortana Skills …and some more like Samsung Bixby etc. • Harman Kardon Speaker announced for 2017 • Cortana • Fictional AI character in the Halo video game. • Acquired 2009 as TellMe. • Intelligent personal assistant and knowledge navigator. • Competes with Siri and Google Now since 2013 (Windows Phone 8.1). • Available on Windows 10, Windows IoT, on Android and iOS, on Xbox, etc. • Cortana Skills Kit announced for early 2017 • Cortana Devices SDK announced for 2017 • Allows OEMs (original equipment manufacturer) and ODMs to create smart and personal devices. • Microsoft Bot Framework • Build and connect intelligent bots to interact with your users naturally. • Microsoft Luis - Language Understanding Intelligent Service (part of Cognitive Services) • Understand language contextually
  • 67. Source: Dark Star, 1974, Bryanston Pictures I think, therefore I am. […] But how do you know that anything else exists? My sensory apparatus reveals it to me.
  • 68. Chatty Devices Does mouse and monitor soon become a thing of the past? Sascha Wolter |@saschawolter | wolter.biz Mai 2017 Source: Dark Star, 1974, Bryanston Pictures