Applying Science to Conversational UX Design
Bob Moore, Raphael Arar
IBM Research
Interfaces have come
a long way.
Applying Science to Conversational UX Design
© 2017 IBM Research
Conversational Agents
“
Jean Baudrillard
Sociologist
The sad thing about
artificial intelligence is
that it lacks artifice and
therefore intelligence.
Human
Conversation
APIs
Conversational Systems
Speech to Text Text to SpeechNatural Language

Understanding
Dialog Management
Human
Conversation
APIs
Conversational Systems
Text to SpeechSpeech to Text Natural Language

Understanding
Dialog Management
USER EXPERIENCE
DESIGN
ARCHITECTURE
INFORMATION
ARCHITECTURE
HUMAN-COMPUTER
INTERACTION
INTERACTION
DESIGN
VISUAL DESIGN
MECHANICAL
ENGINEERING
ELECTRICAL
ENGINEERING
CONTENT CREATION
(Text, Data, Graphics)
Signage
Info Viz
Navigation INTERFACE DESIGN
Ubicomp
Controls
Interactive
Environments
USABILITY
ENGINEERING
INDUSTRIAL
DESIGN
Dan Saffer, UX Designer
USER EXPERIENCE
DESIGN
ARCHITECTURE
INFORMATION
ARCHITECTURE
HUMAN-COMPUTER
INTERACTION
INTERACTION
DESIGN
CONVERSATION DESIGN
MECHANICAL
ENGINEERING
ELECTRICAL
ENGINEERING
CONTENT CREATION
(Text, Data, Graphics)
Signage
Ontology Management
Navigation CONVERSATIONAL UI
Ubicomp
Controls
Interactive
Environments
USABILITY
ENGINEERING
INDUSTRIAL
DESIGN
What’s the state-of-the-
art for conversational UI?
Web. 2017.
Applying Science to Conversational UX Design
© 2017 IBM Research
Web. 1996.
Conversational UI. 2017.
Conversational UI. ????
Human
Conversation
APIs
Conversational Systems
Text to SpeechSpeech to Text Natural Language

Understanding
Dialog Management
Conversational Systems
Dialog Management
Conversation
Analysis
Human
Conversation
What does natural
conversation sound like?
20 Des: What is the name?
21 Guy: Detweiler. D-e-t,
22 (1.2)
23 Guy: w-e,
24 (0.4)
25 Guy: i-l-e-r-.
26 (2.0)
27 Des: Foursome?
28 Guy: Yah.
29 (0.4)
30 Des: Electric carts?
31 (0.6)
32 Guy: Uh:::, n:no? I don’t
33 think so.
34 Des: Okay. We'll see yuh then,
35 Guy: Righto,
36 Des: Mm hm, Bye?
01 Des: G'morning. San Juan Hills
02 Country Club?
03 Guy: Guh morning. What’s-w-what
04 kind of a starting time
05 ken:: we get fer::hh
06 sometime this afternoon.
07 (0.7)
08 Guy: Any[time-
09 Des: [Oh:::, [let's see.
10 Guy: [Any time
11 tuhday.
12 Des: Two fordy. One, thirdy.
13 Guy: One thirty?
14 Des: Mm hm::?
15 Guy: One thirty.
16 (0.7)
17 Guy: .hh W'l at sounds like a
18 good time?
19 (0.4)
Not every voice or 

text interaction is a
conversation.
Speaker-change recurs, or at least occurs.1
Overwhelmingly, one party talks at a time.2
Occurrences of more than one speaker at a time are common, but brief.3
Transitions (from one turn to a next) with no gap and no overlap are common. Together with transitions characterized by slight gap

or slight overlap, they make up the vast majority of transitions
4
Turn order is not fixed, but varies.5
Turn size is not fixed, but varies.6
Length of conversation is not specified in advance.7
Relative distribution of turns is not specified in advance.9
Number of parties can vary.10
Talk can be continuous or discontinuous.11
Turn-allocation techniques are obviously used. A current speaker may select a next speaker (as when he addresses a question to

another party); or parties may self-select in starting to talk
12
Various 'turn-constructional units' are employed; e.g., turns can be projectedly 'one word long', or they can be sentential in length13
Repair mechanisms exist for dealing with turn-taking errors and violations; e.g., if two parties find themselves talking
at the same time, one of them will stop prematurely, thus repairing the trouble
14
What parties say is not specified in advance8
— Harvey Sacks, Emanuel A. Schegloff, Gail Jefferson
Meet Alma
Natural Conversation Framework
Natural Language !=
Natural Conversation
intent
distance
cuisine place
Natural Language
Natural Conversation
action pair
granting
request
dependency
sequence closing
hearing trouble
dependency
dependency
understanding trouble
base second part
base first part
Conversational UX
a working set of principles
Saying is doing
J: T's- tsuh beautiful day out
isn't it?
L: Yeh it's jus' gorgeous...
CA
A: God izn it dreary.
(0.6)
B: [Y'know I don't think-
A: [.hh- It's warm though,
UX
“A "signifier" is some sort
of indicator, some signal
in the physical or social
world that can be
interpreted meaningfully.
Don Norman
Cognitive Scientist
Recipient design
CA
“By 'recipient design' we refer to a
multitude of respects in which the
talk by a party in a conversation is
constructed or designed in ways
which display an orientation and
sensitivity to the particular other(s)
who are the co-participants.
Harvey Sacks, Emanuel A. Schegloff, Gail Jefferson
Sociologists
B: Who's doing your remodel?
A: Dave
CA
C: Who's doing your remodel?
A: My neighbor across the street. 

He's a contractor.
UX
“[Human-centered design is] an
approach that puts human needs,
capabilities, and behavior first, then
designs to accommodate those
needs, capabilities, and ways of
behaving.
Don Norman
Cognitive Scientist
Minimization
D: Who's doing your remodel?
CA
A: Dave
D: Who?
A: You know, my neighbor across the street.
D: Oh!
A: You had a beer with him?
D: Right.
CA
B: uh, yeah, I guess I'd like Mexican food
A: Mañana's is on Fourth and Winchester.
It's a great Mexican restaurant within
walking distance. It gets five out of
five stars. Would you like me to make
a reservation for you at Mañana's?
A: What kind of food would you like?
Mexican
Voice inputs are cheap, but voice
outputs are expensive
UX
Occam’s Razor1
Minimize Cognitive Load2
Eliminate Excise3
Understanding is
interactional
CA
UX
Mental Models1
Feedback2
Emotions describe
actions
CA
CA
UX
Visceral Behavioral Reflexive
Norman’s 3 Levels of Emotional Design
The best input method 

is situational
UX
Context matters!
ibm.biz/conversational-ux
Bob Moore
rjmoore@us.ibm.com
Raphael Arar
rarar@us.ibm.com
Thank you.

Applying Science to Conversational UX Design