I hear voices: Explorations of multidevice experiences with conversational assistants
I HEAR VOICES
JUST LIKE ATMS AND BANK TELLERS, PEOPLE WILL
REALIZE IT’S A MUCH MORE EFFICIENT WAY TO GET THE
INFORMATION YOU WANT AND WE WILL SPEND MUCH OF
OUR TIME TALKING TO MACHINES.
ON SPEECH RECOGNITION IN GENERAL
ACCORDING TO APPLE,
SIRI ALREADY HANDLES
MORE THAN A BILLION
SPOKEN REQUESTS PER
IT’S A FUNNY THING, TRYING TO MAKE SENSE OF A
TECHNOLOGY THAT HAS NO BUILT-IN VISUAL INTERFACE…
THIS GETS AT A DEEPER TRUTH ABOUT CONVERSATIONAL
TECH: YOU ONLY DISCOVER ITS CAPABILITIES IN THE COURSE
OF A PERSONAL RELATIONSHIP WITH IT.
NO VISUAL INTERFACE
▸ “A standardized mental image of a personality or character
that users infer from the application’s voice and language
choices.” It is a vehicle by which companies can brand a
service or project.
▸ Whether implicitly designed or not, users will perceive a
personality. “It is not advisable to leave the perceptions to
chance, especially because branding and image are at stake.”
Source: Voice User Interface Design
by Cohen, Giangola, Balogh
WHAT CAN BE ACHIEVED WITH A PERSONA?
Increase in engagement time
Increase in number of interactions
Higher overall rating of the product
Higher likelihood of recommending
device to a friend
Increase in monetization
Leads to a purchase of another device/
Source: NY Times
Google Now and Siri currently
represent two dramatically different
types of Human-Computer
Interaction styles. While Siri
intentionally and successfully mimics
a human, complete with a wry sense
of humor, Google Now opts instead to
function as a pure informational
oracle, devoid of personality or
OUR ONLINE CONVERSATIONS WILL INCREASINGLY BE
MEDIATED BY CONVERSATION ASSISTANTS WHO WILL
HELP US LAUGH AND BE MORE PRODUCTIVE.
THIS IS JUST THE START
▸ Cortana is conﬁdent, caring, competent, loyal; helpful, but not bossy.
▸ “Not only is she AI, she's self-aware, and that principle of transparency
informs a lot of how we handle error messages, our capabilities, tasks
and chitchat. You'll have more faith and trust in us if we do that for
▸ “As soon as the team gave Cortana a boost in conﬁdence, people
immediately began responding to her more positively.”
▸ Cortana places importance on audio
feedback to help complete tasks
▸ Uses audio when processing speech
▸ Speaks like a real person “Sound
▸ Not as formal as some other
personas, uses contractions, is
▸ Siri has “Occasionally a light attitude”.
▸ It has been noted “What can Siri do
better? Have an emotional relationship
with a user.”
▸ More formal “Shall I create it?”
▸ Siri uses visual feedback for speciﬁc
details unless plugged into a headset
▸ Google Now does a great job
anticipating needs, and gathering
▸ Its persona is a reﬂection of its
capabilities and strengths: “Google
Now does not attempt personality,
▸ All business, no chitchat
▸ “ ‘Sorry, I didn’t understand the
question I heard’ is her favorite
response, though honestly she really
doesn’t sound very sorry.”
▸ She is just smart enough to be useful.
And she keeps getting smarter.
▸ “She’s like a genie in a sci-ﬁ-looking
bottle – one not quite at the peak of her
powers, and with a tiny bit of an
▸ Cortana: “I’m sorry, the internet
and I aren’t talking right now”
▸ “Siri not available. Connect to the
▸ Google: No audio – visual says
“Can’t reach Google at the
As a toy, Hello Barbie needs to be both fun, leading girls through imaginative games,
and funny, telling jokes and being goofy. But Mattel also wanted Barbie to have an
empathetic, afﬁrming sensibility aimed at young girls. - NYTimes
IF WE HAD MORE OF A FRAMING
“SHE’S THE PERFECT PERSON TO ASK
ABOUT RESTAURANTS OR SEARCHING
FOR DIRECTIONS” THEN SHE WOULD
WORK 95% OF THE TIME RATHER THAN
FAILING YOU 40% OF THE TIME.
▸ In many cases it is still a layer on top of
the OS; only goes one layer deep
A great conversational agent is only fully useful when it’s everywhere, when it can
get to know you in multiple contexts—learning your habits, your likes and dislikes,
your routine and schedule. The way to get there is to have your AI colonize as many
apps and devices as possible.
THE NEW SIRI IS PAVING THE WAY TO WHAT YOU MIGHT
CALL “AMBIENT COMPUTING” — A FUTURE IN WHICH
ROBOTIC ASSISTANTS ARE ALWAYS ON HAND TO ANSWER
QUESTIONS, TAKE NOTES, TAKE ORDERS OR OTHERWISE
FUNCTION AS AUXILIARY BRAINS TO WHOM YOU MIGHT
OFFLOAD MANY OF YOUR CHORES.
IPHONE 6S’S HANDS-FREE SIRI IS AN OMEN OF THE FUTURE
Start on one device and shift based on capabilities
of the device.
Hey Siri, what was the
score of the Habs game?
Just do what you need it to do no matter how many
“apps” and devices it takes.
Send this to my Mom and
add to favorites.
Direct the output when it’s not the obvious choice.
And then ability to surface it
Do you still need
Save this to my watch.
TRUE REMOTE CONTROL
Remote as in “on my way home”
Hey Siri, get the next
House of Cards tee’d up
Amazon is throwing $100M in the form of Alexa Fund to lure developers,
manufacturers, and startups to create voice-driven applications and devices based on
Echo. Companies such as Orange Chef, Scout Alarm, Toymail and, Mojio got the seed
investment from Alexa Fund.
▸ “Your customers can simply speak
to Alexa through the microphone
on your device and Alexa will
respond through your device's
(WE’RE HIRING ALL TYPES OF DESIGNERS)