This presentation is a summary of our first event, it will give you a walk you through the technical capabilities of the major voice platforms (Amazon Alexa, Google Home, Siri, MS Cortana, Bixby etc), examine how they can be leveraged to build better products, and give an introduction to the voice-specific design process.
DevEX - reference for building teams, processes, and platforms
Voice Tech TO #1
1.
2. Connected Lab
December 22, 2017
370 King St W #300 Toronto, ON M5V 1J9 / (647) 478-7493
An overview
Voice Technologies Today
3. Yours truly
Tim BETTRIDGE
Product Designer at
Connected Lab
Guy TONYE
Software engineer at
Connected Lab
Polina CHERKASHYNA
Product manager at
Connected Lab
6. Voice Technologies are a subset of conversational interface.
Definition
• A conversational interface describes a place where a user interact with
a system using a conversation.
• Those interface have been available for a little while. One of the most
famous is the Interactive Voice Response (IVR) which is often used for
automated customer service (for example when calling a bank and the
automate voice guides the user by asking to use the keypad to provide
information).
Conversational interface
7. Chat bot interface
A user interface which
makes human interaction
with computers possible
through chat-like written
conversation.
Hybrid interfaces
Interfaces which combine
natural spoken
conversation, text input,
video etc.
Voice interface
Makes human interaction
with computers possible
through close to natural
conversation by voice.
Zoom on 3 types of conversational interfaces
8. What happens under the hood?
A Virtual Personal Assistant at the
core
• The devices are the interface for the
consumer.
• When users converse in a
conversational interface, the device
forward the requests to the Virtual
Private Assistant (VPA) behind the
scene. The latter process the request
and send an appropriate response to
the device as a response.
• The VPA is a web server.
VPA
NLP-
NLU
ML
9. VPA - zoom on two features
Natural Language
Two powerful features of the VPA are
Natural Language Processing (NLP) and
Natural Language Understanding (NLU)
NLP and NLU are a tools and techniques
useful to help the server convert human
requests into action or set of actions
that the machine can execute.
Machine learning
The VPA also leverages Machine
Learning (ML) for two purposes.
Understand the request
• Because of accent and idiom, ML
helps to adapt and improve the
understanding of the user request
Fulfill the request
• ML is also used to tailor the actions
and the response that is given to the
user upon request
12. 15 mln+ Alexa devices
5 mln Google
50% of teens and 40% of adults
$18.30 Billion USD by 2023
All major car manufacturers
55% of U.S. households by 2022
13. When is voice relevant for my business/
product?
Fewer steps
than via
phone or PC
Hands
occupied
Multi-
tasking
Assisting
differently
abled people
Fun/
leisure
Interactive
learning
Based on understanding these core use cases it’s possible to think of multiple
voice specific use cases for different businesses and domains.
14. Kitchen is THE place where we have our hands occupied. Think of the various voice services which
could be useful in this situation and brands which could own them.
Step by step
voice recipes
Food delivery Voice search
for music
Kitchen timer (an
embedded Alexa feature)
15. Cars are another “hands occupied” space. They hold ample opportunities for “aftermarket
products” - devices which can help drivers operate navigation or car infotainment by voice if the
vehicle doesn’t have an embedded voice assistant. At the same time most car manufacturers are
building in voice capabilities into the new models. Those will cover multiple use cases: voice
search, navigation input, adjusting car AC and other features and even coaching for new drivers.
16. Requesting for music is much faster by voice than using the phone and key music service providers
are already offering voice-first music discovery and playback capabilities.
Amazon Music launched a feature where users can search for music by voice using any
combinations like “babymaking country music” or “slow Italian dinner music”.
Play some Italian dinner music
17. Feeling sick is not a “hands occupied” situation, but it’s a time when ordering something by voice is
much faster and and easier than doing so via a phone or lap top. Multiple brands could leverage
this opportunity and create an additional entry point for their customers.
18. Products for people with special needs could benefit a lot from voice features. Bionic Laboratories
are already using a voice interface to help patients, who are unable to use their legs to operate an
eco skeleton.
19. Leisure and entertainment are one of the most demanded voice-first experiences. Providers of
news, shows, radio and games could leverage are already actively leveraging voice for new product
and services development and as an additional digital touch point with their users.
20. Situations where a voice interface is more suitable than touch aren't limited by our planet.
21. Smart voice assistants offer ample opportunities for education providers to create interactive
“tutors”, who can help learn facts, listen to lectures and test your knowledge.
22. Customer services is a known area for voice capabilities. Here the experience will be improved as
voice assistants are becoming smarter and offering more human-like communication patterns.
23. Voice platforms offer an opportunity for automation and thus savings for businesses. Think about a
voice capability, which could help visitors seamlessly make an order in a restaurant, without
waiting for a server.
24. There is space for product improvement,
new product development, marketing
and investment.
25. But there’s always a “BUT” …
Users don’t invest any
effort into new feature/
skill discovery
Users learn by talking to
the assistant - not by
using the companion app
If a feature or capability
didn’t work the first time
they will not try it again
27. Fundamentally lean
Hypothesis/
product vision
Build
Learn Measure
Scale
(Rapid prototyping
and user validation)
When dealing with new products and
technologies it’s important to keep in
mind that risks of failure are inherently
high and “gut instinct” is not enough to
create a successful product/experience.
Start user validation as soon as you have
a clear hypothesis and ensure that user
validation is a frequent and continuous
during the discovery, definition and build
phases of your project.
28. Invest more when risks are lower
Time
Investments
Risk
Continuous user validation will allow you
to iterate faster and eliminate risks while
making only small investments of time
and energy.
29. Steps towards effective CUI design
Not everything
is a good fit
Pick the right
use cases
Develop
user stories
Experiences that
make things
faster & simpler
1
How do you want
your persona to
feel and sound
Create the
Personality
2
Write
Dialogs
Figure out the
‘happy path’ and
then think about
the other paths
and branches.
Conversation
repair is very
important.
3
1:1 operator
allows for high
experiential fidelity
prototypes
Interview before
and after
Become the
puppet master.
Test with
real users
4
Record your
test sessions
Capture user
utterances
Use analytics to
illicit insights
Measure
& Learn
5
Iterate, test,
measure, and
repeat
Iterate
& test
6
Iterate, test, measure, and repeat
31. Spreadsheet
and OSX TTS
Simple and effective for
early validation.
• Have an external speaker
and mic to create higher
experiential fidelity.
• Be ready to improvise.
• Record your sessions
with an audio recorder.
32. Interactive Keynote
Keynote is a familiar
way to prototype quickly
and effectively.
• Easier to see the dialog
flow and follow along.
• Be ready to improvise.
• Conversational repair is
essential.
• Record your sessions
with an audio recorder.
33. CUI
Prototyping Tools
Tools for CUI design and
testing are in development.
• Highest
experiential fidelity
• Transcripts of tests allow
for later review
• Utterance capture
and analytics
• Eg. Simili and others
35. The hypothesis
Idea: we have daily company stand ups where we share tech news. Why don’t we create a voice
capability which would provide latest news on a this or category (tech, business, economy).
Hypothesis: it's faster and easier for users to request a news update by voice than via a mobile app.
36. Initial user testing
• Users want to request
news proactively e.g.
ask for the news on a
certain topic
(Facebook), or category
(tech news).
• Users want to get only
the top 3-5 headlines
first and then have an
ability to ask for more
on the headline they
found interesting or
send the full article to
their phone.
37. Iterating on the flow: v1.0
We designed the first flow as an assumption and started to iterate from there taking each new
iteration to users, collecting feedback and improving the flow before doing any engineering work
other than research of platform capabilities.
38. Iterating on the flow: v1.0 feedback from users
• The intro is too long,
especially when you listen
to it a second time.
39. • Users are not interested in the source as
long as they hear top 3 news items.
• Users want distinct discourse markers
“Number 1… Number 2… Number 3…”.
Iterating on the flow: v1.0 feedback from users
41. • Its hard to remember 3 options
of what you can do with the
skill. Users prefer a shorter
conversation like flow.
Iterating on the flow: v2.0 feedback from users
42. • The added intro to the article
wasn’t creating enough value
for users. We should look at
new ways to summarize
articles
Iterating on the flow: v2.0 feedback from users
44. Ready for next test.
Iterating on the flow: v3.0 - prototype
45. An alpha prototype of Tailor news for Google Home was
quickly built using:
Google Assistant
• When registering as a google action
(action.developer.google.com) a Voice capability can
leverage the Google Assistant to process request.
Dialogflow
• Interface where the model for user utterances and the
mapping with the actions to do upon receiving those
request can be defined.
AWS Lambda
• Fulfilment of the action is done using a webhook on an
AWS Lambda.
Quick prototype: Tailor news on Google Home
NLP-NLU
VPA
Assistant
https://newsapi.org
46. Three key bullet points
Use SMMRY the API behind
the famous TLDR Redditbot-
article summary (3 bullet
points) rather than plus one
sentence.
Beta
Testing the skill with a larger
set of users
Refining the dialogues
We’ve established the flow
now we need to finalize the
actual wording.
Next steps
48. Lean
Fully execute the lean
methodology to reduce
risks. Remember that gut
instinct is not enough for
new product development.
1 Design process
Pick good use cases, create
a persona, test with real
users, measure and learn.
2 Grab the opportunity
Build your expertise now
and be the first to make your
products better and win
over competition.
3
Summary
49. 1) Using voice technologies to build better products (existing or new).
2) Designing for Voice (dialogue building, guidelines, prototyping).
3) A detailed client case study.
4) Engineering: rapid prototyping, cross-platform capability delivery.
5) Testing voice products (QA).
6) Results of Tailor News launch, available analytics.
Upcoming events: your input is welcome