Getting Started with Voice UI

by Isidore Gotto
Getting Started with Voice UI
Last updated: Feb. 2018
“Hello, I’m _____.
“

TOPICS
Voice Overview Designing for Voice
Resources & Reference Links
2
• What is Voice?
• Why you should consider adding Voice?
• Voice Only: Pro’s & Con’s
• Voice: Things to Consider
• Introducing Voice to SDLC
• Crawl, Walk, Run Approach
• Intro
• 5 Steps to Designing for Voice before Coding
• 7 Principles for Designing Voice
• Real Life User Conditions
• Error Handling
• Identifying the Problem
• Complexity by Data Inputs
• Voice AI Persona, Personality, Tone …
• Designer Tool-Kit Downloads
• UX Research Result
• Platform Comparison
• Industry Best Practices URLs
4
5
6
7
8
9-10
12
13
14
15
16
17
18
19
21
22
23
24
• Prototyping & Development Tools 25

Voice Overview
3
“
“Hello, I’m _____.

What is Voice?
Voice experience has been around since the 1950’s. Today’s enhancements in
technology & demand for innovation has brought us to the next evolution of human
computer interactions.
In today’s market you may come across all types of terms for Voice; i.e. voice
assistant, voice-enabled speakers, Voice UI (VUI), Conversational UI (CUI), Artificial
Intelligence (AI), etc. All you need to understand is that its software that listens out
for grammatical details and attempts to recognize sentence structure to understand
the context and meaning of instructions.
Voice User Interface (VUI) is the next generation of human computer
interaction. VUIs allow people to use the power of their voice to interact with
computers/systems, instead of using their hands with a mouse, keyboard, or touch
screen.
This method of interacting with your product & services has unlimited potential.
* Apple’s Siri, Google’s Assistant, Amazon’s Alexa, and Microsoft’s Cortana are all prime examples of
consumer level AI that can respond to a request, control some level of physical devices, help give
options based on internet searches, and more.
** IBM’s Watson, a business to business solution, takes AI to another level by adding the ability to
make predictions, assumptions, and even some reasoning to computational outcomes.
4
“
Technology has Arrived
as of June 2017 Amazon Alexa
has grown to 15,000 skills &
98% speech recognition
accuracy.
Tech giants like Google, IBM,
Apple, Cisco and even Slack are
all investing into voice
technology.
“Hello, I’m _____.

Why should you consider introducing
Voice Assistance to your products &
services?
• Simplification / Ease of Use - “everyone knows how to talk…”
• Speed & Convenience in Hands Free / Screen Free Situations
• Multi-Tasking - working on one file requesting info from another
* Taking it beyond Voice Only an introducing multi-modal Voice experiences with a new Voice GUI,
we now bring contextual navigation, orientation, personalization & additional benefits to users.
** Voice assistants can help with human empathy as humans have a difficult time understanding tone
via the written word alone. Voice, which includes tone, volume, intonation, personality and rate of
speech conveys a great deal of information.
5
“
Introducing Voice Assistance to
your product & services will one
day help improve your overall
client experience.
Key benefits to users & clients:
Problem
 Today it takes an avg. new user multiple attempts, endless
amount of training to get familiar with your product & services.
 Fact: majority of call volume across all business types revolves
around – “how do I…”
“Voice is being seen as the
future of software & computer
interaction.”
“Hello, I’m _____.

6
Voice Only: Pro’s vs. Con’s
Pro’s
 Get a specific question addressed more
easily and faster; Ask/Command & Done!
 Great for specific info/data lookup and data
analysis tasks, that are either buried or not
accessible via current navigation
 Focused conversation & limiting number of
choices lends to speed & confidence with
decision making
 Handy when user situation requires a
hands-free setup
 More Natural interactions – “Humanize the
experience”
Con’s
✗ May not be obvious to user that they can
initiate conversation or what/how to ask
✗ User may need to adjust their work
environment
✗ High Risk of exceeding cognitive load to
process voice response
✗ Not suitable for complex tasks that require
visual guidance, user input or involve many
choices
✗ Privacy & security concerns with speaking
out loud

7
Voice: Things to Consider
Benefit of Introducing Voice + GUI Experience “Multi-Modal Interactions“– combining two or more
modes of interaction.
• Multi-Modal allows you to compensate for cognitive memory weaknesses & task complexity, through current
visual interface or by introducing a new Voice GUI overlay.
Examples:
Leverage a voice/chat based experience.
Visual Confirmations (Hound app does a great job with only voice input, responses are voice + visual. )
UX Challenges to Overcome
• User Input - Speech 2 Text Recognition (based on technology selection constraints)
• Type of Data Input – will vary based on complexity of Use Case & Task
• Privacy - Speaking Out Loud Sensitive Information (system needs to be able to identify sensitive information and not respond
with audio)
The challenge is making the experience more natural, tackling the wide variety of ambiguity that may occur.

8
Introducing Voice to your SDLC
A conversational or natural language user interface is a method of interacting with computers through text or voice
commands.
With good speech recognition, accurate instruction detection & quick responses, voice interaction is starting to feel natural.
“
“Hello, I’m _____.

Designing for Voice
10
“
“Hello, I’m _____.

11
Designing for Voice
Voice User Interface (VUI) systems understand voice
commands, and respond either by speaking back, or by showing
a visual response.
The difference between Voice-Only interactions & ‘multi-modal’
means more information can be conveyed to the user than on
voice only devices. Multi-modal interfaces could help drive huge
advances in the workplace.
While Voice-Only interactions benefit the user in hands free
situations and providing quick answers to short commands.
Adding voice to any system will give it the sense of life, personality, &
character. Moving forward with voice, we must think about how verbal
conversations sound, feel, and flow.
““Using a VUI should feel as natural as speaking, and
listening, to any other human.” “Hello, I’m _____.

5 Steps to Designing a Voice Experience before #Coding
1. Discover
What problem can voice solve?
How will voice provide value to your
users? i.e. consider all environments
2. Define
Voice Persona – Tone, Voice,
Personality…
Evaluate Capabilities – Will voice be a
good fit for this use case or task?
i.e. start with introducing 1 to 5 capabilities.
- Download Voice Evaluation Worksheet
3. Detail Conversation Flow
Begin with the “Happy Path” a
conversational flow in which the voice app
can respond to the users request without
any expectations or error. Then move on to
detailing the conversation flow for
exceptions and errors. - Download Design
Kit
4. Describe Alternative
Words & Phrases for NLP
People don’t always use the same
words to say the same thing and voice
apps need to be taught that. Phrase-
mapping is an exercise to teach voice
apps to accommodate variation in the
way users phrase their requests.
5. Refine
Test, learn, measure &
refine with user research.
12
“
Steps to VUI
Discover
Define
DetailsDescribe
Refine
“Hello, I’m _____.

GoogleAmazon Cortana
13
Principles for Designing for Voice
*Voice Design Guides
by
“
Voice UI & Conversational UI
Design Kit - Download
Voice Task Evaluation
Worksheet - Download
“Hello, I’m _____.

interrupted
self correction
cut off to soon
background noise
confused
too many choices
didn’t understand
talked too long
speaks in other termscoughs
hesitation
connection cuts off
REAL LIFE
USER
CONDITIONS
}
language
accents
soft spoken
“It’s hard enough to
speak with another
human.”
culture
jumps from one thought to another
14
Things to Consider when Designing for Voice
privacy

I Don’t Understand You
When a so-called “error”
occurs in a conversation, it
should be treated simply as a
new turn in the dialog, only
with different conditions.
15
Error Handling
“
Example:
• I did not understand your request. Did
you say A or B?
• I currently am not able to process your
request, would you prefer A or B?
• I am not able to process your request.
Would you like me to connect you with a
Service Representative?
A
B
?
“Hello, I’m _____.

GET STARTED WITH ASKING: What user problem are you looking to solve?
Identifying if Voice UI experience is the right solution
• First, identify your intended user persona & personality.
• Then, layout their typical journey when using your application.
• Next, identify areas where Voice will benefit the user.
• Then, identify what other personas will benefit with the same or similar Voice experience.
• Design, porotype and test – more on this later
1. Difficulty finding or navigating applications. i.e. how do I… Where is… Shortcuts...
2. What’s my status? i.e. Did my package ship?
3. What is __________ phone number?
4. I have a specific question on ___________.
5. Look up _________ information or data.
6. Show me _________ report.
7. Calculate total or difference between _________ & _________.
Examples where Voice can make a BIG difference assisting users today.
Note: where possible try to use data/analytics first to identify areas of applications that are
most frequently used, have the largest amount of call volume. Then use the voice task
evaluation worksheet to evaluate.
16
“
Our GOAL is to build a
complete & seamless
Voice Experience
across all your
products.
Voice UI & Conversational
UI Design Kit - Download
💡
“Hello, I’m _____.

Complexity by Data Input Types on Users via Voice/Conversational
UI
TYPES OF DATA INPUT
VOICE ONLY
(standalone)
VOICE + GUI
(Multi-modal Exp.)
CONVERSATIONAL UI
CHAT / TEXT
PRO-ACTIVE
CONVERSATIONAL UI w/ AI
(Multi-modal Exp.)
On/Off
(checkbox, switch)
Easy Easy Easy Easy
Select one or multiple
from options offered
(radio options, dropdown menus,
checkboxes, cards, multi-select)
Difficult
(cognitive load with visual
aid)
Easy,
(Multi-Mode two or more
modes of interaction. GUI used
for data entry, selection,
validation, confirmation)
Difficult
Presentation of choices needs to be
limited; especially multiple choice
Difficult
Presentation of choices needs to be
limited; especially multiple choice)
Structured fields
(dates, currency, etc.)
Difficult
(inconsistent voice
recognition performance)
Easy
Easy, but could be tedious when
multiple fields are involved.
Recommend large input forms to be
designed in traditional UI Format.
Text fields with variable
data
(email address, people names,
addresses)
Difficult
(voice recognition of
variable data)
Easy
17

Characteristics of Voice for A.I.
1. Tone of Voice
2. Gender of Voice
3. Personality
4. Character
5. Word & Phrase Choices
6. Functional Design
7. Style & Technique
Creating the Voice of A.I. for your Product
Base your characteristics on:
 Your user population
 Their needs
 The imagery & qualities associated with
your brand
18
“
“Hello, I’m _____.

Reference Links, Research Results, Frameworks & more.
Resources
19
“
“Hello, I’m _____.

Reference & Resource Links
20
We have created several downloadable tool-kits for you to get started with adopting Voice/Conversation UI
experiences on your products.
• Customer Journey & Scripting for Voice – will assist you with facilitating stakeholder discussions in evaluating
where in your customer journey Voice UI would make an impact from Product Discovery, Initial Setup of new Client, First
Benefit/Use, Re-Use. As well samples on designing conversational UI with scripts and prototype references. – download
• Voice Use Case / Task Evaluation Worksheet – helps you quickly evaluate your product use cases for Voice prior to
designing. – download
• Voice Personality Development – expanding on traditional personas, looking deeper into user personality traits,
character and into your AI Personality.

Reference & Resource Links
Industry UX design best practices and heuristics for voice & conversational UI.
Amazon:
https://developer.amazon.com/designing-for-voice/design-process/
Apple Siri:
https://developer.apple.com/sirikit/
Google:
https://developers.google.com/actions/design/checklist
https://developers.google.com/actions/design/principles
Microsoft:
https://docs.microsoft.com/en-us/cortana/skills/design-principles
Samsung Bixby:
http://bixby.samsung.com/
21

Platform Comparison
AVAILABLE ON PRO’s CON’s
Amazon Skills Standalone, Mobile
(Nov.2017 announced
Alexa for business)
95-98% accuracy; languages US, Europe,
German, Japanese
…
- To Be Delivered (TBD)
Apple Siri Kit iPhone, iPad, mac,
macbook, iWatch,
HomePod
88% accuracy; multi-language supported
…
…
Google
Assistant
Phone, tablet, laptop,
standalone devices &
web
95-98% accuracy; multi-language supported
…
…
Microsoft
Cortana
Laptop, desktop,
standalone devices
95-98% accuracy; multi-language supported
…
…
Samsung
Bixby
Phone, tablet, TV - To Be Delivered (TBD)
Company
Virtual
Assistant
Company Ecosystem
of products &
services online or
native app.
- To Be Delivered (TBD)
Other
platforms…
As of Oct. 2017 22

Google Assistant
https://developers.google.com/assistant/sdk/overview |
https://developers.google.com/assistant/sdk/
Google Speech - https://cloud.google.com/speech/
Apple Siri Kit - https://developer.apple.com/sirikit/
Microsoft Cortana - https://developer.microsoft.com/en-us/cortana
Microsoft Bing Speech API - https://azure.microsoft.com/en-
us/services/cognitive-services/speech/
UMP Speech Recognition - https://docs.microsoft.com/en-
us/windows/uwp/input-and-devices/speech-recognition
Microsoft Cortana Skills Kit - https://developer.microsoft.com/en-us/cortana
Aug 2017 reached 5.1% error rate -
https://techcrunch.com/2017/08/20/microsofts-speech-recognition-system-
hits-a-new-accuracy-milestone/
Finnish IT company Blucup wanted to find a way for its salespeople to input
customer data and generate leads while in the
field. https://customers.microsoft.com/en-us/story/blucup-discrete-
manufacturing-cognitive-services
Samsung Bixby - http://developer.samsung.com/home.do
https://news.samsung.com/global/bixby-a-new-way-to-interact-with-your-
phone
Amazon Alexa -https://developer.amazon.com/alexa
Voice Design Guide - https://developer.amazon.com/designing-for-
voice/
Amazon - https://developer.amazon.com/designing-for-voice/
Google - https://developers.google.com/actions/design/
Facebook - https://developers.facebook.com/docs/messenger-
platform/introduction/general-best-practices
Slack - https://api.slack.com/best-practices
Apple - https://developer.apple.com/ios/human-interface-
guidelines/overview/themes/
Paid Vendors
KeenResearch - http://keenresearch.com/
DialogFlow - Conversational UX Platform for Web, Mobile and IoT -
https://dialogflow.com/
SpeechMatics - https://www.speechmatics.com/
Open Source Vendors
SoundHound “Hound” - https://soundhound.com/hound
CMU Sphinx - https://soundhound.com/hound
OpenEars - https://www.politepix.com/openears/
iSpeech - https://www.ispeech.org/
23

Prototyping & Development Tools
None Developer
• Wizard of Oz – set of microphones and speakers
• Sayspring.com (voice only, can be connected to Amazon and Google)
• InvisionApp, Axure, Keynote etc. (used to create GUI part of the experience)
Development Skills Required
• Wit.ai
• Dialogflow.com
• SoundHound.com ‘Houndify’
• Amazon Alexa Skills
• Google Cloud Platform
• Apple Speech Recongnition
• IBM Watson – Speech to Text and Text to Speech
Voice Analytics
• VoiceLabs.com
24

Getting Started with Voice UI

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Getting Started with Voice UI

Similar to Getting Started with Voice UI (20)

More from Isidore Gotto

More from Isidore Gotto (14)

Recently uploaded

Recently uploaded (20)

Getting Started with Voice UI

Editor's Notes