Getting ready for voice
Maarten Dings
About me
Creative Technologist at CX Company
Finding magic where left brain meets
right brain
Innovator and early-adopter
Connecting technology, business and
people
Some cool facts about CX Company
• Founded in 2004, meanwhile 15 years of experience
• AI-driven, enterprise-ready conversational platform
DigitalCX
• Content-first developer-friendly platform
• Support for over 20 languages
• Over 4 million digital interactions per day
• Leading provider for chatbots and digital assistants in NL
• ISO 27001 certified. Your data is safe with us
• Pan European operations with offices in NL, D and UK
Building voice journeys through an intuitive visual
dialog creator
We also offer easy integrations for backend systems
and devices. Available through NPM
Let's embark on a high-level journey
through the adoption of smart
speakers and voice assistants
What is the current state? What can we do better? What new challenges do we face?
And what can we do to make the experience better?
The current state of
smart speakers
How many people own
smart speakers?
Smart speakers
are a hot market
Of the 30% planning to purchase, 73% of
them want to do so in the next six months.
There are more speakers available and
cheaper in price.
75% of households will have at least
one smart speaker by 2020
Source: Gartner
Also by 2020, customers will manage
85% of their relationship with the
enterprise without interacting with a
human
Source: Gartner
You can connect the dots, right?
Usecase for smart
speakers →
What are we talking
about?
Jargon, technical phrases, definitions etc.
Smart speakers try to
understand users' intents by
analyzing their utterances.
Smart speakers are called smart because
they use Artificial Intelligence (AI) to do
so.
Do you feel like this at
the moment?
Let's explain
Intent vs. utterance
Intent
What does a user want to achieve?
Utterance
How does a user express that?
The same intent can have multiple utterances: "Book a cab", "Get me a taxi"
Machine Learning vs.
Natural Language Processing
Machine Learning vs.
Natural Language Processing
Machine Learning
A computer tries to make decisions by discovering patterns in large amounts of data. It
will typically use neural networks or deep learning models to achieve that.
Requires lots of data and expensive computing power
Sounds really cool. Ma-chi-ne-Lear-ning, awesome!
Can potentially predict what a user wants to do
Can you
imagine?
"Hey Maarten, your favorite beer is on
discount now at your local supermarket. I
also noticed you're running out of stock.
Since your friends are coming over
tomorrow to watch the football you might
want to have some at home. Shall I go
ahead and place a new order?"
Machine Learning vs.
Natural Language Processing
Natural Language Processing/Understanding
A computer tries to make decisions based on patterns and topology. It will typically use
grammar analysis with lexicons, dictionaries, lemmatization, word weights etc. to achieve
that.
Requires a bit more work to maintain
Can be setup without lots of data. Early results possible
You don't need to train your phrases
Can be fine-grained, for example Mercedes SLK vs. car
Can distinct grammar rules, for example: book a ship vs. ship a book
Pyramid of Artificial
Intelligence.
We're currently
somewhere in between
automation and narrow
intelligence
AI technology is improving
rapidly, but we still have a way
to go to Super Intelligence
Some even say we're in an AI-winter.
Because the industry is currently
overpromising but underperforming
How is CX Company using AI?
We use business rules for the things we
know and AI for the things we do not know
and want to improve
How is CX Company using AI?
• Make life easier for content engineers
(GAP-analysis, pre-trained modules,
AutoAnswers, AutoDialogs)
→ Supervised Machine Learning
• Better understand what a clients wants
(context, spellcheck, synonyms etc.)
→ NLP
Where can (must!) we
improve?
41% of users reported concerns
around digital assistants and
voice enabled technology.
Source: Microsoft
#1
Privacy, baby!
What concerns do people have about digital
assistants?
Source: Microsoft Market Intelligence
This kind of news doesn't really help
Although not directly related to smart speakers
By the
way..
Chihuahua vs. Huawei
pronunciation?
Waaa-waaaaay
A growing majority
now views our online
privacy as a crisis
Source: Axios
The privacy threat is
a crisis, and we need
to force companies
to change.
vs.
Online services are
essential, and we all
have to accept
some risk.
Not everyone agrees
though
#2
Providing a frictionless experience
"A frictionless experience is what
builds consumer loyalty"
Sebastian Reeve, chatbot conference 2019
Excerpt taken from
Curious Rituals: A
Digital Tomorrow by
Near Future Laboratory
Also, integrating smart
speakers into existing
ecosystems like smart homes is
not frictionless. In fact, it can
be quite hard for non-techies.
"Voice search is
one of the most
often talked about,
but least
understood topics
confronting
businesses today."
The average VSR score
Uberall calculated was
44.12%, meaning the
majority of businesses
have not been optimized
to an acceptable
standard for consumer
voice search queries.
We'll see why this is
important later on
#3
Embrace screenager adoption
Some random facts
about screenagers
• 95 percent of teens
have a phone
• Teens spend 11
hours on their
devices each day
• Kids spend twice as
long playing on
screens than they do
outside
• 1 in 3 people would
rather give up sex
than their phone
Sources: Pew Research
Center, KFF, Market
Watch, Elephant Journal
So why should they give up
their smartphone for
something that's currently
working less interactive, less
efficient and less personal?
What new challenges are
we facing?
#1
Search is moving from a place of answers
to a state of action
"The new Q&A is
Question & Action"
The way we interact with new Voice User
Interfaces forces us to rethink a consumer's
decision-making process.
"Conversational AI simplifies
computer usage like never
before as it flips the user
dynamic – instead of humans
learning computer code,
computers are learning our
language."
#2
A Voice User Interface is more of a blank canvas
than a Graphical User Interface
Similar to a rails put up on a bowling track
to make the ball reach its target, we have
to provide handles for a user to complete
a journey through voice.
We have to take more efforts into
steering the conversation. (1/2)
By creating exceptionally good dialogs.
Conversational copywriting is already playing a huge roll in this. This
will eventually be - or already is - a job on its own!
For example by catering for non-scripted scenarios
"Alright I found two accommodations, one is a hotel the other a camping. Which one
would you like?"
"The first one"
"The last one"
"Neither, I want an AirB&B"
"Definitely the hotel!"
Or by managing expectations
"We're at question 3 of 10"
vs.
"Thanks for your patience. I only need 3 more answers and then we're done"
We have to take more efforts into
steering the conversation. (2/2)
Or by using the hardware's capabilities. Do you have a screen
available? Use it! Quick replies? Show them! Anything that can help
moving the user forward into the dialog.
You can get those parameters from a request
Using the hardware's capability
"Take a look at the Nike shoes below I've found for you in size 8. Just tap one to get
more information"
#3
Your brand needs to be heard
Invocation vs. deep-linking
• Explicit invocation
"Hey Google, talk to Domino's pizza"
• Deep linking
Extension on the explicit invocation where the user already has an intent
"Hey Google, ask Domino's pizza about their opening hours"
"Hey Google, ask Domino's pizza about their opening hours this Friday"
• Implicit invocation
Google uses an "action directory" to lookup which actions can fulfill this intent.
"Hey Google, order a pizza"
So..
There are two
important things
to keep in mind
1. First mover's
advantage because
of the invocation
2. There's something
like "SEO for Voice"
It brings back memories to the
early days of optimizing our
websites for search engines
(SEO).
No-one really knew how it worked
From Google's developer documentation
Again Google's Algorithm be like..
Best-practices for getting
started
#1
Think voice-first
We're moving on from the mobile-first
into the voice-first era. Your brand should also have
a voice-search strategy
So avoid copying your
chatbot's conversation 1-on-1
to voice..
I know it's very tempting, however it's far
more difficult to adjust screen content to
voice than the other way around.
But how do you get started
then?
Focus on the high-frequency and low-
breadth journey.
In other words: a journey that is triggered
often without too many steps to complete.
You do have analytics for this, right?
Or run a workshop.
At CX Company we can
potentially run several
workshops like kickoff-
or prototyping
workshops.
Let me know if you want
to know more about
this!
#2
Fail fast, fail often
"Failure is nothing
more than a chance
to revise your
strategy"
Leverage the power of
Design Thinking to get
your insights fast and
test them in an early
stage already.
Setup a Minimal Viable
Product to be able to
enable more
experimentation across
teams
Setup lo-fi prototyping, for example:
• Wizard of Oz testing
Simulate the machine responding
• Alexa's one-breath test
If you need to take a breath in between responses, it's probably too long..
• Read the script out loud!
Release your inner Shakespeare. Really. Write down your script. Set aside some time
and start "playing" your script with two or more people.
Whatever you do..
ABC!
Always Be
Capturing
#3
Give your voice an identity
"We respond to voice technologies as we respond
to actual people and behave as we would in any
social situation"
The principle behind this is called the Cooperative Principle and can
be understood in terms of four rules: Maxim of Quality, Maxim of
Quantity, Maxim of Relevance and Maxim of Manner
• Maxim of Quality
Share the right quality of information, i.e. info you believe is true
• Maxim of Quantity
Share the right amount of information
• Maxim of Relevance
Share only relevant information
• Maxim of Manner
Share information as briefly and orderly as possible, trying to avoid obscurity and
ambiguity
For example, maxim of quantity:
"Hey Google, do you know the time?"
"Of course I do"
vs.
"Hey Google, do you know the time?"
"In The Hague it's currently twenty to twelve"
"Research
repeatedly shows
that men and
women prefer
female voices
when receiving
customer service"
"People attribute human characteristics to spoken
dialogue systems for reasons related to human
evolution."
Keep this in mind when creating a Persona around your voice identity.
A Persona will make your assistant also appear
more consistent, more likable and more reliable
and should at least:
Define personality traits like friendly, helpful, witty, charming etc.
Follow a style guide (if existent)
Be considered as a real employee of your customer service center
That's all folks!
Let's connect!
• maarten.dings@cxcompany.com
• linkedin.com/in/dingsmaarten/

Getting ready for voice

  • 1.
    Getting ready forvoice Maarten Dings
  • 2.
    About me Creative Technologistat CX Company Finding magic where left brain meets right brain Innovator and early-adopter Connecting technology, business and people
  • 3.
    Some cool factsabout CX Company • Founded in 2004, meanwhile 15 years of experience • AI-driven, enterprise-ready conversational platform DigitalCX • Content-first developer-friendly platform • Support for over 20 languages • Over 4 million digital interactions per day • Leading provider for chatbots and digital assistants in NL • ISO 27001 certified. Your data is safe with us • Pan European operations with offices in NL, D and UK
  • 4.
    Building voice journeysthrough an intuitive visual dialog creator
  • 5.
    We also offereasy integrations for backend systems and devices. Available through NPM
  • 6.
    Let's embark ona high-level journey through the adoption of smart speakers and voice assistants What is the current state? What can we do better? What new challenges do we face? And what can we do to make the experience better?
  • 7.
    The current stateof smart speakers
  • 8.
    How many peopleown smart speakers?
  • 9.
    Smart speakers are ahot market Of the 30% planning to purchase, 73% of them want to do so in the next six months. There are more speakers available and cheaper in price.
  • 10.
    75% of householdswill have at least one smart speaker by 2020 Source: Gartner
  • 11.
    Also by 2020,customers will manage 85% of their relationship with the enterprise without interacting with a human Source: Gartner
  • 12.
    You can connectthe dots, right?
  • 13.
  • 14.
    What are wetalking about? Jargon, technical phrases, definitions etc.
  • 15.
    Smart speakers tryto understand users' intents by analyzing their utterances. Smart speakers are called smart because they use Artificial Intelligence (AI) to do so.
  • 16.
    Do you feellike this at the moment? Let's explain
  • 17.
    Intent vs. utterance Intent Whatdoes a user want to achieve? Utterance How does a user express that? The same intent can have multiple utterances: "Book a cab", "Get me a taxi"
  • 18.
    Machine Learning vs. NaturalLanguage Processing
  • 19.
    Machine Learning vs. NaturalLanguage Processing Machine Learning A computer tries to make decisions by discovering patterns in large amounts of data. It will typically use neural networks or deep learning models to achieve that. Requires lots of data and expensive computing power Sounds really cool. Ma-chi-ne-Lear-ning, awesome! Can potentially predict what a user wants to do
  • 20.
    Can you imagine? "Hey Maarten,your favorite beer is on discount now at your local supermarket. I also noticed you're running out of stock. Since your friends are coming over tomorrow to watch the football you might want to have some at home. Shall I go ahead and place a new order?"
  • 21.
    Machine Learning vs. NaturalLanguage Processing Natural Language Processing/Understanding A computer tries to make decisions based on patterns and topology. It will typically use grammar analysis with lexicons, dictionaries, lemmatization, word weights etc. to achieve that. Requires a bit more work to maintain Can be setup without lots of data. Early results possible You don't need to train your phrases Can be fine-grained, for example Mercedes SLK vs. car Can distinct grammar rules, for example: book a ship vs. ship a book
  • 22.
    Pyramid of Artificial Intelligence. We'recurrently somewhere in between automation and narrow intelligence
  • 23.
    AI technology isimproving rapidly, but we still have a way to go to Super Intelligence Some even say we're in an AI-winter. Because the industry is currently overpromising but underperforming
  • 24.
    How is CXCompany using AI? We use business rules for the things we know and AI for the things we do not know and want to improve
  • 25.
    How is CXCompany using AI? • Make life easier for content engineers (GAP-analysis, pre-trained modules, AutoAnswers, AutoDialogs) → Supervised Machine Learning • Better understand what a clients wants (context, spellcheck, synonyms etc.) → NLP
  • 26.
    Where can (must!)we improve?
  • 27.
    41% of usersreported concerns around digital assistants and voice enabled technology. Source: Microsoft
  • 28.
  • 29.
    What concerns dopeople have about digital assistants? Source: Microsoft Market Intelligence
  • 30.
    This kind ofnews doesn't really help Although not directly related to smart speakers
  • 31.
    By the way.. Chihuahua vs.Huawei pronunciation? Waaa-waaaaay
  • 33.
    A growing majority nowviews our online privacy as a crisis Source: Axios The privacy threat is a crisis, and we need to force companies to change. vs. Online services are essential, and we all have to accept some risk.
  • 34.
  • 35.
  • 36.
    "A frictionless experienceis what builds consumer loyalty" Sebastian Reeve, chatbot conference 2019
  • 37.
    Excerpt taken from CuriousRituals: A Digital Tomorrow by Near Future Laboratory
  • 39.
    Also, integrating smart speakersinto existing ecosystems like smart homes is not frictionless. In fact, it can be quite hard for non-techies.
  • 40.
    "Voice search is oneof the most often talked about, but least understood topics confronting businesses today." The average VSR score Uberall calculated was 44.12%, meaning the majority of businesses have not been optimized to an acceptable standard for consumer voice search queries. We'll see why this is important later on
  • 41.
  • 42.
    Some random facts aboutscreenagers • 95 percent of teens have a phone • Teens spend 11 hours on their devices each day • Kids spend twice as long playing on screens than they do outside • 1 in 3 people would rather give up sex than their phone Sources: Pew Research Center, KFF, Market Watch, Elephant Journal
  • 43.
    So why shouldthey give up their smartphone for something that's currently working less interactive, less efficient and less personal?
  • 44.
    What new challengesare we facing?
  • 45.
    #1 Search is movingfrom a place of answers to a state of action
  • 46.
    "The new Q&Ais Question & Action"
  • 47.
    The way weinteract with new Voice User Interfaces forces us to rethink a consumer's decision-making process.
  • 48.
    "Conversational AI simplifies computerusage like never before as it flips the user dynamic – instead of humans learning computer code, computers are learning our language."
  • 49.
    #2 A Voice UserInterface is more of a blank canvas than a Graphical User Interface
  • 50.
    Similar to arails put up on a bowling track to make the ball reach its target, we have to provide handles for a user to complete a journey through voice.
  • 51.
    We have totake more efforts into steering the conversation. (1/2) By creating exceptionally good dialogs. Conversational copywriting is already playing a huge roll in this. This will eventually be - or already is - a job on its own!
  • 52.
    For example bycatering for non-scripted scenarios "Alright I found two accommodations, one is a hotel the other a camping. Which one would you like?" "The first one" "The last one" "Neither, I want an AirB&B" "Definitely the hotel!"
  • 53.
    Or by managingexpectations "We're at question 3 of 10" vs. "Thanks for your patience. I only need 3 more answers and then we're done"
  • 54.
    We have totake more efforts into steering the conversation. (2/2) Or by using the hardware's capabilities. Do you have a screen available? Use it! Quick replies? Show them! Anything that can help moving the user forward into the dialog. You can get those parameters from a request
  • 55.
    Using the hardware'scapability "Take a look at the Nike shoes below I've found for you in size 8. Just tap one to get more information"
  • 56.
  • 57.
    Invocation vs. deep-linking •Explicit invocation "Hey Google, talk to Domino's pizza" • Deep linking Extension on the explicit invocation where the user already has an intent "Hey Google, ask Domino's pizza about their opening hours" "Hey Google, ask Domino's pizza about their opening hours this Friday" • Implicit invocation Google uses an "action directory" to lookup which actions can fulfill this intent. "Hey Google, order a pizza"
  • 58.
    So.. There are two importantthings to keep in mind 1. First mover's advantage because of the invocation 2. There's something like "SEO for Voice"
  • 59.
    It brings backmemories to the early days of optimizing our websites for search engines (SEO). No-one really knew how it worked
  • 60.
  • 61.
  • 62.
  • 63.
  • 64.
    We're moving onfrom the mobile-first into the voice-first era. Your brand should also have a voice-search strategy
  • 65.
    So avoid copyingyour chatbot's conversation 1-on-1 to voice.. I know it's very tempting, however it's far more difficult to adjust screen content to voice than the other way around.
  • 66.
    But how doyou get started then? Focus on the high-frequency and low- breadth journey. In other words: a journey that is triggered often without too many steps to complete. You do have analytics for this, right?
  • 67.
    Or run aworkshop. At CX Company we can potentially run several workshops like kickoff- or prototyping workshops. Let me know if you want to know more about this!
  • 68.
  • 69.
    "Failure is nothing morethan a chance to revise your strategy"
  • 70.
    Leverage the powerof Design Thinking to get your insights fast and test them in an early stage already.
  • 71.
    Setup a MinimalViable Product to be able to enable more experimentation across teams
  • 72.
    Setup lo-fi prototyping,for example: • Wizard of Oz testing Simulate the machine responding • Alexa's one-breath test If you need to take a breath in between responses, it's probably too long.. • Read the script out loud! Release your inner Shakespeare. Really. Write down your script. Set aside some time and start "playing" your script with two or more people.
  • 73.
  • 74.
    #3 Give your voicean identity
  • 75.
    "We respond tovoice technologies as we respond to actual people and behave as we would in any social situation"
  • 76.
    The principle behindthis is called the Cooperative Principle and can be understood in terms of four rules: Maxim of Quality, Maxim of Quantity, Maxim of Relevance and Maxim of Manner
  • 77.
    • Maxim ofQuality Share the right quality of information, i.e. info you believe is true • Maxim of Quantity Share the right amount of information • Maxim of Relevance Share only relevant information • Maxim of Manner Share information as briefly and orderly as possible, trying to avoid obscurity and ambiguity
  • 78.
    For example, maximof quantity: "Hey Google, do you know the time?" "Of course I do" vs. "Hey Google, do you know the time?" "In The Hague it's currently twenty to twelve"
  • 79.
    "Research repeatedly shows that menand women prefer female voices when receiving customer service"
  • 80.
    "People attribute humancharacteristics to spoken dialogue systems for reasons related to human evolution."
  • 82.
    Keep this inmind when creating a Persona around your voice identity. A Persona will make your assistant also appear more consistent, more likable and more reliable and should at least: Define personality traits like friendly, helpful, witty, charming etc. Follow a style guide (if existent) Be considered as a real employee of your customer service center
  • 83.
  • 84.