Heidi Young - The Future of Search: How Measuring Satisfaction Will Enhance Our Personal AIs and Our Lives - Seattle Interactive 2016

The Future of Search:
How Measuring Satisfaction Will
Enhance Our Personal AIs and
Our Lives
Heidi Young
VP of Engineering
Ozlo

Who am I?
Search Junkie, Data Scientist,
Engineer
Currently building Ozlo!!!

What is Ozlo?
Next generation assistant
Ozlo is leveraging artificial intelligence, machine
learning and natural language processing to
power the next generation of search
Ozlo is in the early stages of learning to
understand a wide range of human goals and
activities, and the words and ideas that connect
those things to help users find what they
actually need

AI Assistant and Chatbot Landscape
Siri
Alexa Skills Store
Bot Store
Skype Bot Store
Assistants
Platforms for
exposing
chatbots
Building a chatbot or
assistant

https://twitter.com/ashevat/status/786690547733889024/photo/1

https://twitter.com/davidjbland/status/725119174368976897

Why all the hype then?
We’ve moved to mobile where
messaging is the natural
method of communication
We’re moving to connected
smart devices and expect our
interactions to be natural to our
surroundings

Why all the hype then?
There’s a good chunk of
information seeking tasks
that search engines don’t
handle well in their
current form
Say wha?And they aren’t the really
hard ones that you’re
thinking of
(i.e. research travel, buy a
house)

Why is conversational a better experience?
It isn’t for a lot of things
Alexa, buy me some pants
I can’t buy pants. So I’ve added it
to your shopping list.
😒
I want to order a pizza
Great! What kind of
toppings would you like?
Pepperoni and sausage
with extra cheese
And what kind of crust?
Thin crust
What size pizza would you
like?
…
😒
On average 73 taps with
conversational ui vs
conventional filtering ui
with 16 taps

Why is conversational a better experience?
Rich,
robust
filtering
Highly visual experience
A lot of variety
It isn’t for a lot of things

Answer? The most natural interaction
The bar should be:
What kind of response would you
expect from a really
knowledgeable friend?
Are there any good movies playing?
Here’s some:
…
Anything more kid friendly?
How about these?
…
Which of these is playing around 9pm?
This is the only one playing close to
9pm, near you
…
Great! Can you get me a ticket?
Here’s a link to buy it on Fandango

Information Task Modes
Remember
• Simple Facts
• Simple 1-2
sentence answers
• Clean, cut, dried
Understand
• Obtaining
knowledge from a
multitude of
sources
• Constructing
meaning from
different content
sources
Analyze
• Breaking material
into constituent
parts
• Determine
relationships
• Make decisions
https://www.microsoft.com/en-us/research/wp-content/uploads/2015/08/fp286-bailey.pdf

Information Task Modes
In typical web search tasks, users have expectations for the number of queries they’ll issue
and documents they’ll review
How many queries they expect to issue How many documents they expect to review

Back to that hype thing…
Chatbots and AI of today are primarily focused on stuff that’s pretty easy to get with an
existing app or search engine
X X X X
But our expectation is that they can do these

Understand or Analyze Type of Task
What’s a good place to watch the game nearby?
Point of interest
That is
rated highly or is popular or is known for
this type of task
Implies sports bar or point of
interest that has a television
with sports typically available
Close to your current
location
Depending on where
you’re located, could mean
within walking distance or
could mean 20 mins
driving distance,
depending on density of
POIs and sparsity of
available content
VERY IMPORTANT!!!
There is not ONE right answer to this question
It is a subjective question. Depending on your content sources, results can widely vary.
It requires a lot of synthesis across multiple sources, and likely presenting multiple
sources, not a definitive answer.

What you really want
Place A:
Great sports
bar nearby
Place B:
Romantic
restaurant
nearby
Place C:
Coffeeshop
nearby
X
X
Place A:
Great sports
bar nearby
Place D:
Restaurant
known for
sports and tvs
Place D:
Restaurant
known for
sports and tvs

Some existing experiences
Alexa
Google Assistant via Allo

What might a good experience look like?
Present evidence as to why those are
good options
Present multiple options, but not so
many that it’s overwhelming
Establish that you were heard and that
he understood what you actually
meant (i.e. sports bars, nearby)
Offer most likely refinements and
follow on prompts

Successful Measurement of
Conversational UIs

To measure, we must understand
National
Communication
Association publishes a
rating scale to assess
skills in interpersonal
settings during
conversation
1 5
Inadequate
awkward, disruptive, leaving
a negative impression
Excellent
smooth, controlled, leaving a
positive impression
Attentiveness
Attention to, concern for
conversational partner
Composure
Confidence, assertiveness
Expressiveness
Articulation, animation,
variation
Coordination
Non disruptive negotiation of
speaking turns

What do REAL messaging conversations look like?
New vs Continuing
Conversations
Identifying satisfaction
of each sub-
conversation

How we think about things at Ozlo
Negative conversations
Bottom Line: How did the conversation end?
Negative indicators,
implicit AND explicit
We:
1. Identify conversation boundaries
2. Assign positive or negative assessment of
each interaction
3. Mark as negative if it “ended” negatively

What’s a negative ending conversation?
Conversations that contain one
of the following in the last N
messages in the interaction:
1. Explicit negative feedback
2. Highly latent
3. Not well understood
4. No follow on
VS

What’s a negative ending conversation?
Negative Ending Specific Signal Roughly maps to NCA ratings for…
Explicit Negative
Feedback
Thumbs down Composure (i.e. Didn’t understand, Results could be better)
Attentiveness (i.e. Oddly worded response, Didn’t understand)
Expressiveness (i.e. Oddly worded response)
Highly latent >1 second Coordination (i.e. Controlling the flow of conversation, “Never leave
me hanging”)
Not well understood Didn’t understand,
low confidence scores
Composure
Expressiveness
No follow on Lack of prompts displayed,
Lack of engagement for
non QnA questions
Coordination
Attentiveness

Why this over DAUs?
It’s not one over the other
DAUs/MAUs are lagging indicators
We must optimize for in-the-moment
interactions
Negatively ending conversations allows
us to react in the moment, and
aggregate and set targets

Will this result in better AI experiences?
Still early
This is how we learn, reinforce good behavior
Once we successfully measure, we can optimize

Heidi Young - The Future of Search: How Measuring Satisfaction Will Enhance Our Personal AIs and Our Lives - Seattle Interactive 2016

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Heidi Young - The Future of Search: How Measuring Satisfaction Will Enhance Our Personal AIs and Our Lives - Seattle Interactive 2016

Similar to Heidi Young - The Future of Search: How Measuring Satisfaction Will Enhance Our Personal AIs and Our Lives - Seattle Interactive 2016 (20)

More from Seattle Interactive Conference

More from Seattle Interactive Conference (20)

Recently uploaded

Recently uploaded (20)

Heidi Young - The Future of Search: How Measuring Satisfaction Will Enhance Our Personal AIs and Our Lives - Seattle Interactive 2016

Editor's Notes