SlideShare a Scribd company logo
From Queries to Dialogues:
Predicting User Satisfaction with
Intelligent Assistants
Julia Kiseleva, Kyle Williams, Ahmed Hassan Awadallah,
Aidan C. Crook, Imed Zitouni, Tasos Anastasakos
Eindhoven University of Technology
Pennsylvania State University
Microsoft
Google at SIGIR 2016
Google at SIGIR 2016
Google at SIGIR 2016
2106
It brings us new challenges
Google at SIGIR 2016
Google at SIGIR 2016
From Queries to Dialogues
Q1: how is the weather in Chicago
Q2: how is it this weekend
Q3: find me hotels
Q4: which one of these is the cheapest
Q5: which one of these has at least 4 stars
Q6: find me directions from the Chicago airport to
number one
User’s dialogue
with Cortana:
Task is “Finding
a hotel in
Chicago”
From Queries to Dialogues
Q1: find me a pharmacy nearby
Q2: which of these is highly rated
Q3: show more information about number 2
Q4: how long will it take me to get there
Q5: Thanks
User’s dialogue
with Cortana:
Task is “Finding
a pharmacy”
Cortana:
“Here are ten
restaurants
near you”
Cortana:
“Here are ten
restaurants near
you that have
good reviews”
Cortana:
“Getting you
direction to the
Mayuri Indian
Cuisine”
User:
“show
restauran
ts near
me”
User:
“show the
best ones”
User:
“show
directions
to the
second
one”
From Queries to Dialogues
Main Research Question
How can we automatically predict user
satisfaction with search dialogues on
intelligent assistants using
click, touch, and voice interactions?
User:
“Do I need
to have a
jacket
tomorrow?”
Cortana: “You
could probably
go without one.
The forecast
shows …”
Single Task Search Dialogue
Cortana:
“Here are ten
restaurants
near you”
Cortana:
“Here are ten
restaurants near
you that have
good reviews”
Cortana:
“Getting you
direction to the
Mayuri Indian
Cuisine”
User:
“show
restauran
ts near
me”
User:
“show the
best ones”
User:
“show
directions
to the
second
one”
Multi-Task Search Dialogues
How to define user satisfaction
with search dialogues?
Cortana:
“Here are ten
restaurants
near you”
Cortana:
“Here are ten
restaurants near
you that have
good reviews”
Cortana:
“Getting you
direction to the
Mayuri Indian
Cuisine”
User:
“show
restauran
ts near
me”
User:
“show the
best ones”
User:
“show
directions
to the
second
one”
No Clicks
???
Cortana:
“Here are ten
restaurants
near you”
Cortana:
“Here are ten
restaurants near
you that have
good reviews”
Cortana:
“Getting you
direction to the
Mayuri Indian
Cuisine”
User:
“show
restauran
ts near
me”
User:
“show the
best ones”
User:
“show
directions
to the
second
one”
SAT? SAT? SAT?
Overall
SAT?
? SAT? SAT? SAT?
User Frustration
Q1: what's the weather like in San Francisco
Q2: what's the weather like in Mountain View
Q3: can you find me a hotel close to Mountain
View
Q4: can you show me the cheapest ones
Q5: show me the third one
Q6: show me the directions from SFO to this
hotel
Q6: show me the directions from SFO to this
hotel
Q7: go back to first hotel (misrecognition)
Q8: show me hotels in Mountain View
Q9: show me cheap hotels in Mountain View
Q10: show me more about the third one


Dialog with
Intelligent Assistant
Task is “Planning a
weekend ”
RestartsearchAuserissatisfied

What interaction signals can
track during search dialogues?
Tracking User Interaction:
Click Signals
• Number of queries in a dialogue
• Number of clicks in a dialogue
• Number of SAT clicks (> 30 sec. dwell time) in a dialogue
• Number of DSAT clicks (< 15 sec. dwell time) in a dialogue
• Time (seconds) until the first click in a dialogue
Tracking User Interaction:
Acoustic Signals
Phonetic Similarity
between consecutive requests
Tracking User Interaction
3 seconds 6 seconds
33% of
ViewPort
66% of
ViewPort
ViewPortHeight
2 seconds
20% of
ViewPort
1s 4s 0.4s 5.4s+ + =
Tracking User Interaction
• Number of Swipes
• Number of up-swipes
• Number of down-swipes
• Total distance swiped (pixels)
• Number of swipes normalized by
time
• Total distance divided by num. of
swipes
• Total swiped distance divided by
time
• Number of swipe direction
changes
• SERP answer duration (seconds)
which is shown on screen (even
partially)
• Fraction of visible pixels belonging
to SERP answer
• Attributed time (seconds) to viewing
a particular element (answer) on
SERP
• Attributed time (seconds) per unit
height (pixels) associated with a
particular element on SERP
• Attributed time (milliseconds) per
unit area (square pixels) associated
with a particular element on SERP
Tracking User Interaction:
Touch Signals
How to collect data?
User Study Participants
75%
25%
GENDER
Male Female
55%
45%
LANGUAGE
English Other
82%
8%
2% 8%
EDUCATION Computer
Science
Electrical
Engineering
Mathematics
Other
• 60 Participants
• 25.53 +/- 5.42 years
You are planning a
vacation. Pick a place.
Check if the weather is
good enough for the
period you are planning
the vacation. Find a hotel
that suits you. Find the
driving directions to this
place.
You are planning a
vacation. Pick a place.
Check if the weather is
good enough for the
period you are planning
the vacation. Find a hotel
that suits you. Find the
driving directions to this
place.
Questionnaire
• Were you able to complete the task?
o Yes/No
• How satisfied are you with your experience in this task?
o If the task has sub-tasks participants indicate their graded satisfaction e.g.
o a. How satisfied are you with your experience in finding a hotel?
o b. How satisfied are you with your experience in finding directions?
• How well did Cortana recognize what you said?
o 5-point Likert scale
• Did you put in a lot of effort to complete the task?
o 5-point Likert scale
Questionnaire
• Were you able to complete the task?
o Yes/No
• How satisfied are you with your experience in this task?
o If the task has sub-tasks participants indicate their graded satisfaction e.g.
o a. How satisfied are you with your experience in finding a hotel?
o b. How satisfied are you with your experience in finding directions?
• How well did Cortana recognize what you said?
o 5-point Likert scale
• Did you put in a lot of effort to complete the task?
o 5-point Likert scale
8 Tasks:
1 simple,
4 with 2 subtasks,
3 with 3 subtasks
~ 30 Minutes
Search Dialog Dataset
• Total amount of queries is 2, 040
• Amount of unique queries is 1, 969
• The average query-length is 7.07
Search Dialog Dataset
• Total amount of queries is 2, 040
• Amount of unique queries is 1, 969
• The average query-length is 7.07
• The simple task generated 130 queries
• Tasks with 2 context switches generated 685 queries
• Tasks with 3 context switches generated 1, 355
queries
How can we predict user
satisfaction
with search dialogues using
interaction signals?
Q1: what do you have medicine for the
stomach ache
Q2: stomach ache medicine over the counter
General
Web
SERP
User’s dialogue about the ‘stomach
ache’
Q1: what do you have medicine for the
stomach ache
Q2: stomach ache medicine over the counter
Q3: show me the nearest pharmacy
Q4: more information on the second one
General
Web
SERP
Structured
SERP
User’s dialogue about the ‘stomach
ache’
General Web and Structured SERP
General Web and Structured SERP
Aggregating Touch Interactions (I)
I( )
1.
Aggregating Touch Interactions (I)
I( ) I( , )
1. 2.
Aggregating Touch Interactions (I)
I( ) I( ),I( )I( , )
1. 2. 3.
Quality of Interaction Model
Method Accuracy (%) Average F1 (%)
Baseline 70.62 61.38
Interaction Model 1 78.78*
(+11.55)
83.59*
(+35.90)
Interaction Model 2 80.21*
(+13.58)
83.31*
(+35.44)
Interaction Model 3 80.81*
(14.43)
79.08*
(28.83)
* Statistically significant improvement (p < 0,05 )
Which interaction signals have
the highest impact on predicting
user satisfaction with search
dialogues?
Predicting User Satisfaction
• F1: The SERP for a query is ordered by a measure of relevance as
determined by the system, then additional exploration is unlikely to achieve
user satisfaction, but is more likely an indication that the best-provided
results (i.e. the SERP top) are insufficient to address the user intent
Predicting User Satisfaction
• F1: The SERP for a query is ordered by a measure of relevance as
determined by the system, then additional exploration is unlikely to achieve
user satisfaction, but is more likely an indication that the best-provided
results (i.e. the SERP top) are insufficient to address the user intent
• F2: In the converse case of F1, when users find content that satisfies their
intent, their likelihood of scrolling is reduced, and they dwell for an extended
period on the top viewport
Predicting User Satisfaction
• F1: The SERP for a query is ordered by a measure of relevance as
determined by the system, then additional exploration is unlikely to achieve
user satisfaction, but is more likely an indication that the best-provided
results (i.e. the SERP top) are insufficient to address the user intent
• F2: In the converse case of F1, when users find content that satisfies their
intent, their likelihood of scrolling is reduced, and they dwell for an extended
period on the top viewport
• F3: When users are involved in a complex task, they are dissatisfied when
redirected to a general web SERP. Unlike F2, the absence of scrolling on this
landing page is an indication of dissatisfaction
How can we define user satisfaction with search dialogues?
• User satisfaction with search dialogues is defined in the generalized form,
which showed understanding the nature of user satisfaction as an
aggregation of satisfaction with all dialogue’s tasks and not as a satisfaction
with all dialogue’s queries separately
How can we predict user satisfaction with search dialogues using
interaction signals?
• We showed that features derived from voice and especially from touch and
voice interactions add significant gain in accuracy over the baseline
How can we predict user satisfaction with search dialogues using
interaction signals?
• Our analysis showed a strong negative correlation between user satisfaction
and swipe actions
Conclusion
• User satisfaction with search dialogues is defined in
the generalized form, which showed understanding
the nature of user satisfaction as an aggregation of
satisfaction with all dialogue’s tasks and not as a
satisfaction with all dialogue’s queries separately
• We showed that features derived from voice and
especially from touch and voice interactions add
significant gain in accuracy over the baseline
• Our analysis showed a strong negative correlation
between user satisfaction and swipe actions
Thank you!
Questions?

More Related Content

What's hot

The Influence of Multimedia on Recommender System User's Perceptions of Syste...
The Influence of Multimedia on Recommender System User's Perceptions of Syste...The Influence of Multimedia on Recommender System User's Perceptions of Syste...
The Influence of Multimedia on Recommender System User's Perceptions of Syste...
Ashley Farrell
 
Target audience research
Target audience researchTarget audience research
Target audience researchnykelly
 
Power Of Advocacy
Power Of AdvocacyPower Of Advocacy
Power Of Advocacy
Ihab Hatoum
 
Describing Patterns and Disruptions in Large Scale Mobile App Usage Data
Describing Patterns and Disruptions in Large Scale Mobile App Usage DataDescribing Patterns and Disruptions in Large Scale Mobile App Usage Data
Describing Patterns and Disruptions in Large Scale Mobile App Usage Data
Mounia Lalmas-Roelleke
 
From “Selena Gomez” to “Marlon Brando”: Understanding Explorative Entity Search
From “Selena Gomez” to “Marlon Brando”: Understanding Explorative Entity SearchFrom “Selena Gomez” to “Marlon Brando”: Understanding Explorative Entity Search
From “Selena Gomez” to “Marlon Brando”: Understanding Explorative Entity Search
Mounia Lalmas-Roelleke
 
Social Media and AI: Don’t forget the users
Social Media and AI: Don’t forget the usersSocial Media and AI: Don’t forget the users
Social Media and AI: Don’t forget the users
Mounia Lalmas-Roelleke
 
Presentation2
Presentation2Presentation2
Presentation2
Joe Nash
 

What's hot (7)

The Influence of Multimedia on Recommender System User's Perceptions of Syste...
The Influence of Multimedia on Recommender System User's Perceptions of Syste...The Influence of Multimedia on Recommender System User's Perceptions of Syste...
The Influence of Multimedia on Recommender System User's Perceptions of Syste...
 
Target audience research
Target audience researchTarget audience research
Target audience research
 
Power Of Advocacy
Power Of AdvocacyPower Of Advocacy
Power Of Advocacy
 
Describing Patterns and Disruptions in Large Scale Mobile App Usage Data
Describing Patterns and Disruptions in Large Scale Mobile App Usage DataDescribing Patterns and Disruptions in Large Scale Mobile App Usage Data
Describing Patterns and Disruptions in Large Scale Mobile App Usage Data
 
From “Selena Gomez” to “Marlon Brando”: Understanding Explorative Entity Search
From “Selena Gomez” to “Marlon Brando”: Understanding Explorative Entity SearchFrom “Selena Gomez” to “Marlon Brando”: Understanding Explorative Entity Search
From “Selena Gomez” to “Marlon Brando”: Understanding Explorative Entity Search
 
Social Media and AI: Don’t forget the users
Social Media and AI: Don’t forget the usersSocial Media and AI: Don’t forget the users
Social Media and AI: Don’t forget the users
 
Presentation2
Presentation2Presentation2
Presentation2
 

Similar to From queries to dialogues

Understanding and Predicting User Satisfaction with Intelligent Assistants
Understanding and Predicting User Satisfaction with Intelligent AssistantsUnderstanding and Predicting User Satisfaction with Intelligent Assistants
Understanding and Predicting User Satisfaction with Intelligent Assistants
Julia Kiseleva
 
Understanding and Predicting User Satisfaction with Intelligent Assistants
Understanding and Predicting User Satisfaction with Intelligent AssistantsUnderstanding and Predicting User Satisfaction with Intelligent Assistants
Understanding and Predicting User Satisfaction with Intelligent Assistants
PromoVeTue
 
Understanding User Satisfaction with Intelligent Assistants
Understanding User Satisfaction with Intelligent AssistantsUnderstanding User Satisfaction with Intelligent Assistants
Understanding User Satisfaction with Intelligent Assistants
Julia Kiseleva
 
Detecting Good Abandonment in Mobile Search
Detecting Good Abandonment in Mobile SearchDetecting Good Abandonment in Mobile Search
Detecting Good Abandonment in Mobile Search
Julia Kiseleva
 
Q3) What have you learned from your audience feedback?
Q3) What have you learned from your audience feedback?Q3) What have you learned from your audience feedback?
Q3) What have you learned from your audience feedback?
Daniel Hunt
 
UI/UX Foundations - Research
UI/UX Foundations - ResearchUI/UX Foundations - Research
UI/UX Foundations - Research
Meg Kurdziolek
 
UX playbook: Real world user exercises
UX playbook: Real world user exercisesUX playbook: Real world user exercises
UX playbook: Real world user exercises
InVision App
 
Measuring the Quality of Online Service - Jinyoung kim
Measuring the Quality of Online Service - Jinyoung kimMeasuring the Quality of Online Service - Jinyoung kim
Measuring the Quality of Online Service - Jinyoung kimJin Young Kim
 
10 Steps to Mapping Your Customer Journey
10 Steps to Mapping Your Customer Journey10 Steps to Mapping Your Customer Journey
10 Steps to Mapping Your Customer Journey
Qualtrics
 
The Voice Search Revolution
The Voice Search RevolutionThe Voice Search Revolution
The Voice Search Revolution
Advice Interactive Group
 
Fmp research
Fmp research Fmp research
Fmp research
WilliamAnderson165
 
What do my customers really want
What do my customers really wantWhat do my customers really want
What do my customers really wantThe URL Dr.
 
Evaluation Question 6 pp
Evaluation Question 6 ppEvaluation Question 6 pp
Evaluation Question 6 pp
chloeyearsleymedia
 
10 tips for a better survey at STC2011
10 tips for a better survey at STC201110 tips for a better survey at STC2011
10 tips for a better survey at STC2011
Caroline Jarrett
 
Fake Your Research - UX Masterclass
Fake Your Research - UX MasterclassFake Your Research - UX Masterclass
Fake Your Research - UX Masterclass
Sherpas
 
Fake Your Research - UX Masterclass
Fake Your Research - UX MasterclassFake Your Research - UX Masterclass
Fake Your Research - UX Masterclass
ExperienceU
 
Evaluating the search experience: from Retrieval Effectiveness to User Engage...
Evaluating the search experience: from Retrieval Effectiveness to User Engage...Evaluating the search experience: from Retrieval Effectiveness to User Engage...
Evaluating the search experience: from Retrieval Effectiveness to User Engage...
Mounia Lalmas-Roelleke
 
Survey knowledge
Survey knowledgeSurvey knowledge
Survey knowledge
Tu Tran
 
Better UX Surveys part 1
Better UX Surveys part 1Better UX Surveys part 1
Better UX Surveys part 1
Caroline Jarrett
 

Similar to From queries to dialogues (20)

Understanding and Predicting User Satisfaction with Intelligent Assistants
Understanding and Predicting User Satisfaction with Intelligent AssistantsUnderstanding and Predicting User Satisfaction with Intelligent Assistants
Understanding and Predicting User Satisfaction with Intelligent Assistants
 
Understanding and Predicting User Satisfaction with Intelligent Assistants
Understanding and Predicting User Satisfaction with Intelligent AssistantsUnderstanding and Predicting User Satisfaction with Intelligent Assistants
Understanding and Predicting User Satisfaction with Intelligent Assistants
 
Understanding User Satisfaction with Intelligent Assistants
Understanding User Satisfaction with Intelligent AssistantsUnderstanding User Satisfaction with Intelligent Assistants
Understanding User Satisfaction with Intelligent Assistants
 
Detecting Good Abandonment in Mobile Search
Detecting Good Abandonment in Mobile SearchDetecting Good Abandonment in Mobile Search
Detecting Good Abandonment in Mobile Search
 
Q3) What have you learned from your audience feedback?
Q3) What have you learned from your audience feedback?Q3) What have you learned from your audience feedback?
Q3) What have you learned from your audience feedback?
 
UI/UX Foundations - Research
UI/UX Foundations - ResearchUI/UX Foundations - Research
UI/UX Foundations - Research
 
UX playbook: Real world user exercises
UX playbook: Real world user exercisesUX playbook: Real world user exercises
UX playbook: Real world user exercises
 
Measuring the Quality of Online Service - Jinyoung kim
Measuring the Quality of Online Service - Jinyoung kimMeasuring the Quality of Online Service - Jinyoung kim
Measuring the Quality of Online Service - Jinyoung kim
 
10 Steps to Mapping Your Customer Journey
10 Steps to Mapping Your Customer Journey10 Steps to Mapping Your Customer Journey
10 Steps to Mapping Your Customer Journey
 
The Voice Search Revolution
The Voice Search RevolutionThe Voice Search Revolution
The Voice Search Revolution
 
Fmp research
Fmp research Fmp research
Fmp research
 
What do my customers really want
What do my customers really wantWhat do my customers really want
What do my customers really want
 
Evaluation Question 6 pp
Evaluation Question 6 ppEvaluation Question 6 pp
Evaluation Question 6 pp
 
10 tips for a better survey at STC2011
10 tips for a better survey at STC201110 tips for a better survey at STC2011
10 tips for a better survey at STC2011
 
Fake Your Research - UX Masterclass
Fake Your Research - UX MasterclassFake Your Research - UX Masterclass
Fake Your Research - UX Masterclass
 
Fake Your Research - UX Masterclass
Fake Your Research - UX MasterclassFake Your Research - UX Masterclass
Fake Your Research - UX Masterclass
 
HCI - online surveys
HCI - online surveysHCI - online surveys
HCI - online surveys
 
Evaluating the search experience: from Retrieval Effectiveness to User Engage...
Evaluating the search experience: from Retrieval Effectiveness to User Engage...Evaluating the search experience: from Retrieval Effectiveness to User Engage...
Evaluating the search experience: from Retrieval Effectiveness to User Engage...
 
Survey knowledge
Survey knowledgeSurvey knowledge
Survey knowledge
 
Better UX Surveys part 1
Better UX Surveys part 1Better UX Surveys part 1
Better UX Surveys part 1
 

More from Julia Kiseleva

Behavioral Dynamics from the SERP’s Perspective: What are Failed SERPs and Ho...
Behavioral Dynamics from the SERP’s Perspective: What are Failed SERPs and Ho...Behavioral Dynamics from the SERP’s Perspective: What are Failed SERPs and Ho...
Behavioral Dynamics from the SERP’s Perspective: What are Failed SERPs and Ho...
Julia Kiseleva
 
Using Contextual Information to Understand Searching and Browsing Behavior
Using Contextual Information to Understand Searching and Browsing BehaviorUsing Contextual Information to Understand Searching and Browsing Behavior
Using Contextual Information to Understand Searching and Browsing Behavior
Julia Kiseleva
 
Where to Go on Your Next Trip? Optimizing Travel Destinations Based on User P...
Where to Go on Your Next Trip? Optimizing Travel Destinations Based on User P...Where to Go on Your Next Trip? Optimizing Travel Destinations Based on User P...
Where to Go on Your Next Trip? Optimizing Travel Destinations Based on User P...
Julia Kiseleva
 
Modelling and Detecting Changes in User Satisfaction
Modelling and Detecting Changes in User SatisfactionModelling and Detecting Changes in User Satisfaction
Modelling and Detecting Changes in User Satisfaction
Julia Kiseleva
 
Predicting Current User Intent with Contextual Markov Models
Predicting Current User Intent with Contextual Markov ModelsPredicting Current User Intent with Contextual Markov Models
Predicting Current User Intent with Contextual Markov Models
Julia Kiseleva
 
The talk at Twente University on 28 July 2014
The talk at Twente University on 28 July 2014 The talk at Twente University on 28 July 2014
The talk at Twente University on 28 July 2014
Julia Kiseleva
 
Discovering Temporal Hidden Contexts in Web Sessions for User Trail Prediction
Discovering Temporal Hidden Contexts in Web Sessions for User Trail PredictionDiscovering Temporal Hidden Contexts in Web Sessions for User Trail Prediction
Discovering Temporal Hidden Contexts in Web Sessions for User Trail PredictionJulia Kiseleva
 
Context Mining and Integration in Web Predictive Analytics
Context Mining and Integration in Web Predictive AnalyticsContext Mining and Integration in Web Predictive Analytics
Context Mining and Integration in Web Predictive AnalyticsJulia Kiseleva
 

More from Julia Kiseleva (8)

Behavioral Dynamics from the SERP’s Perspective: What are Failed SERPs and Ho...
Behavioral Dynamics from the SERP’s Perspective: What are Failed SERPs and Ho...Behavioral Dynamics from the SERP’s Perspective: What are Failed SERPs and Ho...
Behavioral Dynamics from the SERP’s Perspective: What are Failed SERPs and Ho...
 
Using Contextual Information to Understand Searching and Browsing Behavior
Using Contextual Information to Understand Searching and Browsing BehaviorUsing Contextual Information to Understand Searching and Browsing Behavior
Using Contextual Information to Understand Searching and Browsing Behavior
 
Where to Go on Your Next Trip? Optimizing Travel Destinations Based on User P...
Where to Go on Your Next Trip? Optimizing Travel Destinations Based on User P...Where to Go on Your Next Trip? Optimizing Travel Destinations Based on User P...
Where to Go on Your Next Trip? Optimizing Travel Destinations Based on User P...
 
Modelling and Detecting Changes in User Satisfaction
Modelling and Detecting Changes in User SatisfactionModelling and Detecting Changes in User Satisfaction
Modelling and Detecting Changes in User Satisfaction
 
Predicting Current User Intent with Contextual Markov Models
Predicting Current User Intent with Contextual Markov ModelsPredicting Current User Intent with Contextual Markov Models
Predicting Current User Intent with Contextual Markov Models
 
The talk at Twente University on 28 July 2014
The talk at Twente University on 28 July 2014 The talk at Twente University on 28 July 2014
The talk at Twente University on 28 July 2014
 
Discovering Temporal Hidden Contexts in Web Sessions for User Trail Prediction
Discovering Temporal Hidden Contexts in Web Sessions for User Trail PredictionDiscovering Temporal Hidden Contexts in Web Sessions for User Trail Prediction
Discovering Temporal Hidden Contexts in Web Sessions for User Trail Prediction
 
Context Mining and Integration in Web Predictive Analytics
Context Mining and Integration in Web Predictive AnalyticsContext Mining and Integration in Web Predictive Analytics
Context Mining and Integration in Web Predictive Analytics
 

Recently uploaded

BASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptxBASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptx
natyesu
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
Gal Baras
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
Arif0071
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC
 
Comptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guideComptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guide
GTProductions1
 
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesMulti-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Sanjeev Rampal
 
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shopHistory+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
laozhuseo02
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
Rogerio Filho
 
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
keoku
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
3ipehhoa
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Brad Spiegel Macon GA
 
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptxInternet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
VivekSinghShekhawat2
 
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
Javier Lasa
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
nirahealhty
 
The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
laozhuseo02
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
3ipehhoa
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
JeyaPerumal1
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
ufdana
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
3ipehhoa
 
Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
JungkooksNonexistent
 

Recently uploaded (20)

BASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptxBASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptx
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
 
Comptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guideComptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guide
 
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesMulti-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
 
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shopHistory+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
 
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
 
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptxInternet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
 
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
 
The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
 
Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
 

From queries to dialogues

  • 1. From Queries to Dialogues: Predicting User Satisfaction with Intelligent Assistants Julia Kiseleva, Kyle Williams, Ahmed Hassan Awadallah, Aidan C. Crook, Imed Zitouni, Tasos Anastasakos Eindhoven University of Technology Pennsylvania State University Microsoft
  • 6. It brings us new challenges Google at SIGIR 2016
  • 8. From Queries to Dialogues Q1: how is the weather in Chicago Q2: how is it this weekend Q3: find me hotels Q4: which one of these is the cheapest Q5: which one of these has at least 4 stars Q6: find me directions from the Chicago airport to number one User’s dialogue with Cortana: Task is “Finding a hotel in Chicago”
  • 9. From Queries to Dialogues Q1: find me a pharmacy nearby Q2: which of these is highly rated Q3: show more information about number 2 Q4: how long will it take me to get there Q5: Thanks User’s dialogue with Cortana: Task is “Finding a pharmacy”
  • 10. Cortana: “Here are ten restaurants near you” Cortana: “Here are ten restaurants near you that have good reviews” Cortana: “Getting you direction to the Mayuri Indian Cuisine” User: “show restauran ts near me” User: “show the best ones” User: “show directions to the second one” From Queries to Dialogues
  • 11. Main Research Question How can we automatically predict user satisfaction with search dialogues on intelligent assistants using click, touch, and voice interactions?
  • 12. User: “Do I need to have a jacket tomorrow?” Cortana: “You could probably go without one. The forecast shows …” Single Task Search Dialogue
  • 13. Cortana: “Here are ten restaurants near you” Cortana: “Here are ten restaurants near you that have good reviews” Cortana: “Getting you direction to the Mayuri Indian Cuisine” User: “show restauran ts near me” User: “show the best ones” User: “show directions to the second one” Multi-Task Search Dialogues
  • 14. How to define user satisfaction with search dialogues?
  • 15. Cortana: “Here are ten restaurants near you” Cortana: “Here are ten restaurants near you that have good reviews” Cortana: “Getting you direction to the Mayuri Indian Cuisine” User: “show restauran ts near me” User: “show the best ones” User: “show directions to the second one” No Clicks ???
  • 16. Cortana: “Here are ten restaurants near you” Cortana: “Here are ten restaurants near you that have good reviews” Cortana: “Getting you direction to the Mayuri Indian Cuisine” User: “show restauran ts near me” User: “show the best ones” User: “show directions to the second one” SAT? SAT? SAT? Overall SAT? ? SAT? SAT? SAT?
  • 17. User Frustration Q1: what's the weather like in San Francisco Q2: what's the weather like in Mountain View Q3: can you find me a hotel close to Mountain View Q4: can you show me the cheapest ones Q5: show me the third one Q6: show me the directions from SFO to this hotel Q6: show me the directions from SFO to this hotel Q7: go back to first hotel (misrecognition) Q8: show me hotels in Mountain View Q9: show me cheap hotels in Mountain View Q10: show me more about the third one   Dialog with Intelligent Assistant Task is “Planning a weekend ” RestartsearchAuserissatisfied 
  • 18. What interaction signals can track during search dialogues?
  • 19. Tracking User Interaction: Click Signals • Number of queries in a dialogue • Number of clicks in a dialogue • Number of SAT clicks (> 30 sec. dwell time) in a dialogue • Number of DSAT clicks (< 15 sec. dwell time) in a dialogue • Time (seconds) until the first click in a dialogue
  • 20. Tracking User Interaction: Acoustic Signals Phonetic Similarity between consecutive requests
  • 22. 3 seconds 6 seconds 33% of ViewPort 66% of ViewPort ViewPortHeight 2 seconds 20% of ViewPort 1s 4s 0.4s 5.4s+ + = Tracking User Interaction
  • 23. • Number of Swipes • Number of up-swipes • Number of down-swipes • Total distance swiped (pixels) • Number of swipes normalized by time • Total distance divided by num. of swipes • Total swiped distance divided by time • Number of swipe direction changes • SERP answer duration (seconds) which is shown on screen (even partially) • Fraction of visible pixels belonging to SERP answer • Attributed time (seconds) to viewing a particular element (answer) on SERP • Attributed time (seconds) per unit height (pixels) associated with a particular element on SERP • Attributed time (milliseconds) per unit area (square pixels) associated with a particular element on SERP Tracking User Interaction: Touch Signals
  • 24. How to collect data?
  • 25. User Study Participants 75% 25% GENDER Male Female 55% 45% LANGUAGE English Other 82% 8% 2% 8% EDUCATION Computer Science Electrical Engineering Mathematics Other • 60 Participants • 25.53 +/- 5.42 years
  • 26. You are planning a vacation. Pick a place. Check if the weather is good enough for the period you are planning the vacation. Find a hotel that suits you. Find the driving directions to this place.
  • 27. You are planning a vacation. Pick a place. Check if the weather is good enough for the period you are planning the vacation. Find a hotel that suits you. Find the driving directions to this place.
  • 28. Questionnaire • Were you able to complete the task? o Yes/No • How satisfied are you with your experience in this task? o If the task has sub-tasks participants indicate their graded satisfaction e.g. o a. How satisfied are you with your experience in finding a hotel? o b. How satisfied are you with your experience in finding directions? • How well did Cortana recognize what you said? o 5-point Likert scale • Did you put in a lot of effort to complete the task? o 5-point Likert scale
  • 29. Questionnaire • Were you able to complete the task? o Yes/No • How satisfied are you with your experience in this task? o If the task has sub-tasks participants indicate their graded satisfaction e.g. o a. How satisfied are you with your experience in finding a hotel? o b. How satisfied are you with your experience in finding directions? • How well did Cortana recognize what you said? o 5-point Likert scale • Did you put in a lot of effort to complete the task? o 5-point Likert scale 8 Tasks: 1 simple, 4 with 2 subtasks, 3 with 3 subtasks ~ 30 Minutes
  • 30. Search Dialog Dataset • Total amount of queries is 2, 040 • Amount of unique queries is 1, 969 • The average query-length is 7.07
  • 31. Search Dialog Dataset • Total amount of queries is 2, 040 • Amount of unique queries is 1, 969 • The average query-length is 7.07 • The simple task generated 130 queries • Tasks with 2 context switches generated 685 queries • Tasks with 3 context switches generated 1, 355 queries
  • 32. How can we predict user satisfaction with search dialogues using interaction signals?
  • 33. Q1: what do you have medicine for the stomach ache Q2: stomach ache medicine over the counter General Web SERP User’s dialogue about the ‘stomach ache’
  • 34. Q1: what do you have medicine for the stomach ache Q2: stomach ache medicine over the counter Q3: show me the nearest pharmacy Q4: more information on the second one General Web SERP Structured SERP User’s dialogue about the ‘stomach ache’
  • 35. General Web and Structured SERP
  • 36. General Web and Structured SERP
  • 38. Aggregating Touch Interactions (I) I( ) I( , ) 1. 2.
  • 39. Aggregating Touch Interactions (I) I( ) I( ),I( )I( , ) 1. 2. 3.
  • 40. Quality of Interaction Model Method Accuracy (%) Average F1 (%) Baseline 70.62 61.38 Interaction Model 1 78.78* (+11.55) 83.59* (+35.90) Interaction Model 2 80.21* (+13.58) 83.31* (+35.44) Interaction Model 3 80.81* (14.43) 79.08* (28.83) * Statistically significant improvement (p < 0,05 )
  • 41. Which interaction signals have the highest impact on predicting user satisfaction with search dialogues?
  • 42. Predicting User Satisfaction • F1: The SERP for a query is ordered by a measure of relevance as determined by the system, then additional exploration is unlikely to achieve user satisfaction, but is more likely an indication that the best-provided results (i.e. the SERP top) are insufficient to address the user intent
  • 43. Predicting User Satisfaction • F1: The SERP for a query is ordered by a measure of relevance as determined by the system, then additional exploration is unlikely to achieve user satisfaction, but is more likely an indication that the best-provided results (i.e. the SERP top) are insufficient to address the user intent • F2: In the converse case of F1, when users find content that satisfies their intent, their likelihood of scrolling is reduced, and they dwell for an extended period on the top viewport
  • 44. Predicting User Satisfaction • F1: The SERP for a query is ordered by a measure of relevance as determined by the system, then additional exploration is unlikely to achieve user satisfaction, but is more likely an indication that the best-provided results (i.e. the SERP top) are insufficient to address the user intent • F2: In the converse case of F1, when users find content that satisfies their intent, their likelihood of scrolling is reduced, and they dwell for an extended period on the top viewport • F3: When users are involved in a complex task, they are dissatisfied when redirected to a general web SERP. Unlike F2, the absence of scrolling on this landing page is an indication of dissatisfaction
  • 45. How can we define user satisfaction with search dialogues? • User satisfaction with search dialogues is defined in the generalized form, which showed understanding the nature of user satisfaction as an aggregation of satisfaction with all dialogue’s tasks and not as a satisfaction with all dialogue’s queries separately How can we predict user satisfaction with search dialogues using interaction signals? • We showed that features derived from voice and especially from touch and voice interactions add significant gain in accuracy over the baseline How can we predict user satisfaction with search dialogues using interaction signals? • Our analysis showed a strong negative correlation between user satisfaction and swipe actions Conclusion
  • 46. • User satisfaction with search dialogues is defined in the generalized form, which showed understanding the nature of user satisfaction as an aggregation of satisfaction with all dialogue’s tasks and not as a satisfaction with all dialogue’s queries separately • We showed that features derived from voice and especially from touch and voice interactions add significant gain in accuracy over the baseline • Our analysis showed a strong negative correlation between user satisfaction and swipe actions Thank you! Questions?

Editor's Notes

  1. We utilize acoustic feature to characterize voice interaction happening in search dialogues. More specifically, we use the phonetic similarity between consecutive requests to identify patterns of repetition. Metaphone representation [39] is a way of indexing words by their pronunciation that allows us to represent words by how they are pronounced as opposed to how they are written.
  2. Consider movie recommendation
  3. Conclusion from presentation