Talk given at Recsys 2017 by Max Harper: The technical barriers for conversing with recommender systems using natural language are vanishing. Already, there are commercial systems that facilitate interactions with an AI agent. For instance, it is possible to say “what should I watch” to an Apple TV remote to get recommendations. In this research, we investigate how users initially interact with a new natural language recommender to deepen our understanding of the range of inputs that these technologies can expect. We deploy a natural language interface to a recommender system, we observe users’ first interactions and follow-up queries, and we measure the differences between speaking- and typing-based interfaces. We employ qualitative methods to derive a categorization of users’ first queries (objective, subjective, and navigation) and follow-up queries (refine, reformulate, start over). We employ quantitative methods to determine the differences between speech and text, finding that speech inputs are typically longer and more conversational.
Recsys 2017 -- Understanding How People Use Natural Language to Ask for Recommendations
1. Understanding How People Use
Natural Language to Ask for
Recommendations
Jie Kang*, Kyle Condiff*, Shuo Chang**, Joseph A. Konstan,
Loren Terveen, Max Harper (presenter)
1
GroupLens Center for Social and Human-Centered Computing
University of Minnesota
* Now with Facebook
** Now with Quora
2. Overview of this Talk:
Natural Language Recommenders
» motivation: why natural language
recommenders are cool
» experiment: lab experiment + qualitative
analysis
» discussion: dataset, design implications,
opportunities
2
3. Overview of this Talk:
Natural Language Recommenders
» motivation: why natural language
recommenders are cool
» experiment: lab experiment + qualitative
analysis
» discussion: dataset, design implications,
opportunities
3
6. librarian as recommender interface
» I can seed the conversation in a natural
way (ask for what you want!)
» I can detect that the conversation is going
astray, and correct (too scary! too old!)
» I can be vague in my query, or very
specific, depending on my mood
6
7. vs. canonical recommender UI
» endless lists, best
stuff tries to be at the
top
» sometimes based on
recent activity
(“context”)
» downsides?
7
9. bridging the gap: voice control
» voice interfaces, e.g.:
• Amazon Fire as video
player
• Google Home as
music player
» voice recognition
getting better
» goal: better
integration with
recommender
technologies
9
10. bridging the gap: chatbots
» chat interfaces, e.g.:
• And Chill on Facebook
Messenger
• LunchBot on Slack
» frameworks to build
these dialogues (e.g.,
wit.ai) are very
accessible
» goal: richer, more
flexible dialogue
10
14. before we can ask “how do we respond to
natural language recommendation
requests?” we must ask the following
research question:
» how do users ask for recommendations
and express their preferences using
natural language?
14
15. experiment overview
» collect dataset of queries
• recruit MovieLens users by email
• assign subjects to speaking and typing
conditions
• collect queries and survey responses
» qualitatively code queries
17. follow-up query (N=151)
» show 10 recs
» “I can improve
these results. Tell
me more about
what you want.”
» same
speaking/typing
UI as first query
17
18. extracting meaning from queries
» Inspired by Rose and Levinson (WWW 2004): goals of
users in search (navigational, informational, resource)
» inductive, open coding
• four researchers read through the dataset, iteratively
assign new codes and refine old codes until stable
• final codes were consensus of two researchers who
discussed and resolved disagreements
» evaluation of coding consistency
• two researchers coded 187 random queries to measure
consistency
• Cohen’s kappa 0.87
18
21. objective
genre “superhero movies”
deep features “movies with open endings or plot twists”
people “Brad Pitt”
release date “can you find me a funny romantic movie
made in the 2000s?”
region “British murder mystery”
language “show me a list of German movies”
21
» known attributes
» filtering
22. subjective
emotion “sad movie”
quality “interesting characters, clever plot”
movie-based “what would you recommend to a fan of
Big Lebowski?”
22
» quality judgments
» ordering
27. follow-up query type: refine
Refine with
further
constraints
1: a mystery drama with a suspenseful
ending
2: something from the last few years
Refine with
clarification
1: Horror
2: More true horror instead of drama/
thriller
27
» specify additional criteria for the
recommender to consider
28. follow-up query type: reformulate
Reformulate with
further
constraints
1: I’m looking for a romantic comedy
2: I’d like a romantic comedy that was
created after the year
2000
Reformulate with
clarification
1: a romantic comedy with a happy
ending
2: romantic comedy with tensions
between the couple but ends
well
28
» completely restate the query
30. text vs. speech
» speaking: longer queries
» speaking: more likely “conversational”
» speaking: more queries with objective
deep features and subjective queries
30
31. Overview of this Talk:
Natural Language Recommenders
» motivation: why natural language
recommenders are cool
» experiment: lab experiment + qualitative
analysis
» discussion: dataset, design implications,
opportunities
31
32. design implications
» objective deep features and subjective
features: important (and difficult?) to
support
» detect and differentially support objective
(filter) vs. subjective (sort)
» detect and support the “intent” of follow-up
(critiquing) queries
32
33. opportunities
» more than movies! how do people interact
differently in different domains?
» different UIs: voice-only, voice + screen
(Amazon Echo), typing (Facebook
Messenger)
» conversational recommenders
» decision-making with intelligent agents
33
34. dataset
34
» queries, survey responses
» link in paper (or search for “understanding how
people use natural language to ask for
recommendations”)
35. Understanding How People Use
Natural Language to Ask for
Recommendations
Max Harper
Research Scientist
GroupLens Research, University of Minnesota
max@umn.edu
@maxharp3r
35
This material is based on work supported by the National Science Foundation under grants
IIS-0964695, IIS-1017697, IIS-1111201, IIS- 1210863, and IIS-1218826, and by a grant from
Google.