Converse framework

ConveRSE:
a domain-independent
Framework for building
Conversational Recommender Systems
Fedelucio Narducci
University of Bari Aldo Moro - Italy
Google Conversational Search & Recommendation Workshop - August 28-29, 2019 - London

BACKGROUND
➤ Conversational
Recommender Systems
(CoRSs) belong to the class of
dialog agents and interact
with the user during the
recommendation process
➤ CoRSs guide the users through an
interactive dialog
➤ the preference acquisition
is an incremental process
that does not have to be necessarily
ﬁnalized in a single step
ConveRSE: A domain-independent Framework for building Conversational Recommender Systems

PROBLEMS
➤ Developing dialog
agents is becoming more
and more popular for several
domains and applications
➤ The implementation is a
complex task since it requires
knowledge about NLP,
HCI, machine
learning

CONVERSE
➤ a domain-
independent
framework for building
conversational
recommender
systems

CAPABILITIES
➤ Acquire preferences
➤ Explore user profile
➤ Exploit feedback
➤ Recommendation explanation
➤ Offer different interaction
modes
➤ natural language
➤ buttons
➤ hybrid

ARCHITECTURE
Intent Recognizer
Dialog Manager
Entity Recognizer
Sentiment
Analyzer
Recommendation
Services

ARCHITECTURE: DIALOG MANAGER
Intent Recognizer
Dialog Manager
Entity Recognizer
Sentiment
Analyzer
Recommendation
Services
➤ core component of the
framework
➤ supervises the whole
➤ keeps track of the dialog
state
➤receives/sends
messages from/to the user
➤ is completely independent
from the client (JSON
message)

INTENT RECOGNIZER
Intent Recognizer
Dialog Manager
Entity Recognizer
Sentiment
Analyzer
Recommendation
Services
➤ identiﬁes the intent of the
user expressed by a natural
language sentence
➤ is based on DialogFlow
➤ recognizes four user intents
➤ preference
➤ recommendation
➤ show proﬁle
➤ help

ARCHITECTURE: INTENT RECOGNIZER
Intent Recognizer
Dialog Manager
Entity Recognizer
Sentiment
Analyzer
Recommendation
Services
➤ each intent can be composed
of set of sub-intents
➤ show proﬁle
➤ delete preference
➤ update preference
➤ reset proﬁle

ARCHITECTURE: SENTIMENT ANALYZER
Intent Recognizer
Dialog Manager
Entity Recognizer
Sentiment
Analyzer
Recommendation
Services
➤ Sentiment Tagger of
Stanford CoreNLP
➤ returns the sentiment
tags identiﬁed in a sentence
➤ links the sentiment tag
to an entity in the sentence
➤ I like The Matrix, but I hate
Keanu Reeves

ARCHITECTURE: ENTITY RECOGNIZER
Intent Recognizer
Dialog Manager
Entity Recognizer
Sentiment
Analyzer
Recommendation
Services
➤ Entity Recognizer ﬁnds
entities in the user sentence
➤ links them to the
Knowledge Base Wikidata
➤ does not require
annotated data for
training
➤recognizes alias
➤ Steven Allan Spielberg, Spielberg,
Steven Spielberg

ARCHITECTURE: ENTITY RECOGNIZER - PROBLEMS
Intent Recognizer
Dialog Manager
Entity Recognizer
Sentiment
Analyzer
Recommendation
Services
➤ diﬀerent surface forms can
refer to the same entity
➤ Steven Spielberg, Spielberg ->
Steven_Spielberg:director
➤ the same surface form
can refer to more than one
entities
➤ Spielberg ->
Sasha_Spielberg:actor

ARCHITECTURE: ENTITY RECOGNIZER@WORK
➤Step 1
➤ I like Spielberg and Jurassic Park ~ I like Jurassic Park and its director
Spielberg

➤Step 1
➤ I like Spielberg and Jurassic Park
Spielberg

➤Step 2
➤ User sentence: I like Spielberg and Jurassic Park
➤ Surface form1: Spielberg
➤ Candidate entities: Steven_Spielberg:director, Sasha_Spielberg:actor
➤ Context: Jurassic Park
sim2 (Jurassic Park,Steven_Spielberg:director) = 0.90
sim (Jurassic Park, Sasha_Spielberg:actor)= 0.15
Spielberg = Steven_Spielberg:director
1Basile, P., Caputo, A., Semeraro, G., Narducci, F.: Uniba: Exploiting a distributional semantic model for disambiguating
and linking entities in tweets. Making Sense of Microposts (# Microposts2015) (2015)
2Maximilian Nickel, Lorenzo Rosasco, Tomaso A Poggio, and others. 2016. Holographic Embeddings of Knowledge Graphs.
In The Thirtieth AAAI Conference on Artiﬁcial Intelligence (AAAI-16). 1955–1961.

ARCHITECTURE: RECOMMENDATION SERVICES
Intent Recognizer
Dialog Manager
Entity Recognizer
Sentiment
Analyzer
Recommendation
Services
➤ Recommendation algorithm
is PageRank with priors
➤ nodes are entities from
DBpedia (eg, actors, movies,
directors)
➤ Explanation1
➤ exploits links between liked
items and
recommendations in the
DBpedia graph
1Musto, C., Narducci, F., Lops, P., de Gemmis, M., Semeraro, G.
Linked open data-based explanations for transparent recommender systems.
(2019) International Journal of Human Computer Studies, 121, pp. 93-107.

ARCHITECTURE: RECOMMENDATION SERVICES
Intent Recognizer
Dialog Manager
Entity Recognizer
Sentiment
Analyzer
Recommendation
Services
➤ Critiquing
➤ the user can provides
complex feedback on
the recommended
items
➤ I like the movie Titanic,
but I don’t like the actor
Bill Paxton

EXPLANATION@WORK
American Epic Films
Tom
Hanks
Dystopian Films
The
Wachowskis
I recommend you Cloud Atlas because you often like films with Tom
Hanks as Saving Private Ryan and Da Vinci Code. In addition, you
sometimes like films directed by The Wachowskis as The Matrix.
dbpedia-owl:starring
dcterms:subject!
dcterms:subject!
dbpedia-owl:director
dbpedia-owl:director
dcterm
s:subject!
dcterms:subject!

THE FRAMEWORK@WORK
➤ three instances on Telegram (movie, music, and
book)
➤ @movierecsys2_bot
➤ @musicrecsys_bot
➤ @bookrecsys_bot

EXPERIMENTAL EVALUATION
➤ ﬁrst session: in-vitro experiment on
two datasets
➤ second session: user study

EXPERIMENTAL EVALUATION: FIRST SESSION
➤ In-vitro experiment
➤ to assess the accuracy of each
component of our framework
and its impact on the

EXPERIMENTAL EVALUATION: GOAL
Test separately
➤ Intent Recognizer
➤ Entity Recognizer
➤ Sentiment Recognizer

EXPERIMENTAL EVALUATION: DATASET
bAbI by Facebook Research
collects utterances like
Beauty and the Beast, Aladdin, Schindler’s List, and The
Silence of the Lambs are movies I loved.
Would you recommend something I might like?

Intent Recognizer test
Entities and Sentiments are set programmatically
Silence of the Lambs are movies I loved. (Preference)
(Recommendation request)

Entity Recognizer test
Intents and Sentiments are set programmatically

Sentiment Recognizer test
Intents and Entities are set programmatically

➤ Intent Recognizer Test
➤ Entity Recognizer Test
➤ Sentiment Recognizer Test
compared to
➤ Upper bound
recommendations generated by setting intents,
entities, and sentiments by code

EXPERIMENTAL EVALUATION: METRIC AND RESULTS
HitRate@n: hits/#recommended items
n= 5,10,20
HR@5 HR@10 HR@20
Upper Bound 0.75 1.21 1.93
Loss@5 Loss@10 Loss@20
Intent Recognizer -34.00% -30.86% -24.03%
Entity Recognizer -46.00% -35.80% -27.13%
Sentiment Recognizer -20.00% -16.05% -14.73%
Entity Recognizer~ 85% accuracy
Intent Recognizer ~ 77% accuracy
Sentiment Recognizer ~ 83% accuracy

EXPERIMENTAL EVALUATION: SECOND IN-VITRO EXPERIMENT
➤ Dataset released by Grouplens1
➤ collects recommendation requests of
real users to a (simulated) conversational
recommender system
➤ 694 sentences
➤ Results
➤ 7.4% intents (very difﬁcult task, requests like “action
movies”,”exploitations films”, ”film with sharks”, ”i’m looking for a hard
sci-fi movie”)
➤ 64.39% entities
➤
1Kang, J., Condiff, K., Chang, S., Konstan, J. A., Terveen, L., & Harper, F. M. (2017, August). Understanding how people use natural langua
ask for recommendations. In Proceedings of the Eleventh ACM Conference on Recommender Systems (pp. 229-237). ACM.

EXPERIMENTAL EVALUATION: USER STUDY
➤ 50 users tested three domains and three
interaction modes:
➤ movie, book, and music
➤ natural language, buttons, and mixed

➤ Goal
➤ to assess the impact of using natural language in a CoRS
➤ Research Questions
➤ Can natural language improve a CoRS in terms of
cost of interaction?
➤ Can natural language improve a CoRS in terms of
quality of recommendations?
➤ What is the impact of each component of a CoRS to
the accuracy of the recommendations?
➤ What are the most critical aspects to consider when
modelling a natural language-based dialog for a
CoRS?

➤ Metrics
➤ Objective metrics
➤ MAP, Accuracy
➤ Number of Questions, Time per Question,
Interaction Time, Query Density
➤ Questionnaire based on the ResQue model [1]
and the questionnaire proposed in [2]
[1] P. Pu, L. Chen, R. Hu, A user-centric evaluation framework for recommender systems, in:
Proceedings of the ﬁfth ACM conference on Recommender systems, ACM, 2011, pp. 157–164
[2] A. Silvervarg, A. Jonsson, Subjective and objective evaluation of conversational agents in learning
environments for young teenagers, in: Proceedings of the 7th IJCAI Workshop on Knowledge and
Reasoning in Practical Dialogue Systems, 2011

➤ Results 1/2
➤ Pure NL-based interfaces need to help the user
when precise input is needed
➤ Pure NL-based interfaces did not perform
signiﬁcantly better than button-based
interfaces
➤ Mixed interactions (NL + buttons) generally
perform better than NL and buttons taken
individually

➤ Results 2/2
➤ Recognition of ratings and entities is a
crucial step for achieving a good accuracy,
specifically in the first phases of the interaction
➤ When users are asked to choose from a set of
predefined answers, this activity needs to be
facilitated in some way

CONVERSE DATASET
➤ During the user study, we collected three
datasets of real dialogs useful for training-
testing Intent Recognizer, Entity Recognizer, or
Sentiment Analyzer
➤ Messages
➤ 5,318 movie domain
➤ 1,862 book domain
➤ 2,096 music domain
➤ split in recommendation requests, preferences,
criticisms, and explanation requests
➤ https://tinyurl.com/converse-dat

QUESTIONS?
fedelucio.narducci@uniba.it
https://www.linkedin.com/in/fedelucio-narducci-612a0425
@LucioNarducci
This is a collaborative work with: Pierpaolo Basile, Marco de
Gemmis, Andrea Iovine, Pasquale Lops, and Giovanni Semeraro

Converse framework

Recommended

Recommended

More Related Content

Similar to Converse framework

Similar to Converse framework (20)

Recently uploaded

Recently uploaded (20)

Converse framework