Slides presenting a paper published in the proceeding of 22nd International Conference on Knowledge-Based and Intelligent Information & Engineering Systems (KES 2018), Belgrade, Serbia
A combination of reduction and expansion approaches to handle with long natural language queries
1. A COMBINATION OF REDUCTION
AND EXPANSION APPROACHES
TO HANDLE WITH LONG NATURAL
LANGUAGE QUERIES
Available online at www.sciencedirect.com
Procedia Computer Science 00 (2018) 000–000
www.elsevier.com/locate/procedia
22nd International Conference on Knowledge-Based and
Intelligent Information & Engineering Systems
A Combination of Reduction and Expansion Approaches to to
Handle with Long Natural Language queries
Mohamed ETTALEBa
, Chiraz LATIRIb
, Patrice BELLOTb
aUniversity of Tunis El Manar, Faculty of Sciences of Tunis, LIPAH research Laboratory, Tunis ,Tunisia
bAix-Marseille Universit, CNRS, LIS UMR 7020, 13397, Marseille, France
Abstract
Most of the queries submitted to search engines are composed of keywords but it is not enough for users to express their needs.
Through verbose natural language queries, users can express complex or highly specific information needs. However, it is di cult
for search engine to deal with this type of queries. Moreover, the emergence of social medias allows users to get opinions, sug-
gestions or recommendations from other users about complex information needs. In order to increase the understanding of user
needs, tasks as the CLEF Social Book Search Suggestion Track have been proposed from 2011 to 2016. The aim is to investigate
techniques to support users in searching for books in catalogs of professional metadata and complementary social media. In this re-
spect, we introduce in the present paper a statical approach to deal with long verbose queries in Social Information Retrieval (SIR)
3. P. Bellot (AMU-CNRS, LSIS-OpenEdition)
Searching for Books : a difficult task
User needs : very diverse facets or aspects
— topic oriented aspects
— With / without a precise context
eg. arts in China during the XXth century
— Books dealing with named entities : locations (the book is about a specific location OR
the action takes place at this location), proper names…
— What are the most important / more popular books about …
— style / type / language aspects
— category : fiction, novel, essay, proceedings, position papers…
— target : for experts / for dummies / for children …
— and also… : well illustrated, cheap, short…
=> Keyword queries are not enough => verbose natural language queries
Book Contents : long stories, metaphoric language, several topics…
=> Metadata (tags, ToC, indexes…), summaries, reader reviews can help us
!3
4. (IR based) Book Suggestion System
!4
Hypothesis : Reviews can help to expand the queries
Information
Retrieval
System
Book content
Metadata
Summaries
Reviews
Needs
Example
books
Query
Index
Book
Suggestion
Reduction
Expansion
5. P. Bellot (AMU-CNRS, LSIS-OpenEdition) !5
http://social-book-search.humanities.uva.nl/#/overview
2 The Amazon collection
The document used for this year’s Book Track is composed of Amazon pages of
existing books. These pages consist of editorial information such as ISBN num-
ber, title, number of pages etc... However, in this collection the most important
content resides in social data. Indeed Amazon is social-oriented, and user can
comment and rate products they purchased or they own. Reviews are identi-
fied by the <review> fields and are unique for a single user: Amazon does not
allow a forum-like discussion. They can also assign tags of their creation to a
product. These tags are useful for refining the search of other users in the way
that they are not fixed: they reflect the trends for a specific product. In the
XML documents, they can be found in the <tag> fields. Apart from this user
classification, Amazon provides its own category labels that are contained in the
<browseNode> fields.
Table 1. Some facts about the Amazon collection.
Number of pages (i.e. books) 2, 781, 400
Number of reviews 15, 785, 133
Number of pages that contain a least a review 1, 915, 336
3 Retrieval model
3.1 Sequential Dependence Model
Like the previous year, we used a language modeling approach to retrieval [4].
We use Metzler and Croft’s Markov Random Field (MRF) model [5] to integrate
multiword phrases in the query. Specifically, we use the Sequential Dependance
a CLEF lab 2011-2016
8. P. Bellot (AMU-CNRS, LSIS-OpenEdition)
Our proposal for dealing with queries
— 1- Combine query reduction and query expansion
— 2- Apply an Information Retrieval Model over (meta)data and reviews
— Query Reduction : removes the less informative words / the most complex to deal with
— Query Expansion : fights against word mismatch problem (a concept is described by
different terms in user queries and in source documents)
— Association rules « query words / word in the reviews »
= inter-term correlations (if the query words occur then these words occur as well)
— Use of the examples given by the user : Pseudo-Relevance Feedback (Rocchio)
!8
• stopword removal and stemming for the English language to reduce the verbose queries
• query expansion based on Associations rules between terms.
• query expansion using similar books mentioned in the topics.
4.1. Query Reduction
Removing stopwords has long been a standard query processing step[12]. We used three di↵erent stopwords lists
in this study: the standard stopword list1
, as well as two stopwords lists based on morphosyntactic analysis and
according to the ranks of terms by some weight. [1] present several methods for automatically constructing a collection
dependent stopword list. Their methods generally involve ranking collection terms by some weight, then choosing
some rank threshold above which terms are considered stopwords. [17] constructed a specific stopword list of a given
collection and used statistic measure IDF to rank terms and decide which term is a stopword or not. Next, they applied
these techniques by removing from the query all words which occur on the stopword list. Our proposal is to reduce the
verbose queries based on two steps: first, all terms that appear in the standard stopwords list2
are eliminated. Second,
we process the linguistic filtering method and execute TreeTagger3
a part of speech tagging on the queries. Then, we
select only the particular words of noun type(nouns, proper nouns, etc.) and query words that have a form as noun
phrase such as Syrian Civil War. The aim is to keep the appropriate words that can improve the quality of the user
query.
4.2. Query expansion using Associations rules between terms(ART)
Query expansion is the process of adding additional relevant terms to the original queries to improve the per-
formance of information retrieval systems. However, previous studies showed that automatic query expansion using
Association rules do not lead to an improvement in the performance.
The main idea of this approach is to extract a set of non redundant rules, representing inter-terms correlations in a
contextual manner. We use the rules that convey the most interesting correlations amongst terms, to extend the initial
queries. Then, we extract a list of books for each query using the MB25 scoring [25].
4.2.1. Representation and Query Expansion:
We represent a query q as a bag of terms:
9. Association Rules Mining for Query Expansion
!9
CT = {w1, ..., wm}
Where wi is a candidate term. This set
terms detailed in the next section.
4.2.2. Association Rules:
An association rule, i.e., between ter
of ⌧ where ⌧ := {t1, ..., wl} is a finite set
and T2 are, respectively, called the pre
equal to T1 [ T2. The support of a rule
S upp(R) = S upp(T)
while its confidence is computed as:
Con f(R) =
S upp(T)
S upp(T1)
An association rule R is said to be valid
threshold denoted mincon f. This confid
4.2.3. Candidate Terms Generation Ap
The main idea of this approach is t
between terms [4]. The set of query te
conclusion parts of the retained rules w
example of association rules is highligh
=
S upp(T)
S upp(T1)
(4)
rule R is said to be valid if its confidence value, i.e., Con f(R), is greater than or equal to a user-defined
ted mincon f. This confidence threshold is used to exclude non valid rules.
ate Terms Generation Approach based on Association Rules:
dea of this approach is to use the association rules mining technique to discover strong correlations
[4]. The set of query terms will be expanded using the maximal possible set of terms located in the
ts of the retained rules while checking that the terms are located in their premise part. An illustrative
ociation rules is highlighted in Table4.2.3.
R Support Confidence
military ) warfare 83 0.741
romance ) love 64 0.723
on Rules examples.
generating candidate terms for a given query is performed as in the following steps:
f a sub-set of 12000 books according to the querys subject. The books are represented only by their
tion, we chose to select the title, reviews and the tags as content of the book.
the selected books using TreeTagger. The choice of TreeTagger was based on the ability of this tool to
nature(morphosyntactic category) of a word in its context.
R Support Confidence
military ) warfare 83 0.741
romance ) love 64 0.723
Table 2. Association Rules examples.
The process of generating candidate terms for a given query is performed as in the following steps:
1) Selection of a sub-set of 12000 books according to the querys subject. The books are represented only by
social information, we chose to select the title, reviews and the tags as content of the book.
2) Annotating the selected books using TreeTagger. The choice of TreeTagger was based on the ability of this t
recognize the nature(morphosyntactic category) of a word in its context.
3) Extraction of nouns (terms) from the annotated books, and removing the most frequent ones.
4) Generating the association rules using an e cient algorithm: Closed Association Rule Mining (CHARM)[2
mining all the closed frequent termsets, As parameters, CHARM takes minsupp = 15 as the relative minimal su
and minconf = 0.7 as the minimum confidence of the association rules [18]. While considering the Zipf distri
of the collection, the maximum threshold of the support values is experimentally set in order to spread trivial
which occur in the most of the documents, and are then related to too many terms. On the other hand, the m
threshold allows eliminating marginal terms which occur in few documents, and are then not statistically imp
when occurring in a rule. CHARM gives as output, the association rules with their appropriate support and confid
Table4.2.3 describes the output of CHARM.
military ) warfare 83 0.741
romance ) love 64 0.723
Table 2. Association Rules examples.
The process of generating candidate terms for a given query is performed as in the following steps:
1) Selection of a sub-set of 12000 books according to the querys subject. The books are represented on
social information, we chose to select the title, reviews and the tags as content of the book.
2) Annotating the selected books using TreeTagger. The choice of TreeTagger was based on the ability of
recognize the nature(morphosyntactic category) of a word in its context.
3) Extraction of nouns (terms) from the annotated books, and removing the most frequent ones.
4) Generating the association rules using an e cient algorithm: Closed Association Rule Mining (CHAR
mining all the closed frequent termsets, As parameters, CHARM takes minsupp = 15 as the relative minim
and minconf = 0.7 as the minimum confidence of the association rules [18]. While considering the Zipf d
of the collection, the maximum threshold of the support values is experimentally set in order to spread tr
which occur in the most of the documents, and are then related to too many terms. On the other hand, t
threshold allows eliminating marginal terms which occur in few documents, and are then not statistically
when occurring in a rule. CHARM gives as output, the association rules with their appropriate support and
Table4.2.3 describes the output of CHARM.
R Support Confidence
military ) warfare 83 0.741
romance ) love 64 0.723
Table 2. Association Rules examples.
The process of generating candidate terms for a given query is performed as in the following steps:
1) Selection of a sub-set of 12000 books according to the querys subject. The books are represented
social information, we chose to select the title, reviews and the tags as content of the book.
2) Annotating the selected books using TreeTagger. The choice of TreeTagger was based on the ability
recognize the nature(morphosyntactic category) of a word in its context.
3) Extraction of nouns (terms) from the annotated books, and removing the most frequent ones.
4) Generating the association rules using an e cient algorithm: Closed Association Rule Mining (CHA
mining all the closed frequent termsets, As parameters, CHARM takes minsupp = 15 as the relative min
and minconf = 0.7 as the minimum confidence of the association rules [18]. While considering the Zip
of the collection, the maximum threshold of the support values is experimentally set in order to spread
which occur in the most of the documents, and are then related to too many terms. On the other hand
threshold allows eliminating marginal terms which occur in few documents, and are then not statistica
when occurring in a rule. CHARM gives as output, the association rules with their appropriate support an
Table4.2.3 describes the output of CHARM.
R Support Confidence
military ) warfare 83 0.741
romance ) love 64 0.723
ssociation Rules examples.
cess of generating candidate terms for a given query is performed as in the following steps:
tion of a sub-set of 12000 books according to the querys subject. The books are represented only by their
formation, we chose to select the title, reviews and the tags as content of the book.
tating the selected books using TreeTagger. The choice of TreeTagger was based on the ability of this tool to
e the nature(morphosyntactic category) of a word in its context.
ction of nouns (terms) from the annotated books, and removing the most frequent ones.
rating the association rules using an e cient algorithm: Closed Association Rule Mining (CHARM)[29] for
ll the closed frequent termsets, As parameters, CHARM takes minsupp = 15 as the relative minimal support
conf = 0.7 as the minimum confidence of the association rules [18]. While considering the Zipf distribution
ollection, the maximum threshold of the support values is experimentally set in order to spread trivial terms
ccur in the most of the documents, and are then related to too many terms. On the other hand, the minimal
d allows eliminating marginal terms which occur in few documents, and are then not statistically important
curring in a rule. CHARM gives as output, the association rules with their appropriate support and confidence.
.3 describes the output of CHARM.
6
a set of candidate terms for q:
CT = {w1, ..., wm}
Where wi is a candidate term.
terms detailed in the next sectio
4.2.2. Association Rules:
An association rule, i.e., betw
of ⌧ where ⌧ := {t1, ..., wl} is a fi
and T2 are, respectively, called
equal to T1 [ T2. The support
S upp(R) = S upp(T)
while its confidence is compute
Con f(R) =
S upp(T)
S upp(T1)
ia Computer Science 00 (2018) 000–000
e terms, denoted CT, is selected using association rules betw
mplication of the form R : T1 ) T2, where T1 and T2 are sub
terms in the books collection and T1 T2 = ;. The termsets
e conclusion of R. The rule R is said to be based on the termse
query
term
term in book
title + review + metadata
For each query (bag of words)
termsets
Expansion using the maximal set of terms in the conclusion parts
T =
10. Experiments
— Experimental setup
- Terrier Information Retrieval System
- BM25 model
!10
In our experiments, we present experimental results on SBS 2014 a
to compare the performances of di↵erent components of our system. Fi
framework developed at the University of Glasgow [22]. Terrier is a modu
scale IR applications. It provides indexing and retrieval functionalities. Th
the usual parameter values (b=0, k3=1000, k1=2). Using the BM25 mode
Q is given by:
S (D, Q) =
X
t2Q
(K1 + 1)w(t, d)
K1 + w(t, d)
.id f(t).
(K3 + 1)w(t, Q)
K3 + w(t, Q)
4 http://social-book-search.humanities.uva.nl/#/data/suggestion
8 Author name / Procedia Comput
Where w(t, d) and w(t, Q) are respectively the weights of te
document frequency of term t, given as follow:
id f(t) = log
|D| d f(t) + 0.5
d f(t) + 0.5
Where d f(t) is the number of documents containing t, an
conducted three di↵erent runs, namely:
1. Run-RQ: We used only the reduced queries we showed
2. Run-QEART: We added the association rules between
3. Run-QEEB: Query expansion using examples books.
Strategy NDCG@10 MAP Improved
Baseline model 0.1041 0.0965 -
RQ 0.1158 0.1014 11.24%
QEART 0.1429 0.1153 23.4%
QEEB 0.1518 0.1194 6.23%
Table 4. Results of SBS 2014 with di↵erent strategies T
We used two topic sets provided by CLEF SBS in 2014 (6
narrative fields for each topic. In the beginning, we used th
stop-words and keep the appropriate words in the query. Se
terms using ART. In this step, we applied the CHARM alg
mincon f = 0.7. Then, we used the similar books for each to
to expand again the query. The rocchio function was used wi
of terms selected from each similar book was set to 10. Ta
reduced query and query expansion based on association ru
relevant and contain terms can be important to the query. In order to exploit these similar books, we expand again the
queries processed in the section 4.2 by automatically adding terms from these similar books.
5. Experiments and results
In this section, we describe the experimental setup we used for our experiments.
5.1. Experimental data
To evaluate our approach, the data provided by CLEF SBS suggestion track 20164
are used.
• Documents: The documents collection consists of 2.8 millions of books descriptions with meta-data from Ama-
zon and LibraryThing. Each document is represented by book-title, author, publisher, publication year, library
classification codes and user-generated content in the form of user ratings and reviews.
• Queries: the collection of queries from 2011 to 2016 the organizers of SBS have used Librarything forum to
extract a di↵erent set of queries with relevance judgments for each year. In our case, we chose to combine the
title with the narrative as a representation of the queries.
Year #Queries Fields
2011 211 Title,Group,Narrative,type,genre,specificity
2012 96 Title,Group,Narrative,type,genre
2013 370 Title,Group,Narrative,Query
2014 672 Title,Group,Narrative,mediated query
2015 178 Title,Group,Narrative,mediated query
2016 119 Title,Group,Request
Table 3. The six years topics used for SBS Suggestion track
For fair comparison, the queries and the corresponding relevance judgments in the others years are utilized as the
RQ 0.1158 0.1014 11.24%
QEART 0.1429 0.1153 23.4%
QEEB 0.1518 0.1194 6.23%
Table 4. Results of SBS 2014 with di↵erent strategies
RQ 0.1240 0.0904 5.53%
QEART 0.1549 0.1013 24.92%
QEEB 0.1688 0.1054 8.97%
Table 5. Results of SBS 2016 with di↵erent strategies
We used two topic sets provided by CLEF SBS in 2014 (680 topics) and 2016 (120 topic). We selected the title and
narrative fields for each topic. In the beginning, we used the techniques we showed in the section4.1 to remove the
stop-words and keep the appropriate words in the query. Secondly, the reduced query was expanded by adding new
terms using ART. In this step, we applied the CHARM algorithm with the following parameters : minsup = 15, and
mincon f = 0.7. Then, we used the similar books for each topic and applied the pseudo-relevance feedback technique
to expand again the query. The rocchio function was used with their default parameter settings = 0.4, and the number
of terms selected from each similar book was set to 10. Table5.2 describe an example of both approaches based on
reduced query and query expansion based on association rules between terms.
Original Query Does anyone know of a good book on the Battle of Gazala?
Reduced Query good book battle gazala
Query Expansion using ART good book battle gazala / military history gazala war attack army
Table 6. Examples of reduced and expansion Approaches for handling verbose queries
5.3. Experimental results
We first compare our baseline retrieval results with results from di↵erent expansion strategies which are shown
in Table5.2 and Table5.2. Where the columns RQ, QEART and QEEB represent the results obtained by the reduced
query, expanded reduced query using association rules and expanded (QEART) using examples books respectively. As
can be seen from the two tables, with the proposed di↵erent expansion strategies, the results are well-performed and
improve the baseline to some extent. We show that when we used the query reduction technique, the results perform
better than the baseline in the two sets of topics. we also show that when applying query expansion technique using
pseudo-relevance feedback, the results are better across all the sets of topics. In term of ndcg@10 the results increase
from 0.1429 to 0.1518 in the set of 2014 and from 0.1549 to 0.1688 in the set of 2016. From the overall perspective,
the best performance is obtained by QEART strategy with the greatest improvements of the score of 24.9% in the
11. P. Bellot (AMU-CNRS, LSIS-OpenEdition) !11
Author name / Procedia Computer Science 00 (2018) 000–000 9
wins in 2016(O cial run), the authors proposed a searching framework which builds at any moment a reading list for
any specific topic, where the relevance between topics and books, the books quality, the popularities timeliness and the
results diversity are respectively embedded into vector representations based on user-generated contents and statistics
on social media. The obtained evaluation results also shed light that our proposed approaches o↵er interesting results.
Run NDCG@10 MAP
Our run 0.1518 0.1194
O cial run 0.1420 0.102
Medium run 0.096 0.068
Worst run 0.010 0.007
Table 7. Comparison results on Social Book Search 2014.
Run NDCG@10 MAP
Our run 0.1688 0.1054
O cial run 0.2157 0.1253
Medium run 0.0861 0.0524
Worst run 0.0018 0.0004
Table 8. Comparison results on Social Book Search 2016.
However, we noticed that the QEART worked well in the reviews also, this is justified by the fact that the association
rules allowed us to find the terms having a strong correlation with the querys terms.
Lastly, to further the e↵ectiveness analysis, we present a gain and failure analysis our approach. Table 5.3 presents the
percentages of queries R+
and R for which QE techniques perform better or lower/equal than the di↵erent baselines
in terms of NDCG@10. As depicted in Table 5.3, the average percentage for the set of queries R+
is of about 67.40%
for the SBS 2014 collection and 66.11% for SBS 2016 collection. The high percentage for R+ queries is reached
when we combined the ART technique with PRF for QE. these results confirms the e↵ectiveness of using mainly the
association rules as well as PRF in query expansion as proved in the literature.
Author name / Procedia Computer Science 00 (2018) 000–000 9
wins in 2016(O cial run), the authors proposed a searching framework which builds at any moment a reading list for
any specific topic, where the relevance between topics and books, the books quality, the popularities timeliness and the
results diversity are respectively embedded into vector representations based on user-generated contents and statistics
on social media. The obtained evaluation results also shed light that our proposed approaches o↵er interesting results.
Run NDCG@10 MAP
Our run 0.1518 0.1194
O cial run 0.1420 0.102
Medium run 0.096 0.068
Worst run 0.010 0.007
Table 7. Comparison results on Social Book Search 2014.
Run NDCG@10 MAP
Our run 0.1688 0.1054
O cial run 0.2157 0.1253
Medium run 0.0861 0.0524
Worst run 0.0018 0.0004
Table 8. Comparison results on Social Book Search 2016.
However, we noticed that the QEART worked well in the reviews also, this is justified by the fact that the association
rules allowed us to find the terms having a strong correlation with the querys terms.
Lastly, to further the e↵ectiveness analysis, we present a gain and failure analysis our approach. Table 5.3 presents the
percentages of queries R+
and R for which QE techniques perform better or lower/equal than the di↵erent baselines
in terms of NDCG@10. As depicted in Table 5.3, the average percentage for the set of queries R+
is of about 67.40%
for the SBS 2014 collection and 66.11% for SBS 2016 collection. The high percentage for R+ queries is reached
when we combined the ART technique with PRF for QE. these results confirms the e↵ectiveness of using mainly the
association rules as well as PRF in query expansion as proved in the literature.
Run QEART QEEB
SBS 2014 COLLECTIONS
R+
62.55 72.24
R 37.45 29.16
SBS 2016 COLLECTIONS
R+
61.39 70.83
R 38.61 29.17
Table 9. Percentage of queries R+ and R for each set of query (better or lower/equal) than the di↵erent baselines in terms of NDCG.
id f(t) = log
d f(t) + 0.5
(6)
Where d f(t) is the number of documents containing t, and |D| is the number of documents in the collection. We
conducted three di↵erent runs, namely:
1. Run-RQ: We used only the reduced queries we showed in the section4.1.
2. Run-QEART: We added the association rules between terms to extend the reduced queries.
3. Run-QEEB: Query expansion using examples books.
Strategy NDCG@10 MAP Improved
Baseline model 0.1041 0.0965 -
RQ 0.1158 0.1014 11.24%
QEART 0.1429 0.1153 23.4%
QEEB 0.1518 0.1194 6.23%
Table 4. Results of SBS 2014 with di↵erent strategies
Strategy NDCG@10 MAP Improved
Baseline model 0.1175 0.0872
RQ 0.1240 0.0904 5.53%
QEART 0.1549 0.1013 24.92%
QEEB 0.1688 0.1054 8.97%
Table 5. Results of SBS 2016 with di↵erent strategies
We used two topic sets provided by CLEF SBS in 2014 (680 topics) and 2016 (120 topic). We selected the title and
narrative fields for each topic. In the beginning, we used the techniques we showed in the section4.1 to remove the
stop-words and keep the appropriate words in the query. Secondly, the reduced query was expanded by adding new
terms using ART. In this step, we applied the CHARM algorithm with the following parameters : minsup = 15, and
mincon f = 0.7. Then, we used the similar books for each topic and applied the pseudo-relevance feedback technique
to expand again the query. The rocchio function was used with their default parameter settings = 0.4, and the number
of terms selected from each similar book was set to 10. Table5.2 describe an example of both approaches based on
reduced query and query expansion based on association rules between terms.
Original Query Does anyone know of a good book on the Battle of Gazala?
Reduced Query good book battle gazala
Query Expansion using ART good book battle gazala / military history gazala war attack army
Table 6. Examples of reduced and expansion Approaches for handling verbose queries
5.3. Experimental results
(QEART with pseudo relevance feedback)
more
ework
ystem
l now
ormal-
there
ethods
o the
. The
evalu-
of the
matical
f eval-
ollow-
st.
uation
and
S, it is
redic-
Error
mean
ut.
he RS
ru,i =
m i on
u hav-
ystem
te dif-
forms
the proportion of relevant recommended items from the total
number of recommended items, (2) recall, which indicates the pro-
portion of relevant recommended items from the number of rele-
vant items, and (3) F1, which is a combination of precision and
recall.
Let Xu as the set of recommendations to user u, and Zu as the set
of n recommendations to user u. We will represent the evaluation
precision, recall and F1 measures for recommendations obtained
by making n test recommendations to the user u, taking a h rele-
vancy threshold. Assuming that all users accept n test
recommendations:
precision ¼
1
#U
X
u2U
#fi 2 Zujru;i P hg
n
ð4Þ
recall ¼
1
#U
X
u2U
#fi 2 Zujru;i P hg
#fi 2 Zujru;i P hg þ # i 2 Zc
u
ru;i P h
È É ð5Þ
F1 ¼
2  precision  recall
precision þ recall
ð6Þ
4.3. Quality of the list of recommendations: rank measures
When the number n of recommended items is not small, users
give greater importance to the first items on the list of recommen-
dations. The mistakes incurred in these items are more serious er-
rors than those in the last items on the list. The ranking measures
consider this situation. Among the ranking measures most often
used are the following standard information retrieval measures:
(a) half-life (7) [43], which assumes an exponential decrease in
the interest of users as they move away from the recommenda-
tions at the top and (b) discounted cumulative gain (8) [17], wherein
decay is logarithmic.
HL ¼
1
#U
X
u2U
XN
i¼1
maxðru;pi
À d; 0Þ
2ðiÀ 1Þ=ðaÀ 1Þ
ð7Þ
DCGk
¼
1
#U
X
u2U
ru;p1
þ
Xk
i¼2
ru;pi
log2ðiÞ
!
ð8Þ
p1,. . .,pn represents the recommendation list, ru,pi represents
the true rating of the user u for the item pi, k is the rank of the eval-
uated item, d is the default rating, a is the number of the item on
the list such that there is a 50% chance the user will review that
item.
4.4. Novelty and diversity
The novelty evaluation measure indicates the degree of differ-
ence between the items recommended to and known by the user.
The diversity quality measure indicates the degree of differentia-
tion among recommended items.
Currently, novelty and diversity measures do not have a stan-
dard; therefore, different authors propose different metrics
4.6. Reliability
The reliability of a prediction or a recommendation informs
about how seriously we may consider this prediction. When RS
recommends an item to a user with prediction 4.5 in a scale
{1,. . .,5}, this user hopes to be satisfied by this item. However, this
value of prediction (4.5 over 5) does not reflect with which certain
degree the RS has concluded that the user will like this item (with
value 4.5 over 5). Indeed, this prediction of 4.5 is much more reli-
able if it has obtained by means of 200 similar users than if it has
obtained by only two similar users.
In Hernando et al. [96], a realibility measure is proposed accord-
ing the usual notion that the more reliable a prediction, the less lia-
ble to be wrong. Although this reliability measure is not a quality
Fig. 7. Recommender systems evaluation process.
118 J. Bobadilla et al. / Knowledge-Based Systems 46 (2013) 109–132
12. P. Bellot (AMU-CNRS, LSIS-OpenEdition)
Conclusion
— Dealing with long and verbose natural language queries for information retrieval is still
an open problem
— For Book Search : using social information (reviews) and metadata can efficiently
replace the content of the books
— Combining Query Reduction / Expansion is useful and even mandatory for long queries
— Association Rules are efficient and effective
— Perspectives :
— Retrieving and analysing book reviews (aspect based sentiment analysis)
!12
ent classification mod-
aseline the naive Bayes
Savoy, 2010). The clas-
oose between two pos-
is a Review and h1 =
lass that has the maxi-
e Equation (5). Where
f words included in the
s the number of words
t.
|w|
Y
j=1
P(wj|hi) (5)
ities with the Equation
etween the lexical fre-
the whole size of the
,hi
) and the size of the
Words) where all words are considered as features.
We also used feature selection based on the nor-
malized z-score by keeping the first 1000 words
according to this score (after removing all words
that appear less than 5 times). As the third ap-
proach, we suggested that the common features
between the Review collection can be located in
the Named Entity distribution in the text.
Table 4: Results showing the performances of
the classification models using different indexing
schemes on the test set. The best values for the
Review class are noted in bold and those for
Review class are are underlined
Review Review
# Model R P F-M R P F-M
1 NB 65.5% 81.5% 72.6% 81.6% 65.7% 72.8%
SVM (Linear) 99.6% 98.3% 98.9% 97.9% 99.5% 98.7%
SVM (RBF) 89.8% 97.2% 93.4% 96.8% 88.5% 92.5%
* C = 5.0
* = 0.00185
2 NB 90.6% 64.2% 75.1% 37.4% 76.3% 50.2%
SVM (Linear) 87.2% 81.3% 84.2% 75.3% 82.7% 78.8%
SVM (RBF) 87.2% 86.5% 86.8% 83.1% 84.0% 83.6%
* C = 32.0
* = 0.00781
3 NB 80.0% 68.4% 73.7% 54.2% 68.7% 60.6%
SVM (Linear) 77.0% 81.9% 79.4% 78.9% 73.5% 76.1%
SVM (RBF) 81.2% 48.6% 79.9% 72.6% 75.8% 74.1%
* C = 8.0
* = 0.03125
A paraître : conférence NLDB (Paris, 2018)
LREC 2014
IEEE-ACM WI 2018
https://lab.hypotheses.org
You can follow us / participate: