Contextualised Browsing in
a Digital Library’s Living
Lab
Zeljko Carevic, Sascha Schüller, Philipp Mayr, Norbert
Fuhr
JCDL 2018
Introduction
 Exploratory Search (especially
browsing/stratagem search) is one of the most
frequent search activities in DL [1-3]
 DL offer high quality structured metadata that can
be utilised for browsing. E.g.:
 Keywords
 Classifications
 Journals
 System support on this level rather low. E.g.:
 Browsing DL by keywords acts as a simple
Boolean filter
2
3
violence and sports
loosing the context
stratagems
Contextualised Browsing
 Implement contextual browsing that tailors search
results along previous search activities of a present
user.
 Introduce two contextual re-ranking features:
 Document similarity
 Session context
4
Contextualised Browsing
5
re-rank these
result lists based
on
contextual
information
Research Question
 Can we improve the effectiveness of exploratory
search on the level of browsing by using contextual
ranking features in comparison to a non-contextual
ranking feature?
6
Approach A: Baseline
 Default ranking that is based on a query expansion
including synonyms and translations.
 Browsing is not contextualised.
7
Q=Expanded Query e.g.
Keyword:“sport“
D=Set of documents
Approach B: Document Similarity
 Re-rank documents according to their similarity to
the seed document.
 To measure the similarity between two documents we
employ SOLR‘S „More Like This“ query parser.
8
Q=Expanded Query e.g.
Keyword:“sport“
D=Set of documents
D_s=Seed document
Approach C: Session Context
 Re-rank document based on previous search
activities -> Session Context
9
Q=Expanded Query e.g.
Keyword:“sport“
D=Set of documents
U_c=Session Context
Approach C: Session Context
 Session context
contains information
about:
 Submitted queries
(„violence“ and „violence and
sports“)
 Set of:
 Keywords and
 Classifications
 which were contained in
seen documents and in
documents within a result
set
10
Experiment
 For a period of 3 months each Sowiport user
is assigned one approach at the beginning of
a session:
 A: Baseline (non-contextualised)
 B: Document similarity (contextualised)
 C: Session context (contextualised)
11
Sowiport a DL for the Social Sciences as a
Living Lab
9.5 Mio. documents
20,000 unique users per week
Methodology
 Measure the effectiveness of our contextualised
ranking features on two levels:
 Mean First Relevant (MFR): The mean of the first
clicked document in a result set [4]
 Usefulness [5]
 Local usefulness: the immediate relevance of a
document
 Global usefulness: the total number of implicit
relevance signals for the entire session starting
from stratagem usage.
12
Results
 ~600,000 sessions in total
 Equally distribution for:
 Total stratagem usage
 Interactions per session
 Dwell time
 Document views from stratagem search notably
higher for the contextualised approaches
13
Results: Mean first relevant
14
 Baseline significantly outperformed by both
contextual re-ranking features
 Document similarity performs best.
 As result set sizes might contain only few
documents we additionaly measure MFR ≥ 20
 MFR increases for all approaches when MFR
≥ 20Bonferroni corrected p*=0.016
Results: Mean first relevant with
different history sizes (HS)
15
 MFR increases with growing HS
 Effect most evident for the baseline
 HS has the lowest effect on approach C
 The better the session context the better the re-
ranking
 Sample rather low.
 Approach C highly depends on the number of interactions
resulting in a more meaningful context -> Cold start
problem
 History size is defined by the number of interactions prior
stratagem search.
Results: Usefulness
16
 Similar observation as in MFR.
 Baseline outperformed by both contextualised
approaches
 Document Similarity performs best.
 Global usefulness only marginally different
Results (Summary)
 Document views from stratagem search notably
higher for the contextualised approaches
 Both contextual ranking features outperform the
baseline in terms of MFR.
 Document similarity performs best; esp. for short
sessions
 Performance of the session context increases
with growing history sizes
 In terms of usefulness the re-ranking based on
document similarity performs best.
 Differences in session related features like dwell time
could not be found.
17
Strengths and Limitations
 Pros
 Real life environment with real users
 Large sample of online users
 Strong indication for a need for contextual ranking
features
 Cons
 No information about the relevance of the clicked
documents
 User is not aware of the re-ranking and thus not
able to tune the results
18
Outlook
 Evaluate contextualisation in a controlled
environment.
 Gather information about the explicit relevance
of clicked documents
 Introduce a transparent re-ranking interface that
enables users to tune the ranking (e.g. disable
contextualisation)
 Implement more sophisticated re-ranking
approaches e.g.:
 Mouse tracking
 Collaborative contextualisation
19
Conclusion
 Implemented two contextual re-ranking features
that rank documents according to:
 Document Similarity
 Session Context
 Evaluation in a living lab for the Social Sciences
 Contextual ranking significantly outperforms the non-
contextualised baseline.
 Contextualisation has an immediate influence on
the local usefulness of search results.
20
References
 [1] Zeljko Carevic, Maria Lusky, Wilko van Hoek, and Philipp
Mayr. 2017. Investigating exploratory search activities based on
the stratagem level in digital libraries. International Journal on
Digital Libraries (2017), 1–21.
 [2] Zeljko Carevic and Philipp Mayr. 2016. Survey on High-level
Search Activities based on the Stratagem Level in Digital
Libraries. In Proceedings of TPDL 2016, Springer, 54–66
 [3] Philipp Mayr and Ameni Kacem. 2017. A Complete Year of
User Retrieval Sessions in a Social Sciences Academic Search
Engine. In Proceedings of TPDL 2017, Springer, 560–565
 [4] Norbert Fuhr. 2017. Some Common Mistakes In IR
Evaluation, And How They Can Be Avoided. Technical Report.
University of Duisburg-Essen, Germany
 [5] Daniel Hienert and Peter Mutschke. 2016. A usefulness-
based approach for measuring the local and global effect of IIR
services. In Proceedings of the 2016 ACM Conference on
Human Information Interaction and Retrieval. ACM, 153–162
21

Contextualised Browsing in a Digital Library’s Living Lab

  • 1.
    Contextualised Browsing in aDigital Library’s Living Lab Zeljko Carevic, Sascha Schüller, Philipp Mayr, Norbert Fuhr JCDL 2018
  • 2.
    Introduction  Exploratory Search(especially browsing/stratagem search) is one of the most frequent search activities in DL [1-3]  DL offer high quality structured metadata that can be utilised for browsing. E.g.:  Keywords  Classifications  Journals  System support on this level rather low. E.g.:  Browsing DL by keywords acts as a simple Boolean filter 2
  • 3.
    3 violence and sports loosingthe context stratagems
  • 4.
    Contextualised Browsing  Implementcontextual browsing that tailors search results along previous search activities of a present user.  Introduce two contextual re-ranking features:  Document similarity  Session context 4
  • 5.
    Contextualised Browsing 5 re-rank these resultlists based on contextual information
  • 6.
    Research Question  Canwe improve the effectiveness of exploratory search on the level of browsing by using contextual ranking features in comparison to a non-contextual ranking feature? 6
  • 7.
    Approach A: Baseline Default ranking that is based on a query expansion including synonyms and translations.  Browsing is not contextualised. 7 Q=Expanded Query e.g. Keyword:“sport“ D=Set of documents
  • 8.
    Approach B: DocumentSimilarity  Re-rank documents according to their similarity to the seed document.  To measure the similarity between two documents we employ SOLR‘S „More Like This“ query parser. 8 Q=Expanded Query e.g. Keyword:“sport“ D=Set of documents D_s=Seed document
  • 9.
    Approach C: SessionContext  Re-rank document based on previous search activities -> Session Context 9 Q=Expanded Query e.g. Keyword:“sport“ D=Set of documents U_c=Session Context
  • 10.
    Approach C: SessionContext  Session context contains information about:  Submitted queries („violence“ and „violence and sports“)  Set of:  Keywords and  Classifications  which were contained in seen documents and in documents within a result set 10
  • 11.
    Experiment  For aperiod of 3 months each Sowiport user is assigned one approach at the beginning of a session:  A: Baseline (non-contextualised)  B: Document similarity (contextualised)  C: Session context (contextualised) 11 Sowiport a DL for the Social Sciences as a Living Lab 9.5 Mio. documents 20,000 unique users per week
  • 12.
    Methodology  Measure theeffectiveness of our contextualised ranking features on two levels:  Mean First Relevant (MFR): The mean of the first clicked document in a result set [4]  Usefulness [5]  Local usefulness: the immediate relevance of a document  Global usefulness: the total number of implicit relevance signals for the entire session starting from stratagem usage. 12
  • 13.
    Results  ~600,000 sessionsin total  Equally distribution for:  Total stratagem usage  Interactions per session  Dwell time  Document views from stratagem search notably higher for the contextualised approaches 13
  • 14.
    Results: Mean firstrelevant 14  Baseline significantly outperformed by both contextual re-ranking features  Document similarity performs best.  As result set sizes might contain only few documents we additionaly measure MFR ≥ 20  MFR increases for all approaches when MFR ≥ 20Bonferroni corrected p*=0.016
  • 15.
    Results: Mean firstrelevant with different history sizes (HS) 15  MFR increases with growing HS  Effect most evident for the baseline  HS has the lowest effect on approach C  The better the session context the better the re- ranking  Sample rather low.  Approach C highly depends on the number of interactions resulting in a more meaningful context -> Cold start problem  History size is defined by the number of interactions prior stratagem search.
  • 16.
    Results: Usefulness 16  Similarobservation as in MFR.  Baseline outperformed by both contextualised approaches  Document Similarity performs best.  Global usefulness only marginally different
  • 17.
    Results (Summary)  Documentviews from stratagem search notably higher for the contextualised approaches  Both contextual ranking features outperform the baseline in terms of MFR.  Document similarity performs best; esp. for short sessions  Performance of the session context increases with growing history sizes  In terms of usefulness the re-ranking based on document similarity performs best.  Differences in session related features like dwell time could not be found. 17
  • 18.
    Strengths and Limitations Pros  Real life environment with real users  Large sample of online users  Strong indication for a need for contextual ranking features  Cons  No information about the relevance of the clicked documents  User is not aware of the re-ranking and thus not able to tune the results 18
  • 19.
    Outlook  Evaluate contextualisationin a controlled environment.  Gather information about the explicit relevance of clicked documents  Introduce a transparent re-ranking interface that enables users to tune the ranking (e.g. disable contextualisation)  Implement more sophisticated re-ranking approaches e.g.:  Mouse tracking  Collaborative contextualisation 19
  • 20.
    Conclusion  Implemented twocontextual re-ranking features that rank documents according to:  Document Similarity  Session Context  Evaluation in a living lab for the Social Sciences  Contextual ranking significantly outperforms the non- contextualised baseline.  Contextualisation has an immediate influence on the local usefulness of search results. 20
  • 21.
    References  [1] ZeljkoCarevic, Maria Lusky, Wilko van Hoek, and Philipp Mayr. 2017. Investigating exploratory search activities based on the stratagem level in digital libraries. International Journal on Digital Libraries (2017), 1–21.  [2] Zeljko Carevic and Philipp Mayr. 2016. Survey on High-level Search Activities based on the Stratagem Level in Digital Libraries. In Proceedings of TPDL 2016, Springer, 54–66  [3] Philipp Mayr and Ameni Kacem. 2017. A Complete Year of User Retrieval Sessions in a Social Sciences Academic Search Engine. In Proceedings of TPDL 2017, Springer, 560–565  [4] Norbert Fuhr. 2017. Some Common Mistakes In IR Evaluation, And How They Can Be Avoided. Technical Report. University of Duisburg-Essen, Germany  [5] Daniel Hienert and Peter Mutschke. 2016. A usefulness- based approach for measuring the local and global effect of IIR services. In Proceedings of the 2016 ACM Conference on Human Information Interaction and Retrieval. ACM, 153–162 21

Editor's Notes

  • #4 Entering a DL via a google search, finding one good record and quickly loosing the context (because of too simple browsing)
  • #6 Each of these interactions (7) leads to a new result list containing documents that shares the same attribute with the seed document which is also part of the result list (8). Our approach is to re-rank these result lists based on contextual information about the users search sessions.
  • #9 SR is an extension to the default ranking DR. To compute the similarity of all documents to the seed document we use the keywords, journal information, the abstract (in different languages if available), and the author names of the seed document.
  • #13 MFR is an improvement of MRR mean reciprocal rank, and RR