SlideShare a Scribd company logo
1 of 37
Download to read offline
Institut de Recherche en Informatique de Toulouse (IRIT) - UMR 5505




                  Bridging the gap between users and systems

      Laurent CANDILLIER – Max CHEVALIER – Damien DUDOGNON – Josiane MOTHE




27/10/11
Diversity in recommender systems
  How to recommend documents for a visited one
              Maximizing the chances of retrieving at least one relevant
               document per user [Santos et al., 2010]
              Cover a large range of users’ interests


  Context
     Blog platform
              Unknown user => no profile
              Diversity of users, diversity of their expectations




27/10/11                  Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   2
Diversity in recommender systems
  How to recommend documents for a visited one
              Maximizing the chances of retrieving at least one relevant
               document per user [Santos et al., 2010]
              Cover a large range of users’ interests


  Context
     Blog platform
              Unknown user => no profile
              Diversity of users, diversity of their expectations


     => Diversify the recommendations
27/10/11                  Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   3
What is diversity?
  Definitions from the literature
    Topicality
              Related to a particular topic [Xu and Chen, 2006]


        Diversity
              Topical diversity
                Extrinsic: solve ambiguity [Radlinski et al., 2009]

                Intrinsic: avoid redundancy [Clarke et al., 2008]



              Serendipity
                Attractive and surprising documents [Herlocker et al., 2004]



27/10/11                   Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   4
Approaches to diversify IR results
  Topical diversity
     Clustering
              Identify aspects
              Reorder depending on the aspects covered


        Examples
              K-Means [Bi et al., 2009]
              Hierarchical Clustering [Meij et al., 2010]




27/10/11                  Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   5
Approaches to diversify IR results
  Topical diversity
     Sliding Windows
              Reorder the retrieved documents
              Select documents using metrics
                Similarity with the visited document

                Similarity with the current recommended document list




        Examples
              MMR [Carbonell and Goldstein, 1998]
              Intra-list similarity [Ziegler et al., 2005]



27/10/11                   Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   6
Approaches to diversify IR results
  Serendipity
     Alternative to topical diversity
     Similarity not only based on the content


        Examples
              Organizational similarity [Cabanac et al., 2007]
              Temporal diversity [Lathia et al., 2010]




27/10/11                 Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   7
Analysis of the TREC Web 2009 results
  Hypothesis
    Diversity of approaches
              No one approach for all users’ needs
              Approaches are complementary
              Valuable to combine them


  Goals
     Analyse results obtained with approaches having
            Same goal
            Similar performances

           => To identify if diversity exists

27/10/11                  Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   8
Analysis of the TREC Web 2009 results
  Experimental framework
     Reference IR corpus (TREC Web 2009)


        Two IR contexts
          Adhoc task

          Diversity task



        Compare results (runs) of the 4 best approaches of each task
          Similar performances according to IR metrics

             MAP for adhoc task

             NDCG for diversity task

          Overlap for each pair of runs underlying diversity


27/10/11              Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   9
Analysis of the TREC Web 2009 results
  Adhoc Task




    Top 10 documents
      Overlap: 22.4%

      Precision: 0.384



    Overlap max < 30%




27/10/11             Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   10
Analysis of the TREC Web 2009 results
  Diversity Task




    Top 10 documents
      Overlap: 6.3%



    Overlap max < 15%




27/10/11            Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   11
Analysis of the TREC Web 2009 results
  Conclusions
     Two distinct approaches are unlikely to return the same
      (relevant) documents
              Low average overlap

        Diversity of approaches
          No approach significantly better than others
          A combination can be valuable



        TREC tasks focused on topicality and topical diversity
          Can’t be used to evaluate other types of diversity
          Users’ study necessary [Hayes et al., 2002]


27/10/11                 Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   12
Users’ Study
  Our intuitions
    Most of the time, users want topicality
              Get focused information


        Sometime, they want diversity
              Topical diversity
                Enlarge the subject

              Serendipity
                Discover new information




27/10/11                 Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   13
Users’ Study
  Goals
     Verify our intuitions
     Prove that diversified recommendations answer a larger
      range of users’ needs

  Context of experimentation
     34 students in M. Sc. (Management field)
     Blog post recommendations




27/10/11         Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   14
Users’ Study
  Experimental Framework
     Select a document




27/10/11      Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   15
Users’ Study
  Experimental Framework
     Read the selected
      document




27/10/11      Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   16
Users’ Study
  Experimental Framework
     Compute the recommendation lists


       Approach 1                                                   List 1 (random)

       Approach 2

       Approach 3

       Approach 4                                                     List 2 (fused)

       Approach 5



27/10/11            Candillier L. – Chevalier M. – Dudognon D. – Mothe M.              17
Users’ Study
  Experimental Framework
     Compute the recommendation lists


       Approach 1                                                   List 1 (random)

       Approach 2

       Approach 3

       Approach 4                                                     List 2 (fused)

       Approach 5



27/10/11            Candillier L. – Chevalier M. – Dudognon D. – Mothe M.              18
Users’ Study
  Experimental Framework
     Compute the recommendation lists


       Approach 1                                                   List 1 (random)

       Approach 2

       Approach 3

       Approach 4                                                     List 2 (fused)

       Approach 5



27/10/11            Candillier L. – Chevalier M. – Dudognon D. – Mothe M.              19
Users’ Study
  Experimental Framework
     Compute the recommendation lists


       Approach 1                                                   List 1 (random)

       Approach 2

       Approach 3

       Approach 4                                                     List 2 (fused)

       Approach 5



27/10/11            Candillier L. – Chevalier M. – Dudognon D. – Mothe M.              20
Users’ Study
  Experimental Framework
     Compute the recommendation lists


       Approach 1                                                   List 1 (random)

       Approach 2

       Approach 3

       Approach 4                                                     List 2 (fused)

       Approach 5



27/10/11            Candillier L. – Chevalier M. – Dudognon D. – Mothe M.              21
Users’ Study
  Experimental Framework
     Present recommendation lists for assessment
                   Which list best meets your needs?




27/10/11         Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   22
Users’ Study
  Experimental Framework
     Present recommendation lists for assessment
                   Which list is the most diversified?




27/10/11         Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   23
Users’ Study
  Experimental Framework
     Assessment of all
      documents




27/10/11      Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   24
Users’ Study
  Approaches used
     searchsim
              Vector-space model
              Document title as query
        mlt
                                                                                  Topicality
          Apache Solr MoreLikeThis module
          Document content as query




27/10/11                  Candillier L. – Chevalier M. – Dudognon D. – Mothe M.            25
Users’ Study
  Approaches used
       
           
           
       
           
           
        kmeans
          K-means classification                                              Topical diversity
          One element per cluster




27/10/11               Candillier L. – Chevalier M. – Dudognon D. – Mothe M.                   26
Users’ Study
  Approaches used
       
           
           
       
           
           
       
           
           
        blogart
          Random selection from the same blog
        topcateg                                                              Serendipity
          Popular documents in the same category



27/10/11               Candillier L. – Chevalier M. – Dudognon D. – Mothe M.             27
Users’ Study
  Approaches used



    Same analysis than TREC
     experiments
      Same results

      Overlap is low (< 10%)

     => High diversity




27/10/11             Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   28
Users’ Study
     Results
        Distribution of relevant documents
   blogart                           fused                                kmeans                        fused
              35%              65%                                                  52.5%           21.3%
                       0%                                                                   26.2%


                                       mlt                             fused
                                             54.7%              32.8%
                                                      12.5%



searchsim                         fused                                topcateg                         fused
              52.4%          38.9%                                                  8.8%            91.2%
                      8.7%                                                                   0%


   27/10/11                     Candillier L. – Chevalier M. – Dudognon D. – Mothe M.                       29
Users’ Study
     Results
        Distribution of relevant documents
                                                                          kmeans                        fused
              35%              65%                                                  52.5%           21.3%
                       0%                                                                   26.2%


                                       mlt                             fused
                                             54.7%              32.8%
                                                      12.5%



searchsim                         fused
              52.4%          38.9%                                                  8.8%            91.2%
                      8.7%                                                                   0%


   27/10/11                     Candillier L. – Chevalier M. – Dudognon D. – Mothe M.                       30
Users’ Study
  Results
     Distribution of relevant documents
blogart                           fused
           35%              65%                                                  52.5%           21.3%
                    0%                                                                   26.2%




                                          54.7%              32.8%
                                                   12.5%



                                                                    topcateg                         fused
           52.4%          38.9%                                                  8.8%            91.2%
                   8.7%                                                                   0%


27/10/11                     Candillier L. – Chevalier M. – Dudognon D. – Mothe M.                       31
Users’ Study
  Results
     Distribution of relevant documents
              Relevant mainly retrieved by topical approaches
              But at least 20% are retrieved only by fused


        Fused matches with a larger range of needs




27/10/11                  Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   32
Conclusions and future work
  Conclusions
     Diversity of users’ expectations
              No one approach to rule them all
              A diversity of approaches
                Complementary

                Fused




        Diversity helps RS to fit more users’ needs




27/10/11                  Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   33
Conclusions and future work
  Future work
     Real scale experiment
              OverBlog platform


        Renew the user survey
              More users (international call for participation)
              Avoid revealed biases
                e.g. More detailed form

               => Deeper analysis




27/10/11                  Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   34
Conclusions and future work
  Future work
     Improve the model
              Refining the fusing process
              Adding a learning process to weight each approach
                For every visited document

                   Find the proportion of documents coming from each
                    approach (log analysis)
                Better match with the real users’ needs




27/10/11                  Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   35
Thank you for your attention

                               Questions ?




27/10/11         Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   36
References
 W. Bi, X. Yu, Y. Liu, F. Guan, Z. Peng, H. Xu, and X. Cheng, “ICTNET at Web Track 2009 diversity task”, Text REtrieval Conf., 2009

 G. Cabanac, M. Chevalier, C. Chrisment, and C. Julien, “An Original Usage-based Metrics for Building a Unified View of Corporate Documents”,
 Inter. Conf. on Database and Expert Systems Applications, 2007, LNCS V. 4653, 2007, pp. 202–212

 J. Carbonell and J. Goldstein, “The use of MMR, diversity-based reranking for reordering documents and producing summaries”, ACM Conf. on
 Research and Development in Information Retrieval, 1998, pp. 335-336

 C. L. A. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. Büttcher, and I.n MacKinnon, “Novelty and Diversity in Information
 Retrieval Evaluation”, ACM Conf. on Research and Development in Information Retrieval, 2008, pp. 659-666

 C. Hayes, P. Massa, P. Avesani, and P. Cunningham, « An online evaluation framework for recommender systems», Workshop on Personalization
 and Recommendation in E-Commerce, 2002

 J. L. Herlocker, J. A. Konstan, L. G. Terveen, and J. T. Riedl, “Evaluating Collaborative Filtering Recommender Systems”, ACM Trans. Information
 Systems, 22(1), 2004, pp. 5-53

 N. Lathia, S. Hailes, L. Capra, and X. Amatriain, “Temporal diversity in recommender systems”, ACM Conf. on Research and Development in
 Information Retrieval, 2010, pp. 210-217

 E. Meij, J. He, W. Weerkamp, and M. de Rijke, “Topical Diversity and Relevance Feedback”, Text REtrieval Conf., 2010

 F. Radlinski, P. N. Bennett, B. Carterette, and T. Joachims. “Redundancy, diversity and interdependent document relevance”, SIGIR Forum, 43(2),
 2009, pp. 46–52

 R. L. T. Santos, C. Macdonald, and I. Ounis, “Selectively Diversifying Web Search Results”, ACM Inter. Conf. on Information and Knowledge
 Management, 2010

 Y. C. Xu and Z. Chen, “Relevance judgment: What do information users consider beyond topicality”, Journal of the American Society for
 Information Science and Technology, 57(7), 2006, pp. 961–973

 C. Ziegler, S. McNee, J. A. Konstan, and G. Lausen, “Improving recommendation lists through topic diversification”, Inter. Conf. on World Wide
 Web, 2005, pp. 22–32
27/10/11                                 Candillier L. – Chevalier M. – Dudognon D. – Mothe M.                                                  37

More Related Content

Similar to Diversity in recommender systems - Bridging the gap between users and systems

Ullmann
UllmannUllmann
Ullmann
anesah
 
Ontology for Research in Distance, Open and Online Learning
Ontology for Research in Distance, Open and Online LearningOntology for Research in Distance, Open and Online Learning
Ontology for Research in Distance, Open and Online Learning
Sanjaya Mishra
 

Similar to Diversity in recommender systems - Bridging the gap between users and systems (20)

Staged Models for Interdisciplinary Research
Staged Models for Interdisciplinary ResearchStaged Models for Interdisciplinary Research
Staged Models for Interdisciplinary Research
 
Serve-Learn-Sustain's Linked Courses for Trandisciplinary Learning
Serve-Learn-Sustain's Linked Courses for Trandisciplinary LearningServe-Learn-Sustain's Linked Courses for Trandisciplinary Learning
Serve-Learn-Sustain's Linked Courses for Trandisciplinary Learning
 
Interactive Recommender Systems
Interactive Recommender SystemsInteractive Recommender Systems
Interactive Recommender Systems
 
Visual Analysis of Topic Competition on Social Media
Visual Analysis of Topic Competition on Social Media Visual Analysis of Topic Competition on Social Media
Visual Analysis of Topic Competition on Social Media
 
Ullmann
UllmannUllmann
Ullmann
 
1_Q2-PRACTICAL-RESEARCH.pptx
1_Q2-PRACTICAL-RESEARCH.pptx1_Q2-PRACTICAL-RESEARCH.pptx
1_Q2-PRACTICAL-RESEARCH.pptx
 
metodos qualitativos
metodos qualitativosmetodos qualitativos
metodos qualitativos
 
Interactive recommender systems: opening up the “black box”
Interactive recommender systems: opening up the “black box”Interactive recommender systems: opening up the “black box”
Interactive recommender systems: opening up the “black box”
 
Designing work integrated assessment- tools & techniques for creating 'authe...
Designing work integrated assessment- tools & techniques for creating  'authe...Designing work integrated assessment- tools & techniques for creating  'authe...
Designing work integrated assessment- tools & techniques for creating 'authe...
 
Jan Reichelt Mendeley
Jan Reichelt MendeleyJan Reichelt Mendeley
Jan Reichelt Mendeley
 
Professor Marcia Devlin: "Learning Theories and Interdisciplinary Epistemolog...
Professor Marcia Devlin: "Learning Theories and Interdisciplinary Epistemolog...Professor Marcia Devlin: "Learning Theories and Interdisciplinary Epistemolog...
Professor Marcia Devlin: "Learning Theories and Interdisciplinary Epistemolog...
 
Design based for lisbon 2011
Design based for lisbon 2011Design based for lisbon 2011
Design based for lisbon 2011
 
Ontology for Research in Distance, Open and Online Learning
Ontology for Research in Distance, Open and Online LearningOntology for Research in Distance, Open and Online Learning
Ontology for Research in Distance, Open and Online Learning
 
Navigation Support for Learners in Informal Learning Environments, Recommende...
Navigation Support for Learners in Informal Learning Environments, Recommende...Navigation Support for Learners in Informal Learning Environments, Recommende...
Navigation Support for Learners in Informal Learning Environments, Recommende...
 
Bell.bolouri
Bell.bolouriBell.bolouri
Bell.bolouri
 
these_15-9
these_15-9these_15-9
these_15-9
 
Doing things differently: Re-evaluating our role in participatory research
Doing things differently: Re-evaluating our role in participatory researchDoing things differently: Re-evaluating our role in participatory research
Doing things differently: Re-evaluating our role in participatory research
 
EU4ALL @ UKOU
EU4ALL @ UKOUEU4ALL @ UKOU
EU4ALL @ UKOU
 
Chapter eight (1)
Chapter eight (1)Chapter eight (1)
Chapter eight (1)
 
Mapping the Terrain of Design Thinking: Pedagogies & Outcomes
Mapping the Terrain of Design Thinking: Pedagogies & OutcomesMapping the Terrain of Design Thinking: Pedagogies & Outcomes
Mapping the Terrain of Design Thinking: Pedagogies & Outcomes
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 

Diversity in recommender systems - Bridging the gap between users and systems

  • 1. Institut de Recherche en Informatique de Toulouse (IRIT) - UMR 5505 Bridging the gap between users and systems Laurent CANDILLIER – Max CHEVALIER – Damien DUDOGNON – Josiane MOTHE 27/10/11
  • 2. Diversity in recommender systems  How to recommend documents for a visited one  Maximizing the chances of retrieving at least one relevant document per user [Santos et al., 2010]  Cover a large range of users’ interests  Context  Blog platform  Unknown user => no profile  Diversity of users, diversity of their expectations 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 2
  • 3. Diversity in recommender systems  How to recommend documents for a visited one  Maximizing the chances of retrieving at least one relevant document per user [Santos et al., 2010]  Cover a large range of users’ interests  Context  Blog platform  Unknown user => no profile  Diversity of users, diversity of their expectations => Diversify the recommendations 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 3
  • 4. What is diversity?  Definitions from the literature  Topicality  Related to a particular topic [Xu and Chen, 2006]  Diversity  Topical diversity  Extrinsic: solve ambiguity [Radlinski et al., 2009]  Intrinsic: avoid redundancy [Clarke et al., 2008]  Serendipity  Attractive and surprising documents [Herlocker et al., 2004] 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 4
  • 5. Approaches to diversify IR results  Topical diversity  Clustering  Identify aspects  Reorder depending on the aspects covered  Examples  K-Means [Bi et al., 2009]  Hierarchical Clustering [Meij et al., 2010] 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 5
  • 6. Approaches to diversify IR results  Topical diversity  Sliding Windows  Reorder the retrieved documents  Select documents using metrics  Similarity with the visited document  Similarity with the current recommended document list  Examples  MMR [Carbonell and Goldstein, 1998]  Intra-list similarity [Ziegler et al., 2005] 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 6
  • 7. Approaches to diversify IR results  Serendipity  Alternative to topical diversity  Similarity not only based on the content  Examples  Organizational similarity [Cabanac et al., 2007]  Temporal diversity [Lathia et al., 2010] 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 7
  • 8. Analysis of the TREC Web 2009 results  Hypothesis  Diversity of approaches  No one approach for all users’ needs  Approaches are complementary  Valuable to combine them  Goals  Analyse results obtained with approaches having  Same goal  Similar performances => To identify if diversity exists 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 8
  • 9. Analysis of the TREC Web 2009 results  Experimental framework  Reference IR corpus (TREC Web 2009)  Two IR contexts  Adhoc task  Diversity task  Compare results (runs) of the 4 best approaches of each task  Similar performances according to IR metrics  MAP for adhoc task  NDCG for diversity task  Overlap for each pair of runs underlying diversity 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 9
  • 10. Analysis of the TREC Web 2009 results  Adhoc Task  Top 10 documents  Overlap: 22.4%  Precision: 0.384  Overlap max < 30% 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 10
  • 11. Analysis of the TREC Web 2009 results  Diversity Task  Top 10 documents  Overlap: 6.3%  Overlap max < 15% 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 11
  • 12. Analysis of the TREC Web 2009 results  Conclusions  Two distinct approaches are unlikely to return the same (relevant) documents  Low average overlap  Diversity of approaches  No approach significantly better than others  A combination can be valuable  TREC tasks focused on topicality and topical diversity  Can’t be used to evaluate other types of diversity  Users’ study necessary [Hayes et al., 2002] 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 12
  • 13. Users’ Study  Our intuitions  Most of the time, users want topicality  Get focused information  Sometime, they want diversity  Topical diversity  Enlarge the subject  Serendipity  Discover new information 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 13
  • 14. Users’ Study  Goals  Verify our intuitions  Prove that diversified recommendations answer a larger range of users’ needs  Context of experimentation  34 students in M. Sc. (Management field)  Blog post recommendations 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 14
  • 15. Users’ Study  Experimental Framework  Select a document 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 15
  • 16. Users’ Study  Experimental Framework  Read the selected document 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 16
  • 17. Users’ Study  Experimental Framework  Compute the recommendation lists Approach 1 List 1 (random) Approach 2 Approach 3 Approach 4 List 2 (fused) Approach 5 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 17
  • 18. Users’ Study  Experimental Framework  Compute the recommendation lists Approach 1 List 1 (random) Approach 2 Approach 3 Approach 4 List 2 (fused) Approach 5 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 18
  • 19. Users’ Study  Experimental Framework  Compute the recommendation lists Approach 1 List 1 (random) Approach 2 Approach 3 Approach 4 List 2 (fused) Approach 5 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 19
  • 20. Users’ Study  Experimental Framework  Compute the recommendation lists Approach 1 List 1 (random) Approach 2 Approach 3 Approach 4 List 2 (fused) Approach 5 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 20
  • 21. Users’ Study  Experimental Framework  Compute the recommendation lists Approach 1 List 1 (random) Approach 2 Approach 3 Approach 4 List 2 (fused) Approach 5 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 21
  • 22. Users’ Study  Experimental Framework  Present recommendation lists for assessment Which list best meets your needs? 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 22
  • 23. Users’ Study  Experimental Framework  Present recommendation lists for assessment Which list is the most diversified? 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 23
  • 24. Users’ Study  Experimental Framework  Assessment of all documents 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 24
  • 25. Users’ Study  Approaches used  searchsim  Vector-space model  Document title as query  mlt Topicality  Apache Solr MoreLikeThis module  Document content as query 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 25
  • 26. Users’ Study  Approaches used        kmeans  K-means classification Topical diversity  One element per cluster 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 26
  • 27. Users’ Study  Approaches used           blogart  Random selection from the same blog  topcateg Serendipity  Popular documents in the same category 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 27
  • 28. Users’ Study  Approaches used  Same analysis than TREC experiments  Same results  Overlap is low (< 10%) => High diversity 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 28
  • 29. Users’ Study  Results  Distribution of relevant documents blogart fused kmeans fused 35% 65% 52.5% 21.3% 0% 26.2% mlt fused 54.7% 32.8% 12.5% searchsim fused topcateg fused 52.4% 38.9% 8.8% 91.2% 8.7% 0% 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 29
  • 30. Users’ Study  Results  Distribution of relevant documents kmeans fused 35% 65% 52.5% 21.3% 0% 26.2% mlt fused 54.7% 32.8% 12.5% searchsim fused 52.4% 38.9% 8.8% 91.2% 8.7% 0% 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 30
  • 31. Users’ Study  Results  Distribution of relevant documents blogart fused 35% 65% 52.5% 21.3% 0% 26.2% 54.7% 32.8% 12.5% topcateg fused 52.4% 38.9% 8.8% 91.2% 8.7% 0% 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 31
  • 32. Users’ Study  Results  Distribution of relevant documents  Relevant mainly retrieved by topical approaches  But at least 20% are retrieved only by fused  Fused matches with a larger range of needs 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 32
  • 33. Conclusions and future work  Conclusions  Diversity of users’ expectations  No one approach to rule them all  A diversity of approaches  Complementary  Fused  Diversity helps RS to fit more users’ needs 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 33
  • 34. Conclusions and future work  Future work  Real scale experiment  OverBlog platform  Renew the user survey  More users (international call for participation)  Avoid revealed biases  e.g. More detailed form => Deeper analysis 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 34
  • 35. Conclusions and future work  Future work  Improve the model  Refining the fusing process  Adding a learning process to weight each approach  For every visited document  Find the proportion of documents coming from each approach (log analysis)  Better match with the real users’ needs 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 35
  • 36. Thank you for your attention Questions ? 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 36
  • 37. References W. Bi, X. Yu, Y. Liu, F. Guan, Z. Peng, H. Xu, and X. Cheng, “ICTNET at Web Track 2009 diversity task”, Text REtrieval Conf., 2009 G. Cabanac, M. Chevalier, C. Chrisment, and C. Julien, “An Original Usage-based Metrics for Building a Unified View of Corporate Documents”, Inter. Conf. on Database and Expert Systems Applications, 2007, LNCS V. 4653, 2007, pp. 202–212 J. Carbonell and J. Goldstein, “The use of MMR, diversity-based reranking for reordering documents and producing summaries”, ACM Conf. on Research and Development in Information Retrieval, 1998, pp. 335-336 C. L. A. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. Büttcher, and I.n MacKinnon, “Novelty and Diversity in Information Retrieval Evaluation”, ACM Conf. on Research and Development in Information Retrieval, 2008, pp. 659-666 C. Hayes, P. Massa, P. Avesani, and P. Cunningham, « An online evaluation framework for recommender systems», Workshop on Personalization and Recommendation in E-Commerce, 2002 J. L. Herlocker, J. A. Konstan, L. G. Terveen, and J. T. Riedl, “Evaluating Collaborative Filtering Recommender Systems”, ACM Trans. Information Systems, 22(1), 2004, pp. 5-53 N. Lathia, S. Hailes, L. Capra, and X. Amatriain, “Temporal diversity in recommender systems”, ACM Conf. on Research and Development in Information Retrieval, 2010, pp. 210-217 E. Meij, J. He, W. Weerkamp, and M. de Rijke, “Topical Diversity and Relevance Feedback”, Text REtrieval Conf., 2010 F. Radlinski, P. N. Bennett, B. Carterette, and T. Joachims. “Redundancy, diversity and interdependent document relevance”, SIGIR Forum, 43(2), 2009, pp. 46–52 R. L. T. Santos, C. Macdonald, and I. Ounis, “Selectively Diversifying Web Search Results”, ACM Inter. Conf. on Information and Knowledge Management, 2010 Y. C. Xu and Z. Chen, “Relevance judgment: What do information users consider beyond topicality”, Journal of the American Society for Information Science and Technology, 57(7), 2006, pp. 961–973 C. Ziegler, S. McNee, J. A. Konstan, and G. Lausen, “Improving recommendation lists through topic diversification”, Inter. Conf. on World Wide Web, 2005, pp. 22–32 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 37