Ph.D. Defense - Enhanced Vector Space Models for Content-based Recommender Systems

2,358 views
2,287 views

Published on

08.06.12
PhD defense: Enhanced Vector Space Models for Content-based Recommender Systems.

Published in: Technology, Business
0 Comments
8 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,358
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
127
Comments
0
Likes
8
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Ph.D. Defense - Enhanced Vector Space Models for Content-based Recommender Systems

    1. 1. Università degli Studi di Bari ‘Aldo Moro’ Dottorato di Ricerca in Informatica - Ciclo XXIV Enhanced Vector Space Models for Content-based Recommender Systems Cataldo Musto, Ph.D. Candidate Supervisor: prof. Giovanni Semeraro08.06.12
    2. 2. what will we talk about in the next 40 minutes?Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    3. 3. life is all a matter of decisionsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    4. 4. life is all a matter of decisionsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    5. 5. decision-making is actually challengingCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    6. 6. decision-making is actually challengingCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    7. 7. decision-making is actually challengingCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    8. 8. as much we need to hold knowledge as possibleCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    9. 9. Leibniz “In things which are absolutely indifferent there can be no choice and consequently no option or will. ”Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    10. 10. information age knowledge is spread through the WebCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    11. 11. social media changed the rules for informationmanagement and knowledge acquisitionCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    12. 12. exponential growth of the available informationCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    13. 13. Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    14. 14. it is physiologically impossible to follow the information flow in real timeCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    15. 15. how much information?Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    16. 16. we daily interact with393 bits of information per secondCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    17. 17. human brain can absorb126 bits of information per secondCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    18. 18. we can handle 126 bits of information we deal with 393 bits of information ratio: more than (Source: Adrian C.Ott, The 24-hour customer) 3x consequence: Information OverloadCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    19. 19. Information OverloadCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    20. 20. Information OverloadCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    21. 21. Information OverloadCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    22. 22. Information OverloadCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    23. 23. Information OverloadCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    24. 24. paradox of choice (Barry Schwartz, TED talk “Why more is less”)Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    25. 25. Buridan’s ass paradox Two alternatives. The ass cannot decide. It starves.Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    26. 26. Is the information overload actually unbearable?Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    27. 27. “It is not information overload. It is filter failure” Clay Shirky talk @Web2.0 ExpoCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    28. 28. Solution we need to the improve techniques for filtering the informationCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    29. 29. Information Filtering (IF) “To expose users only with the information that are relevant for them, thus avoiding information overload.” to filter. as kids do when they play with sand.Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    30. 30. IF applicationsExample: Recommender System Relevant items (movies, news, books, etc.) are pushed to the user according to her needs.Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    31. 31. Recommender Systems are an effective way to face the Information Overload problemCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    32. 32. example Amazon.com RecommendationsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    33. 33. Information Retrieval (IR) “Findings of relevant pieces of information from a collection of (usually unstructured) data”Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    34. 34. IR applications Example: Search Engines Relevant document are returned to the user, according to her query.Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    35. 35. IR vs. IF • IR and IF represent two strictly related research areas • Same goal: to optimize and make easier the access to (unstructured) data sources • “Two sides of the same coin” (*) (*) N.Belkin, W. Croft: Information Filtering and Information Retrieval: Two sides of the same coin”, Communications of ACM, Volume 35, Issue 12, pp. 29-38, 1992Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    36. 36. IR vs IF: differences • Little differences • Representation of user needs • Query in IR, user profile in IF • Convergence between IR and IF • Personalized Search !Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    37. 37. Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    38. 38. Ph.D. dissertation Research Question Is it possible to exploit the convergence between IR and IF to introduce a recommendation framework based on IR techniques?Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    39. 39. outline.Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    40. 40. outline (1/2) • recommender systems • content-based recommender systems (CBRS) • vector space models • VSM for CBRS • strengths and weaknessesCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    41. 41. outline (2/2) • eVSM: enhanced vector space models • semantics in VSMs • dimensionality reduction in VSMs • modeling negation in VSMs • applications and experimental evaluation • movie recommendation • Philips TV-guides personalizationCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    42. 42. recommender systems.Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    43. 43. definition guiding the Recommender Systems have the goal of users in a personalized way to interesting or useful objects in a large space of possible options. Burke, 2002 (*) (*) Robin D. Burke: Hybrid Recommender Systems: Survey and Experiments. UMUAI, volume 12, issue 4, 331-370 (2002)Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    44. 44. suggestions• Examples • books or news to read • music to be listened to • movies worth to be watched • restaurants, etc.Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    45. 45. Some maths (1/2) • Let • U set of users • I set of items • Given • user u ∈ U • item i ∈ ICataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    46. 46. Some maths (2/2) • A recommender system should predict how relevant item i is for user u by defining a scoring function • f: U×I→[0,1] = scoring function • The items with the highest value of f are labeled as relevant and returned to the userCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    47. 47. classes of RSs • In literature many approaches for building RSs have been introduced. • Collaborative Recommender Systems • Content-based Recommender Systems • Knowledge-based Recommender Systems • Demographic-based Recommender Systems • Social Recommender Systems • Hybrid Recommender SystemsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    48. 48. classes of RSs • In literature many approaches for building RSs have been introduced. • Collaborative Recommender Systems FOCUS • Content-based Recommender Systems • Knowledge-based Recommender Systems • Demographic-based Recommender Systems • Social Recommender Systems • Hybrid Recommender SystemsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    49. 49. content-based recommenders Suggest items similar to those liked in the past by the userCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    50. 50. content-based recommenders key concepts • Each item has to be described through a set of textual features • Movie plots, content of news, book summaries,Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    51. 51. content-based recommenders key concepts • User profile contains the features that often occur in the items the user liked • A profile of a user interested in basketball will contain keywords related to it (example: basketball teams, players or competitions)Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    52. 52. content-based recommenders key concepts • Recommendations are provided by calculating the overlap between the features stored in the user profile and those that occur in the item. • The bigger the overlap, the higher the relevanceCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    53. 53. content-based recommenders example: news recommendations Items User Profile User is interested in ♥ news articles about sports, football, ♥ cycling, etc.Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    54. 54. content-based recommenders example: news recommendations Items Recommendations ♥ ♥Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    55. 55. content-based recommenders example: news recommendations Items Recommendations ♥ X ♥Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    56. 56. content-based recommenders example: news recommendations Items Recommendations ♥ X ♥Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    57. 57. main building block vector space model the most adopted IR model (*) (*) Gerard Salton: A Vector Space Model for Automatic Indexing, Communications of the ACM, vol. 18, nr. 11, pages 613–620Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    58. 58. vector space model (VSM) Testo • Given a set of n features (vocabulary) Testo • f={ f1, f2 ... fn } • Given a set of M items • Each document (item) is represented as a point a an n-dimensional vector space • I = (wi in the itemw is the weight of i feature .....w ) -f1 fn fiCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    59. 59. VSM representation football news sports news politics news politics newsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    60. 60. research question Is it possible to exploit VSM for a recommendation scenario?Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    61. 61. VSM for CBRS how to adapt it? • In VSM each item is represented as a vector • User profile vector space representation as well needs a • How? • For example, by combining vectors of the items (documents) the user liked in the pastCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    62. 62. VSM representation user profile football news sports news politics news politics newsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    63. 63. VSM representation Recommendation task seen as user profile similarity calculation football news between vectors sports news politics news politics newsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    64. 64. VSM representation recommender systmem suggests user profile football and football news sports news sports news politics news politics newsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    65. 65. Can this model be improved? Yes.Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    66. 66. VSM weaknesses • Modeling Negation • VSM does not model negative evidences • The vector space representation only depends on the features that occur in the document, there are no assumption about the features that don’t occur • What a specific user dislikes is not consideredCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    67. 67. VSM weaknesses • High Dimensionality • As the number of documents grows, the number of features grows as well • Large vector spaces are difficult to manageCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    68. 68. VSM weaknesses •Language issues • Does not manage the latent semantic of documents • String matching-based approach • A CBRS based on VSM cannot understand the information it manages apple ?Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    69. 69. VSM weaknesses •Language issues • Representation is language-dependant • User profile built in a language can not be exploited to provide recommendation of items described in another language • It would be good to receive (e.g.) recommendation about news written by english newspapers even if I expressed my interest only on italian news articles!Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    70. 70. How to catch these issues?Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    71. 71. a novel recommendation framework based on VSM eVSM enhanced Vector Space Model (*) (*) Cataldo Musto: Enhanced Vector Space Models for Content-based Recommender Systems, RECSYS 2010, pages 361-364Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    72. 72. eVSM goals • To introduce a CBRS based on VSM • To catch representation issues of VSM •No Semantics •High Dimensionality •No modeling of Negative Information •Language-dependant recommendationsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    73. 73. a novel recommendation framework based on VSM eVSM step 1: modeling semantics step 2: dimensionality reduction step 3: modeling negation step 4: building user profiles step 5: providing suggestionsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    74. 74. how to improve the semantic modeling in VSMs? distributional models (Firth, 1957) Firth, J.R. A synopsis of linguistic theory 1930-1955. In Studies in Linguistic Analysis, pp. 1-32, 1957.Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    75. 75. distributional models “meaning is its use” L.WittgensteinCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    76. 76. distributional models insightby analyzing large corpus of textual data it is possibleto infer information about the usage (about the meaning)of the terms.Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    77. 77. distributional models insightby analyzing large corpus of textual data it is possibleto infer information about the usage (about the meaning)of the terms. exampleCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    78. 78. Distributional Models term/context matrix c1 c2 c3 c4 c5 c6 c7 c8 c9 t1 ✔ ✔ ✔ ✔ t2 ✔ ✔ ✔ ✔ t3 ✔ ✔ ✔ t4 ✔ ✔ ✔ ✔Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    79. 79. distributional models • Key: definition of what is the ‘context’ • Different granularities are possible • Document • Paragraph • Sentence • Sliding window of wordsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    80. 80. Distributional Models term/context matrix c1 c2 c3 c4 c5 c6 c7 c8 c9 t1 ✔ ✔ ✔ ✔ t2 ✔ ✔ ✔ ✔ t3 ✔ ✔ ✔ t4 ✔ ✔ ✔ ✔Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    81. 81. distributional models beer vs. glass: good overlap c1 c2 c3 c4 c5 c6 c7 c8 c9 t1 ✔ ✔ ✔ ✔ t2 ✔ ✔ ✔ ✔ t3 ✔ ✔ ✔ t4 ✔ ✔ ✔ ✔Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    82. 82. distributional models beer vs. spoon: no overlap c1 c2 c3 c4 c5 c6 c7 c8 c9 t1 ✔ ✔ ✔ ✔ t2 ✔ ✔ ✔ ✔ t3 ✔ ✔ ✔ t4 ✔ ✔ ✔ ✔Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    83. 83. distributional models recap models for representing terms/ documents in large vector spaces light semantics it is simple to calculate similarities between words but the high dimensionality problem is even worsened!Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    84. 84. a novel recommendation framework based on VSM eVSM step 1: modeling semantics step 2: dimensionality reduction step 3: modeling negation step 4: building user profiles step 5: providing suggestionsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    85. 85. Random Indexing (Sahlgren, 2005) Sahlgren, M. An Introduction to Random Indexing. Proceedings of the Methods and Applications of Semantic Indexing Workshop, TKE 2005.Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    86. 86. dimensionality reduction random indexing • Strenghts • Incremental approach • Based on distributional hypothesis • Builds a small-scale semantic vector space representationCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    87. 87. random indexing • Input • n-dimensional term-document matrix • Output • k-dimensional term-context matrix • k << n • Approximation built upon distributional hypothesis • Based on contexts, but much more compact!Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    88. 88. random indexing dimensionality reduction d1 d2 d3 d4 d5 . . . dn c1 c2 c3 c4 c5 . . . ck t1 t1 t2 n >> k t2 t3 t3 t4 t4 t5 t5 term/document matrix term/context matrixCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    89. 89. random indexing dimensionality reduction d1 d2 d3 d4 d5 . . . dn c1 c2 c3 c4 c5 . . . ck t1 t1 t2 n >> k t2 k is a simple t3 t3 parameter of the model t4 t4 t5 t5 term/document matrix term/context matrixCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    90. 90. random indexing dimensionality reduction d1 d2 d3 d4 d5 . . . dn c1 c2 c3 c4 c5 . . . ck t1 t1 t2 n >> k t2 the smaller , the k more the efficiency t3 t3 and the loss of t4 t4 information t5 t5 term/document matrix term/context matrixCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    91. 91. random indexing some literature • Roots • Sparse distributed representations (Kanerva, 1988) • Studies about Random Projection • State of the art applications • Clustering text documents (Kohonen, 2000) • Image data compression (Bingham, 2001) • Information Retrieval (Basile, 2010) • Collaborative filtering (Cisielczyk, 2010) • Never exploited for CBRS.Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    92. 92. How to obtain the smaller k-dimensional representation?Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    93. 93. random indexing algorithm • (1) Definition of the context. • Document ? Paragraph ? Sentence ? Word ? • (2) Each ‘context’ is assigned a context vector. • Dimension of the vector = k • Allowed values = {-1, 0, 1} • Constraints: non-zero elements have to be much smaller • Values distributed in a random wayCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    94. 94. random indexing context vectors k=8 rc1 = (0, 0, -1, 1, 0, 0, 0, 0) rc2 = (1, 0, 0, 0, 0, 0, 0, -1) rc3 = (0, 0, 0, 0, 0, -1, 1, 0) rc4 = (-1, 1-, 0, 0, 0, 0, 0, 0) rc5 = (0, 0, 0, -1, 1, 0, 0, 0)Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    95. 95. random indexing algorithm • (3) The vector space representation of a term t is obtained by combining the random vectors of the contexts it occurs in. rc1 = (0, 0, -1, 1, 0, 0, 0, 0) rc2 = (1, 0, 0, 0, 0, 0, 0, -1) rc3 = (0, 0, 0, 0, 0, -1, 1, 0) t1 ∈ {c1, c2} rc4 = (-1, 1-, 0, 0, 0, 0, 0, 0) rc5 = (0, 0, 0, -1, 1, 0, 0, 0)Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    96. 96. random indexing algorithm • (3) The vector space representation of a term t is obtained by combining the random vectors of the contexts it occurs in. rc1 = (0, 0, -1, 1, 0, 0, 0, 0) t1 ∈ {c1, c2} rc2 = (1, 0, 0, 0, 0, 0, 0, -1) rc3 = (0, 0, 0, 0, 0, -1, 1, 0) rc1 = (0, 0, -1, 1, 0, 0, 0, 0) rc4 = (-1, 1-, 0, 0, 0, 0, 0, 0) rc2 = (1, 0, 0, 0, 0, 0, 0, -1 rc5 = (0, 0, 0, -1, 1, 0, 0, 0) t1 = (1, 0, -1, 1, 0, 0, 0, -1)Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    97. 97. random indexing algorithm • (3) The vector space representation of a term t is obtained by combining the random vectors of the contexts it occurs in. • (4) The vector space representation of a document d is obtained by combining the vector space representation of the terms that occur in the document.Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    98. 98. random indexing algorithm • (3) The vector space representation of a term t is obtained by combining the random vectors of the contexts it occurs in. output: WORDSPACE • (4) The vector space representation of a document d is obtained by combining the vector space representation of the terms that occur in the document.Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    99. 99. random indexing algorithm • (3) The vector space representation of a term t is obtained by combining the random vectors of the contexts it occurs in. output: DOCSPACE • (4) The vector space representation of a document d is obtained by combining the vector space representation of the terms that occur in the document.Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    100. 100. random indexing WordSpace DocSpace c1 c2 c3 c4 c5 . . . ck c1 c2 c3 c4 c5 . . . ck t1 d1 t2 Uniform d2 t3 Representation d3 t4 d4 t5 d5 Comparison between Comparison between terms documentsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    101. 101. Dimensionality reduction is obtained upon a set of random vectors Does it sound weird?Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    102. 102. random indexing theoretical basis • Johnson-Lindenstauss Lemma (*) • Distance between points are approximately preserved. • Constraint: orthogonal vectors • Random Indexing vectors are nearly-ortoghonal. • The loss of information depends on the parameter k (*) Johnson, W and Lindenstauss, J. Extensions of lipschitz maps into a Hilbert space. Contemporary Mathematics, 1984Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    103. 103. random indexing johnson-lindenstrauss lemmaCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    104. 104. a novel recommendation framework based on VSM eVSM step 1: modeling semantics step 2: dimensionality reduction step 3: modeling negation step 4: building user profiles step 5: providing suggestionsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    105. 105. quantum negation (Widdows, 2007) Sahlgren, M. An Introduction to Random Indexing. Proceedings of the Methods and Applications of Semantic Indexing Workshop, TKE 2005.Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    106. 106. negation in VSMs state of the art • State-of-the-art approaches: poor theoretical background • Post-retrieval filtering, Rocchio Algorithm (Rocchio, 1971) • Widdows proposed a different point of view • Negation view as a form of orthogonality between vectors • Vision inherited from Quantum LogicCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    107. 107. negation in VSMs Quantum Negation • Some theory • Given vector a and vector b • Through quantum negation it is possible to defined a vector a not b (a ∧¬b) • Projection of vector a on the subspace orthogonal to those generated by vector bCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    108. 108. quantum negation application to CBRS • Vector A models positive feedbacks • Information about what a user likes • Vector B models negative feedbacks • Information about what a user does not like • Vector A not B combines both information sourcesCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    109. 109. eVSM building blocks - recap • Distributional Models • Light semantic modeling • Random Indexing (Sahlgren, 2005) • Incremental technique for dimensionality reduction • Quantum Negation (Widdows, 2007) • Negation operator based on Quantum LogicCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    110. 110. eVSM building blocks - recap • A content-based recommendation framework needs to: • Represent items • Build user profiles • Provide suggestions • Random Indexing and Quantum Negation provide a novel representation model.Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    111. 111. a novel recommendation framework based on VSM eVSM step 1: modeling semantics step 2: dimensionality reduction step 3: modeling negation step 4: building user profiles step 5: providing suggestionsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    112. 112. eVSM building user profiles • Represent profiles in eVSM • Vector space representation • Obtained by combining the vectors of the items the user liked • How? • Four different profiling modelsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    113. 113. User Profiles Random Indexing-based (RI) Items Rating Threshold VSM representation of RI-based profile for user uCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    114. 114. User Profiles Quantum Negation-based (QN) Positive User Profile Vector Negative User Profile Vector VSM representation of QN-based profile for user uCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    115. 115. User Profiles Weighted Random Indexing-based (w-RI) Items Rating Threshold Higher weight given to the documents with higher ratingCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    116. 116. User Profiles Weighted Quantum Negation-based (w-QN) Positive User Profile Vector Negative User Profile Vector VSM representation of wQN-based profile for user uCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    117. 117. a novel recommendation framework based on VSM eVSM step 1: modeling semantics step 2: dimensionality reduction step 3: modeling negation step 4: building user profiles step 5: providing suggestionsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    118. 118. eVSM providing suggestions - monolingual scenario DocSpace c1 c2 c3 c4 c5 . . . ck d1 d2 d3 d4 p P All the items are vectors in a DocSpaceCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    119. 119. eVSM providing suggestions - monolingual scenario DocSpace c1 c2 c3 c4 c5 . . . ck d1 d2 d3 d4 p profile is a vector in a DocSpaceCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    120. 120. eVSM providing suggestions - monolingual scenario DocSpace c1 c2 c3 c4 c5 . . . ck d1 d2 d3 d4 p Similarity calculation between p and each itemCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    121. 121. Some maths (1/2) • Let • U set of users • I set of items • Given • active user u ∈ UCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    122. 122. Some maths (2/2) • For each couple (u, ij) • For both user u and item i a vector space representation is provided • u = (fu1, fu2 ... fun) • i = (fi1, fi2 ... fin) • Calculate sim(u, ij) • Cosine similarity • Order ij in a descending similarity order • Return the top-k elementsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    123. 123. Similarity-based recommendations Relevance of an item seen as a form of similarity The most similar items are returned to the target userCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    124. 124. What about multilanguage recommendations?Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    125. 125. eVSM providing suggestions - multilingual scenario • eVSM for multilingual recommendations • Assumption • The distribution of the terms is (almost) language- independent drink bere beer / birra glass bicchiereCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    126. 126. eVSM providing suggestions - multilingual scenario • eVSM for multilingual recommendations • Assumption • The distribution of the terms is (almost) language- independent • The position of concept of in a WordSpace beer will be always the same, regardless the language!Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    127. 127. (english) WordSpace beer wine spoon dogCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    128. 128. (italian) WordSpace relationships between terms stay birra regardless the language! vino cucchiaio caneCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    129. 129. eVSM providing suggestions - multilingual scenario DocSpace for L1 DocSpace for L2 c1 c2 c3 c4 c5 . . . ck c1 c2 c3 c4 c5 . . . ck Parallel d1 DocSpaces d1 d2 Built upon the d2 same d3 d3 set of d4 random d4 d5 vectors d5 (italian) (english)Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    130. 130. eVSM providing suggestions - multilingual scenario DocSpace for L1 DocSpace for L2 c1 c2 c3 c4 c5 . . . ck c1 c2 c3 c4 c5 . . . ck Parallel d1 DocSpaces d1 d2 Built upon the d2 same d3 d3 set of d4 random d4 p vectors d5 L1 user profile in L1 (italian)Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    131. 131. eVSM providing suggestions - multilingual scenario DocSpace for L1 DocSpace for L2 c1 c2 c3 c4 c5 . . . ck c1 c2 c3 c4 c5 . . . ck Parallel d1 DocSpaces d1 d2 Built upon the d2 same d3 d3 set of d4 random d4 p vectors p L1 L1 we can project user profile in the DocSpace of english itemsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    132. 132. eVSM providing suggestions - multilingual scenario DocSpace for L1 DocSpace for L2 c1 c2 c3 c4 c5 . . . ck c1 c2 c3 c4 c5 . . . ck Parallel d1 DocSpaces d1 d2 Built upon the d2 same d3 d3 set of d4 random d4 p vectors p L1 L1 similarity computations of italian profile with english items to build multilingual recommendationsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    133. 133. Multilingual recommendations come with no costs. Thanks to distributional hypothesis.Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    134. 134. experimental evaluation applicationsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    135. 135. evaluation of eVSM • selected experiments • movie recommendation • monolingual scenario • Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis: Random Indexing and Negative User Preferences for Enhancing Content-Based Recommender Systems. EC-Web 2011. 270-281 • multilingual scenario • Cataldo Musto, Fedelucio Narducci, Pierpaolo Basile, Pasquale Lops, Marco de Gemmis, Giovanni Semeraro: Cross-Language Information Filtering: Word Sense Disambiguation vs. Distributional Models. AI*IA 2011 • epg personalization • Cataldo Musto, Fedelucio Narducci, Pasquale Lops, Giovanni Semeraro, Marco de Gemmis, Mauro Barbieri, Jan H. M. Korst,Verus Pronk, Ramon Clout. Enhanced Semantic TV-Show Representation for personalized electronic program guides. UMAP 2012 (to be presented)Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    136. 136. movie recommendation ‘in vitro’ experiments • Goal: to provide users with recommendations about movies worth to be watched. • Subset of 100k MovieLens dataset + Wikipedia content • Monolingual and Multilingual settingsCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    137. 137. monolingual experiment parameter tuning • Size of context vectors • k = 50, 100, 200, 400 • 99% reduction of DocSpace • original size: 25k • Profiling models • RI, w-RI, QN- w-QN • Weighted vs. Unweighted • With negations vs. without negationCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    138. 138. experimental design experiments • Experiment 1 • Do the weighting scheme and the introduction of a negation operator improve the predictive accuracy of the recommendation models? • Experiment 2 • How do the model perform with respect to other state of the art approaches?Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    139. 139. experiment 1 size=100 - Movielens dataset 87 86.69 RI WRI QN WQN 86.25 86.17 85.7485.8 85.61 85.57 85.4685.43 85.5 85.36 85.29 85.03 84.84 84.9 84.7884.8184.84 84.75 84 p@1 P@3 P@5 P@10 Weighted vs Unweighted: improvement under 0.2%Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    140. 140. experiment 1 size=100 - Movielens dataset 87 86.69 RI WRI QN WQN 86.25 86.17 85.7485.8 85.61 85.57 85.4685.43 85.5 85.36 85.29 85.03 84.84 84.9 84.7884.8184.84 84.75 84 p@1 P@3 P@5 P@10 Weighted vs Unweighted: improvement under 0.2%Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    141. 141. experiment 1 size=100 - Movielens dataset 87 86.69 RI WRI QN WQN 86.25 86.17 Peak: +0.52 85.8 85.74 85.61 85.57 85.4685.43 85.5 85.36 85.29 85.03 84.84 84.9 84.7884.8184.84 84.75 84 p@1 P@3 P@5 P@10 However, differences are not statistically significantCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    142. 142. experiment 1 size=400 - Movielens dataset 87 RI WRI QN WQN 86.25 86.01 85.94 85.82 85.59 85.6 85.48 85.55 85.5285.5585.58 85.52 85.5 85.32 85.34 85.24 84.94 84.86 84.75 84 p@1 P@3 P@5 P@10 Negation vs No-negation: improvement under 0.5%Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    143. 143. experiment 1 size=100 - Movielens dataset 87 86.69 RI WRI QN WQN 86.25 86.17 Gap: +1.08 85.8 85.74 85.61 85.57 85.46 85.43 85.5 85.36 85.29 85.03 84.84 84.9 84.78 84.81 84.84 84.75 84 p@1 P@3 P@5 P@10 Some exception, P@1 and P@3 , comparison W-RI vs. W-QNCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    144. 144. experiment 1 size=100 - Movielens dataset 87 86.69 RI WRI QN WQN 86.25 86.17 85.74 85.8 85.61 85.57 85.5 85.29 85.36 Gap: +0.77 85.46 85.43 85.03 84.84 84.9 84.78 84.81 84.84 84.75 84 p@1 P@3 P@5 P@10 The use of negation operator improves the accuracy in a significant way.Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    145. 145. experiment 1 size=100 - Movielens dataset 87 86.69 RI WRI QN WQN 86.25 86.17 Gap: +1.08 85.8 85.74 85.61 85.57 85.46 85.43 85.5 85.36 85.29 85.03 84.84 84.9 84.78 84.81 84.84 84.75 84 p@1 P@3 P@5 P@10 Peaks in P@1 and P@3 are statistically significantCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    146. 146. experiment 1 size=100 - Movielens dataset 87 86.69 RI WRI QN WQN 86.25 86.17 85.74 85.8 85.61 85.57 85.46 85.43 85.5 85.36 85.29 85.03 84.84 84.9 84.78 84.81 84.84 84.75 84 p@1 P@3 P@5 P@10Generally speaking, W-QN configuration outperforms the others.Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    147. 147. experiment 1 size=100 - Movielens dataset 87 86.69 RI WRI QN WQN 86.25 86.17 85.74 85.8 85.61 Gap: +1.4% 85.57 85.46 85.43 85.5 85.36 85.29 85.03 84.84 84.9 84.78 84.81 84.84 84.75 84 p@1 P@3 P@5 P@10 The combined use of weigthing and negation significally improves the accuracyCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    148. 148. experiment 1 impact of negation operator and weighting scheme context vectors - size 50 100 200 400 P@1 ✔ ✔ ✔ P@3 ✔ ✔ ✔ P@5 P@10 ✔ ✔ = statistical significanceCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    149. 149. experiment 1 impact of negation operator and weighting scheme context vectors - size 50 100 200 400 P@1 ✔ ✔ ✔ P@3 ✔ ✔ ✔ P@5 P@10 ✔ The combined use of weigthing and negation significally improves the accuracyCataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
    150. 150. experiment 2 87 size=400 - Movielens dataset eVSM VSM 86.25 85.94 86.01 LSI Bayes 85.58 85.52 85.5 85.39 85.27 84.97 84.85 84.77 84.75 84.75 84.7 84.7 84.58 84.47 84.5 84.43 84 p@1 P@3 P@5 P@10 Gap always around 1%Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12

    ×