SSSW 2013 - Feeding Recommender Systems with Linked Open Data

3,015 views

Published on

Basic introduction to recommender systems + Implementing a content-based recommender system by leveraging knowledge encoded into Linked Open Data datasets

Published in: Education
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,015
On SlideShare
0
From Embeds
0
Number of Embeds
621
Actions
Shares
0
Downloads
48
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

SSSW 2013 - Feeding Recommender Systems with Linked Open Data

  1. 1. TOOLS AND TECHNIQUES “FEEDING RECOMMENDER SYSTEMS WITH LINKED OPEN DATA” Tommaso Di Noia t.dinoia@poliba.it Politecnico di Bari ITALY SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  2. 2. Recommender Systems Input Data: A set of users U={u1, …, uM} A set of items X={x1, …, xN} The rating matrix R=[ru,i] Problem Definition: Given user u and target item i Predict the rating ru,i A definition Recommender Systems (RSs) are software tools and techniques providing suggestions for items to be of use to a user. [F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor, editors. Recommender Systems Handbook. Springer, 2011.] SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  3. 3. The rating matrix 5 1 2 4 3 ?? 2 4 5 3 5 2 4 3 2 4 1 3 3 5 1 5 2 4 4 4 5 3 5 2 TheMatrix Titanic Iloveshopping Argo LoveActually Thehangover Tommaso Enrico Sean Natasha Valentina SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  4. 4. The rating matrix (in the real world) 5 4 3 ?? 2 4 5 5 3 4 3 3 5 5 2 4 4 5 5 2 TheMatrix Titanic Iloveshopping Argo LoveActually Thehangover Tommaso Enrico Sean Natasha Valentina SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  5. 5. Collaborative Recommender Systems Collaborative RSs recommend items to a user by identifying other users with a similar profile Recommender System User profile Users Item7 Item15 Item11 … Top-N Recommendations Item1, 5 Item2, 1 Item5, 4 Item10, 5 …. …. Item1, 4 Item2, 2 Item5, 5 Item10, 3 …. Item1, 4 Item2, 2 Item5, 5 Item10, 3 …. Item1, 4 Item2, 2 Item5, 5 Item10, 3 …. SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  6. 6. Content-based Recommender Systems Recommender System User profile Item7 Item15 Item11 … Top-N Recommendations Item1, 5 Item2, 1 Item5, 4 Item10, 5 …. Items Item1 Item2 Item100 Item’s descriptions …. CB-RSs recommend items to a user based on their description and on the profile of the user’s interests SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  7. 7. Knowledge-based Recommender Systems Recommender System User profile Item7 Item15 Item11 … Top-N Recommendations Item1, 5 Item2, 1 Item5, 4 Item10, 5 …. Items Item1 Item2 Item100Item’s descriptions …. KB-RSs recommend items to a user based on their description and domain knowledge encoded in a knowledge base Knowledge-base SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  8. 8. User-based Collaborative Recommendation 5 1 2 4 3 ?? 2 4 5 3 5 2 4 3 2 4 1 3 3 5 1 5 2 4 4 4 5 3 5 2 TheMatrix Titanic Iloveshopping Argo LoveActually Thehangover Tommaso Enrico Sean Natasha Valentina 𝑠𝑖𝑚 𝑢𝑖, 𝑢𝑗 = 𝑟𝑢𝑖,𝑥 − 𝑟𝑢𝑖 ∗ 𝑟𝑗,𝑥 − 𝑟𝑢 𝑗𝑥∈𝑋 𝑟𝑢𝑖,𝑥 − 𝑟𝑢𝑖 2 𝑥∈𝑋 ∗ 𝑟𝑢 𝑗,𝑥 − 𝑟𝑢 𝑗 2 𝑥∈𝑋 Pearson’s correlation coefficient Rate prediction 𝑟 𝑢𝑖, 𝑥′ = 𝑟𝑢𝑖 + 𝑠𝑖𝑚 𝑢𝑖, 𝑢𝑗 ∗ 𝑟 𝑢 𝑗,𝑥′ − 𝑟𝑢 𝑗𝑢 𝑗 𝑠𝑖𝑚(𝑢𝑖, 𝑢𝑗)𝑢 𝑗 = 𝑋 SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  9. 9. Item-based Collaborative Recommendation 5 1 2 4 3 ?? 2 4 5 3 5 2 4 3 2 4 1 3 3 5 1 5 2 4 4 4 5 3 5 2 TheMatrix Titanic Iloveshopping Argo LoveActually Thehangover Tommaso Enrico Sean Natasha Valentina 𝑠𝑖𝑚 𝑥𝑖, 𝑥𝑗 = 𝑥𝑖 ⋅ 𝑥𝑗 |𝑥𝑖| ∗ |𝑥𝑗| = 𝑟𝑢,𝑥 𝑖 ∗ 𝑟𝑢,𝑥 𝑗𝑢 𝑟𝑢,𝑥 𝑖 2 𝑢 ∗ 𝑟𝑢,𝑥 2 𝑢 Cosine Similarity Rate prediction 𝑟 𝑢𝑖, 𝑥′ = 𝑠𝑖𝑚 𝑥, 𝑥′ ∗ 𝑟𝑥,𝑢 𝑖𝑥∈𝑋 𝑢 𝑖 𝑠𝑖𝑚 𝑥, 𝑥′𝑥∈𝑋 𝑢 𝑖 𝑠𝑖𝑚 𝑥𝑖, 𝑥𝑗 = 𝑟𝑢,𝑥 𝑖 − 𝑟𝑢 ∗ 𝑟𝑢,𝑥 𝑗 − 𝑟𝑢𝑢 𝑟𝑢,𝑥 𝑖 − 𝑟𝑢 2 𝑢 ∗ 𝑟𝑢,𝑥 𝑗 − 𝑟𝑢 2 𝑢 Adjusted Cosine Similarity = 𝑋 𝑢 𝑖 SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  10. 10. Content-Based Recommender Systems SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web P. Lops, M. de Gemmis, G. Semeraro. Content-based recommender Systems: State of the Art and Trends. In: P. Kantor, F. Ricci, L. Rokach, B. Shapira, editors, Recommender Systems Hankbook: A complete Guide for Research Scientists & Practitioners
  11. 11. Content-Based Recommender Systems • Items are described in terms of attributes/features • A finite set of values is associated to each feature • Items representation is a (Boolean) vector SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  12. 12. Content-Based Recommender Systems • Compute similarity between items 𝑠𝑖𝑚 𝑥𝑖, 𝑥𝑗 = |𝑥𝑖 ∩ 𝑥𝑗| |𝑥𝑖 ∪ 𝑥𝑗| Jaccard similarity SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  13. 13. Content-Based Recommender Systems • Compute similarity between items 𝑠𝑖𝑚 𝑥𝑖, 𝑥𝑗 = 𝑥𝑖 ⋅ 𝑥𝑗 𝑥𝑖 ∗ |𝑥𝑗| Cosine similarity SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  14. 14. Content-Based Recommender Systems • Compute similarity between items 𝑠𝑖𝑚 𝑥𝑖, 𝑥𝑗 = 𝑥𝑖 ⋅ 𝑥𝑗 𝑥𝑖 ∗ |𝑥𝑗| Cosine similarity and TF-IDF 𝑇𝐹 𝑣, 𝑥 = # 𝑣 𝑎𝑝𝑝𝑒𝑎𝑟𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑑𝑒𝑠𝑐𝑟𝑖𝑝𝑡𝑖𝑜𝑛 𝑜𝑓 𝑥 𝐼𝐷𝐹 𝑣 = log |𝑋| #𝑣 𝑎𝑝𝑝𝑒𝑎𝑟𝑠 𝑖𝑛 𝑋 𝑥 = (𝑇𝐹 𝑣1, 𝑥 ∗ 𝐼𝐷𝐹 𝑣1 , … , 𝑇𝐹 𝑣 𝑛, 𝑥 ∗ 𝐼𝐷𝐹 𝑣 𝑛 ) SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  15. 15. Content-Based Recommender Systems Rate prediction 𝑟 𝑢𝑖, 𝑥′ = 𝑠𝑖𝑚 𝑥, 𝑥′ ∗ 𝑟𝑥,𝑢 𝑖𝑥∈𝑋 𝑢 𝑖 𝑠𝑖𝑚 𝑥, 𝑥′𝑥∈𝑋 𝑢 𝑖 SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  16. 16. Content-Based Recommender Systems • Nearest neighbors – Given a set of items representing the user profile, select their most similar items which are not in the user profile – Predict the rate only for the N nearest neighbors SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  17. 17. Content-Based Recommender Systems • Nearest neighbors with kNN 𝑥𝑖,1 𝑥𝑖,2 𝑥𝑖,3 𝑥𝑖,4 𝑥𝑖,5 𝑥𝑖,6 k = 5 SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  18. 18. Content-Based Recommender Systems SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  19. 19. Need of domain knowledge! We need rich descriptions of the items! No suggestion is available if the analyzed content does not contain enough information to discriminate items the user might like from items the user might not like.* (*) P. Lops, M. de Gemmis, G. Semeraro. Content-based Recommender Systems: State of the Art and Trends. In: P. Kantor, F. Ricci, L. Rokach and B. Shapira, editors, Recommender Systems Handbook: A Complete Guide for Research Scientists & Practitioners The quality of CB recommendations are correlated with the quality of the features that are explicitly associated with the items. Main CB RSs Drawback: Limited Content Analysis SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  20. 20. A solution based on Linked Data Use Linked Data to mitigate the limited content analysis issue • Plenty of structured data available • No Content Analyzer required SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  21. 21. FRED: Transforming Natural Language text to RDF/OWL graphs input SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web Courtesy of Valentina Presutti
  22. 22. SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web FRED: RDF/OWL graph output Resolution and linking Vocabulary alignment Frames/events Taxonomy induction Semantic roles Co-reference Custom namespace Types Induced types (sense tagging) 22 Courtesy of Valentina Presutti
  23. 23. Linked Data as structured information source for items descriptions Rich items descriptions SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  24. 24. Select the domain(s) of your RS SELECT count(?i) AS ?num ?c WHERE { ?i a ?c . FILTER(regex(?c, "^http://dbpedia.org/ontology")) . } ORDER BY DESC(?num) SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  25. 25. Tìpalo: automatic typing of DBpedia entities Input: natural language definition of an entity SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web Courtesy of Valentina Presutti
  26. 26. 26 Linking to WN supersenses DBpedia resource Extracted type Disambiguated sense Inferred superclass Linked resource Disambiguated sense A chaise longue is an upholstered sofa in the shape of a chair that is long enough to support the legs. A chaise longue (English /ˌʃeɪz ˈlɔːŋ/;[1] French pronunciation: [ʃɛzlɔ̃ŋɡ(ə)], "long chair") is an upholstered sofa in the shape of a chair that is long enough to support the legs. Linking to DOLCE http://en.wikipedia.org/wiki/Chaise_longue Rich taxonomical data! SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web Courtesy of Valentina Presutti
  27. 27. Select the sub-graph you are interested in select ?m ?p ?o1 where{ ?m ?p ?o1 . ?m dcterms:subject ?o. dbpedia:The_Matrix dcterms:subject ?o. ?m a dbpedia-owl:Film. filter(regex(?p,"^http://dbpedia.org/ontology/")). } order by ?m SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  28. 28. Enriching Annotations with Resources From the Web Watson: find ontologies on the Semantic Web for enriching the representation of your data Sindice: find other entities for enriching your knowledge base SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web SameAs.org: find other entities to link to from your knowledge base Courtesy of Valentina Presutti
  29. 29. Computing similarity in LOD datasets SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  30. 30. Vector Space Model for LOD Righteous Kill starring director subject/broader genre Heat RobertDeNiro JohnAvnet Serialkillerfilms Drama AlPacino BrianDennehy Heistfilms Crimefilms starring RobertDeNiro AlPacino BrianDennehy Righteous Kill Heat … … SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  31. 31. Vector Space Model for LOD Righteous Kill STARRING Al Pacino (v1) Robert De Niro (v2) Brian Dennehy (v3) Righteous Kill (m1) X X X Heat (m2) X X Heat Righteous Kill (x1) wv1,x1 wv2,x1 wv3,x1 Heat (x2) wv1,x2 wv2,x2 0 𝑤 𝐴𝑙𝑃𝑎𝑐𝑖𝑛𝑜,𝐻𝑒𝑎𝑡 = 𝑡𝑓𝐴𝑙𝑃𝑎𝑐𝑖𝑛𝑜,𝐻𝑒𝑎𝑡 ∗ 𝑖𝑑𝑓𝐴𝑙𝑃𝑎𝑐𝑖𝑛𝑜 SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  32. 32. Vector Space Model for LOD Righteous Kill STARRING Al Pacino (v1) Robert De Niro (v2) Brian Dennehy (v3) Righteous Kill (m1) X X X Heat (m2) X X Heat Righteous Kill (x1) wv1,x1 wv2,x1 wv3,x1 Heat (x2) wv1,x2 wv2,x2 0 𝑤 𝐴𝑙𝑃𝑎𝑐𝑖𝑛𝑜,𝐻𝑒𝑎𝑡 = 𝑡𝑓𝐴𝑙𝑃𝑎𝑐𝑖𝑛𝑜,𝐻𝑒𝑎𝑡 ∗ 𝑖𝑑𝑓𝐴𝑙𝑃𝑎𝑐𝑖𝑛𝑜 𝑡𝑓 ∈ {0,1} SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  33. 33. Vector Space Model for LOD + + + … = 𝒔𝒊𝒎 𝒔𝒕𝒂𝒓𝒓𝒊𝒏𝒈(𝒙𝒊, 𝒙𝒋) = 𝒘 𝒗 𝟏,𝒙𝒊 ∗ 𝒘 𝒗 𝟏,𝒙 𝒋 + 𝒘 𝒗 𝟐,𝒙𝒊 ∗ 𝒘 𝒗 𝟐,𝒙 𝒋 + 𝒘 𝒗 𝟑,𝒙𝒊 ∗ 𝒘 𝒗 𝟑,𝒙 𝒋 𝒘 𝒗 𝟏,𝒙𝒊 𝟐 + 𝒘 𝒗 𝟐,𝒙𝒊 𝟐 + 𝒘 𝒗 𝟑,𝒙𝒊 𝟐 ∗ 𝒘 𝒗 𝟏,𝒙 𝒋 𝟐 + 𝒘 𝒗 𝟐,𝒙 𝒋 𝟐 + 𝒘 𝒗 𝟑,𝒙 𝒋 𝟐 𝜶 𝒔𝒕𝒂𝒓𝒓𝒊𝒏𝒈 ∗ 𝒔𝒊𝒎 𝒔𝒕𝒂𝒓𝒓𝒊𝒏𝒈(𝒙𝒊, 𝒙𝒋) 𝜶 𝒅𝒊𝒓𝒆𝒄𝒕𝒐𝒓 ∗ 𝒔𝒊𝒎 𝒅𝒊𝒓𝒆𝒄𝒕𝒐𝒓(𝒙𝒊, 𝒙𝒋) 𝜶 𝒔𝒖𝒃𝒋𝒆𝒄𝒕 ∗ 𝒔𝒊𝒎 𝒔𝒖𝒃𝒋𝒆𝒄𝒕(𝒙𝒊, 𝒙𝒋) 𝒔𝒊𝒎 𝒔𝒕𝒂𝒓𝒓𝒊𝒏𝒈(𝒙𝒊, 𝒙𝒋) SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  34. 34. A LOD-based CB RS Given a user profile defined as: We can predict the rating using a Nearest Neighbor Classifier wherein the similarity measure is a linear combination of local property similarities 𝑿 𝒖 𝒊 = { < 𝒙𝒊, 𝒓 𝒙 𝒊,𝒖 𝒊 > } 𝑟 𝑢𝑖, 𝑥′ = 𝛼 𝑝 ∗ 𝑠𝑖𝑚 𝑝(𝑥𝑖, 𝑥′)𝑝 𝑃 ∗ 𝑟𝑥 𝑖,𝑢 𝑖<𝑥 𝑖,𝑟 𝑥 𝑖,𝑢 𝑖 > ∈ 𝑋 𝑢 𝑖 |𝑋 𝑢 𝑖 | SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  35. 35. We’ve just scratched the surface… Missing in this presentation (not a complete list) • Learning 𝛼 𝑝 • Explanation – Quite straight with a content-based approach • Hybrid approaches – Mix a content based approach with a collaborative one – Collaborative similarity as a further vector space (property) of the content based approach? – Exploit the graph-based nature of both the rating matrix and the RDF graph • … SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web
  36. 36. Many thanks to Vito Claudio Ostuni and Roberto Mirizzi SSSW'13 The 10th Summer School on Ontology Engineering and the Semantic Web

×