Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Computing the Semantic Similarity of Resources
in DBpedia for Recommendation Purposes
Guangyuan Piao, Safina showkat Ara, ...
Contents
•  Introduction
•  Related Work
•  Resim (Resource similarity) Measure
•  Evaluation Setup and Results
•  Study o...
•  Linked Data (especially DBpedia) has been used for various
applications including recommendations:
•  LOD-enabled Recom...
•  Linked Data for Recommendation Purposes (single domain)
4
Introduction
dbpedia:Cheryl_Cole
•  measure the semantic simi...
•  Linked Data for Recommendation Purposes (social domain)
5
Introduction
dbpedia:Cheryl_Cole
•  user interests can be any...
6
Related Work
•  LDSD (Linked Data Semantic Distance) – Passant, 2010
•  evaluated on music artist recommendations
•  wid...
sim(ra, ra) = sim(rb, rb), for all resources ra and rb
7
Fundamental Axioms
equal self-similarity
sim(ra, rb) = sim(rb, ra...
8
The goal of the paper
propose a semantic similarity measure
- Resim on top of a revised LDSD to
satisfy fundamental axio...
9
Linked Data Semantic Distance (LDSD)
List_of_The_Tonight_Show_with_Jay_Leno
_episodes_(2013–14)
Category:21st-century_Am...
10
Resim (Resource similarity) Measure - 1
sim(ra, ra) = sim(rb, rb), for all resources ra and rb
equal self-similarity
si...
11
Resim (Resource similarity) Measure - 2
sim(ra, rb) = sim(rb, ra), for all resources ra and rb
symmetry
✔
•  to satisfy...
•  incorporating property similarity
•  from the definition of an ontology, the properties of each concept
describe variou...
•  w1 = 1 and w2 = 2 for the experiment
13
Resim (Resource similarity) Measure
final equation for Resim
14
Evaluation Setup and Results
1.  similarity measures evaluated on axioms
2.  evaluation on calculating similarities for...
15
Evaluation Setup and Results
•  Resim performs best compared to other approaches
•  satisfy 23 out of 28 test pairs of ...
Evaluation Setup and Results
•  10 similar music artists from Last.fm for given artist
golden truth
•  200 randomly select...
Evaluation Setup and Results
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Shakti3 Shakti5 LDSDsim Resim
MRR
0
0.1
0.2
0.3
0.4
0.5...
18
Study of Linked Data Sparsity Problem
•  Linked Data Sparsity Problem:
•  the performance of the recommender system bas...
19
Study of Linked Data Sparsity Problem
H0 : The number(log) of incoming/outgoing links for resources
has no relationship...
20
Conclusions
•  Results show that our proposed similarity measure:
•  satisfy the fundamental axioms
•  outperforms base...
•  extend the current similarity measure (with longer paths)
•  investigate different normalization strategies
•  apply it...
Upcoming SlideShare
Loading in …5
×

JIST2015-Computing the Semantic Similarity of Resources in DBpedia for Recommendation Purposes

11,192 views

Published on

Presentation at JIST2015

Published in: Data & Analytics
  • Follow the link, new dating source: ❤❤❤ http://bit.ly/2F7hN3u ❤❤❤
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Dating direct: ♥♥♥ http://bit.ly/2F7hN3u ♥♥♥
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

JIST2015-Computing the Semantic Similarity of Resources in DBpedia for Recommendation Purposes

  1. 1. Computing the Semantic Similarity of Resources in DBpedia for Recommendation Purposes Guangyuan Piao, Safina showkat Ara, John G. Breslin Insight Centre for Data Analytics @NUI Galway, Ireland Unit for Social Software The 5th Joint International Semantic Technology Conference Yichang, China, 12/11/2015
  2. 2. Contents •  Introduction •  Related Work •  Resim (Resource similarity) Measure •  Evaluation Setup and Results •  Study of Linked Data Sparsity Problem •  Conclusions 2
  3. 3. •  Linked Data (especially DBpedia) has been used for various applications including recommendations: •  LOD-enabled Recommender Systems Challenge (ESWC’14, 15) •  User Modeling for Personalization in Online Social Networks •  Use entities/resources in a Knowledge Graph (e.g., DBpedia, Freebase) to represent user interests •  measuring the semantic similarity between resources is important 3 Introduction
  4. 4. •  Linked Data for Recommendation Purposes (single domain) 4 Introduction dbpedia:Cheryl_Cole •  measure the semantic similarity in the context of DBpedia •  recommend similar items based on what you like in a single domain (e.g., music, movie) Who is the most similar artist to Cheryl Cole?
  5. 5. •  Linked Data for Recommendation Purposes (social domain) 5 Introduction dbpedia:Cheryl_Cole •  user interests can be any topical resources in DBpedia •  can we reuse the similarity measures that were designed for recommendations in single domain? dbpedia:SIOC dbpedia:Linked_data wi1:preference What news the user will be interested in? 1.  http://smiy.sourceforge.net/wi/spec/weightedinterests.html
  6. 6. 6 Related Work •  LDSD (Linked Data Semantic Distance) – Passant, 2010 •  evaluated on music artist recommendations •  widely used and has comparative performance with supervised learning approaches •  Shakti – Leah, 2012 •  similarity was measured based on proximity: two entities are more similar if they have more number of paths (penalty for longer paths) •  some problems need to be addressed: •  not suitable for measuring the similarity between general resources •  fundamental axioms are violated •  performance over each other is unproven •  supervised learning approaches (Di Noia etc.)
  7. 7. sim(ra, ra) = sim(rb, rb), for all resources ra and rb 7 Fundamental Axioms equal self-similarity sim(ra, rb) = sim(rb, ra), for all resources ra and rb symmetry sim(ra, ra) > sim(ra, rb), for all resources ra ≠ rb minimality •  http://www.scholarpedia.org/article/Similarity_measures
  8. 8. 8 The goal of the paper propose a semantic similarity measure - Resim on top of a revised LDSD to satisfy fundamental axioms be able to measure the semantic similarity between general resources provide a comparative study study Linked Data sparsity problem
  9. 9. 9 Linked Data Semantic Distance (LDSD) List_of_The_Tonight_Show_with_Jay_Leno _episodes_(2013–14) Category:21st-century_American_singers Ariana_Grande Selena_Gomez musicalguests musicalguests subject subject associatedMusicArtist influences Cd(influences, Ariana_Grande, Selena_Gomez) = 1 Cii(musicalguests, Ariana_Grande, Selena Gomez) = 1
  10. 10. 10 Resim (Resource similarity) Measure - 1 sim(ra, ra) = sim(rb, rb), for all resources ra and rb equal self-similarity sim(ra, ra) > sim(ra, rb), for all resources ra ≠ rb minimality ✔ ✔ •  to satisfy “equal self-similarity” and “minimality” axioms
  11. 11. 11 Resim (Resource similarity) Measure - 2 sim(ra, rb) = sim(rb, ra), for all resources ra and rb symmetry ✔ •  to satisfy “symmetry” axiom
  12. 12. •  incorporating property similarity •  from the definition of an ontology, the properties of each concept describe various features and attributes of the concept. •  Thus, property similarity is important when there is no similarity can be indicated using LDSD’ •  property similarity measure •  based on the number of shared incoming/outgoing properties •  Csip: shared incoming properties, Cip: # of incoming properties 12 Resim (Resource similarity) Measure - 3
  13. 13. •  w1 = 1 and w2 = 2 for the experiment 13 Resim (Resource similarity) Measure final equation for Resim
  14. 14. 14 Evaluation Setup and Results 1.  similarity measures evaluated on axioms 2.  evaluation on calculating similarities for general resources Axiom LDSDsim Shakti Resim equal self-similarity ✔ symmetry ✔ ✔ minimality ✔ ✔ (1) extract word pairs from WordSim353 dataset sim(Wa, Wb) > sim(Wa, Wc) the difference is higher than 2 (2) retrieve the corresponding DBpedia resources construct a test pair as sim(ra, rb) > sim(ra, rc)
  15. 15. 15 Evaluation Setup and Results •  Resim performs best compared to other approaches •  satisfy 23 out of 28 test pairs of general resources Test pairs of resources LDSDsim Shakti Resim sim(dbpedia:Money, dbpedia:Currency) > sim(dbpedia:Money, dbpedia:Business_operations) ✔ ✔ sim(dbpedia:Money, dbpedia:Cash) > sim(dbpedia:Money, dbpedia:Demand_deposit) ✔ ✔ … > … … sim(dbpedia:Planet, dbpedia:Moon) > sim(dbpedia:Planet, dbpedia:People) ✔ ✔ sim(dbpedia:Coast, dbpedia:Shore) > sim(dbpedia:Coast, dbpedia:Hill) ✔ ✔ … … … Total: 13 18 23
  16. 16. Evaluation Setup and Results •  10 similar music artists from Last.fm for given artist golden truth •  200 randomly selected music artists from 75,682 resources in DBpedia of type dbpedia-owl:MusicArtist or dbpedia-owl:Band candidate list •  Recall and Mean Reciprocal Rank (MRR) evaluation methods 3.  evaluation on LOD recommender system (music domain)
  17. 17. Evaluation Setup and Results 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Shakti3 Shakti5 LDSDsim Resim MRR 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Shakti3 Shakti5 LDSDsim Resim R@5 R@10 R@20 Recall@5, 10, 20 MRR 3.  evaluation on LOD recommender system (music domain)
  18. 18. 18 Study of Linked Data Sparsity Problem •  Linked Data Sparsity Problem: •  the performance of the recommender system based on similarity measures of resources decreases when resources lack information (i.e., when they have a lesser number of incoming/outgoing relationships to other resources). 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 R@5 R@10 R@20 Random Popular The average performance of recommendations on popular music artists The average performance of recommendations on random music artists
  19. 19. 19 Study of Linked Data Sparsity Problem H0 : The number(log) of incoming/outgoing links for resources has no relationship to the performance of a recommender system. •  in other words, the performance of the recommender system decreases for the resources with sparsity. Pearson’s correlation of 0.798 thus, we reject H0
  20. 20. 20 Conclusions •  Results show that our proposed similarity measure: •  satisfy the fundamental axioms •  outperforms baselines for measuring the semantic similarity between general resources •  outperforms Sharkti on single-domain recommendations •  Linked Data sparsity problem for LOD recommender system •  on one hand, utilizing Linked Data to build a recommender system can mitigate the traditional sparsity problem of collaborative recommender systems, but on the other hand, the system can also have a Linked Data sparsity problem for resources in the Linked Data set that the recommender system has adopted
  21. 21. •  extend the current similarity measure (with longer paths) •  investigate different normalization strategies •  apply it to social recommendations (e.g., news recommendations in Twitter) 21 Future Work

×