Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Paolo Pareti
University of Edinburgh
ACM Web Science Conference 29/6/2015
The Semantic Richness
of Linked Data Concepts
Vo...
The Problem
is a
What does class membership tell us?
:x Cat
Semantic Richness
The more facts we can infer about :x,
knowing that :x it is a Cat,
the more Semantically Rich the concep...
Semantic Richness
The more facts we can infer about :x,
knowing that :x it is a Cat,
the more Semantically Rich the concep...
Semantic Richness
is NOT
Specificity / Information Content
For example, this might have been the set of entities
in the original definition of the concept Cat.
However, after some time,
people started using the term Cat in a more generic way.
Some entities were defined as Cats,
despite not being animals.
Even t-shirts could be defined as Cats.
And why not, maybe even some trees...
is a
So what do you actually know about :x,
if on the Web anything can be a Cat?
:x Cat
A Linked Data Challenge
The more a concept gets reused…
… the least Semantically Rich it becomes.
A Linked Data Challenge
The more a concept gets reused…
… the least Semantically Rich it becomes.
Frequently reused concep...
http://www.w3.org/2002/07/owl#sameAs
This problem already affects highly reused concepts,
such as owl:sameAs *
* H. Halpin...
http://dbpedia.org/resource/Edinburgh
owl:sameAs
owl:sameAs
http://dbpedia.org/resource/Edinburgh
owl:sameAs
owl:sameAs
Originally designed to represent strict equality,
owl:sameAs i...
http://dbpedia.org/resource/Edinburgh
owl:sameAs
owl:sameAs
In this example, the usage of owl:sameAs is incorrect,
as Edin...
A Simple Measure of Semantic Richness
We define a measure based on:
● the number of common patterns,
● and their frequency...
A Simple Measure of Semantic Richness
Intuitively:
● The more patterns, and the more frequent they are,
the more semantica...
DBpedia Ontology
DBpedia Ontology
The DBpedia ontology tree, plotted according to the Semantic Richness
of its concepts (each line represen...
Loss of Semantic Richness in foaf:Person
Loss of Semantic Richness in foaf:Person
How quickly does Semantic Richness decrease when reusing
a concept? We looked at ...
Loss of Semantic Richness in foaf:Person
Loss of Semantic Richness in foaf:Person
As we add external entities of type foaf:Person into a dataset, the
Semantic Rich...
The Challenge
How can concepts be openly reused on the Web,
while at the same time remaining semantically rich?
The end,
any questions?
A Linked Data Scalability Challenge: Frequently Reused Concepts Lose their Meaning
Upcoming SlideShare
Loading in …5
×

A Linked Data Scalability Challenge: Frequently Reused Concepts Lose their Meaning

728 views

Published on

Paper presented at the ACM Web Science 2015 conference, Oxford, 29/6/15. Paper available here: http://tinyurl.com/pqvm22z

  • Be the first to comment

  • Be the first to like this

A Linked Data Scalability Challenge: Frequently Reused Concepts Lose their Meaning

  1. 1. Paolo Pareti University of Edinburgh ACM Web Science Conference 29/6/2015 The Semantic Richness of Linked Data Concepts Vocabulary Reuse Damages Semantics!
  2. 2. The Problem
  3. 3. is a What does class membership tell us? :x Cat
  4. 4. Semantic Richness The more facts we can infer about :x, knowing that :x it is a Cat, the more Semantically Rich the concept Cat is.
  5. 5. Semantic Richness The more facts we can infer about :x, knowing that :x it is a Cat, the more Semantically Rich the concept Cat is. Does it have a tail? Is it a mammal?
  6. 6. Semantic Richness is NOT Specificity / Information Content
  7. 7. For example, this might have been the set of entities in the original definition of the concept Cat.
  8. 8. However, after some time, people started using the term Cat in a more generic way.
  9. 9. Some entities were defined as Cats, despite not being animals.
  10. 10. Even t-shirts could be defined as Cats.
  11. 11. And why not, maybe even some trees...
  12. 12. is a So what do you actually know about :x, if on the Web anything can be a Cat? :x Cat
  13. 13. A Linked Data Challenge The more a concept gets reused… … the least Semantically Rich it becomes.
  14. 14. A Linked Data Challenge The more a concept gets reused… … the least Semantically Rich it becomes. Frequently reused concepts lose their meaning.
  15. 15. http://www.w3.org/2002/07/owl#sameAs This problem already affects highly reused concepts, such as owl:sameAs * * H. Halpin, P. J. Hayes, J. P. McCusker, D. L. McGuinness, and H. S. Thompson. When owl:sameAs Isn’t the Same: An Analysis of Identity in Linked Data. In The Semantic Web - ISWC 2010, volume 6496 of Lecture Notes in Computer Science, pages 305–320. Springer Berlin Heidelberg, 2010.
  16. 16. http://dbpedia.org/resource/Edinburgh owl:sameAs owl:sameAs
  17. 17. http://dbpedia.org/resource/Edinburgh owl:sameAs owl:sameAs Originally designed to represent strict equality, owl:sameAs is often (mis)used to represent weaker relations.
  18. 18. http://dbpedia.org/resource/Edinburgh owl:sameAs owl:sameAs In this example, the usage of owl:sameAs is incorrect, as Edinburgh, a picture of Edinburgh and the location of Edinburgh are three different things.
  19. 19. A Simple Measure of Semantic Richness We define a measure based on: ● the number of common patterns, ● and their frequency. For example: if X is a cat, what can we say about X? ● X is a mammal (frequency: 1.00) ● X has a tail (frequency: 0.99) ● ...
  20. 20. A Simple Measure of Semantic Richness Intuitively: ● The more patterns, and the more frequent they are, the more semantically rich the concept is. Measure motivated by: ● Number of Features theory ● Inductive Learning Main advantage: ● Can be automatically and efficiently computed over large datasets.
  21. 21. DBpedia Ontology
  22. 22. DBpedia Ontology The DBpedia ontology tree, plotted according to the Semantic Richness of its concepts (each line represents a subclass relation). As we would expect, Semantic Richness is highly correlated with specificity.
  23. 23. Loss of Semantic Richness in foaf:Person
  24. 24. Loss of Semantic Richness in foaf:Person How quickly does Semantic Richness decrease when reusing a concept? We looked at the concept of foaf:Person as defined in ten different datasets.
  25. 25. Loss of Semantic Richness in foaf:Person
  26. 26. Loss of Semantic Richness in foaf:Person As we add external entities of type foaf:Person into a dataset, the Semantic Richness of this concept quickly decreases. In particular, it falls below the average Semantic Richness of the original datasets (dotted line).
  27. 27. The Challenge How can concepts be openly reused on the Web, while at the same time remaining semantically rich?
  28. 28. The end, any questions?

×