Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

A Semi-Automatic Tool for Linked Data Integration

151 views

Published on

Linked Data is a set of best practices to publish data in RDF format. Despite the advantages of Linked Data (i.e., discoverability, interoperability, reusability, etc.) many datasets are not published in RDF.

Transformation of existing structured datasets into RDF is possible thanks to RDF Mappings. To be able to define such mappings, it is necessary to be familiar with the Linked Data practices and to know perfectly the datasets concerned.

An obstacle to the Linked Data democratization is that few people satisfy these two conditions. Tools making the process of Linked Data integration easier can foster Linked Open Data growth.

Opendatasoft proposes a chatbot-like tool that can semi-automatically generate RDF mappings for existing structured datasets. The challenge is to automate part of the integration process that requires getting familiar with Linked Data practices.

Published in: Data & Analytics
  • Login to see the comments

A Semi-Automatic Tool for Linked Data Integration

  1. 1. Benjamin Moreau @BenjMoreau Connected Data London 2019 1
  2. 2. INTRODUCTION 2 Linked Data is a set of best practices to publish data in RDF Data published as Linked Data are described according to ontologies LD make data interoperable, discoverable and reusable RDF Mapping defines the transformation of a structured dataset into an RDF dataset
  3. 3. RDF MAPPING TO INTEGRATE DATA IN THE LD 3 Name Birth City Augustus Rome Tiberius Rome Caligula Antitum Subject Predicate Object ex:Augustus dbo:birthPlace ex:Rome ex:Tiberius dbo:birthPlace ex:Rome ex:Caligula dbo:birthPlace ex:Antitum ex:$(Name) ex:$(Birth City) dbo:birthPlace
  4. 4. RDF MAPPING CREATION PROCESS 4 Which type of resources are contained in columns? Name Birth City Augustus Rome Tiberius Rome Caligula Antitum Person City
  5. 5. 5 Which type of resources are contained in columns? What are the relationships between resources? Name Birth City Augustus Rome Tiberius Rome Caligula Antitum Person City Birth Place RDF MAPPING CREATION PROCESS
  6. 6. 6 Which type of resources are contained in columns? What are the relationships between resources? Which ontologies are relevant to describe these concepts? Name Birth City Augustus Rome Tiberius Rome Caligula Antitum dbo:Person dbo:City dbo:birthPlace RDF MAPPING CREATION PROCESS
  7. 7. LIMITATION 7 ● ● ● ● ● ● KNOW THE DATASET PERFECTLY BE FAMILIAR WITH RDF Making an RDF Mapping requires to:
  8. 8. PROBLEM STATEMENT 8 How to simplify as much as possible the integration of existing structured datasets as Linked Data?
  9. 9. CHALLENGE 9 Automate part of the integration process that requires to be familiar with RDF ● ● ● ● ● ● KNOW THE DATASET PERFECTLY BE FAMILIAR WITH RDF
  10. 10. 10
  11. 11. 11 ● SPARQL-Generate [1] ● RML [2] ○ YARRRML [3] [1] Lefrançois, M., Zimmermann, A., Bakerally, N.: A SPARQL Extension For Generating RDF From Heterogeneous Formats. In: Extended Semantic Web Conference (ESWC) (2017) [2] Dimou, A., Vander Sande, M., Colpaert, P., Verborgh, R., Mannens, E., Van de Walle, R.: RML: A Generic Language for Integrated RDF Mappings of Heterogeneous Data. In: Workshop on Linked Data on the Web (LDOW) collocated with WWW (2014) [3] Heyvaert, P., De Meester, B., Dimou, A., Verborgh, R.: Declarative Rules for Linked Data Generation at Your Fingertips! In: Extended Semantic Web Conference (ESWC), Poster & Demo (2018) mappings: Person: subject: https://www.example.org/Person/$(name)/ predicateobjects: - [a, 'http://dbpedia.org/ontology/Person'] - predicates: ‘http://dbpedia.org/ontology/birthPlace’ objects: - mapping: City City: subject: https://www.example.org/City/$(birth_city)/ predicateobjects: - [a, 'http://dbpedia.org/ontology/City'] Name Birth City Augustus Rome Tiberius Rome Caligula Antitum dbo:Person dbo:City dbo:birthPlace
  12. 12. 12 ● Karma [4] ● RMLEditor [5] ● Juma [6] [4] Gupta, S., Szekely, P., Knoblock, C.A., Goel, A., Taheriyan, M., Muslea, M.: Karma: A System for Mapping Structured Sources Into the Semantic Web. In: Extended Semantic Web Conference (ESWC), Poster & Demo (2012) [5] Heyvaert, P., Dimou, A., Herregodts, A.L., Verborgh, R., Schuurman, D., Mannens, E., Van de Walle, R.: RMLEditor: a Graph-Based Mapping Editor for Linked Data Mappings. In: Extended Semantic Web Conference (ESWC) (2016) [6] Junior, A.C., Debruyne, C., O’Sullivan, D.: An Editor that Uses a Block Metaphor for Representing Semantic Mappings in Linked Data. In: Extended Semantic Web Conference (ESWC), Poster & Demo (2018)
  13. 13. 13
  14. 14. 14 github.com/opendatasoft/ontology-mapping-chatbot/ chatbot.opendatasoft.com/
  15. 15. 15 ENTITIES RECOGNITION DBPedia, YAGO Linked Open Vocabularies PROPERTIES RECOGNITION Linked Open Vocabularies MAPPING GENERATION RDFS and OWL ✓ QUESTIONS TO USER QUESTIONS TO USER
  16. 16. 16 Name Birth City Birth Province Augustus Rome Tiberius Rome Claudius Lugdunum Gallia Lugdunensis DBpedia YAGO Linked Open Vocabularies [Augustus, Tiberius, Claudius] Person dbo:Person dbo:Person
  17. 17. 17 DBpedia YAGO Linked Open Vocabularies [Rome, Lugdunum] City dbo:City Name Birth City Birth Province Augustus Rome Tiberius Rome Claudius Lugdunum Gallia Lugdunensis dbo:Person dbo:City
  18. 18. 18 DBpedia YAGO Linked Open Vocabularies [Gallia Lugdunensis] ? Name Birth City Birth Province Augustus Rome Tiberius Rome Claudius Lugdunum Gallia Lugdunensis dbo:Person dbo:City
  19. 19. 19 ENTITIES RECOGNITION DBPedia, YAGO Linked Open Vocabularies PROPERTIES RECOGNITION Linked Open Vocabularies MAPPING GENERATION RDFS and OWL ✓ QUESTIONS TO USER QUESTIONS TO USER
  20. 20. 20 Name Birth City Birth Province Augustus Rome Tiberius Rome Claudius Lugdunum Gallia Lugdunensis dbo:Person dbo:City Column Name contains entities of type Person?
  21. 21. 21 Name Birth City Birth Province Augustus Rome Tiberius Rome Claudius Lugdunum Gallia Lugdunensis dbo:Person dbo:City Column Name contains entities of type Person? ✓
  22. 22. 22 Name Birth City Birth Province Augustus Rome Tiberius Rome Claudius Lugdunum Gallia Lugdunensis dbo:Person dbo:City Column Birth City contains entities of type City? Column Name contains entities of type Person? ✓
  23. 23. 23 Name Birth City Birth Province Augustus Rome Tiberius Rome Claudius Lugdunum Gallia Lugdunensis dbo:Person dbo:City ✓ Column Birth City contains entities of type City? Column Name contains entities of type Person? ✓
  24. 24. 24 ENTITIES RECOGNITION DBPedia, YAGO Linked Open Vocabularies PROPERTIES RECOGNITION Linked Open Vocabularies MAPPING GENERATION RDFS and OWL ✓ QUESTIONS TO USER QUESTIONS TO USER
  25. 25. 25 Name Birth City Birth Province Augustus Rome Tiberius Rome Claudius Lugdunum Gallia Lugdunensis dbo:Person dbo:City Linked Open Vocabularies Name schema:name schema:name
  26. 26. 26 Name Birth City Birth Province Augustus Rome Tiberius Rome Claudius Lugdunum Gallia Lugdunensis dbo:Person dbo:City Linked Open Vocabularies Birth City dbo:birthPlace dbo:birthPlace schema:name
  27. 27. 27 Name Birth City Birth Province Augustus Rome Tiberius Rome Claudius Lugdunum Gallia Lugdunensis dbo:Person dbo:City Linked Open Vocabularies Birth Province dbo:birthPlace dbo:birthPlace dbo:birthPlace schema:name
  28. 28. 28 ENTITIES RECOGNITION DBPedia, YAGO Linked Open Vocabularies PROPERTIES RECOGNITION Linked Open Vocabularies MAPPING GENERATION RDFS and OWL ✓ QUESTIONS TO USER QUESTIONS TO USER
  29. 29. 29 Name Birth City Birth Province Augustus Rome Tiberius Rome Claudius Lugdunum Gallia Lugdunensis dbo:Person dbo:City schema:name dbo:birthPlace dbo:birthPlace Column Name contains names of entities?
  30. 30. 30 Name Birth City Birth Province Augustus Rome Tiberius Rome Claudius Lugdunum Gallia Lugdunensis dbo:Person dbo:City schema:name dbo:birthPlace dbo:birthPlace Column Name contains names of entities? ✓
  31. 31. 31 Name Birth City Birth Province Augustus Rome Tiberius Rome Claudius Lugdunum Gallia Lugdunensis dbo:Person dbo:City schema:name dbo:birthPlace dbo:birthPlace Column Name contains names of entities? ✓ Which column contains entities that have a name?
  32. 32. 32 Name Birth City Birth Province Augustus Rome Tiberius Rome Claudius Lugdunum Gallia Lugdunensis dbo:Person dbo:City schem a:nam e dbo:birthPlace dbo:birthPlaceColumn Name contains names of entities? ✓ Which column contains entities that have a name? Name
  33. 33. 33 Column Birth City contains birth places of entities? ✓ Which column contains entities that have a birth place? Name Name Birth City Birth Province Augustus Rome Tiberius Rome Claudius Lugdunum Gallia Lugdunensis dbo:Person dbo:City schem a:nam e dbo:birthPlace dbo:birthPlace
  34. 34. 34 Column Birth Province contains birth places of entities? ✓ Which column contains entities that have a birth place? Name Name Birth City Birth Province Augustus Rome Tiberius Rome Claudius Lugdunum Gallia Lugdunensis dbo:Person dbo:City schem a:nam e dbo:birthPlace dbo:birthPlace
  35. 35. 35 ENTITIES RECOGNITION DBPedia, YAGO Linked Open Vocabularies PROPERTIES RECOGNITION Linked Open Vocabularies MAPPING GENERATION RDFS and OWL ✓ QUESTIONS TO USER QUESTIONS TO USER
  36. 36. 36 Name Birth City Birth Province Augustus Rome Tiberius Rome Claudius Lugdunum Gallia Lugdunensis dbo:Person dbo:City schem a:nam e dbo:birthPlace dbo:birthPlace Apply rdfs:domain and rdfs:range rules dbo:Place
  37. 37. 37 Name Birth City Birth Province Augustus Rome Tiberius Rome Claudius Lugdunum Gallia Lugdunensis dbo:Person dbo:City schem a:nam e dbo:birthPlace dbo:birthPlace Infer rdfs:label dbo:Place rdfs:label rdfs:label rdfs:label
  38. 38. 38 Name Birth City Birth Province Augustus Rome Tiberius Rome Claudius Lugdunum Gallia Lugdunensis dbo:Person foaf:Person dbo:Agent ... dbo:City dbo:Place dbo:Location ... schem a:nam e dbo:birthPlace dbo:birthPlace Apply rdfs:subClassOf owl:equivalentClass rules dbo:Place dbo:location ... rdfs:label rdfs:label rdfs:label
  39. 39. 39 Name Birth City Birth Province Augustus Rome Tiberius Rome Claudius Lugdunum Gallia Lugdunensis dbo:Person foaf:Person dbo:Agent ... dbo:City dbo:Place dbo:Location ... schem a:nam e dbo:birthPlace dul:hasLocation dbo:birthPlace dul:hasLocation Infer properties rdfs:subPropertyOf owl:equivalentProperty dbo:Place dbo:location ... rdfs:label rdfs:label rdfs:label
  40. 40. 40 Name Birth City Birth Province Augustus Rome Tiberius Rome Claudius Lugdunum Gallia Lugdunensis dbo:Person foaf:Person dbo:Agent ... dbo:City dbo:Place dbo:Location ... schem a:nam e dbo:birthPlace dul:hasLocation dbo:birthPlace dul:hasLocation dbo:Place dbo:location ... rdfs:label rdfs:label rdfs:label mappings: Person: subject: https://www.example.org/Person/$(name)/ predicateobjects: - [a, 'http://dbpedia.org/ontology/Person] - [a, 'http://xmlns.com/foaf/0.1/Person] - [a, 'http://dbpedia.org/ontology/Agent] - predicates: ‘http://dbpedia.org/ontology/birthPlace’ objects: - mapping: City - predicates: ‘http://dbpedia.org/ontology/birthPlace’ objects: - mapping: Place ... City: subject: https://www.example.org/City/$(birth_city)/ predicateobjects: - [a, 'http://dbpedia.org/ontology/City'] - [a, 'http://dbpedia.org/ontology/location] ... Place: subject: https://www.example.org/City/$(birth_province)/ predicateobjects: - [a, 'http://dbpedia.org/ontology/Place] - [a, 'http://dbpedia.org/ontology/location] ... ✓
  41. 41. 41
  42. 42. 42 A bot for linked data integration No need to be familiar with RDF Works for datasets on Opendatasoft platform Available on GitHub under MIT license
  43. 43. 43 Use Machine Learning to learn from user interactions Identify more classes and properties New user interface
  44. 44. 44 THANK YOU ! Benjamin Moreau @BenjMoreau Connected Data London 2019

×