Making Semantic Data Federation Work

1,516 views

Published on

Enterprises are drowning in data that they can't find, access, or use. For many years, enterprises have wrestled with the best way to combine all that data into actionable information without building systems that break as schemas evolve. Approaches like warehousing and ETL can be brittle in the face of changing data sources or expensive to create. Data integration at the application level is common but this results in significant complexity in the code. Data-oriented web services attempt to provide reusable sources of integrated data, however these have just added another layer of data access that constrain query and access patterns.
This talk will look at how semantic web technologies can be used to make existing data visible and actionable using standards like RDF (data), R2RML (data translation), OWL (schema definition and integration), SPARQL (federated query), and RIF (rules). The semantic web approach takes the data you already have and makes that data available for query and use across your existing data sources. This base capability is an excellent platform for building federated analytics.

  • Be the first to comment

Making Semantic Data Federation Work

  1. 1. Making Semantic Data FederationWork by Alex Miller
  2. 2. Data Integration Problems1. Discovery and description2. Internal integration3. External integration4. Nomadic data5. Inflexible interfaces 2
  3. 3. 1. Discovery and description • What data do we have? • What does it mean? • Who is creating it? • Who is using it? 3
  4. 4. 2. Internal integration• Does your order entity have the same fields as my entity?• Are your codes for order status the same as my codes for order status? 4
  5. 5. 3. External integration• Does a public source of information exist?• How do the entities in the public source relate to the entities in my data? 5
  6. 6. 4. Nomadic data• Where does your data come from?• Which version of the data are you using?• Why does your data not match my data? 6
  7. 7. 5. Inflexible interfaces• Why cant I see all of my data?• Why does it take months to expose a new data element in my application? 7
  8. 8. Results XData Information Action 8
  9. 9. Semantic Technologies• Data model - RDF• Metadata - RDFS/OWL• Entailment - OWL, RIF• Relational data - R2RML• Query - SPARQL• Federation - SPARQL Protocol, Federation 9
  10. 10. Semantic Data Source Semantic Data Source SPARQL Protocol SPARQL RDFS/OWL RDF 10
  11. 11. Semantic Data Source Semantic Data Source SPARQL Protocol SPARQL RDFS/OWL D ata mo d el RDF 10
  12. 12. Semantic Data Source Semantic Data Source SPARQL Protocol SPARQL Meta da ta RDFS/OWL RDF 10
  13. 13. Semantic Data Source Semantic Data Source SPARQL Protocol Que r y SPARQL RDFS/OWL RDF 10
  14. 14. Semantic Data Source Semantic Data Source API SPARQL Protocol SPARQL RDFS/OWL RDF 10
  15. 15. Relational AccessSemantic Data Source SPARQL Protocol SPARQL RDFS/OWL SQL Relational RDB2RDF Database RDF 11
  16. 16. Relational Access Semantic Data Source SPARQL Protocol SPARQL RDFS/OWL SQL Relational RDB2RDF Database RDFVirtual 11
  17. 17. Relational AccessSemantic Data Source SPARQL Protocol SPARQL RDFS/OWL SQL Relational RDB2RDF Database RDF 11
  18. 18. Music DatabaseMusicians: MID First Last Inst_ID 1 Eddie Van Halen 10 2 Yo Yo Ma 20 3 Kenny G 30 Instruments: IID Instrument Type 10 Guitar String 20 Cello String 30 Saxophone Woodwind 12
  19. 19. Musician Schema rdfs:Class rdf:Property rdf:type rdf:type rdfs:domain music:firstName music:Musician rdfs:doma in rdfs music:lastName :dom ain rdfs:range music:playsmusic:Instrument rdfs:dom ain rdfs :do music:instName mai n music:instType 13
  20. 20. Triples From Tables Musicians: Instruments: MID First Last Inst_ID IID Instrument Type 1 Eddie Van Halen 10 10 Guitar String 2 Yo Yo Ma 20 20 Cello String 3 Kenny G 30 30 Saxophone Woodwind Turn each key into a resource and specify the proper type of each resource:artist:1 rdf:type music:Musician instrument:10 rdf:type music:Instrumentartist:2 rdf:type music:Musician instrument:20 rdf:type music:Instrumentartist:3 rdf:type music:Musician instrument:30 rdf:type music:Instrument 14
  21. 21. Triples From Tables Musicians: Instruments: MID First Last Inst_ID IID Instrument Type 1 Eddie Van Halen 10 10 Guitar String 2 Yo Yo Ma 20 20 Cello String 3 Kenny G 30 30 Saxophone Woodwind Turn each cell into a triple based on the key, property (mapped per column), and value:artist:1 music:firstName "Eddie" instrument:10 music:instName "Guitar"artist:1 music:lastName "Van Halen" instrument:10 music:instType "String"artist:2 music:firstName "Yo Yo" instrument:20 music:instName "Cello"artist:2 music:lastName "Ma" instrument:20 music:instType "String"artist:3 music:firstName "Kenny" instrument:30 music:instName "Saxophone"artist:3 music:lastName "G" instrument:30 music:instType "Woodwind" 15
  22. 22. Triples From Tables Musicians: Instruments: MID First Last Inst_ID IID Instrument Type 1 Eddie Van Halen 10 10 Guitar String 2 Yo Yo Ma 20 20 Cello String 3 Kenny G 30 30 Saxophone WoodwindTurn each foreign key reference into a relationshipbetween the foreign and primary resources. artist:1 music:plays instrument:10 artist:1 music:plays instrument:20 artist:2 music:plays instrument:30 16
  23. 23. R2RML Triple Mapping ain music:instName rdfs:dommusic:Instrument rdfs:d omain music:instType Instruments: IID Instrument Type 10 Guitar String 17
  24. 24. R2RML Triple Mapping ain music:instName rdfs:dom music:Instrument rdfs:d omain music:instTypeTriples Map rr:tableName Instruments: IID Instrument Type 10 Guitar String 17
  25. 25. R2RML Triple Mapping ain music:instName rdfs:dom music:Instrument rdfs:d omain rr:class music:instType Subject Map "http://example.com/music/ Inst-{iid}"Triples Map rr:tableName Instruments: IID Instrument Type 10 Guitar String 17
  26. 26. R2RML Triple Mapping ain music:instName rdfs:dom music:Instrument rdfs:d omain rr:class music:instType rr:predicate Subject Map "http://example.com/music/ Inst-{iid}" Predicate Map Predicate Object Map Object MapTriples Map rr:tableName Instruments: rr:column IID Instrument Type 10 Guitar String 17
  27. 27. R2RML Triple MappingDo main ain music:instNameontolog y rdfs:dom music:Instrument rdfs:d omain rr:class music:instType rr:predicate Subject Map "http://example.com/music/ Inst-{iid}" Predicate Map Predicate Object Map Object MapTriples Map rr:tableName Instruments: rr:column IID Instrument Type 10 Guitar String 17
  28. 28. R2RML Triple Mapping ain music:instName rdfs:dom music:Instrument rdfs:d omain rr:class music:instType rr:predicate Subject Map "http://example.com/music/ Inst-{iid}" Predicate Map Predicate Object Map Object MapTriples Map rr:tableName Instruments: rr:column IID Instrument Type 10 Guitar String 17
  29. 29. R2RML Triple Mapping ain music:instName rdfs:dom music:Instrument rdfs:d omain rr:class music:instType rr:predicate Subject Map "http://example.com/music/ Inst-{iid}" Predicate Map Predicate Object Map Object MapTriples Map rr:tableName Instruments: rr:column IID Instrument Type 10 Guitar String se Databa 17
  30. 30. R2RML Triple Mapping ain music:instName rdfs:dom music:Instrument rdfs:d omain rr:class music:instType rr:predicate Subject Map "http://example.com/music/ Inst-{iid}" Predicate Map Predicate Object Map Object MapTriples Map rr:tableName Instruments: rr:column IID Instrument Type 10 Guitar String 17
  31. 31. R2RML Triple Mapping ain music:instName rdfs:dom music:Instrument rdfs:d omain rr:class music:instType rr:predicate Subject Map "http://example.com/music/ Inst-{iid}" Predicate Map Predicate Object Map Object Map Triples Map rr:tableNameR2RML Instruments: rr:column IID Instrument Type 10 Guitar String 17
  32. 32. R2RML Triple Mapping ain music:instName rdfs:dom music:Instrument rdfs:d omain rr:class music:instType rr:predicate Subject Map "http://example.com/music/ Inst-{iid}" Predicate Map Predicate Object Map Object MapTriples Map rr:tableName Instruments: rr:column IID Instrument Type 10 Guitar String 17
  33. 33. Registry• Semantic data sources are self-describing and use a common protocol• Easy to build into a registry w/ additional metadata (also described with RDFS/ OWL) 18
  34. 34. Benefits of semantic technology stack1. Common data model2. Precise description3. Uniform access4. Federation 19
  35. 35. 1. Common data model• RDF provides common model for both data and descriptions of all kinds• Very flexible (but also very fine-grained) 20
  36. 36. 2. Precise flexible description rdfs:do ex:City main rdf:type rdfs:range rdf:Property xsd:gYear rdf:type rdf:Class rdf:type ex:cityFounded 47 dbp:London dbp: http://dbpedia.org/resource/ ex: http://example.org/ontology/ rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# rdfs: http://www.w3.org/2000/01/rdf-schema# 21
  37. 37. 3. Uniform access• SPARQL 1.1• SPARQL Protocol• HTTP 22
  38. 38. 4. Federation Semantic Data SourceSemantic Data Semantic Data Source Source Relational DBPedia Database 23
  39. 39. Data Integration Solutions (with semantics)1. Discovery and description2. Internal integration3. External integration4. Nomadic data5. Inflexible interfaces 24
  40. 40. Challenges 25
  41. 41. Challenges• Relating data domains 25
  42. 42. Challenges• Relating data domains• Security 25
  43. 43. Challenges• Relating data domains• Security• Unconstrained query access 25
  44. 44. Challenges• Relating data domains• Security• Unconstrained query access• Federated query optimization 25
  45. 45. Thanks!Visit us at http://revelytix.com or at ourbooth! 26

×