Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and Semantics

282 views

Published on

Juan Sequeda, Co-founder of Capsenta, gave an interesting talk on how can we integrate data using graphs and semantics (semantic data virtualization). As Mr. Sequeda said, the idea is to integrate data without needing to move it around. Juan started off his presentation talking about the huge gap that exists between the IT departments, guardians of the data and the business development departments, trying to extract insights about the data.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and Semantics

  1. 1. Integrating  Data  using   Graphs  and  Semantics Juan  F.  Sequeda juan@capsenta.com
  2. 2. IT Biz Total  net   sales  of   all  Orders   today Reports
  3. 3. What  do  you  mean  by  … How  many  orders  were   placed  in  May  2016? 317,595 317,124 316,899 Billing Shipping E-­‐Commerce
  4. 4. What  do  you  mean  by  … What  is  an  Order? When  a  user   clicks   “Order”  on   the  website When  the   customer   has   received   the   product When  it  comes   out  of  the   billing   system  and  the  CC   has  been  charged Billing Shipping E-­‐Commerce Data   resides  in   different   sources Ambiguity No  Shared   Understanding Lack   of   Semantics
  5. 5. IT Biz Total  net   sales  of   all  Orders   today Data Architect SELECT   ..   FROM  … csv csv csv MS Access T=1 T=2T=3 XLS Did  the  Biz  User  communicate   the  correct   message   to  IT?   Did  IT  understand  correctly   what  the  Biz   User  wanted?   Did  IT  deliver  the  correct/precise   results?   Reports XLS XLS Status  Quo  1
  6. 6. Enterprise Data  Warehouse IT Biz Reports Time   and  $ Total  net   sales  of   all  Orders   today ETL ETL ETL Total  net   sales  of  all   Orders   today  with   FX Status  Quo  2 Data Architect
  7. 7. Cross  Organizational  Data  Integration   Organization  1 Organization  2 Organization  n
  8. 8. 8
  9. 9. GRAPHS  ARE  COOL! 9
  10. 10. Flexible :US_Constitution_1992/ section/123 “Excessive   bail  shall  not   be  required,  nor   excessive   fines  imposed,   nor  cruel  and  unusual   punishments   inflicted.” :text :US_Constitution_1992 “United   States   of  America   1789  (rev.  1992)” :text :isSectionOf :Cruelty :hasTopic “Prohibition  of  cruel  or   degrading  treatment” :label “inhumane   treatment” :keyword 10
  11. 11. Integration :US_Constitution_1992/ section/123 “Excessive   bail  shall   not   be  required,   nor   excessive   fines   imposed,   nor  cruel   and  unusual   punishments   inflicted.” :text :US_Constitution_1992 “United   States  of  America   1789   (rev.  1992)” :isSectionOf :Cruelty :hasTopic “Prohibition   of  cruel   or   degrading   treatment” :label “inhumane   treatment” :keyword :text :EighthAmendment_US Constitution :Farmer_vs_Brennan :lawsApplied “A  prison   official’s   ‘deliberate   indifference’   to  a  substantial   risk  of  a   serious   harm   to  an  inmate     violates   the  Eighth   Amendment” :holding :sameAs :Prisons_in _Indiana :LGBT_right _case_laws :subject :subject 11
  12. 12. Data  and  Metadata  are  One :US_Constitution_1992/ section/123 “Excessive   bail  shall   not   be  required,   nor   excessive   fines   imposed,   nor  cruel   and  unusual   punishments   inflicted.” :text :US_Constitution_1992 “United   States  of  America   1789   (rev.  1992)” :isSectionOf :Cruelty :hasTopic “Prohibition   of  cruel   or   degrading   treatment” :label “inhumane   treatment” :keyword :text :Section :Constitution:Topic :Rights _and_ Duties :Physical _Integrity _Rights :subClass :subClass :subClass :hasTopic :isSectionOf :type :type 12
  13. 13. Common  denominator   <constitution id=“US_Constitution_1992”> <section id="US_Constitution_1992/section/123"> <text>Excessive bail shall ...</text> </section> <topic>Cruelty</topic> </constitution> “Excessive   bail  shall  not  be   required,  nor  excessive   fines   imposed,   nor  cruel and  unusual   punishments   inflicted.” id text topic 123 Excessive  bail  shall…   Cruelty :US_Constitution_1992/ section/123 “Excessive   bail  shall   not   be  required,   nor   excessive   fines   imposed,   nor  cruel   and  unusual   punishments   inflicted.” :text :Cruelty :hasTopic XML Text Tabular 13
  14. 14. Traversal,  Navigation,  Reachability :US_Constitution_1992/ section/123 “Excessive   bail  shall   not   be  required,   nor   excessive   fines   imposed,   nor  cruel   and  unusual   punishments   inflicted.” :text :US_Constitution_1992 “United   States  of  America   1789   (rev.  1992)” :isSectionOf :Cruelty :hasTopic “Prohibition   of  cruel   or   degrading   treatment” :label “inhumane   treatment” :keyword :text :EighthAmendment_US Constitution :Farmer_vs_Brennan :lawsApplied “A  prison   official’s   ‘deliberate   indifference’   to  a  substantial   risk  of  a   serious   harm   to  an  inmate     violates   the  Eighth   Amendment” :holding :sameAs :Prisons_in _Indiana :LGBT_right _case_laws :subject :subject 14
  15. 15. Semantics :US_Constitution_1992/ section/123 “Excessive   bail  shall   not   be  required,   nor   excessive   fines   imposed,   nor  cruel   and  unusual   punishments   inflicted.” :text :Cruelty :hasTopic “Prohibition   of  cruel   or   degrading   treatment” :label “inhumane   treatment” :keyword :Physical _Integrity _Rights :subClass :hasTopic 15
  16. 16. (Summary)  Why  are  Graphs  Cool? • Flexible • Integration • Data  and  Metadata  are   one • Common  Denominator • Traversal,  Navigation,   Reachability • Semantics ACM  Computing  Surveys  2008 16
  17. 17. Integrating  Data  using  Graphs  and   Semantics 17 HIVE Impala,   etc Oracle SQL   Server Postgres Unstructured Semi-­‐ Structured Mappings Enterprise  Knowledge  Graph Search ReportsAPI Dashboard
  18. 18. MAPPING  RELATIONAL  DATABASES  TO   GRAPHS 18
  19. 19. Relational  Database  to  RDF  (RDB2RDF) ID NAME AGE CID 1 Alice 25 100 2 Bob NULL 100 Person CID NAME 100 Austin 200 Madrid City <Person/1> <City/100> Alice 25 Austin <Person/2> Bob <City/200> Madrid foaf:namefoaf:name foaf:age rdfs:label rdfs:label foaf:based_near Mapping 19
  20. 20. W3C  RDB2RDF  Standards • Standards  to  map  Relational  Data  to  RDF • A  Direct  Mapping  of  Relational  Data  to  RDF – Default  automatic  mapping  of  relational  data  to   RDF • R2RML:  RDB  to  RDF  Mapping  Language – Customizable  language  to  map  relational  data  to   RDF 20
  21. 21. RDF W3C  Direct  Mapping Relational Database Direct   Mapping Engine Input:   Database  (Schema  and  Data) Primary  Keys Foreign  Keys Output RDF  graph 21
  22. 22. W3C  Direct  Mapping  Result ID NAME AGE CID 1 Alice 25 100 2 Bob NULL 100 Person CID NAME 100 Austin 200 Madrid City <Person/ID=1> <City/CID=100> Alice 25 Austin <Person/ID=2> Bob <City/CID=200> Madrid Person#Name Person#Age City#Name City#Name Person#ref-­‐CID Direct  Mapping Person#Name 22
  23. 23. R2RML R2RML Engine R2RML File :Cruelty :Section :Constitution:Topic :Rights _and_ Duties :Physical _Integrity _Rights :subClass:subClass :subClass :hasTopic :isSectionOf RDF Relational Database Target  Schema 23
  24. 24. <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName”Person" ]; rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}"; rr:class foaf:Person ]; rr:predicateObjectMap [ rr:predicate foaf:based_near ; rr:objectMap [ rr:parentTripelMap <TripleMap2>; rr:joinCondition [ rr:child “CID”; rr:parent “CID”; ] ] ] . <TriplesMap2> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”City" ]; rr:subjectMap [ rr:template "http://ex.com/City/{CID}"; rr:class ex:City ]; rr:predicateObjectMap [ rr:predicate foaf:name; rr:objectMap [ rr:column ”TITLE" ] ] . Example  R2RML 24
  25. 25. Graph  Data  Virtualization SPARQL RDBMS Graph SQL SQL   Results SPARQL Results R2RML  Mapping 25
  26. 26. RDBMS RDBMS RDBMS Ultrawrap NoETL Ultrawrap NoETL Ultrawrap NoETLR2RML R2RML R2RML SPARQL   Federator RDBMS Ultrawrap NoETLR2RML NoETL  Architecture 26
  27. 27. RDBMS RDBMS RDBMS Ultrawrap NoETL Ultrawrap NoETL RDF Triplestore R2RML R2RML SPARQL   Federator RDBMS R2RML R2RML Ultrawrap ETL Hybrid  NoETL  and  ETL  Architecture 27
  28. 28. Scalability • Seconds  vs  Months • Reuse  existing  relational  infrastructure – 30+  years  of  optimizations – Semantic  Query  Optimizations • Result:  SPARQL  as  fast  as  SQL  under  mappings Sequeda &  Miranker.  Ultrawrap:  SPARQL  Execution  on  Relational  Data.  J.  of  Web  Semantics  2013
  29. 29. The  Tipping  Point  Problem Relational  Database Graphs • Flexible • Integration • Data  and  Metadata  are  One • Common  Denominator • Traversal,  Navigation,  Reachability   • Semantics 29 Sequeda  (2015)  Integrating  Relational  Databases  with  the  Semantic  Web An  overarching  theme  is  the  need  to  create  systematic  and  real-­‐world  benchmarks  in   order  to  evaluate  different  solutions  for  these  features.

×