Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Integrating Semantic Web in the Real World: A Journey between Two Cities

1,549 views

Published on

Keynote at The 9th International Conference on Knowledge Capture (KCAP2017), Austin, Texas, Dec 2017

An early vision in Computer Science has been to create intelligent systems capable of reasoning on large amounts of data. Today, this vision can be delivered by integrating Relational Databases with the Semantic Web using the W3C standards: a graph data model (RDF), ontology language (OWL), mapping language (R2RML) and query language (SPARQL). The research community has successfully been showing how intelligent systems can be created with Semantic Web technologies, dubbed now as Knowledge Graphs.

However, where is the mainstream industry adoption? What are the barriers to adoption? Are these engineering and social barriers or are they open scientific problems that need to be addressed?

This talk will chronicle our journey of deploying Semantic Web technologies with real world users to address Business Intelligence and Data Integration needs, describe technical and social obstacles that are present in large organizations, and scientific challenges that require attention.

Published in: Technology

Integrating Semantic Web in the Real World: A Journey between Two Cities

  1. 1. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Integrating  Semantic  Web  in  the   Real  World:   A  journey  between  two  cities Juan  F.  Sequeda Keynote  at The  9th  International  Conference  on  Knowledge  Capture  (K-­‐CAP2017) December  6,  2017
  2. 2. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com 2
  3. 3. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Take  Away  Message • Reflect  on  our  journey  to  commercialize  semantic   web  technology  to  address  data  integration  and   business  intelligence  needs. Question • Why  is  it  so  hard  to  deploy  Semantic  Web  technologies  in   the  real  world? • Answer: 1. History 2. Knowledge  Engineer 3. Ontology/mapping  engineering
  4. 4. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Data Logic RDBMS Semantic   Web Workshop  on   Logic   and  Data  Bases,   Toulouse  1977 Gallaire,   Nicolas   &   Minker SQL99 Recursion KL-­‐ONE Description   Logic RDF OWL Views Triggers Semantic Networks Japanese   5th Generation   Project MCC Austin,  TX Today1970s Relational   Algebra Workshops  on Expert  Systems Deductive   Databases KRDB 1980s 1990s 2000s Let’s  put                                    in  Today’s  Context 4 History
  5. 5. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Where  we  started  in  2007…  What  is  the  relationship  between Relational  Model Table  Definition ConstraintsS Q L Relational  Databases RDF RDFS OWL S P A R Q L TIME Triggers Rules Semantic  Web Sequeda  et  al.  SQL  Databases  are  a  Moving  Target.  W3C  Workshop  on  RDF  Access  on  RDB.  2007 Progra mmer type 2 “Bob” name ITEmployee subClassOf SELECT  ?s  ?n  { ?s  type  ITEmployee. ?s  name  ?n } Literal name
  6. 6. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com 10  years  ago • D2R  (Map,Q,Server),  Virtuoso  RDF  Views,  SquirrelRDF,  R2D2,   Relational.OWL,  DB2OWL,  R2O,  Triplify,  Dartgrid,  RDBToOnto,   METAmorphoses,…
  7. 7. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com “Comparing the overall performance […] of the fastest rewriter with the fastest relational database shows an overhead for query rewriting of 106%. This is an indicator that there is still room for improvingthe rewritingalgorithms”. [Bizer and Schultz. BerlinSPARQL Benchmark 2009] Current  rdb2rdf  systems  are  not  capable  of  providing  the  query   execution  performance  required  [...]  it  is  likely  that  with  more  work   on  query  translation,  suitable  mechanisms  for  translating  queries   could  be  developed.  These  mechanisms  should  focus  on  exploiting   the  underlying  database  system’s  capabilities  to  optimize  queries   and  process  large  quantities  of  structure  data   [Gray  et  al.  2009] Some  Issues  early  on
  8. 8. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com https://sourceforge.net/p/d2rq-­‐map/mailman/message/28055191/ Sept  2011
  9. 9. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Why  was  this  happening  if  … ISWC  2008
  10. 10. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com (1)  Relational  Databases  à Semantic  Web:  Direct  Mapping 10 I R,  Σ   • Formalization  in  Datalog • Databases  with  NULLs • Correctness  of  a  Direct  Mapping • Information  Preservation • Query  Preservation • Monotonicity • Semantics  Preservation DM(R,  Σ,  I) • No  monotone  direct   mapping  is  semantics   preserving On  Directly  Mapping  Relational  Databases  to  RDF  and  OWL.  Sequeda,  Arenas,  Miranker.  WWW  2012
  11. 11. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com (2)Relational  Databases  ß Semantic  Web  :  Ultrawrap 11 Relational   Database Tripleview Mapping Compiler SPARQL  to  SQL   on  Views SQL  Optimizer Mapping  as   Views Direct Mapping Results Ultrawrap:  SPARQL  Execution  on  Relational  Data.  Sequeda  &  Miranker.  J.  Web  Semantics  2013 • Chakravarthy,  Grant  and  Minker.  Logic-­‐ Based  Approach  to  Semantic  Query   Optimization.   TODS1990 • Cheng  et  al.  (1990)  Implementation  of   Two  Semantic  Query  Optimization   Techniques  in  DB2  Universal   Database.  VLDB1999 • Semantic  Query  Optimization • Detection  of  Unsatisfiable   Conditions • Self  Join  Elimination • Commercial  RDB H
  12. 12. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com (3)Relational  Databases  ßàSemantic  Web:  UltrawrapOBDA 12 Relational   Database Tripleview Mapping Compiler SPARQL  to  SQL   on  Views SQL  Optimizer Mapping  as   Views Saturated Mapping Results Mapping OBDA:  Query  Rewriting  or  Materialization?  In  practice,  Both! Sequeda,  Arenas,  Miranker.  ISWC  2014  (Best  Paper) OWL  SQL EL RL QL DL • Gallaire et  al.  Logic  and  Databases:  A  Deductive   Approach.  ACM  Survey  1984 • Chaudhuri et  al.  Optimizing  queries  with   materialized  views.  ICDE95 Harinarayanet  al.  Implementing  Data  Cubes   Efficiently.  SIGMOD96 • Halevy.  Answering  queries  using  views:  A  survey.   VLDBJ2001 • Mami &  Bellahsene.  A  Survey  of  View  Selection   Methods.  SIGMOD  Record  2012 • Commercial  RDB • Answering  Queries   using  Views • Rewriting  using   materialized  views • Recursion  in  SQL H
  13. 13. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com HOW and  to  what  EXTENT can  RDB  be  integrated  with  the  SW? 13 RDB  can  be  automatically  directly   mapped  to  RDF  and  OWL RDB  can  evaluate  and  optimize   SPARQL  1.0  queries RDB  can  act  as  a  reasoner  for   Ontologies  with  inheritance  and   transitivity Direct  Mappings  can  be  Monotone,  Information   Preserving  and  Query  Preserving.  Monotonicity   is  an  obstacle  for  Semantics  Preservation Existing  Semantic  Query  Optimization  in   commercial  RDBMS Saturated  Mappings,   Query  rewriting  using  Materialized  Views  and   Recursion
  14. 14. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Where  did  our  research  journey  take  us? 14 Oracle SQL   Server Postgres MySQL IBM  DB2 Enterprise  Knowledge  Graph • Sheth&  Larson.  Federated  database  systems  for  managing  distributed,  heterogeneous,  and  autonomous  databases.  ACM  Survey.  1990 • Carnot92,  Infosleuth92,  SIMS93,  Information  Manifold96,  Lore96, TSIMMIS97,  Kleisli99,  Nimble01,  Clio01,  Sphinx04 H
  15. 15. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Our  Journey 15 https://constituteproject.org/
  16. 16. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com 16 SEMANTIC  CITY NON-­‐SEMANTIC  CITY
  17. 17. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com IT Biz Total  net   sales  of   all  Orders   today Reports Data  Integration  and  Business  Intelligence 17
  18. 18. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Business  Question How  many  orders  were  placed  in  November  2017? 317,595 317,124 316,899 Billing Shipping E-­‐Commerce 18
  19. 19. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com What  do  you  mean  by  … What  is  an  Order? When  a  user   clicks   “Order”  on   the  website When  the   customer   has   received   the   product When  it  comes   out  of  the   billing   system  and  the  CC   has  been  charged Billing Shipping E-­‐Commerce 19
  20. 20. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com IT Biz Total  net   sales  of   all  Orders   today Data Architect SELECT   ..   FROM  … csv csv csv MS Access T=1 T=2T=3 XLS • Did  the  Biz  User   communicate  the  correct   message  to  IT?   • Did  IT  understand  correctly   what  the  Biz  User  wanted?   • Did  IT  deliver  the   correct/precise  results?   Reports XLS XLS Status  Quo  1 20 https://www.wsj.com/articles/finance-­‐pros-­‐say-­‐youll-­‐have-­‐to-­‐pry-­‐excel-­‐out-­‐of-­‐their-­‐cold-­‐dead-­‐hands-­‐1512060948
  21. 21. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Enterprise Data  Warehouse IT Biz Reports Time   and  $ Total  net   sales  of   all  Orders   today ETL ETL ETL Total  net   sales  of  all   Orders   today  with   FX Status  Quo  2 Data Architect 21
  22. 22. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com What  is  actually  going  on  here • Subject  Matter  Expert   knows  the  business   domain • Dialog  between  users   • Understand  the   Domainà ontology • Find  where  it  is  in  the   data  à mappings • Sound  familiar?   22 Giarratano&  Riley.  Expert  Systems:   Principles  and  Programming.  1989 H
  23. 23. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Semantic  Web  can  help...right? Who  creates  this? Using  what  tools? IT  IS  NOT  EASY! HOWEVER
  24. 24. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Chasm  between  the  two  Cities 24G.  Moore.  Crossing  the  Chasm.   SEMANTIC  CITY NON-­‐SEMANTIC  CITY
  25. 25. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Observation  1:  Boiling  the  Ocean • Ontology  Engineering – Traditional  ontology   engineering   methodologies   – Using  competency   questions   – Test  driven  development   – Ontology  design  patterns   – ... • Mapping  Engineering – Ontology   Matching/Alignment – Schema   Matching/Alignment 25 “There  is  not  a  right  ontology.  But  a  useful” -­‐ F.  van  Harmelen https://www.flickr.com/photos/eclogite/4950276577/
  26. 26. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Observation  2:  Real  World  Schemas  are  Hard 26
  27. 27. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Observation  3:  Real  World  Mappings  are  Hard 27 How  to  deal  with  NULLs  and  Duplicates  in  a  mapping?
  28. 28. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Observation  4:  Tools  are  made  for  Semantic  City 28
  29. 29. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com So  what  is  the  solution? • Create  tools  for  citizens  of  the  Non-­‐Semantic  City   (?)   • Knowledge  Engineering  as  a  “transfer  process”  of   human  knowledge  to  a  KB  during  the  80s  did  not   succeed   • Assumption:  knowledge  exists,  just  has  to  be  collected  and   implemented • Knowledge  was  obtained  by  interviewing  experts  on  how  they   solve  specific  tasks   • Feasible  for  small  prototypical  systems • Failed  to  produce  large,  reliable  and  maintainable  knowledge   bases 29 Studer et  al.  Knowledge  Engineering:  Principles  and  methods.  Data  &  Know.  Engineering  1998 H
  30. 30. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com IT Biz The  Resurrection  of  the  Knowledge  Engineer! 30 KE Knowledge Engineer Data Engineers Domain  (Biz) Experts Business  &   Data   Modeling Data  Access “People  Person”“Geeky  Person” D.  Michie.  Knowledge  Engineering.  Kybernetes 1973 H
  31. 31. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com 31 The  Knowledge  Engineer • Analyze  graph  structures  and  content  and   develop  new  semantic  representations. • Make  decisions  and  provide  guidance  about   ontologies  and  semantic  representations. • Write  code  to  gather,  process,  and  analyze   data  of  various  kinds. • Work  with  researchers,  engineers,  and   linguists  to  develop  new  techniques  for   expansion,  improvement,  and  analysis  of  the   Knowledge  Graph. https://careers.google.com/jobs#!t=jo&jid=/google/linguist-­‐ontologist-­‐google-­‐knowledge-­‐firebase-­‐345-­‐spear-­‐st-­‐san-­‐francisco-­‐ca-­‐3182490028
  32. 32. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Knowledge  Engineer  vs  Data  Scientist 32 IT BizKE Knowledge Engineer Data Engineers Domain  (Biz) Experts DS Data   Scientist “Most  data  scientists  spend  only  20  percent  of  their  time   on  actual  data  analysis  and  80  percent  of  their  time  finding,   cleaning,  and  reorganizing  huge  amounts  of  data,  which  is   an  inefficient  data  strategy” https://www.infoworld.com/article/3228245/data-­‐science/the-­‐80-­‐20-­‐data-­‐science-­‐dilemma.html
  33. 33. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com 33 How  is  the  Knowledge  Engineer   empowered  in  order  to  be   successful?  
  34. 34. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Idea1:  Pay  as  you  go  Methodology 34 A  Pay-­‐As-­‐You-­‐Go  Methodology  for  Ontology-­‐Based  Data  Access.  Sequeda  &  Miranker.  IEEE  Internet  Computing  2017 -­‐ Studer et  al.  Knowledge  Engineering:  Principles   and  methods.  Data  &  Know.  Engineering  1998 -­‐ CommonKADS,  MIKE,  PROTÉGÉ,  VITAL,  EXPECT Knowledge  Engineering  as  a  modeling  process H
  35. 35. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Idea  2:  Extract  mappings  from  Source  Queries SELECT o.orderid, o.orderdate, o.ordertotal - ot.finaltax - CASE WHEN o.currencyid in (‘USD’, ‘CAD’) THEN o.shippingcost ELSE o.shippingcost - ot.shippingtax END AS netsales, o.currencyid FROM order o, ordertax ot WHERE o.orderid = ordertax.orderid AND o.statusid NOT IN (4, 5)
  36. 36. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Idea  3:  Tools  for  the  Knowledge  Engineer 36
  37. 37. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Bridging  the  Chasm 37 SEMANTIC  CITY NON-­‐SEMANTIC  CITY Knowledge  Engineer
  38. 38. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Our  Vision 38
  39. 39. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Thanks • Daniel  Miranker • Marcelo  Arenas • Oscar  Corcho • ..  And  many  more • Daniel  Miranker • Wayne  Heideman • Will  Briggs • Rick  Liao • Bill  Rogers • ...  And  many  more 39
  40. 40. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Takeaway  Message 40 Juan  Sequeda,  Ph.D Co-­‐Founder  – Capsenta juan@capsenta.com @juansequeda Sequeda   J.  Integrating   Relational   Databases   with   the  Semantic   Web.  IOS  Press.  2016 http://www.iospress.nl/book/integrating-­‐relational-­‐databases-­‐with-­‐the-­‐semantic-­‐web/ We  are  always  looking  for   smart  people  (and   Knowledge  Engineers!) THANK  YOU! Don’t  reinvent  the  wheel   Know  the  History Read  pre-­‐pdf  paper Knowledge  Engineer It’s  back   And  sexy Ontology  and  Mapping   Engineering  challenges New  Problems Because  we  need  to  bridge  the  chasm  between  the  Semantic  and  Non-­‐Semantic  Cities.   We  need  Knowledge  Engineers,  who  need  to  be  empowered  with  methodologies  and  tools. Why  is  it  so  hard  to  deploy  Semantic  Web  technologies  in  the  real  world?  

×