Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Graph Query Languages
Juan F. Sequeda, Ph.D
Co-Founder
Ca...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Take	
  away	
  message
• Graph	
  Databases	
  need	
  a...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Linked	
  Data	
  Benchmark	
  Council	
  (LDBC)
• LDBC	
...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
LDBC	
  Organization	
  (non-­‐profit)
“sponsors”
+	
  no...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
LDBC	
  Benchmarks
5
http://ldbcouncil.org/benchmarks
Gra...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Graph	
  Query	
  Language	
  Task	
  Force
• Study	
  qu...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Property	
  Graph	
  Data	
  Model
id:	
  123
name:	
  Ju...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Graph	
  Query	
  Languages	
  Today
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Standard	
  Syntax	
  and	
  Semantics
• Standard
– Vendo...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Closed	
  Language
SQL
Graph
Query
Language*
however	
  …
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Path	
  Support
• Conjunctive	
  Query	
  (CQ)
• Regular	...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Desired	
  Query	
  Functionalities
• Adjacency	
  Querie...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Adjacency	
  queries
• Property	
  access
– Get	
  the	
 ...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Graph	
  Pattern	
  Matching
• Join
– Get	
  the	
  creat...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Navigational	
  queries
• Reachability
– Is	
  there	
  a...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Where	
  are	
  we	
  now?
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Decision:	
  Property	
  Graph	
  Data	
  Model
In	
  the...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Property	
  Graph	
  Data	
  Model
id:	
  123
name:	
  Ju...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Unified	
  Data	
  Model	
  Options
• Extend	
  Relationa...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Unified	
  Data	
  Model	
  Options
• “Natural”	
  Graph	...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
(1)	
  Paths	
  make	
  things	
  “weird”!
SELECT x, y, C...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
(2)	
  Paths	
  make	
  things	
  “weird”!
22
SELECT x, y...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
(1)	
  Graph	
  Projection	
  (instead	
  of	
  Relation	...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
(2)	
  Graph	
  Projection	
  (instead	
  of	
  Relation	...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
(1)	
  Graph	
  Projection	
  with	
  Paths
25
1 2 3 4
5 ...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
(2)	
  Graph	
  Projection	
  with	
  Paths
26
1 2 3 4
5 ...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Wrap	
  up
• We	
  want	
  to	
  hear	
  from	
  you!
• 9...
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
THANK	
  YOU
Juan	
  Sequeda,	
  Ph.D
Co-­‐Founder	
  – C...
Upcoming SlideShare
Loading in …5
×

Graph Query Languages: update from LDBC

768 views

Published on

The Linked Data Benchmark Council (LDBC) is a non-profit organization dedicated to establishing benchmarks, benchmark practices and benchmark results for graph data management software. The Graph Query Language task force of LDBC is studying query languages for graph data management systems, and specifically those systems storing so-called Property Graph data. The goals of the GraphQL task force are to:
Devise a list of desired features and functionalities of a graph query language.
Evaluate a number of existing languages (i.e. Cypher, Gremlin, PGQL, SPARQL, SQL), and identify possible issues.
Provide a better understanding of the design space and state-of-the-art.
Develop proposals for changes to existing query languages or even a new graph query language.
This query language should cover the needs of the most important use-cases for such systems, such as social network and Business Intelligence workloads.

This talk will present an update of the work accomplished by the LDBC GraphQL task force. We also look for input from the graph community.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Graph Query Languages: update from LDBC

  1. 1. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Graph Query Languages Juan F. Sequeda, Ph.D Co-Founder Capsenta 1Smart  Data  – Graphorum Conference  – January  20,  2017 @juansequeda @  LDBCouncil
  2. 2. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Take  away  message • Graph  Databases  need  a  Standardized  Query   Language • It’s  complicated • “Those  who  fail  to  learn  from  history  are   doomed  to  repeat  it”
  3. 3. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Linked  Data  Benchmark  Council  (LDBC) • LDBC  is  a  non-­‐profit  organization  dedicated  to   establishing  benchmarks,  benchmark  practices   and  benchmark  results  for  graph  data   management  software. • LDBC  was  established  as  an  outcome  of  the   LDBC  EU  project  funded  by  the  European   Commission  within  the  7th  Framework   Programme (Grant  Agreement  No.  317548). http://ldbcouncil.org/
  4. 4. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com LDBC  Organization  (non-­‐profit) “sponsors” +  non-­‐profit  members  (FORTH,  STI2)  &  personal  members +  Task  Forces,  volunteers  developing  benchmarks +  TUC:  Technical  User  Community  (8  workshops,  ~40  graph  and   RDF  user  case  studies,  18  vendor  presentations)   9th  TUC   Meeting,  SAP  Headquarters  in  Walldorf Germany,  February  9-­‐10  2017 http://ldbcouncil.org/9th-­‐tuc-­‐meeting-­‐sap-­‐headquarters-­‐walldorf-­‐germany-­‐february-­‐9-­‐10-­‐2017
  5. 5. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com LDBC  Benchmarks 5 http://ldbcouncil.org/benchmarks Graphalytics Semantic  Publishing Social  Network VLDB  2016 SIGMOD  2015
  6. 6. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Graph  Query  Language  Task  Force • Study  query  languages   for  graph  data   management  systems,   specifically  systems   storing  “Property  Graph”   data • Query  language  should   cover  the  needs  of   important  use  cases:   social  network   benchmark,  interactive   and  BI  workloads • Devise  a  list  of  desired   features  and   functionalities   • Renzo  Angles,  Universidad  de  Talca • Marcelo  Arenas,  PUC  Chile  -­‐ task  force  lead • Pablo  Barceló,  Universidad  de  Chile • Peter  Boncz,  Vrije Universiteit Amsterdam • George  Fletcher,  Eindhoven  University  of   Technology • Claudio  Gutierrez,  Universidad  de  Chile • Tobias  Lindaaker,  Neo  Technology • Marcus  Paradies,  SAP • Raquel  Pau,  UPC • Arnau Prat,  UPC  /  Sparsity • Juan  Sequeda,  Capsenta • Oskar  van  Rest,  Oracle  Labs • Hannes  Voigt,  TU  Dresden • YinglongXia,  IBM
  7. 7. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Property  Graph  Data  Model id:  123 name:  Juan  Sequeda Person id:  456 name:  Marcelo  Arenas Person knows since:  2010 • Neo4j’s   Cypher • Oracle’s   PGQL • Tinkerpop’s Gremlin
  8. 8. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Graph  Query  Languages  Today
  9. 9. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Standard  Syntax  and  Semantics • Standard – Vendor  lock-­‐in – Prevents  true  performance   competition   à improvement  of  systems • Syntax – Hard  to  define  benchmarks – Hard  to  compare  Graph  DBMS – Applications  are  not  portable • Semantics – PGQL:  iso/homomorphism – Cypher:  “edge”  isomorphism – Gremlin:  homomorphism – SPARQL:  homomorphism Best  Paper  WWW2012 Angles  and  Gutierrez.   The  Expressive   Power  of  SPARQL.   ISWC2008 Best  Paper  ISWC2006.  10  Year  Award  2016  
  10. 10. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Closed  Language SQL Graph Query Language* however  …
  11. 11. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Path  Support • Conjunctive  Query  (CQ) • Regular  Path  Queries  (RPQ):  regular  expression   over  edge  labels • Conjunctive  Regular  Path  Query  (CRPQ) • Union  Conjunctive  Regular  Path  Query  (UCRPQ) • Path  Matching  supported.  What  about  further   processing  of  Paths? • Require  for  an  (a,b)*  RPQ  that  the  sum  of  some   property  on  all  b  edges  along  the  path  is  greater   some  give  threshold
  12. 12. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Desired  Query  Functionalities • Adjacency  Queries • Graph  Pattern  Matching • Navigational  Queries • Aggregate  Queries • Sub  queries
  13. 13. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Adjacency  queries • Property  access – Get  the  firstName and  lastName of  a  person   having  email  "$email” • Neighborhood  of  a  node – Get  the  firstName and  lastName of  the  friends  of   a  person  identified  by  email  "$email". • K-­‐neighborhood  of  a  node – Get  the  email,  firstName and  lastName of  friends   of  the  friends  of  a  person  having  email  "$email"   (excluding  the  start  person)  (i.e.  get  a  list  of   recommended  friends)  (directed  2-­‐neighborhood)
  14. 14. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Graph  Pattern  Matching • Join – Get  the  creationDate and  content  of  the  messages  created  by  a  person  identified  by  email   "$email1"  and  commented  by  another  person  identified  by  "$email2". • Union – Get  the  creationDate and  content  of  the  messages  either  created  or  liked  by  a  person   identified  by  email  "$email". • Intersection – Get  the  email,  firstNameand  lastName of  the  common  friends  between  two  persons   identified  by  emails  "$email1"  and  "$email2"  respectively. • Difference – Given  two  friends  identified  by  emails  "$email1"  and  "$email2"  respectively,  get  the  email,   firstName and  lastName of  the  friends  of  the  second  person  which  are  not  friends  of  the  first   person  (this  questions  is  relevant  for  friendship  recommendations). • Optional – Given  a  person  identified  by  email  "$email",  get  the  title  of  all  the  messages  created  by  such   person,  and  the  content  of  the  first  comment  replying  each  message  (if  it  exists). • Filter – Get  the  properties  of  the  people  whose  firstNameincludes  the  string  "xxx"  (it  implies  use  of   wildcards).
  15. 15. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Navigational  queries • Reachability – Is  there  a  friendship  connection  between  two  persons  identified  by  emails   "$email1"  and  "$email2"  respectively? • All  Path  Finding – Get  the  friendship  paths  between  two  persons  identified  by  emails  "$email1"   and  "$email2"  respectively. • Shortest  Path  Finding – The  shortest  friendship  path  between  two  persons  identified by  emails   "$email1"  and  "$email2"  respectively". • Regular  Path  Query – Get  the  firstName of  friends  of  the  friends  of  the  friends  of  a  person  identified   by  email  "$email". • Conjunctive  Regular  Path  Queries – Given  a  target  message  created  on  "$dateTime"  by  a  person  identified  by   email  "$email",  for  each  comment  replying  the  target  message,  get  the   comment's  content  and  the  email  of  the  comment´s  creator. • Filtered  regular  path  query – Given  a  person  identified  by  email  "$email",  get  the  title  of  all  the  messages   liked  by  such  person  between  "$dateTime1"  and  "$dateTime2”.
  16. 16. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Where  are  we  now?
  17. 17. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Decision:  Property  Graph  Data  Model In  the  following  definition,  we  assume  the  existence  of  the  following  sets: • L is  an  infinite  set  of  (node  and  edge)  labels; • P is  an infinite  set  of  property  names; • V is  an  infinite  set  of  literals  (actual  values). Moreover,  we  assume  that  SET(X)  is  the  set  of  all  finite  subsets  of  a  given   set X. Then  a  property  graph  is  a  tuple G =  (N,  E, ρ, λ, σ),  where: • nodes: N is  a  finite  set  of  nodes; • edges:   E is  a  finite  set  of  edges  such  that N and E have  no  elements  in   common; ρ : E → (N × N)  is  a  total  function; • labels:   λ :  (N  U E) →  SET(L) is  a  total  function; • properties:   σ :  (N U E) × P → V is  a  partial  function.
  18. 18. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Property  Graph  Data  Model id:  123 name:  Juan  Sequeda Person id:  456 name:  Marcelo  Arenas Person knows since:  2010 L is  an  infinite  set  of  (node  and  edge)  labels P is  an infinite  set  of  property  names V is  an  infinite  set  of  literals  (actual  values) N is  a  finite  set  of  nodes E is  a  finite  set  of  edges  such  that N and E have  no  elements  in  common   L =  {Person,  knows} P =  {id,  name,  since} V =  {“123”,  “456”,  “Juan  Sequeda”,  “Marcelo  Arenas”,  “2010”} N =  {n1,  n2} E =  {e1} ρ (edge) : E → (N × N)  is  a  total  function λ (labels):  (N  U E) →  SET(L) is  a  total  function σ (properties):  (N U E) × P → V is  a  partial  function ρ =  [ρ(e1)  =  (n1,n2) ] λ =  [λ(n1)  =  “Person”,  λ(n2)  =  “Person”,  λ(e1)  =  “knows”  ] σ =  [ σ(n1)  =  {(“id”,  “123”),  (“name”,  “Juan  Sequeda”)}, σ(n2)  =  {(“id”,  “456”),  (“name”,  “Marcelo  Arenas”)}, σ(e1)  =  {(“since”,  “2010”)} ]
  19. 19. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Unified  Data  Model  Options • Extend  Relational  model  (where  cells  can   contain  actual  values  such  as  strings,  integers,   dates,…)  with  objects  of  type  either  NODE,   EDGE,  PATH  or  GRAPH – Embedding  Graphs  in  Relational  Database – Similar  Postgres +  JSON  datatype – What  happens  if  you  join  on  a  GRAPH  column?
  20. 20. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Unified  Data  Model  Options • “Natural”  Graph  Data  model 20 1 2 3 4 5 6 id label E.id E.dest 1 purple 10 2 2 green 11 3 13 5 3 green 12 4 14 6 4 purple no  out edges 5 green 15 6 6 purple 16 4 10 11 12 13 14 15 16 Data  graph  G SELECT x, y FROM G (x:purple)-[e:*]->(y:purple) Query  1: id x y 90 1 4 91 1 6 x=1 y=4 90 x=1 y=6 91 id x y E.id E.dest 90 1 4 no  out edges 91 1 6 no  out edges New  Nodes   Because  of Relation  Projection
  21. 21. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com (1)  Paths  make  things  “weird”! SELECT x, y, CHEAPEST PATH edgeId FROM G (x:purple)-[e:*]->(y:purple) Query  2a: id x y 90 1 4 91 1 6 id x y E.id E.dest E.edgeId E.order 90 1 4 20 90 10 1 21 90 11 2 22 90 12 3 91 1 6 23 91 10 1 24 91 13 2 25 91 15 3 x=1 y=4 90 x=1 y=6 91 edgeId=10 order=1 edgeId=11 order=2 edgeId=12 order=3 edgeId=10 order=1 edgeId=13 order=2 edgeId=15 order=3 1 2 3 4 5 6 10 11 12 13 14 15 16 Data  graph  G Paths  are  Self  Edges  with  an  Order
  22. 22. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com (2)  Paths  make  things  “weird”! 22 SELECT x, y, p FROM G (x:purple)-[CHEAPEST p:+]->(y:purple) Query  2b: id x y 90 1 4 91 1 6 x=1 y=4 90 x=1 y=6 91id x y p E.id E.dest 90 1 4 [10,11,12] no  out edges 91 1 6 [10,13,15] no  out edges x=1 y=4 p=[10,11,12] 90 x=1 y=6 p=[10,13,15] 91 1 2 3 4 5 6 10 11 12 13 14 15 16 Data  graph  G Paths  are  Datatypes
  23. 23. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com (1)  Graph  Projection  (instead  of  Relation  Projection) 23 Query  3a: 1 2 3 4 5 6 10 11 12 13 14 15 16 Data  graph  G SELECT (x), (y) FROM G (x:orange)-[:+]->(y:orange) id label E.id E.dest 1 purple no  out edges 4 purple no  out edges 6 purple no  out edges 1 4 6 id x y 90 1 4 91 1 6 x=1 y=4 90 x=1 y=6 91 id x y E.id E.dest 90 1 4 no  out edges 91 1 6 no  out edges Remember:  this  is  the  Relational  Projection
  24. 24. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com (2)  Graph  Projection  (instead  of  Relation  Projection) 24 id label E.id E.dest 1 purple 21 4 22 6 4 purple 23 6 6 purple no  out edges Query  3b: 1 4 6 21 2322 SELECT (x)--(y) FROM G (x:purple)-[:+]->(y:purple) 1 2 3 4 5 6 10 11 12 13 14 15 16 Data  graph  G
  25. 25. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com (1)  Graph  Projection  with  Paths 25 1 2 3 4 5 6 10 11 12 13 14 15 16 Data  graph  G SELECT (PATH p) FROM G (x:orange)-[CHEAPEST p:*]->(y:orange) WHERE y.id = 4 1 2 3 4 6 10 11 12 16id label E.id E.dest 1 purple 10 2 2 green 11 3 3 green 12 4 4 purple no  out edges 6 purple 16 6 Query  4a:
  26. 26. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com (2)  Graph  Projection  with  Paths 26 1 2 3 4 5 6 10 11 12 13 14 15 16 Data  graph  G Query  4b: 21 23l=1 L=3 SELECT (x)-[:{l=LENGTH OF p}]-(y) FROM G (x:orange)-[CHEAPEST p:*]->(y:orange) WHERE y.id = 4 id label E.id E.dest E.l 1 purple 21 4 3 4 purple 23 6 1 6 purple no  out edges 1 4 6
  27. 27. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Wrap  up • We  want  to  hear  from  you! • 9th  LDBC  Technical  User  Community   Meeting,  SAP  Headquarters  in  Walldorf Germany,  February  9-­‐10  2017 • http://ldbcouncil.org/9th-­‐tuc-­‐meeting-­‐sap-­‐ headquarters-­‐walldorf-­‐germany-­‐february-­‐9-­‐ 10-­‐2017 27
  28. 28. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com THANK  YOU Juan  Sequeda,  Ph.D Co-­‐Founder  – Capsenta juan@capsenta.com @juansequeda 28 Sequeda  J.  Integrating  Relational  Databases  with  the  Semantic  Web.  IOS  Press.  2016 http://www.iospress.nl/book/integrating-­‐relational-­‐databases-­‐with-­‐the-­‐semantic-­‐web/

×