Grades nda 2018 - gremlinator demo talk - harsh thakkar

Two for One -- Querying Property Graphs using SPARQL
via GREMLINATOR
Harsh Thakkar, Dharmen Punjani, Jens Lehmann, Sören Auer

GRADES-NDA‘18 ⦿ Houston, TX, USA ⦿ June 10, 2018 Two for One - SPARQL Querying of PGs via Gremlinator ⦿ Harsh Thakkar ⦿ University of Bonn
●
●
●
●
●

● Graph formalism → intuitive way of modeling complex, highly connected data
● Resource Description Framework (RDF, W3C standard, 2004) data model and the
Property Graph (PG) data model are most popular graph data models.
● Various Graph Query Languages (GQLs) have been proposed to address:
○ Declarative style - Pattern Matching
○ Imperative style - Traversing
● SPARQL (W3C standard, 2008) for querying RDF databases, ?? (standard) for PG
databases
● Lack of standardization → Vendor lock-in → Interoperability gap → many issues!

● Gremlinator advantages:
○ Avoid a steep learning curve for users well versed in SPARQL for querying graph databases
○ Perform both OLTP and OLAP querying using SPARQL
○ Bridge the gap between the two Graph data models (RDF & PG), and
■ between the Semantic Web and Graph database communities
■ Best of both the worlds ⇒ get “Two for One”

RDF Data Model
● RDF is a triple based graph model (W3C’04), where :
○ Subject: URI, Blank node
○ Predicate: URIs -> property
○ Object: URI, Literal, Blank node
“2018”
ex:Eventex:Person
ex:Houston
“GRADES-NDA’18”
ex:year
ex:name
ex:place
ex:speaker
URI = Universal Resource identifier, analogous
to ISBN for books
Literals = data values
Blank nodes = Desc. of entities that don’t need
to be named.
IRIs*
ex:stim
e
“20”
@prefix ex: <http://example.org>
ex:Person ex:speaker ex:Event
ex:Person ex:name “Harsh”
ex:Person ex:place ex:Bonn
ex:Person ex:age “28”
ex:Event ex:name “GRADES-NDA’18”
ex:Event ex:Year “2018”
interpretation
representation
“Harsh” ex:name
ex:place
ex:Bonn
“28”
ex:age

RDF Graphs (RDFGs)
● Edge-labelled, directed, multi-graphs (w. Ent. URIs, Blank nodes, Literals)
● Going from information to Knowledge using OWL (DLs) and Ontologies (RDFS, RDFa,
etc)
● Bulky
○ Everything is a node-edge-node (edges do not have properties)
○ More relationships per node → More total number of triples!
■ Triple/dataset explosion

Property Graph Data Model
● Edge-labelled, directed, attributed, multi-graph
● Vertices and edges both have properties
● Main components:
○ Vertices, edges (Src,Dsc), properties (key-value pairs), labels (strings)
● Super neat (compact), super cute
● Easier to add weighted, reified edges
● Query Languages - CYPHER, Gremlin, PGQL, etc
Name: GRADES-NDA’18
Year: 2018
Place: Houston
Name: Harsh
Age: 28
Place: Bonn
Time: 20
Person Event
speaker

http://www.datastax.com/wp-content/uploads/2015/09/many-to-many-mapping.png
http://www.datastax.com/wp-content/uploads/2015/09/gtm-dataflow.png
Gremlin’s Multi-Graph Query Language (GQL) support

…
Multi-DMS & platform support
https://tinkerpop.apache.org/images/oltp-and-olap.png And thus… ➤

Gremlinator
Me
Coffee

• Gremlinator is a novel translation approach that maps SPARQL queries to Gremlin
pattern matching traversals [1, 2]
Talk@Graph Day 2017
[1] Thakkar, Harsh, Dharmen Punjani, et al. "Towards an Integrated Graph Algebra for Graph Pattern Matching with Gremlin." In proceedings of DEXA 2017, pp. 81-91. Springer, Cham, (2017).
[2] Thakkar, Harsh, Dharmen Punjani, et al. "A Stitch in Time Saves Nine--SPARQL querying of Property Graphs using Gremlin Traversals." under review at the Semantic Web Journal (submitted Feb,
2018).

➞
⇒
⇒
⇒
⇒
Name: GRADES-NDA’18
Year: 2018
Place: Houston
Name: Harsh
Age: 28
From: Bonn, DE
Time: 20
Person Event
* Rodriguez, Marko A., and Peter Neubauer.
"The graph traversal pattern." arXiv preprint
arXiv:1004.1001 (2010).
speaker

➞
Mapping corresponding Gremlin operators

Ex: Tinker Modern-Crew Graph
“Select only those persons who are younger or equal
than 30 and created a soft. Collectively.”
SELECT ?a ?b ?c WHERE {
?a v:label "person" .
?a e:knows ?b .
?a e:created ?c .
?b e:created ?c .
?a v:age ?d .
FILTER (?d <= 30)
}

Contd…
?a e:knows ?b .
?a e:created ?c .
?b e:created ?c .
?a v:age ?d .
FILTER (?d <= 30)
}
[MatchStartStep@[a], HasStep([~label.eq(person)]), MatchEndStep]
[MatchStartStep@[a], VertexStep(OUT,[knows],vertex)@[b], MatchEndStep]
[MatchStartStep@[a], VertexStep(OUT,[created],vertex)@[c], MatchEndStep]
[MatchStartStep@[b], VertexStep(OUT,[created],vertex)@[c], MatchEndStep]
[MatchStartStep@[a], PropertiesStep([age],value)@[d], MatchEndStep]
[WhereTraversalStep([WhereStartStep(d), IsStep(leq(30))]), MatchEndStep]
s
BGPs
BGP (q) ➞ SST* ( )

[GraphStep(vertex,[]), MatchStep(AND,[
[MatchStartStep(a), HasStep([~label.eq(person)]), MatchEndStep],
[MatchStartStep(a), VertexStep(OUT,[knows],vertex), MatchEndStep(b)],
[MatchStartStep(a), VertexStep(OUT,[created],vertex), MatchEndStep(c)],
[MatchStartStep(b), VertexStep(OUT,[created],vertex), MatchEndStep(c)],
[MatchStartStep(a), PropertiesStep([age],value), MatchEndStep(d)],
[MatchStartStep(d), WhereTraversalStep([WhereStartStep, IsStep(leq(30))]), MatchEndStep] ] ),
SelectStep([a, b, c])]
?a e:knows ?b .
?a e:created ?c .
?b e:created ?c .
?a v:age ?d .
FILTER (?d <= 30)
}
Contd…
sSPARQL Query
CGP (Q) ➞ Traversal ( )

“Select only those persons who are younger or
equal than 30 and created a soft. Collectively.”
?a e:knows ?b .
?a e:created ?c .
?b e:created ?c .
?a v:age ?d .
FILTER (?d <= 30)
}
{a=v[2], b=v[4], c=v[3]}
Contd…

https://sd.keepcalm-o-matic.co.uk/i-w600/keep-calm-it-is-demo-time.jpg
http://gremlinator.iai.uni-bonn.de:8080/Demo/

Team
Other Resources
Dharmen PunjaniHarsh Thakkar Prof. Dr. Sören
Auer
Prof. Dr. Jens
Lehmann

Demo: http://195.201.31.31:8080/Demo/ (OR) http://gremlinator.iai.uni-bonn.de:8080/Demo/
Harsh Thakkar
University of Bonn
Twitter: @harsh9t
LinkedIn: thakkarharsh
E-mail: harsh9t@gmail.com
Questions? Comments?
Insults? Injuries?

● A first of its kind, SPARQL-to-Gremlin traversal compiler [1,2] based on the Apache
TinkerPop framework.
SPARQLLanguage
FILTER
GROUPBY
LIMIT+OFFSET
UNION OPTIONAL
COUNT
GROUP
BY
GREMLINATOR
It allows querying
Property Graphs via
SPARQL
Can query a wide
variety of Graph
DBs using SPARQL
[1] Thakkar, Harsh, Dharmen Punjani, et al. "Towards an Integrated Graph Algebra for Graph Pattern Matching with Gremlin." In proceedings of DEXA 2017, pp. 81-91. Springer, Cham, (2017).
[2] Thakkar, Harsh, Dharmen Punjani, et al. "A Stitch in Time Saves Nine--SPARQL querying of Property Graphs using Gremlin Traversals." under review at the Semantic Web Journal (submitted Feb, 2018).

●
●
●
●
○ WHERE
○
○ SELECT

GPM using Gremlin*
1. g.V().match(
__.as(‘x’).out(‘Created’).as(‘y’)).
select(‘x’).dedup()
2. g.V(2).match(__.as(‘x’).out(‘Created’).
as(‘y’)).dedup()
*In Gremlin GPM is executed by the match() step

==>x:v[4]
==>x:v[2]
==>x:v[5]
Output
*In Gremlin GPM is executed by the match() step
==>x:v[3]
Output
x
x
x
y
y
GPM using Gremlin*
1. g.V().match(
__.as(‘x’).out(‘Created’).as(‘y’)).
select(‘x’).dedup()
2. g.V(2).match(__.as(‘x’).out(‘Created’).
as(‘y’)).dedup()

Grades nda 2018 - gremlinator demo talk - harsh thakkar

Recommended

Recommended

More Related Content

What's hot

What's hot (9)

Similar to Grades nda 2018 - gremlinator demo talk - harsh thakkar

Similar to Grades nda 2018 - gremlinator demo talk - harsh thakkar (14)

Recently uploaded

Recently uploaded (20)

Grades nda 2018 - gremlinator demo talk - harsh thakkar