Uploaded on

3 Myths about Graph Query Languages, Busted by Pixy

3 Myths about Graph Query Languages, Busted by Pixy

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. 3 Myths about graph query languages Busted by Pixy Sridhar Ramachandran Founder, LambdaZen LLC
  • 2. Background ● Graph databases are a category of NoSQL databases that model graphs consisting of vertices and edges. ○ The property-graph model from Tinkerpop is a graph database standard. ○ It offers a common abstraction for over a dozen graph databases using the Blueprints API. ● There are two querying paradigms for graph DBs, viz. graph query languages (GQL) and graph traversal languages (GTL). ○ GQLs are declarative and constraint-driven. ○ GTLs are imperative and step-driven.
  • 3. Background ● The Tinkerpop software stack includes Gremlin, a graph traversal language (GTL) that is a monadic Groovy DSL. ● All other GQLs to date are proprietary and can not be ported across graph databases. ● Pixy is a new declarative graph query language (GQL) that works on any Blueprints-compatible graph database. ○ Project page: https://github.com/lambdazen/pixy/ ○ Available under the Apache 2.0 license
  • 4. Myth #1: GQLs and GTLs can’t mix ● Myth #1: Graph Query Languages and Graph Traversal Languages are totally different ways to look at the graph query problem. ● Common wisdom dictates that: ○ A graph “access” language must either be a GTL or a GQL. ○ The programmer must choose one paradigm or the other for a specific query.
  • 5. Pixy co-exists with Gremlin ● Pixy queries are run from Gremlin expressions using the ‘pixy’ step. ● The input and output to the query can be operated on by Gremlin. ● The programmer can use both paradigms in the same query using Pixy + Gremlin. Gremlin (GTL) GremlinPixy (GQL)
  • 6. Myth #2: GQLs are slower ● Myth #2: Graph Query Languages are much slower than Graph Traversal Languages because of their declarative nature. ● Common wisdom dictates that: ○ the performance penalty is the price paid for declarative expressiveness. ○ you can’t be sure about the execution plan of a query written in a GQL, as it is with SQL.
  • 7. Pixy compiles to Gremlin ● Pixy compiles PROLOG-style rules to Gremlin expressions. ● The execution plan is a Gremlin pipeline and can be tweaked by reordering the clauses. ● Performance should be the same in most cases.
  • 8. Myth #3: GQLs can’t be relational ● Myth #3: A graph query language can not be based on N-ary predicate-calculus or relational algebra, since graphs can only express binary relations/predicates. ● Common wisdom dictates that: ○ Graph-based models unlike relational models can only capture binary relationships in edges. ○ Therefore, GQLs can only operate on vertices and edges, not N-ary relations. ○ HypergraphDB is designed to support “hyper” edges across N vertices to address this perceived weakness with graph-based associative models.
  • 9. Pixy derives N-ary relations ● The property graph model can only capture binary relations between vertices in a graph, aka edges. ● But Pixy can derive N-ary relations across vertices, edges and properties. ● These relations can be used to derive other N-ary relations. ○ These relations form what is called the “domain model” for the graph. ○ When any relation is queried, Pixy compiles the query into a sequence of Gremlin steps.
  • 10. An example gremlin> pt = new PixyTheory( '''father(Child, Father) :- out(Child, 'father', Father).''') The above rule means that: - father(A, B), father(B, C), father(D, B) are all true. - father(A, C), father(D, E), etc. are false. gremlin> pt = pt.extend( '''grandfather(X, Y, Z) :- father(X, Y), father(Y, Z).''') The above rule means that: - grandfather(A, B, C) and grandfather(D, B, C) are true - All other combinations are false Sample query from the Pixy Tutorial
  • 11. Wrap-up ● Pixy is a declarative graph query language that dispels 3 myths about GQLs. ● Myth #1: GQLs and GTLs can’t mix. ○ Pixy’s querying capability is integrated into Gremlin, bringing the capabilities of both querying paradigms in one combined language. ● Myth #2: GQLs are slower than GTLs. ○ Pixy compiles PROLOG-based queries and rules to Gremlin expressions. ● Myth #3: GQLs can’t be relational. ○ Pixy can derive N-ary relations from graphs. ○ New relations can be derived from existing ones.