Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Cypher.PL: Executable Specification of Cypher written in Prolog

148 views

Published on

Presented at the Fifth openCypher Implementers Group Meeting in September 2017 @ http://www.opencypher.org/ocig/2017/09/28/ocig5/

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Cypher.PL: Executable Specification of Cypher written in Prolog

  1. 1. Cypher.PL Executable Specification of Cypher written in Prolog {jan.posiadala,pawel.susicki}@gmail.com
  2. 2. Cypher.PL Cypher.PL ● executable specification ● of declarative query language (Cypher) ● in formal declarative language of logic (Prolog) ● as close to the semantics as possible ● as far from the implementation issues as possible ● a tool for collective designing, verification, validation
  3. 3. Why Prolog Cypher.PL Prolog's enticements: ● declarative langauge ● with built-in unification... ● ...which is more general than pattern matching ● super-native data (structures) representation ● multiple solutions/evident ambiguity ● easy constraint verification ● DCG: notation for grammars ● meta-programming
  4. 4. Nothing New Under The Sun Cypher.PL Series of symposiums: Programming Language Implementation and Logic Programming Many papers on the topic but most defining Specifications Are (Preferably) Executable by Norbert E. Fuchs, Software Engineering Journal, September 1992 and many, many others
  5. 5. Cypher.PL in openCypher ecosystem Cypher.PL Cypher.g4 Antlr4ToDCG TCK features Cypher Queries Cypher Grammar in DCG Cypher Query Parse Tree (AST) Cypher Query Term Representation AST to IR (DSL) compiler DCG Grammar of Antlr4 Execution in Specification TCK features Scenarios Results IR Specification
  6. 6. Syntactic (and semantic) exercise Cypher.PL Strictly and evident syntactic and semantic ambiguity of query: with 1 as x return [x in [1,2]] as result Result 1 (the neo4j’s one) +--------+ | result | +--------+ | [1,2] | +--------+ Result 2 +--------------+ | result | +--------------+ | [true] | +--------------+
  7. 7. oCIG 67th of SEPTEMBER 2017 Cypher.PL singleQuery | | LIST ____________________________________________|____________________________________________ / clause clause | | | | with return __________________________________|__________________________________ ___________________________________________|___________________________________________ / | / no_modifier returnBody no_where no_modifier returnBody __________________|__________________ _____________________|____________________ / | | / | | returnItems no_order no_skip no_limit returnItems no_order no_skip no_limit | | | | LIST LIST | | | | returnItem returnItem ______|______ ____________|___________ / / expression variable expression variable | | | | | | | | cypher_integer symbolicName listComprehension symbolicName | | ____________|____________ | | | / | 1 X filterExpression identity _value _______________|_______________ / idInColl no_where __________|__________ / variable expression | | | | symbolicName cypher_list | | | | X LIST _______|______ / expression expression | | | | cypher_integer cypher_integer | | | | 1 2
  8. 8. oCIG 67th of SEPTEMBER 2017 Cypher.PL singleQuery | | LIST _______________________________________|______________________________________ / clause clause | | | | with return __________________________________|__________________________________ ________________________________|_______________________________ / | / no_modifier returnBody no_where no_modifier returnBody __________________|__________________ _______________________|_______________________ / | | / | | returnItems no_order no_skip no_limit returnItems no_order no_skip no_limit | | | | LIST LIST | | | | returnItem returnItem ______|______ ________________|________________ / / expression variable expression variable | | | | | | | | cypher_integer symbolicName cypher_list symbolicName | | | | | | | | 1 X LIST _value | | expression | | in __________|__________ / variable cypher_list | | | | symbolicName LIST | _______|______ | / X expression expression | | | | cypher_integer cypher_integer | | | | 1 2
  9. 9. #206: On the interpretation of range comparison expressions 1<2=3>4 comparisonExpression([cypher_integer(1), lt, cypher_integer(2), eq, cypher_integer(3), gt, cypher_integer(4)]), comparisonExpression | | LIST _____________________________|____________________________ / | | | | | cypher_integer lt cypher_integer eq cypher_integer gt cypher_integer | | | | | | | | 1 2 3 4 and __________________________|_________________________ / binaryTrenaryComparisionExpression and | _________________|________________ | / LIST binaryTrenaryComparisionExpression binaryTrenaryComparisionExpression _________|________ | | / | | | cypher_integer lt cypher_integer LIST LIST | | _________|________ _________|________ | | / | / | 1 2 cypher_integer eq cypher_integer cypher_integer gt cypher_integer | | | | | | | | 2 3 3 4 binaryTrenaryComparisionExpression | | LIST ___________________|__________________ / | binaryTrenaryComparisionExpression gt cypher_integer | | | | LIST 4 ___________________|__________________ / | | | cypher_integer lt cypher_integer eq cypher_integer | | | | | | 1 2 3 Thobe Mats Intermediate Representation
  10. 10. Implied group by is neat but, two following queries give (in neo4j) two different results: unwind [{a:1,b:2,c:3},{a:2,b:3,c:1},{a:3,b:1,c:2}] as x return x.a + count(*) + x.b + count(*) + x.c; Query Results +----------------------------------------+ | x.a + count(*) + x.b + count(*) + x.c | +----------------------------------------+ | 8 | | 8 | | 8 | +----------------------------------------+ 3 rows 96 ms unwind [{a:1,b:2,c:3},{a:2,b:3,c:1},{a:3,b:1,c:2}] as x return x.a + x.b + x.c + count(*) + count(*) ; Query Results +----------------------------------------+ | x.a + x.b + x.c + count(*) + count(*) | +----------------------------------------+ | 12 | +----------------------------------------+ 1 row 77 ms No adjustment in oC TCK Cypher.PL use case: implied group by Cypher.PL
  11. 11. Basic Concepts Cypher.PL Basic Cypher.PL concepts: I. Expression (intermediate representation) II. Environment III. Evaluation (partial) of expression in environment and additionally: IV. Equivalences on forms of expression
  12. 12. I. Expression Cypher.PL Expression: %definition of expression %literal value from domain expression(Value) :- value(Value). %identifiers expression(var(Id)) :- string(Id). %grouping function call: sum, max, count... etc expression(gc). %plus operator expression(plus(X,Y)) :- expression(X),expression(Y). %times operator expression(times(X,Y)) :- expression(X),expression(Y). Domain of values is limited to integers: value(Value) :- integer(Value). %expression plus __|_ / 1 times _|_ / var gc | | b
  13. 13. II. Environment Cypher.PL Mapping between identifiers and values and it is expressed as pairs of string identifier name and value %environment as list of pairs %of string identifier name and value environment(env(Environment)) :- maplist(environment_entry,Environment). environment_entry(EE) :- EE = (Id,Value), string(Id),value(Value). %environment env | | LIST _____|____ / | , , , | | | / / / a 2 b 3 c 1
  14. 14. III. Expression Evaluation in Environment (full) Cypher.PL %evaluation of variable %eval_expression(++Environment,++Expression,-EvaluatedExpression) eval_expression(env(Environment),Id,Value) :- member((Id,Value),Environment),!. %out environment exception eval_expression(_,Id,_) :- string(Id),throw(out_of_environment),!. %identity evaluation of literal value eval_expression(_,X,X) :- value(Value). %full recursive evaluation of plus operator eval_expression(Environment,plus(X,Y),E) :- eval_expression(Environment,X,EX),value(EX), eval_expression(Environment,Y,EY),value(EY), E is EX + EY,!. %for multiplication analogously
  15. 15. Expression Evaluation in Environment (partial) Cypher.PL %identity evaluation of literal value eval_expression(_,gc,gc). %partial evaluation of operator (plus, times) eval_expression(Environment,T,ET) :- %operator term decomposition T =.. [TN|Args], % plus(times(3,2),gc) =.. [plus | [times(3,2), gc] ] %evaluation on operator arguments maplist(eval_expression(Environment),Args,EArgs), %operator term recomposition ET =.. [TN|EArgs], plus(6,gc) =.. [plus | [6, gc] ] !. plus __|_ / times gc | / 3 2 plus _| / 6 gc
  16. 16. IV. Equivalences on forms of expression Cypher.PL %Expression equivalence is reflexive, symmetric, transitive ex_equivalence(A,A). ex_equivalence(A,B) :- ex_equivalence(B,A). ex_equivalence(A,C) :- ex_equivalence(A,B),ex_equivalence(B,C). %Operator properties we want to preserve %plus commutativity ex_equivalence(plus(X,Y),plus(Y,X)). %plus associativity ex_equivalence(plus(plus(X,Y),Z),plus(X,plus(Y,Z))). %times commutativity ex_equivalence(times(X,Y),times(Y,X)). %times associativity ex_equivalence(times(times(X,Y),Z),times(X,times(Y,Z))). %plus/times distributivity ex_equivalence(times(plus(X,Y),Z),plus(times(X,Z),times(Y,Z))). Operators properties to be preserved plus plus | | / / X Y Y X plus plus _|_ _| / / plus Z X plus | | / / X Y Y Z times plus _|_ __|__ / / plus Z times times | | | / / / X Y X Z Y Z
  17. 17. Grouping Definition Cypher.PL %grouping Environments list with respect %to value of Expression evaluated in environment cypher_gr_by(Expression,Environments,EnvironmentsGroups) :- gr_by(env_equality(Expression),Environments,EnvironmentsGroups). %test equality of pair of environment %by comparing partial evaluation of equivalent form of expression env_equality(Expression,Environment1,Environment2) :- %generates all possible equivalent form of expression mapterm(ex_equivalence,Expression,EquivalentExpression), %test equality of (partial) evaluation in both environments eval_expression(Environment1,EquivalentExpression,Value), eval_expression(Environment2,EquivalentExpression,Value).
  18. 18. Example: Query 1 Cypher.PL unwind [{a:1,b:2,c:3},{a:3,b:1,c:2},{a:2,b:3,c:1}] as x with x.a as a, x.b as b, x.c as c return a + count(*) + b + count(*) + c %environment LIST _________________|________________ / | env env env | | | | | | LIST LIST LIST _____|____ _____|____ _____|____ / | / | / | , , , , , , , , , | | | | | | | | | / / / / / / / / / a 1 b 2 c 3 a 3 b 1 c 2 a 2 b 3 c 1 %expression plus ___|__ / var plus | __|__ | / a gc plus __|__ / var plus | _| | / b gc var | | c
  19. 19. Example: Query 1 Cypher.PL %equivalent expression form plus __|__ / gc plus __|__ / gc plus __|__ / var plus | _|_ | / b var var | | | | c a %value of equivalent expression form plus _|_ / gc plus | / gc 6
  20. 20. Example: Query 1 Cypher.PL %therefore the result is one group LIST | | LIST _________________|________________ / | env env env | | | | | | LIST LIST LIST _____|____ _____|____ _____|____ / | / | / | , , , , , , , , , | | | | | | | | | / / / / / / / / / a 2 b 3 c 1 a 3 b 1 c 2 a 1 b 2 c 3
  21. 21. Example: Query 2 Cypher.PL unwind [{a:1,b:2,c:3},{a:3,b:1,c:2},{a:2,b:3,c:1}] as x with x.a as a, x.b as b, x.c as c return a + b + c + count(*) + count(*) %expression plus ___|___ / var plus | ___|__ | / a var plus | __|__ | / b var plus | _| | / c gc gc %equivalent expression form plus __|__ / gc plus __|__ / gc plus __|__ / var plus | _|_ | / a var var | | | | b c %value of equivalent %expression form plus _|_ / gc plus | / gc 6
  22. 22. Cypher.PL: Database Cypher.PL %Property graph model facts node(NodeId) %V relationship(RelationshipId,NodeStartId,NodeEndId) %E,st label(NodeId,LabelName) %L,l type(RelationshipId,RelationshipType) %T,t nodeProperty(Id,Key,Value) %Pv,D relationshipProperty(Id,Key,Value) %Pe,D %Asserting/Retracting PGM facts %create_node(-NodeId:integer) create_node(NodeId) %set_label(++NodeId:integer, --LabelName) set_label(NodeId,LabelName) %remove_label(++NodeId:integer, ++LabelName) remove_label(NodeId,LabelName) %create_relationship(++NodeStartId:integer,++NodeEndId:integer,RelationshipType,--RelationshipId:integer) create_relationship(NodeStartId,NodeEndId,RelationshipType,RelationshipId) %delete_relationship(--RelationshipId:integer) delete_relationship(RelationshipId) %set_property(++Id:integer,++Key,++Value) set_node_property(Id,Key,Value) set_relationship_property(Id,Key,Value) %remove_property(++Id:integer,++Key) remove_node_property(Id,Key) remove_relationship_property(Id,Key)
  23. 23. Cypher.PL: Core Cypher.PL Environment: environment(env(Environment)) :- maplist(environment_entry,Environment). environment_entry(bind(Name,Value)) :- variable_name(Name),cypher_value(Value). %entities: cypher_node, cypher_relationship, cypher_path %primitives: cypher_string, cypher_integer, cypher_float, cypher_boolean, %structures: cypher_list (recursive), cypher_map (recursive) Query evaluation: folding of subsequent clause evaluations: %foldl(:Goal, +List, +V0, -V) foldl(eval_clause, Clauses, [env([])], ResultEnvironment). % eval_clause(++Clause:clause,++Environments:list,-ReturnEnvironments:list) %True if ReturnEnvironments are evaluation of Claused in Environments eval_clause(Clause,Environments,ReturnEnvironments)
  24. 24. Feedfront: Intermediate Representation Cypher.PL Machine-oriented ● Verbose ● Explicit ● Unambiguous Planner-friendly ● Minimal ordering constraints ● Unique variable names ● Human-friendly Mainly for debugging, not a primary goal cypher(statement(query(regularQuery(singleQuery([clause(with(no_modifier,returnBody(returnItems([returnItem(expression(cypher_int eger(1)),variable(symbolicName('X')))]),no_order,no_skip,no_limit),no_where)),clause(return(no_modifier,returnBody(returnItems([retur nItem(expression(listComprehension(filterExpression(idInColl(variable(symbolicName('X')),expression(cypher_list([expression(cypher_ integer(1)),expression(cypher_integer(2))]))),no_where),identity)),variable(symbolicName('_value')))]),no_order,no_skip,no_limit)))]),[])) ),[])
  25. 25. Feedfront: Intermediate Representation Cypher.PL Simple model for query planner ● grounded in property graph model ● easy formal treatment Point of collaboration between implementers ● language agnostic ● engine agnostic ● discuss impact of language changes / extensions
  26. 26. Summary: Cypher.PL features... Cypher.PL readability, declarativeness, explicitness, consistency, comparability, easiness of branching, easiness of versioning, evolvability, meatiness, pithiness, preciseness, easiness of (remote!) thoughts exchange (collective thinking), closeness to mathematical formalism, language/engine agnostic, verifiability, validatablity, executability ...as an oC artifact of type fourth - Cypher Language Specification
  27. 27. Q&A Cypher.PL The general question to the oC community and to the oC leaders: Is such executable specification of Cypher a desired artifact of openCypher? Leading question: Is there a current issue (CIR/CIP) to be executively specified in Cypher.PL as grouping issue to make collaborative thinking easier?

×