Learning Timed Automata with Cypher
Gábor Szárnyas, Anna Gujgiczer, Márton Elekes
4th openCypher Implementers Meeting
Sicco Verwer, Mathijs de Weerdt, Cees Witteveen:
An algorithm for learning real-time automata.
Benelearn 2007
Automaton Learning
STATE MACHINE
Off
Stop
Continue
Prepare
Go
switchPhase
switchPhase
switch
Phase
onOff
onOff
onOff onOff
onOff
switch
Phase
switch
Phase
STATE MACHINE
Off
Stop
Continue
Prepare
Go
switchPhase
switchPhase
switch
Phase
onOff
onOff
onOff onOff
onOff
switch
Phase
switch
Phase
AUTOMATON
 States: accepting/rejecting
 Run: (finite) event sequence
 Accepted run:
ends in an accepting state
 Modelled behaviour:
accepted runsOff
Stop
Continue
Prepare
Go
switchPhase
switchPhase
switch
Phase
onOff
onOff
onOff onOff
onOff
switch
Phase
switch
Phase
AUTOMATON
 States: accepting/rejecting
 Run: (finite) event sequence
 Accepted run:
ends in an accepting state
 Modelled behaviour:
accepted runsOff
Stop
Continue
Prepare
Go
switchPhase
switchPhase
switch
Phase
onOff
onOff
onOff onOff
onOff
switch
Phase
+
- -
- -
switch
Phase
TIMED AUTOMATON
 Time spent in a state
 Guard condition: an interval
Off
Stop
Continue
Prepare
Go
switchPhase
switchPhase
switch
Phase
onOff
onOff
onOff onOff
onOff
switch
Phase
+
- -
- -
switch
Phase
TIMED AUTOMATON
 Time spent in a state
 Guard condition: an interval
Off
Stop
Continue
Prepare
Go
switchPhase
switchPhase
switch
Phase
onOff
onOff
onOff onOff
onOff
switch
Phase
+
- -
- -
[3,5]
[30,35]
[1,2]
switch
Phase
[30,35]
[0,∞]
[0,∞]
[0,∞]
[0,∞]
[0,∞]
[0,∞]
AUTOMATON LEARNING
AUTOMATON LEARNING
 System: unknown internal implementation (black box)
o System Under Learning
AUTOMATON LEARNING
 System: unknown internal implementation (black box)
o System Under Learning
 Goal: determine the internal operation of the system
(~reverse engineering)
o Hypothesis Model
AUTOMATON LEARNING
 System: unknown internal implementation (black box)
o System Under Learning
 Goal: determine the internal operation of the system
(~reverse engineering)
o Hypothesis Model
 Approach: observation  execution trajectories  automaton
AUTOMATON LEARNING
 System: unknown internal implementation (black box)
o System Under Learning
 Goal: determine the internal operation of the system
(~reverse engineering)
o Hypothesis Model
 Approach: observation  execution trajectories  automaton
Hypothesis
Model
System Under
Learning
Trajectory
of Runs
Learning
Algorithm
AUTOMATON LEARNING
 System: unknown internal implementation (black box)
o System Under Learning
 Goal: determine the internal operation of the system
(~reverse engineering)
o Hypothesis Model
 Approach: observation  execution trajectories  automaton
 Application: monitor synthesis, testing
Hypothesis
Model
System Under
Learning
Trajectory
of Runs
Learning
Algorithm
AUTOMATON LEARNING: THE BASICS
+
-
-
+
a,1
a,5
+
a,3
a,5 a,7
a,2
+ -
a,[1,2]
+
a,[1,5]
+
a,[3,5]
-
a,[1,5]
+ -
a, [1,2]
a,[1,2]
a, [3,5] a, [3,5]
AUTOMATON LEARNING: AN ALGORITHM
 Repeating three operations
o Split inconsistent states
+ -
a,[1,2]
+
a,[1,5]
+
a,[3,5]
-
a,[1,5]
+ +/-
a,[1,5]
+/-
a,[1,5]
+
-
-
+
a,1
a,5
+
a,3
a,5 a,7
a,2
AUTOMATON LEARNING: AN ALGORITHM
 Repeating three operations
o Split inconsistent states
o Merge similar states
-
-
b
+
+
a
a
-b
+
+
a
a
-b +
a
Merge step
AUTOMATON LEARNING: AN ALGORITHM
 Repeating three operations
o Split inconsistent states
o Merge similar states
o Colour to finalise a state
 Selecting an operation
o Using scoring metrics based on the
result of the operation
 Operations have to be executed
Select operation
Split Merge Colour
Implementation
DATA MODEL
DATA MODEL
DATA MODEL
DATA MODEL
DATA MODEL
DATA MODEL
DATA MODEL
DATA MODEL
DATA MODEL
Interesting Queries
INTERESTING QUERIES
 How to select a “longest path”
o i.e. a maximal path that cannot be extended further
 Merge
o Transitive reachability for the states to-be-merged
 Split: categorisation/copying subtrees
o Merge might result in multiple origNext edges
o Which one to use for categorisation?
WHERE NOT «PATTERN»
 Finding the beginning of a path
 Finding the end of a path
 These two can be used to select a complete path
MATCH (u)-[:Next*]->(v) WHERE NOT ()-[:Next]->(u)
MATCH (u)-[:Next*]->(v) WHERE NOT (v)-[:Next]->()
MATCH (u)-[:Next*]->(v)
WHERE NOT ()-[:Next]->(u)
AND NOT (v)-[:Next]->()
u:Next :Next
… …:Next :Next :Next
… …v :Next
MERGE
:indexPair :indexPair
-
-:Next
:Next
+
+
:Next
:Next
…
…
:indexPair
-:Next
+:Next
…
 Select states to merge
 Calculate scoring metrics
 Filter inconsistencies
 Return result
MERGE +
-
:indexPair
:indexPair
 Problem:
o Inconsistencies can be transitive
 Solution:
o Variable length path
:indexPair
:indexPair
+
-:Next
:Next
MERGE +
-
:indexPair
:indexPair
 Problem:
o Inconsistencies can be transitive
 Solution:
o Variable length path
:indexPair
:indexPair
+
-:Next
:Next
while (driver.run(createIndexPairs, mergeMap))
MERGE +
-
:indexPair
:indexPair
 Problem:
o Inconsistencies can be transitive
 Solution:
o Variable length path
:indexPair
:indexPair
+
-:Next
:Next
MATCH (s1:IndexedMerge)-[:IndexPair*..]-(s2:IndexedMerge),
s1p = (s1)-[:Next*0..]->(s12:State), s2p = (s2)-[:Next*0..]->(s22:State)
WHERE id(s1) > id(s2) AND length(s1p) = length(s2p)
AND NOT (s12)-[:IndexPair*0..]-(s22)
WITH s12, s22, s1p, s2p,[s1r IN rels(s1p) | s1r.symbol] AS s1ss, [s2r IN rels(s2p) | s2r.symbol] AS s2ss,
[s1r IN rels(s1p) | s1r.Tmin] AS s1mins, [s2r IN rels(s2p) | s2r.Tmin ] AS s2mins,
[s1r IN rels(s1p) | s1r.Tmax] AS s1maxs, [s2r IN rels(s2p) | s2r.Tmax ] AS s2maxs
WHERE s1ss = s2ss AND s1mins = s2mins AND s1maxs = s2maxs
WITH collect([s12, s22]) as pairs
MATCH ()-[ip:IndexPair]->()
WITH max(ip.index) + 1 as nextIndex, pairs
UNWIND range(0, length(pairs)-1) as idx
WITH pairs[idx][0] as s12, pairs[idx][1] as s22, idx + nextIndex as edgeIndex
CREATE (s12)-[:IndexPair {index: edgeIndex}]->(s22)
SET s12:IndexedMerge, s22:IndexedMerge
while (driver.run(createIndexPairs, mergeMap))
MERGE +
-
:indexPair
:indexPair
:indexPair
:indexPair
+
-:Next
:Next
 Problem:
o Inconsistencies can be transitive
 Solution:
o Variable length path
MERGE
MATCH (s1:IndexedMerge)-[:IndexPair*]-(s2:IndexedMerge),
s1p = (s1)-[:Next*0..]->(s12:State),
s2p = (s2)-[:Next*0..]->(s22:State)
WHERE id(s1) > id(s2) AND length(s1p) = length(s2p)
AND NOT (s12)-[:IndexPair*0..]-(s22)
+
-
:indexPair
:indexPair
:indexPair
:indexPair
+
-:Next
:Next
 Problem:
o Inconsistencies can be transitive
 Solution:
o Variable length path
MERGE
MATCH (s1:IndexedMerge)-[:IndexPair*]-(s2:IndexedMerge),
s1p = (s1)-[:Next*0..]->(s12:State),
s2p = (s2)-[:Next*0..]->(s22:State)
WHERE id(s1) > id(s2) AND length(s1p) = length(s2p)
AND NOT (s12)-[:IndexPair*0..]-(s22)
+
-
:indexPair
:indexPair
:indexPair
:indexPair
+
-:Next
:Next
 Problem:
o Inconsistencies can be transitive
 Solution:
o Variable length path
Find critical state pairs
MERGE
MATCH (s1:IndexedMerge)-[:IndexPair*]-(s2:IndexedMerge),
s1p = (s1)-[:Next*0..]->(s12:State),
s2p = (s2)-[:Next*0..]->(s22:State)
WHERE id(s1) > id(s2) AND length(s1p) = length(s2p)
AND NOT (s12)-[:IndexPair*0..]-(s22)
+
-
:indexPair
:indexPair
:indexPair
:indexPair
+
-:Next
:Next
 Problem:
o Inconsistencies can be transitive
 Solution:
o Variable length path
s1 s1
s2 s2
Find critical state pairs
MERGE
MATCH (s1:IndexedMerge)-[:IndexPair*]-(s2:IndexedMerge),
s1p = (s1)-[:Next*0..]->(s12:State),
s2p = (s2)-[:Next*0..]->(s22:State)
WHERE id(s1) > id(s2) AND length(s1p) = length(s2p)
AND NOT (s12)-[:IndexPair*0..]-(s22)
+
-
:indexPair
:indexPair
:indexPair
:indexPair
+
-:Next
:Next
 Problem:
o Inconsistencies can be transitive
 Solution:
o Variable length path
s1 s1
s2 s2
WITH s12, s22, s1p, s2p,
[s1r IN rels(s1p) | s1r.symbol] AS s1ss, [s2r IN rels(s2p) | s2r.symbol] AS s2ss,
[s1r IN rels(s1p) | s1r.Tmin] AS s1mins, [s2r IN rels(s2p) | s2r.Tmin ] AS s2mins,
[s1r IN rels(s1p) | s1r.Tmax] AS s1maxs, [s2r IN rels(s2p) | s2r.Tmax ] AS s2maxs
WHERE s1ss = s2ss AND s1mins = s2mins AND s1maxs = s2maxs
...
Find critical state pairs
MERGE
MATCH (s1:IndexedMerge)-[:IndexPair*]-(s2:IndexedMerge),
s1p = (s1)-[:Next*0..]->(s12:State),
s2p = (s2)-[:Next*0..]->(s22:State)
WHERE id(s1) > id(s2) AND length(s1p) = length(s2p)
AND NOT (s12)-[:IndexPair*0..]-(s22)
+
-
:indexPair
:indexPair
:indexPair
:indexPair
+
-:Next
:Next
 Problem:
o Inconsistencies can be transitive
 Solution:
o Variable length path
s1 s1
s2 s2
WITH s12, s22, s1p, s2p,
[s1r IN rels(s1p) | s1r.symbol] AS s1ss, [s2r IN rels(s2p) | s2r.symbol] AS s2ss,
[s1r IN rels(s1p) | s1r.Tmin] AS s1mins, [s2r IN rels(s2p) | s2r.Tmin ] AS s2mins,
[s1r IN rels(s1p) | s1r.Tmax] AS s1maxs, [s2r IN rels(s2p) | s2r.Tmax ] AS s2maxs
WHERE s1ss = s2ss AND s1mins = s2mins AND s1maxs = s2maxs
...
• Paths of same length
• Event sequences of same
lengths, etc.
Find critical state pairs
MERGE
 Problem:
o Inconsistencies can be transitive
 Solution:
o Variable length path
+
-
:indexPair
:indexPair
:indexPair
:indexPair
+
-:Next
:Nexts1 s1
s2 s2
MERGE
CREATE (s12)-[:IndexPair {index: edgeIndex}]->(s22)
SET s12:IndexedMerge, s22:IndexedMerge
 Problem:
o Inconsistencies can be transitive
 Solution:
o Variable length path
+
-
:indexPair
:indexPair
:indexPair
:indexPair
+
-:Next
:Nexts1 s1
s2 s2
MERGE
CREATE (s12)-[:IndexPair {index: edgeIndex}]->(s22)
SET s12:IndexedMerge, s22:IndexedMerge
 Problem:
o Inconsistencies can be transitive
 Solution:
o Variable length path
+
-
:indexPair
:indexPair
:indexPair
:indexPair
+
-:Next
:Nexts1 s1
s2 s2
Mark newly found
index pairs
MERGE
CREATE (s12)-[:IndexPair {index: edgeIndex}]->(s22)
SET s12:IndexedMerge, s22:IndexedMerge
 Problem:
o Inconsistencies can be transitive
 Solution:
o Variable length path
+
-
:indexPair
:indexPair
:indexPair
:indexPair
+
-:Next
:Nexts1 s1
s2 s2
:indexPair :indexPair
Mark newly found
index pairs
MERGE
CREATE (s12)-[:IndexPair {index: edgeIndex}]->(s22)
SET s12:IndexedMerge, s22:IndexedMerge
 Problem:
o Inconsistencies can be transitive
 Solution:
o Variable length path
+
-
:indexPair
:indexPair
:indexPair
:indexPair
+
-:Next
:Nexts1 s1
s2 s2
:indexPair :indexPair
Mark newly found
index pairs
while (driver.run(createIndexPairs, mergeMap))
MERGE
CREATE (s12)-[:IndexPair {index: edgeIndex}]->(s22)
SET s12:IndexedMerge, s22:IndexedMerge
 Problem:
o Inconsistencies can be transitive
 Solution:
o Variable length path
+
-
:indexPair
:indexPair
:indexPair
:indexPair
+
-:Next
:Nexts1 s1
s2 s2
:indexPair :indexPair
Mark newly found
index pairs
while (driver.run(createIndexPairs, mergeMap))
Repeatedly called from
Java code until fixed-point
MERGE
CREATE (s12)-[:IndexPair {index: edgeIndex}]->(s22)
SET s12:IndexedMerge, s22:IndexedMerge
 Problem:
o Inconsistencies can be transitive
 Solution:
o Variable length path
+
-
:indexPair
:indexPair
:indexPair
:indexPair
+
-:Next
:Nexts1 s1
s2 s2
:indexPair :indexPair
Mark newly found
index pairs
while (driver.run(createIndexPairs, mergeMap))
Repeatedly called from
Java code until fixed-point
SPLIT
:Origin
:Origin
:Origin
:Origin
[1,6]
1
6
 Splits an edge
 Based on timestamp: t
o Original transition
 The two endpoints…
o Smaller: < t
o Larger: ≥ t
2
:Origin :Origin
+
+
-
+/-
SPLIT
:Origin
:Origin
:Origin
:Origin
[1,6]
1
6
2
:Origin :Origin
 Splits an edge
 Based on timestamp: 5
o Original transition
 The two endpoints…
o Smaller: < t
o Larger: ≥ t
+
+
-
+/-
SPLIT
 Splits an edge
 Based on timestamp: 5
o Original transition
 The two endpoints…
o Smaller: < t
o Larger: ≥ t
 Results
[1,4]
[5,6]
+
-
SPLIT
 Problem
o Node a is a result of an earlier
merge operation
o The subtree regenerated from
node b should belong to the…
• Larger subtree because of 6
• Smaller subtree because of 1
:Origin
:Origin :Origin
[1,6]
1
6
b
6
:Origin :Origin
…
a
 Solution
o “Shortest” path to node a
o Based on the last number: 6
SPLIT
:Origin
:Origin :Origin
[1,6]
1
6
b
6
:Origin :Origin
…
a
SPLIT
MATCH (r:Red)-[n:Next]->(b:Blue)
WITH r, b, n.Tmax as Tmax, n.Tmin as Tmin, n.symbol as symbol
UNWIND range(Tmin, Tmax-1) AS t
MATCH (b)-[:Next*0..]->(s1:State)-[:Origin]->(so1:OrigState), trace1=(so1)<-[OrigNext*0..]-(redOrigNext:OrigState), (redOrigNext)<-[no1:OrigNext]-(redOrig:OrigState)<-[:Origin]-(r)
WHERE none(x IN nodes(trace1) WHERE (x)<-[:Origin]-(r))
WITH r, b, t, Tmin, Tmax, symbol, s1, no1, so1
MATCH (b)-[:Next*0..]->(s2:State)-[:Origin]->(so2:OrigState), trace2=(so2)<-[OrigNext*0..]-(redOrigNext:OrigState), (redOrigNext)<-[no2:OrigNext]-(redOrig:OrigState)<-[:Origin]-(r)
WHERE none(x IN nodes(trace2) WHERE (x)<-[:Origin]-(r))
WITH t, r, b, s1, s2, toInteger(no1.time)>t as cat1, toInteger(no2.time)>t as cat2, Tmin, Tmax, symbol, so1, so2
WHERE s1 = s2
AND cat1 <> cat2
AND id(so1) < id(so2)
WITH r, b, t, Tmin, Tmax, symbol,
[coalesce(so1.accepting, false), coalesce(so1.rejecting, false)] AS so1ar,
[coalesce(so2.accepting, false), coalesce(so2.rejecting, false)] AS so2ar
WITH
r, b, t, Tmin, Tmax, symbol,
CASE
WHEN so1ar[0] <> so2ar[0] AND so1ar[1] <> so2ar[1] THEN 1
WHEN so1ar = so2ar AND (so1ar = [false, true] OR so1ar = [true, false]) THEN -1 // there is at least one true
ELSE 0
END AS score
WITH r, b, t, sum(score) AS metric, Tmin, Tmax, symbol
ORDER BY metric DESC
LIMIT 1
WITH r, b, t, metric, Tmin, Tmax, symbol
MATCH (b)-[:Next*0..]->(s1:State)-[:Origin]->(so1:OrigState), trace1=(so1)<-[OrigNext*0..]-(redOrigNext:OrigState), (redOrigNext)<-[no1:OrigNext]-(redOrig:OrigState)<-[:Origin]-(r)
WHERE none(x IN nodes(trace1) WHERE (x)<-[:Origin]-(r)) AND toInteger(no1.time)>t
WITH r, b, t, metric, Tmin, Tmax, symbol, collect(so1) as so1s
MATCH (b)-[:Next*0..]->(s2:State)-[:Origin]->(so2:OrigState), trace2=(so2)<-[OrigNext*0..]-(redOrigNext:OrigState), (redOrigNext)<-[no2:OrigNext]-(redOrig:OrigState)<-[:Origin]-(r)
WHERE none(x IN nodes(trace2) WHERE (x)<-[:Origin]-(r)) AND toInteger(no2.time)<=t
RETURN r, b, t, metric, Tmin, Tmax, symbol, so1s, collect(so2) as so2s
SPLIT
:Origin
:Origin :Origin
[1,6]
1
6
6
:Origin :Origin
…
a s2
ao
an
b
 Solution
o “Shortest” path to node a
o Based on the last number: 6
MATCH
(a)-[:Next*0..]->(s2:State)-[:Origin]->(b:OrigState),
trace=(b)<-[OrigNext*0..]-(an:OrigState),
(an)<-[no:OrigNext]-(ao:OrigState)<-[:Origin]-(a)
The split candidate in
the original runs
SPLIT
The split candidate in
the original runs
:Origin
:Origin :Origin
[1,6]
1
6
6
:Origin :Origin
…ao
a
an
s2
b
MATCH
(a)-[:Next*0..]->(s2:State)-[:Origin]->(b:OrigState),
trace=(b)<-[OrigNext*0..]-(an:OrigState),
(an)<-[no:OrigNext]-(ao:OrigState)<-[:Origin]-(a)
 Solution
o “Shortest” path to node a
o Based on the last number: 6
SPLIT
WHERE none(x IN nodes(trace) WHERE (x)<-[:Origin]-(a))
The shortest possible path, i.e.
a node, from which node a can
only be reached directly.
 Solution
o “Shortest” path to node a
o Based on the last number: 6
:Origin
:Origin :Origin
[1,6]
1
6
6
:Origin :Origin
…
a s2
ao
an
b
MATCH
(a)-[:Next*0..]->(s2:State)-[:Origin]->(b:OrigState),
trace=(b)<-[OrigNext*0..]-(an:OrigState),
(an)<-[no:OrigNext]-(ao:OrigState)<-[:Origin]-(a)
Technical Details
 Visualisation
ADVANTAGES OF NEO4J AND CYPHER
 Visualisation
 Flexible data model
o New labels, properties
o No need to define the schema upfront
ADVANTAGES OF NEO4J AND CYPHER
 Visualisation
 Flexible data model
o New labels, properties
o No need to define the schema upfront
 Cypher: high-level declarative graph query language
ADVANTAGES OF NEO4J AND CYPHER
WORKFLOW
 Neo4j web browser UI
WORKFLOW
MATCH (n)
RETURN n
 Neo4j web browser UI
WORKFLOW
MATCH (n)
RETURN n
 Neo4j web browser UI
WORKFLOW
 Neo4j web browser UI
WORKFLOW
 Neo4j web browser UI
WORKFLOW
 Neo4j web browser UI
WORKFLOW
 Neo4j web browser UI
We should combine
transformation and
visualisation
WORKFLOW
 Neo4j web browser UI
We should combine
transformation and
visualisation
WORKFLOW
 Neo4j web browser UI
We should combine
transformation and
visualisation
WORKFLOW
 Neo4j web browser UI
We should combine
transformation and
visualisation
WORKFLOW
 Neo4j web browser UI
We should combine
transformation and
visualisation
CHAINING QUERIES
 Chaining queries to write a complex step of the algorithm
with a single Cypher query
MATCH (node:Blue)
REMOVE node:Blue
SET node:Red
RETURN *
node
:Red {prop1: "val1"}
:Red {prop1: "val2"}
CHAINING QUERIES
 Chaining queries to write a complex step of the algorithm
with a single Cypher query
MATCH (node:Blue)
REMOVE node:Blue
SET node:Red
RETURN *
node
:Red {prop1: "val1"}
:Red {prop1: "val2"}
MATCH (node:Blue)
REMOVE node:Blue
SET node:Red
WITH 1 AS dummy
RETURN *
dummy
1
1
CHAINING QUERIES
 Chaining queries to write a complex step of the algorithm
with a single Cypher query
MATCH (node:Blue)
REMOVE node:Blue
SET node:Red
RETURN *
node
:Red {prop1: "val1"}
:Red {prop1: "val2"}
MATCH (node:Blue)
REMOVE node:Blue
SET node:Red
WITH 1 AS dummy
RETURN *
dummy
1
1
MATCH (node:Blue)
REMOVE node:Blue
SET node:Red
WITH 1 AS dummy
MATCH (n)
RETURN *
node dummy
:Red {prop1: "val1"} 1
:Red {prop1: "val2"} 1
:Green {value: 13} 1
:Red {prop1: "val1"} 1
:Red {prop1: "val2"} 1
:Green {value: 13} 1
CHAINING QUERIES
 Chaining queries to write a complex step of the algorithm
with a single Cypher query
MATCH (node:Blue)
REMOVE node:Blue
SET node:Red
RETURN *
node
:Red {prop1: "val1"}
:Red {prop1: "val2"}
MATCH (node:Blue)
REMOVE node:Blue
SET node:Red
WITH 1 AS dummy
RETURN *
dummy
1
1
MATCH (node:Blue)
REMOVE node:Blue
SET node:Red
WITH 1 AS dummy
MATCH (n)
RETURN *
node dummy
:Red {prop1: "val1"} 1
:Red {prop1: "val2"} 1
:Green {value: 13} 1
:Red {prop1: "val1"} 1
:Red {prop1: "val2"} 1
:Green {value: 13} 1
Cartesian product
 all rows are
displayed twice
CHAINING QUERIES
MATCH (node:Blue)
REMOVE node:Blue
SET node:Red
RETURN *
node
:Red {prop1: "val1"}
:Red {prop1: "val2"}
MATCH (node:Blue)
REMOVE node:Blue
SET node:Red
WITH count(*) as dummy
RETURN *
dummy
2
 Chaining queries to write a complex step of the algorithm
with a single Cypher query
CHAINING QUERIES
MATCH (node:Blue)
REMOVE node:Blue
SET node:Red
RETURN *
node
:Red {prop1: "val1"}
:Red {prop1: "val2"}
MATCH (node:Blue)
REMOVE node:Blue
SET node:Red
WITH count(*) as dummy
RETURN *
dummy
2
MATCH (node:Blue)
REMOVE node:Blue
SET node:Red
WITH count(*) as dummy
MATCH (n)
RETURN *
node dummy
:Red {prop1: "val1"} 2
:Red {prop1: "val2"} 2
:Green {value: 13} 2
 Chaining queries to write a complex step of the algorithm
with a single Cypher query
CHAINING QUERIES
MATCH (node:Blue)
REMOVE node:Blue
SET node:Red
RETURN *
node
:Red {prop1: "val1"}
:Red {prop1: "val2"}
MATCH (node:Blue)
REMOVE node:Blue
SET node:Red
WITH count(*) as dummy
RETURN *
dummy
2
MATCH (node:Blue)
REMOVE node:Blue
SET node:Red
WITH count(*) as dummy
MATCH (n)
RETURN *
node dummy
:Red {prop1: "val1"} 2
:Red {prop1: "val2"} 2
:Green {value: 13} 2
Exactly 1 result
 The second query will not
produce unnecessary
duplicates
 Chaining queries to write a complex step of the algorithm
with a single Cypher query
RUNNING AND VISUALISING THE ALGORITHM
RUNNING AND VISUALISING THE ALGORITHM
...
WITH count(*) as dummy
MATCH (n)
RETURN n
RUNNING AND VISUALISING THE ALGORITHM
...
WITH count(*) as dummy
MATCH (n)
RETURN n
RUNNING AND VISUALISING THE ALGORITHM
...
WITH count(*) as dummy
MATCH (n)
RETURN n
RUNNING AND VISUALISING THE ALGORITHM
...
WITH count(*) as dummy
MATCH (n)
RETURN n
RUNNING AND VISUALISING THE ALGORITHM
...
WITH count(*) as dummy
MATCH (n)
RETURN n
RUNNING AND VISUALISING THE ALGORITHM
...
WITH count(*) as dummy
MATCH (n)
RETURN n
RUNNING AND VISUALISING THE ALGORITHM
...
WITH count(*) as dummy
MATCH (n)
RETURN n
RUNNING AND VISUALISING THE ALGORITHM
...
WITH count(*) as dummy
MATCH (n)
RETURN n
RUNNING AND VISUALISING THE ALGORITHM
...
WITH count(*) as dummy
MATCH (n)
RETURN n
RUNNING AND VISUALISING THE ALGORITHM
...
WITH count(*) as dummy
MATCH (n)
RETURN n
RUNNING AND VISUALISING THE ALGORITHM
...
WITH count(*) as dummy
MATCH (n)
RETURN n
RUNNING AND VISUALISING THE ALGORITHM
...
WITH count(*) as dummy
MATCH (n)
RETURN n
RUNNING AND VISUALISING THE ALGORITHM
...
WITH count(*) as dummy
MATCH (n)
RETURN n
EXECUTING QUERIES
Passing information between queries:
EXECUTING QUERIES
Passing information between queries:
 Node labels
EXECUTING QUERIES
Passing information between queries:
 Node labels
 Nodes with temporary information
EXECUTING QUERIES
Passing information between queries:
 Node labels
 Nodes with temporary information
 Passing nodes/edges as parameters
(only supported in embedded mode)
EXECUTING QUERIES
Passing information between queries:
 Node labels
 Nodes with temporary information
 Passing nodes/edges as parameters
(only supported in embedded mode)
redNode num
:Red {prop1: "val1"} 1
EXECUTING QUERIES
Passing information between queries:
 Node labels
 Nodes with temporary information
 Passing nodes/edges as parameters
(only supported in embedded mode)
redNode num
:Red {prop1: "val1"} 1
Exactly
1 result
EXECUTING QUERIES
Passing information between queries:
 Node labels
 Nodes with temporary information
 Passing nodes/edges as parameters
(only supported in embedded mode)
redNode num
:Red {prop1: "val1"} 1
WITH $redNode AS redNode
MATCH (g:Green {num: $num})
CREATE (redNode)-[:Edge]->(g)
Exactly
1 result
LOOPS
 Many loop-like problems can be solved using lists:
o List comprehensions: [x IN xs WHERE condition | f(x)]
o Reduce: reduce(acc = "", x IN list | acc + x.prop)
LOOPS
 Many loop-like problems can be solved using lists:
o List comprehensions: [x IN xs WHERE condition | f(x)]
o Reduce: reduce(acc = "", x IN list | acc + x.prop)
 However, loops still might be necessary:
o Run queries from Java code
gds.execute(step1Query);
gds.execute(step2Query);
LOOPS
 Many loop-like problems can be solved using lists:
o List comprehensions: [x IN xs WHERE condition | f(x)]
o Reduce: reduce(acc = "", x IN list | acc + x.prop)
 However, loops still might be necessary:
o Run queries from Java code
o The loop condition checks the number of rows in the result
while (gds.execute(conditionQuery).hasNext()) {
gds.execute(step1Query);
gds.execute(step2Query);
}
IMPORT-EXPORT
 Save current state of the algorithm
IMPORT-EXPORT
 Save current state of the algorithm
 APOC* GraphML import-export
*Awesome Procedures on Cypher
IMPORT-EXPORT
 Save current state of the algorithm
 APOC* GraphML import-export
*Awesome Procedures on Cypher
// save
CALL apoc.export.graphml.all('my.graphml',
{storeNodeIds:true, readLabels:true, useTypes:true})
IMPORT-EXPORT
 Save current state of the algorithm
 APOC* GraphML import-export
*Awesome Procedures on Cypher
// save
CALL apoc.export.graphml.all('my.graphml',
{storeNodeIds:true, readLabels:true, useTypes:true})
// load
MATCH (n) // delete all
DETACH DELETE n
WITH count(*) AS dummy //-----------------
CALL apoc.import.graphml('my.graphml',
{storeNodeIds:true, readLabels:true, useTypes:true})
YIELD nodes, relationships
WITH count(*) AS dummy //-----------------
MATCH (n) RETURN n // show all
BACKTRACKING
 In many cases, we need to restore a previous
state and continue from that one
o Next step is based on scoring metrics
 Solutions:
?
BACKTRACKING
 In many cases, we need to restore a previous
state and continue from that one
o Next step is based on scoring metrics
 Solutions:
?
Metrics:
BACKTRACKING
 In many cases, we need to restore a previous
state and continue from that one
o Next step is based on scoring metrics
 Solutions:
?
Metrics: 5
BACKTRACKING
 In many cases, we need to restore a previous
state and continue from that one
o Next step is based on scoring metrics
 Solutions:
?
Metrics: 5 12
BACKTRACKING
 In many cases, we need to restore a previous
state and continue from that one
o Next step is based on scoring metrics
 Solutions:
?
Metrics: 5 12 8
BACKTRACKING
 In many cases, we need to restore a previous
state and continue from that one
o Next step is based on scoring metrics
 Solutions:
o Export -> Re-import
o Use transactions:
?
Metrics: 5 12 8
BACKTRACKING
 In many cases, we need to restore a previous
state and continue from that one
o Next step is based on scoring metrics
 Solutions:
o Export -> Re-import
o Use transactions:
• Abort transaction
?
Metrics: 5 12 8
BACKTRACKING
 In many cases, we need to restore a previous
state and continue from that one
o Next step is based on scoring metrics
 Solutions:
o Export -> Re-import
o Use transactions:
• Abort transaction
• Only applicable to one level of backtracking
?
Metrics: 5 12 8
DEBUGGING
 Save and play back the states that occurred during execution
DEBUGGING
 Save and play back the states that occurred during execution
o A counter for the states is stored in a node
DEBUGGING
 Save and play back the states that occurred during execution
o A counter for the states is stored in a node
o Saved query to step back
DEBUGGING
 Save and play back the states that occurred during execution
o A counter for the states is stored in a node
o Saved query to step back
DEBUGGING
 Save and play back the states that occurred during execution
o A counter for the states is stored in a node
o Saved query to step back
DEBUGGING
 For complex errors
DEBUGGING
 For complex errors
o Formulate an error pattern in Cypher
MATCH (node:Faulty)
RETURN node
DEBUGGING
 For complex errors
o Formulate an error pattern in Cypher
o Use this to define a breakpoint in the code
MATCH (node:Faulty)
RETURN node
if (gds.execute(query).hasNext())
// breakpoint
DEBUGGING
 For complex errors
o Formulate an error pattern in Cypher
o Use this to define a breakpoint in the code
o Load the state where
the error occurred
o Debug step by step
MATCH (node:Faulty)
RETURN node
if (gds.execute(query).hasNext())
// breakpoint
Lessons Learnt
Sicco Verwer:
Efficient identification
of timed automata.
PhD dissertation
PSEUDOCODE
Sicco Verwer:
Efficient identification
of timed automata.
PhD dissertation
PSEUDOCODE
Sicco Verwer:
Efficient identification
of timed automata.
PhD dissertation
PSEUDOCODE
Sicco Verwer:
Efficient identification
of timed automata.
PhD dissertation
PSEUDOCODE
Sicco Verwer:
Efficient identification
of timed automata.
PhD dissertation
PSEUDOCODE
EXPRESSIVE POWER OF CYPHER
 Most fixed-size graph queries can be decided in P-time
 Checking whether two disjoint paths exist is NP-complete
 Cypher is Turing-complete
o reduce
 Cypher 9 (current version)
 Cypher 10 (multiple graph support)
o This would simplify a lot of our queries.
N. Francis et al.,
Cypher: An Evolving Query Language for Property Graphs,
SIGMOD 2018
SUMMARY
 Neo4j prototype
o Rapid development
o Lots of APOC procedures for specific problems (e.g. mergeNodes)
 Issues
o APOC bug (reproted)
o Visualisation glitches
o Lack of fixed-point algorithms
 For graph algorithms, we’d recommend to
o create a prototype in Neo4j,
o once the algorithm is understood, rewrite it in a general purpose
programming language.
OUR TEAM
Co-advisor: Rebeka Farkas, Thanks to: Oszkár Semeráth, Tamás Tóth, András Vörös

Learning Timed Automata with Cypher

  • 1.
    Learning Timed Automatawith Cypher Gábor Szárnyas, Anna Gujgiczer, Márton Elekes 4th openCypher Implementers Meeting
  • 2.
    Sicco Verwer, Mathijsde Weerdt, Cees Witteveen: An algorithm for learning real-time automata. Benelearn 2007
  • 3.
  • 4.
  • 5.
  • 6.
    AUTOMATON  States: accepting/rejecting Run: (finite) event sequence  Accepted run: ends in an accepting state  Modelled behaviour: accepted runsOff Stop Continue Prepare Go switchPhase switchPhase switch Phase onOff onOff onOff onOff onOff switch Phase switch Phase
  • 7.
    AUTOMATON  States: accepting/rejecting Run: (finite) event sequence  Accepted run: ends in an accepting state  Modelled behaviour: accepted runsOff Stop Continue Prepare Go switchPhase switchPhase switch Phase onOff onOff onOff onOff onOff switch Phase + - - - - switch Phase
  • 8.
    TIMED AUTOMATON  Timespent in a state  Guard condition: an interval Off Stop Continue Prepare Go switchPhase switchPhase switch Phase onOff onOff onOff onOff onOff switch Phase + - - - - switch Phase
  • 9.
    TIMED AUTOMATON  Timespent in a state  Guard condition: an interval Off Stop Continue Prepare Go switchPhase switchPhase switch Phase onOff onOff onOff onOff onOff switch Phase + - - - - [3,5] [30,35] [1,2] switch Phase [30,35] [0,∞] [0,∞] [0,∞] [0,∞] [0,∞] [0,∞]
  • 10.
  • 11.
    AUTOMATON LEARNING  System:unknown internal implementation (black box) o System Under Learning
  • 12.
    AUTOMATON LEARNING  System:unknown internal implementation (black box) o System Under Learning  Goal: determine the internal operation of the system (~reverse engineering) o Hypothesis Model
  • 13.
    AUTOMATON LEARNING  System:unknown internal implementation (black box) o System Under Learning  Goal: determine the internal operation of the system (~reverse engineering) o Hypothesis Model  Approach: observation  execution trajectories  automaton
  • 14.
    AUTOMATON LEARNING  System:unknown internal implementation (black box) o System Under Learning  Goal: determine the internal operation of the system (~reverse engineering) o Hypothesis Model  Approach: observation  execution trajectories  automaton Hypothesis Model System Under Learning Trajectory of Runs Learning Algorithm
  • 15.
    AUTOMATON LEARNING  System:unknown internal implementation (black box) o System Under Learning  Goal: determine the internal operation of the system (~reverse engineering) o Hypothesis Model  Approach: observation  execution trajectories  automaton  Application: monitor synthesis, testing Hypothesis Model System Under Learning Trajectory of Runs Learning Algorithm
  • 16.
    AUTOMATON LEARNING: THEBASICS + - - + a,1 a,5 + a,3 a,5 a,7 a,2 + - a,[1,2] + a,[1,5] + a,[3,5] - a,[1,5] + - a, [1,2] a,[1,2] a, [3,5] a, [3,5]
  • 17.
    AUTOMATON LEARNING: ANALGORITHM  Repeating three operations o Split inconsistent states + - a,[1,2] + a,[1,5] + a,[3,5] - a,[1,5] + +/- a,[1,5] +/- a,[1,5] + - - + a,1 a,5 + a,3 a,5 a,7 a,2
  • 18.
    AUTOMATON LEARNING: ANALGORITHM  Repeating three operations o Split inconsistent states o Merge similar states - - b + + a a -b + + a a -b + a Merge step
  • 19.
    AUTOMATON LEARNING: ANALGORITHM  Repeating three operations o Split inconsistent states o Merge similar states o Colour to finalise a state  Selecting an operation o Using scoring metrics based on the result of the operation  Operations have to be executed Select operation Split Merge Colour
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
    INTERESTING QUERIES  Howto select a “longest path” o i.e. a maximal path that cannot be extended further  Merge o Transitive reachability for the states to-be-merged  Split: categorisation/copying subtrees o Merge might result in multiple origNext edges o Which one to use for categorisation?
  • 32.
    WHERE NOT «PATTERN» Finding the beginning of a path  Finding the end of a path  These two can be used to select a complete path MATCH (u)-[:Next*]->(v) WHERE NOT ()-[:Next]->(u) MATCH (u)-[:Next*]->(v) WHERE NOT (v)-[:Next]->() MATCH (u)-[:Next*]->(v) WHERE NOT ()-[:Next]->(u) AND NOT (v)-[:Next]->() u:Next :Next … …:Next :Next :Next … …v :Next
  • 33.
    MERGE :indexPair :indexPair - -:Next :Next + + :Next :Next … … :indexPair -:Next +:Next …  Selectstates to merge  Calculate scoring metrics  Filter inconsistencies  Return result
  • 34.
    MERGE + - :indexPair :indexPair  Problem: oInconsistencies can be transitive  Solution: o Variable length path :indexPair :indexPair + -:Next :Next
  • 35.
    MERGE + - :indexPair :indexPair  Problem: oInconsistencies can be transitive  Solution: o Variable length path :indexPair :indexPair + -:Next :Next while (driver.run(createIndexPairs, mergeMap))
  • 36.
    MERGE + - :indexPair :indexPair  Problem: oInconsistencies can be transitive  Solution: o Variable length path :indexPair :indexPair + -:Next :Next MATCH (s1:IndexedMerge)-[:IndexPair*..]-(s2:IndexedMerge), s1p = (s1)-[:Next*0..]->(s12:State), s2p = (s2)-[:Next*0..]->(s22:State) WHERE id(s1) > id(s2) AND length(s1p) = length(s2p) AND NOT (s12)-[:IndexPair*0..]-(s22) WITH s12, s22, s1p, s2p,[s1r IN rels(s1p) | s1r.symbol] AS s1ss, [s2r IN rels(s2p) | s2r.symbol] AS s2ss, [s1r IN rels(s1p) | s1r.Tmin] AS s1mins, [s2r IN rels(s2p) | s2r.Tmin ] AS s2mins, [s1r IN rels(s1p) | s1r.Tmax] AS s1maxs, [s2r IN rels(s2p) | s2r.Tmax ] AS s2maxs WHERE s1ss = s2ss AND s1mins = s2mins AND s1maxs = s2maxs WITH collect([s12, s22]) as pairs MATCH ()-[ip:IndexPair]->() WITH max(ip.index) + 1 as nextIndex, pairs UNWIND range(0, length(pairs)-1) as idx WITH pairs[idx][0] as s12, pairs[idx][1] as s22, idx + nextIndex as edgeIndex CREATE (s12)-[:IndexPair {index: edgeIndex}]->(s22) SET s12:IndexedMerge, s22:IndexedMerge while (driver.run(createIndexPairs, mergeMap))
  • 37.
    MERGE + - :indexPair :indexPair :indexPair :indexPair + -:Next :Next  Problem: oInconsistencies can be transitive  Solution: o Variable length path
  • 38.
    MERGE MATCH (s1:IndexedMerge)-[:IndexPair*]-(s2:IndexedMerge), s1p =(s1)-[:Next*0..]->(s12:State), s2p = (s2)-[:Next*0..]->(s22:State) WHERE id(s1) > id(s2) AND length(s1p) = length(s2p) AND NOT (s12)-[:IndexPair*0..]-(s22) + - :indexPair :indexPair :indexPair :indexPair + -:Next :Next  Problem: o Inconsistencies can be transitive  Solution: o Variable length path
  • 39.
    MERGE MATCH (s1:IndexedMerge)-[:IndexPair*]-(s2:IndexedMerge), s1p =(s1)-[:Next*0..]->(s12:State), s2p = (s2)-[:Next*0..]->(s22:State) WHERE id(s1) > id(s2) AND length(s1p) = length(s2p) AND NOT (s12)-[:IndexPair*0..]-(s22) + - :indexPair :indexPair :indexPair :indexPair + -:Next :Next  Problem: o Inconsistencies can be transitive  Solution: o Variable length path Find critical state pairs
  • 40.
    MERGE MATCH (s1:IndexedMerge)-[:IndexPair*]-(s2:IndexedMerge), s1p =(s1)-[:Next*0..]->(s12:State), s2p = (s2)-[:Next*0..]->(s22:State) WHERE id(s1) > id(s2) AND length(s1p) = length(s2p) AND NOT (s12)-[:IndexPair*0..]-(s22) + - :indexPair :indexPair :indexPair :indexPair + -:Next :Next  Problem: o Inconsistencies can be transitive  Solution: o Variable length path s1 s1 s2 s2 Find critical state pairs
  • 41.
    MERGE MATCH (s1:IndexedMerge)-[:IndexPair*]-(s2:IndexedMerge), s1p =(s1)-[:Next*0..]->(s12:State), s2p = (s2)-[:Next*0..]->(s22:State) WHERE id(s1) > id(s2) AND length(s1p) = length(s2p) AND NOT (s12)-[:IndexPair*0..]-(s22) + - :indexPair :indexPair :indexPair :indexPair + -:Next :Next  Problem: o Inconsistencies can be transitive  Solution: o Variable length path s1 s1 s2 s2 WITH s12, s22, s1p, s2p, [s1r IN rels(s1p) | s1r.symbol] AS s1ss, [s2r IN rels(s2p) | s2r.symbol] AS s2ss, [s1r IN rels(s1p) | s1r.Tmin] AS s1mins, [s2r IN rels(s2p) | s2r.Tmin ] AS s2mins, [s1r IN rels(s1p) | s1r.Tmax] AS s1maxs, [s2r IN rels(s2p) | s2r.Tmax ] AS s2maxs WHERE s1ss = s2ss AND s1mins = s2mins AND s1maxs = s2maxs ... Find critical state pairs
  • 42.
    MERGE MATCH (s1:IndexedMerge)-[:IndexPair*]-(s2:IndexedMerge), s1p =(s1)-[:Next*0..]->(s12:State), s2p = (s2)-[:Next*0..]->(s22:State) WHERE id(s1) > id(s2) AND length(s1p) = length(s2p) AND NOT (s12)-[:IndexPair*0..]-(s22) + - :indexPair :indexPair :indexPair :indexPair + -:Next :Next  Problem: o Inconsistencies can be transitive  Solution: o Variable length path s1 s1 s2 s2 WITH s12, s22, s1p, s2p, [s1r IN rels(s1p) | s1r.symbol] AS s1ss, [s2r IN rels(s2p) | s2r.symbol] AS s2ss, [s1r IN rels(s1p) | s1r.Tmin] AS s1mins, [s2r IN rels(s2p) | s2r.Tmin ] AS s2mins, [s1r IN rels(s1p) | s1r.Tmax] AS s1maxs, [s2r IN rels(s2p) | s2r.Tmax ] AS s2maxs WHERE s1ss = s2ss AND s1mins = s2mins AND s1maxs = s2maxs ... • Paths of same length • Event sequences of same lengths, etc. Find critical state pairs
  • 43.
    MERGE  Problem: o Inconsistenciescan be transitive  Solution: o Variable length path + - :indexPair :indexPair :indexPair :indexPair + -:Next :Nexts1 s1 s2 s2
  • 44.
    MERGE CREATE (s12)-[:IndexPair {index:edgeIndex}]->(s22) SET s12:IndexedMerge, s22:IndexedMerge  Problem: o Inconsistencies can be transitive  Solution: o Variable length path + - :indexPair :indexPair :indexPair :indexPair + -:Next :Nexts1 s1 s2 s2
  • 45.
    MERGE CREATE (s12)-[:IndexPair {index:edgeIndex}]->(s22) SET s12:IndexedMerge, s22:IndexedMerge  Problem: o Inconsistencies can be transitive  Solution: o Variable length path + - :indexPair :indexPair :indexPair :indexPair + -:Next :Nexts1 s1 s2 s2 Mark newly found index pairs
  • 46.
    MERGE CREATE (s12)-[:IndexPair {index:edgeIndex}]->(s22) SET s12:IndexedMerge, s22:IndexedMerge  Problem: o Inconsistencies can be transitive  Solution: o Variable length path + - :indexPair :indexPair :indexPair :indexPair + -:Next :Nexts1 s1 s2 s2 :indexPair :indexPair Mark newly found index pairs
  • 47.
    MERGE CREATE (s12)-[:IndexPair {index:edgeIndex}]->(s22) SET s12:IndexedMerge, s22:IndexedMerge  Problem: o Inconsistencies can be transitive  Solution: o Variable length path + - :indexPair :indexPair :indexPair :indexPair + -:Next :Nexts1 s1 s2 s2 :indexPair :indexPair Mark newly found index pairs while (driver.run(createIndexPairs, mergeMap))
  • 48.
    MERGE CREATE (s12)-[:IndexPair {index:edgeIndex}]->(s22) SET s12:IndexedMerge, s22:IndexedMerge  Problem: o Inconsistencies can be transitive  Solution: o Variable length path + - :indexPair :indexPair :indexPair :indexPair + -:Next :Nexts1 s1 s2 s2 :indexPair :indexPair Mark newly found index pairs while (driver.run(createIndexPairs, mergeMap)) Repeatedly called from Java code until fixed-point
  • 49.
    MERGE CREATE (s12)-[:IndexPair {index:edgeIndex}]->(s22) SET s12:IndexedMerge, s22:IndexedMerge  Problem: o Inconsistencies can be transitive  Solution: o Variable length path + - :indexPair :indexPair :indexPair :indexPair + -:Next :Nexts1 s1 s2 s2 :indexPair :indexPair Mark newly found index pairs while (driver.run(createIndexPairs, mergeMap)) Repeatedly called from Java code until fixed-point
  • 50.
    SPLIT :Origin :Origin :Origin :Origin [1,6] 1 6  Splits anedge  Based on timestamp: t o Original transition  The two endpoints… o Smaller: < t o Larger: ≥ t 2 :Origin :Origin + + - +/-
  • 51.
    SPLIT :Origin :Origin :Origin :Origin [1,6] 1 6 2 :Origin :Origin  Splitsan edge  Based on timestamp: 5 o Original transition  The two endpoints… o Smaller: < t o Larger: ≥ t + + - +/-
  • 52.
    SPLIT  Splits anedge  Based on timestamp: 5 o Original transition  The two endpoints… o Smaller: < t o Larger: ≥ t  Results [1,4] [5,6] + -
  • 53.
    SPLIT  Problem o Nodea is a result of an earlier merge operation o The subtree regenerated from node b should belong to the… • Larger subtree because of 6 • Smaller subtree because of 1 :Origin :Origin :Origin [1,6] 1 6 b 6 :Origin :Origin … a
  • 54.
     Solution o “Shortest”path to node a o Based on the last number: 6 SPLIT :Origin :Origin :Origin [1,6] 1 6 b 6 :Origin :Origin … a
  • 55.
    SPLIT MATCH (r:Red)-[n:Next]->(b:Blue) WITH r,b, n.Tmax as Tmax, n.Tmin as Tmin, n.symbol as symbol UNWIND range(Tmin, Tmax-1) AS t MATCH (b)-[:Next*0..]->(s1:State)-[:Origin]->(so1:OrigState), trace1=(so1)<-[OrigNext*0..]-(redOrigNext:OrigState), (redOrigNext)<-[no1:OrigNext]-(redOrig:OrigState)<-[:Origin]-(r) WHERE none(x IN nodes(trace1) WHERE (x)<-[:Origin]-(r)) WITH r, b, t, Tmin, Tmax, symbol, s1, no1, so1 MATCH (b)-[:Next*0..]->(s2:State)-[:Origin]->(so2:OrigState), trace2=(so2)<-[OrigNext*0..]-(redOrigNext:OrigState), (redOrigNext)<-[no2:OrigNext]-(redOrig:OrigState)<-[:Origin]-(r) WHERE none(x IN nodes(trace2) WHERE (x)<-[:Origin]-(r)) WITH t, r, b, s1, s2, toInteger(no1.time)>t as cat1, toInteger(no2.time)>t as cat2, Tmin, Tmax, symbol, so1, so2 WHERE s1 = s2 AND cat1 <> cat2 AND id(so1) < id(so2) WITH r, b, t, Tmin, Tmax, symbol, [coalesce(so1.accepting, false), coalesce(so1.rejecting, false)] AS so1ar, [coalesce(so2.accepting, false), coalesce(so2.rejecting, false)] AS so2ar WITH r, b, t, Tmin, Tmax, symbol, CASE WHEN so1ar[0] <> so2ar[0] AND so1ar[1] <> so2ar[1] THEN 1 WHEN so1ar = so2ar AND (so1ar = [false, true] OR so1ar = [true, false]) THEN -1 // there is at least one true ELSE 0 END AS score WITH r, b, t, sum(score) AS metric, Tmin, Tmax, symbol ORDER BY metric DESC LIMIT 1 WITH r, b, t, metric, Tmin, Tmax, symbol MATCH (b)-[:Next*0..]->(s1:State)-[:Origin]->(so1:OrigState), trace1=(so1)<-[OrigNext*0..]-(redOrigNext:OrigState), (redOrigNext)<-[no1:OrigNext]-(redOrig:OrigState)<-[:Origin]-(r) WHERE none(x IN nodes(trace1) WHERE (x)<-[:Origin]-(r)) AND toInteger(no1.time)>t WITH r, b, t, metric, Tmin, Tmax, symbol, collect(so1) as so1s MATCH (b)-[:Next*0..]->(s2:State)-[:Origin]->(so2:OrigState), trace2=(so2)<-[OrigNext*0..]-(redOrigNext:OrigState), (redOrigNext)<-[no2:OrigNext]-(redOrig:OrigState)<-[:Origin]-(r) WHERE none(x IN nodes(trace2) WHERE (x)<-[:Origin]-(r)) AND toInteger(no2.time)<=t RETURN r, b, t, metric, Tmin, Tmax, symbol, so1s, collect(so2) as so2s
  • 56.
    SPLIT :Origin :Origin :Origin [1,6] 1 6 6 :Origin :Origin … as2 ao an b  Solution o “Shortest” path to node a o Based on the last number: 6 MATCH (a)-[:Next*0..]->(s2:State)-[:Origin]->(b:OrigState), trace=(b)<-[OrigNext*0..]-(an:OrigState), (an)<-[no:OrigNext]-(ao:OrigState)<-[:Origin]-(a) The split candidate in the original runs
  • 57.
    SPLIT The split candidatein the original runs :Origin :Origin :Origin [1,6] 1 6 6 :Origin :Origin …ao a an s2 b MATCH (a)-[:Next*0..]->(s2:State)-[:Origin]->(b:OrigState), trace=(b)<-[OrigNext*0..]-(an:OrigState), (an)<-[no:OrigNext]-(ao:OrigState)<-[:Origin]-(a)  Solution o “Shortest” path to node a o Based on the last number: 6
  • 58.
    SPLIT WHERE none(x INnodes(trace) WHERE (x)<-[:Origin]-(a)) The shortest possible path, i.e. a node, from which node a can only be reached directly.  Solution o “Shortest” path to node a o Based on the last number: 6 :Origin :Origin :Origin [1,6] 1 6 6 :Origin :Origin … a s2 ao an b MATCH (a)-[:Next*0..]->(s2:State)-[:Origin]->(b:OrigState), trace=(b)<-[OrigNext*0..]-(an:OrigState), (an)<-[no:OrigNext]-(ao:OrigState)<-[:Origin]-(a)
  • 59.
  • 60.
  • 61.
     Visualisation  Flexibledata model o New labels, properties o No need to define the schema upfront ADVANTAGES OF NEO4J AND CYPHER
  • 62.
     Visualisation  Flexibledata model o New labels, properties o No need to define the schema upfront  Cypher: high-level declarative graph query language ADVANTAGES OF NEO4J AND CYPHER
  • 63.
  • 64.
    WORKFLOW MATCH (n) RETURN n Neo4j web browser UI
  • 65.
    WORKFLOW MATCH (n) RETURN n Neo4j web browser UI
  • 66.
  • 67.
  • 68.
  • 69.
    WORKFLOW  Neo4j webbrowser UI We should combine transformation and visualisation
  • 70.
    WORKFLOW  Neo4j webbrowser UI We should combine transformation and visualisation
  • 71.
    WORKFLOW  Neo4j webbrowser UI We should combine transformation and visualisation
  • 72.
    WORKFLOW  Neo4j webbrowser UI We should combine transformation and visualisation
  • 73.
    WORKFLOW  Neo4j webbrowser UI We should combine transformation and visualisation
  • 74.
    CHAINING QUERIES  Chainingqueries to write a complex step of the algorithm with a single Cypher query MATCH (node:Blue) REMOVE node:Blue SET node:Red RETURN * node :Red {prop1: "val1"} :Red {prop1: "val2"}
  • 75.
    CHAINING QUERIES  Chainingqueries to write a complex step of the algorithm with a single Cypher query MATCH (node:Blue) REMOVE node:Blue SET node:Red RETURN * node :Red {prop1: "val1"} :Red {prop1: "val2"} MATCH (node:Blue) REMOVE node:Blue SET node:Red WITH 1 AS dummy RETURN * dummy 1 1
  • 76.
    CHAINING QUERIES  Chainingqueries to write a complex step of the algorithm with a single Cypher query MATCH (node:Blue) REMOVE node:Blue SET node:Red RETURN * node :Red {prop1: "val1"} :Red {prop1: "val2"} MATCH (node:Blue) REMOVE node:Blue SET node:Red WITH 1 AS dummy RETURN * dummy 1 1 MATCH (node:Blue) REMOVE node:Blue SET node:Red WITH 1 AS dummy MATCH (n) RETURN * node dummy :Red {prop1: "val1"} 1 :Red {prop1: "val2"} 1 :Green {value: 13} 1 :Red {prop1: "val1"} 1 :Red {prop1: "val2"} 1 :Green {value: 13} 1
  • 77.
    CHAINING QUERIES  Chainingqueries to write a complex step of the algorithm with a single Cypher query MATCH (node:Blue) REMOVE node:Blue SET node:Red RETURN * node :Red {prop1: "val1"} :Red {prop1: "val2"} MATCH (node:Blue) REMOVE node:Blue SET node:Red WITH 1 AS dummy RETURN * dummy 1 1 MATCH (node:Blue) REMOVE node:Blue SET node:Red WITH 1 AS dummy MATCH (n) RETURN * node dummy :Red {prop1: "val1"} 1 :Red {prop1: "val2"} 1 :Green {value: 13} 1 :Red {prop1: "val1"} 1 :Red {prop1: "val2"} 1 :Green {value: 13} 1 Cartesian product  all rows are displayed twice
  • 78.
    CHAINING QUERIES MATCH (node:Blue) REMOVEnode:Blue SET node:Red RETURN * node :Red {prop1: "val1"} :Red {prop1: "val2"} MATCH (node:Blue) REMOVE node:Blue SET node:Red WITH count(*) as dummy RETURN * dummy 2  Chaining queries to write a complex step of the algorithm with a single Cypher query
  • 79.
    CHAINING QUERIES MATCH (node:Blue) REMOVEnode:Blue SET node:Red RETURN * node :Red {prop1: "val1"} :Red {prop1: "val2"} MATCH (node:Blue) REMOVE node:Blue SET node:Red WITH count(*) as dummy RETURN * dummy 2 MATCH (node:Blue) REMOVE node:Blue SET node:Red WITH count(*) as dummy MATCH (n) RETURN * node dummy :Red {prop1: "val1"} 2 :Red {prop1: "val2"} 2 :Green {value: 13} 2  Chaining queries to write a complex step of the algorithm with a single Cypher query
  • 80.
    CHAINING QUERIES MATCH (node:Blue) REMOVEnode:Blue SET node:Red RETURN * node :Red {prop1: "val1"} :Red {prop1: "val2"} MATCH (node:Blue) REMOVE node:Blue SET node:Red WITH count(*) as dummy RETURN * dummy 2 MATCH (node:Blue) REMOVE node:Blue SET node:Red WITH count(*) as dummy MATCH (n) RETURN * node dummy :Red {prop1: "val1"} 2 :Red {prop1: "val2"} 2 :Green {value: 13} 2 Exactly 1 result  The second query will not produce unnecessary duplicates  Chaining queries to write a complex step of the algorithm with a single Cypher query
  • 81.
  • 82.
    RUNNING AND VISUALISINGTHE ALGORITHM ... WITH count(*) as dummy MATCH (n) RETURN n
  • 83.
    RUNNING AND VISUALISINGTHE ALGORITHM ... WITH count(*) as dummy MATCH (n) RETURN n
  • 84.
    RUNNING AND VISUALISINGTHE ALGORITHM ... WITH count(*) as dummy MATCH (n) RETURN n
  • 85.
    RUNNING AND VISUALISINGTHE ALGORITHM ... WITH count(*) as dummy MATCH (n) RETURN n
  • 86.
    RUNNING AND VISUALISINGTHE ALGORITHM ... WITH count(*) as dummy MATCH (n) RETURN n
  • 87.
    RUNNING AND VISUALISINGTHE ALGORITHM ... WITH count(*) as dummy MATCH (n) RETURN n
  • 88.
    RUNNING AND VISUALISINGTHE ALGORITHM ... WITH count(*) as dummy MATCH (n) RETURN n
  • 89.
    RUNNING AND VISUALISINGTHE ALGORITHM ... WITH count(*) as dummy MATCH (n) RETURN n
  • 90.
    RUNNING AND VISUALISINGTHE ALGORITHM ... WITH count(*) as dummy MATCH (n) RETURN n
  • 91.
    RUNNING AND VISUALISINGTHE ALGORITHM ... WITH count(*) as dummy MATCH (n) RETURN n
  • 92.
    RUNNING AND VISUALISINGTHE ALGORITHM ... WITH count(*) as dummy MATCH (n) RETURN n
  • 93.
    RUNNING AND VISUALISINGTHE ALGORITHM ... WITH count(*) as dummy MATCH (n) RETURN n
  • 94.
    RUNNING AND VISUALISINGTHE ALGORITHM ... WITH count(*) as dummy MATCH (n) RETURN n
  • 95.
  • 96.
    EXECUTING QUERIES Passing informationbetween queries:  Node labels
  • 97.
    EXECUTING QUERIES Passing informationbetween queries:  Node labels  Nodes with temporary information
  • 98.
    EXECUTING QUERIES Passing informationbetween queries:  Node labels  Nodes with temporary information  Passing nodes/edges as parameters (only supported in embedded mode)
  • 99.
    EXECUTING QUERIES Passing informationbetween queries:  Node labels  Nodes with temporary information  Passing nodes/edges as parameters (only supported in embedded mode) redNode num :Red {prop1: "val1"} 1
  • 100.
    EXECUTING QUERIES Passing informationbetween queries:  Node labels  Nodes with temporary information  Passing nodes/edges as parameters (only supported in embedded mode) redNode num :Red {prop1: "val1"} 1 Exactly 1 result
  • 101.
    EXECUTING QUERIES Passing informationbetween queries:  Node labels  Nodes with temporary information  Passing nodes/edges as parameters (only supported in embedded mode) redNode num :Red {prop1: "val1"} 1 WITH $redNode AS redNode MATCH (g:Green {num: $num}) CREATE (redNode)-[:Edge]->(g) Exactly 1 result
  • 102.
    LOOPS  Many loop-likeproblems can be solved using lists: o List comprehensions: [x IN xs WHERE condition | f(x)] o Reduce: reduce(acc = "", x IN list | acc + x.prop)
  • 103.
    LOOPS  Many loop-likeproblems can be solved using lists: o List comprehensions: [x IN xs WHERE condition | f(x)] o Reduce: reduce(acc = "", x IN list | acc + x.prop)  However, loops still might be necessary: o Run queries from Java code gds.execute(step1Query); gds.execute(step2Query);
  • 104.
    LOOPS  Many loop-likeproblems can be solved using lists: o List comprehensions: [x IN xs WHERE condition | f(x)] o Reduce: reduce(acc = "", x IN list | acc + x.prop)  However, loops still might be necessary: o Run queries from Java code o The loop condition checks the number of rows in the result while (gds.execute(conditionQuery).hasNext()) { gds.execute(step1Query); gds.execute(step2Query); }
  • 105.
    IMPORT-EXPORT  Save currentstate of the algorithm
  • 106.
    IMPORT-EXPORT  Save currentstate of the algorithm  APOC* GraphML import-export *Awesome Procedures on Cypher
  • 107.
    IMPORT-EXPORT  Save currentstate of the algorithm  APOC* GraphML import-export *Awesome Procedures on Cypher // save CALL apoc.export.graphml.all('my.graphml', {storeNodeIds:true, readLabels:true, useTypes:true})
  • 108.
    IMPORT-EXPORT  Save currentstate of the algorithm  APOC* GraphML import-export *Awesome Procedures on Cypher // save CALL apoc.export.graphml.all('my.graphml', {storeNodeIds:true, readLabels:true, useTypes:true}) // load MATCH (n) // delete all DETACH DELETE n WITH count(*) AS dummy //----------------- CALL apoc.import.graphml('my.graphml', {storeNodeIds:true, readLabels:true, useTypes:true}) YIELD nodes, relationships WITH count(*) AS dummy //----------------- MATCH (n) RETURN n // show all
  • 109.
    BACKTRACKING  In manycases, we need to restore a previous state and continue from that one o Next step is based on scoring metrics  Solutions: ?
  • 110.
    BACKTRACKING  In manycases, we need to restore a previous state and continue from that one o Next step is based on scoring metrics  Solutions: ? Metrics:
  • 111.
    BACKTRACKING  In manycases, we need to restore a previous state and continue from that one o Next step is based on scoring metrics  Solutions: ? Metrics: 5
  • 112.
    BACKTRACKING  In manycases, we need to restore a previous state and continue from that one o Next step is based on scoring metrics  Solutions: ? Metrics: 5 12
  • 113.
    BACKTRACKING  In manycases, we need to restore a previous state and continue from that one o Next step is based on scoring metrics  Solutions: ? Metrics: 5 12 8
  • 114.
    BACKTRACKING  In manycases, we need to restore a previous state and continue from that one o Next step is based on scoring metrics  Solutions: o Export -> Re-import o Use transactions: ? Metrics: 5 12 8
  • 115.
    BACKTRACKING  In manycases, we need to restore a previous state and continue from that one o Next step is based on scoring metrics  Solutions: o Export -> Re-import o Use transactions: • Abort transaction ? Metrics: 5 12 8
  • 116.
    BACKTRACKING  In manycases, we need to restore a previous state and continue from that one o Next step is based on scoring metrics  Solutions: o Export -> Re-import o Use transactions: • Abort transaction • Only applicable to one level of backtracking ? Metrics: 5 12 8
  • 117.
    DEBUGGING  Save andplay back the states that occurred during execution
  • 118.
    DEBUGGING  Save andplay back the states that occurred during execution o A counter for the states is stored in a node
  • 119.
    DEBUGGING  Save andplay back the states that occurred during execution o A counter for the states is stored in a node o Saved query to step back
  • 120.
    DEBUGGING  Save andplay back the states that occurred during execution o A counter for the states is stored in a node o Saved query to step back
  • 121.
    DEBUGGING  Save andplay back the states that occurred during execution o A counter for the states is stored in a node o Saved query to step back
  • 122.
  • 123.
    DEBUGGING  For complexerrors o Formulate an error pattern in Cypher MATCH (node:Faulty) RETURN node
  • 124.
    DEBUGGING  For complexerrors o Formulate an error pattern in Cypher o Use this to define a breakpoint in the code MATCH (node:Faulty) RETURN node if (gds.execute(query).hasNext()) // breakpoint
  • 125.
    DEBUGGING  For complexerrors o Formulate an error pattern in Cypher o Use this to define a breakpoint in the code o Load the state where the error occurred o Debug step by step MATCH (node:Faulty) RETURN node if (gds.execute(query).hasNext()) // breakpoint
  • 126.
  • 127.
    Sicco Verwer: Efficient identification oftimed automata. PhD dissertation PSEUDOCODE
  • 128.
    Sicco Verwer: Efficient identification oftimed automata. PhD dissertation PSEUDOCODE
  • 129.
    Sicco Verwer: Efficient identification oftimed automata. PhD dissertation PSEUDOCODE
  • 130.
    Sicco Verwer: Efficient identification oftimed automata. PhD dissertation PSEUDOCODE
  • 131.
    Sicco Verwer: Efficient identification oftimed automata. PhD dissertation PSEUDOCODE
  • 132.
    EXPRESSIVE POWER OFCYPHER  Most fixed-size graph queries can be decided in P-time  Checking whether two disjoint paths exist is NP-complete  Cypher is Turing-complete o reduce  Cypher 9 (current version)  Cypher 10 (multiple graph support) o This would simplify a lot of our queries. N. Francis et al., Cypher: An Evolving Query Language for Property Graphs, SIGMOD 2018
  • 133.
    SUMMARY  Neo4j prototype oRapid development o Lots of APOC procedures for specific problems (e.g. mergeNodes)  Issues o APOC bug (reproted) o Visualisation glitches o Lack of fixed-point algorithms  For graph algorithms, we’d recommend to o create a prototype in Neo4j, o once the algorithm is understood, rewrite it in a general purpose programming language.
  • 134.
    OUR TEAM Co-advisor: RebekaFarkas, Thanks to: Oszkár Semeráth, Tamás Tóth, András Vörös