SlideShare a Scribd company logo
Outrageous Ideas
Data Day Texas - June 13, 2022
For Graph Databases
@maxdemarzi


maxdemarzi.com


GitHub.com/maxdemarzi


Max De Marzi
Ten Years
In the Graph game.
Let’s talk about
Money
1.8%
2018
Ideas are Wrong
• Too Many Back-ends (aka
Tinkerpop is wrong)


• No lessons applied from
Relational Databases


• API is incomplete (bulk)


• Query Languages are
Incompetent
Implementations


are Wrong
• Nodes as Objects sucks


• No internal algebras


• Incompetent Query Optimizers


• Incompetent Query Executors


• Incompetent Engineering


• A short clip of the talk
GitHub.com/ldbc/lsqb
https://homepages.cwi.nl/~boncz/edbt2022.pdf
Peter Suggests:
https://homepages.cwi.nl/~boncz/edbt2022.pdf
1. Row Storage for Properties of Nodes/Relationships


2. Less Indexing


3. Less Joins


4. Be more Relational then add Graph Functionality


5. Don’t rely on the query optimizer


6. Don’t allow generic recursive queries


7. Limit the query language
Completely Sensible Ideas
Data Day Texas - June 13, 2022
For Graph Databases
Outrageous Ideas
Data Day Texas - June 13, 2022
For Graph Databases
Why?
How many Paths are there from


The top left node to


the bottom right node?
2 Paths
6 Paths
14x14 = 11 minutes


15x15 = 10 Hours


20x20 = Nope
How many Paths are there?
20x20 = 10 Minutes
How many Paths are there?
137 Billion
How many Paths are there?
-[*]-
Death Star Queries
Blows up Alderaaning Servers
How many Paths are there?
20 x 20 in


0.41 Seconds
137 Billion
42
http://relational.ai
Column Storage
Idea One
Graph Normal Form


Narrow Tables
Key-Value or Key-Key
Traditional 3rd Normal Form
Graph Normal Form
More Indexes
Idea Two
One Index Per Column
Composite Index Explosion
Dual Indexed Narrow Tables = Dynamic Composite Indexes
More Joins
Idea Three
Problem with Joins
Table 1
ID
0
1
3
4
5
6
7
8
9
11
Table 2
ID
0
2
6
7
8
9
Table 3
ID
2
4
5
8
10
Results
Table 1 Table 2 Table 3
8 8 8
Intermediate Results
Table1 and Table 2
0
6
7
8
9
Worst Case Optimal Joins
● Worst-Case Optimal Join Algorithms: Techniques, Results, and
Open Problems. Ngo. (Gems of PODS 2018)
● Worst-Case Optimal Join Algorithms: Techniques, Results, and
Open Problems. Ngo, Porat, Re, Rudra. (Journal of the ACM
2018)
● What do Shannon-type inequalities, submodular width, and
disjunctive datalog have to do with one another? Abo Khamis,
Ngo, Suciu, (PODS 2017 - Invited to Journal of ACM)
● Computing Join Queries with Functional Dependencies. Abo
Khamis, Ngo, Suciu. (PODS 2017)
● Joins via Geometric Resolutions: Worst-case and Beyond. Abo
Khamis, Ngo, Re, Rudra. (PODS 2015, Invited to TODS 2015)
● Beyond Worst-Case Analysis for Joins with Minesweeper. Abo
Khamis, Ngo, Re, Rudra. (PODS 2014)
● Leapfrog Triejoin: A Simple Worst-Case Optimal Join Algorithm.
Veldhuizen (ICDT 2014 - Best Newcomer)
● Skew Strikes Back: New Developments in the Theory of Join
Algorithms. Ngo, Re, Rudra. (Invited to SIGMOD Record 2013)
● Worst Case Optimal Join Algorithms. Ngo, Porat, Re,
Rudra. (PODS 2012 – Best Paper)
LeapFrog Join
Table 1
ID
0
1
3
4
5
6
7
8
9
11
Table 2
ID
0
2
6
7
8
9
Table 3
ID
2
4
5
8
10
Table IDs Action
Table 1 Table 2 Table 3
0 0 2 Table 1: Seek 2
3 0 2 Table 2: Seek 3
3 6 2 Table 3: Seek 6
3 6 8 Table 1: Seek 8
8 6 8 Table 2: Seek 8
8 8 8 Emit, Table 3: Next
8 8 10 Table 1: Seek 10
11 8 10 Table 2: Seek 11 END
Results
Table 1 Table 2 Table 3
8 8 8
Start
End
Seek 2 Seek 3 Seek 6
Seek 8
Seek 10
Seek 8
Next
Seek 11
More than 3 Tables
m
a
14
Brand
Category
Retailer
Rating
p
o
n
b
7) seek m
6) seek m
3) seek f
5) seek m
4) seek g
2) seek c
1) seek c
c d e f g
Worst-Case Optimal Joins take advantage of sorted keys and gaps in the data to
eliminate intermediate results, speed up queries and get rid of the Join problem.
in Legacy GraphDBs:
How do you model Flight Data?
Don’t we care about Flights only on particular Days?
How do you model Flight Data?
Group Destinations together!
How do you model Flight Data?
OMG WAT!
How do you model Flight Data?
Reduce the Search Space
m
a
14
Airport
Day
Flight
Destination
p
o
n
b
7) seek m
6) seek m
3) seek f
5) seek m
4) seek g
2) seek c
1) seek c
c d e f g
What if you wanted to earn miles on your frequent flyer program and filter by Airline? No
problem here, the more joins the merrier.
Real Relational
Idea Four
Vision Reality
Relational Databases
Drop The “Null”
What’s wrong with NULL?
SELECT *

FROM parts
WHERE (price <= 99) OR (price > 99)
SELECT *

FROM parts
WHERE (price <= 99) OR (price > 99) OR isNull(price)
SELECT AVG(height)

FROM parts
SELECT orders.id, parts.id

FROM orders LEFT OUTER JOIN
parts ON parts.id = orders.part_id
SELECT orders.id, parts.id

FROM parts LEFT OUTER JOIN
orders ON parts.id = orders.part_id


●(a and NOT(a)) != True
●Aggregation requires special cases
●Outer Joins are not commutative 

a x b != b x a
Query Optimizers hate Nulls. The 3 valued
logic cause major headaches.
Lose the “Bags”
Sets vs Bags
Set: {1,2,3}, {8,3,4}
Bags: {1,2,2,3}, {3, 3, 3, 3}
Sets have Unique Values
Bags allow Duplicate Values
●Queries that use only ANDs (no ORs)
are called “conjunctive queries”
●Conjunctive Queries under Set
Semantics are Much Easier to Optimize
Query Optimizers hate Bags. Duplicates cause
major headaches.
Smarter Optimizer
Idea Five
Traditional Query Optimizers
• Predicate pushdown (push selection through join)


• Projection pushdown (push projection through join)


• Aggregation pushdown


• Their “pull ups” counter parts


• Split conjunctive predicates (split AND statements)


• Replace cartesian products (use inner joins with predicates)


• (Un)Nesting Sub-Queries


• Etc.
Data Answer
Query
Equivalent Query

Math
Semantic

Optimizer
Optimized

Query
Semantic Query Optimizer
Math
You learned this in middle school
• 1 + (2 + 3) = (1 + 2) + 3


• 3 + 4 = 4 + 3


• 3 + 0 = 3


• 1 + (-1) = 0
• 2 x (3 x 4) = (2 x 3) x 4


• 2 x 5 = 5 x 2


• 2 x 1 = 2


• 2 x 0.5 = 1
• 2 x (3 + 4) = (2 x 3) + (2 x 4)


• (3 + 4) x 2 = (3 x 2) + (4 x 2)
Math
You learned this in high school
• a + (b + c) = (a + b) + c


• a + b = b + a


• a + 0 = a


• a + (-a) = 0
• a x (b x c) = (a x b) x c


• a x b = b x a


• a x 1 = a


• a x a-1 = 1, a != 0
• a x (b + c) = (a x b) + (a x c)


• (a + b) x c = (a x c) + (b x c)
Math
You forgot this in high school
• Addition:


• Associativity:


• a ⊕ (b ⊕ c) = (a ⊕ b) ⊕ c


• Commutativity:


• a ⊕ b = b ⊕ a


• Identity: a ⊕ ō = a


• Inverse: a ⊕ (-a) = ō
• Multiplication


• Associativity:


• a ⊗ (b ⊗ c) = (a ⊗ b) ⊗ c


• Commutativity:


• a ⊗ b = b ⊗ a


• Identity: a ⊗ ī = a


• Inverse: a ⊗ a-1 = ī
• Distribution of Multiplication over Addition:


• a ⊗ (b ⊕ c) = (a ⊗ b) ⊕ (a ⊗ c)


• (a ⊕ b) ⊗ c = (a ⊗ c) ⊕ (b ⊗ c)
Example 1
Query: find the count of the combined rows a, b, c in tables R, S and T



	
	
def result = count[a,b,c: R(a) and S(b) and T(c)]
Mathematic Representation:
77
Math
Example 1
Query: count the number of combined rows a, b, c in tables R, S and T
Example 1
Query: count the number of combined rows a, b, c in tables R, S and T
Example 1
Query: count the number of combined rows a, b, c in tables R, S and T



	
	
def result = count[a,b,c: R(a) and S(b) and T(c)]
Optimized Query:
def result = count[R] * count[S] * count[T]
n^3 is much slower than 3n
Example 2
Query: find the minimum sum of rows a, b, c in tables R, S and T:



	
	
	
def result = min[a,b,c,v: v = R[a] + S[b] + T[c]]
Mathematic Representation:
82
Math
Example 2
Query: find the minimum sum of rows a, b, c in tables R, S and T:



	
	
	
def result = min[a,b,c,v: v = R[a] + S[b] + T[c]]
Optimized Query:
def result = min[R] + min[S] + min[T]
C
B D
A E F
1
2
9 4
6
3
5
AEF = 9 + 4 = 13


ABDF = 1 + 6 + 5 = 12


ABCDF = 1 + 2 + 3 + 5 = 11
min{13,12,11} = 11
Shortest Path


from A to F
C
B D
A E F
0.9
0.9
0.4 0.8
0.2
1.0
0.7
AEF = 0.4 x 0.8 = 0.32


ABDF = 0.9 x 0.2 x 0.7 = 0.126


ABCDF = 0.9 x 0.9 x 1.0 x 0.7 = 0.567
max{0.32,0.126,0.567} = 0.567
Maximum Reliability


from A to F
C
B D
A E F
T
I
A T
H
M
E
AEF = A · T = AT


ABDF = T · H · E = THE


ABCDF = T · I · M · E = TIME
union{at, the, time} = at the time
Words


from A to F
Math
You skipped this in college
• min { (9 + 4), (1 + 6 + 5), ( 1 + 2 + 3 + 5 ) }


• max { (0.4 x 0.8), (0.9 x 0.2 x 0.7), (0.9 x 0.9 x 1.0 x 0.7) }


• union { (A · T), (T · H · E), (T · I · M · E) }
Math
You skipped this in college
• ⊕ { (9 ⊗ 4), (1 ⊗ 6 ⊗ 5), ( 1 ⊗ 2 ⊗ 3 ⊗ 5 ) }


• ⊕ { (0.4 ⊗ 0.8), (0.9 ⊗ 0.2 ⊗ 0.7), (0.9 ⊗ 0.9 ⊗ 1.0 ⊗ 0.7) }


• ⊕ { (A ⊗ T), (T ⊗ H ⊗ E), (T ⊗ I ⊗ M ⊗ E) }
Example 3
Query: count the number of 3-hop paths per node in a graph


def path3(a, b, c, d) = edge(a,b) and edge(b,c) and edge(c,d)


def result[a] = count[path3[a]]
Mathematic Representation:
A B C D
Query: count the number of 3-hop paths per node in a graph
A B C D
Example 3
Query: count the number of 3-hop paths per node in a graph


def path3(a, b, c, d) = edge(a,b) and edge(b,c) and edge(c,d)


def result[a] = count[path3[a]]
Optimized Query:
def path1[c] = count[edge[c]]


def path2[b] = sum[path1[c] for c in edge[b]]


def result[a] = sum[path2[b] for b in edge[a]]
A B C D
Semantic Query Optimizer
It knows math!
• Compute Discrete Fourier Transform in Fast Fourier Transform-time


• Junction Tree Algorithm for inference in Probabilistic Graphical Models


• Message passing, belief propagation


• Viterbi Algorithm, forward/backward for Hidden Markov Models most probable
paths


• Counting sub-graph patterns (motifs)


• Yannakakis Algorithm for acyclic conjunctive queries in Polynomial Time


• Fractional hypertree-width time algorithm for Constraint Satisfaction Problems


• Best known results for Conjunctive Queries and Quanti
f
ied Conjunctive Queries
Semantic Query Optimizer
It knows math!
• This optimizer produces much better code than the average developer
because it knows a ton more math than the average developer.
• Maryam Mirzakhani


• Terence Tao


• Ramanujan


• Katherine Goble


• Good Will Hunting
Add Recursion
Idea Six
95
def reachable = edge; reachable.edge
Recursion
How many Paths are there from


The top left node to


the bottom right node?
2 Paths
6 Paths
def number_of_paths_of_length(node_number, path_length, path_count) =


	
node_number=1, path_length=0, path_count=1


def number_of_paths_of_length[node_number, path_length] =


sum[other_node, paths_of_length : paths_of_length =


	
number_of_paths_of_length[other_node, path_length - 1]


	
and edge(other_node, node_number)]


def output = number_of_paths_of_length[number_of_nodes, 2 * lattice_size]
@function @transient
def :_intermediate#0(other_node#1, path_length#0, _t#0) =
reduce[(_x#0, _y#0, _z#0) : :rel_primitive_add(_x#0, _y#0, _z#0),
(x#8, paths_of_length#1) :
:number_of_paths_of_length(other_node#1, x#8, paths_of_length#1) and
:rel_primitive_add(1, x#8, path_length#0),
(_no_init#0) : false](_t#0)


@function @transient
def :_intermediate#1(node_number#0, path_length#0, path_count#0) =
reduce[(_x#1, _y#1, _z#1) : :rel_primitive_add(_x#1, _y#1, _z#1),
(other_node#1, _t#0) :
:edge(other_node#1, node_number#0) and
:_intermediate#0(other_node#1, path_length#0, _t#0),
(_no_init#1) : false](path_count#0)


def :number_of_paths_of_length(node_number#0, path_length#0, path_count#0) =
:_base_case#0(node_number#0, path_length#0, path_count#0) or
:_intermediate#1(node_number#0, path_length#0, path_count#0)
Naive recursion, iteration 1
Evaluating `_intermediate#0`:
(1, 1) => (1,)
Evaluating `_intermediate#1`:
(2, 1) => (1,)
(4, 1) => (1,)
Evaluating `number_of_paths_of_length`:
(1, 0, 1)
(2, 1, 1)
(4, 1, 1)
@function @transient
def :_intermediate#0(other_node#1, path_length#0, _t#0) =
reduce[(_x#0, _y#0, _z#0) : :rel_primitive_add(_x#0, _y#0, _z#0),
(x#8, paths_of_length#1) :
:number_of_paths_of_length(other_node#1, x#8, paths_of_length#1) and
:rel_primitive_add(1, x#8, path_length#0),
(_no_init#0) : false](_t#0)


@function @transient
def :_intermediate#1(node_number#0, path_length#0, path_count#0) =
reduce[(_x#1, _y#1, _z#1) : :rel_primitive_add(_x#1, _y#1, _z#1),
(other_node#1, _t#0) :
:edge(other_node#1, node_number#0) and
:_intermediate#0(other_node#1, path_length#0, _t#0),
(_no_init#1) : false](path_count#0)


def :number_of_paths_of_length(node_number#0, path_length#0, path_count#0) =
:_base_case#0(node_number#0, path_length#0, path_count#0) or
:_intermediate#1(node_number#0, path_length#0, path_count#0)
Naive recursion, iteration 2
Evaluating `_intermediate#0`:
(1, 1) => (1,)
(2, 2) => (1,)
(4, 2) => (1,)
Evaluating `_intermediate#1`:
(2, 1) => (1,)
(3, 2) => (1,)
(4, 1) => (1,)
(5, 2) => (2,)
(7, 2) => (1,)
Evaluating `number_of_paths_of_length`:
(1, 0, 1)
(2, 1, 1)
(3, 2, 1)
(4, 1, 1)
(5, 2, 2)
(7, 2, 1)
@function @transient
def :_intermediate#0(other_node#1, path_length#0, _t#0) =
reduce[(_x#0, _y#0, _z#0) : :rel_primitive_add(_x#0, _y#0, _z#0),
(x#8, paths_of_length#1) :
:number_of_paths_of_length(other_node#1, x#8, paths_of_length#1) and
:rel_primitive_add(1, x#8, path_length#0),
(_no_init#0) : false](_t#0)


@function @transient
def :_intermediate#1(node_number#0, path_length#0, path_count#0) =
reduce[(_x#1, _y#1, _z#1) : :rel_primitive_add(_x#1, _y#1, _z#1),
(other_node#1, _t#0) :
:edge(other_node#1, node_number#0) and
:_intermediate#0(other_node#1, path_length#0, _t#0),
(_no_init#1) : false](path_count#0)


def :number_of_paths_of_length(node_number#0, path_length#0, path_count#0) =
:_base_case#0(node_number#0, path_length#0, path_count#0) or
:_intermediate#1(node_number#0, path_length#0, path_count#0)
Naive recursion, iteration 3
Evaluating `_intermediate#0`:
(1, 1) => (1,)
(2, 2) => (1,)
(3, 3) => (1,)
(4, 2) => (1,)
(5, 3) => (2,)
(7, 3) => (1,)
Evaluating `_intermediate#1`:
(2, 1) => (1,)
(3, 2) => (1,)
(4, 1) => (1,)
(5, 2) => (2,)
(6, 3) => (3,)
(7, 2) => (1,)
(8, 3) => (3,)
Evaluating `number_of_paths_of_length`:
(1, 0, 1)
(2, 1, 1)
(3, 2, 1)
(4, 1, 1)
(5, 2, 2)
(6, 3, 3)
(7, 2, 1)
(8, 3, 3)
@function @transient
def :_intermediate#0(other_node#1, path_length#0, _t#0) =
reduce[(_x#0, _y#0, _z#0) : :rel_primitive_add(_x#0, _y#0, _z#0),
(x#8, paths_of_length#1) :
:number_of_paths_of_length(other_node#1, x#8, paths_of_length#1) and
:rel_primitive_add(1, x#8, path_length#0),
(_no_init#0) : false](_t#0)


@function @transient
def :_intermediate#1(node_number#0, path_length#0, path_count#0) =
reduce[(_x#1, _y#1, _z#1) : :rel_primitive_add(_x#1, _y#1, _z#1),
(other_node#1, _t#0) :
:edge(other_node#1, node_number#0) and
:_intermediate#0(other_node#1, path_length#0, _t#0),
(_no_init#1) : false](path_count#0)


def :number_of_paths_of_length(node_number#0, path_length#0, path_count#0) =
:_base_case#0(node_number#0, path_length#0, path_count#0) or
:_intermediate#1(node_number#0, path_length#0, path_count#0)
Naive recursion, iteration 4
Evaluating `_intermediate#0`:
(1, 1) => (1,)
(2, 2) => (1,)
(3, 3) => (1,)
(4, 2) => (1,)
(5, 3) => (2,)
(6, 4) => (3,)
(7, 3) => (1,)
(8, 4) => (3,)
Evaluating `_intermediate#1`:
(2, 1) => (1,)
(3, 2) => (1,)
(4, 1) => (1,)
(5, 2) => (2,)
(6, 3) => (3,)
(7, 2) => (1,)
(8, 3) => (3,)
(9, 4) => (6,)
Evaluating `number_of_paths_of_length`:
(1, 0, 1)
(2, 1, 1)
(3, 2, 1)
(4, 1, 1)
(5, 2, 2)
(6, 3, 3)
(7, 2, 1)
(8, 3, 3)
(9, 4, 6)
No Language Limits
Idea Seven
Graph Analytics
module graph_analytics[G]

with G use node, edge



def neighbor(x, y) = edge(x, y) or edge(y, x)

def outdegree[x] = count[edge[x]]

def degree[x] = count[neighbor[x]]

def cn[x, y] = count[intersect[neighbor[x], neighbor[y]]] // Count of Common Neighbors



def reachable = edge; reachable.edge

def reachable_undirected = neighbor; reachable_undirected.neighbor



def scc[x] = min[v: reachable(x, v) and reachable(v, x)] // Strongly Connected Component

def wcc[x] = min[reachable_undirected[x]] // Weakly Connected Component



def cosine_sim[x, y] = cn[x, y] / sqrt[degree[x] * degree[y]]

def jaccard_sim[x, y] = cn[x, y] / count[neighbor[x]] + count[neighbor[y]] - cn[x, y]

…

end
Betweenness Centrality


Graph Algorithms
One of many of graph centrality measures which are
useful for assessing the importance of a node.

High Level Definition: Number of times a node
appears on shortest paths within a network

Why it’s Useful: Identify which nodes control
information flow between different areas of the
graph; also called “Bridge Nodes”

Business Use-Cases:

Communication Analysis: Identify important
people which communicate across different
groups

Retail Purchase Analysis: Which products
introduce customers to new categories
Betweenness Centrality


Computation
Brandes Algorithm is applied as follows:

1. For each pair of nodes, compute all
shortest paths and capture nodes
(less endpoints) on said path(s)

2. For each pair of nodes, assign each
node along path a value of one if there
is only one shortest path, or the
fractional contribution (1/n) if n
shortest paths

3. Sum the value from step 2 for each
node; this is the Betweenness
Centrality
Betweenness Centrality Implementation
// Shortest path between s and t when they are the same is 0. 

def shortest_path[s, t] = Min[

v, w:

(shortest_path(s, t, w) and v = 1) or

(w = shortest_path[s,v] +1 and E(v, t))

]
// When s and t are the same, there is only one shortest path between
// them, namely the one with length 0.
def nb_shortest(s, t, n) = V(s) and V(t) and s = t and n = 1
// When s and t are *not* the same, it is the sum of the number of
shortest
// paths between s and v for all the v's adjacent to t and on the shortest
// path between s and t.
def nb_shortest(s, t, n) =
s != t and
n = sum[v, m:
shortest_path[s, v] + 1 = shortest_path[s, t] and E(v, t) and
nb_shortest(s, v, m)
]
// sum over all t's such that there is an edge between v and t,
// and v is on the shortest path between s and t
def C[s, v] = sum[t, r:
E(v, t) and shortest_path[s, t] = shortest_path[s, v] + 1 and
(
a = C[s, t] or
not C(s, t, _) and a = 0.0
) and
r = (nb_shortest[s, v] / nb_shortest[s, t]) * (1 + a)
] from a
// Note that below we divide by 2 because we are double
counting every edge.
def betweenness_centrality_brandes[v] =
sum[s, p : s != v and C[s, v] = p]/2
Betweenness Centrality ReComputation
Incremental updates to
data and recomputation
of Betweenness
Centrality takes only a
few seconds, whereas
the entire graph needs to
be re-computed in other
systems.
Algorithm Change ReComputation
Incremental updates to
code is also
recomputated, whereas
the entire algorithm
needs to be re-
computed in other
systems.
Code Dependency Graph
Incremental Maintenance
1. Dependency tracking to figure out which views are affected by a change.

2. Demand-driven execution to only compute what users are actively interested in.

3. Differential computation to incrementally maintain even general recursion.

4. Semantic optimization to recover better maintenance algorithms where possible.
112
http://relational.ai
113
Raised Money Too
114
http://relational.ai

More Related Content

What's hot

Neo4j Training Modeling
Neo4j Training ModelingNeo4j Training Modeling
Neo4j Training Modeling
Max De Marzi
 
Workshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data ScienceWorkshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data Science
Neo4j
 
XQuery
XQueryXQuery
XQuery
Raji Ghawi
 
Optimizing Cypher Queries in Neo4j
Optimizing Cypher Queries in Neo4jOptimizing Cypher Queries in Neo4j
Optimizing Cypher Queries in Neo4j
Neo4j
 
Optimizing Your Supply Chain with the Neo4j Graph
Optimizing Your Supply Chain with the Neo4j GraphOptimizing Your Supply Chain with the Neo4j Graph
Optimizing Your Supply Chain with the Neo4j Graph
Neo4j
 
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Databricks
 
Incremental Processing on Large Analytical Datasets with Prasanna Rajaperumal...
Incremental Processing on Large Analytical Datasets with Prasanna Rajaperumal...Incremental Processing on Large Analytical Datasets with Prasanna Rajaperumal...
Incremental Processing on Large Analytical Datasets with Prasanna Rajaperumal...
Databricks
 
Introduction to data analysis using python
Introduction to data analysis using pythonIntroduction to data analysis using python
Introduction to data analysis using python
Guido Luz Percú
 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsOptimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL Joins
Databricks
 
026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...
026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...
026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...
Neo4j
 
Domain Driven Design Quickly
Domain Driven Design QuicklyDomain Driven Design Quickly
Domain Driven Design Quickly
Mariam Hakobyan
 
Real World Event Sourcing and CQRS
Real World Event Sourcing and CQRSReal World Event Sourcing and CQRS
Real World Event Sourcing and CQRS
Matthew Hawkins
 
Python dentro de SQL Server
Python dentro de SQL ServerPython dentro de SQL Server
Python dentro de SQL Server
Eduardo Castro
 
Top 10 Cypher Tuning Tips & Tricks
Top 10 Cypher Tuning Tips & TricksTop 10 Cypher Tuning Tips & Tricks
Top 10 Cypher Tuning Tips & Tricks
Neo4j
 
Kata: Hexagonal Architecture / Ports and Adapters
Kata: Hexagonal Architecture / Ports and AdaptersKata: Hexagonal Architecture / Ports and Adapters
Kata: Hexagonal Architecture / Ports and Adapters
holsky
 
User Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love StoryUser Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love Story
Databricks
 
Boost Your Neo4j with User-Defined Procedures
Boost Your Neo4j with User-Defined ProceduresBoost Your Neo4j with User-Defined Procedures
Boost Your Neo4j with User-Defined Procedures
Neo4j
 
Graph Data Modeling Best Practices(Eric_Monk).pptx
Graph Data Modeling Best Practices(Eric_Monk).pptxGraph Data Modeling Best Practices(Eric_Monk).pptx
Graph Data Modeling Best Practices(Eric_Monk).pptx
Neo4j
 
Tiger graph 2021 corporate overview [read only]
Tiger graph 2021 corporate overview [read only]Tiger graph 2021 corporate overview [read only]
Tiger graph 2021 corporate overview [read only]
ercan5
 
Spark Streaming | Twitter Sentiment Analysis Example | Apache Spark Training ...
Spark Streaming | Twitter Sentiment Analysis Example | Apache Spark Training ...Spark Streaming | Twitter Sentiment Analysis Example | Apache Spark Training ...
Spark Streaming | Twitter Sentiment Analysis Example | Apache Spark Training ...
Edureka!
 

What's hot (20)

Neo4j Training Modeling
Neo4j Training ModelingNeo4j Training Modeling
Neo4j Training Modeling
 
Workshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data ScienceWorkshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data Science
 
XQuery
XQueryXQuery
XQuery
 
Optimizing Cypher Queries in Neo4j
Optimizing Cypher Queries in Neo4jOptimizing Cypher Queries in Neo4j
Optimizing Cypher Queries in Neo4j
 
Optimizing Your Supply Chain with the Neo4j Graph
Optimizing Your Supply Chain with the Neo4j GraphOptimizing Your Supply Chain with the Neo4j Graph
Optimizing Your Supply Chain with the Neo4j Graph
 
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
 
Incremental Processing on Large Analytical Datasets with Prasanna Rajaperumal...
Incremental Processing on Large Analytical Datasets with Prasanna Rajaperumal...Incremental Processing on Large Analytical Datasets with Prasanna Rajaperumal...
Incremental Processing on Large Analytical Datasets with Prasanna Rajaperumal...
 
Introduction to data analysis using python
Introduction to data analysis using pythonIntroduction to data analysis using python
Introduction to data analysis using python
 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsOptimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL Joins
 
026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...
026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...
026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...
 
Domain Driven Design Quickly
Domain Driven Design QuicklyDomain Driven Design Quickly
Domain Driven Design Quickly
 
Real World Event Sourcing and CQRS
Real World Event Sourcing and CQRSReal World Event Sourcing and CQRS
Real World Event Sourcing and CQRS
 
Python dentro de SQL Server
Python dentro de SQL ServerPython dentro de SQL Server
Python dentro de SQL Server
 
Top 10 Cypher Tuning Tips & Tricks
Top 10 Cypher Tuning Tips & TricksTop 10 Cypher Tuning Tips & Tricks
Top 10 Cypher Tuning Tips & Tricks
 
Kata: Hexagonal Architecture / Ports and Adapters
Kata: Hexagonal Architecture / Ports and AdaptersKata: Hexagonal Architecture / Ports and Adapters
Kata: Hexagonal Architecture / Ports and Adapters
 
User Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love StoryUser Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love Story
 
Boost Your Neo4j with User-Defined Procedures
Boost Your Neo4j with User-Defined ProceduresBoost Your Neo4j with User-Defined Procedures
Boost Your Neo4j with User-Defined Procedures
 
Graph Data Modeling Best Practices(Eric_Monk).pptx
Graph Data Modeling Best Practices(Eric_Monk).pptxGraph Data Modeling Best Practices(Eric_Monk).pptx
Graph Data Modeling Best Practices(Eric_Monk).pptx
 
Tiger graph 2021 corporate overview [read only]
Tiger graph 2021 corporate overview [read only]Tiger graph 2021 corporate overview [read only]
Tiger graph 2021 corporate overview [read only]
 
Spark Streaming | Twitter Sentiment Analysis Example | Apache Spark Training ...
Spark Streaming | Twitter Sentiment Analysis Example | Apache Spark Training ...Spark Streaming | Twitter Sentiment Analysis Example | Apache Spark Training ...
Spark Streaming | Twitter Sentiment Analysis Example | Apache Spark Training ...
 

Similar to Outrageous Ideas for Graph Databases

Developer Intro Deck-PowerPoint - Download for Speaker Notes
Developer Intro Deck-PowerPoint - Download for Speaker NotesDeveloper Intro Deck-PowerPoint - Download for Speaker Notes
Developer Intro Deck-PowerPoint - Download for Speaker Notes
Max De Marzi
 
Adobe
AdobeAdobe
Introduction to MATLAB
Introduction to MATLABIntroduction to MATLAB
Introduction to MATLAB
Damian T. Gordon
 
3rd Semester Computer Science and Engineering (ACU) Question papers
3rd Semester Computer Science and Engineering  (ACU) Question papers3rd Semester Computer Science and Engineering  (ACU) Question papers
3rd Semester Computer Science and Engineering (ACU) Question papers
BGS Institute of Technology, Adichunchanagiri University (ACU)
 
Unit-1 Basic Concept of Algorithm.pptx
Unit-1 Basic Concept of Algorithm.pptxUnit-1 Basic Concept of Algorithm.pptx
Unit-1 Basic Concept of Algorithm.pptx
ssuser01e301
 
Curvefitting
CurvefittingCurvefitting
Curvefitting
Philberto Saroni
 
Lecture 6 operators
Lecture 6   operatorsLecture 6   operators
Lecture 6 operators
eShikshak
 
Rdio's Alex Gaynor at Heroku's Waza 2013: Why Python, Ruby and Javascript are...
Rdio's Alex Gaynor at Heroku's Waza 2013: Why Python, Ruby and Javascript are...Rdio's Alex Gaynor at Heroku's Waza 2013: Why Python, Ruby and Javascript are...
Rdio's Alex Gaynor at Heroku's Waza 2013: Why Python, Ruby and Javascript are...
Heroku
 
CDT 22 slides.pdf
CDT 22 slides.pdfCDT 22 slides.pdf
CDT 22 slides.pdf
Christian Robert
 
Map reduce and the art of Thinking Parallel - Dr. Shailesh Kumar
Map reduce and the art of Thinking Parallel   - Dr. Shailesh KumarMap reduce and the art of Thinking Parallel   - Dr. Shailesh Kumar
Map reduce and the art of Thinking Parallel - Dr. Shailesh Kumar
Hyderabad Scalability Meetup
 
RDataMining slides-regression-classification
RDataMining slides-regression-classificationRDataMining slides-regression-classification
RDataMining slides-regression-classification
Yanchang Zhao
 
Mid-Term ExamName___________________________________MU.docx
Mid-Term ExamName___________________________________MU.docxMid-Term ExamName___________________________________MU.docx
Mid-Term ExamName___________________________________MU.docx
annandleola
 
Day 2 review with sat
Day 2 review with satDay 2 review with sat
Day 2 review with satjbianco9910
 
Monads and Monoids by Oleksiy Dyagilev
Monads and Monoids by Oleksiy DyagilevMonads and Monoids by Oleksiy Dyagilev
Monads and Monoids by Oleksiy Dyagilev
JavaDayUA
 
CS 542 -- Query Optimization
CS 542 -- Query OptimizationCS 542 -- Query Optimization
CS 542 -- Query OptimizationJ Singh
 
Algorithm Design and Analysis
Algorithm Design and AnalysisAlgorithm Design and Analysis
Algorithm Design and Analysis
Sayed Chhattan Shah
 
VARIOUS FUZZY NUMBERS AND THEIR VARIOUS RANKING APPROACHES
VARIOUS FUZZY NUMBERS AND THEIR VARIOUS RANKING APPROACHESVARIOUS FUZZY NUMBERS AND THEIR VARIOUS RANKING APPROACHES
VARIOUS FUZZY NUMBERS AND THEIR VARIOUS RANKING APPROACHES
IAEME Publication
 
Qp cdsi18-math
Qp cdsi18-mathQp cdsi18-math
Qp cdsi18-math
PRASANTH RAKOTI
 
Lecture1a data types
Lecture1a data typesLecture1a data types
Lecture1a data types
mbadhi barnabas
 

Similar to Outrageous Ideas for Graph Databases (20)

Developer Intro Deck-PowerPoint - Download for Speaker Notes
Developer Intro Deck-PowerPoint - Download for Speaker NotesDeveloper Intro Deck-PowerPoint - Download for Speaker Notes
Developer Intro Deck-PowerPoint - Download for Speaker Notes
 
Adobe
AdobeAdobe
Adobe
 
Introduction to MATLAB
Introduction to MATLABIntroduction to MATLAB
Introduction to MATLAB
 
3rd Semester Computer Science and Engineering (ACU) Question papers
3rd Semester Computer Science and Engineering  (ACU) Question papers3rd Semester Computer Science and Engineering  (ACU) Question papers
3rd Semester Computer Science and Engineering (ACU) Question papers
 
Unit-1 Basic Concept of Algorithm.pptx
Unit-1 Basic Concept of Algorithm.pptxUnit-1 Basic Concept of Algorithm.pptx
Unit-1 Basic Concept of Algorithm.pptx
 
Curvefitting
CurvefittingCurvefitting
Curvefitting
 
Lecture 6 operators
Lecture 6   operatorsLecture 6   operators
Lecture 6 operators
 
Rdio's Alex Gaynor at Heroku's Waza 2013: Why Python, Ruby and Javascript are...
Rdio's Alex Gaynor at Heroku's Waza 2013: Why Python, Ruby and Javascript are...Rdio's Alex Gaynor at Heroku's Waza 2013: Why Python, Ruby and Javascript are...
Rdio's Alex Gaynor at Heroku's Waza 2013: Why Python, Ruby and Javascript are...
 
CDT 22 slides.pdf
CDT 22 slides.pdfCDT 22 slides.pdf
CDT 22 slides.pdf
 
Map reduce and the art of Thinking Parallel - Dr. Shailesh Kumar
Map reduce and the art of Thinking Parallel   - Dr. Shailesh KumarMap reduce and the art of Thinking Parallel   - Dr. Shailesh Kumar
Map reduce and the art of Thinking Parallel - Dr. Shailesh Kumar
 
RDataMining slides-regression-classification
RDataMining slides-regression-classificationRDataMining slides-regression-classification
RDataMining slides-regression-classification
 
Mid-Term ExamName___________________________________MU.docx
Mid-Term ExamName___________________________________MU.docxMid-Term ExamName___________________________________MU.docx
Mid-Term ExamName___________________________________MU.docx
 
Day 2 review with sat
Day 2 review with satDay 2 review with sat
Day 2 review with sat
 
Monads and Monoids by Oleksiy Dyagilev
Monads and Monoids by Oleksiy DyagilevMonads and Monoids by Oleksiy Dyagilev
Monads and Monoids by Oleksiy Dyagilev
 
CS 542 -- Query Optimization
CS 542 -- Query OptimizationCS 542 -- Query Optimization
CS 542 -- Query Optimization
 
Algorithm Design and Analysis
Algorithm Design and AnalysisAlgorithm Design and Analysis
Algorithm Design and Analysis
 
LalitBDA2015V3
LalitBDA2015V3LalitBDA2015V3
LalitBDA2015V3
 
VARIOUS FUZZY NUMBERS AND THEIR VARIOUS RANKING APPROACHES
VARIOUS FUZZY NUMBERS AND THEIR VARIOUS RANKING APPROACHESVARIOUS FUZZY NUMBERS AND THEIR VARIOUS RANKING APPROACHES
VARIOUS FUZZY NUMBERS AND THEIR VARIOUS RANKING APPROACHES
 
Qp cdsi18-math
Qp cdsi18-mathQp cdsi18-math
Qp cdsi18-math
 
Lecture1a data types
Lecture1a data typesLecture1a data types
Lecture1a data types
 

More from Max De Marzi

DataDay 2023 Presentation
DataDay 2023 PresentationDataDay 2023 Presentation
DataDay 2023 Presentation
Max De Marzi
 
DataDay 2023 Presentation - Notes
DataDay 2023 Presentation - NotesDataDay 2023 Presentation - Notes
DataDay 2023 Presentation - Notes
Max De Marzi
 
Neo4j Training Introduction
Neo4j Training IntroductionNeo4j Training Introduction
Neo4j Training Introduction
Max De Marzi
 
Detenga el fraude complejo con Neo4j
Detenga el fraude complejo con Neo4jDetenga el fraude complejo con Neo4j
Detenga el fraude complejo con Neo4j
Max De Marzi
 
Data Modeling Tricks for Neo4j
Data Modeling Tricks for Neo4jData Modeling Tricks for Neo4j
Data Modeling Tricks for Neo4j
Max De Marzi
 
Fraud Detection and Neo4j
Fraud Detection and Neo4j Fraud Detection and Neo4j
Fraud Detection and Neo4j
Max De Marzi
 
Detecion de Fraude con Neo4j
Detecion de Fraude con Neo4jDetecion de Fraude con Neo4j
Detecion de Fraude con Neo4j
Max De Marzi
 
Neo4j Data Science Presentation
Neo4j Data Science PresentationNeo4j Data Science Presentation
Neo4j Data Science Presentation
Max De Marzi
 
Neo4j Stored Procedure Training Part 2
Neo4j Stored Procedure Training Part 2Neo4j Stored Procedure Training Part 2
Neo4j Stored Procedure Training Part 2
Max De Marzi
 
Neo4j Stored Procedure Training Part 1
Neo4j Stored Procedure Training Part 1Neo4j Stored Procedure Training Part 1
Neo4j Stored Procedure Training Part 1
Max De Marzi
 
Decision Trees in Neo4j
Decision Trees in Neo4jDecision Trees in Neo4j
Decision Trees in Neo4j
Max De Marzi
 
Neo4j y Fraude Spanish
Neo4j y Fraude SpanishNeo4j y Fraude Spanish
Neo4j y Fraude Spanish
Max De Marzi
 
Data modeling with neo4j tutorial
Data modeling with neo4j tutorialData modeling with neo4j tutorial
Data modeling with neo4j tutorial
Max De Marzi
 
Neo4j Fundamentals
Neo4j FundamentalsNeo4j Fundamentals
Neo4j Fundamentals
Max De Marzi
 
Neo4j Presentation
Neo4j PresentationNeo4j Presentation
Neo4j Presentation
Max De Marzi
 
Fraud Detection Class Slides
Fraud Detection Class SlidesFraud Detection Class Slides
Fraud Detection Class Slides
Max De Marzi
 
Neo4j in Depth
Neo4j in DepthNeo4j in Depth
Neo4j in Depth
Max De Marzi
 
Bootstrapping Recommendations OSCON 2015
Bootstrapping Recommendations OSCON 2015Bootstrapping Recommendations OSCON 2015
Bootstrapping Recommendations OSCON 2015
Max De Marzi
 
What Finance can learn from Dating Sites
What Finance can learn from Dating SitesWhat Finance can learn from Dating Sites
What Finance can learn from Dating Sites
Max De Marzi
 
Bootstrapping Recommendations with Neo4j
Bootstrapping Recommendations with Neo4jBootstrapping Recommendations with Neo4j
Bootstrapping Recommendations with Neo4j
Max De Marzi
 

More from Max De Marzi (20)

DataDay 2023 Presentation
DataDay 2023 PresentationDataDay 2023 Presentation
DataDay 2023 Presentation
 
DataDay 2023 Presentation - Notes
DataDay 2023 Presentation - NotesDataDay 2023 Presentation - Notes
DataDay 2023 Presentation - Notes
 
Neo4j Training Introduction
Neo4j Training IntroductionNeo4j Training Introduction
Neo4j Training Introduction
 
Detenga el fraude complejo con Neo4j
Detenga el fraude complejo con Neo4jDetenga el fraude complejo con Neo4j
Detenga el fraude complejo con Neo4j
 
Data Modeling Tricks for Neo4j
Data Modeling Tricks for Neo4jData Modeling Tricks for Neo4j
Data Modeling Tricks for Neo4j
 
Fraud Detection and Neo4j
Fraud Detection and Neo4j Fraud Detection and Neo4j
Fraud Detection and Neo4j
 
Detecion de Fraude con Neo4j
Detecion de Fraude con Neo4jDetecion de Fraude con Neo4j
Detecion de Fraude con Neo4j
 
Neo4j Data Science Presentation
Neo4j Data Science PresentationNeo4j Data Science Presentation
Neo4j Data Science Presentation
 
Neo4j Stored Procedure Training Part 2
Neo4j Stored Procedure Training Part 2Neo4j Stored Procedure Training Part 2
Neo4j Stored Procedure Training Part 2
 
Neo4j Stored Procedure Training Part 1
Neo4j Stored Procedure Training Part 1Neo4j Stored Procedure Training Part 1
Neo4j Stored Procedure Training Part 1
 
Decision Trees in Neo4j
Decision Trees in Neo4jDecision Trees in Neo4j
Decision Trees in Neo4j
 
Neo4j y Fraude Spanish
Neo4j y Fraude SpanishNeo4j y Fraude Spanish
Neo4j y Fraude Spanish
 
Data modeling with neo4j tutorial
Data modeling with neo4j tutorialData modeling with neo4j tutorial
Data modeling with neo4j tutorial
 
Neo4j Fundamentals
Neo4j FundamentalsNeo4j Fundamentals
Neo4j Fundamentals
 
Neo4j Presentation
Neo4j PresentationNeo4j Presentation
Neo4j Presentation
 
Fraud Detection Class Slides
Fraud Detection Class SlidesFraud Detection Class Slides
Fraud Detection Class Slides
 
Neo4j in Depth
Neo4j in DepthNeo4j in Depth
Neo4j in Depth
 
Bootstrapping Recommendations OSCON 2015
Bootstrapping Recommendations OSCON 2015Bootstrapping Recommendations OSCON 2015
Bootstrapping Recommendations OSCON 2015
 
What Finance can learn from Dating Sites
What Finance can learn from Dating SitesWhat Finance can learn from Dating Sites
What Finance can learn from Dating Sites
 
Bootstrapping Recommendations with Neo4j
Bootstrapping Recommendations with Neo4jBootstrapping Recommendations with Neo4j
Bootstrapping Recommendations with Neo4j
 

Recently uploaded

Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
Jheel Barad
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
Anna Sz.
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
EduSkills OECD
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
beazzy04
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 

Recently uploaded (20)

Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 

Outrageous Ideas for Graph Databases

  • 1. Outrageous Ideas Data Day Texas - June 13, 2022 For Graph Databases
  • 3.
  • 4. Ten Years In the Graph game.
  • 5.
  • 6.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13. 1.8%
  • 14.
  • 15.
  • 16. 2018
  • 17. Ideas are Wrong • Too Many Back-ends (aka Tinkerpop is wrong) • No lessons applied from Relational Databases • API is incomplete (bulk) • Query Languages are Incompetent
  • 18. Implementations are Wrong • Nodes as Objects sucks • No internal algebras • Incompetent Query Optimizers • Incompetent Query Executors • Incompetent Engineering • A short clip of the talk
  • 19.
  • 20.
  • 22.
  • 23.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31. Peter Suggests: https://homepages.cwi.nl/~boncz/edbt2022.pdf 1. Row Storage for Properties of Nodes/Relationships 2. Less Indexing 3. Less Joins 4. Be more Relational then add Graph Functionality 5. Don’t rely on the query optimizer 6. Don’t allow generic recursive queries 7. Limit the query language
  • 32.
  • 33. Completely Sensible Ideas Data Day Texas - June 13, 2022 For Graph Databases
  • 34. Outrageous Ideas Data Day Texas - June 13, 2022 For Graph Databases
  • 35. Why?
  • 36. How many Paths are there from The top left node to the bottom right node? 2 Paths 6 Paths
  • 37. 14x14 = 11 minutes 15x15 = 10 Hours 20x20 = Nope How many Paths are there?
  • 38. 20x20 = 10 Minutes How many Paths are there? 137 Billion
  • 39. How many Paths are there? -[*]- Death Star Queries Blows up Alderaaning Servers
  • 40. How many Paths are there? 20 x 20 in 0.41 Seconds 137 Billion
  • 41.
  • 44. Graph Normal Form Narrow Tables Key-Value or Key-Key
  • 48. One Index Per Column
  • 49. Composite Index Explosion Dual Indexed Narrow Tables = Dynamic Composite Indexes
  • 51. Problem with Joins Table 1 ID 0 1 3 4 5 6 7 8 9 11 Table 2 ID 0 2 6 7 8 9 Table 3 ID 2 4 5 8 10 Results Table 1 Table 2 Table 3 8 8 8 Intermediate Results Table1 and Table 2 0 6 7 8 9
  • 52. Worst Case Optimal Joins ● Worst-Case Optimal Join Algorithms: Techniques, Results, and Open Problems. Ngo. (Gems of PODS 2018) ● Worst-Case Optimal Join Algorithms: Techniques, Results, and Open Problems. Ngo, Porat, Re, Rudra. (Journal of the ACM 2018) ● What do Shannon-type inequalities, submodular width, and disjunctive datalog have to do with one another? Abo Khamis, Ngo, Suciu, (PODS 2017 - Invited to Journal of ACM) ● Computing Join Queries with Functional Dependencies. Abo Khamis, Ngo, Suciu. (PODS 2017) ● Joins via Geometric Resolutions: Worst-case and Beyond. Abo Khamis, Ngo, Re, Rudra. (PODS 2015, Invited to TODS 2015) ● Beyond Worst-Case Analysis for Joins with Minesweeper. Abo Khamis, Ngo, Re, Rudra. (PODS 2014) ● Leapfrog Triejoin: A Simple Worst-Case Optimal Join Algorithm. Veldhuizen (ICDT 2014 - Best Newcomer) ● Skew Strikes Back: New Developments in the Theory of Join Algorithms. Ngo, Re, Rudra. (Invited to SIGMOD Record 2013) ● Worst Case Optimal Join Algorithms. Ngo, Porat, Re, Rudra. (PODS 2012 – Best Paper)
  • 53. LeapFrog Join Table 1 ID 0 1 3 4 5 6 7 8 9 11 Table 2 ID 0 2 6 7 8 9 Table 3 ID 2 4 5 8 10 Table IDs Action Table 1 Table 2 Table 3 0 0 2 Table 1: Seek 2 3 0 2 Table 2: Seek 3 3 6 2 Table 3: Seek 6 3 6 8 Table 1: Seek 8 8 6 8 Table 2: Seek 8 8 8 8 Emit, Table 3: Next 8 8 10 Table 1: Seek 10 11 8 10 Table 2: Seek 11 END Results Table 1 Table 2 Table 3 8 8 8 Start End Seek 2 Seek 3 Seek 6 Seek 8 Seek 10 Seek 8 Next Seek 11
  • 54. More than 3 Tables m a 14 Brand Category Retailer Rating p o n b 7) seek m 6) seek m 3) seek f 5) seek m 4) seek g 2) seek c 1) seek c c d e f g Worst-Case Optimal Joins take advantage of sorted keys and gaps in the data to eliminate intermediate results, speed up queries and get rid of the Join problem.
  • 55. in Legacy GraphDBs: How do you model Flight Data?
  • 56. Don’t we care about Flights only on particular Days? How do you model Flight Data?
  • 57. Group Destinations together! How do you model Flight Data?
  • 58. OMG WAT! How do you model Flight Data?
  • 59. Reduce the Search Space m a 14 Airport Day Flight Destination p o n b 7) seek m 6) seek m 3) seek f 5) seek m 4) seek g 2) seek c 1) seek c c d e f g What if you wanted to earn miles on your frequent flyer program and filter by Airline? No problem here, the more joins the merrier.
  • 62.
  • 64. What’s wrong with NULL? SELECT *
 FROM parts WHERE (price <= 99) OR (price > 99) SELECT *
 FROM parts WHERE (price <= 99) OR (price > 99) OR isNull(price) SELECT AVG(height)
 FROM parts SELECT orders.id, parts.id
 FROM orders LEFT OUTER JOIN parts ON parts.id = orders.part_id SELECT orders.id, parts.id
 FROM parts LEFT OUTER JOIN orders ON parts.id = orders.part_id 
 ●(a and NOT(a)) != True ●Aggregation requires special cases ●Outer Joins are not commutative 
 a x b != b x a Query Optimizers hate Nulls. The 3 valued logic cause major headaches.
  • 65.
  • 67. Sets vs Bags Set: {1,2,3}, {8,3,4} Bags: {1,2,2,3}, {3, 3, 3, 3} Sets have Unique Values Bags allow Duplicate Values ●Queries that use only ANDs (no ORs) are called “conjunctive queries” ●Conjunctive Queries under Set Semantics are Much Easier to Optimize Query Optimizers hate Bags. Duplicates cause major headaches.
  • 68.
  • 70. Traditional Query Optimizers • Predicate pushdown (push selection through join) • Projection pushdown (push projection through join) • Aggregation pushdown • Their “pull ups” counter parts • Split conjunctive predicates (split AND statements) • Replace cartesian products (use inner joins with predicates) • (Un)Nesting Sub-Queries • Etc.
  • 72.
  • 73. Math You learned this in middle school • 1 + (2 + 3) = (1 + 2) + 3 • 3 + 4 = 4 + 3 • 3 + 0 = 3 • 1 + (-1) = 0 • 2 x (3 x 4) = (2 x 3) x 4 • 2 x 5 = 5 x 2 • 2 x 1 = 2 • 2 x 0.5 = 1 • 2 x (3 + 4) = (2 x 3) + (2 x 4) • (3 + 4) x 2 = (3 x 2) + (4 x 2)
  • 74. Math You learned this in high school • a + (b + c) = (a + b) + c • a + b = b + a • a + 0 = a • a + (-a) = 0 • a x (b x c) = (a x b) x c • a x b = b x a • a x 1 = a • a x a-1 = 1, a != 0 • a x (b + c) = (a x b) + (a x c) • (a + b) x c = (a x c) + (b x c)
  • 75. Math You forgot this in high school • Addition: • Associativity: • a ⊕ (b ⊕ c) = (a ⊕ b) ⊕ c • Commutativity: • a ⊕ b = b ⊕ a • Identity: a ⊕ ō = a • Inverse: a ⊕ (-a) = ō • Multiplication • Associativity: • a ⊗ (b ⊗ c) = (a ⊗ b) ⊗ c • Commutativity: • a ⊗ b = b ⊗ a • Identity: a ⊗ ī = a • Inverse: a ⊗ a-1 = ī • Distribution of Multiplication over Addition: • a ⊗ (b ⊕ c) = (a ⊗ b) ⊕ (a ⊗ c) • (a ⊕ b) ⊗ c = (a ⊗ c) ⊕ (b ⊗ c)
  • 76. Example 1 Query: find the count of the combined rows a, b, c in tables R, S and T
 
 def result = count[a,b,c: R(a) and S(b) and T(c)] Mathematic Representation:
  • 78. Example 1 Query: count the number of combined rows a, b, c in tables R, S and T
  • 79. Example 1 Query: count the number of combined rows a, b, c in tables R, S and T
  • 80. Example 1 Query: count the number of combined rows a, b, c in tables R, S and T
 
 def result = count[a,b,c: R(a) and S(b) and T(c)] Optimized Query: def result = count[R] * count[S] * count[T] n^3 is much slower than 3n
  • 81. Example 2 Query: find the minimum sum of rows a, b, c in tables R, S and T:
 
 def result = min[a,b,c,v: v = R[a] + S[b] + T[c]] Mathematic Representation:
  • 83. Example 2 Query: find the minimum sum of rows a, b, c in tables R, S and T:
 
 def result = min[a,b,c,v: v = R[a] + S[b] + T[c]] Optimized Query: def result = min[R] + min[S] + min[T]
  • 84. C B D A E F 1 2 9 4 6 3 5 AEF = 9 + 4 = 13 ABDF = 1 + 6 + 5 = 12 ABCDF = 1 + 2 + 3 + 5 = 11 min{13,12,11} = 11 Shortest Path from A to F
  • 85. C B D A E F 0.9 0.9 0.4 0.8 0.2 1.0 0.7 AEF = 0.4 x 0.8 = 0.32 ABDF = 0.9 x 0.2 x 0.7 = 0.126 ABCDF = 0.9 x 0.9 x 1.0 x 0.7 = 0.567 max{0.32,0.126,0.567} = 0.567 Maximum Reliability from A to F
  • 86. C B D A E F T I A T H M E AEF = A · T = AT ABDF = T · H · E = THE ABCDF = T · I · M · E = TIME union{at, the, time} = at the time Words from A to F
  • 87. Math You skipped this in college • min { (9 + 4), (1 + 6 + 5), ( 1 + 2 + 3 + 5 ) } • max { (0.4 x 0.8), (0.9 x 0.2 x 0.7), (0.9 x 0.9 x 1.0 x 0.7) } • union { (A · T), (T · H · E), (T · I · M · E) }
  • 88. Math You skipped this in college • ⊕ { (9 ⊗ 4), (1 ⊗ 6 ⊗ 5), ( 1 ⊗ 2 ⊗ 3 ⊗ 5 ) } • ⊕ { (0.4 ⊗ 0.8), (0.9 ⊗ 0.2 ⊗ 0.7), (0.9 ⊗ 0.9 ⊗ 1.0 ⊗ 0.7) } • ⊕ { (A ⊗ T), (T ⊗ H ⊗ E), (T ⊗ I ⊗ M ⊗ E) }
  • 89. Example 3 Query: count the number of 3-hop paths per node in a graph def path3(a, b, c, d) = edge(a,b) and edge(b,c) and edge(c,d) def result[a] = count[path3[a]] Mathematic Representation: A B C D
  • 90. Query: count the number of 3-hop paths per node in a graph A B C D
  • 91. Example 3 Query: count the number of 3-hop paths per node in a graph def path3(a, b, c, d) = edge(a,b) and edge(b,c) and edge(c,d) def result[a] = count[path3[a]] Optimized Query: def path1[c] = count[edge[c]] def path2[b] = sum[path1[c] for c in edge[b]] def result[a] = sum[path2[b] for b in edge[a]] A B C D
  • 92. Semantic Query Optimizer It knows math! • Compute Discrete Fourier Transform in Fast Fourier Transform-time • Junction Tree Algorithm for inference in Probabilistic Graphical Models • Message passing, belief propagation • Viterbi Algorithm, forward/backward for Hidden Markov Models most probable paths • Counting sub-graph patterns (motifs) • Yannakakis Algorithm for acyclic conjunctive queries in Polynomial Time • Fractional hypertree-width time algorithm for Constraint Satisfaction Problems • Best known results for Conjunctive Queries and Quanti f ied Conjunctive Queries
  • 93. Semantic Query Optimizer It knows math! • This optimizer produces much better code than the average developer because it knows a ton more math than the average developer. • Maryam Mirzakhani • Terence Tao • Ramanujan • Katherine Goble • Good Will Hunting
  • 95. 95 def reachable = edge; reachable.edge Recursion
  • 96. How many Paths are there from The top left node to the bottom right node? 2 Paths 6 Paths
  • 97. def number_of_paths_of_length(node_number, path_length, path_count) = node_number=1, path_length=0, path_count=1 def number_of_paths_of_length[node_number, path_length] = sum[other_node, paths_of_length : paths_of_length = number_of_paths_of_length[other_node, path_length - 1] and edge(other_node, node_number)] def output = number_of_paths_of_length[number_of_nodes, 2 * lattice_size]
  • 98. @function @transient def :_intermediate#0(other_node#1, path_length#0, _t#0) = reduce[(_x#0, _y#0, _z#0) : :rel_primitive_add(_x#0, _y#0, _z#0), (x#8, paths_of_length#1) : :number_of_paths_of_length(other_node#1, x#8, paths_of_length#1) and :rel_primitive_add(1, x#8, path_length#0), (_no_init#0) : false](_t#0) @function @transient def :_intermediate#1(node_number#0, path_length#0, path_count#0) = reduce[(_x#1, _y#1, _z#1) : :rel_primitive_add(_x#1, _y#1, _z#1), (other_node#1, _t#0) : :edge(other_node#1, node_number#0) and :_intermediate#0(other_node#1, path_length#0, _t#0), (_no_init#1) : false](path_count#0) def :number_of_paths_of_length(node_number#0, path_length#0, path_count#0) = :_base_case#0(node_number#0, path_length#0, path_count#0) or :_intermediate#1(node_number#0, path_length#0, path_count#0) Naive recursion, iteration 1 Evaluating `_intermediate#0`: (1, 1) => (1,) Evaluating `_intermediate#1`: (2, 1) => (1,) (4, 1) => (1,) Evaluating `number_of_paths_of_length`: (1, 0, 1) (2, 1, 1) (4, 1, 1)
  • 99. @function @transient def :_intermediate#0(other_node#1, path_length#0, _t#0) = reduce[(_x#0, _y#0, _z#0) : :rel_primitive_add(_x#0, _y#0, _z#0), (x#8, paths_of_length#1) : :number_of_paths_of_length(other_node#1, x#8, paths_of_length#1) and :rel_primitive_add(1, x#8, path_length#0), (_no_init#0) : false](_t#0) @function @transient def :_intermediate#1(node_number#0, path_length#0, path_count#0) = reduce[(_x#1, _y#1, _z#1) : :rel_primitive_add(_x#1, _y#1, _z#1), (other_node#1, _t#0) : :edge(other_node#1, node_number#0) and :_intermediate#0(other_node#1, path_length#0, _t#0), (_no_init#1) : false](path_count#0) def :number_of_paths_of_length(node_number#0, path_length#0, path_count#0) = :_base_case#0(node_number#0, path_length#0, path_count#0) or :_intermediate#1(node_number#0, path_length#0, path_count#0) Naive recursion, iteration 2 Evaluating `_intermediate#0`: (1, 1) => (1,) (2, 2) => (1,) (4, 2) => (1,) Evaluating `_intermediate#1`: (2, 1) => (1,) (3, 2) => (1,) (4, 1) => (1,) (5, 2) => (2,) (7, 2) => (1,) Evaluating `number_of_paths_of_length`: (1, 0, 1) (2, 1, 1) (3, 2, 1) (4, 1, 1) (5, 2, 2) (7, 2, 1)
  • 100. @function @transient def :_intermediate#0(other_node#1, path_length#0, _t#0) = reduce[(_x#0, _y#0, _z#0) : :rel_primitive_add(_x#0, _y#0, _z#0), (x#8, paths_of_length#1) : :number_of_paths_of_length(other_node#1, x#8, paths_of_length#1) and :rel_primitive_add(1, x#8, path_length#0), (_no_init#0) : false](_t#0) @function @transient def :_intermediate#1(node_number#0, path_length#0, path_count#0) = reduce[(_x#1, _y#1, _z#1) : :rel_primitive_add(_x#1, _y#1, _z#1), (other_node#1, _t#0) : :edge(other_node#1, node_number#0) and :_intermediate#0(other_node#1, path_length#0, _t#0), (_no_init#1) : false](path_count#0) def :number_of_paths_of_length(node_number#0, path_length#0, path_count#0) = :_base_case#0(node_number#0, path_length#0, path_count#0) or :_intermediate#1(node_number#0, path_length#0, path_count#0) Naive recursion, iteration 3 Evaluating `_intermediate#0`: (1, 1) => (1,) (2, 2) => (1,) (3, 3) => (1,) (4, 2) => (1,) (5, 3) => (2,) (7, 3) => (1,) Evaluating `_intermediate#1`: (2, 1) => (1,) (3, 2) => (1,) (4, 1) => (1,) (5, 2) => (2,) (6, 3) => (3,) (7, 2) => (1,) (8, 3) => (3,) Evaluating `number_of_paths_of_length`: (1, 0, 1) (2, 1, 1) (3, 2, 1) (4, 1, 1) (5, 2, 2) (6, 3, 3) (7, 2, 1) (8, 3, 3)
  • 101. @function @transient def :_intermediate#0(other_node#1, path_length#0, _t#0) = reduce[(_x#0, _y#0, _z#0) : :rel_primitive_add(_x#0, _y#0, _z#0), (x#8, paths_of_length#1) : :number_of_paths_of_length(other_node#1, x#8, paths_of_length#1) and :rel_primitive_add(1, x#8, path_length#0), (_no_init#0) : false](_t#0) @function @transient def :_intermediate#1(node_number#0, path_length#0, path_count#0) = reduce[(_x#1, _y#1, _z#1) : :rel_primitive_add(_x#1, _y#1, _z#1), (other_node#1, _t#0) : :edge(other_node#1, node_number#0) and :_intermediate#0(other_node#1, path_length#0, _t#0), (_no_init#1) : false](path_count#0) def :number_of_paths_of_length(node_number#0, path_length#0, path_count#0) = :_base_case#0(node_number#0, path_length#0, path_count#0) or :_intermediate#1(node_number#0, path_length#0, path_count#0) Naive recursion, iteration 4 Evaluating `_intermediate#0`: (1, 1) => (1,) (2, 2) => (1,) (3, 3) => (1,) (4, 2) => (1,) (5, 3) => (2,) (6, 4) => (3,) (7, 3) => (1,) (8, 4) => (3,) Evaluating `_intermediate#1`: (2, 1) => (1,) (3, 2) => (1,) (4, 1) => (1,) (5, 2) => (2,) (6, 3) => (3,) (7, 2) => (1,) (8, 3) => (3,) (9, 4) => (6,) Evaluating `number_of_paths_of_length`: (1, 0, 1) (2, 1, 1) (3, 2, 1) (4, 1, 1) (5, 2, 2) (6, 3, 3) (7, 2, 1) (8, 3, 3) (9, 4, 6)
  • 103. Graph Analytics module graph_analytics[G]
 with G use node, edge
 
 def neighbor(x, y) = edge(x, y) or edge(y, x)
 def outdegree[x] = count[edge[x]]
 def degree[x] = count[neighbor[x]]
 def cn[x, y] = count[intersect[neighbor[x], neighbor[y]]] // Count of Common Neighbors
 
 def reachable = edge; reachable.edge
 def reachable_undirected = neighbor; reachable_undirected.neighbor
 
 def scc[x] = min[v: reachable(x, v) and reachable(v, x)] // Strongly Connected Component
 def wcc[x] = min[reachable_undirected[x]] // Weakly Connected Component
 
 def cosine_sim[x, y] = cn[x, y] / sqrt[degree[x] * degree[y]]
 def jaccard_sim[x, y] = cn[x, y] / count[neighbor[x]] + count[neighbor[y]] - cn[x, y] … end
  • 104. Betweenness Centrality Graph Algorithms One of many of graph centrality measures which are useful for assessing the importance of a node. High Level Definition: Number of times a node appears on shortest paths within a network Why it’s Useful: Identify which nodes control information flow between different areas of the graph; also called “Bridge Nodes” Business Use-Cases: Communication Analysis: Identify important people which communicate across different groups Retail Purchase Analysis: Which products introduce customers to new categories
  • 105. Betweenness Centrality Computation Brandes Algorithm is applied as follows: 1. For each pair of nodes, compute all shortest paths and capture nodes (less endpoints) on said path(s) 2. For each pair of nodes, assign each node along path a value of one if there is only one shortest path, or the fractional contribution (1/n) if n shortest paths 3. Sum the value from step 2 for each node; this is the Betweenness Centrality
  • 106. Betweenness Centrality Implementation // Shortest path between s and t when they are the same is 0. def shortest_path[s, t] = Min[ v, w: (shortest_path(s, t, w) and v = 1) or (w = shortest_path[s,v] +1 and E(v, t)) ] // When s and t are the same, there is only one shortest path between // them, namely the one with length 0. def nb_shortest(s, t, n) = V(s) and V(t) and s = t and n = 1 // When s and t are *not* the same, it is the sum of the number of shortest // paths between s and v for all the v's adjacent to t and on the shortest // path between s and t. def nb_shortest(s, t, n) = s != t and n = sum[v, m: shortest_path[s, v] + 1 = shortest_path[s, t] and E(v, t) and nb_shortest(s, v, m) ] // sum over all t's such that there is an edge between v and t, // and v is on the shortest path between s and t def C[s, v] = sum[t, r: E(v, t) and shortest_path[s, t] = shortest_path[s, v] + 1 and ( a = C[s, t] or not C(s, t, _) and a = 0.0 ) and r = (nb_shortest[s, v] / nb_shortest[s, t]) * (1 + a) ] from a // Note that below we divide by 2 because we are double counting every edge. def betweenness_centrality_brandes[v] = sum[s, p : s != v and C[s, v] = p]/2
  • 107. Betweenness Centrality ReComputation Incremental updates to data and recomputation of Betweenness Centrality takes only a few seconds, whereas the entire graph needs to be re-computed in other systems.
  • 108. Algorithm Change ReComputation Incremental updates to code is also recomputated, whereas the entire algorithm needs to be re- computed in other systems.
  • 110. Incremental Maintenance 1. Dependency tracking to figure out which views are affected by a change. 2. Demand-driven execution to only compute what users are actively interested in. 3. Differential computation to incrementally maintain even general recursion. 4. Semantic optimization to recover better maintenance algorithms where possible.
  • 111.